Skip to main content

Data Pipelines

The Data Pipelines surface is the core of Zingle. It's where you create, manage, and ship AI-assisted data models — from initial requirement to production-ready pull request.


The pipeline list

Navigate to Data Pipelines in the sidebar to see all models in your workspace.

ColumnDescription
NameModel name (e.g., gold_customer_ltv)
StatusActive, Draft, or Pending Approval
ApprovalCurrent approval state from the Approval Console
OwnerTeam member responsible for the pipeline
Last updatedTimestamp of the most recent change

Available actions

  • Create — open the AI modeling workspace to build a new pipeline
  • Search — filter models by name, status, or owner
  • Activate / Deactivate — toggle pipeline state
  • Edit — modify an existing pipeline's configuration
  • Observe in Airflow — deep-link to the corresponding Airflow DAG (requires Airflow connection)

The AI modeling workspace

Click Create from the pipeline list to open the full modeling workspace. This is a three-panel interface:

Left panel — Lineage canvas

A visual DAG showing the flow of data through modeling layers:

Bronze → Silver → Intermediate (int_) → Gold → Semantic

Click any node to inspect its schema, SQL, and test configuration in the bottom pane.

Bottom panel — Detail tabs

TabPurpose
Schema DetailsColumn names, types, and constraints for the selected table
SQL QueryGenerated SQL with syntax highlighting, formatting, and diff view
TestsData quality tests (not-null, unique, accepted values, custom)
ScheduleExecution frequency, timing, and compute engine assignment
PRsHistory of pull requests raised for this pipeline

Right panel — Unified Chat Assistant

The AI-powered chat interface that drives the entire workflow. The assistant supports:

  • Natural language input — describe what you want to build
  • SQL paste — provide existing SQL for refactoring
  • Notebook upload — upload .ipynb or .py files
  • Autocomplete — intelligent suggestions as you type

Building a model — step by step

  1. Describe your requirement

    Tell the assistant what table you need. Be specific about:

    • Purpose — what business question does this answer?
    • Source tables — what data feeds into this?
    • Key columns — what fields matter most?
    • Business rules — any transformations, filters, or aggregations?

    The assistant analyzes your input and selects the appropriate modeling flow.

  2. Review proposed schemas

    Zingle generates schemas for intermediate and gold tables. For each table:

    • Review column definitions in the Schema Details tab
    • Verify data types and nullability
    • Click Accept to lock in the schema, or ask the assistant for changes

    The lineage canvas updates in real-time as schemas are accepted.

  3. Review and accept SQL

    Switch to the SQL Query tab to see the generated transformation logic.

    • Use the diff view to compare before/after changes
    • Use the format button to standardize SQL style
    • Click Accept once the SQL is correct
    Modeling guidelines

    The SQL generator respects your workspace modeling guidelines. If you've defined rules like "no joins in bronze layer" or "use incremental materialization for large tables," the AI follows them.

  4. Configure data quality tests

    In the Tests tab, Zingle suggests tests based on the schema:

    • not_null for required columns
    • unique for primary keys
    • accepted_values for categorical fields
    • Custom tests for business logic

    Accept, reject, or modify each suggestion.

  5. Set compute engine and schedule

    Choose the compute engine for this pipeline:

    • Snowflake — X-Small through 4X-Large
    • Managed DuckDB — for lightweight transformations

    Then set the execution schedule (frequency, start time, timezone).

  6. Raise a pull request

    Once all gates pass — schemas accepted, SQL accepted, tests configured, compute set, schedule defined — the Review changes and raise PR button activates.

    Click it to:

    1. Review a summary of all proposed changes
    2. Select the target repository and branch
    3. Create the PR with all SQL, YAML, test, and semantic artifacts

Model detail view

Click any model from the pipeline list to open its detail view. Here you can:

  • View model metadata (name, owner, creation date)
  • Switch between Production and Draft tabs
  • Trigger a model run
  • View step-by-step execution history
  • Discard draft changes
  • Navigate to individual step details (schema, query, configuration)

Quality gates

The PR button is intentionally gated. It only activates when all of the following are satisfied:

GateRequirement
Schema acceptedEvery intermediate and gold table schema must be reviewed and accepted
SQL acceptedTransformation SQL must be accepted for each table
Tests configuredAll suggested tests must be accepted or explicitly rejected
Compute engine setA compute engine must be assigned
Schedule definedExecution schedule must be configured

This ensures nothing ships to your repo without complete configuration.


Tips

  • Be specific in your descriptions. The more context you give the AI (purpose, source tables, column names, business rules), the better the initial output.
  • Use modeling guidelines. Configure them in Settings to get consistent naming, layer enforcement, and materialization patterns across all pipelines.
  • Review the diff view. Before accepting SQL, use the split diff editor to understand exactly what changed.