Using the ETL Dinámico Dashboard to Run Pipelines

The ETL Dinámico dashboard is a browser-based interface built with Streamlit that lets you configure and execute full ETL pipelines without writing a single line of Python. You choose a source table or write a custom SQL query, map its columns to a destination Data Warehouse table, apply optional transformations, and fire the pipeline — all from one screen. Every result is displayed inline with row counts, a JSON detail panel, and a live sample of the loaded data.

Starting the Dashboard

Launch the app from the root of the project with a single command:

streamlit run app.py

Streamlit will compile the app and open it automatically in your default browser at:

http://localhost:8501

Keep the terminal open for the duration of your session — closing it stops the server.

init_etl() is decorated with @st.cache_resource, which means DatabaseClient and ETLPipeline are instantiated once per server session and reused across every interaction. If you update connection strings in your .env file, you must restart the Streamlit process for the new values to take effect — a page refresh alone is not enough.

Dashboard Layout

The dashboard is divided into two top-level columns that sit side by side, followed by a shared mapping section and the execution controls below them.

┌─────────────────────────┬─────────────────────────────┐
│     Origen de datos     │     Destino en el DW        │
│  (Data Source — left)   │  (DW Destination — right)   │
└─────────────────────────┴─────────────────────────────┘
          ↓ Mapeo de columnas y transformaciones ↓
                    [ Ejecutar ETL ]

Data Source (Left Column)

The left column, labelled Origen de datos, is where you tell the pipeline what data to pull from the OLTP database. Two modes are available, selected via a horizontal radio button.

Mode 1 — Tabla existente (OLTP)

Select this mode to browse the non-system tables that are already present in the OLTP database.

Select a table

A dropdown lists every schema-qualified table returned by extract_tables(). System schemas are automatically excluded — specifically sys, INFORMATION_SCHEMA, guest, and the built-in database-role schemas (db_owner, db_accessadmin, db_securityadmin, db_ddladmin, db_backupoperator, db_datareader, db_datawriter, db_denydatareader, db_denydatawriter). The comparison is case-insensitive. Each entry is formatted as schema.TableName — for example, Sales.Customer.

Choose columns to extract

After selecting a table, a multiselect widget appears pre-populated with all columns. Each option is labelled with both the column name and its SQL type — for example, CustomerID (INTEGER). Deselect any columns you do not need before building your mappings.

Preview source data

Click Vista previa to load the first 10 rows of the selected table with the chosen columns using extract_by_table(source_table, columns=..., limit=10). The result is stored in st.session_state.preview_df so it survives re-renders triggered by other widget interactions — but each click of the button always issues a fresh database query and overwrites the cached frame. A collapsed expander below the preview lists every column name and its SQL type for reference.

Mode 2 — SQL personalizado

Select this mode to supply an arbitrary SQL query instead of selecting a table.

Write your query

A text area with a placeholder of SELECT * FROM .. WHERE .. accepts any valid T-SQL statement targeted at the OLTP database.

Execute and preview

Click Ejecutar consulta to run the query. The pipeline wraps it as a subquery and fetches a TOP 50 preview via extract_by_query(sql_query, limit=50). The resulting DataFrame is displayed inline and its column names are automatically forwarded to the mapping section below.

When you edit the SQL text area, the previously cached query result is cleared automatically so the mapping section always reflects the current query.

Data Warehouse Destination (Right Column)

The right column, labelled Destino en el DW, is where you choose the target table in the Data Warehouse.

Select a destination table

A dropdown lists all tables returned by DatabaseClient.get_olap_tables(), which uses SQLAlchemy’s inspect() against the OLAP engine. Select the dimension or fact table you want to load data into.

Inspect destination columns

Once a table is selected, an expandable panel (expanded by default) renders every column in that table as a markdown list item with the column name in bold and its SQL type in backtick code style:

- **CustomerKey** `INTEGER`
- **CustomerName** `NVARCHAR(100)`
- **TerritoryKey** `INTEGER`
- **ModifiedDate** `DATETIME`

Use this panel to confirm column names and types before building your mappings.

Column Mapping and Transformation Section

Below both columns, a horizontal divider separates the Mapeo de columnas y transformaciones section. This section is only rendered when at least one source column and at least one destination column are both available. For full details on how to configure mappings, transformations, concatenations, and the business key, see Column Mapping.

Running the ETL Pipeline

At the bottom of the mapping section, the Ejecutar ETL button (rendered as a full-width primary button) is the single trigger for the entire pipeline run.

st.button(
    "Ejecutar ETL",
    type="primary",
    use_container_width=True,
    on_click=_validar_y_ejecutar,
    args=(pipeline, source_table, sql_query, target_table, business_key, all_mappings),
)

Clicking the button calls _validar_y_ejecutar(), which first runs a set of validation checks and, if all pass, invokes pipeline.run_dynamic_etl() inside a spinner.

Validation Errors

If any required field is missing or misconfigured, one or more inline error messages appear before the pipeline is executed — nothing is written to the database. The possible validation errors are:

Condition	Error message
No source table selected and no SQL written	`Debes seleccionar una tabla o escribir una consulta SQL.`
No destination table selected	`Debes especificar la tabla destino.`
No business key selected	`Debes seleccionar la llave de negocio.`
No column mappings configured	`Debes configurar al menos un mapeo de columna.`
Business key not present in mapped target columns	`La llave de negocio '{key}' debe estar mapeada desde una columna origen.`
A concat mapping is missing `column1` or `column2`	`Cada concatenación debe tener dos columnas origen.`
A non-concat mapping is missing `source_column`	`Cada mapeo debe tener una columna origen.`
Any mapping (concat or non-concat) is missing `target_column`	`Cada mapeo debe tener una columna destino.`

ETL Results

When the pipeline completes successfully, the dashboard replaces the spinner with a structured results block.

Filas extraídas

Total rows read from the OLTP source — either the full table or the SQL query result — before any filtering or deduplication.

Filas cargadas

Rows actually inserted into the Data Warehouse after the incremental filter removed records whose business key already existed in the destination table.

Below the metrics:

Tabla destino — the name of the DW table that was written to, displayed as inline code.
Ver detalle completo de la respuesta — an expandable panel showing the full JSON result object returned by run_dynamic_etl():

{
  "status": "success",
  "rows_extracted": 847,
  "rows_loaded": 12,
  "table_destination": "dbo.DimCustomer",
  "sample_data": "..."
}

Vista previa de datos cargados — a subheading followed by a st.dataframe rendering of sample_data, which is the first 5 rows of the fully-transformed DataFrame that was submitted to the loader. This lets you verify column names and values without querying the DW manually. Note that this sample is taken from the complete transformed set before the incremental filter, so it may include rows that were not inserted because their business key already existed in the DW.

If an unhandled exception occurs during execution, a red st.error banner appears with the exception message and the run is aborted — no partial data is committed.

Get Started

Architecture

ETL Pipeline

UI Guide

Testing & Development

Starting the Dashboard

Dashboard Layout

Data Source (Left Column)

Mode 1 — Tabla existente (OLTP)

Mode 2 — SQL personalizado

Data Warehouse Destination (Right Column)

Column Mapping and Transformation Section

Running the ETL Pipeline

Validation Errors

ETL Results

Filas extraídas

Filas cargadas

Build docs developers (and LLMs) love

Get Started

Architecture

ETL Pipeline

UI Guide

Testing & Development

Documentation Index

​Starting the Dashboard

​Dashboard Layout

​Data Source (Left Column)

​Mode 1 — Tabla existente (OLTP)

​Mode 2 — SQL personalizado

​Data Warehouse Destination (Right Column)

​Column Mapping and Transformation Section

​Running the ETL Pipeline

​Validation Errors

​ETL Results

Filas extraídas

Filas cargadas

Build docs developers (and LLMs) love

Starting the Dashboard

Dashboard Layout

Data Source (Left Column)

Mode 1 — Tabla existente (OLTP)

Mode 2 — SQL personalizado

Data Warehouse Destination (Right Column)

Column Mapping and Transformation Section

Running the ETL Pipeline

Validation Errors

ETL Results