The ETL Dinámico dashboard is a browser-based interface built with Streamlit that lets you configure and execute full ETL pipelines without writing a single line of Python. You choose a source table or write a custom SQL query, map its columns to a destination Data Warehouse table, apply optional transformations, and fire the pipeline — all from one screen. Every result is displayed inline with row counts, a JSON detail panel, and a live sample of the loaded data.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/obedc295/proyect_dw/llms.txt
Use this file to discover all available pages before exploring further.
Starting the Dashboard
Launch the app from the root of the project with a single command:init_etl() is decorated with @st.cache_resource, which means DatabaseClient and ETLPipeline are instantiated once per server session and reused across every interaction. If you update connection strings in your .env file, you must restart the Streamlit process for the new values to take effect — a page refresh alone is not enough.Dashboard Layout
The dashboard is divided into two top-level columns that sit side by side, followed by a shared mapping section and the execution controls below them.Data Source (Left Column)
The left column, labelled Origen de datos, is where you tell the pipeline what data to pull from the OLTP database. Two modes are available, selected via a horizontal radio button.Mode 1 — Tabla existente (OLTP)
Select this mode to browse the non-system tables that are already present in the OLTP database.Select a table
A dropdown lists every schema-qualified table returned by
extract_tables(). System schemas are automatically excluded — specifically sys, INFORMATION_SCHEMA, guest, and the built-in database-role schemas (db_owner, db_accessadmin, db_securityadmin, db_ddladmin, db_backupoperator, db_datareader, db_datawriter, db_denydatareader, db_denydatawriter). The comparison is case-insensitive. Each entry is formatted as schema.TableName — for example, Sales.Customer.Choose columns to extract
After selecting a table, a multiselect widget appears pre-populated with all columns. Each option is labelled with both the column name and its SQL type — for example,
CustomerID (INTEGER). Deselect any columns you do not need before building your mappings.Preview source data
Click Vista previa to load the first 10 rows of the selected table with the chosen columns using
extract_by_table(source_table, columns=..., limit=10). The result is stored in st.session_state.preview_df so it survives re-renders triggered by other widget interactions — but each click of the button always issues a fresh database query and overwrites the cached frame. A collapsed expander below the preview lists every column name and its SQL type for reference.Mode 2 — SQL personalizado
Select this mode to supply an arbitrary SQL query instead of selecting a table.Write your query
A text area with a placeholder of
SELECT * FROM .. WHERE .. accepts any valid T-SQL statement targeted at the OLTP database.When you edit the SQL text area, the previously cached query result is cleared automatically so the mapping section always reflects the current query.
Data Warehouse Destination (Right Column)
The right column, labelled Destino en el DW, is where you choose the target table in the Data Warehouse.Select a destination table
A dropdown lists all tables returned by
DatabaseClient.get_olap_tables(), which uses SQLAlchemy’s inspect() against the OLAP engine. Select the dimension or fact table you want to load data into.Inspect destination columns
Once a table is selected, an expandable panel (expanded by default) renders every column in that table as a markdown list item with the column name in bold and its SQL type in backtick code style:Use this panel to confirm column names and types before building your mappings.
Column Mapping and Transformation Section
Below both columns, a horizontal divider separates the Mapeo de columnas y transformaciones section. This section is only rendered when at least one source column and at least one destination column are both available. For full details on how to configure mappings, transformations, concatenations, and the business key, see Column Mapping.Running the ETL Pipeline
At the bottom of the mapping section, the Ejecutar ETL button (rendered as a full-width primary button) is the single trigger for the entire pipeline run._validar_y_ejecutar(), which first runs a set of validation checks and, if all pass, invokes pipeline.run_dynamic_etl() inside a spinner.
Validation Errors
If any required field is missing or misconfigured, one or more inline error messages appear before the pipeline is executed — nothing is written to the database. The possible validation errors are:| Condition | Error message |
|---|---|
| No source table selected and no SQL written | Debes seleccionar una tabla o escribir una consulta SQL. |
| No destination table selected | Debes especificar la tabla destino. |
| No business key selected | Debes seleccionar la llave de negocio. |
| No column mappings configured | Debes configurar al menos un mapeo de columna. |
| Business key not present in mapped target columns | La llave de negocio '{key}' debe estar mapeada desde una columna origen. |
A concat mapping is missing column1 or column2 | Cada concatenación debe tener dos columnas origen. |
A non-concat mapping is missing source_column | Cada mapeo debe tener una columna origen. |
Any mapping (concat or non-concat) is missing target_column | Cada mapeo debe tener una columna destino. |
ETL Results
When the pipeline completes successfully, the dashboard replaces the spinner with a structured results block.Filas extraídas
Total rows read from the OLTP source — either the full table or the SQL query result — before any filtering or deduplication.
Filas cargadas
Rows actually inserted into the Data Warehouse after the incremental filter removed records whose business key already existed in the destination table.
- Tabla destino — the name of the DW table that was written to, displayed as inline code.
- Ver detalle completo de la respuesta — an expandable panel showing the full JSON result object returned by
run_dynamic_etl():
- Vista previa de datos cargados — a subheading followed by a
st.dataframerendering ofsample_data, which is the first 5 rows of the fully-transformed DataFrame that was submitted to the loader. This lets you verify column names and values without querying the DW manually. Note that this sample is taken from the complete transformed set before the incremental filter, so it may include rows that were not inserted because their business key already existed in the DW.
st.error banner appears with the exception message and the run is aborted — no partial data is committed.