ETL Dinámico is a layered, modular Extract-Transform-Load system that connects a transactional SQL Server database (OLTP) to a Data Warehouse (OLAP). It lets you configure column mappings, apply transformations, and run incremental loads entirely through an interactive Streamlit dashboard — no hardcoded schemas required.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/obedc295/proyect_dw/llms.txt
Use this file to discover all available pages before exploring further.
Quickstart
Install dependencies, configure your .env, and run your first ETL pipeline in minutes.
Configuration
Set up OLTP and OLAP connection strings using environment variables.
Architecture Overview
Understand the four-layer design: Settings, DatabaseClient, Services, and UI.
ETL Pipeline
Learn how extraction, transformation, and incremental loading work together.
UI Dashboard Guide
Use the Streamlit interface to map columns, preview data, and execute ETL runs.
API Reference
Full reference for all public classes: ETLPipeline, DataExtractor, DataTransformer, and more.
How It Works
ETL Dinámico follows a clean three-phase pipeline orchestrated byETLPipeline.run_dynamic_etl():
Extract
DataExtractor reads from your SQL Server OLTP source — either a full table or a custom SQL query — and returns a Pandas DataFrame.Transform
DataTransformer applies per-column operations: uppercase/lowercase text, date component extraction (year, month, day), or concatenating two columns into one.Key Features
Dynamic Column Mapping
Configure source→target column mappings and transformations at runtime through the UI — no code changes needed.
Incremental Loading
Business-key deduplication ensures only new records enter the Data Warehouse on every run.
Multiple Transform Types
Upper/lower case conversion, year/month/day extraction from dates, and multi-column concatenation.
Custom SQL Support
Write a custom SELECT query as the extraction source instead of selecting a full table.
Streamlit Dashboard
An interactive web UI lets you configure and run ETL pipelines without writing Python.
Automated Tests
Pytest suite covers connection health, transformer logic, incremental loader filtering, and full pipeline orchestration.