Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/obedc295/proyect_dw/llms.txt

Use this file to discover all available pages before exploring further.

The column mapping section is the core of the ETL Dinámico interface. It connects every column coming from the OLTP source to a specific column in the Data Warehouse destination, with an optional transformation applied in between. This section appears automatically once you have selected (or queried) a source and chosen a destination table — no configuration file or Python code required.

The Three-Column Mapping Grid

Each source column gets its own row in a three-column grid:
┌──────────────────────┬──────────────────────┬───────────────────────────┐
│   Columna origen     │   Transformación     │    Columna destino        │
│  (read-only)         │  (dropdown)          │  (dropdown)               │
└──────────────────────┴──────────────────────┴───────────────────────────┘
│  CustomerID          │  Ninguna       ▾     │  CustomerKey  (INTEGER) ▾ │
│  INTEGER             │                      │                           │
├──────────────────────┼──────────────────────┼───────────────────────────┤
│  FirstName           │  Upper         ▾     │  CustomerName (NVARCHAR)▾ │
│  NVARCHAR(50)        │                      │                           │
├──────────────────────┼──────────────────────┼───────────────────────────┤
│  BirthDate           │  Año           ▾     │  BirthYear    (INTEGER) ▾ │
│  DATETIME            │                      │                           │
└──────────────────────┴──────────────────────┴───────────────────────────┘

Column 1 — Source Column (read-only)

The leftmost cell displays the source column’s name in bold and its SQL data type in a code label beneath it. This cell is read-only — it reflects whichever columns you selected in the source panel or the columns returned by your SQL query.

Column 2 — Transformation

A dropdown that determines what operation is applied to the source column’s values before they are written to the destination. The available options map directly to transform types handled by DataTransformer:
UI Labeltransform_type valueOperation
NingunanoneDirect copy or rename — df[target_col] = df[source_col]
UpperupperConvert string to upper-case — .str.upper()
LowerlowerConvert string to lower-case — .str.lower()
AñoyearExtract the year from a datetime column — .dt.year
MesmonthExtract the month number from a datetime column — .dt.month
DíadayExtract the day of the month from a datetime column — .dt.day
Date transformations (Año, Mes, Día) call pd.to_datetime() on the source column before extracting the component, so string-encoded dates in ISO format are handled automatically. Numeric or text columns will raise an error if you attempt a date transform on them.

Column 3 — Target Column

A dropdown listing every column in the selected DW table, formatted as ColumnName (TYPE). It also includes a special first option:
  • — No mapear — — selecting this option excludes the source column from the ETL run entirely. The column will not appear in the DataFrame sent to the loader and will not consume a mapping slot.
Only rows where the target column is not — No mapear — produce a mapping entry in the final column_mappings list that is passed to pipeline.run_dynamic_etl().

Concatenation Mappings

Below the main grid, a separate Concatenaciones section lets you combine two source columns into a single target column using a space separator. This is handled by DataTransformer.concat_transform(), which produces:
df[new_column] = df[column1] + ' ' + df[column2]
1

Add a concat row

Click + Agregar concatenación. A new row appears with three dropdowns: Columna 1 and Columna 2 (both sourced from the available source columns) and Destino (sourced from the DW target columns). Clicking the button increments st.session_state.num_concat and triggers a st.rerun().
2

Configure the row

Choose the two source columns to combine and the single target column that will receive the concatenated result. For example, FirstName + LastNameFullName.
3

Remove a concat row

Click - Quitar última concatenación to remove the most recently added concat row. This decrements st.session_state.num_concat and re-renders the form. Only the last row is removed — earlier rows are preserved.
Each concat mapping is appended to all_mappings as a dict with "type": "concat":
{
    "type": "concat",
    "column1": "FirstName",
    "column2": "LastName",
    "target_column": "FullName"
}

Business Key Selector

Beneath the concatenation controls, a Configuración de carga section contains the business key dropdown. This is a required field and must be set before the ETL can run. The business key is the DW target column used by DataLoader.load_incremental() to detect duplicate records. Before inserting any rows, the loader reads all existing values of this column from the DW table and filters them out of the incoming DataFrame — only rows with a key that does not yet exist are inserted.
Use the primary key or natural key of the dimension or fact table as your business key. For a customer dimension this is typically CustomerKey or CustomerID; for a territory dimension it is TerritoryID; for a sales fact table it is SalesOrderID.

Full Mapping Example — DimCustomer

The table below shows a complete mapping configuration for loading OLTP Sales.Customer rows into a DimCustomer dimension table:
Source ColumnSource TypeTransformationTarget ColumnTarget Type
CustomerIDINTEGERNingunaCustomerKeyINTEGER
FirstNameNVARCHAR(50)UpperCustomerNameNVARCHAR(100)
EmailAddressNVARCHAR(50)LowerEmailNVARCHAR(50)
BirthDateDATETIMEAñoBirthYearINTEGER
ModifiedDateDATETIMENingunaModifiedDateDATETIME
MiddleNameNVARCHAR(50)— No mapear —(excluded)
Concat mapping: FirstName + LastNameFullName The resulting column_mappings list passed to run_dynamic_etl() would look like this:
column_mappings = [
    {"source_column": "CustomerID",   "transform_type": "none",  "target_column": "CustomerKey"},
    {"source_column": "FirstName",    "transform_type": "upper", "target_column": "CustomerName"},
    {"source_column": "EmailAddress", "transform_type": "lower", "target_column": "Email"},
    {"source_column": "BirthDate",    "transform_type": "year",  "target_column": "BirthYear"},
    {"source_column": "ModifiedDate", "transform_type": "none",  "target_column": "ModifiedDate"},
    {"type": "concat", "column1": "FirstName", "column2": "LastName", "target_column": "FullName"},
]
Business key: CustomerKey

Validation Rules

The _validar_y_ejecutar() function enforces the following rules before submitting the pipeline. All errors are shown as inline st.error banners and the pipeline is not started until every issue is resolved:
At least one of source_table or sql_query must be non-empty. You cannot run ETL without specifying where the data comes from.
The target_table value must be non-empty. Select a table from the DW destination dropdown.
The business_key value must be non-empty. Use the Llave de negocio selector in the Configuración de carga sub-section.
The column_mappings list must contain at least one entry. If every column is set to — No mapear —, this validation will fail.
The value selected as the business key must appear as target_column in at least one mapping — regular or concat. _validar_y_ejecutar builds mapped_targets from all entries in column_mappings regardless of type, and checks that business_key in mapped_targets. This guarantees the incremental deduplication key is present in the DataFrame being loaded.
Every entry in column_mappings that does not have "type": "concat" must provide a non-empty source_column value. A mapping row left with no source column selected will trigger this error: Cada mapeo debe tener una columna origen.
Every entry in column_mappings — whether a regular mapping or a concat mapping — must provide a non-empty target_column value. Missing the destination column triggers: Cada mapeo debe tener una columna destino.
Every entry with "type": "concat" must have non-empty column1 and column2 values. Adding a concat row and leaving either column on its default without reviewing it will trigger this error.

Build docs developers (and LLMs) love