Configure Column Mappings and Transformations in ETL

The column mapping section is the core of the ETL Dinámico interface. It connects every column coming from the OLTP source to a specific column in the Data Warehouse destination, with an optional transformation applied in between. This section appears automatically once you have selected (or queried) a source and chosen a destination table — no configuration file or Python code required.

The Three-Column Mapping Grid

Each source column gets its own row in a three-column grid:

┌──────────────────────┬──────────────────────┬───────────────────────────┐
│   Columna origen     │   Transformación     │    Columna destino        │
│  (read-only)         │  (dropdown)          │  (dropdown)               │
└──────────────────────┴──────────────────────┴───────────────────────────┘
│  CustomerID          │  Ninguna       ▾     │  CustomerKey  (INTEGER) ▾ │
│  INTEGER             │                      │                           │
├──────────────────────┼──────────────────────┼───────────────────────────┤
│  FirstName           │  Upper         ▾     │  CustomerName (NVARCHAR)▾ │
│  NVARCHAR(50)        │                      │                           │
├──────────────────────┼──────────────────────┼───────────────────────────┤
│  BirthDate           │  Año           ▾     │  BirthYear    (INTEGER) ▾ │
│  DATETIME            │                      │                           │
└──────────────────────┴──────────────────────┴───────────────────────────┘

Column 1 — Source Column (read-only)

The leftmost cell displays the source column’s name in bold and its SQL data type in a code label beneath it. This cell is read-only — it reflects whichever columns you selected in the source panel or the columns returned by your SQL query.

Column 2 — Transformation

A dropdown that determines what operation is applied to the source column’s values before they are written to the destination. The available options map directly to transform types handled by DataTransformer:

UI Label	`transform_type` value	Operation
Ninguna	`none`	Direct copy or rename — `df[target_col] = df[source_col]`
Upper	`upper`	Convert string to upper-case — `.str.upper()`
Lower	`lower`	Convert string to lower-case — `.str.lower()`
Año	`year`	Extract the year from a datetime column — `.dt.year`
Mes	`month`	Extract the month number from a datetime column — `.dt.month`
Día	`day`	Extract the day of the month from a datetime column — `.dt.day`

Date transformations (Año, Mes, Día) call pd.to_datetime() on the source column before extracting the component, so string-encoded dates in ISO format are handled automatically. Numeric or text columns will raise an error if you attempt a date transform on them.

Column 3 — Target Column

A dropdown listing every column in the selected DW table, formatted as ColumnName (TYPE). It also includes a special first option:

— No mapear — — selecting this option excludes the source column from the ETL run entirely. The column will not appear in the DataFrame sent to the loader and will not consume a mapping slot.

Only rows where the target column is not — No mapear — produce a mapping entry in the final column_mappings list that is passed to pipeline.run_dynamic_etl().

Concatenation Mappings

Below the main grid, a separate Concatenaciones section lets you combine two source columns into a single target column using a space separator. This is handled by DataTransformer.concat_transform(), which produces:

df[new_column] = df[column1] + ' ' + df[column2]

Add a concat row

Click + Agregar concatenación. A new row appears with three dropdowns: Columna 1 and Columna 2 (both sourced from the available source columns) and Destino (sourced from the DW target columns). Clicking the button increments st.session_state.num_concat and triggers a st.rerun().

Configure the row

Choose the two source columns to combine and the single target column that will receive the concatenated result. For example, FirstName + LastName → FullName.

Remove a concat row

Click - Quitar última concatenación to remove the most recently added concat row. This decrements st.session_state.num_concat and re-renders the form. Only the last row is removed — earlier rows are preserved.

Each concat mapping is appended to all_mappings as a dict with "type": "concat":

{
    "type": "concat",
    "column1": "FirstName",
    "column2": "LastName",
    "target_column": "FullName"
}

Business Key Selector

Beneath the concatenation controls, a Configuración de carga section contains the business key dropdown. This is a required field and must be set before the ETL can run. The business key is the DW target column used by DataLoader.load_incremental() to detect duplicate records. Before inserting any rows, the loader reads all existing values of this column from the DW table and filters them out of the incoming DataFrame — only rows with a key that does not yet exist are inserted.

Use the primary key or natural key of the dimension or fact table as your business key. For a customer dimension this is typically CustomerKey or CustomerID; for a territory dimension it is TerritoryID; for a sales fact table it is SalesOrderID.

Full Mapping Example — DimCustomer

The table below shows a complete mapping configuration for loading OLTP Sales.Customer rows into a DimCustomer dimension table:

Source Column	Source Type	Transformation	Target Column	Target Type
`CustomerID`	`INTEGER`	Ninguna	`CustomerKey`	`INTEGER`
`FirstName`	`NVARCHAR(50)`	Upper	`CustomerName`	`NVARCHAR(100)`
`EmailAddress`	`NVARCHAR(50)`	Lower	`Email`	`NVARCHAR(50)`
`BirthDate`	`DATETIME`	Año	`BirthYear`	`INTEGER`
`ModifiedDate`	`DATETIME`	Ninguna	`ModifiedDate`	`DATETIME`
`MiddleName`	`NVARCHAR(50)`	— No mapear —	(excluded)	—

Concat mapping: FirstName + LastName → FullName The resulting column_mappings list passed to run_dynamic_etl() would look like this:

column_mappings = [
    {"source_column": "CustomerID",   "transform_type": "none",  "target_column": "CustomerKey"},
    {"source_column": "FirstName",    "transform_type": "upper", "target_column": "CustomerName"},
    {"source_column": "EmailAddress", "transform_type": "lower", "target_column": "Email"},
    {"source_column": "BirthDate",    "transform_type": "year",  "target_column": "BirthYear"},
    {"source_column": "ModifiedDate", "transform_type": "none",  "target_column": "ModifiedDate"},
    {"type": "concat", "column1": "FirstName", "column2": "LastName", "target_column": "FullName"},
]

Business key: CustomerKey

Validation Rules

The _validar_y_ejecutar() function enforces the following rules before submitting the pipeline. All errors are shown as inline st.error banners and the pipeline is not started until every issue is resolved:

Must select source table OR write a SQL query

At least one of source_table or sql_query must be non-empty. You cannot run ETL without specifying where the data comes from.

Must specify a target table

The target_table value must be non-empty. Select a table from the DW destination dropdown.

Must select a business key

The business_key value must be non-empty. Use the Llave de negocio selector in the Configuración de carga sub-section.

Must configure at least one column mapping

The column_mappings list must contain at least one entry. If every column is set to — No mapear —, this validation will fail.

Business key must be in the mapped target columns

The value selected as the business key must appear as target_column in at least one mapping — regular or concat. _validar_y_ejecutar builds mapped_targets from all entries in column_mappings regardless of type, and checks that business_key in mapped_targets. This guarantees the incremental deduplication key is present in the DataFrame being loaded.

Non-concat mappings must have a source column

Every entry in column_mappings that does not have "type": "concat" must provide a non-empty source_column value. A mapping row left with no source column selected will trigger this error: Cada mapeo debe tener una columna origen.

Every mapping must have a target column

Every entry in column_mappings — whether a regular mapping or a concat mapping — must provide a non-empty target_column value. Missing the destination column triggers: Cada mapeo debe tener una columna destino.

Concat mappings require both source columns

Every entry with "type": "concat" must have non-empty column1 and column2 values. Adding a concat row and leaving either column on its default without reviewing it will trigger this error.

Get Started

Architecture

ETL Pipeline

UI Guide

Testing & Development

The Three-Column Mapping Grid

Column 1 — Source Column (read-only)

Column 2 — Transformation

Column 3 — Target Column

Concatenation Mappings

Business Key Selector

Full Mapping Example — DimCustomer

Validation Rules

Build docs developers (and LLMs) love

Get Started

Architecture

ETL Pipeline

UI Guide

Testing & Development

Documentation Index

​The Three-Column Mapping Grid

​Column 1 — Source Column (read-only)

​Column 2 — Transformation

​Column 3 — Target Column

​Concatenation Mappings

​Business Key Selector

​Full Mapping Example — DimCustomer

​Validation Rules

Build docs developers (and LLMs) love

The Three-Column Mapping Grid

Column 1 — Source Column (read-only)

Column 2 — Transformation

Column 3 — Target Column

Concatenation Mappings

Business Key Selector

Full Mapping Example — DimCustomer

Validation Rules