PruebaETL Documentation
Transform messy CSV data into clean, normalized SQL Server databases with automatic encoding detection, data cleaning, and schema generation
Quick start
Get up and running with PruebaETL in minutes
Prepare your CSV files
Place your customer and sales CSV files in the project directory. The tool supports multiple encodings (UTF-8, Latin-1, CP1252) and will automatically detect the correct one.
Normalize customer data
Run the customer normalization script to clean and normalize your customer data, extracting cities and segments into separate tables.
Expected output
Expected output
The script generates four files in the current directory:
ciudades.csv- Unique cities with IDssegmentos.csv- Unique customer segments with IDsclientes_normalizados.csv- Normalized customer data with foreign keysclientes_normalizados_completo.csv- Denormalized view for reference
Normalize sales data
Run the sales normalization script to process transaction data.This creates normalized sales files with proper data types and reference tables for channels and currencies.
Key features
Everything you need for robust ETL processing
Multi-encoding support
Automatically detects and handles UTF-8, Latin-1, CP1252, and ISO-8859-1 encodings
Data normalization
Extracts redundant data into separate tables following relational database design principles
Flexible date parsing
Accepts multiple date formats and normalizes to ISO standard (YYYY-MM-DD)
SQL Server integration
Generates complete SQL scripts with DDL, constraints, indexes, and data insertion
Data quality
Cleans text, handles missing values, validates data types, and removes duplicates
Audit trail
Preserves raw data in separate tables for compliance and debugging
Explore by topic
Dive deeper into specific aspects of the ETL pipeline
ETL process
Understand the complete data transformation workflow
Database schema
Learn about the normalized relational structure
Customer data
How customer records are processed and cleaned
Sales data
Transaction data normalization and validation
Schema generation
SQL DDL and DML generation from normalized data
Functions reference
Complete API documentation for all Python functions
Ready to transform your data?
Start normalizing your CSV files and building production-ready SQL Server databases in minutes
Get Started