TinderJob is a full-cycle data analytics project built for DataTalent Solutions S.L. to optimize their reskilling programs. It combines automated web scraping of live Spanish tech job listings, statistical analysis of salary distributions and skill demand, bias auditing, and an interactive Streamlit dashboard with a CV-based job matching engine — all backed by real data from Tecnoempleo and the DS Salaries global dataset.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/HelenDiMo/TinderJob/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
Understand the project context, data sources, and team structure behind TinderJob.
Quickstart
Clone the repo, install dependencies, and run the scraper and dashboard in minutes.
Data Pipeline
Explore the scraper, cleaning pipeline, and data dictionary for the Tecnoempleo dataset.
Analysis Notebooks
Walk through the three sequential EDA notebooks covering descriptive stats, correlations, and bias.
Streamlit Dashboard
Run and navigate the interactive dashboard: market view, salary analysis, conditional probability, and TinderMatch.
Key Findings
Read the main insights: top skills, salary benchmarks, and strategic recommendations for DataTalent.
How TinderJob Works
Scrape live job listings
Run the Tecnoempleo scraper to collect fresh offers across 24 tech profiles — Data Scientist, DevOps, Cloud, Ciberseguridad, and more.
Clean and normalize the data
Execute the cleaning pipeline to remove duplicates, parse salary ranges into
salario_min, salario_max, and salario_medio, and flag outliers with IQR.Analyze with Jupyter notebooks
Run the three EDA notebooks to explore descriptive statistics, correlations, conditional probabilities, and data bias.
TinderJob is an educational analytics project. The data is collected for research purposes and all salary statistics should be treated as directional benchmarks, not guarantees. See the Bias Report for a full disclosure of dataset limitations.