The Sesgos (Bias) tab makes dataset limitations transparent and directly actionable for the consultants and analysts who rely on TinderJob outputs. Every dataset carries assumptions, gaps, and structural choices made during collection — and when those gaps are invisible, they produce misleading conclusions. Before DataTalent uses any figure from TinderJob to design a curriculum, advise a candidate, or present to management, the team must understand exactly where the data falls short and how that affects interpretation. This tab surfaces three specific, quantified biases and closes with six concrete strategic recommendations.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/HelenDiMo/TinderJob/llms.txt
Use this file to discover all available pages before exploring further.
Bias Summary Table
The following table reproduces the executive summary rendered at the top of the Sesgos tab:| Bias | Dataset | Type | % Affected | Impact |
|---|---|---|---|---|
| MNAR — Missing Salaries | Tecnoempleo | MNAR | 80.7% null | No salary analysis possible from Tecnoempleo |
| Search Term Bias | Tecnoempleo | Selection | 24 fixed terms | Profiles outside these 24 are excluded |
| Geographic Underrepresentation | DS Salaries | Geographic | 2.3% (14/607) | Spanish stats from this dataset unreliable |
Chart 1 — Selection Bias: Offers by Search Term
A horizontal bar chart showing the number of offers perbusqueda search term, sorted ascending. This chart is deliberately identical in structure to the demand chart in the Mercado España tab — but the framing here is critical.
The dataset contains only 24 predefined search terms. These were chosen at scraper design time and represent the developer’s prior assumptions about which tech profiles matter. Any role with a title that doesn’t map to one of these 24 terms — product manager, QA engineer, site reliability engineer, blockchain developer, and hundreds of others — is structurally absent from the dataset, not because those roles don’t exist in Spain, but because they were never scraped.
This is a selection bias by design, not a data quality failure. The remedy is straightforward: expand the scraper’s search term list before the next data collection run.
Chart 2 — Geographic Underrepresentation: DS Salaries
A bar chart showing the top 15 countries by record count in the DS Salaries dataset, with Spain (country codeES) highlighted in Tinder red and all other countries in pink.
| Country | Records | Share |
|---|---|---|
| US | 355 | 58.5% |
| Spain (ES) | 14 | 2.3% |
Strategic Recommendations for DataTalent
The following six recommendations are rendered directly in the Sesgos tab as a numbered list:- Always communicate the median salary (€93,444), never the mean. The right-skewed salary distribution makes the mean (€103,314) unrepresentative of what a typical tech worker earns.
- Do not use Tecnoempleo as a salary source. With 80.7% of salary values null and the MNAR mechanism confirmed, any salary analysis built on Tecnoempleo data would be unreliable and potentially misleading.
- Expand the scraper’s search term list to reduce selection bias. Prioritize high-growth roles currently absent from the 24 scraped terms.
- Complement with Spanish-specific sources (InfoJobs, LinkedIn Spain) for salary benchmarking. These sources have higher Spanish record density and are more representative of local compensation norms.
- Do not train automated selection or recommendation models on these datasets without applying debiasing techniques first. Models trained on biased data reproduce and amplify those biases at scale.
- Communicate uncertainty to management. All figures derived from this dataset are directional and indicative, not precise market measurements. Present them with appropriate confidence intervals or explicit caveats.