Selection bias occurs when the process by which data enters a dataset is not random with respect to the population the dataset is meant to represent. In this case, the population of interest is all hiring activity in the labor markets covered by the dataset. The data collection process captures only one narrow slice of that activity: jobs that employers chose to post publicly on LinkedIn during the crawl window. This is not a representative sample of all hiring. It is a sample of a specific hiring behavior — public job advertising on one platform — and that behavior is systematically correlated with employer type, role type, and seniority level in ways that distort every downstream analysis.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/MajoRodri/HRIA/llms.txt
Use this file to discover all available pages before exploring further.
What Is Excluded
The LinkedIn dataset is silent on a large share of actual labor market activity. Excluded hiring channels include:Employee Referrals
Referrals are the single largest hiring channel in many technology companies. Roles filled through internal referral networks are frequently never posted publicly — they move from “open requisition” directly to “candidate referred” without a job ad. The LinkedIn dataset contains zero evidence of this hiring volume.Internal Promotions and Transfers
When a company promotes a software engineer to senior engineer, or transfers a product manager between divisions, no job posting is created. Internal mobility — a major driver of career progression — is entirely absent from this dataset.Staffing Agency Placements
Many companies fill contract, temporary, and even permanent roles through staffing agencies that maintain their own candidate pools. These roles may never appear on LinkedIn; the agency matches candidates from its database without posting publicly.Direct-Apply Company Career Pages
Large employers (especially in regulated industries like healthcare, finance, and government) maintain proprietary applicant tracking systems and career portals. They may post on LinkedIn, or they may not. Roles posted only on a company’s own careers page are excluded from this dataset.Informal Networks and Executive Search
Senior leadership roles, board positions, and niche specialist roles are frequently filled through executive search firms (headhunters) working entirely outside public job boards. These represent some of the highest-salary, highest-impact hires and are entirely absent from the dataset.Company Profile Bias
The types of employers who actively post on LinkedIn are not a cross-section of all employers:- Large technology companies are over-represented — they have dedicated talent acquisition teams, employer branding budgets, and LinkedIn Recruiter subscriptions
- SMEs and micro-businesses are under-represented — smaller companies often rely on word-of-mouth, local networks, and direct referrals rather than platform-based sourcing
- Startups in early stages frequently hire through founder networks and angel investor communities before they have LinkedIn Recruiter accounts
- Public sector and NGO employers in Spain and the EU use official government employment portals (SEPE, EU Jobs) and are less consistently represented on LinkedIn
Impact on Analysis
| Dimension | Estimated Impact |
|---|---|
| Share of market activity captured | ~50% or less of actual hiring |
| Company size representation | Skewed toward large-cap, LinkedIn-active employers |
| Role type representation | Skewed toward publicly advertised, individual contributor roles |
| Seniority representation | Mid-level roles over-represented; C-suite and entry-level under-represented |
| Skill demand signal | Reflects public job description language, not actual day-to-day role requirements |
Mitigation
The most honest mitigation for selection bias is scope discipline: every conclusion drawn from this dataset should be explicitly scoped to “publicly posted LinkedIn roles” rather than “the labor market” in general.- Frame all findings as applying to “publicly posted LinkedIn roles” in the stated time window
- Do not extrapolate salary or skill demand findings to sectors or company types that are structurally under-represented on LinkedIn
- Complement LinkedIn data with salary surveys (e.g., Hays Spain Salary Guide, Glassdoor), ATS aggregate data, or government labor statistics (INE, Eurostat) when making market-wide claims
- Weight or stratify analyses by company size when available to partially correct for large-company over-representation
The survivorship bias page covers a related but distinct issue: even within the universe of publicly posted LinkedIn jobs, only postings that were active during the crawl window are captured. Jobs posted and filled before the crawl, or pulled by the employer before crawl time, are also absent. See Survivorship Bias for details.