Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/borgius/jobspy-js/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The scrapeJobs() function is the main entry point for JobSpy JS. It scrapes one or more job boards in parallel and returns a unified result set with job postings from all requested sites.
import { scrapeJobs } from "jobspy-js";

const result = await scrapeJobs({
  site_name: ["indeed", "linkedin"],
  search_term: "software engineer",
  location: "San Francisco, CA",
  results_wanted: 20,
});

console.log(`Found ${result.jobs.length} jobs`);

Function Signature

interface ScrapeJobsResult {
  jobs: FlatJobRecord[];
  totalScraped: number;
  newCount: number;
  profile?: {
    name: string;
    lastRunAt: string | null;
    stateFile: string;
  };
}

async function scrapeJobs(
  params?: ScrapeJobsParams
): Promise<ScrapeJobsResult>

Core Parameters

site_name
string | string[] | Site | Site[]
default:"all sites"
Job boards to scrape. Accepts site keys as strings or Site enum values.Supported sites: linkedin, indeed, glassdoor, google, google_careers, zip_recruiter, bayt, naukri, bdjobsSite names are normalized—"ziprecruiter", "zip_recruiter", and "zip-recruiter" all work.
// Single site
site_name: "linkedin"

// Multiple sites
site_name: ["indeed", "linkedin", "glassdoor"]

// Using enum
import { Site } from "jobspy-js";
site_name: [Site.INDEED, Site.LINKEDIN]
search_term
string
Job title or search query.
search_term: "react developer"
search_term: "senior python engineer"
search_term: "data scientist machine learning"
location
string
Job location. Can be a city, state, country, or “Remote”.
location: "San Francisco, CA"
location: "New York"
location: "London"
location: "Remote"
results_wanted
number
default:"15"
Maximum number of results per site. Total results may be up to results_wanted × number_of_sites.
results_wanted: 50  // Get up to 50 jobs from each site

Search Filters

is_remote
boolean
default:"false"
Filter for remote jobs only.
is_remote: true
job_type
string
Filter by employment type.Valid values: fulltime, parttime, contract, internship, temporary
job_type: "fulltime"
Not all job types are supported on every site. LinkedIn supports fulltime, parttime, internship, contract, and temporary. Indeed supports fulltime, parttime, contract, and internship.
distance
number
default:"50"
Search radius in miles from the specified location.
distance: 25  // Search within 25 miles
easy_apply
boolean
Filter for easy-apply jobs (supported on LinkedIn, Indeed, Glassdoor).
easy_apply: true
hours_old
number
Only return jobs posted within the last N hours.
hours_old: 24  // Jobs posted in last 24 hours
hours_old: 168 // Jobs posted in last week

Description & Format

description_format
string
default:"markdown"
Format for job descriptions.Options: markdown, html, plain
description_format: "markdown"  // Convert HTML to Markdown
description_format: "html"      // Keep original HTML
description_format: "plain"     // Strip all markup
linkedin_fetch_description
boolean
default:"false"
Fetch full job descriptions from LinkedIn. Requires an extra HTTP request per job (slower).
linkedin_fetch_description: true
Enabling this option significantly increases scraping time. Use only when you need complete LinkedIn job descriptions.
indeed_fetch_description
boolean
default:"false"
Fetch full descriptions by visiting Indeed job pages or direct links.
indeed_fetch_description: true

Site-Specific Options

google_search_term
string
Override search_term for the Google scraper only. Useful for customizing Google’s broader search syntax.
search_term: "software engineer",
google_search_term: "software engineer jobs near San Francisco CA"
linkedin_company_ids
number[]
Filter LinkedIn results to specific company IDs.
// Only jobs from Google (1441) and Microsoft (1035)
linkedin_company_ids: [1441, 1035]
Find LinkedIn company IDs by visiting a company page on LinkedIn and extracting the ID from the URL: https://www.linkedin.com/company/1441/ → ID is 1441

Salary & Compensation

enforce_annual_salary
boolean
default:"false"
Convert all salary figures to annual equivalents.
  • Hourly rates are multiplied by 2,080 (40 hours/week × 52 weeks)
  • Monthly salaries are multiplied by 12
  • Weekly salaries are multiplied by 52
  • Daily salaries are multiplied by 260
enforce_annual_salary: true

Pagination

offset
number
default:"0"
Skip the first N results (pagination offset).
// First page
const page1 = await scrapeJobs({
  site_name: "indeed",
  search_term: "nurse",
  results_wanted: 20,
  offset: 0,
});

// Second page
const page2 = await scrapeJobs({
  site_name: "indeed",
  search_term: "nurse",
  results_wanted: 20,
  offset: 20,
});

Deduplication & Profiles

profile
string
Named profile for deduplication tracking. When specified, JobSpy tracks which jobs you’ve already seen and filters them out on subsequent runs.
profile: "frontend-jobs"
See the Profiles & Deduplication guide for details on how state tracking works.
state_file
string
Path to state file for deduplication. Defaults to jobspy.json in the current directory.
state_file: "/path/to/my-state.json"
skip_dedup
boolean
default:"false"
Skip deduplication filtering (state is still updated).
skip_dedup: true  // Return all jobs, but still update state

Output & Logging

verbose
number
default:"0"
Logging verbosity level.
  • 0 — Errors only
  • 1 — Warnings and errors
  • 2 — All logs (info, warnings, errors)
verbose: 2  // Enable debug logging

Return Value

The function returns a ScrapeJobsResult object:
interface ScrapeJobsResult {
  jobs: FlatJobRecord[];
  totalScraped: number;
  newCount: number;
  profile?: {
    name: string;
    lastRunAt: string | null;
    stateFile: string;
  };
}
Fields:
  • jobs — Array of job postings (see Job Fields below)
  • totalScraped — Total number of jobs scraped before deduplication
  • newCount — Number of new jobs after deduplication (same as jobs.length when using profiles)
  • profile — Profile metadata (only present when using profile parameter)

Job Fields

Each job in the jobs array is a FlatJobRecord with these fields:
FieldTypeDescription
idstringUnique job ID with site prefix (e.g. "li-123", "in-abc")
sitestringSource site key (e.g. "linkedin", "indeed")
titlestringJob title
companystringCompany name
locationstringFormatted as "City, State, Country"
job_urlstringCanonical job URL on the board
job_url_directstringDirect employer/ATS URL (if available)
date_postedstringISO date "YYYY-MM-DD"
job_typestringComma-separated (e.g. "fulltime, contract")
is_remotebooleanWhether the job is remote
descriptionstringFull job description (formatted per description_format)
min_amountnumberMinimum salary/pay amount
max_amountnumberMaximum salary/pay amount
intervalstringPay interval: "yearly", "hourly", etc.
currencystringCurrency code (e.g. "USD", "EUR")
salary_sourcestring"direct_data" or "description"
emailsstringComma-separated emails extracted from description
job_levelstringSeniority level (LinkedIn only)
job_functionstringJob function category (LinkedIn only)
company_industrystringIndustry classification
company_urlstringCompany page on the job board
company_url_directstringCompany’s own website URL
company_logostringCompany logo URL
listing_typestringE.g. "sponsored"
See the type definitions for the complete list of available fields.

Examples

import { scrapeJobs } from "jobspy-js";

const { jobs } = await scrapeJobs({
  site_name: "indeed",
  search_term: "software engineer",
  location: "New York, NY",
});

console.log(`Found ${jobs.length} jobs`);
for (const job of jobs) {
  console.log(`${job.title} at ${job.company}`);
}

Multiple Sites

const { jobs } = await scrapeJobs({
  site_name: ["linkedin", "indeed", "glassdoor"],
  search_term: "data scientist",
  location: "San Francisco, CA",
  results_wanted: 25,
});

// Group results by site
const bySite = jobs.reduce((acc, job) => {
  if (!acc[job.site]) acc[job.site] = [];
  acc[job.site].push(job);
  return acc;
}, {} as Record<string, typeof jobs>);

for (const [site, siteJobs] of Object.entries(bySite)) {
  console.log(`${site}: ${siteJobs.length} jobs`);
}

Remote Jobs with Salary Filter

const { jobs } = await scrapeJobs({
  site_name: ["linkedin", "indeed"],
  search_term: "frontend developer",
  is_remote: true,
  enforce_annual_salary: true,
});

const wellPaid = jobs.filter(
  (j) => j.min_amount && j.min_amount >= 100000
);
console.log(`${wellPaid.length} remote jobs paying $100k+`);

Recent Jobs Only

// Jobs posted in the last 24 hours
const { jobs } = await scrapeJobs({
  site_name: ["linkedin", "indeed"],
  search_term: "machine learning engineer",
  hours_old: 24,
  description_format: "plain",
});

LinkedIn Company Filter

// Only jobs from Google and Microsoft on LinkedIn
const { jobs } = await scrapeJobs({
  site_name: "linkedin",
  search_term: "product manager",
  linkedin_company_ids: [1441, 1035],
  linkedin_fetch_description: true,
});

With Profile Deduplication

// First run - returns all jobs
const run1 = await scrapeJobs({
  site_name: ["indeed", "linkedin"],
  search_term: "react developer",
  location: "Austin, TX",
  profile: "frontend-jobs",
});
console.log(`First run: ${run1.jobs.length} jobs`);

// Second run (hours later) - returns only new jobs
const run2 = await scrapeJobs({
  site_name: ["indeed", "linkedin"],
  search_term: "react developer",
  location: "Austin, TX",
  profile: "frontend-jobs",
});
console.log(`Second run: ${run2.jobs.length} new jobs`);
console.log(`Total scraped: ${run2.totalScraped}`);
console.log(`Already seen: ${run2.totalScraped - run2.newCount}`);

Behavior

Parallel Scraping

All sites are scraped concurrently using Promise.allSettled(). If one site fails, the others still return results. Failed scrapers are silently skipped.
const { jobs } = await scrapeJobs({
  site_name: ["linkedin", "indeed", "glassdoor"],
  search_term: "developer",
});

// You'll get results from whichever sites succeeded
if (jobs.length === 0) {
  console.log("No jobs found from any site");
}

Result Sorting

Results are sorted by:
  1. Site name (alphabetical)
  2. Date posted (newest first, within each site)

Error Handling

try {
  const { jobs } = await scrapeJobs({
    site_name: ["linkedin", "indeed"],
    search_term: "developer",
  });
  
  if (jobs.length === 0) {
    console.log("No jobs found - try broadening your search");
  }
} catch (err) {
  // Only throws if all scrapers fail or params are invalid
  console.error("Scrape failed:", err);
}
Set verbose: 2 to see detailed logs for debugging:
const { jobs } = await scrapeJobs({
  search_term: "developer",
  verbose: 2,
});

See Also

Build docs developers (and LLMs) love