parseResume — Local PDF Resume Parser API Reference

The resume parser reads a PDF file from the local filesystem and extracts every git provider URL it can find using two independent methods: rendered text extraction and hyperlink annotation scanning. Both methods run independently so that a failure in one does not block the other. The result is the same ResolverResult shape returned by scrapePortfolio, making it straightforward to combine portfolio and resume data downstream.

`parseResume`

Reads a local PDF file and resolves any git profile or repo links it contains into a structured ResolverResult.

async function parseResume(filePath: string): Promise<ResolverResult>

Parameters

filePath

string

required

Path to a local PDF file. Both relative paths (./resumes/janedoe.pdf) and absolute paths (/tmp/uploads/janedoe.pdf) are accepted. The path is passed directly to Node.js fs/promises.readFile.

parseResume only accepts local file paths. It cannot fetch remote PDFs over HTTP. If you have a PDF URL, download the file first and then pass the local path to parseResume.

Returns

Promise<ResolverResult> — this function never throws. All errors are surfaced inside the result object.

source

string

The filePath argument, unchanged.

sourceType

'resume_file'

Always 'resume_file' for results from this function.

ownerProfile

ExtractedGitLink | null

The resolved candidate git profile, or null if none could be determined from the resume.

confidence

'high' | 'medium' | 'low' | 'none'

Confidence level from the owner disambiguation step.

ownedRepos

ExtractedGitLink[]

Repos whose owner username matches the resolved candidate username (case-insensitive).

contributions

ExtractedGitLink[]

Explicit PR and issue links found in the resume.

externalRepos

ExtractedGitLink[]

Repos referenced in the resume but owned by a different username.

allLinks

ExtractedGitLink[]

Every ExtractedGitLink parsed from the resume, before categorisation.

warnings

string[]

Diagnostic messages. Extraction method failures appear here (e.g. "Text extraction failed: ..." or "Annotation extraction failed: ..."). Also includes counts and disambiguator messages.

error

string | undefined

Set only when fs.readFile itself fails (file not found, permission denied, etc.). Individual extraction method failures are in warnings, not error.

How it works

Read the file

Calls fs.readFile(filePath) to load the entire PDF into a Buffer. If this fails (file not found, permission error), result.error is set and the function returns immediately with empty arrays.

Method 1 — Text layer extraction

Dynamically imports unpdf and calls getDocumentProxy + extractText with mergePages: true. The resulting text string is passed to extractGitUrlsFromText, which regex-scans for all git provider URLs. Failures are caught and appended to result.warnings without affecting Method 2.

Method 2 — Hyperlink annotation extraction

Creates a second independent copy of the PDF buffer (to avoid shared-state issues) and iterates every page via pdf.getPage(i) + page.getAnnotations(). Any annotation of subtype: 'Link' with a url string is checked against GIT_HOSTS. Matching annotation URLs are collected. Failures are caught and appended to result.warnings.

Deduplicate

Both URL sets are merged, trailing slashes are stripped, and the combined list is deduplicated with Set. This handles the common case where a git URL appears as both visible text and a clickable hyperlink.

Parse and classify

Each unique URL is passed to parseGitLink. Results that are not null are added to result.allLinks.

Resolve owner and categorise

resolveOwnerAndCategorize(result.allLinks, 'resume') runs the full disambiguation logic and populates ownerProfile, confidence, ownedRepos, contributions, and externalRepos.

Error vs warnings

Scenario	Where it appears
File not found / permission denied	`result.error`
`unpdf` text extraction threw	`result.warnings`
Annotation iteration threw	`result.warnings`
Disambiguator could not find an owner	`result.warnings` (and `confidence: 'none'`)

Always check result.warnings even on success. A warning like "Text extraction failed: ..." means only the annotation method ran — the result may be less complete than expected.

Examples

import { parseResume } from '@clyrisai/gitresolve';

const result = await parseResume('./resumes/janedoe.pdf');

if (result.error) {
  console.error('Could not read file:', result.error);
} else {
  console.log('Owner profile:', result.ownerProfile?.url);
  console.log('Confidence:',    result.confidence);
  console.log('Owned repos:',   result.ownedRepos.map(r => r.repo));
  console.log('Contributions:', result.contributions.length);
  console.log('Warnings:',      result.warnings);
}

Programmatic API

Types

parseResume — Local PDF Resume Parser API Reference

`parseResume`

Parameters

Returns

How it works

Error vs warnings

Examples

Build docs developers (and LLMs) love

Programmatic API

Types

Documentation Index

​parseResume

​Parameters

​Returns

​How it works

​Error vs warnings

​Examples

Build docs developers (and LLMs) love

`parseResume`

Parameters

Returns

How it works

Error vs warnings

Examples