Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/clyrisai/gitresolve/llms.txt

Use this file to discover all available pages before exploring further.

The resume parser reads a PDF file from the local filesystem and extracts every git provider URL it can find using two independent methods: rendered text extraction and hyperlink annotation scanning. Both methods run independently so that a failure in one does not block the other. The result is the same ResolverResult shape returned by scrapePortfolio, making it straightforward to combine portfolio and resume data downstream.

parseResume

Reads a local PDF file and resolves any git profile or repo links it contains into a structured ResolverResult.
async function parseResume(filePath: string): Promise<ResolverResult>

Parameters

filePath
string
required
Path to a local PDF file. Both relative paths (./resumes/janedoe.pdf) and absolute paths (/tmp/uploads/janedoe.pdf) are accepted. The path is passed directly to Node.js fs/promises.readFile.
parseResume only accepts local file paths. It cannot fetch remote PDFs over HTTP. If you have a PDF URL, download the file first and then pass the local path to parseResume.

Returns

Promise<ResolverResult> — this function never throws. All errors are surfaced inside the result object.
source
string
The filePath argument, unchanged.
sourceType
'resume_file'
Always 'resume_file' for results from this function.
ownerProfile
ExtractedGitLink | null
The resolved candidate git profile, or null if none could be determined from the resume.
confidence
'high' | 'medium' | 'low' | 'none'
Confidence level from the owner disambiguation step.
ownedRepos
ExtractedGitLink[]
Repos whose owner username matches the resolved candidate username (case-insensitive).
contributions
ExtractedGitLink[]
Explicit PR and issue links found in the resume.
externalRepos
ExtractedGitLink[]
Repos referenced in the resume but owned by a different username.
Every ExtractedGitLink parsed from the resume, before categorisation.
warnings
string[]
Diagnostic messages. Extraction method failures appear here (e.g. "Text extraction failed: ..." or "Annotation extraction failed: ..."). Also includes counts and disambiguator messages.
error
string | undefined
Set only when fs.readFile itself fails (file not found, permission denied, etc.). Individual extraction method failures are in warnings, not error.

How it works

1

Read the file

Calls fs.readFile(filePath) to load the entire PDF into a Buffer. If this fails (file not found, permission error), result.error is set and the function returns immediately with empty arrays.
2

Method 1 — Text layer extraction

Dynamically imports unpdf and calls getDocumentProxy + extractText with mergePages: true. The resulting text string is passed to extractGitUrlsFromText, which regex-scans for all git provider URLs. Failures are caught and appended to result.warnings without affecting Method 2.
3

Method 2 — Hyperlink annotation extraction

Creates a second independent copy of the PDF buffer (to avoid shared-state issues) and iterates every page via pdf.getPage(i) + page.getAnnotations(). Any annotation of subtype: 'Link' with a url string is checked against GIT_HOSTS. Matching annotation URLs are collected. Failures are caught and appended to result.warnings.
4

Deduplicate

Both URL sets are merged, trailing slashes are stripped, and the combined list is deduplicated with Set. This handles the common case where a git URL appears as both visible text and a clickable hyperlink.
5

Parse and classify

Each unique URL is passed to parseGitLink. Results that are not null are added to result.allLinks.
6

Resolve owner and categorise

resolveOwnerAndCategorize(result.allLinks, 'resume') runs the full disambiguation logic and populates ownerProfile, confidence, ownedRepos, contributions, and externalRepos.

Error vs warnings

ScenarioWhere it appears
File not found / permission deniedresult.error
unpdf text extraction threwresult.warnings
Annotation iteration threwresult.warnings
Disambiguator could not find an ownerresult.warnings (and confidence: 'none')
Always check result.warnings even on success. A warning like "Text extraction failed: ..." means only the annotation method ran — the result may be less complete than expected.

Examples

import { parseResume } from '@clyrisai/gitresolve';

const result = await parseResume('./resumes/janedoe.pdf');

if (result.error) {
  console.error('Could not read file:', result.error);
} else {
  console.log('Owner profile:', result.ownerProfile?.url);
  console.log('Confidence:',    result.confidence);
  console.log('Owned repos:',   result.ownedRepos.map(r => r.repo));
  console.log('Contributions:', result.contributions.length);
  console.log('Warnings:',      result.warnings);
}

Build docs developers (and LLMs) love