The resume parser reads a PDF file from the local filesystem and extracts every git provider URL it can find using two independent methods: rendered text extraction and hyperlink annotation scanning. Both methods run independently so that a failure in one does not block the other. The result is the sameDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/clyrisai/gitresolve/llms.txt
Use this file to discover all available pages before exploring further.
ResolverResult shape returned by scrapePortfolio, making it straightforward to combine portfolio and resume data downstream.
parseResume
Reads a local PDF file and resolves any git profile or repo links it contains into a structured ResolverResult.
Parameters
Path to a local PDF file. Both relative paths (
./resumes/janedoe.pdf) and absolute paths (/tmp/uploads/janedoe.pdf) are accepted. The path is passed directly to Node.js fs/promises.readFile.parseResume only accepts local file paths. It cannot fetch remote PDFs over HTTP. If you have a PDF URL, download the file first and then pass the local path to parseResume.Returns
Promise<ResolverResult> — this function never throws. All errors are surfaced inside the result object.
The
filePath argument, unchanged.Always
'resume_file' for results from this function.The resolved candidate git profile, or
null if none could be determined from the resume.Confidence level from the owner disambiguation step.
Repos whose owner username matches the resolved candidate username (case-insensitive).
Explicit PR and issue links found in the resume.
Repos referenced in the resume but owned by a different username.
Every
ExtractedGitLink parsed from the resume, before categorisation.Diagnostic messages. Extraction method failures appear here (e.g.
"Text extraction failed: ..." or "Annotation extraction failed: ..."). Also includes counts and disambiguator messages.Set only when
fs.readFile itself fails (file not found, permission denied, etc.). Individual extraction method failures are in warnings, not error.How it works
Read the file
Calls
fs.readFile(filePath) to load the entire PDF into a Buffer. If this fails (file not found, permission error), result.error is set and the function returns immediately with empty arrays.Method 1 — Text layer extraction
Dynamically imports
unpdf and calls getDocumentProxy + extractText with mergePages: true. The resulting text string is passed to extractGitUrlsFromText, which regex-scans for all git provider URLs. Failures are caught and appended to result.warnings without affecting Method 2.Method 2 — Hyperlink annotation extraction
Creates a second independent copy of the PDF buffer (to avoid shared-state issues) and iterates every page via
pdf.getPage(i) + page.getAnnotations(). Any annotation of subtype: 'Link' with a url string is checked against GIT_HOSTS. Matching annotation URLs are collected. Failures are caught and appended to result.warnings.Deduplicate
Both URL sets are merged, trailing slashes are stripped, and the combined list is deduplicated with
Set. This handles the common case where a git URL appears as both visible text and a clickable hyperlink.Parse and classify
Each unique URL is passed to
parseGitLink. Results that are not null are added to result.allLinks.Error vs warnings
| Scenario | Where it appears |
|---|---|
| File not found / permission denied | result.error |
unpdf text extraction threw | result.warnings |
| Annotation iteration threw | result.warnings |
| Disambiguator could not find an owner | result.warnings (and confidence: 'none') |