Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/clyrisai/gitresolve/llms.txt

Use this file to discover all available pages before exploring further.

The disambiguator takes a flat list of ExtractedGitLink objects — profiles, repos, PRs, and issues — and works out which profile is the candidate’s own. It then categorises every repo link as either owned by the candidate or external. scrapePortfolio and parseResume both call this internally, but you can invoke it directly when you have already collected links from a custom source.

resolveOwnerAndCategorize

Determines the candidate owner from a list of parsed git links and splits repo links into owned, contributed, and external buckets.
function resolveOwnerAndCategorize(
  links: ExtractedGitLink[],
  sourceContext?: string,
  knownOwnerProfile?: ExtractedGitLink
): {
  ownerProfile: ExtractedGitLink | null;
  confidence: 'high' | 'medium' | 'low' | 'none';
  ownedRepos: ExtractedGitLink[];
  contributions: ExtractedGitLink[];
  externalRepos: ExtractedGitLink[];
  warnings: string[];
}

Parameters

The full list of parsed git links from a single source. Usually the allLinks array produced by scrapePortfolio or parseResume. May be empty — the function handles the empty case by returning confidence: 'none'.
sourceContext
string
An optional hint string used in warning messages to indicate where the links came from (e.g. 'portfolio', 'resume'). Does not affect the resolution algorithm.
knownOwnerProfile
ExtractedGitLink
If you already know the candidate’s git profile (from a different source, or from a direct input), supply it here. When provided, the disambiguation algorithm is bypassed entirely — ownerProfile is set to this value, confidence is forced to 'high', and a warning is added noting the bypass.

Returns

An object with the following shape. The underlying OwnerResolution interface is internal to the library — it is not exported from @clyrisai/gitresolve and cannot be imported by name. Use the field descriptions below as your type reference.
ownerProfile
ExtractedGitLink | null
The resolved candidate git profile, or null when no owner could be determined (Case 4).
confidence
'high' | 'medium' | 'low' | 'none'
How confident the algorithm is in the resolved owner. See the four cases below.
ownedRepos
ExtractedGitLink[]
Repo links (type 'repo') where username matches the resolved owner, case-insensitively. Duplicates (same URL) are removed.
contributions
ExtractedGitLink[]
All pull_request and issue links from the input, deduplicated by URL.
externalRepos
ExtractedGitLink[]
Repo links whose username does not match the resolved owner — third-party repos referenced on the page or in the document.
warnings
string[]
Diagnostic strings describing how the owner was determined (or why it could not be).

knownOwnerProfile bypass

When knownOwnerProfile is supplied, the function skips all four disambiguation cases. The owner is accepted as given, confidence is set to 'high', and the following warning is added:
Owner strictly determined by profile URL input: {username}
Repo categorisation still runs normally — ownedRepos and externalRepos are computed relative to the supplied username.

The four disambiguation cases

Repo categorisation rules

After the owner is determined, every link in the links array is assigned to exactly one bucket:
BucketCondition
ownedReposlink.type === 'repo' AND link.username.toLowerCase() === ownerUsername.toLowerCase()
externalReposlink.type === 'repo' AND username does NOT match owner
contributionslink.type === 'pull_request' OR link.type === 'issue'
Profile links, gist links, and 'other' type links are not placed in any of the three buckets but remain in the input links array (accessible as allLinks on ResolverResult). Duplicate repo URLs (case-insensitive) are deduplicated across ownedRepos and externalRepos — the first occurrence is kept.

When to call directly

In most cases you do not need to call resolveOwnerAndCategorize yourself — scrapePortfolio and parseResume call it internally. You should call it directly when:
  • You have collected ExtractedGitLink objects from a custom source not covered by the built-in functions.
  • You want to re-run disambiguation on a combined link list from multiple sources (e.g. merging allLinks from a portfolio result and a resume result).
  • You are testing your own parsing logic and want to verify categorisation.

Examples

import { resolveOwnerAndCategorize } from '@clyrisai/gitresolve';
import type { ExtractedGitLink } from '@clyrisai/gitresolve';

const links: ExtractedGitLink[] = [
  {
    url: 'https://github.com/janedoe',
    provider: 'github',
    type: 'profile',
    username: 'janedoe',
  },
  {
    url: 'https://github.com/janedoe/api-service',
    provider: 'github',
    type: 'repo',
    username: 'janedoe',
    repo: 'api-service',
  },
  {
    url: 'https://github.com/facebook/react',
    provider: 'github',
    type: 'repo',
    username: 'facebook',
    repo: 'react',
  },
];

const resolution = resolveOwnerAndCategorize(links, 'custom');

console.log(resolution.ownerProfile?.username);  // 'janedoe'
console.log(resolution.confidence);              // 'high'
console.log(resolution.ownedRepos.length);       // 1
console.log(resolution.externalRepos.length);    // 1 (facebook/react)

dedupeProfilesByUsername

Removes duplicate profile links from an array, keeping the first occurrence of each unique username. The comparison is case-insensitive so JaneDoe and janedoe are treated as the same person.
function dedupeProfilesByUsername(profiles: ExtractedGitLink[]): ExtractedGitLink[]

Parameters

profiles
ExtractedGitLink[]
required
An array of ExtractedGitLink objects. Only username is used for deduplication — url, provider, and other fields are not compared.

Returns

A new array containing only the first occurrence of each unique lowercased username. The original array is not mutated. Non-profile link types are passed through unchanged since only username is used as the dedup key.

Example

import { dedupeProfilesByUsername } from '@clyrisai/gitresolve';
import type { ExtractedGitLink } from '@clyrisai/gitresolve';

const profiles: ExtractedGitLink[] = [
  { url: 'https://github.com/JaneDoe',   provider: 'github', type: 'profile', username: 'JaneDoe' },
  { url: 'https://github.com/janedoe',   provider: 'github', type: 'profile', username: 'janedoe' },
  { url: 'https://gitlab.com/janedoe',   provider: 'gitlab', type: 'profile', username: 'janedoe' },
  { url: 'https://github.com/bobsmith',  provider: 'github', type: 'profile', username: 'bobsmith' },
];

const unique = dedupeProfilesByUsername(profiles);
// [
//   { username: 'JaneDoe', url: 'https://github.com/JaneDoe', ... },  ← first occurrence kept
//   { username: 'bobsmith', url: 'https://github.com/bobsmith', ... },
// ]
// 'janedoe' (github) and 'janedoe' (gitlab) are both dropped as duplicates

Build docs developers (and LLMs) love