Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/zotero/zotero-connectors/llms.txt

Use this file to discover all available pages before exploring further.

Translators are the core logic units that teach Zotero how to extract bibliographic metadata from a specific website, file format, or search result. Each translator is a self-contained JavaScript module paired with a JSON metadata header. The Zotero.Translators singleton — defined in src/common/translators.js — manages loading those modules from local storage, keeping them up-to-date by consulting the Zotero desktop client or the remote translator repository, and routing every page visit to the best matching translator so that detection can proceed.

Translator Types

Every translator is assigned a bitmask translatorType field. The connector recognises four types, stored in the TRANSLATOR_TYPES enum (aliased from Zotero.Translator.TRANSLATOR_TYPES):
TypeDescription
importReads a structured file format (BibTeX, RIS, MARC, etc.) and produces Zotero items
exportSerialises Zotero items to a structured format
webDetects and scrapes metadata from a live webpage
searchTakes an identifier (DOI, ISBN, PMID) and queries a remote data source
A translator may combine multiple type bits. For example, a translator can be both import and export. During initialisation, _load() iterates over all stored translators and pushes each one into every _cache bucket whose bit it matches.

Initialisation and the _cache Object

Zotero.Translators.init() is idempotent and promise-cached — subsequent calls return the same promise rather than re-loading.
// src/common/translators.js
this.init = async function() {
  if (_initializedPromise) return _initializedPromise;
  _cache = {"import":[], "export":[], "web":[], "search":[]};
  _translators = {};
  _initializedPromise = new Promise(async (resolve, reject) => {
    try {
      let translators = Zotero.Prefs.get("translatorMetadata");
      // No stored translators
      if (typeof translators !== "object" || !translators.length) {
        Zotero.debug(`Translators: First time launch, getting all translators.`);
        await this.updateFromRemote(true);
      }
      else {
        this._load(translators);
      }
      this.keepTranslatorsUpdated();
      resolve();
    }
    catch (e) {
      _initializedPromise = null;
      reject(e);
    }
  })
}
The _cache object maps each type name to a sorted array of Zotero.Translator instances:
_cache = {
  "import":  [ /* sorted by priority */ ],
  "export":  [ /* sorted by priority */ ],
  "web":     [ /* sorted by priority */ ],
  "search":  [ /* sorted by priority */ ]
}
After every batch load, each array is sorted ascending by the priority field so that higher-priority (lower number) translators are tested first.

Translator Code Caching

Translator metadata (title, author, regexp, priority…) is always stored in the translatorMetadata preference as a JSON array. The JavaScript code for each translator is cached separately in the extension’s preference storage under the key prefix translatorCode_ followed by the translator’s UUID.
this.PREFS_TRANSLATOR_CODE_PREFIX = 'translatorCode_';
When code is needed for a translator, getCodeForTranslator() first checks in-memory (translator.code), then falls back to Zotero.Prefs.get('translatorCode_' + translatorID). On a cache miss it fetches from the Zotero desktop client or the remote repository and stores the result back into prefs for future calls.

URL Matching: getWebTranslatorsForLocation()

The inject script calls getWebTranslatorsForLocation(URI, rootURI) in the background to find which web translators apply to the current tab. Matching works as follows:
1

Resolve potential proxied URLs

Zotero.Proxies.getPotentialProxies(rootURI) returns a map of candidate unproxied URLs → proxy objects so that translator regexps can be tested against the real host.
2

Test root and frame regexps

For every web translator in _cache["web"], the code tests the translator’s webRegexp.root against each candidate root URL. Generic translators (no webRegexp.root) are only eligible when they can RUN_MODE_IN_BROWSER.
3

Test frame regexps for iframes

When the current URI is inside a frame (URI !== rootURI), translators that declare webRegexp.all are additionally tested against the frame URI.
4

Fetch code and return

CodeGetter.getAll() concurrently fetches code for all potential translators (concurrency capped at 2 to avoid hammering the network), then the matched list is returned to the inject script for detectWeb() to run.
The _fullFrameDetectionWhitelist hard-codes hosts (currently ['resolver.ebscohost.com']) whose frames are treated as root frames, enabling full translator matching even inside an iframe context.

Translator Updates: updateFromRemote()

Updates are deduped by a module-level promise. The method accepts a reset flag that forces a full re-fetch rather than a diff:
this.updateFromRemote = async function(reset=false) {
  if (_updateFromRemotePromise) {
    if (!reset) {
      return _updateFromRemotePromise;     // join in-progress update
    }
    try { await _updateFromRemotePromise; } catch (e) {}
  }
  let promise = this._doUpdateFromRemote(reset);
  _updateFromRemotePromise = promise;
  promise.finally(() => { _updateFromRemotePromise = null; });
  return promise;
}
Internally, _doUpdateFromRemote tries two sources in order:
  1. Zotero desktop client — calls Zotero.Repo.getTranslatorMetadataFromZotero(), which posts to the connector’s getTranslators endpoint and records the current timestamp in connector.repo.lastCheck.localTime.
  2. Remote repository — calls Zotero.Repo.getTranslatorMetadataFromServer(reset), which fetches from https://repo.zotero.org/repo/metadata?version=<ext_version>&last=<repoTime>.

Zotero.Repo: Fetching Translator Code

Zotero.Repo (src/common/repo.js) provides the lower-level fetch primitives:

getTranslatorCode(translatorID)

Tries the Zotero desktop client first (getTranslatorCode endpoint), then falls back to https://repo.zotero.org/repo/code/<id>?version=<ext_version>. Validates the embedded metadata JSON and triggers Zotero.Translators.updateTranslator() if the fetched code is newer than stored metadata.

getTranslatorMetadataFromServer(reset)

Fetches a full or differential metadata list from the repo. The reset=true path passes last=0; otherwise it sends the Unix timestamp stored in connector.repo.lastCheck.repoTime. The Date response header is used to update that timestamp.

Keeping Translators Updated: keepTranslatorsUpdated()

After init() completes, keepTranslatorsUpdated() runs as a self-scheduling async loop:
// ZOTERO_CONFIG.REPOSITORY_CHECK_INTERVAL = 86400  (24 hours)
// ZOTERO_CONFIG.REPOSITORY_RETRY_INTERVAL = 3600   (1 hour)

this.keepTranslatorsUpdated = async function() {
  const nextCascadeToRepo = Zotero.Prefs.get("connector.repo.lastCheck.localTime")
    + ZOTERO_CONFIG.REPOSITORY_CHECK_INTERVAL * 1000;
  const now = Date.now();
  const repoCheckIntervalHasExpired = nextCascadeToRepo <= now;

  if (repoCheckIntervalHasExpired) {
    try { await this.updateFromRemote(); }
    catch (e) { repoCheckFailed = true; }
  }

  let nextCheckIn = Math.max(0, nextCascadeToRepo - now);
  if (repoCheckIntervalHasExpired && repoCheckFailed) {
    nextCheckIn = ZOTERO_CONFIG.REPOSITORY_RETRY_INTERVAL * 1000; // retry in 1h
  }
  await Zotero.Promise.delay(nextCheckIn);
  return this.keepTranslatorsUpdated();
}
The check interval is 24 hours (REPOSITORY_CHECK_INTERVAL: 86400). On failure, the retry interval drops to 1 hour (REPOSITORY_RETRY_INTERVAL: 3600).

Translator Hash Mechanism

When the Zotero desktop client responds to a ping request it includes a translatorsHash (or sortedTranslatorHash) in the prefs object. The connector computes its own hash with getTranslatorsHash(sorted):
// Hash = MD5 of "<id>:<lastUpdated>," concatenated for every translator
let hashString = "";
for (let translator of translators) {
  hashString += `${translator.translatorID}:${translator.lastUpdated},`;
}
this[prop] = Zotero.Utilities.Connector.md5(hashString);
If the hashes differ, Zotero.Translators.updateFromRemote() is called immediately to pull the latest metadata from the desktop client.

Key Method Reference

init()
Promise<void>
Initialise the translator cache. Loads stored metadata from translatorMetadata preference, or fetches from remote if this is the first launch. Idempotent.
get(translatorID)
Promise<Zotero.Translator|false>
Returns a translator by UUID with code loaded (for RUN_MODE_IN_BROWSER translators). Returns false if not found.
getWithoutCode(translatorID)
Promise<Zotero.Translator|false>
Returns a translator object without loading its code. Useful for metadata-only lookups.
getAllForType(type, debugMode?)
Promise<Zotero.Translator[]>
Returns all translators of a given type ("import", "export", "web", "search") with code loaded. Pass debugMode=true to re-fetch code from Zotero Standalone.
getWebTranslatorsForLocation(URI, rootURI, callback?)
Promise<[translators[], proxies[]]>
Finds web translators matching the given page URL. Returns a tuple of matched translators and corresponding proxy objects.
serialize(translators, properties)
Object|Object[]
Converts one or more translator objects to plain JSON-serialisable objects using the supplied list of property names. Used internally with Zotero.Translator.TRANSLATOR_CACHING_PROPERTIES.
updateFromRemote(reset?)
Promise<void>
Fetches updated translator metadata from the Zotero desktop client, falling back to the remote repository. Pass reset=true for a full re-fetch.
deleteTranslatorCode(id)
Promise<void>
Removes cached code for a translator from both in-memory state and the preferences store (translatorCode_<id>).

Build docs developers (and LLMs) love