Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/backtest-kit/backtest-ollama-crontab/llms.txt

Use this file to discover all available pages before exploring further.

The crawler is the first stage in the signal pipeline. It connects to a Telegram channel via the GramJS client, retrieves raw message text for one or more calendar days, hands each message to the screen service for parsing, and then upserts any successfully-parsed signals into the parser-items MongoDB collection. Because every upsert is keyed on { channel, messageId }, the crawler is fully idempotent — you can re-run it over any date range without creating duplicate records.

CrawlerService

CrawlerService (packages/core/src/lib/services/core/CrawlerService.ts) is the low-level workhorse. It delegates scraping to ScraperService and parsing to CryptoYodaScreenService, then persists the results through ParserDbService.

crawlDay(stamp)

Crawls a single calendar day identified by a moment-stamp (an integer like 20260115).
public crawlDay = async (stamp: number) => {
  this.loggerService.log("crawlerService crawlDay", { stamp });
  return await this.crawlRange(stamp, stamp);
};
This is the method called during live mode — the 15-minute crontab always crawls just today.

crawlRange(fromStamp, toStamp)

Crawls an inclusive range of days in parallel. For each day in the range, it calls cryptoYodaScreenService.screenDay(date), then processes every returned message:
public crawlRange = async (fromStamp: number, toStamp: number) => {
  this.loggerService.log("crawlerService crawlRange", { fromStamp, toStamp });
  const screenList = await RUN_CRAWLER_FN(
    fromStamp,
    toStamp,
    this.cryptoYodaScreenService.screenDay,
  );
  for (const msg of screenList) {
    if (!msg.data) {
      this.loggerService.info("crawlerService crawlRange skip: data is null", {
        channel: msg.channel,
        messageId: msg.id,
        content: msg.content,
      });
      continue;
    }
    if (msg.type === "crypto_yoda_channel") {
      await this.parserDbService.create({
        channel: msg.channel,
        source: msg.channel,
        messageId: msg.id,
        publishedAt: msg.date,
        note: msg.content,
        symbol: `${msg.data.symbol}USDT`,
        direction: msg.data.direction,
        entry: msg.data.entry,
        targets: msg.data.targets,
        stoploss: msg.data.stoploss,
        content: msg.data,
      });
    }
  }
  return screenList;
};
Messages where data is null (failed parsing) are logged and skipped. Only messages whose type equals "crypto_yoda_channel" are written to the database — this guard allows multiple screen services with different channel types to be added later without code changes.

CrawlerMainService

CrawlerMainService (packages/core/src/lib/services/main/CrawlerMainService.ts) is the orchestration layer. The two strategy crontabs call into it rather than calling CrawlerService directly, so mode-awareness and frame-lookup logic stay in one place.
import { inject } from "../../core/di";
import LoggerService from "../base/LoggerService";
import TYPES from "../../core/types";
import { getContext, getMode, listFrameSchema } from "backtest-kit";
import CrawlerService from "../core/CrawlerService";
import { getMomentStamp } from "get-moment-stamp";
import { signalJobSubject } from "../../../config/emitters";

export class CrawlerMainService {
  readonly loggerService = inject<LoggerService>(TYPES.loggerService);
  readonly crawlerService = inject<CrawlerService>(TYPES.crawlerService);

  public crawlLiveFrame = async (when: Date) => {
    this.loggerService.log("crawlerMainService crawlLiveFrame", { when });
    const mode = await getMode();
    if (mode === "backtest") {
      return;
    }
    const stamp = getMomentStamp(when);
    await this.crawlerService.crawlDay(stamp);
    await signalJobSubject.next();
  };

  public crawlBacktestFrame = async (when: Date) => {
    this.loggerService.log("crawlerMainService crawlFrame", { when });
    const { frameName } = await getContext();
    const frameList = await listFrameSchema();
    const { startDate, endDate } = frameList.find(
      (frame) => frame.frameName === frameName,
    );
    const fromStamp = getMomentStamp(startDate);
    const toStamp = getMomentStamp(endDate);
    await this.crawlerService.crawlRange(fromStamp, toStamp);
    await signalJobSubject.next();
  };
}

export default CrawlerMainService;

crawlLiveFrame(when: Date)

Called by the 15-minute live-mode crontab. It:
1

Guard against backtest mode

Calls getMode() and returns early if the runtime is "backtest". This prevents the live handler from accidentally running when backtest-kit replays history.
2

Compute today's stamp

Converts the when timestamp to an integer moment-stamp with getMomentStamp(when).
3

Crawl today

Delegates to crawlerService.crawlDay(stamp) — fetches and upserts all of today’s channel messages.
4

Fire the signal job

Calls signalJobSubject.next() so that SignalJobService immediately processes any new parser-items through the RiskOutline.

crawlBacktestFrame(when: Date)

Called once at strategy startup by the backtest-prepare crontab. It:
1

Resolve the active frame

Reads frameName from getContext() and looks up the matching entry in listFrameSchema() to obtain startDate and endDate.
2

Crawl the full range

Converts both dates to moment-stamps and passes them to crawlerService.crawlRange(fromStamp, toStamp), which fetches every day in the frame in parallel.
3

Fire the signal job

Emits signalJobSubject.next() so the backtest job picks up every newly-upserted row.

ScraperService

ScraperService (packages/core/src/lib/services/core/ScraperService.ts) owns the raw Telegram I/O. It calls getTelegram() to obtain an authenticated GramJS client, then streams messages from the channel for a specific calendar day:
public scrapeDay = async (channel: string, date: Date): Promise<ScraperMessage[]> => {
  const client = await getTelegram();

  const dayStart = new Date(date);
  dayStart.setUTCHours(0, 0, 0, 0);

  const dayEnd = new Date(date);
  dayEnd.setUTCHours(23, 59, 59, 999);

  const rows: ScraperMessage[] = [];

  for await (const message of client.iterMessages(channel, {
    offsetDate: Math.floor(dayEnd.getTime() / 1000) + 1,
    reverse: false,
  })) {
    if (!message.message) {
      continue;
    }
    const ts = message.date * 1000;
    if (ts < dayStart.getTime()) {
      break;
    }
    rows.push({
      id: message.id,
      content: message.message,
      channel,
      date: new Date(ts),
    });
  }
  return rows;
};
The iterator walks backwards from the end of the day, breaking as soon as it sees a message older than midnight UTC. This keeps the GramJS session load minimal regardless of how active the channel is.

Channel Configuration

The channel being scraped is identified by the constant:
const CHANNEL_NAME = "crypto_yoda_channel" as const;
defined at the top of CryptoYodaScreenService. This string is used as both the GramJS channel identifier passed to ScraperService.scrapeDay() and as the type tag on each parsed message. To target a different Telegram channel, create a new ScreenService following the same pattern as CryptoYodaScreenService:
  1. Define a CHANNEL_NAME constant with the new channel’s identifier.
  2. Define a SIGNAL_FORMAT object with regex patterns for the new channel’s message structure (see the Parser guide).
  3. Implement screenDay(date) and parseDay(messages) methods.
  4. Inject the new screen service into CrawlerService and add its screenDay call to RUN_CRAWLER_FN.

Data Deduplication

parserDbService.create() performs an upsert keyed on the compound unique index { channel, messageId }. This means:
  • Re-crawling the same date range never creates duplicate records.
  • If a Telegram message is edited after the first crawl, the updated content is written over the old record on the next crawl.
  • Backtest preparation can be re-run safely — only new messages (not yet in the collection) result in actual writes.
The messageId field is the Telegram-assigned integer message ID, which is stable and monotonically increasing within each channel. It is not a UUID generated by the application.

Build docs developers (and LLMs) love