parser-items MongoDB Collection Schema and Indexes

parser-items is the raw ingestion collection. The crawler upserts messages here; the signal job reads unvisited rows and processes them through the LLM. Each document is a fully-parsed trading signal extracted from a Telegram channel message, ready for risk evaluation. The visited boolean is the state-machine flag that separates the ingestion stage from the enrichment stage — once SignalJobService successfully produces a screen-items document, it flips visited to true so the row is never reprocessed.

Schema Definition

Full IParserDto interface and ParserSchema from packages/core/src/schema/Parser.schema.ts:

import mongoose, { Document, Schema } from "mongoose";

interface IParserDto {
  channel:     string;
  source:      string;
  messageId:   number;
  publishedAt: Date;
  note:        string;

  symbol:    string;
  direction: "long" | "short";
  entry:     { from: number; to: number };
  targets:   number[];
  stoploss:  number;

  content: unknown;
}

interface ParserDocument extends IParserDto, Document {
  visited: boolean;
}

interface IParserRow extends IParserDto {
  id:          string;
  visited:     boolean;
  createDate:  Date;
  updatedDate: Date;
}

const DIRECTION_ENUM = ["long", "short"] as const;

const ParserSchema: Schema<ParserDocument> = new Schema(
  {
    channel:     { type: String, required: true, index: true },
    source:      { type: String, required: true, index: true },
    messageId:   { type: Number, required: true, index: true },
    publishedAt: { type: Date,   required: true, index: true },
    note:        { type: String, required: true },

    symbol:    { type: String, required: true, index: true },
    direction: { type: String, required: true, enum: DIRECTION_ENUM },
    entry: {
      from: { type: Number, required: true },
      to:   { type: Number, required: true },
    },
    targets:  { type: [Number], required: true },
    stoploss: { type: Number,   required: true },

    content: { type: Schema.Types.Mixed, required: true },
    visited: { type: Boolean, required: true, default: false, index: true },
  },
  { timestamps: { createdAt: "createDate", updatedAt: "updatedDate" }, minimize: false }
);

ParserSchema.index({ channel: 1, messageId: 1 }, { unique: true });
ParserSchema.index({ symbol: 1, publishedAt: -1 });

const ParserModel = mongoose.model<ParserDocument>("parser-items", ParserSchema);

export { ParserModel, IParserDto, IParserRow };

Fields

Message Identity

channel

string

required

Telegram channel name from which the message was scraped (e.g., "crypto_yoda_channel"). Part of the unique compound index with messageId. Indexed for channel-scoped queries.

source

string

required

Source identifier for the channel. Set equal to channel by the current crawler implementation. Indexed separately to support potential future multi-source setups.

messageId

number

required

Telegram’s numeric message ID within the channel. Combined with channel in a unique compound index to ensure idempotent upserts — scraping the same message twice produces a single document.

publishedAt

Date

required

The timestamp at which the original Telegram message was published. Used as the time axis for 4-hour window queries and backtest frame filtering. Indexed.

note

string

required

Full original message text preserved verbatim. Used for audit, replay, and to populate the note field of the resulting screen-items document.

Parsed Signal

symbol

string

required

Trading symbol with USDT suffix, e.g., "SOLUSDT", "BTCUSDT". The crawler appends "USDT" to the raw symbol extracted from the message. Indexed and used as the primary dimension in compound time-range queries.

direction

"long" | "short"

required

Trade direction parsed from Russian keywords in the original message (ЛОНГ → "long", ШОРТ → "short").

entry.from

number

required

Lower bound of the entry price zone extracted from the message.

entry.to

number

required

Upper bound of the entry price zone extracted from the message.

targets

number[]

required

Array of take-profit target prices in ascending order as parsed from the message. Passed directly to RiskOutlineContract as the targets argument.

stoploss

number

required

Stop-loss price level parsed from the message. Passed directly to RiskOutlineContract as the stoploss argument.

Processing State

visited

boolean

required

Tracks whether this row has been processed by SignalJobService. Defaults to false on insert. Flipped to true by ParserDbService.markVisited() after a successful screen-items document has been created. Indexed to allow O(1) scans for unprocessed rows.

content

Mixed

required

Raw parsed data object produced by the screen parser (CryptoYodaScreenService). Stored as Schema.Types.Mixed and propagated to screen-items.content unchanged.

createDate

Date

Auto-managed by Mongoose timestamps. Set once at document creation. Mapped from the default createdAt via { createdAt: "createDate" }.

updatedDate

Date

Auto-managed by Mongoose timestamps. Updated on every write. Mapped from the default updatedAt via { updatedAt: "updatedDate" }.

Indexes

Index	Type	Purpose
`{ channel: 1, messageId: 1 }`	Unique	Idempotent upserts — same Telegram message is never stored twice
`{ symbol: 1, publishedAt: -1 }`	Compound	Efficient `findLast4HourRow` queries sorted by recency
`{ channel: 1 }`	Single-field	Channel-scoped filtering
`{ source: 1 }`	Single-field	Source-scoped filtering
`{ messageId: 1 }`	Single-field	Direct message-ID lookup
`{ publishedAt: 1 }`	Single-field	Time-range scans (backtest frame queries)
`{ symbol: 1 }`	Single-field	Symbol equality filter
`{ visited: 1 }`	Single-field	Fast scan for unprocessed rows in live mode

The unique compound index on { channel: 1, messageId: 1 } is the crawler’s primary idempotency guard. ParserDbService.create() uses findOneAndUpdate with $setOnInsert, so re-crawling the same channel day returns existing documents without overwriting any data.

State Machine

The visited flag governs the two-stage pipeline lifecycle for every parser row:

┌──────────────┐    Crawler upserts     ┌─────────────────────────────┐
│  Telegram    │ ──────────────────────▶│  parser-items               │
│  Channel     │                        │  visited: false  ◀── default │
└──────────────┘                        └───────────┬─────────────────┘
                                                    │
                                    SignalJobService reads
                                    findAllByVisited(false)
                                                    │
                                                    ▼
                                        ┌───────────────────┐
                                        │  RiskOutline LLM  │
                                        │  (agent-swarm-kit)│
                                        └────────┬──────────┘
                                                 │ success
                                                 ▼
                                        ┌──────────────────────┐
                                        │  screen-items        │
                                        │  (created via upsert)│
                                        └────────┬─────────────┘
                                                 │
                                    markVisited(row.id)
                                                 │
                                                 ▼
                                        ┌────────────────────────────┐
                                        │  parser-items              │
                                        │  visited: true  ◀── updated │
                                        └────────────────────────────┘

Ingestion — The crawler upserts new messages with visited: false (the schema default).
Job pickup — SignalJobService queries parserDbService.findAllByVisited(false) to find all pending rows.
Deduplication check — Before calling the LLM, screenDbService.findByParserItem(row.id) is checked. If a screen-items document already exists, the row is skipped even if visited is still false.
Enrichment — SignalLogicService.execute(row) runs the RiskOutline outline and returns an IScreenDto.
Persistence — The IScreenDto is written to screen-items and markVisited(row.id) sets visited: true.
Idempotency — Re-running the job skips all rows where visited: true, making repeated cron triggers safe.

The deduplication check in step 3 (findByParserItem) provides a safety net for crash-recovery scenarios where screenDbService.create() succeeded but markVisited() had not yet been called before the process exited.

Core API

Results

parser-items MongoDB Collection Schema and Indexes

Schema Definition

Fields

Message Identity

Parsed Signal

Processing State

Indexes

State Machine

Build docs developers (and LLMs) love

Core API

Results

Documentation Index

​Schema Definition

​Fields

​Message Identity

​Parsed Signal

​Processing State

​Indexes

​State Machine

Build docs developers (and LLMs) love

Schema Definition

Fields

Message Identity

Parsed Signal

Processing State

Indexes

State Machine