Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jazbengu/ThreatDetect/llms.txt

Use this file to discover all available pages before exploring further.

The Single Search page is listed in the ThreatDetect sidebar alongside Organisational Search and Exploratory Data Analysis. It is designed for targeted analysis of one individual, running their data through the same XGBoost + Isolation Forest pipeline as a CSV batch scan and returning a prediction, risk probability, anomaly score, confidence, and SHAP explanation for that record.
Single Search uses the same feature engineering, encoding, scaling, and model inference pipeline as the CSV analysis. The inputs required and the interpretation of outputs are identical — see Input data schema for the required columns and Interpreting results for what each output field means.
Single Search is best suited for targeted follow-up on specific individuals — for example, re-running analysis on an employee after their data has been updated, or spot-checking a person flagged in an earlier organisational scan with revised or corrected values. For analysing your entire workforce in one pass, use the Organisational Search via CSV page instead.

Input fields

The Single Search form requires the same columns as the CSV upload. You must supply values for all fields the model expects:
FieldTypeDescription
employee_campusstringCampus location (must match training set values)
total_printed_pagesnumericTotal pages the employee has printed
num_printed_pages_off_hoursnumericPages printed outside standard business hours
total_files_burnednumericFiles written to removable media
has_criminal_recordbinary (0/1)Whether the employee has a criminal record
is_contractorbinary (0/1)Whether the employee is a contractor
has_foreign_citizenshipbinary (0/1)Whether the employee holds foreign citizenship
entry_during_weekendbinary (0/1)Whether the employee accessed the building on weekends
late_exit_flagbinary (0/1)Whether the employee regularly exits late

Understanding the output

After submitting the record, ThreatDetect returns the following fields:

Prediction

Either "Malicious" or "Normal", determined by comparing Risk_Prob against the model’s best_threshold. If Risk_Prob ≥ best_threshold, the employee is classified as Malicious.

Risk probability

Risk_Prob is a continuous value between 0 and 1 representing the XGBoost model’s estimated probability of malicious behavior. Values close to 1 indicate high risk; values close to 0 indicate low risk.

Confidence

Confidence tells you how certain the model is about the class it assigned:
  • If the prediction is Malicious: Confidence = Risk_Prob
  • If the prediction is Normal: Confidence = 1 − Risk_Prob
A confidence score close to 0.5 means the employee is near the decision boundary and their result warrants closer human review.

Anomaly score

Anomaly_Score is the IsolationForest decision_function output. More negative values indicate a behavioral profile that is more anomalous relative to the training population. This score is appended as a feature to the XGBoost input, so it directly influences Risk_Prob.

SHAP per-instance explanation

Alongside the numeric outputs, ThreatDetect displays a SHAP breakdown for the employee:
  • A human-readable list of the top features increasing risk (positive SHAP values) and reducing risk (negative SHAP values), showing the employee’s actual value for each feature.
  • A SHAP bar chart plotting the top 10 features by SHAP value. Red bars push the prediction toward Malicious; green bars push it toward Normal.
Positive SHAP values mean that feature’s value raises the probability of a Malicious prediction relative to the model’s baseline. Negative SHAP values lower it.
A SHAP value of exactly 0 means that feature had no marginal influence on this employee’s prediction beyond the model’s average baseline. The feature may still be globally important — it simply did not deviate from the baseline for this specific record.
The global feature importance chart reflects how frequently each feature is used in tree splits across the entire model. Per-instance SHAP values reflect the actual influence on one specific employee’s prediction. A globally important feature can have a low SHAP value for an individual if their value for that feature is unremarkable.
A Malicious prediction is a probabilistic risk indicator, not a confirmed finding. Always combine ThreatDetect output with human judgement, supporting evidence, and your organisation’s review process before taking any action based on a result.

Build docs developers (and LLMs) love