The Organisational Search via CSV page lets you analyse an entire workforce at once. You upload a CSV containing employee behavioral data, ThreatDetect runs every record through the XGBoost model and IsolationForest scorer, and you receive a complete breakdown of risk by employee — including visualisations, SHAP explanations, and a downloadable results file.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jazbengu/ThreatDetect/llms.txt
Use this file to discover all available pages before exploring further.
Run a batch analysis
Navigate to Organisational Search via CSV
Open ThreatDetect in your browser and use the sidebar dropdown to select Organisational Search via CSV.
Upload your CSV file
Click Browse files (or drag and drop) in the file uploader. ThreatDetect accepts
.csv files only. Once loaded, the app displays a preview of the first 10 rows and two summary metrics: Total Records and, if the employee_campus column is present, Unique Campuses.Click Run Threat Detection
Click the Run Threat Detection button. The app processes every record — encoding categorical columns, scaling numeric columns, engineering derived features, and computing both XGBoost probabilities and IsolationForest anomaly scores. A spinner indicates that analysis is in progress.
Review the results
Once complete, the app displays an Organisational Threat Summary with four metrics, two charts, a feature importance chart, and a SHAP summary plot. See Understanding the results below for details on each output.
Required CSV columns
Your CSV must include the following columns. ThreatDetect raises an error if any are missing.| Column | Type | Description |
|---|---|---|
employee_campus | string | Campus or office location of the employee (must match training set values) |
total_printed_pages | numeric | Total pages printed by the employee |
num_printed_pages_off_hours | numeric | Pages printed outside standard hours |
total_files_burned | numeric | Number of files written to removable media |
has_criminal_record | binary (0/1) | Whether the employee has a criminal record |
is_contractor | binary (0/1) | Whether the employee is a contractor |
has_foreign_citizenship | binary (0/1) | Whether the employee holds foreign citizenship |
entry_during_weekend | binary (0/1) | Whether the employee accessed the building on weekends |
late_exit_flag | binary (0/1) | Whether the employee regularly exits late |
Understanding the results
Summary metrics
After detection runs, four metrics appear at the top of the results section:| Metric | Description |
|---|---|
| Total Employees | Total number of records processed |
| Malicious | Count of employees predicted as malicious, with percentage in the delta label |
| Normal | Count of employees predicted as normal, with percentage in the delta label |
| Avg. Confidence | Mean confidence score across all employees (see confidence formula) |
Charts
Threat Prediction Count — A bar chart showing how many employees were classified as Malicious versus Normal. Use this for a quick visual split. Risk Probability Distribution — A histogram ofRisk_Prob values (0–1) for all employees. A vertical dashed red line marks the model’s decision threshold. Employees to the right of the line are classified as Malicious. The distribution shape reveals whether most employees cluster far from the threshold (clear cases) or are concentrated near it (uncertain cases).
Global Feature Importance (Top 15) — A horizontal bar chart of the top 15 XGBoost feature importances (F-score). These are the features that most frequently split the decision trees across the entire model, giving you a global view of what drives predictions at the organisational level.
Global SHAP Summary Plot — A SHAP beeswarm plot computed over a random sample of up to 100 records. Each dot represents one employee for one feature. The horizontal position shows the SHAP value (positive = pushes toward Malicious), and the colour shows the raw feature value (red = high, blue = low). This plot reveals how feature values relate to risk direction across the organisation.
Organisational risk insight
Below the charts, ThreatDetect displays one of two messages:- Warning — if at least one employee is predicted Malicious, listing the count, percentage, and the three features with the highest global importance.
- Success — if no employees are flagged, confirming the organisation appears clean.
Per-employee explanation
After running detection, expand Explain a specific employee (SHAP per instance) to drill into any individual record.- Select an employee from the dropdown. Each option shows the employee index, their prediction, and their confidence score.
- The app displays three metrics: Prediction, Confidence, and Anomaly Score for that employee.
- A human-readable list explains the top features pushing toward Malicious (increases risk) and toward Normal (reduces risk), showing the original feature value for each.
- A SHAP bar chart plots the top 10 features by SHAP value. Red bars push toward Malicious; green bars push toward Normal.
Downloading results
The Download results as CSV button inside the Detailed Results Table expander saves a file namedthreat_analysis_results.csv. This file includes all original columns from your upload plus:
Prediction—"Malicious"or"Normal"Risk_Prob— probability of being malicious (0–1)Anomaly_Score— IsolationForest decision function outputConfidence— model certainty for the assigned class
Confidence formula:
Confidence = Risk_Prob when the prediction is Malicious, and Confidence = 1 − Risk_Prob when the prediction is Normal. This means Confidence always represents how certain the model is about whichever class it chose, not the raw probability of being malicious.