Skip to main content
Get your fraud detection system up and running in just a few steps.

Prerequisites

Before you begin, ensure you have the following installed:
  • Python 3.8 - The project requires Python 3.8 for compatibility with specific library versions
  • Miniconda or Anaconda - For creating isolated Python environments
  • Git - For cloning the repository

Setup Steps

1

Clone the repository

Clone the fraud detection project to your local machine:
git clone https://github.com/sujith52/fraud.git
cd fraud
2

Create conda environment

Create a new conda environment with Python 3.8:
conda create -n fraud-env python=3.8 -y
conda activate fraud-env
Using a dedicated environment ensures dependency isolation and prevents conflicts with other Python projects.
3

Install dependencies

Install all required packages from requirements.txt:
pip install -r requirements.txt
This will install:
  • Flask 1.1.1 (web framework)
  • scikit-learn 0.22.1 (ML algorithms)
  • XGBoost 0.90 (gradient boosting)
  • pandas 0.25.3 (data processing)
  • Flask-MonitoringDashboard 3.0.6 (monitoring)
  • And 40+ other dependencies
4

Run the application

Start the Flask application:
python main.py
The server will start at http://127.0.0.1:5001
* Serving Flask app 'main' (lazy loading)
* Environment: production
* Debug mode: on
* Running on http://127.0.0.1:5001/ (Press CTRL+C to quit)

Train a Model

Before making predictions, you need to train a model on your insurance claims data.

Using the API

Send a POST request to the /train endpoint:
curl -X POST http://127.0.0.1:5001/train \
  -H "Content-Type: application/json" \
  -d '{"folderPath": "Training_Batch_Files/"}'
Training may take several minutes depending on the dataset size. The system will:
  1. Validate and preprocess the data
  2. Perform K-Means clustering
  3. Train XGBoost and SVM models per cluster
  4. Select and save the best model for each cluster

Make a Prediction

Once models are trained, you can predict fraud on new insurance claims.

Using the API

Send a POST request to the /predict endpoint:
curl -X POST http://127.0.0.1:5001/predict \
  -H "Content-Type: application/json" \
  -d '{"filepath": "Prediction_Batch_files/"}'

Response

Prediction File created at Prediction_Output_File/Predictions.csv!!!

View Results

The prediction results will be saved as a CSV file at Prediction_Output_File/Predictions.csv:
Predictions
N
Y
N
N
Y
Where:
  • Y = Fraud detected
  • N = No fraud detected

Access Monitoring Dashboard

View API performance metrics and usage statistics:
http://127.0.0.1:5001/dashboard
The dashboard provides:
  • Request/response times
  • Endpoint usage statistics
  • Error rates
  • Performance graphs
The monitoring dashboard uses Flask-MonitoringDashboard and stores metrics in flask_monitoringdashboard.db.

Next Steps

Installation Guide

Detailed installation instructions and troubleshooting

Training Guide

Learn about the model training pipeline

API Reference

Complete API documentation

Deployment

Deploy to production with Gunicorn

Troubleshooting

Change the port by setting the PORT environment variable:
PORT=8000 python main.py
Ensure you’ve activated the conda environment and installed all dependencies:
conda activate fraud-env
pip install -r requirements.txt
Ensure your training data CSV files are in the Training_Batch_Files/ directory with the correct naming format: fraudDetection_[DATESTAMP]_[TIMESTAMP].csv

Build docs developers (and LLMs) love