Quickstart

Get your fraud detection system up and running in just a few steps.

Prerequisites

Before you begin, ensure you have the following installed:

Python 3.8 - The project requires Python 3.8 for compatibility with specific library versions
Miniconda or Anaconda - For creating isolated Python environments
Git - For cloning the repository

Setup Steps

Clone the repository

Clone the fraud detection project to your local machine:

git clone https://github.com/sujith52/fraud.git
cd fraud

Create conda environment

Create a new conda environment with Python 3.8:

conda create -n fraud-env python=3.8 -y
conda activate fraud-env

Using a dedicated environment ensures dependency isolation and prevents conflicts with other Python projects.

Install dependencies

Install all required packages from requirements.txt:

pip install -r requirements.txt

This will install:

Flask 1.1.1 (web framework)
scikit-learn 0.22.1 (ML algorithms)
XGBoost 0.90 (gradient boosting)
pandas 0.25.3 (data processing)
Flask-MonitoringDashboard 3.0.6 (monitoring)
And 40+ other dependencies

Run the application

Start the Flask application:

python main.py

The server will start at http://127.0.0.1:5001

Expected Output

* Serving Flask app 'main' (lazy loading)
* Environment: production
* Debug mode: on
* Running on http://127.0.0.1:5001/ (Press CTRL+C to quit)

Train a Model

Before making predictions, you need to train a model on your insurance claims data.

Using the API

Send a POST request to the /train endpoint:

curl -X POST http://127.0.0.1:5001/train \
  -H "Content-Type: application/json" \
  -d '{"folderPath": "Training_Batch_Files/"}'

Training may take several minutes depending on the dataset size. The system will:

Validate and preprocess the data
Perform K-Means clustering
Train XGBoost and SVM models per cluster
Select and save the best model for each cluster

Make a Prediction

Once models are trained, you can predict fraud on new insurance claims.

Using the API

Send a POST request to the /predict endpoint:

curl -X POST http://127.0.0.1:5001/predict \
  -H "Content-Type: application/json" \
  -d '{"filepath": "Prediction_Batch_files/"}'

Response

Prediction File created at Prediction_Output_File/Predictions.csv!!!

View Results

The prediction results will be saved as a CSV file at Prediction_Output_File/Predictions.csv:

Predictions
N
Y
N
N
Y

Where:

Y = Fraud detected
N = No fraud detected

Access Monitoring Dashboard

View API performance metrics and usage statistics:

http://127.0.0.1:5001/dashboard

The dashboard provides:

Request/response times
Endpoint usage statistics
Error rates
Performance graphs

The monitoring dashboard uses Flask-MonitoringDashboard and stores metrics in flask_monitoringdashboard.db.

Next Steps

Installation Guide

Detailed installation instructions and troubleshooting

Training Guide

Learn about the model training pipeline

API Reference

Complete API documentation

Deployment

Deploy to production with Gunicorn

Troubleshooting

Port 5001 is already in use

Change the port by setting the PORT environment variable:

PORT=8000 python main.py

ModuleNotFoundError when running

Ensure you’ve activated the conda environment and installed all dependencies:

conda activate fraud-env
pip install -r requirements.txt

Training data not found

Ensure your training data CSV files are in the Training_Batch_Files/ directory with the correct naming format: fraudDetection_[DATESTAMP]_[TIMESTAMP].csv

Get Started

Core Concepts

Training

Prediction

Prerequisites

Setup Steps

Train a Model

Using the API

Make a Prediction

Using the API

Response

View Results

Access Monitoring Dashboard

Next Steps

Installation Guide

Training Guide

API Reference

Deployment

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Core Concepts

Training

Prediction

Documentation Index

​Prerequisites

​Setup Steps

​Train a Model

​Using the API

​Make a Prediction

​Using the API

​Response

​View Results

​Access Monitoring Dashboard

​Next Steps

Installation Guide

Training Guide

API Reference

Deployment

​Troubleshooting

Build docs developers (and LLMs) love

Prerequisites

Setup Steps

Train a Model

Using the API

Make a Prediction

Using the API

Response

View Results

Access Monitoring Dashboard

Next Steps

Troubleshooting