Skip to main content

Fraud Detection System

End-to-end machine learning system for detecting fraudulent insurance claims using Flask, XGBoost, and K-Means clustering

Quick Start

Get up and running with the fraud detection system in minutes

1

Clone the repository

Clone the fraud detection project to your local machine:
git clone https://github.com/sujith52/fraud.git
cd fraud
2

Set up environment

Create a conda environment with Python 3.8 and install dependencies:
conda create -n fraud-env python=3.8 -y
conda activate fraud-env
pip install -r requirements.txt
The project uses specific library versions (Flask 1.1.1, scikit-learn 0.22.1, XGBoost 0.90) for compatibility.
3

Start the Flask application

Run the application locally:
python main.py
The server will start at http://127.0.0.1:5001
4

Make your first prediction

Upload a CSV file with insurance claims data through the web interface or use the API:
curl -X POST http://127.0.0.1:5001/predict \
  -H "Content-Type: application/json" \
  -d '{"filepath": "Prediction_Batch_files/"}'
Prediction File created at Prediction_Output_File/Predictions.csv!!!

Key Features

Everything you need for production-ready fraud detection

Multi-Model Detection

Automatically selects the best model between XGBoost and SVM using cross-validation and AUC scoring

K-Means Clustering

Segments data into clusters using the elbow method for optimized model training per cluster

Data Validation

Validates schema, file naming conventions, and data types before processing

Batch Processing

Process multiple insurance claims in batch mode with CSV output

Flask API

RESTful API endpoints for training models and generating predictions

Monitoring Dashboard

Built-in Flask monitoring dashboard for tracking API performance

Ready to detect fraud?

Follow the quickstart guide to set up the system and start detecting fraudulent insurance claims in minutes.

Get Started

Build docs developers (and LLMs) love