Fraud Detection System
End-to-end machine learning system for detecting fraudulent insurance claims using Flask, XGBoost, and K-Means clustering
Quick Start
Get up and running with the fraud detection system in minutes
Set up environment
Create a conda environment with Python 3.8 and install dependencies:
The project uses specific library versions (Flask 1.1.1, scikit-learn 0.22.1, XGBoost 0.90) for compatibility.
Start the Flask application
Run the application locally:The server will start at
http://127.0.0.1:5001Key Features
Everything you need for production-ready fraud detection
Multi-Model Detection
Automatically selects the best model between XGBoost and SVM using cross-validation and AUC scoring
K-Means Clustering
Segments data into clusters using the elbow method for optimized model training per cluster
Data Validation
Validates schema, file naming conventions, and data types before processing
Batch Processing
Process multiple insurance claims in batch mode with CSV output
Flask API
RESTful API endpoints for training models and generating predictions
Monitoring Dashboard
Built-in Flask monitoring dashboard for tracking API performance
Explore by Topic
Deep dive into specific areas of the system
System Architecture
Understand the ML pipeline from data ingestion to prediction serving
Learn more
Data Preprocessing
Feature engineering, encoding, scaling, and handling missing values
Learn more
Model Selection
Hyperparameter tuning with GridSearchCV for XGBoost and SVM
Learn more
Production Deployment
Deploy to Heroku or your own infrastructure with Gunicorn
Learn more
Ready to detect fraud?
Follow the quickstart guide to set up the system and start detecting fraudulent insurance claims in minutes.
Get Started