Algorithmic Trading with Machine Learning

Overview

This project was developed as the final work for the Programming Technologies degree. It consists of developing an algorithmic trading strategy using unsupervised learning techniques (K-Means) to analyze and select S&P500 assets, optimizing portfolios and comparing their performance against the index.

Project Context

Final project for the Programming Technician degree, focused on practical application of Machine Learning and Big Data in finance.

Objectives

The project aimed to accomplish the following goals:

Data Collection

Download and process historical price data from the S&P500

Technical Indicators

Calculate technical indicators and relevant features for each stock

Liquidity Selection

Select the 150 most liquid assets each month

Returns Calculation

Calculate monthly returns for different time horizons

Risk Factors

Download Fama-French factors and calculate rolling betas

Clustering & Optimization

Group similar assets using K-Means and build optimal portfolios (max Sharpe ratio)

Performance Comparison

Compare portfolio performance against the S&P500

Technologies & Libraries

The project leverages a comprehensive Python stack for financial analysis and machine learning:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Technology Stack

Python

Core programming language

Pandas & NumPy

Data manipulation and numerical computing

yfinance

Financial data acquisition

scikit-learn

Machine learning algorithms (K-Means)

PyPortfolioOpt

Portfolio optimization

pandas_ta

Technical analysis indicators

Matplotlib

Data visualization

Jupyter Notebook

Interactive development environment

Methodology

The project follows a systematic approach to algorithmic trading:

1. Data Download & Cleaning

Historical stock prices from the S&P500 were obtained and technical indicators were calculated:

Technical Indicators

Volatility: Measure of price variability
RSI (Relative Strength Index): Momentum oscillator
Bollinger Bands: Volatility bands
ATR (Average True Range): Volatility indicator
MACD (Moving Average Convergence Divergence): Trend-following momentum indicator
Volume in Dollars: Trading activity measure

2. Liquidity Filtering

The 150 most liquid assets per month were selected to ensure realistic trading operations and minimize slippage.

Liquidity filtering is crucial for algorithmic trading strategies to ensure that positions can be entered and exited without significant market impact.

3. Returns Calculation

Monthly returns were calculated for different time horizons, with outlier control to prevent skewed results.

4. Risk Factors

Fama-French factors were downloaded and rolling betas were calculated for each asset to capture systematic risk exposure.

The Fama-French factors (Market, Size, Value) provide a multi-factor model for understanding stock returns beyond simple market beta.

5. Clustering & Optimization

K-Means clustering was applied to group similar assets based on their characteristics, and optimal portfolios were built using the efficient frontier to maximize the Sharpe ratio.

Key Steps

# Cluster assets with similar characteristics
kmeans = KMeans(n_clusters=k)
clusters = kmeans.fit_predict(scaled_features)

# Build optimal portfolio (max Sharpe ratio)
ef = EfficientFrontier(expected_returns, cov_matrix)
weights = ef.max_sharpe()

6. Results Comparison

The optimized portfolio performance was compared against the S&P500 index to evaluate the strategy’s effectiveness.

Results

The optimized portfolio achieved competitive results against the S&P500, demonstrating the utility of combining Machine Learning techniques with traditional financial analysis.

The strategy successfully identified optimal asset combinations that provided risk-adjusted returns competitive with the benchmark index.

Project Resources

Access the complete project materials:

View Jupyter Notebook

Interactive notebook with full analysis

GitHub Repository

Complete project report and code

Key Takeaways

Machine Learning in Finance

Unsupervised learning techniques like K-Means can effectively identify patterns in financial data and group assets with similar characteristics.

Portfolio Optimization

Modern portfolio theory combined with algorithmic selection can produce portfolios with attractive risk-adjusted returns.

Practical Implementation

The project demonstrates a complete pipeline from data acquisition to strategy evaluation, providing a realistic framework for algorithmic trading.

This project showcases how Machine Learning and Big Data techniques can be practically applied to financial markets, providing a foundation for more sophisticated trading strategies.

AI & Machine Learning

Web Applications

Audio & Media

Other Projects

Algorithmic Trading with Machine Learning

Overview

Project Context

Objectives

Technologies & Libraries

Technology Stack

Python

Pandas & NumPy

yfinance

scikit-learn

PyPortfolioOpt

pandas_ta

Matplotlib

Jupyter Notebook

Methodology

1. Data Download & Cleaning

2. Liquidity Filtering

3. Returns Calculation

4. Risk Factors

5. Clustering & Optimization

6. Results Comparison

Results

Project Resources

View Jupyter Notebook

GitHub Repository

Key Takeaways

Build docs developers (and LLMs) love

AI & Machine Learning

Web Applications

Audio & Media

Other Projects

​Overview

Project Context

​Objectives

​Technologies & Libraries

​Technology Stack

Python

Pandas & NumPy

yfinance

scikit-learn

PyPortfolioOpt

pandas_ta

Matplotlib

Jupyter Notebook

​Methodology

​1. Data Download & Cleaning

​2. Liquidity Filtering

​3. Returns Calculation

​4. Risk Factors

​5. Clustering & Optimization

​6. Results Comparison

​Results

​Project Resources

View Jupyter Notebook

GitHub Repository

​Key Takeaways

Build docs developers (and LLMs) love

Overview

Objectives

Technologies & Libraries

Technology Stack

Methodology

1. Data Download & Cleaning

2. Liquidity Filtering

3. Returns Calculation

4. Risk Factors

5. Clustering & Optimization

6. Results Comparison

Results

Project Resources

Key Takeaways