Overview
The Data Analysis feature provides powerful statistical analysis, predictions, and AI-powered insights for water quality data. Generate comprehensive reports with visualizations using matplotlib and export to PDF.
Average Analysis Statistical summaries with min/max/average
Time-Series Trend analysis by day, month, or year
Predictions AI-powered future value forecasting
Correlation Multi-sensor relationship analysis
Analysis Types
1. Average Analysis
Calculate statistical summaries for sensor data over a time period:
POST /api/analysis/average/
Content-Type : application/json
Authorization : Bearer {access_token}
{
"workspace_id" : "workspace_123" ,
"meter_id" : "meter_456" ,
"sensor_name" : "ph" ,
"start_date" : "2024-01-01" ,
"end_date" : "2024-01-31"
}
Response:
{
"message" : "Analysis generating with id: analysis_abc123" ,
"result" : {
"id" : "analysis_abc123" ,
"type" : "average" ,
"status" : "saved" ,
"data" : {
"period" : {
"start_date" : "2024-01-01" ,
"end_date" : "2024-01-31"
},
"result" : [
{
"sensor" : "ph" ,
"average" : 7.2 ,
"min" : 6.8 ,
"max" : 7.6
},
{
"sensor" : "turbidity" ,
"average" : 2.3 ,
"min" : 1.1 ,
"max" : 4.2
}
]
}
}
}
Features:
Calculate average, minimum, and maximum values
Analyze single sensor or all sensors
Custom date ranges
Includes statistical bar charts in PDF reports
See: ~/workspace/source/app/features/analysis/presentation/routes/average.py:16
2. Average by Period
Analyze trends over time with period-based grouping:
POST /api/analysis/average-period/
Content-Type : application/json
Authorization : Bearer {access_token}
{
"workspace_id" : "workspace_123" ,
"meter_id" : "meter_456" ,
"sensor_name" : "temperature" ,
"start_date" : "2024-01-01" ,
"end_date" : "2024-12-31" ,
"period_type" : "months"
}
Period Types:
class PeriodEnum ( str , Enum ):
DAYS = "days" # Daily averages
MONTHS = "months" # Monthly averages
YEARS = "years" # Yearly averages
Response:
{
"result" : {
"sensor" : "temperature" ,
"period_type" : "months" ,
"period" : {
"start_date" : "2024-01-01" ,
"end_date" : "2024-12-31"
},
"averages" : [
{ "date" : "2024-01" , "value" : 18.5 },
{ "date" : "2024-02" , "value" : 19.2 },
{ "date" : "2024-03" , "value" : 21.1 }
]
}
}
Features:
Time-series trend visualization
Line charts showing temporal patterns
Handles missing data (null values create gaps in charts)
Compare multiple sensors over the same period
3. Prediction Analysis
Forecast future sensor values using AI models:
POST /api/analysis/prediction/
Content-Type : application/json
Authorization : Bearer {access_token}
{
"workspace_id" : "workspace_123" ,
"meter_id" : "meter_456" ,
"sensor_name" : "ph" ,
"start_date" : "2024-01-01" ,
"end_date" : "2024-01-31" ,
"prediction_days" : 7 ,
"period_type" : "days"
}
Response:
{
"message" : "Analysis generating with id: pred_789" ,
"result" : {
"sensor" : "ph" ,
"data" : {
"labels" : [ "2024-01-25" , "2024-01-26" , "2024-01-27" ],
"values" : [ 7.2 , 7.3 , 7.1 ]
},
"pred" : {
"labels" : [ "2024-01-28" , "2024-01-29" , "2024-01-30" ],
"values" : [ 7.2 , 7.4 , 7.3 ]
}
}
}
Features:
Historical data + predicted values
Configurable prediction horizon
Visual separation in charts (different line styles)
Supports all sensor types
Period-based predictions (daily, monthly, yearly)
Source: ~/workspace/source/app/features/analysis/presentation/routes/prediction.py:16
4. Correlation Analysis
Analyze relationships between multiple sensors:
POST /api/analysis/correlation/
Content-Type : application/json
Authorization : Bearer {access_token}
{
"workspace_id" : "workspace_123" ,
"meter_id" : "meter_456" ,
"start_date" : "2024-01-01" ,
"end_date" : "2024-01-31" ,
"method" : "pearson"
}
Correlation Methods:
class CorrMethodEnum ( str , Enum ):
PEARSON = "pearson" # Linear correlation
SPEARMAN = "spearman" # Rank-based correlation
Response:
{
"result" : {
"method" : "pearson" ,
"sensors" : [ "ph" , "temperature" , "conductivity" , "tds" , "turbidity" ],
"matrix" : [
[ 1.0 , 0.23 , 0.45 , 0.67 , -0.12 ],
[ 0.23 , 1.0 , 0.89 , 0.78 , 0.34 ],
[ 0.45 , 0.89 , 1.0 , 0.92 , 0.21 ],
[ 0.67 , 0.78 , 0.92 , 1.0 , 0.15 ],
[ -0.12 , 0.34 , 0.21 , 0.15 , 1.0 ]
]
}
}
Features:
Correlation matrix heatmap visualization
Pearson or Spearman correlation methods
Identify sensor relationships and dependencies
All five sensors analyzed simultaneously
Source: ~/workspace/source/app/features/analysis/presentation/routes/correlation.py:18
AI-Powered Insights
Chat with AI about your analysis results using OpenRouter integration.
Create AI Chat Session
POST /api/analysis/ai/{analysis_id}/session
Authorization : Bearer {access_token}
Response:
{
"session_id" : "analysis_abc123-user_456" ,
"context" : "Analysis type: average \n Parameters: {...} \n Results: {...}" ,
"created_at" : "2024-01-15T10:30:00Z"
}
Chat with AI
POST /api/analysis/ai/{analysis_id}/chat
Content-Type : application/json
Authorization : Bearer {access_token}
{
"message" : "What does this correlation between TDS and conductivity tell us?"
}
Response:
{
"response" : "The strong positive correlation (0.92) between TDS and conductivity is expected and normal. As Total Dissolved Solids increase, the water's ability to conduct electricity also increases proportionally. This relationship is used to estimate TDS from conductivity measurements in the field. Your data shows a healthy, consistent relationship between these parameters." ,
"session_id" : "analysis_abc123-user_456"
}
Features:
Contextual understanding of analysis data
Explain statistical results in plain language
Answer questions about trends and patterns
Provide water quality insights
Session-based conversation history
Source: ~/workspace/source/app/features/analysis/presentation/routes/ai_chat.py:46
Get Chat Session History
GET /api/analysis/ai/{analysis_id}/session
Authorization : Bearer {access_token}
Response:
{
"session_id" : "analysis_abc123-user_456" ,
"context" : "Analysis context..." ,
"created_at" : "2024-01-15T10:30:00Z" ,
"updated_at" : "2024-01-15T10:45:00Z" ,
"messages" : [
{
"id" : "msg_1" ,
"role" : "user" ,
"content" : "What does this correlation mean?" ,
"timestamp" : "2024-01-15T10:32:00Z"
},
{
"id" : "msg_2" ,
"role" : "assistant" ,
"content" : "The correlation indicates..." ,
"timestamp" : "2024-01-15T10:32:05Z"
}
],
"metadata" : {
"analysis_id" : "analysis_abc123" ,
"analysis_type" : "correlation" ,
"workspace_id" : "workspace_123" ,
"meter_id" : "meter_456"
}
}
AI chat sessions are automatically created on first message if they don’t exist. The analysis must have status: "saved" before AI interaction.
Chart Generation
Analysis results include visualizations generated with matplotlib.
Chart Types
class ChartType ( str , Enum ):
LINE = "line" # Time-series trends
BAR = "bar" # Comparative statistics
HEATMAP = "heatmap" # Correlation matrices
Line Charts
class LineChartData ( BaseModel ):
x_values: list[ str ] # Date labels
series: dict[ str , list[ float ]] # Sensor name -> values
class ChartConfig ( BaseModel ):
chart_type: ChartType
title: str
x_label: str
y_label: str
period_type: str = "days" # For x-axis formatting
width: int = 140
height: int = 100
Features:
Multiple data series on one chart
Automatic date formatting based on period type
Gap handling for missing data (None values)
Customizable dimensions
Bar Charts
class BarChartData ( BaseModel ):
categories: list[ str ] # X-axis categories
series: dict[ str , list[ float ]] # Series name -> values
Used for:
Average, min, max comparisons
Single-point statistics
Sensor comparisons
Heatmaps
class HeatmapData ( BaseModel ):
data: list[list[ float ]] # 2D correlation matrix
x_labels: list[ str ] # Sensor names
y_labels: list[ str ] # Sensor names
Used for:
Correlation matrices
Visual representation of sensor relationships
Color-coded correlation strength
Source: ~/workspace/source/app/features/analysis/infrastructure/matplotlib_chart_generator.py
PDF Report Generation
Generate comprehensive PDF reports with charts, tables, and analysis results.
Generate PDF Report
GET /api/analysis/report/{analysis_id}/report/pdf
Authorization : Bearer {access_token}
Response:
Content-Type: application/pdf
Filename: reporte_{analysis_type}_{timestamp}.pdf
Streaming download
Report Contents
Line charts for time-series data
Bar charts for statistical summaries
Heatmaps for correlation analysis
Automatic chart captioning
Formatted result tables
Statistical summaries
Sensor readings
Limited to prevent excessive length
Report Customization
class ReportConfig ( BaseModel ):
title: str = "Analysis Report"
author: str
subject: str
class ReportSection ( BaseModel ):
title: str
content: str | None
level: int # Heading level (1, 2, 3)
class TableData ( BaseModel ):
headers: list[ str ]
rows: list[list[ str ]]
Source: ~/workspace/source/app/features/analysis/presentation/routes/report.py:31
Example PDF Structure
Average Analysis Report:
Header with timestamp
Analysis Information (ID, type, workspace, meter)
Period of Analysis (start/end dates)
Statistics Table (sensor, average, min, max)
Bar Charts (one per sensor showing min/avg/max)
Prediction Analysis Report:
Header with timestamp
Analysis Information
Line Charts (historical data + predictions with different line styles)
Prediction parameters and horizon
Correlation Analysis Report:
Header with timestamp
Analysis Information
Correlation Method (Pearson/Spearman)
Heatmap visualization
Correlation Matrix table
Analysis Management
Get Analysis Results
GET /api/analysis/average/{workspace_id}/{meter_id}/
Authorization : Bearer {access_token}
Update Analysis
Re-run analysis with updated parameters:
PUT /api/analysis/average/{analysis_id}/
Content-Type : application/json
Authorization : Bearer {access_token}
{
"start_date" : "2024-02-01" ,
"end_date" : "2024-02-29"
}
Updating an analysis re-processes the data with new parameters. The analysis status changes to "updating" during processing, then back to "saved" when complete.
Analysis Status
class AnalysisStatus ( str , Enum ):
CREATING = "creating" # Initial creation in progress
UPDATING = "updating" # Update in progress
SAVED = "saved" # Complete and ready
ERROR = "error" # Processing failed
Source: ~/workspace/source/app/features/analysis/domain/enums.py:22
Data Storage
Analysis results are stored in Firebase:
Real-time updates during processing
Persistent storage of analysis configurations
Chart images stored separately
Efficient querying by workspace/meter/type
Source: ~/workspace/source/app/features/analysis/infrastructure/firebase_analysis_result.py
Best Practices
Appropriate Date Ranges Use sufficient historical data for meaningful analysis (min 30 days recommended)
Period Selection Match period type to data frequency (daily for hourly data, monthly for daily data)
Correlation Interpretation Values > 0.7 indicate strong correlation, < 0.3 weak correlation
Prediction Horizons Keep predictions short-term (7-14 days) for better accuracy
AI Context Provide specific questions to AI for better insights
Report Sharing Use PDF reports for stakeholder communication and documentation
Example: Complete Analysis Workflow
import requests
import time
API_BASE = "https://api.example.com/api"
HEADERS = { "Authorization" : f "Bearer { token } " }
# 1. Create average analysis
response = requests.post(
f " { API_BASE } /analysis/average/" ,
headers = HEADERS ,
json = {
"workspace_id" : "workspace_123" ,
"meter_id" : "meter_456" ,
"sensor_name" : "ph" ,
"start_date" : "2024-01-01" ,
"end_date" : "2024-01-31"
}
)
analysis_id = response.json()[ "result" ][ "id" ]
print ( f "Analysis created: { analysis_id } " )
# 2. Wait for analysis to complete
time.sleep( 5 )
# 3. Get analysis results
results = requests.get(
f " { API_BASE } /analysis/average/workspace_123/meter_456/" ,
headers = HEADERS
)
print ( f "Average pH: { results.json()[ 'result' ][ 'data' ][ 'result' ][ 0 ][ 'average' ] } " )
# 4. Create AI chat session
ai_session = requests.post(
f " { API_BASE } /analysis/ai/ { analysis_id } /session" ,
headers = HEADERS
)
print ( f "AI session created: { ai_session.json()[ 'session_id' ] } " )
# 5. Ask AI about results
ai_response = requests.post(
f " { API_BASE } /analysis/ai/ { analysis_id } /chat" ,
headers = HEADERS ,
json = { "message" : "Is this pH level normal for drinking water?" }
)
print ( f "AI: { ai_response.json()[ 'response' ] } " )
# 6. Generate PDF report
pdf_response = requests.get(
f " { API_BASE } /analysis/report/ { analysis_id } /report/pdf" ,
headers = HEADERS
)
with open ( "water_quality_report.pdf" , "wb" ) as f:
f.write(pdf_response.content)
print ( "PDF report saved" )