Statistics Dashboard

Overview

The statistics dashboard provides instant insights into the filtering results, including total records, filtered tasks, occurrence rates, and per-keyword breakdown.

Core Function

Statistics generation is handled by gerar_estatisticas() in app.py:55:

def gerar_estatisticas(df_original, df_filtrado, palavras_chave):
    """Gera estatísticas simples dos dados"""
    total = len(df_original)
    filtrados = len(df_filtrado)
    
    stats = {
        'total': total,
        'filtrados': filtrados,
        'percentual': round((filtrados/total)*100, 1) if total > 0 else 0,
        'por_palavra': {}
    }
    
    for palavra in palavras_chave:
        if not df_filtrado.empty:
            count = int(df_filtrado['Relato'].str.contains(
                palavra, case=False, na=False
            ).sum())
            if count > 0:
                stats['por_palavra'][palavra] = count
    
    return stats

Metrics Provided

Total Records

Count of all rows in the original file (before filtering)

Filtered Tasks

Number of tasks matching at least one keyword

Occurrence Rate

Percentage of filtered tasks relative to total records

Per-Keyword Breakdown

Individual match count for each keyword used

Statistics Structure

The function returns a dictionary with this structure:

{
    'total': 1523,              # Total rows in original file
    'filtrados': 147,           # Rows matching keywords
    'percentual': 9.7,          # Percentage (1 decimal place)
    'por_palavra': {            # Per-keyword counts
        'quebrado': 45,
        'solicitar peça': 32,
        'trocar cabo': 28,
        'danificado': 18,
        'instalar': 24
    }
}

Calculation Details

Percentage Formula

View Percentage Calculation Logic

percentual = round((filtrados/total)*100, 1) if total > 0 else 0

Divides filtered count by total count
Multiplies by 100 to get percentage
Rounds to 1 decimal place using round()
Returns 0 if total is 0 (prevents division by zero)

Per-Keyword Counting

Each keyword is counted independently using pandas string matching:

for palavra in palavras_chave:
    if not df_filtrado.empty:
        count = int(df_filtrado['Relato'].str.contains(
            palavra, case=False, na=False
        ).sum())
        if count > 0:
            stats['por_palavra'][palavra] = count

Keywords with zero matches are excluded from the por_palavra dictionary to keep the output clean.

Important Behaviors

Overlapping Matches: A single task can match multiple keywords. The sum of per-keyword counts may exceed the total filtered count.Example: A task with “solicitar peça quebrada” matches both “solicitar peça” and “quebrado”.

Empty DataFrame Handling

The function safely handles empty results:

if not df_filtrado.empty:
    # Only calculate per-keyword stats if there are results

This prevents errors when no tasks match any keywords.

Usage in Application

Statistics are generated during file upload and stored in the Flask session:

# From app.py:141
stats = gerar_estatisticas(df_original, resultado_final, palavras_chave)
session['last_stats'] = stats

Display in Results Page

The stats dictionary is passed to the resultado.html template:

return render_template('resultado.html', 
                       table=tabela_html, 
                       has_results=not resultado_final.empty,
                       stats=stats,
                       palavras_utilizadas=palavras_chave)

Statistics in Exports

Excel Export
PDF Export

Statistics are exported to a dedicated “Estatísticas” sheet:

stats_df = pd.DataFrame([
    ['Total de Registros', stats.get('total', 'N/A')],
    ['Tarefas Encontradas', stats.get('filtrados', 'N/A')],
    ['Taxa de Ocorrência (%)', stats.get('percentual', 'N/A')],
    ['Data de Geração', datetime.now().strftime('%d/%m/%Y %H:%M')]
], columns=['Métrica', 'Valor'])
stats_df.to_excel(writer, index=False, sheet_name='Estatísticas')

Statistics appear in a visual dashboard at the top of the PDF:

<div class="stats">
  <div class="stat-box">
    <div class="stat-number">{stats.get('total', 'N/A')}</div>
    <div>Total de Registros</div>
  </div>
  <div class="stat-box">
    <div class="stat-number">{stats.get('filtrados', 'N/A')}</div>
    <div>Tarefas Encontradas</div>
  </div>
  <div class="stat-box">
    <div class="stat-number">{stats.get('percentual', 'N/A')}%</div>
    <div>Taxa de Ocorrência</div>
  </div>
</div>

Data Types

Field	Type	Example
total	int	1523
filtrados	int	147
percentual	float	9.7
por_palavra	dict	`{"quebrado": 45}`

The per_palavra counts are explicitly converted to int using int() to ensure consistent JSON serialization when storing in Flask sessions.

Performance

Statistics generation is fast even for large datasets:

Uses vectorized pandas operations
Runs in O(n × k) where n = filtered rows, k = number of keywords
Typically completes in less than 100ms for files with 10,000 rows

Getting Started

User Guide

Features

Technical Reference

Support

Overview

Core Function

Metrics Provided

Total Records

Filtered Tasks

Occurrence Rate

Per-Keyword Breakdown

Statistics Structure

Calculation Details

Percentage Formula

Per-Keyword Counting

Important Behaviors

Empty DataFrame Handling

Usage in Application

Display in Results Page

Statistics in Exports

Data Types

Performance

Build docs developers (and LLMs) love

Getting Started

User Guide

Features

Technical Reference

Support

​Overview

​Core Function

​Metrics Provided

Total Records

Filtered Tasks

Occurrence Rate

Per-Keyword Breakdown

​Statistics Structure

​Calculation Details

​Percentage Formula

​Per-Keyword Counting

​Important Behaviors

​Empty DataFrame Handling

​Usage in Application

​Display in Results Page

​Statistics in Exports

​Data Types

​Performance

Build docs developers (and LLMs) love

Overview

Core Function

Metrics Provided

Statistics Structure

Calculation Details

Percentage Formula

Per-Keyword Counting

Important Behaviors

Empty DataFrame Handling

Usage in Application

Display in Results Page

Statistics in Exports

Data Types

Performance