Overview
The Extrator de Tarefas Auvo uses keyword-based filtering to identify actionable tasks from CSV and Excel reports. The filtering mechanism uses Python’s pandas library with case-insensitive regex pattern matching.How It Works
The filtering process is handled by theprocessar_arquivo() function in app.py:32:
Key Features
Multi-format Support
Supports CSV, XLS, and XLSX files with automatic format detection
Regex Patterns
Uses pipe-separated regex for efficient multi-keyword matching
Case-insensitive
Matches keywords regardless of capitalization
Null-safe
Handles missing values with
na=False parameterFiltering Logic
File Format Detection
The system automatically detects file format based on extension (
.csv, .xls, .xlsx) and uses the appropriate pandas reader.Skip Header Rows
Both CSV and Excel files skip the first 5 rows (
skiprows=5) to handle Auvo report formatting.Regex Pattern Construction
Keywords are joined with the pipe operator (
|) to create a single regex pattern: solicitar peça|quebrado|trocar caboColumn Filtering
The ‘Relato’ (Report) column is searched using
str.contains() with case=False for case-insensitive matching.Default Keywords
The application comes with pre-configured keywords optimized for identifying maintenance and repair tasks:View Default Keywords
View Default Keywords
app.py:115 and stored in the Flask session as custom_keywords.Customizing Keywords
Users can customize keywords through the
/config route, which accepts comma-separated values and stores them in the session.Configuration Route
Fromapp.py:108:
Technical Details
Pattern Matching Behavior
- Substring matching: The regex matches keywords anywhere in the text
- No word boundaries:
trocarwill match “trocar”, “trocando”, “retrocado” - OR logic: Any keyword match includes the row in results
Performance Considerations
Column Output
Filtered results include exactly 5 columns:| Column | Description |
|---|---|
| Data | Task date |
| Cliente | Client name |
| Endereco | Service address |
| OS Digital | Digital work order (with clickable links) |
| Relato | Task report/description |
Error Handling
The function raisesValueError for unsupported file formats: