Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/AngelAmoSanchez/TFG-RaspberryPi-BLE/llms.txt

Use this file to discover all available pages before exploring further.

Every BLE scanner observes the raw MAC addresses broadcast by nearby devices. Left unprocessed, those addresses can be linked back to individual people and tracked over time. The BLE People Counter eliminates that risk on-device: raw MAC addresses are hashed before any data is sent to the cloud, and the hash is one-way — you cannot recover the original address from it.

Why BLE scanning raises privacy concerns

Bluetooth Low Energy devices continuously broadcast advertisement packets that include a hardware identifier called a MAC address. In older devices, this address is static and globally unique, which means it can be used to fingerprint and track an individual across time and space. Even in environments where physical identities are unknown, a persistent MAC address is considered personal data under most privacy regulations, including GDPR. The system processes MAC addresses locally on the Raspberry Pi and never forwards them to the cloud.

The anonymize_mac method

The DetectionProcessor class in src/main.py contains the anonymization logic:
def anonymize_mac(self, mac_address: str) -> str:
    """Anonimiza dirección MAC usando SHA-256

    Args:
        mac_address: Dirección MAC del dispositivo

    Devuelve:
        Hash SHA-256 en hexadecimal (64 caracteres)
    """
    normalized_mac = mac_address.replace(":", "").replace("-", "").upper()

    hash_object = hashlib.sha256(normalized_mac.encode())
    return hash_object.hexdigest()
The transformation has three steps:
  1. Normalize — strip colons and hyphens, convert to uppercase. AA:BB:CC:DD:EE:FF becomes AABBCCDDEEFF. This ensures that the same physical device always produces the same hash regardless of how the OS formats the address string.
  2. Hash — apply SHA-256 to the UTF-8-encoded normalized string.
  3. Returnhexdigest() returns a lowercase 64-character hexadecimal string.
The output is always exactly 64 characters, which is validated by the Device dataclass:
def __post_init__(self):
    """Validaciones"""
    if len(self.device_hash) != 64:
        raise ValueError(
            f"El hash tiene que tener 64 carácteres (SHA-256), se obtuvieron {len(self.device_hash)}"
        )

One-way transformation

SHA-256 is a cryptographic hash function. Given the output hash, it is computationally infeasible to recover the input MAC address. The cloud backend only ever receives device_hash values — it has no mechanism to reverse them into MAC addresses, and no raw address data is written to any log or database.
The anonymization runs inside process_detections on the Pi, before the HTTPClient or MQTTClient transmits anything. There is no code path that sends a raw MAC address to the backend.

What is stored in the database

The Device dataclass (defined in src/scanner/detection.py) represents the sanitized record that travels to the cloud:
@dataclass(frozen=True)
class Device:
    """Dispositivo una vez procesado y anonimizado."""

    device_hash: str  # Hash SHA-256 de MAC
    rssi: int
    zone: Zone
    timestamp: datetime
When serialized for transmission, the payload is:
def to_dict(self) -> dict:
    return {
        "device_hash": self.device_hash,
        "rssi": self.rssi,
        "zone": self.zone.value,
        "timestamp": self.timestamp.isoformat(),
    }
The backend also adds device_id (the Pi’s identifier) when storing the record. In summary, the database contains:
FieldTypeDescription
device_hashstring (64 chars)SHA-256 hash of the normalized MAC address
rssiinteger (dBm)Signal strength at time of detection
zonestringnear, medium, or far
timestampISO 8601When the device was detected
device_idstringIdentifier of the Pi that made the detection
No device name, no manufacturer information, no raw MAC address.

MAC address randomization on iOS and Android

Modern smartphones (iOS 14+, Android 10+) use randomized MAC addresses that change periodically. From the scanner’s perspective, a single phone may appear as multiple distinct addresses over time. Each randomized address is hashed independently, so one physical device can generate several different device_hash values within a session.
MAC randomization inflates raw device counts. A phone that changes its MAC address mid-scan will appear as two distinct entries. The DEVICES_PER_PERSON ratio (default 1.5) is partly designed to compensate for this inflation — see the zone tuning guide for how to adjust it.
The RSSI values reported alongside randomized addresses are still accurate readings of signal strength; only the identifier changes. Detections for the same zone will still aggregate correctly for people-counting purposes.

GDPR considerations

No PII stored

The database contains only SHA-256 hashes, RSSI values, zones, and timestamps. None of these individually identify a natural person.

Aggregated output only

The dashboard displays estimated people counts per zone, not individual device histories. The raw detection table is an operational log, not a user profile.

On-device processing

Anonymization happens on the Raspberry Pi before any network transmission. Raw MAC addresses never leave the local device.

No retention of identifiers

Because hashes are one-way, there is no way to associate a stored record with a specific person or device after the fact — even with access to the database.
If your deployment is in a jurisdiction with strict data-protection requirements, consult your legal team. You may also want to document the hashing algorithm and retention period in a privacy notice, even though the stored data is not technically personal data under most interpretations.

Build docs developers (and LLMs) love