Privacy (Camada 9)¶
Enforces Rules 21-24 of the constitution: privacy-by-design, no raw PII in LLM context, data classification, inviolable audit chain.
Modules¶
| Module | Responsibility |
|---|---|
pii_detector |
Regex + structural detection of email, CPF, CNPJ, credit_card, SSN, phone_br, IP, password keyvalue, api_key |
classifier |
Tier 0 (Public) / 1 (Internal) / 2 (Confidential) / 3 (Restricted) |
permissions |
RBAC + Purpose binding (DEBUG / FEATURE_BUILD / ANALYTICS / AUDIT / DSAR) |
audit_chain |
HMAC-chained append-only log; tampering detectable via verify() |
dsar |
LGPD/GDPR Data Subject Access Request handler |
deletion |
Cascade deletion across all stores |
retention |
TTL-driven sweep across episodic / semantic / procedural stores |
Tier model¶
Tier 0 Public — docs, open-source, ToS-bound
Tier 1 Internal — default; operational metadata; no PII
Tier 2 Confidential — business secrets, project keys; no PII
Tier 3 Restricted — any PII, financial, health
Classification precedence:
- If any PII regex matches →
RESTRICTED - If text contains a
_CONFIDENTIAL_HINTSmarker (api_key=,secret=,-----BEGIN,client_secret,internal-only,do not share,confidential) →CONFIDENTIAL - If
hints.explicit_tierprovided → use it - If
hints.source == "public_docs"or hints.license in (MIT,Apache-2.0,BSD-3) →PUBLIC - Default →
INTERNAL
Audit chain (Rule 24)¶
Every audit event includes:
prev_hmac— HMAC of the previous event (chains the log)hmac_— HMAC of (event_blob + prev_hmac), signed withAUDIT_HMAC_KEYenv var
AuditChain.verify() recomputes the chain and reports any mismatch, catching:
- Modification — any field of an existing line differs from its HMAC
- Insertion — a forged line's
prev_hmacdoesn't match the previous line'shmac_ - Deletion — the next line's
prev_hmacpoints to a vanished predecessor
In production: store AUDIT_HMAC_KEY in a vault (never .env committed). Rotate quarterly.
DSAR workflow¶
from src.privacy import (
AuditChain, DeletionCascade, DSARHandler,
)
from src.privacy.dsar import DSARAction
from src.ai import EpisodicStore, SemanticStore, ProceduralStore
ep, sm, pr = EpisodicStore(), SemanticStore(), ProceduralStore()
ac = AuditChain()
dc = DeletionCascade(episodic=ep, semantic=sm, procedural=pr, audit=ac)
dh = DSARHandler(episodic=ep, semantic=sm, procedural=pr, audit=ac, deletion=dc)
# ACCESS: scan only
req = dh.new_request(subject="user@example.com", action=DSARAction.ACCESS)
resp = dh.handle(req)
print(resp.found_records, resp.status)
# ERASURE: scan + delete (delegates to DeletionCascade)
req = dh.new_request(subject="user@example.com", action=DSARAction.ERASURE)
resp = dh.handle(req)
# Every deletion appends a DELETE event to the audit chain with the identifier_hash payload
ERASURE without an injected DeletionCascade fails fast with erasure_requires_deletion_cascade rather than silently logging intent (a bug we caught in Sprint 1).
Permissions matrix (default)¶
| Role | Max tier | Allowed purposes |
|---|---|---|
| operator | RESTRICTED | all |
| auditor-haiku | RESTRICTED | AUDIT, DSAR |
| cortex | CONFIDENTIAL | FEATURE_BUILD, ANALYTICS |
| code-writer | INTERNAL | FEATURE_BUILD, DEBUG |
| test-runner | INTERNAL | FEATURE_BUILD |
| file-operator | CONFIDENTIAL | FEATURE_BUILD, DEBUG |
PermissionManager.enforce(role, tier, purpose) raises PermissionDenied if denied. check() returns (allowed, reason) without raising.
Retention¶
RetentionScheduler.sweep() runs nightly (or manually). For each record:
- Use
record.ttl_daysif present - Otherwise default per policy (
episodic=90,procedural=180,semantic=None) - Compare against
record.created_at; delete if expired
Every deletion writes a DELETE_TTL audit event with ttl_days + age_days payload.