Privacy-safe data governance and analytics framework for real teams.
Documentation | Changelog | Security Policy | Contributing
COVENANT.DATA is a Python framework that helps teams handle sensitive data with fewer leaks and fewer mistakes. It provides tools for:
- Data contracts - Define what data you have, how sensitive it is, and how it may be used
- PII detection - Automatically scan for personally identifiable information
- Policy enforcement - Control who can access data and for what purposes
- Redaction - Safely export data with sensitive fields removed
- Privacy budgets - Track differential privacy spending
- Audit logging - Tamper-evident logs of all sensitive operations
- Lineage tracking - Full history of data transformations
- Verifiable bundles - Share data with proof of what was removed
- No silent data loss - All operations are logged and verifiable
- No silent data leakage - Access requires explicit policy approval
- Auditability beats convenience - Every decision can be reviewed
- Security beats features - Safe defaults over powerful defaults
- Reproducibility beats speed - Deterministic outputs from given inputs
- Legal compliance by itself - You must still review your specific requirements
- That user data is truthful - Garbage in, garbage out
- That PII detection is perfect without human review - Use detection as a tool, not a replacement for judgment
- That differential privacy settings are appropriate for every case - Privacy parameters require expertise
pip install covenant-datacovenant init my-project
cd my-projectfrom covenant.schema.contract import Contract, ContractVersion, FieldTag
from covenant.schema import SensitivityLevel, PIICategory
contract = Contract(
id="users-dataset",
name="User Records",
version=ContractVersion(major=1, minor=0, patch=0),
schema={
"fields": {
"id": {"type": "string"},
"name": {"type": "string"},
"email": {"type": "string"},
"age": {"type": "integer"},
}
},
field_tags={
"id": FieldTag(sensitivity=SensitivityLevel.PUBLIC),
"name": FieldTag(
sensitivity=SensitivityLevel.CONFIDENTIAL,
pii_category=PIICategory.NAME,
is_pii=True,
),
"email": FieldTag(
sensitivity=SensitivityLevel.RESTRICTED,
pii_category=PIICategory.EMAIL,
is_pii=True,
),
"age": FieldTag(sensitivity=SensitivityLevel.INTERNAL),
},
allowed_purposes=["academic_research", "testing"],
retention={},
export_rules={},
)
print(f"Contract hash: {contract.hash()}")from covenant.pii.detector import PIIDetector
from covenant.pii.ruleset import default_ruleset
data = [
{"name": "Alice Smith", "email": "alice@example.com"},
{"name": "Bob Jones", "email": "bob@example.com"},
]
detector = PIIDetector(ruleset=default_ruleset())
result = detector.scan_dataset("users", data)
print(f"Found {len(result.detections)} PII occurrences")
print(f"PII fields: {result.pii_fields()}")from covenant.redact.plan import create_redaction_plan, full_redaction_action
plan = create_redaction_plan(
dataset_id="users",
contract_id=contract.id,
actions=[
full_redaction_action("email", reason="PII"),
full_redaction_action("name", reason="PII"),
],
)
print(f"Plan hash: {plan.hash()}")from covenant.redact.engine import apply_redaction_plan
redacted, result = apply_redaction_plan(plan, data)
print(f"Original: {data}")
print(f"Redacted: {redacted}")
print(f"Hash change: {result.original_hash[:16]}... -> {result.redacted_hash[:16]}...")from covenant.policy.engine import PolicyEngine, PolicyRequest
from covenant.core.context import Actor, Purpose
request = PolicyRequest(
actor=Actor(identity="researcher1", roles=["researcher"]),
purpose=Purpose.ACADEMIC_RESEARCH,
resource_id="users",
contract_id=contract.id,
requested_fields=["id", "age"], # No PII
)
engine = PolicyEngine()
decision = engine.evaluate(request)
print(f"Decision: {decision.decision}")
print(f"Reason: {decision.reason}")from covenant.bundles.bundle import make_bundle
from covenant.bundles.format import write_bundle
bundle = make_bundle(
contract=contract,
audit_log=audit_log,
lineage=lineage_graph,
artifacts=[
("users.csv", csv_data.encode(), "text/csv"),
("redaction_report.json", report.encode(), "application/json"),
],
created_by="data-team",
)
write_bundle(bundle, "users-export.bundle")# Initialize a project
covenant init
# Create a contract
covenant contract create --name "User Data" -o contract.json
# Validate a contract
covenant contract validate contract.json
# Scan for PII
covenant scan pii data.csv --output pii-report.json
# Create a redaction plan
covenant redact plan contract.json --fields email,name -o plan.json
# Apply redaction
covenant redact apply plan.json data.csv -o redacted.csv
# Check a policy
covenant policy check policy.json --actor user1 --purpose research
# Show audit log
covenant audit show audit.json
# Create a bundle
covenant bundle make --contract contract.json --data data.csv -o export.bundle
# Verify a bundle
covenant bundle verify export.bundle
# Verify storage
covenant verify store ./storagesrc/covenant/
├── core/ # Identity, hashing, canonical encoding
├── schema/ # Data contracts, validation, migration
├── pii/ # PII detection, rulesets
├── redact/ # Redaction plans, engine, reports
├── policy/ # Policy language, evaluation, proofs
├── audit/ # Hash-chained audit log
├── lineage/ # Data lineage graph, replay
├── privacy/ # Budget ledger, DP releases
├── bundles/ # Portable, verifiable bundles
├── storage/ # Content-addressed storage, verify/repair
├── server/ # Local web UI
├── cli/ # Command-line interface
└── viz/ # Report generation
- ARCHITECTURE.md - System architecture
- CONTRACTS.md - Data contracts explained
- POLICY_LANGUAGE.md - Policy syntax and evaluation
- AUDIT_LOG.md - Audit logging format
- REDACTION.md - Redaction system
- PRIVACY_BUDGETS.md - Differential privacy tracking
- BUNDLES.md - Secure data sharing
See the examples/ directory for complete examples:
ngo_case_intake/- NGO case management with PII handlingschool_research_release/- Academic research data exportclinic_like_data_demo/- Healthcare data with redactionredaction_review_workflow/- Review and approve redactionspolicy_denial_and_explain/- Understanding policy decisionsprivacy_budget_release_report/- DP releases with budget trackingbundle_share_and_replay/- Verifiable data sharing
- Version: 0.1.0 (Alpha)
- APIs may change before 1.0.0
- File formats are versioned with migration support
- All changes are documented in CHANGELOG.md
- Default deny - Access is denied unless explicitly allowed
- Audit everything - All sensitive operations are logged
- Encrypt at rest - Storage supports encryption (optional)
- Verify imports - Bundles verify signatures on load
- No secrets in logs - Audit logs never contain raw sensitive data
Security issues should be reported privately at security@covenant.data
Bugs and feature requests: https://github.com/covenant-data/covenant.data/issues
MIT License - see LICENSE for details.
See CONTRIBUTING.md