Trust & Safety Infrastructure

Civitas AI

Enterprise-grade content moderation with ML-powered classification, configurable policies, human-in-the-loop review, and immutable audit trails.

EU AI Act Ready NIST AI RMF SOC 2 Controls

The Challenge

Content Moderation at Scale

Millions of user-generated content pieces daily
Toxic, harmful, and policy-violating content
Real-time decision requirements
Multi-platform, multi-language challenges

Regulatory Pressure

EU AI Act compliance requirements
Transparency and explainability mandates
Human oversight obligations
Immutable audit trail requirements

Operational Complexity

Inconsistent moderation decisions
No visibility into AI decision-making
Difficult policy enforcement
Missing evidence for appeals

The Solution

ML-powered automated classification
Configurable, versioned policies
Human-in-the-loop escalation
Cryptographically-secured audit trail

Architecture

🌐

Cloudflare Pages

React Frontend

🚪

Gateway

Rate Limiting, CORS, Auth

🤖

Moderation

HuggingFace ML

📋

Policy Engine

Configurable Rules

Supabase PostgreSQL (Pooled)

Upstash Redis (TLS)

Cloud Run (Serverless)

Live Demo

API in Action

Request / Response

POST /api/v1/moderate

{
  "content": "Hello...",
  "source": "demo"
}

Response

{
  "action": "allow",
  "category_scores": {...}
}

Policy Engine

Configurable Thresholds

Toxicity → Block 0.80

Hate → Block 0.70

Harassment → Warn 0.75

Profanity → Warn 0.90

Multi-Policy Support

Standard Community Guidelines

v1 • Global • Published

Active

Youth Safe Mode

v1 • Under 13 • Published

Active

Relaxed Forum Policy

v1 • US Forums • Draft

Draft

Human-in-the-Loop

Review Queue Workflow

1

Content Escalated

ML confidence below threshold or edge case detected

2

Moderator Review

Human reviews content with ML recommendations

3

Decision with Rationale

Approve/Reject/Escalate with mandatory explanation

4

Evidence Recorded

Immutable audit trail with cryptographic hash

Moderator Actions

Compliance & Audit

Immutable Evidence Records

{
  "id": "e1000000-0000-...",
  "control_id": "MOD-001",
  "decision_id": "d0000000-...",
  "automated_action": "block",
  "category_scores": {
    "toxicity": 0.92,
    "hate": 0.95
  },
  "submission_hash": "sha256:a7f3b...",
  "immutable": true,
  "integrity_hash": "sha256:c9d2e..."
}

Audit Trail Features

✓ Cryptographic hash chain
✓ Tamper detection triggers
✓ Full decision lineage
✓ CSV/JSON export
✓ Policy version tracking
✓ Human review rationale

Regulatory Compliance

🇪🇺

EU AI Act

Art. 9, 13, 14, 15, 17

12 controls mapped

🏛️

NIST AI RMF

MAP, MEASURE, MANAGE, GOVERN

8 controls mapped

🌐

ISO 42001

Clause 6, 8, 9

6 controls mapped

🔒

GDPR

Art. 22, 35

5 controls mapped

✅

SOC 2

CC6, CC7, CC8

7 controls mapped

18 implemented controls with full traceability to regulatory requirements

Knowledge Graph

112 nodes • 138 relationships • Neo4j Aura

Integration Patterns

REST API

Direct HTTP integration with JSON payloads

POST /api/v1/moderate
Authorization: Bearer {api_key}
{"content": "...", "source": "web"}
          

Mobile SDK

Native iOS/Android with offline queue

CivitasSDK.moderate(text) { result ->
  when(result.action) {
    ALLOW -> publish()
    BLOCK -> reject()
  }
}
          

LLM Guardrails

Pre/post-processing for LLM outputs

llm_output = model.generate(prompt)
result = civitas.moderate(llm_output)
if result.action == "block":
    return SAFE_FALLBACK
          

Webhooks

Event-driven notifications

{
  "event": "moderation.decision",
  "action": "escalate",
  "decision_id": "..."
}
          

Cloud Deployment

Live URLs

Frontend: civitas.pages.dev

API: gateway-xxx.run.app

Database: Supabase (us-west-2)

Graph: Neo4j Aura

4

Microservices

<100ms

API Latency (p95)

99.9%

Uptime SLA

Roadmap

Phase 1: Foundation

Core moderation, policy engine, review queue, audit trail

Complete ML Classification • Policy Rules • Evidence Chain

Phase 2: Scale

Multi-language support, custom ML models, real-time streaming

Q2 2026 i18n • Fine-tuning • WebSocket API

Phase 3: Enterprise

Multi-tenant, SSO, advanced analytics, SLA dashboard

Q4 2026 SAML/OIDC • Tenant Isolation • BI Integration

Get Started

Enterprise-grade content moderation, ready for production

Try the API View on GitHub

Documentation

API reference, integration guides, and examples

Contact

proth1@gmail.com

License

MIT Open Source

Civitas AI

The Challenge

Content Moderation at Scale

Regulatory Pressure

Operational Complexity

The Solution

Architecture

Cloudflare Pages

Gateway

Moderation

Policy Engine

Live Demo

Dashboard Overview

Moderation Demo

Policy Management

Audit Log

API in Action

Try It Live

Request / Response

Policy Engine

Configurable Thresholds

Multi-Policy Support

Human-in-the-Loop

Review Queue Workflow

Moderator Actions

Compliance & Audit

Immutable Evidence Records

Audit Trail Features

Regulatory Compliance

EU AI Act

NIST AI RMF

ISO 42001

GDPR

SOC 2

Knowledge Graph

Integration Patterns

REST API

Mobile SDK

LLM Guardrails

Webhooks

Cloud Deployment

Live URLs

Roadmap

Phase 1: Foundation

Phase 2: Scale

Phase 3: Enterprise

Get Started

Documentation

Contact

License