Multilingual PII detection with NLI verification. Single endpoint, two modes.
POST /api/detect
Behavior depends on the Accept header:
| Accept Header | Mode | Description |
|---|---|---|
application/json | JSON | NER only, fast (~200ms), no NLI |
text/event-stream | SSE | Full pipeline: NER → NLI verification → streamed progressively |
POST /api/detect
Content-Type: application/json
{
"text": "John Doe lives at 123 Main St, NRIC S1234567A",
"labels": null,
"budget": 5000
}
| Field | Type | Default | Description |
|---|---|---|---|
text | string | required | Text to analyze |
labels | string[] | null | Entity types to detect (null = auto-select all) |
budget | number | 5000 | Max NLI verification time in ms (SSE mode only) |
Fast NER-only detection. safe: null indicates NLI verification was not performed.
curl -X POST https://pii.engineer/api/detect \
-H "Content-Type: application/json" \
-d '{"text": "John Doe, NRIC S1234567A"}'
{
"entities": [
{
"type": "person",
"value": "John Doe",
"start": 0,
"end": 8,
"score": 0.99,
"needs_review": false
},
{
"type": "nric",
"value": "S1234567A",
"start": 15,
"end": 24,
"score": 0.98,
"needs_review": false
}
],
"redacted": "[PERSON], NRIC [NRIC]",
"safe": null,
"leaks": [],
"rounds": 0
}
Full pipeline with NLI verification, streamed progressively. Results arrive as Server-Sent Events.
curl -N -X POST https://pii.engineer/api/detect \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{"text": "John Doe, NRIC S1234567A"}'
detection — NER results (always first, ~200ms)
event: detection
data: {"round": 1, "entities": [...], "redacted": "..."}
verification — NLI result per entity (streamed one by one)
event: verification
data: {"type": "person", "original_value": "John Doe", "entailment": 0.82, "leaking": true}
Only emitted for entities with score < 0.8. High-confidence entities are auto-verified.
refinement — Re-detection if leaks found
event: refinement
data: {"round": 2, "entities": [...], "redacted": "..."}
complete — Final result (always last)
event: complete
data: {
"entities": [...],
"redacted": "...",
"safe": true,
"leaks": [...],
"rounds": 1,
"verified": 3,
"skipped": 0
}
| Field | Description |
|---|---|
safe | true/false if fully verified, null if budget exceeded |
verified | Number of entities verified (NLI or auto-skip) |
skipped | Number of entities skipped due to budget timeout |
rounds | Number of NER detection rounds (>1 if refinement triggered) |
The budget parameter controls max time for NLI verification after NER completes.
complete event has safe: nullconst r = await fetch("https://pii.engineer/api/detect", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Accept": "text/event-stream",
},
body: JSON.stringify({ text: "John Doe, phone 0123456789" }),
});
const reader = r.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop();
for (const line of lines) {
if (line.startsWith("event: ")) currentEvent = line.slice(7).trim();
else if (line.startsWith("data: ") && currentEvent) {
const data = JSON.parse(line.slice(6));
console.log(currentEvent, data);
currentEvent = null;
}
}
}
import requests
r = requests.post(
"https://pii.engineer/api/detect",
json={"text": "John Doe, NRIC S1234567A"},
)
data = r.json()
print(data["redacted"]) # [PERSON], NRIC [NRIC]
print(data["entities"]) # [{type, value, start, end, score}, ...]
# JSON mode (fast, NER only)
curl -X POST https://pii.engineer/api/detect \
-H "Content-Type: application/json" \
-d '{"text": "My name is Ahmad, IC T1234567G"}'
# SSE mode (full pipeline with NLI)
curl -N -X POST https://pii.engineer/api/detect \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{"text": "My name is Ahmad, IC T1234567G"}'