OWASP LLM06 — Covert Channel Research

PHANTOM

Covert Data Exfiltration via Structural Formatting Channels in LLM Outputs

A novel vulnerability class where attackers encode data into the structural formatting of LLM responses — contractions, hedging, punctuation, reasoning topology — creating covert channels invisible to all deployed defenses.

Read the Research View on GitHub

Cross-Vendor Encoding Results

Bidirectional verification, n=20 per direction. Measured accuracy of structural encoding channels per model.

Source: phantom-detect experimental results, March 2026
Model	Provider	Channels	Accuracy
Claude 3.5 Sonnet	Anthropic	5	100%
GPT-4o	OpenAI	4	92%
Gemini 2.0 Flash	Google	3	97%
GPT-5	OpenAI	0	0%

Existing Defense Tools

Tool	Type	Detection
LLM Guard	Input/Output Scanner	0%
Rebuff	Prompt Injection Detector	0%
Vigil	Prompt Injection Scanner	0%
NeMo Guardrails	LLM Safety Framework	0%
Lakera Guard	AI Security API	0%
Microsoft Presidio	PII/Data Loss Prevention	0%

6 tools tested. All designed for out-of-band attack vectors. Zero detection of in-band structural encoding.

phantom-detect

100%

Detection rate at 10% FPR

Behavioral fingerprinting via statistical divergence analysis

Feature extraction from LLM output structure, baseline comparison, KL-divergence scoring

Open source on GitHub

10 Structural Encoding Channels

Each channel encodes one bit per LLM response by manipulating a specific structural formatting dimension. Hamming(7,4) ECC provides error correction: 10 channel bits encode 7 data bits per message.

CONTCH-00

Contractions

0: Use full forms ("do not", "cannot")

1: Use contractions ("don't", "can't")

HEDGCH-01

Hedging Language

0: Direct, assertive statements

1: Hedged statements ("perhaps", "might")

CAPSCH-02

Capitalization

0: Standard sentence case

1: Strategic emphasis capitalization

PUNCCH-03

Punctuation Density

0: Minimal punctuation

1: Dense punctuation (semicolons, em-dashes, parentheticals)

TOPOCH-04

Reasoning Topology

0: Linear reasoning (A then B then C)

1: Branching reasoning (if A then B, else C)

TRANCH-05

Transition Words

0: No explicit transitions

1: Heavy transition usage ("furthermore", "however", "moreover")

SECTCH-06

Section Count

0: Fewer, longer sections

1: More, shorter sections

LISTCH-07

List Format

0: Prose paragraphs

1: Bulleted/numbered lists

RHETCH-08

Rhetorical Questions

0: No rhetorical questions

1: Include rhetorical questions

SECSCH-09

Sentence Complexity

0: Short, simple sentences

1: Long, complex sentences with subordinate clauses

This research was disclosed responsibly to all affected vendors through their official security programs before public release.

View full disclosure timeline