A researcher reads model weights at 09:14. Normal. At 02:34 the same account reads the same weights again, then bulk-downloads them to a personal Google Drive. Every credential checks out. No policy fires. The data is gone.
That is the move most tools never stop, because every step in it was permitted. Data exfiltration prevention earns its name only if it catches that move while the data is moving, not guessed at in advance and not reconstructed from logs after the disclosure letter goes out. The tool that guesses ahead drowns you in false alarms and still misses the permitted transfer. The tool that reads logs afterward tells you what already left. Runtime data movement governance watches the move as it happens and resolves it to a real identity and the job behind it. The pattern across the moves is the signal, not the content of any one of them.
For the broader detection category, see our DDR security guide. To weigh a specific vendor, compare Hilt vs Cyberhaven.
What Counts as Exfiltration
Exfiltration is data moving from inside the organization to a destination outside it. It is usually the last step of a breach, the moment the stolen thing actually leaves. Three actors drive it:
- External attackers who gain access through compromised credentials, vulnerabilities, or supply chain attacks, then stage and extract targeted data
- Malicious insiders, meaning employees, contractors, or partners who intentionally steal proprietary data, trade secrets, or customer information
- Negligent insiders who accidentally expose data through misconfiguration, shadow IT, or pasting sensitive information into AI tools like ChatGPT or Claude
The numbers track that arc. IBM's Cost of a Data Breach Report put the average breach at $4.88 million in 2024. Mega-breaches involving large-scale exfiltration run past $300 million. The Identity Theft Resource Center counted over 3,300 breaches in 2025, against an estimated $34 billion in annual US losses.
How It Works
Prevention that catches the permitted move has to do three things: see the movement, judge it, and contain the host. Take them in order.
1. Watch the movement
Watch data movement everywhere it happens: cloud workloads, user endpoints, network boundaries. Where you watch from sets what you can see. The kernel is the deepest vantage, because anything that moves through the operating system passes through it no matter which application started the move.
| Vantage | What It Sees | Example Approach |
|---|---|---|
| Kernel | Data movement at the operating-system layer: file reads, writes, network connections, process activity | Runtime collector watching at the kernel (used by Hilt) |
| User-space | Application-level events: file access via APIs, browser activity | Agent hooks (used by Cyberhaven, DTEX, Varonis) |
| Network | Wire-level traffic: egress destinations, transfer volumes, protocol analysis | TAP/SPAN port capture |
A kernel vantage sees movement that passes through the operating system. User-space telemetry sees only what applications choose to report. That is an architectural gap, not a setting you can flip. Hilt watches at the kernel with one lightweight collector that reads metadata by default, stays off the path, and runs single-tenant in your own cloud. Content-aware inspection is there when you want it. The default means you do not have to read your data to watch it move.
2. Judge the behavior
What the collector observes gets scored against a model of normal that surfaces the anomalous move. Three tiers do the work:
Tier 1: Deterministic rules. Pattern matching against known-bad behavior and policy violations. Fast and predictable, blind to anything it has not seen before. This is where DLP lives.
Tier 2: Behavioral baselines. Statistical models learn normal for each user, service account, resource, and time window. A researcher reading model weights at 02:34, when their baseline says they work days, scores as unusual even with valid permissions in hand.
Tier 3: Pattern reasoning. Judgment that spans many signals. It connects a chain of permitted actions, read a file, compress it, package it, upload it, into a shape you can name. Each move resolves to a probabilistic, source-dependent identity, so the read is across the moves, not on any single event.
3. Contain the host
Once an anomaly holds up, the system acts:
- Case: Write the finding as an investigation-ready narrative, not just a raw alert
- Quarantine: Isolate the affected host with host-level network isolation, from the control plane
- Alert: Surface the event for SOC review and route it into existing playbooks
- Audit report: Generate compliance-ready documentation of the event
The shape of the response matters. Hilt isolates the host at the network from the control plane. It never sits inline and never blocks, drops, or alters traffic. DLP and UEBA tools hand the analyst a queue and wait, so containment runs hours to days. The Sophos Active Adversary Report clocks exfiltration finishing within 3 days of the first compromise. Most alert-based systems are still triaging when the data is already out.
Where It Sits Next to DLP, DDR, and Insider Risk
These categories overlap on the surface and solve different problems underneath:
| Category | Primary Focus | Detection Method | When It Acts | Blind Spots |
|---|---|---|---|---|
| Data Exfiltration Prevention | Govern data movement at runtime | Behavioral, watched at runtime | At runtime, as the move happens | Requires a collector in the environment |
| DLP (Data Loss Prevention) | Enforce content policies on known channels | Content inspection + rules | Before the move (policy-based) | Novel paths, encrypted data, valid-permission abuse |
| DDR (Data Detection & Response) | Detect and respond to data threats | Data lineage + flow tracking | During or after, semi-automated | Limited to tracked data flows |
| Insider Risk / UEBA | Detect malicious or negligent insiders | User behavioral analytics | After the fact (alert-based) | Endpoint-focused, limited cross-domain visibility |
| DSPM (Data Security Posture) | Discover and classify sensitive data | Scanning + classification | N/A (posture, not detection) | No runtime detection or response |
Read down the stack and the division of labor is clean. DSPM tells you where sensitive data lives. DLP enforces policy on known channels. Insider risk tools flag suspicious users. Runtime data movement governance catches the move itself as it forms, across channels the others do not cover, and resolves it to the identity and the job behind it. See the full feature comparison for the detailed breakdown.
How Data Actually Leaves
The method does not matter much to a kernel collector, but it helps to know the routes a buyer is defending. They sort into three lanes.
Off the endpoint
- USB and removable media: Copying files to external drives. Declining in frequency but still common in air-gapped environments.
- Email and messaging: Attaching files or pasting data into personal email (Gmail, Outlook), Slack, or Microsoft Teams. Even end-to-end encrypted platforms have security gaps that expose data before encryption occurs.
- Cloud sync: Uploading to personal Dropbox, Google Drive, OneDrive, or iCloud accounts.
- Shadow AI: Pasting proprietary code, customer data, or strategy documents into ChatGPT, Claude, Gemini, or Copilot. IBM's 2025 research put shadow AI breaches at $4.63 million on average, about $670,000 above a standard breach.
Out of the cloud
- Cross-region data transfer: Moving data from compliant to non-compliant storage regions.
- Service account abuse: Exploiting overly broad permissions to access and copy datasets outside normal scope.
- Container escape: Breaking out of containerized workloads to access host-level data.
- Pipeline manipulation: Modifying ETL jobs to copy data to unauthorized destinations.
Across the wire
- DNS tunneling: Encoding data in DNS queries to bypass network controls.
- Encrypted channels: Using TLS/SSL to obscure data transfers to attacker-controlled endpoints. Vendor-controlled encryption key management adds risk, as demonstrated by Microsoft's BitLocker key handover to authorities.
- Protocol abuse: Exfiltrating data through non-standard ports or protocols.
- Steganography: Hiding data within image files, audio, or video.
A kernel collector sees every one of these because it watches below the application layer. DNS tunneling, steganography, a non-standard port: the trick used to hide the bytes does not change the fact that they passed through the operating system. The movement is visible, and it resolves to the identity and the job behind the move.
What to Demand from a Solution
Vantage
The architectural question that decides everything else: where does it watch from, the kernel or user-space?
A kernel vantage sees data movement at the operating-system layer, before application-level obfuscation and before user-space tools get a chance to intercept it. A user-space agent sees only what applications expose through APIs. Cyberhaven, DTEX Systems, Varonis, and Nightfall AI all watch in user-space. Hilt watches at the kernel across cloud, endpoint, and network with one collector, then resolves each move to the identity and the job behind it.
Cross-domain visibility
Exfiltration rarely stays in one domain. A typical chain runs cloud (read the sensitive data) to endpoint (stage it locally) to network (upload it out). Watch one domain and you see one frame of a three-frame story.
| Solution | Domains Covered |
|---|---|
| Hilt | Cloud + Endpoint + Network |
| Cyberhaven | Endpoint + SaaS |
| DTEX | Endpoint |
| Varonis | File + Cloud + SaaS |
| Nightfall AI | SaaS + Email + AI tools |
| CrowdStrike Falcon | Endpoint |
Performance impact
In a latency-sensitive shop, a monitor that taxes the hot path is a monitor that gets ripped out. The right design watches off the path. It adds negligible overhead and never sits between the data and its destination, so there is nothing to slow down and nothing to take down.
Hilt's collector footprint:
| Metric | Value |
|---|---|
| CPU overhead | ~0.1% of one core |
| Memory footprint | 4-8 MB RSS |
| Placement | Off the path, never inline |
| Privacy default | Metadata only (content-aware inspection available) |
Deployment speed
Time to first event varies by an order of magnitude:
| Solution | Time to First Event | Changes Required |
|---|---|---|
| Hilt | Minutes | One collector, no code changes |
| Cyberhaven | Days | Browser extension + agent |
| DTEX | Weeks | Agent deployment |
| Varonis | Weeks | Integration configuration |
Walk the Timeline
Take the researcher from the top of this page. Here is every move, in order, with the verdict on each:
| Time | User | Action | Status |
|---|---|---|---|
| 09:14 | researcher@corp | Read /datasets/model-weights/v3 | Normal |
| 09:31 | researcher@corp | Write /notebooks/experiment-log.ipynb | Normal |
| 14:22 | researcher@corp | Read /configs/hyperparams.yaml | Normal |
| 02:34 | researcher@corp | Read /datasets/model-weights/v3 | Anomaly: off-hours, off baseline |
| 02:35 | researcher@corp | Bulk download to personal Google Drive | Anomaly: volume far above this account's norm |
| 02:35 | researcher@corp | Egress to drive.google.com (personal) | Quarantine: host isolated at the network, case written |
Read any row on its own and it passes. Read them together and they spell exfiltration: off-hours access to sensitive data, then bulk movement to personal storage. The system resolves the chain to the identity and the job behind it, writes the case, and isolates the host at the network from the control plane.
How to Start
For a team putting this to work:
-
Find the gaps first. Map which data movement paths your DLP, EDR, and CASB cover, and which they do not. The uncovered paths are where exfiltration runs.
-
Point it at the crown of the estate. Turn on behavioral monitoring where the data is most sensitive: financial systems, IP repositories, customer databases.
-
Let it learn before it acts. Give the system time to build normal for each user, account, and resource before automated response goes live. The signal sharpens and you stop second-guessing what it flags.
-
Wire it into what you run. Runtime data movement governance adds to your DLP, SIEM (Splunk, Microsoft Sentinel), and EDR (CrowdStrike Falcon, SentinelOne) rather than replacing them. Feed findings into the SIEM and the response into your SOAR.
-
Measure the right things. Track mean time to detection, false positive rate, and data-at-risk reduced. Alert volume is vanity. See our FAQ for common deployment questions.
Book a 30-minute technical call to watch runtime data movement governance run in your own cloud. One collector at the kernel, first findings in minutes.
FAQ
What is data exfiltration prevention? Data exfiltration prevention is the practice of governing data movement at runtime so anomalous transfers are caught while the data moves. It watches movement as it happens and resolves each move to the identity and job behind it, which is how it catches transfers where permissions were valid but the pattern across moves was abnormal.
How is data exfiltration prevention different from DLP? DLP enforces content-based policies on known channels (email, USB, cloud storage). Runtime data movement governance watches movement as it happens and catches anomalous patterns across any channel, including novel paths, encrypted transfers, and valid-permission abuse that DLP cannot see, because it reads the pattern across moves rather than the content of any single one.
Why does where you watch from matter for data exfiltration prevention? Where you watch determines what you can see. A kernel vantage observes data movement at the operating-system layer, regardless of which application initiated the move, before application-level obfuscation. Hilt watches at the kernel with a single lightweight collector that reads metadata by default, stays off the path, and runs single-tenant in your own cloud, using roughly 0.1% of one core and 4-8 MB. Content-aware inspection is available when you want it, but the default means you do not have to read your data to see it move.
Can data exfiltration prevention detect insider threats? Yes. Behavioral baselines learn normal patterns for each user and flag deviations. This catches malicious insiders who hold valid permissions but exhibit abnormal behavior, such as reading sensitive data outside working hours, moving unusual volumes, or transferring data to personal storage.
How long does it take to deploy a data exfiltration prevention solution? Deployment time varies by solution. Hilt deploys with a single collector and no code changes, surfacing first findings in minutes. User-space solutions like Cyberhaven, DTEX, and Varonis typically require days to weeks for agent deployment, integration configuration, and baseline calibration.