2026-04-10Blog

Deepfake Fraud in Video Calls: How to Stop It

In January 2024, a finance worker at a multinational company was tricked into transferring $25 million after joining a video call with what appeared to be the company's CFO and several colleagues. Every person on the call was a deepfake. The attacker used real-time face-swapping technology to impersonate multiple executives simultaneously. This was not a movie plot - it happened, and it is happening with increasing frequency.

The Billion Dollar Problem: Deepfake-Driven CEO Fraud

Deepfake-enabled fraud losses exceeded $1.1 billion in the United States alone in 2025, and the number is accelerating. CEO fraud (also called business email compromise or BEC) has been the most costly form of cybercrime for over a decade. Deepfakes have supercharged it by adding video and audio impersonation to what was previously an email-only attack.

The economics of deepfake fraud are terrifying for defenders:

How Real-Time Video Deepfakes Actually Work

Modern deepfake video calls use a pipeline of technologies running in real time:

  1. Face capture - The attacker's webcam captures their face in real-time
  2. Face swapping - A neural network maps the attacker's facial expressions and movements onto the target's face model in real-time (under 50ms latency)
  3. Voice cloning - A text-to-speech model generates the target's voice from text input, or a voice conversion model transforms the attacker's speech into the target's voice in real-time
  4. Virtual camera injection - The deepfake video output is routed through a virtual camera driver that video conferencing software (Zoom, Teams, Meet) treats as a real webcam
  5. Background synthesis - The background is replaced with a plausible setting (the target's office, a conference room) using standard virtual background technology

The entire pipeline runs on a consumer gaming GPU. The attacker needs only a few photos or a short video of the target to build the face model, and a few seconds of audio to clone the voice.

Why Existing Video Conferencing Security Fails

Current video conferencing platforms have no built-in defense against deepfakes:

Real-Time Liveness and Human Verification During Calls

The solution is not trying to detect deepfakes (a losing arms race) but verifying that a real human is physically present at the other end of the call. This requires hardware-based liveness detection that cannot be spoofed by virtual camera injection:

Deploying POY Verify for High-Stakes Video Authentication

POY Verify can be integrated into video call workflows to provide real-time human verification:

  1. Pre-call verification - Before a high-stakes call begins, each participant completes a 30-second POY verification on their device. This confirms a real human is physically present using on-device Secure Enclave processing
  2. Trust score display - Each verified participant's trust score is visible to other participants, providing a real-time confidence signal
  3. Periodic re-verification - For extended high-risk sessions, the system can require periodic re-verification check-ins to ensure the same human remains present throughout the call
  4. Tamper-evident logging - Every verification event is cryptographically logged, creating an audit trail that proves each participant was a verified human at each checkpoint

This approach does not try to detect whether the video feed is a deepfake. Instead, it establishes an independent, hardware-backed proof channel that confirms a real human is present - regardless of what appears on the video feed. The deepfake can show whatever it wants; the liveness verification proves who is actually there.

For organizations handling high-value decisions via video (M&A discussions, board meetings, financial approvals, classified briefings), this independent verification channel is becoming essential. The cost of a single successful deepfake CEO fraud attack ($2.4M average) dwarfs the cost of implementing verification for the calls that matter most.

Prove You Are Real

POY Verify is the privacy-first human verification layer for the internet. No data collected. No identity required.

VERIFY ME NOW