When Looking Real Isn’t Enough: Why We Need to Talk About the Uncanny Valley
By Michelle Collins, Chief Revenue Officer, CodeBaby
I’ve been thinking a lot about NVIDIA’s Audio2Face lately. If you haven’t seen it in action, it’s genuinely impressive – the technology can take audio input and generate real-time facial animations that are remarkably lifelike. As a partner of NVIDIA, we’ve had a front-row seat to what this kind of innovation makes possible for conversational AI.
But here’s what keeps me up at night: just because we can make avatars look and move more realistically doesn’t mean we should always push for maximum realism. Because there’s this weird psychological phenomenon that happens when digital humans get too close to real without quite getting there. And ignoring it isn’t just a UX problem, it’s an ethical issue.
That Creeping Feeling of “Something’s Off”
Back in 1970, a roboticist named Masahiro Mori identified something he called the “uncanny valley,” that unsettling feeling you get when something looks almost human but not quite. We’ve all felt it. That moment when you’re watching a CGI character and your brain just… rejects it somehow. Even if you can’t put your finger on exactly what’s wrong.
The research on this is actually fascinating. fMRI studies show that when people see almost-human faces, the parts of their brain associated with error detection light up like crazy. Your brain is basically screaming “wait, something’s not right here” even when you can’t consciously identify the problem.
And here’s the kicker: this response doesn’t just make people uncomfortable. It actively undermines trust. Which is pretty much the opposite of what we’re trying to achieve with conversational AI.
The Problem with Perfect
The thing is, we’re not just building digital humans for entertainment anymore. These avatars are showing up in healthcare settings, educational environments, financial services – places where trust isn’t just nice to have, it’s absolutely critical.
When someone is anxious about a medical procedure or confused about their insurance options, the last thing they need is an avatar that triggers their brain’s “something’s wrong here” alarm. It doesn’t matter how technically impressive the animation is if it’s making people less comfortable engaging with it.
This is where I think a lot of the industry is getting it wrong. We’re treating realism as the goal when it should be a means to an end. The actual goal? Creating interactions that feel natural, trustworthy, and helpful.
What Audio2Face Gets Right (And What We Need to Add)
Don’t get me wrong, I’m genuinely excited about what NVIDIA is building. Audio2Face is an extraordinary tool, and the fact that they’ve open-sourced the technology stack is accelerating innovation across the board.
But technology is just the engine. We still need to figure out where we’re driving.
At CodeBaby, we’re looking at Audio2Face as an opportunity to pair cutting-edge animation with genuinely human-centric design. That means making some intentional choices:
Strategic stylization over photorealism: Our avatars don’t need to look like they could walk off the screen and into your living room. In fact, a bit of stylization can help sidestep the uncanny valley entirely while still conveying emotion clearly and effectively.
Crystal-clear transparency: Users should always know they’re talking to a digital agent. We’re not trying to trick anyone into thinking our avatars are human. That’s not just ethically questionable, it’s also setting up interactions to fail when the technology inevitably has limitations.
Context matters: An avatar helping someone navigate insurance options needs different design considerations than one leading a training simulation. We design for the emotional and cognitive context of each specific use case.
The Ethics Piece We Can’t Ignore
Look, I know “ethics in AI” can sound like corporate buzzword bingo. But when you’re building technology that can evoke genuine emotional responses from people, you have a responsibility to think carefully about how it’s designed and deployed.
Realistic avatars are powerful engagement tools. That’s exactly why they can be dangerous if misused. The same features that make them effective at building connection can be weaponized for manipulation if there aren’t intentional guardrails.
We’ve built our approach around three principles that guide everything we do:
Transparency first: No surprises, no deception. Users know what they’re interacting with.
Psychological safety: We design avatars that foster comfort and clarity, not unease or confusion.
Empathy with purpose: Technology should amplify human connection, not exploit or manipulate it.
What Responsible Realism Actually Looks Like
The future of conversational AI isn’t about creating digital humans that are indistinguishable from real ones. It’s about creating digital interfaces that are trustworthy, respectful, and genuinely helpful in ways that align with human values and needs.
NVIDIA’s Audio2Face gives us incredible new capabilities. But capability without thoughtful design is just tech for tech’s sake. And in conversational AI – where we’re literally creating the interface between humans and increasingly powerful systems – that’s not good enough.
The uncanny valley is there for a reason. It’s our brain’s way of saying “hey, pay attention to this.” And as an industry, I think we need to listen to that signal instead of trying to engineer our way past it without considering the implications.
Because at the end of the day, the measure of successful conversational AI isn’t how real it looks. It’s whether people feel comfortable using it, trust the information it provides, and come away from the interaction feeling helped rather than unsettled.
That’s the line we’re trying to walk at CodeBaby. And honestly, I think it’s the only responsible path forward.