Wreck a Nice Beach 
2025, Voice User Interface, Unity, Video, Interview with Crazy Minnow Studio, 6:05

We read mouths. We interpret them. We design them. We have programmed them to behave in specific ways. Wreck a Nice Beach is a voice user interface that treats the mouth as a visual language, using lip-synchronization software to approximate speech as shape. The interface employs audio amplitude and phoneme detection to generate visemes, the visual counterparts of phonemes. By intentionally misaligning sound and shape, the work produces new facial expressions, meanings, and modes of communication, exposing the gap between what is said and what is shown.

Ventriloquy, making one's voice appear to come from somewhere else, is a useful description of what all lip-sync software does. The video is built from an interview with Crazy Minnow Studio, creators of SALSA Lip-Sync, an animation tool used to puppeteer character mouths in video games and 3D simulations. The tool's own creators are ventriloquized by their software, their voices translated into computational gestures that both mimic and distort human expression. Phonemes are deliberately reassigned to mismatched visemes: an 'm,' which should produce a closed, flat mouth, is mapped to an 'o,' triggering a jaw-expanding shape instead. Familiar words produce unfamiliar faces. The mouth becomes legible as a designed object, shaped by assumptions about how speech should look.

The title, Wreck a Nice Beach, comes from a well-known failure in early speech recognition. Systems that parsed phonemes consistently heard "recognize speech" as "wreck a nice beach." That mishearing captures something essential about how machines process voice: through approximation, pattern matching, and inference, not fidelity. The lip-sync system in this work operates through the same logic, translating sound into shape through computational models of expression. From the vocoder's error to the viseme's approximation of an 'e' like smile, machine-mediated speech has always been a site of translation, not transmission. As AI-driven systems increasingly speak on our behalf in games, virtual production, and social media, the cultural and aesthetic assumptions embedded in those translations become urgent questions. Communication technologies, particularly AI-driven interfaces, function as interpretive systems, not transparent channels. When software speaks for us, it doesn't just represent us. It redefines what expression, agency, and presence mean in digital environments.

Thank you to Crazy Minnow Studio  


 FINAL VIDEO
Video excerpt
Video excerpt
Video excerpt
Unity prototype
Unity simulation - waiting mode when no audio is present
Still from Unity simulation

HOUSE — Jenny Rodenhouse

Designer — Educator
[id="Q1004802321"] bodycopy { } [id="Q1004802321"].page { justify-content: center; background-color: #ffffff; } .overlay-content:has([id="Q1004802321"]) { } [id="Q1004802321"] .page-content { border-radius: 0rem; padding-top: 1.7rem; padding-bottom: 1.7rem; background-color: #ffffff; } [id="Q1004802321"] .page-layout { align-items: flex-end; }