Fixing a key audio issue
How I designed an AI audio feature to significantly reduce frustration and boost learning.
Imagine you are trying to learn a language and notice a word that you know isn’t right. In the app! You look it up online and verify that you aren’t going crazy! You go through the work of changing it, only to start an activity and hear the old, outdated, incorrect audio play.
Frustrating, right? This was a common bug reported by users of Embark, a language -learning app for Mormon missionaries used at the Missionary Training Center (MTC).
As the lead UX designer on Embark, I designed a new audio generation feature to solve this issue, significantly reducing user complaints and ensuring a smoother learning experience.
Product
Embark App
Team
1 Project manager
1 developer
Role
Design lead
Background
Embark supports over 60 languages, and the team is always working to improve content accuracy. However, regional variations can lead to discrepancies—like a word being different in Spain versus Mexico. To address this, we allow users to edit words.
Each word also has audio attached, which plays during learning activities. The challenge was that edited words retained the original audio, causing mismatches and user frustration. We needed a solution to keep the audio consistent with edited text.
Project goal
Fix the audio issue for edited words
Ensure that edited words in Embark always have accurate and consistent audio.
Research
Finding a related issue
While reviewing jira tickets about the audio mismatch issue, I also found another common problem. Many users complained about poor audio recordings for certain words. The content team is working to re-record these, but it takes time because Embark supports so many languages. I thought this was another issue we could address with the same design. We could let users generate new audio even if they haven’t edited a word, just to replace low-quality audio.
Brainstorming
Our team brainstormed potential solutions to the mismatched audio problem. Working closely with my project manager and a developer, we came up with two main ideas:
Idea 1
Let user’s record their own audio
If users recorded their own audio, it could create errors. Many of them were just learning the language and might mispronounce the words. Hearing incorrect pronunciations during activities could reinforce bad habits.
Idea 2
Generate audio using AI
Using AI to generate audio automatically seemed faster and easier. However, we were concerned about the quality of the AI-generated audio.
To decide on the best approach, I worked with a developer to test the quality of the AI-generated audio. We tried out several words in several languages and were impressed with both the speed and the quality of the results. Based on these tests the team decided to move forward with AI-generated audio.
Design
Once we finalized our approach, I got to work on the design. I added an audio card to the word-edit page. This card allows users to listen to the current audio and generate new audio if they don’t like what they hear.
Audio card
Editing a word
If they edit a word, new audio is generated automatically, but they can always switch back to the original Embark audio if they prefer.
Generating new audio
In the new design users could also generate new audio from this card without needing to edit the word if they didn’t like the default. Through a loading state they are shown visually that the audio is generating and they see a toast when it is done.
Testing
To evaluate the updated design, I conducted a usability test with 5 missionaries at the MTC. Since the audio generation happens automatically, my main goals were to confirm that users understood that AI was generating new audio for any edits they made and that the system was replacing outdated audio with an accurate version.
Results
Understanding of new design
100% of participants quickly understood the feature and found it easy to use. They unanimously agreed that the automatic audio updates would be a significant improvement to their learning experience.
Conclusion
After launching the update, feedback from users showed that the automatic audio generation has worked well. Complaints about mismatched or low-quality audio stopped coming in. It’s clear that we’ve improved the overall user experience.