GUIDE

Youdao Voice Translation Guide

Real-time speech recognition and translation — conversation mode and simultaneous interpretation

Whether you're traveling abroad and need to communicate with locals, attending an international conference requiring real-time translation, or watching foreign-language videos and want instant subtitles, Youdao Translate's voice translation feature has you covered. It supports real-time speech recognition and translation, breaking down language barriers in any conversation.

Youdao's voice translation is powered by cutting-edge Automatic Speech Recognition (ASR) technology combined with an AI neural machine translation engine. It can recognize and translate your spoken words into the target language in real time. The feature offers two core modes: Conversation Mode for face-to-face bilingual communication, and Simultaneous Interpretation Mode for one-way listening translation. It supports multiple languages including Chinese, English, Japanese, Korean, French, German, Spanish, and more.

How to Use Voice Translation

Follow these steps to start using Youdao's voice translation feature:

1

Open Youdao Translate

Launch the Youdao Translate app (mobile) or desktop software. Ensure your device is connected to the internet, as voice translation requires a network connection for optimal recognition. Haven't installed it yet? Visit our download page.

2

Select Voice Translation Mode

Tap the "Voice" or microphone icon on the home screen to enter the voice translation interface. Select your source and target languages at the top. Choose "Conversation" mode for face-to-face dialogue, or "Simultaneous" mode for one-way interpretation.

3

Start Speaking

Hold the microphone button and speak (Conversation mode), or tap the start button for continuous listening (Simultaneous mode). You'll need to grant microphone permissions on first use. Speak at a normal pace with clear pronunciation for best results.

4

View Translation Results

The system displays recognized text and translations in real time. In Conversation mode, tap the play button to read the translation aloud. In Simultaneous mode, translations appear as live subtitles on screen with bilingual display support.

Conversation Mode vs. Simultaneous Mode

Youdao Translate provides two distinct voice translation modes designed for different scenarios. Here's a detailed comparison:

Feature Conversation Mode Simultaneous Mode
Best For Face-to-face bilingual communication: asking directions, shopping, ordering food One-way listening: meetings, videos, lectures, speeches
Interaction Hold-to-speak, release to translate; both parties take turns Tap start for continuous listening and real-time translation
Direction Bidirectional (auto-switch between languages) Unidirectional (fixed source → target language)
Audio Output Supports auto-play of translated speech Text-based subtitle display
Platforms Mobile App Mobile App + Desktop
Recommended Travel, business meetings Online conferences, foreign videos, live streams

Supported Languages

Youdao voice translation currently supports speech recognition and translation for these major languages:

Chinese (Mandarin) English Japanese Korean French German Spanish Portuguese Italian Russian Thai Vietnamese Indonesian Dutch Polish Turkish

The list of supported voice recognition and text translation languages can change with each update. Check the current app for the language pair you need.

Tips for Better Voice Translation

Voice Translation Best Practices

  • Moderate Speed: Speak at a natural conversational pace. Speaking too quickly can reduce speech recognition accuracy, especially in noisy environments.
  • Clear Pronunciation: Enunciate clearly and avoid mumbling. Standard Mandarin or standard English accents yield the best recognition accuracy.
  • Minimize Background Noise: Use voice translation in quiet environments whenever possible. In noisy settings, speak close to the microphone or use earphone microphones to reduce interference.
  • Short Sentences: Long sentences may produce less accurate translations. Break complex statements into shorter phrases of 1-2 sentences each for optimal results.
  • Verify Language Settings: Always confirm your source and target language settings are correct before starting to avoid translation failures due to wrong language selection.
  • Use Playback: In Conversation mode, use the play button to read translations aloud for the other person — this is often more effective than showing text on screen.

Frequently Asked Questions

Can voice translation work offline?
Voice translation requires an internet connection, as both real-time speech recognition and translation depend on Youdao's cloud-based AI engine. If you're in an area without internet access, consider typing your text manually and using the offline translation feature instead. Future versions may include offline speech recognition support.
Does voice translation support dialects and accents?
Currently, Youdao's speech recognition is optimized for standard Mandarin Chinese and standard English, among other supported languages. Recognition accuracy may decrease with strong regional dialects or heavy accents (such as Cantonese or Sichuanese). For best results, try to use standard pronunciation when using voice translation. The Youdao team is continuously improving support for various accents and dialects.
Can simultaneous interpretation translate videos and meetings?
Yes. Youdao's Simultaneous Interpretation mode can capture system audio, making it ideal for translating online meetings (Zoom, Teams), video websites, online courses, and more. On the desktop version, enabling the simultaneous interpretation feature allows the software to automatically listen to system audio and generate bilingual subtitles. On mobile, it works through the device's microphone. After a meeting ends, you can export the translation log as meeting minutes with one click.

Try Voice Translation Now

Download Youdao Translate for real-time speech translation — break the language barrier

⬇ Free Download