Text to Speech

Convert text to speech free in your browser. Pick from dozens of voices, adjust rate, pitch, volume. 100% private, no upload. Works in any modern browser.

Tip: commas = short pause, periods = full stop, semicolons = mid pause, question marks add rising intonation.0 / 5000
Settings Voice & Playback Settings
Available voices come from your operating system and browser. Different devices ship with different voices.
0.5x1.0x2.0x
01.02.0
0%50%100%

About the Text to Speech Tool

This text-to-speech reader uses the Web Speech API built into modern browsers, so every word is spoken locally on your device. No text is uploaded, nothing is stored on a server, and the tool works offline once the page is loaded. Choose any voice your operating system provides, tune the speaking rate, pitch and volume, and watch the current word highlight while it speaks. It is ideal for proofreading written drafts, learning pronunciation in a foreign language, creating quick voice-overs, or making content more accessible to readers who struggle with long blocks of text.

How does this text-to-speech tool work?

The tool calls the browser's built-in window.speechSynthesis interface, part of the W3C Web Speech API. When you click Speak, your text is sent to the operating system's speech engine — for example, Microsoft Speech Platform on Windows, AVSpeechSynthesizer on macOS and iOS, Google Text-to-Speech on Android and Chromebooks, or eSpeak NG on many Linux distributions. The engine generates audio waveforms locally and plays them through your speakers. No data leaves your device, which is why the tool is fully private and works without an internet connection once the page is loaded. Each voice you see comes from that operating system, so the list of voices changes depending on which device and OS you are using.

Why do I see different voices on different devices?

Voices are not bundled with the website; they are bundled with your operating system, browser, and any extra language packs you install. A fresh Windows 11 machine typically ships with Microsoft David and Zira in English plus one default voice per installed display language. macOS includes Siri voices and dozens of legacy AppleScript voices like Samantha, Daniel and Karen. Android devices use the Google Text-to-Speech engine, which can download additional high-quality voices on demand. Chromebooks add Google natural voices over the network. To get more voices, open your OS settings, look for a Speech, Voice Access or Language pack option, and install the languages or voice qualities you want — they will appear in this dropdown the next time you load the page.

What do the rate, pitch and volume sliders do?

Rate controls speaking speed, ranging from 0.5x (half speed) to 2.0x (double speed). A rate of 1.0 is the voice's natural cadence, around 150 to 180 words per minute for most English voices. Pitch shifts the fundamental frequency of the voice: 0 sounds very low and growly, 1.0 is the natural pitch, and 2.0 is a high cartoon-like tone. Volume scales playback from silence (0) to maximum (1.0); this is independent of your system volume, so set both for the final level. Try a few combinations to find a voice you can listen to comfortably for long periods — many listeners prefer 1.1x rate with a slightly lower pitch for sustained reading.

Can I save the spoken audio as an MP3 or WAV file?

Not directly. The Web Speech API exposes only playback; it does not return the raw waveform to JavaScript, so the page has no way to encode the speech into an audio file. This is a deliberate browser restriction to protect proprietary OS voices from being redistributed. To capture audio, use your operating system's built-in screen recorder (Windows Game Bar, macOS QuickTime Player, Chromebook Screen Capture) or a virtual audio cable plus any free audio recorder while the tool is playing. For an automated file export, you would need a cloud TTS service such as Amazon Polly, Google Cloud TTS, or Microsoft Azure Speech — these return MP3 or WAV but are paid services.

Text to Speech — Convert text to speech free in your browser. Pick from dozens of voices, adjust rate, pitch, volume. 100% private, no up
Text to Speech

Why does speech cut off or stop unexpectedly in Chrome?

Chrome has a known limit of around 15 seconds per utterance and may silently stop long passages. The tool mitigates this by sending each Speak request as one utterance and by issuing a resume() nudge right after speak(), which keeps the engine awake on most recent Chrome versions. If you still hit truncation, split long passages into shorter paragraphs and click Speak again per paragraph, or switch to Microsoft Edge which uses higher-quality Azure voices with no such limit. Firefox and Safari handle long utterances reliably. Pausing and resuming repeatedly can also cause Chrome to drop the queue; a single Stop followed by Speak is the safest recovery.

How can I control pronunciation and pauses?

The Web Speech API does not accept SSML markup in most browsers, so pacing has to be done through punctuation. Commas insert a short pause of about 150 ms, semicolons and dashes give a mid-length pause, periods and question marks add a longer stop with intonation. To force a multi-second silence, place an ellipsis or a row of dots on its own line. For pronunciation, you can phonetically respell tricky words — for example writing 'Vietnam' as 'vee-et-nam' or 'IPv6' as 'I P V six'. Acronyms in all caps are usually read letter by letter, while mixed case is read as a word. Test different spellings and pick the one that sounds best with your chosen voice.

Is this tool really private?

Yes. All processing happens inside the browser tab using your operating system's local speech engine. The text you type never leaves your computer; we do not send it to our server, to any analytics platform, or to any third-party TTS provider. You can verify this by opening your browser's developer tools, switching to the Network panel and clicking Speak — no outgoing requests are made. The single exception is Chromebook 'natural' voices, which Google delivers over the network and which clearly say 'natural' in the voice name; if privacy is critical, deselect those and choose a voice marked as local-only or system-default.

Who benefits the most from text-to-speech?

Writers use it to proofread drafts, because the ear catches awkward phrasing and dropped words that the eye glides over. Language learners use it to hear native pronunciation of vocabulary lists. People with dyslexia, ADHD or low vision use it as an assistive reading tool. Podcasters and YouTubers generate quick voice-overs for placeholder narration. Teachers turn handouts into audio versions for accessibility. Developers test interfaces with screen-reader-like output. Drivers and commuters convert articles into hands-free audio. The tool is intentionally lightweight and free so anyone — including users with slow connections or older hardware — can use it without signup, without payment, and without installing anything.