Welcome to your first step in building an AI Cooking Helper with TypeScript and Express! In this lesson, you will learn how to add audio features to your API using Text-to-Speech (TTS) technology. Imagine you are cooking and your hands are messy — being able to listen to recipe steps instead of reading them can make the process much easier and more enjoyable. By the end of this lesson, you will know how to generate audio from text and serve it through your API, making your cooking assistant more helpful and accessible.
Now, let’s talk about gTTS. gTTS stands for Google Text-to-Speech. It is a library that takes text and turns it into spoken audio using Google’s TTS engine. In Node.js, you can use the gtts package to access this functionality. Important: gTTS sends your text to Google’s servers to generate the audio, so it requires an active internet connection. If your server or development environment is offline, gTTS will not work.
This is useful for making your app more accessible, especially for users who prefer listening over reading.
To use gtts in your project, you can install it with:
In the Codesignal environment, this library will already be installed, there is no need to run this command.
Let’s build the function that will turn text into audio. We will use the gTTS class from the gtts package and Node.js streams. Here’s how we do it, step by step:
First, we need to import the classes we will use:
gTTSis the main class for converting text to speech.Readableis used to work with audio data as a stream, which is efficient for web APIs.
Now, let’s write the function that takes some text and returns a readable audio stream:
Let’s break this down:
const tts = new gTTS(text);: This creates agTTSobject with the text you want to convert.- We try to get a readable stream from the
gTTSobject. If available, we return it directly. - The function returns a readable stream, which can be sent directly to the client.
This function does not save any files to disk. It keeps everything in memory, which is fast and efficient for web APIs.
Now that we have a function to generate audio, let’s create an API endpoint that uses it. We want users to be able to send some text to our API and get back an audio file.
In your Express routes file, add the following code:
Let’s explain what happens here:
const text = String(req.query.text ?? "");: This gets thetextparameter from the URL query string. For example,/api/tts?text=Hello%20world.- If no text is provided, the function returns an error message and a 400 status code.
- If text is provided, it calls
ttsStream(text)to get the audio stream. stream.pipe(res);streams the audio back to the user as an MP3 file.
If you visit:
You will receive an audio file that says:
“Welcome to your smart cooking assistant.”
In this lesson, you learned how to add Text-to-Speech (TTS) support to your Express API using TypeScript and the gtts library. You wrote a function to convert text into audio streams and created a new /api/tts endpoint that returns audio files to users. This makes your cooking assistant more helpful, especially for users who want to listen to recipes while cooking.
Next, you will get a chance to practice generating audio and working with the new endpoint. You will try out different texts and see how the API responds with audio. This hands-on practice will help you become comfortable with adding audio features to your web applications.
