Welcome to the first lesson of our course on audio processing and transcription using Howler.js, TypeScript, and OpenAI Whisper. This lesson lays the groundwork by building a system where users can select an audio file, play it, pause it, or stop it—using backend-controlled logic and frontend audio playback powered by Howler.js.
Howler.js is a JavaScript audio library that simplifies working with sound in the browser. It supports powerful features like streaming, cross-browser playback, controlling volume, tracking playback position, and more.
The backend doesn’t have access to the browser’s audio system or hardware. It cannot play or pause actual audio because:
-
Node.js runs on the server (not the user’s machine).
-
There is no audio output device available to “play” sound.
-
Instead, the backend only tracks playback intent—like which file the user wants to play—while the frontend (in the browser) uses Howler.js to play the audio.
-
Backend = Playback state manager
-
Frontend = Actual audio player
We’ll use Howler.js in the frontend,. Locally, you can install it using:
Also make sure that you've installed the required packages for the backend, including TypeScript and Express:
By the end of this lesson, you'll be able to:
- Serve audio files and playback routes from a TypeScript backend.
- Use Howler.js to control audio playback in the browser.
- Sync backend playback state with frontend playback logic.
We'll use a simple TypeScript module to store which audio file is currently selected.
This file acts like a tiny in-memory database:
setCurrentAudio()tells the backend which file is being played.getCurrentAudio()lets us fetch the current selection.clearCurrentAudio()resets everything when audio stops.
Next, we define the routes that frontend will call when the user presses Play, Pause, or Stop.
These routes allow the browser to:
- Start playback (
/play) - Pause audio (
/pause) - Stop and clear the current file (
/stop) - Get the currently selected audio (
/current)
The actual sound isn’t played here—this just manages state.
Now let’s look at the core of the playback logic. This happens in the browser, inside public/app.js.
Here’s the handlePlayback() function that gets triggered when the user clicks Play, Pause, or Stop.
The handlePlayback function is responsible for syncing user actions (like play, pause, or stop) with both the backend and the Howler.js audio player in the browser.
When a user clicks one of the control buttons, the function first sends a request to the appropriate backend route (/play, /pause, or /stop) to update the playback state. Then, it uses Howler.js to handle the actual audio playback in the browser.
If the user chooses to play an audio file, a new Howl instance is created with the selected file path. The setting ensures that large audio files are streamed efficiently. The callback starts a timer that tracks the current position of the audio in real time using , while the callback cleans up that timer and resets playback flags when the audio finishes.
What This Does:
src: [/${filePath}]: Tells Howler.js which file to play (from the server).html5: true: Uses native browser audio instead of Web Audio API, which works better for large files like long MP3s.onplay: This event runs as soon as audio starts playing.- Starts a
setInterval()every 200ms to track the current position usingseek(). - This value (
lastKnownPosition) will help us later during transcription.
- Starts a
onend: Called when the audio finishes.- Clears the tracking interval and resets playback state.
This setup lets us continuously monitor audio position—critical for timestamp-based transcription.
Each button in your HTML UI is connected to this handlePlayback() function:
When the user interacts with the interface:
- The correct route is called on the backend to update playback state.
- The Howler.js instance plays, pauses, or stops the audio accordingly.
In this lesson, you built the core playback system of your application:
- You learned what Howler.js is and why it's used in the browser.
- You created backend routes to track playback state.
- You used Howler.js in the frontend to control audio playback.
- You monitored audio position in real time using
seek()andsetInterval().
This gives you a complete backend-driven playback system with real audio happening in the browser. You now have everything you need to load files, control playback, and track what’s happening.
