Skip to main content

Building a web app using Speechly Browser Client

Learn how to add voice features to a web app using Speechly Browser Client.

Getting started

This guide assumes you've got some basic knowledge of JavaScript app development. We'll be creating a simple HTML/JS web app and use Parcel as bundler. Feel free to use your favorite bundler, this guide doesn't really go too deep into that.

You'll also need a Speechly account and a Speechly application that's using a Conformer model. If you are new to Speechly, you can follow our quick start guide to get started.

Project setup

Before we get started, you need to create a directory for your project. Then, create some HTML and JS files inside a src directory and add some content into them.

mkdir src
touch src/index.html src/app.js
src/app.js
console.log('Hello world');
src/index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>My Speechly App</title>
<script type="module" src="app.js"></script>
</head>
<body>
<h1>Hello world</h1>
</body>
</html>

Next, install Parcel

npm install --save-dev parcel
# or
yarn add --dev parcel

Then, update package scripts to get our app started.

package.json
{
"source": "src/index.html",
"scripts": {
"start": "parcel"
},
"devDependencies": {
"parcel": "^2.8.3"
},
}

Finally, start the development server.

npm start

Open localhost:1234 to see the application running.

Installation

Now that our project setup, install the @speechly/browser-client package.

npm install @speechly/browser-client
# or
yarn add @speechly/browser-client

Then, import BrowserClient, create a new client instance and pass your App ID to it. Get your App ID from Speechly Dashboard or by using Speechly CLI list command.

src/app.js
import { BrowserClient } from '@speechly/browser-client';

const client = new BrowserClient({
appId: 'YOUR-APP-ID',
logSegments: true,
debug: true,
})

The debug and logSegments properties might be helpful when developing, as they display changes in the client state as well as log the API output.

If you have debug enabled, you should see from the developer console that the client has connected to the API. Now you are ready to capture microphone audio!

Capture microphone audio

The easiest way to capture audio from the browser microphone is creating a button that toggles the microphone on and off.

First, import BrowserMicrophone and create a new microphone instance.

src/app.js
import { BrowserClient, BrowserMicrophone } from '@speechly/browser-client';

const microphone = new BrowserMicrophone();
const client = new BrowserClient({
appId: 'YOUR-APP-ID',
logSegments: true,
debug: true,
})

Next, create a button and a click handler for it where you attach the microphone to the client.

src/index.html
<button id="mic">Start microphone</button>
src/app.js
// ...
const micBtn = document.getElementById('mic');

const attachMicrophone = async () => {
if (microphone.mediaStream) return;
await microphone.initialize();
await client.attach(microphone.mediaStream);
};

const handleClick = async () => {
if (client.isActive()) {
await client.stop();
micBtn.innerText = 'Start microphone';
} else {
await attachMicrophone();
await client.start();
micBtn.innerText = 'Stop microphone';
}
};

micBtn.addEventListener('click', handleClick);

attachMicrophone is a helper function for attaching the microphone to the client. Browsers don't allow accessing the microphone programmatically, that's why it's required to call it from a user initiated action.

The start and stop methods are used for manually controlling audio processing. When used together with client.isActive, you have created simple on/off microphone toggle button.

The first time you press the microphone button you will be prompted for microphone permissions. If you have logSegments enabled, you should see the API output in the developer console.

React to API updates

A common pattern when working with Speechly Browser Client is to use the client.onSegmentChange method which adds a listener for current segment change events.

First, create two elements for tentative and final transcripts.

src/index.html
<button id="mic">Start microphone</button>
<div id="transcripts"></div>
<p id="tentative"></p>

Next, in the onSgementChange callback, create a transcript string from segment.words array and render it to the appropriate elements.

src/app.js
//...
const transcripts = document.getElementById('transcripts');
const tentative = document.getElementById('tentative');

client.onSegmentChange((segment) => {
const text = segment.words.map((word) => word.value).join(' ');
tentative.innerHTML = `<em>${text}</em>`;
if (segment.isFinal) {
transcripts.innerHTML += `<p>${text}</em>`;
tentative.innerHTML = '';
}
});

segment is a structure that accumulates speech recognition (ASR) and natural language understanding (NLU) results. When segment.isFinal is false, the segment might be updated several times. When true, the segment won't be updated anymore and subsequent callbacks within the same audio context refer to the next segment.

Next steps

By now you should have a basic understanding of how to work with Speechly Browser Client and a working web app to that's able to handle speech input and and produce a transcript. Now you're ready to learn about some more advanced features of Speechly: