Skip to main content

Getting started with Speechly

This is a step-by-step guide tailored for those starting out with Speechly for the first time.

Quick introductionโ€‹

Speechly is a voice technology that offers Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) tools and APIs. We've built it from the ground up and here's some feature we'll think you'll like:

Create a Speechly accountโ€‹

Before you can use Speechly products, you'll need to create a Speechly account. Signing up is free and includes:

  • 50h of API quota
  • All the latest models
  • Cloud deployment
  • Batch and Streaming transcription
  • NLU features

Create a Speechly applicationโ€‹

Once you have your account set up, you need to create a Speechly application. Each application in Speechly hosts its own training data and settings. Your project can contain as many applications as you like.

For now we'll create an empty application that you can expand later on. You can do this using Speechly Dashboard or Speechly CLI.

Speechly Dashboardโ€‹

  1. Open Create a new application
  2. Give your application a Name
  3. Press Create application
  4. Deploy the application

Speechly CLIโ€‹

To access your project from Speechly CLI, you need to create a Speechly API token by going to Project settings โ†’ API tokens. Make sure to copy and store the token, you'll need it soon.

Install Speechly CLI:

# Using Homebrew
brew tap speechly/tap
brew install speechly

# Using Scoop
scoop bucket add speechly https://github.com/speechly/scoop-bucket
scoop install speechly

Add your project:

speechly projects add -apikey YOUR_API_TOKEN

Create a new application:

mkdir my-app
cd my-app
speechly create "My first app"

Copy the App ID and deploy the application:

speechly deploy YOUR_APP_ID .

Transcribe live audioโ€‹

Streaming transcription works in real-time and is perfect for working with live audio, for example when capturing audio from the device microphone. To demonstrate this, we'll be using Speechly Dashboard.

Open Previewโ€‹

  1. Open Speechly Dashboard
  2. Open your Application
  3. Go to the Preview tab

Start talkingโ€‹

Once there, press the microphone button, give the site access to your microphone, and start talking. Notice how the transcript appears in real-time!

preview

Analyze the responseโ€‹

Open the browser developer console to see the JSON response that's emitted for each speech segment:

{
"id": 0,
"contextId": "9af98c09-c974-4393-9368-0c64a6c3e583",
"isFinal": true,
"words": [
{
"value": "welcome",
"index": 2
},
{
"value": "to",
"index": 3
},
{
"value": "my",
"index": 4
},
{
"value": "first",
"index": 5
},
{
"value": "application",
"index": 6
}
],
"entities": [],
"intent": {
"intent": "",
"isFinal": true
}
}

What next?โ€‹

Now that you have received your first streaming transcript using Speechly, check out Speechly On-device. It's an excellent way of transcribing live audio in real-time, accurately and cost-effectively right on the usersโ€™ device.

Transcribe pre-recorded audioโ€‹

Batch transcription works asynchronously and is perfect for working with pre-recorded audio. To demonstrate this, we'll be using the transcribe command. If you used Speechly Dashboard in the previous steps, now is a good time to install and set up Speechly CLI.

Choose an audio fileโ€‹

Use an existing audio file, record your own or use our sample audio file. We only support 1ย channel 16โ€‘bit 16ย kHz PCM WAV files at the moment.

Tip

Convert audio files with Audacity or with SoX by running: sox in.wav -c 1 -b 16 -r 16000 out.wav

Info

The transcribe command also supports transcribing multiple files. Create a JSON Lines file with each audio on their own line using the format: {"audio": "path/to/file.wav"}. Then simply pass the JSON Lines file as the input file!

Upload audioโ€‹

Open your terminal, navigate to the location where your audio file is and run:

speechly transcribe path/to/file.wav -a YOUR_APP_ID

You can use the same App ID as in the previous example.

See resultsโ€‹

Your transcript will appear in the terminal once it's ready.

Example output
hi i'm neil degrasse tyson astrophysicist in addition to probing the
secrets of the universe also a movie buff today i introduce you to a
film that everyone thought was lost forever until a print was recently
discovered in a hollywood vault future thirty eight forgotten treasure
from nineteen thirty eight it's one of the first color pictures preceding
gone with the wind and the wizard of oz by a year but what interest me
most is the science finally a movie that gets time travel right

What next?โ€‹

Now that you have received your first batch transcript using Speechly, check out Speechly On-premise. It's an excellent way of transcribing large quantities of pre-recorded audio accurately and asynchronously on-premise or in your private cloud.

Need help?โ€‹

Post a question to our Github Discussions page. For more concrete technical problems, please file an issue.

Try to be as specific as you can. Describe what you are trying to do, how you do it, and what errors (if any) you are getting. We are happy to help!