Skip to main content

Speechly On-device

Transcribe live audio in real-time, accurately and cost-effectively right on the users’ device.

Most speech recognition solutions are SaaS products. This requires sending large amounts of audio over the Internet to be processed in the cloud. However, for services that require real-time processing of tens of thousands of hours of audio per day, cloud-based solutions are often too expensive or cannot deliver transcripts in real-time. These difficulties are easily overcome by deploying the speech recognition software directly on the user’s device.

Speechly on-device is at its core a C-library (Speechly Decoder) that runs on a variety of CPUs and operating systems. It uses the same proprietary speech recognition models trained on tens of thousands of hours of speech that also power Speechly cloud. It can be built against different machine learning frameworks, such as TensorFlow Lite, Core ML, and ONNX Runtime to provide optimum performance on different platforms.

Enterprise feature

Deploying Speechly on-device is available on Enterprise plans.

Overview

Deploying Speechly on-device is a bit different to deploying it in the cloud, but the core concepts are the same:

  1. Select the application you want to use, or create a new one.
  2. From the Models section, select a small model and deploy changes if necessary.
  3. Download the model bundle and the Speechly Decoder library and import them into your project.
  4. Use Speechly Decoder API to transcribe live audio in real-time.
Tip

If you make changes to your training data, remember to deploy the changes, download the updated model bundle and import it to your project.

Technical specifications
  • Supported CPUs: x86, x86_64, arm32, arm64
  • Minimum CPU requirement: Depends on the platform, real time decoding on modern arm64 SoCs consumes about 1-2 CPU cores.
  • Supported operating systems: Android (including Oculus), iOS, Windows, Linux, Macos, BSD variants.
  • Input audio: 1-channel, at least 16kHz sample rate.
  • Impact on binary size (non-mobile platforms): ~6MB when linked statically, ~500k when using dynamic library that uses e.g. TensorFlow Lite or Core ML
  • Model size: 70-140MB depending on accuracy requirements / available resources

Select an appropriate model

Speechly offers models of different sizes and capabilities. For on-device use, only small models are supported.

Speechly Dashboard

  1. Go to Application Overview Model
  2. Select the small model you want to use

config.yaml

Add the following line to your config.yaml:

model: small-lowlatency-LATEST

Download the Speechly Decoder library

The Speechly Decoder library is available for Android, iOS, and Unity. For other platforms, such as native Windows applications, we can provide either a pre-compiled dynamic or a static library plus the required header files.

Integrating the library doesn't require expertise in speech recognition, but you must be able to capture real-time audio from the device microphone. You can download the library from Speechly Dashboard by going to Application Integrate.

Download a model bundle

To use the Speechly Decoder library you need a model bundle. They are available for three different machine learning frameworks: ONNX Runtime, TensorFlow Lite and Core ML.

Which model bundle you need depends on the platform you are developing on. Also, all model bundles have a predefined lifetime after which the Speechly Decoder library refuses to load the model.

Speechly Dashboard

  1. Go to Application Overeviw Model
  2. Click the version you want to download

Speechly CLI

Use the download command:

speechly download YOUR_APP_ID . --model coreml
# Available options are: ort, tflite, coreml and all

Get started with iOS

If you would like to try out Speechly on-device streaming transcription on iOS, there's a iOS example application you can run in simulator or on an iOS device.

Before you start

Make sure you have created and deployed a Speechly application. You'll also need a Core ML model bundle and the SpeechlyDecoder.xcframework library. See above for instructions on how to download them.

Copy the example app

Copy the example app using degit:

npx degit speechly/speechly/examples/ios-decoder-example my-ios-app
cd my-ios-app

Add dependencies

Open Decoder.xcodeproj in Xcode and add both SpeechlyDecoder.xcframework and YOUR_MODEL_BUNDLE.coreml.bundle to the project by dragging and dropping them into the Frameworks folder:

xcode

Make sure Copy items if needed, Create groups and Add to targets are selected:

xcode

In Decoder/SpeechlyManager.swift update the model bundle resource URL:

let bundle = Bundle.main.url(
forResource: "YOUR_MODEL_BUNDLE.coreml",
withExtension: "bundle"
)!

Run the app

Run the app and grant it microphone permissions when prompted.

Get started with Android

If you would like to try out Speechly on-device streaming transcription on Android, there's a Android example application you can run in emulator or on an Android device.

Before you start

Make sure you have created and deployed a Speechly application. You'll also need a TensorFlow Lite model bundle and the SpeechlyDecoder.aar library. See above for instructions on how to download them.

Copy the example app

Copy the example app using degit:

npx degit speechly/speechly/examples/android-decoder-example my-android-app
cd my-android-app

Add dependencies

Put SpeechlyDecoder.aar in a directory that gradle can find. For example, add a flatDir field to the repositories section in your settings.gradle:

pluginManagement {
repositories {
flatDir {
dirs '/path/to/decoder'
}
}
}

In your build.gradle dependencies section add:

dependencies {
implementation 'org.tensorflow:tensorflow-lite:2.9.0'
implementation(name:'SpeechlyDecoder', ext:'aar')
}

If the file is packaged as part of the application, it may be good to ensure that it is not compressed when building the .apk by updating the android section in your build.gradle:

android {
aaptOptions {
noCompress 'bundle''
}
}

Open DecoderTest in Android Studio and add YOUR_MODEL_BUNDLE.tflite.bundle to the project by dragging and dropping it into the build/src/main/assets folder:

android studio

In MainActicity.java update the model bundle resource:

this.bundle = loadAssetToByteBuffer("YOUR_MODEL_BUNDLE.tflite.bundle");

Run the app

Run the app and grant it microphone permissions when prompted.

Get started with C

If you would like to try out Speechly on-device streaming transcription using plain C, there's a C example program that you can compile and run. The repository contains a readme on how to use it and any prerequisites it may have.

API reference

Speechly Decoder API