Speechly Annotation Language (SAL)
SAL is a powerful syntax designed to make annotating example utterances faster and easier.
This document covers all SAL features with examples as well as SAL semantics and template expansion.
Annotation syntax
The annotation syntax is used to annotate intents and entities.
Intent
The intent in an utterance indicates what the user in general wants. Every example utterance must have an intent assigned to it.
Intents are defined by prepending the example with *intent_name
. The remaining sentence after the *intent_name
part will be recognized as having intent intent_name
.
For example:
*show_products show all products
An utterance "Show all products" will return the intent show_products
.
Entity
Entities are “local snippets of information” in an utterance that describe details relevant to the users need.
Entities are annotated with [entity value](entity name)
notation.
For example:
*show_products show [jeans](category)
An utterance "Show jeans" will return the value jeans
for entity name category
.
Template notation
Template notation is used to define templates that are expanded into a large set of example utterances during model training.
Lists
Lists are defined by [exp1 | exp2 | ... | exp_N]
, where exp1
, exp2
... are arbitrary SAL expressions.
When a template having a list is expanded, only one of the list elements is used in the final example utterance.
For example, the template
*show_products [show | view | i want to see] products
Is equivalent to writing
*show_products show products
*show_products view products
*show_products i want to see products
Optional parts
A substring of an example utterance can be declared as optional by enclosing it in curly braces {this substring is optional}
. The optional part can be an arbitrary SAL expression.
The optional parts of an example utterance may or may not exist.
For example, the template
*show_products {show} products {please}
Is equivalent to writing
*show_products show products please
*show_products show products
*show_products products please
*show_products products
Variables
Variables are declared with the syntax variable_name = <arbitrary-SAL-expression>
, and their value is accessed by $variable_name
. You can assign any arbitrary SAL expression to a variable.
For example, a common use case for variables are lists of various entity values:
categories = [jeans | shoes | shirts | accessories]
*show_products show $categories(category)
Note that above $categories(category)
is shorthand for [$categories](category)
. When the entity value is looked up from a variable, the brackets are not necessary.
Variables can also be used to assemble complex phrases from simple components
digit = [one | two | three | four | five | six | seven | eight | nine | zero]
symbol = [hash | slash | dash]
product_code = $digit $digit $symbol $digit $digit $digit $digit
Above, product_code
defines a template that expands to all possible utterances that start with two digits, followed by one of the symbols, followed by four digits, such as "six four dash nine nine zero four" or "one two hash three four five six".
Every variable x
must be declared in your configuration before it can be used with the $x
notation.
✅ This is ok
// Variable defined before it's used
x = [hello | hi | greetings]
*greet $x
🚫 These are not ok
// Variable used before it's defined.
*greet $x
x = [hello | hi | greetings]
// Recursive variable declaration.
products = [
jeans
shoes
$products
]
Standard variables
Standard variables are predefined variables that make supporting certain common but somewhat complex expressions easier. They look like regular variables and are used in the same way, but you don't need to define them. They are useful when your application must support numbers, dates, times, sequences of alphanumeric characters, email addresses, etc.
*schedule Let's book a meeting for $SPEECHLY.DATE(meeting_date) at $SPEECHLY.TIME(meeting_time)
*send Send me an to $SPEECHLY.EMAIL(recipient)
*find My order ID is $SPEECHLY.IDENTIFIER(order_id)
See Standard varaibles to learn more.
Permutations
A permutation generates all possible permutations of the given list of expressions. It is defined with the syntax ![exp1 | exp2 | ... | exp_N]
, where exp1
, exp2
, ... can be arbitrary SAL expressions.
For example
*book Book a ticket ![from [New York](from) | to [London](to) | for [two](num_passengers)]
is equivalent to writing:
*book Book a ticket from [New York](from) to [London](to) for [two](num_passengers)
*book Book a ticket from [New York](from) for [two](num_passengers)] to [London](to)
*book Book a ticket to [London](to) from [New York](from) for [two](num_passengers)
*book Book a ticket to [London](to) for [two](num_passengers) from [New York](from)
*book Book a ticket for [two](num_passengers) from [New York](from) to [London](to)
*book Book a ticket for [two](num_passengers) to [London](to) from [New York](from)
Semantics
A SAL configuration consists of SAL expressions, a SAL expression is either:
- an example utterance,
- a template,
- a partial template, or
- a variable definition.
Every line in a SAL configuration must define an example utterance, a template, or a variable definition.
Example utterances
A SAL expression is an example utterance if it does not contain any template notation. That is, it can only be expanded to itself.
All the following SAL expressions are example utterances:
*search show [red](color) [pants](product)
*book book a flight from [New York](depart_city) to [London](arrival_city)
*greeting hello
All SAL expressions that are example utterances must start by defining an intent.
Templates
A SAL expression is a template if it contains template notation and can be expanded to an example utterance.
All of these SAL expressions are templates:
*search show {[red | green | blue](color)} [pants | shirts | shoes](product)
*book book a flight ![from $city(depart_city) | to $city(arrival_city)]
*greeting $all_greeting_phrases
In general a template starts with an intent definition. However, if a variable definition expands to a template, then a reference to this variable is also a template.
✅ This is okay
all_my_intents = [
*buy buy $company(stock_name) for $amount(value) dollars
*sell sell $company(stock_name) for $amount(value) dollars
]
$all_my_intents
🚫 This is not okay
all_my_intents = [
buy $company(stock_name) for $amount(value) dollars
sell $company(stock_name) for $amount(value) dollars
]
$all_my_intents
This is because the expressions in the all_my_intents
list do not expand to valid example utterances as they are missing the intent definition.
Partial templates
A partial template is just like a template, but it does not expand to a valid example utterance. That is, it is missing the intent definition.
Examples of Partial templates are:
hello
[New York | London | Paris | Berlin | Tokyo]
{can i have} a [large](size) [coffee](product)
A partial template is meaningful only as
- a list item,
- an optional part,
- a permutation item,
- the right-hand-side of a varible definition.
A partial template can not be used as such, but it must always appear as part of a complete template.
Variable definitions
A SAL expression is a variable definition if it has the format:
LHS = RHS
where LHS
is a variable name and RHS
is either an example utterance, a template or a partial template. A variable definition must appear before the variable in question is being used / referred to.
How template expansion works
The templates in your SAL configuration are randomly expanded to example utterances during training. The training system does not exhaustively expand all possible utterances from the templates, but randomly generates a sufficient amount of example uttearances.
A template is expanded by processing it left-to-right. Whenever template notation is encountered, the expansion algorithm expands the part in question according to its expansion rule. These are given below for lists, optional parts, variables, and permutations. The algorithm is applied recursively if applying the expansion rule resolves to something that can be further expanded.
Expansion rules
- Lists: a list item is selected uniformly at random from all list items.
- Optional parts: the expression enclosed in the Optional part is expanded with probability 0.5, and omitted with probability 0.5.
- Variables: the variable reference is replaced with the variables value, and the expansion algorithm proceeds from there.
- Permutations: the expressions in the permutation list are arranged in the resulting example utterance so that every arrangement has equal probability. That is, with probability 1/N if there are N expressions in the permutation.
An example
The workings of the expansion algorithm are best illustrated by an example, suppose we are given the template:
*search show {[red | green | blue](color)} [pants | shirts | shoes](product)
Since running the algorithm involves randomness, the following is an example of one possible outcome:
- Read token
*search
which does not expand, output*search
. - Read token
show
which does not expand, outputshow
. - Read the Optional part
{[red | green | blue](color)}
, flip a coin, decide to skip the optional part, output nothing. - Read the List
[pants | shirts | shoes]
, apply its expansion rule and select one of the items uniformly at random. Output[shoes]
. - Read token
(product)
which does not expand, output(product)
.
Concatenating the output yields the example utterance:
*search show [shoes](product)