Skip to content

vladmaraev/speechstate

Repository files navigation

SpeechState

https://badge.fury.io/js/speechstate.svg

Free browser-based spoken dialogue system. Based on XState.

SDK

Spawn SpeechState

import { speechstate } from "speechstate";

// start with this action in your state chart
assign({
    spstRef: ({ spawn }) => {
        return spawn(speechstate, {
            input: {
                settings: ...
            },
        });
    },
}),
interface AzureCredentials {
  endpoint: string;
  key: string;
}

interface Settings {
  locale?: string;
  azureCredentials: string | AzureCredentials;
  azureRegion: string;
  azureLanguageCredentials?: AzureLanguageCredentials;
  asrDefaultCompleteTimeout?: number;
  asrDefaultNoInputTimeout?: number;
  speechRecognitionEndpointId?: string;
  ttsDefaultVoice?: string;
  ttsLexicon?: string;
}

Events

Example action:

({ context }) =>
  context.spstRef.send({
    type: "SPEAK",
    value: { utterance: "Hello world", voice: "en-GB-RyanNeural" },
  });

DM to SpeechState

  • { type: "PREPARE" }
  • { type: "SPEAK"; value: Agenda }
  • { type: "LISTEN"; value?: RecogniseParameters }
  • { type: "CONTROL" }
  • { type: "STOP" }

SpeechState to DM

  • { type: "ASRTTS_READY" }
  • { type: "ASR_NOINPUT" }
  • { type: "RECOGNISED"; value: Hypothesis[]; nluValue?: any }
  • { type: "SPEAK_COMPLETE" }
  • { type: "ASR_STARTED" }
  • { type: "TTS_STARTED" }

Types:

interface Hypothesis {
    utterance: string;
    confidence: number;
}

interface Agenda {
  utterance: string;
  voice?: string; // defaults to "en-US-DavisNeural"
  streamURL?: string;
}

interface RecogniseParameters {
  noInputTimeout?: number;
  completeTimeout?: number;
  locale?: string;
  hints?: string[];
  nlu?: boolean | AzureLanguageCredentials;
}

interface AzureLanguageCredentials {
  endpoint: string;
  key: string;
  projectName: string;
  deploymentName: string;
}

How to run SpeechState

Create Azure account and enable speech services

  1. Apply for free student credits https://azure.microsoft.com/en-us/free/students/. You should be able to login with your GU account.
  2. Make sure that you are logged into the Azure portal (https://portal.azure.com/).
  3. Create a Resource group (you can use search field):
    • Subscription: Azure for students
    • Resource group: any name
    • Region: (Europe) North Europe
  4. Create a Speech service:
    • Name: any name
    • Subscription: Azure for students
    • Location: (Europe) North Europe
    • Pricing tier: Free (F0)
    • Resource group: group name from the previous step
  5. Within your Speech Service go to: Resourse management → Keys and Endpoint and copy your KEY 1.

Sequence diagrams

docs/diagrams/dm-speechstate.svg