C¶
-
int
DS_CreateModel(const char *aModelPath, const char *aAlphabetConfigPath, unsigned int aBeamWidth, ModelState **retval)¶ An object providing an interface to a trained DeepSpeech model.
- Return
Zero on success, non-zero on failure.
- Parameters
aModelPath: The path to the frozen model graph.aAlphabetConfigPath: The path to the configuration file specifying the alphabet used by the network. See alphabet.h.aBeamWidth: The beam width used by the decoder. A larger beam width generates better results at the cost of decoding time.[out] retval: a ModelState pointer
-
void
DS_FreeModel(ModelState *ctx)¶ Frees associated resources and destroys model object.
-
int
DS_EnableDecoderWithLM(ModelState *aCtx, const char *aLMPath, const char *aTriePath, float aLMAlpha, float aLMBeta)¶ Enable decoding using beam scoring with a KenLM language model.
- Return
Zero on success, non-zero on failure (invalid arguments).
- Parameters
aCtx: The ModelState pointer for the model being changed.aLMPath: The path to the language model binary file.aTriePath: The path to the trie file build from the same vocabu- lary as the language model binary.aLMAlpha: The alpha hyperparameter of the CTC decoder. Language Model weight.aLMBeta: The beta hyperparameter of the CTC decoder. Word insertion weight.
-
char *
DS_SpeechToText(ModelState *aCtx, const short *aBuffer, unsigned int aBufferSize, unsigned int aSampleRate)¶ Use the DeepSpeech model to perform Speech-To-Text.
- Return
The STT result. The user is responsible for freeing the string using DS_FreeString(). Returns NULL on error.
- Parameters
aCtx: The ModelState pointer for the model to use.aBuffer: A 16-bit, mono raw audio signal at the appropriate sample rate.aBufferSize: The number of samples in the audio signal.aSampleRate: The sample-rate of the audio signal.
-
Metadata *
DS_SpeechToTextWithMetadata(ModelState *aCtx, const short *aBuffer, unsigned int aBufferSize, unsigned int aSampleRate)¶ Use the DeepSpeech model to perform Speech-To-Text and output metadata about the results.
- Return
Outputs a struct of individual letters along with their timing information. The user is responsible for freeing Metadata by calling DS_FreeMetadata(). Returns NULL on error.
- Parameters
aCtx: The ModelState pointer for the model to use.aBuffer: A 16-bit, mono raw audio signal at the appropriate sample rate.aBufferSize: The number of samples in the audio signal.aSampleRate: The sample-rate of the audio signal.
-
int
DS_CreateStream(ModelState *aCtx, unsigned int aSampleRate, StreamingState **retval)¶ Create a new streaming inference state. The streaming state returned by this function can then be passed to DS_FeedAudioContent() and DS_FinishStream().
- Return
Zero for success, non-zero on failure.
- Parameters
aCtx: The ModelState pointer for the model to use.aSampleRate: The sample-rate of the audio signal.[out] retval: an opaque pointer that represents the streaming state. Can be NULL if an error occurs.
-
void
DS_FeedAudioContent(StreamingState *aSctx, const short *aBuffer, unsigned int aBufferSize)¶ Feed audio samples to an ongoing streaming inference.
- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().aBuffer: An array of 16-bit, mono raw audio samples at the appropriate sample rate.aBufferSize: The number of samples inaBuffer.
-
char *
DS_IntermediateDecode(StreamingState *aSctx)¶ Compute the intermediate decoding of an ongoing streaming inference. This is an expensive process as the decoder implementation isn’t currently capable of streaming, so it always starts from the beginning of the audio.
- Return
The STT intermediate result. The user is responsible for freeing the string using DS_FreeString().
- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().
-
char *
DS_FinishStream(StreamingState *aSctx)¶ Signal the end of an audio signal to an ongoing streaming inference, returns the STT result over the whole audio signal.
- Return
The STT result. The user is responsible for freeing the string using DS_FreeString().
- Note
This method will free the state pointer (
aSctx).- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().
-
Metadata *
DS_FinishStreamWithMetadata(StreamingState *aSctx)¶ Signal the end of an audio signal to an ongoing streaming inference, returns per-letter metadata.
- Return
Outputs a struct of individual letters along with their timing information. The user is responsible for freeing Metadata by calling DS_FreeMetadata(). Returns NULL on error.
- Note
This method will free the state pointer (
aSctx).- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().
-
void
DS_FreeStream(StreamingState *aSctx)¶ Destroy a streaming state without decoding the computed logits. This can be used if you no longer need the result of an ongoing streaming inference and don’t want to perform a costly decode operation.
- Note
This method will free the state pointer (
aSctx).- Parameters
aSctx: A streaming state pointer returned by DS_CreateStream().
-
void
DS_FreeString(char *str)¶ Free a char* string returned by the DeepSpeech API.
-
void
DS_PrintVersions()¶ Print version of this library and of the linked TensorFlow library.