Python¶
Model¶
-
class
Model(*args, **kwargs)[source]¶ Class holding a DeepSpeech model
- Parameters
-
createStream(sample_rate=16000)[source]¶ Create a new streaming inference state. The streaming state returned by this function can then be passed to
feedAudioContent()andfinishStream().- Parameters
aSampleRate (int) – The sample-rate of the audio signal.
- Returns
Object holding the stream
- Throws
RuntimeError on error
-
enableDecoderWithLM(*args, **kwargs)[source]¶ Enable decoding using beam scoring with a KenLM language model.
- Parameters
aLMPath (str) – The path to the language model binary file.
aTriePath (str) – The path to the trie file build from the same vocabulary as the language model binary.
aLMAlpha (float) – The alpha hyperparameter of the CTC decoder. Language Model weight.
aLMBeta (float) – The beta hyperparameter of the CTC decoder. Word insertion weight.
- Returns
Zero on success, non-zero on failure (invalid arguments).
- Type
-
feedAudioContent(*args, **kwargs)[source]¶ Feed audio samples to an ongoing streaming inference.
- Parameters
aSctx (object) – A streaming state pointer returned by
createStream().aBuffer (int array) – An array of 16-bit, mono raw audio samples at the appropriate sample rate.
aBufferSize (int) – The number of samples in @p aBuffer.
-
finishStream(*args, **kwargs)[source]¶ Signal the end of an audio signal to an ongoing streaming inference, returns the STT result over the whole audio signal.
- Parameters
aSctx (object) – A streaming state pointer returned by
createStream().- Returns
The STT result.
- Type
-
finishStreamWithMetadata(*args, **kwargs)[source]¶ Signal the end of an audio signal to an ongoing streaming inference, returns per-letter metadata.
- Parameters
aSctx (object) – A streaming state pointer returned by
createStream().- Returns
Outputs a struct of individual letters along with their timing information.
- Type
-
intermediateDecode(*args, **kwargs)[source]¶ Compute the intermediate decoding of an ongoing streaming inference. This is an expensive process as the decoder implementation isn’t currently capable of streaming, so it always starts from the beginning of the audio.
- Parameters
aSctx (object) – A streaming state pointer returned by
createStream().- Returns
The STT intermediate result.
- Type
Metadata¶
-
class
Metadata[source]¶ Stores the entire CTC output as an array of character metadata objects
-
confidence()[source]¶ Approximated confidence value for this transcription. This is roughly the sum of the acoustic model logit values for each timestep/character that contributed to the creation of this transcription.
-
items()[source]¶ List of items
- Returns
A list of
MetadataItem()elements- Type
-