Kaldi (software)

{{Short description|Open-source speech recognition software toolkit}}

{{Multiple issues|

{{notability|Product|date=October 2022}}

{{primary sources|date=October 2022}}

}}

{{Infobox software

| name = Kaldi

| logo =

| screenshot =

| caption =

| developer = Daniel Povey and others

| latest_release_version = 5.5.636

| latest_release_date = {{Start date and age|2020|2}}

| repo = https://github.com/kaldi-asr/kaldi

| programming language = C++

| operating_system = Unix systems (Linux, BSD, OSX 10.{8,9} etc.), Windows (via Cygwin)

| genre = Speech recognition

| license = Apache License v.2.0{{cite web|url=http://kaldi-asr.org/doc/legal.html|title=Kaldi: Legal stuff|website=kaldi-asr.org}}

| website = {{url|kaldi-asr.org}}

}}

Kaldi is an open-source speech recognition toolkit written in C++ for speech recognition and signal processing, freely available under the Apache License v2.0.

Kaldi aims to provide software that is flexible and extensible,{{cite web|url=http://kaldi-asr.org/doc/about.html|title=Kaldi: About the Kaldi project|website=kaldi-asr.org}} and is intended for use by automatic speech recognition (ASR) researchers for building a recognition system.

It supports linear transforms, MMI, boosted MMI and MCE discriminative training, feature-space discriminative training, and deep neural networks.{{cite web|url=http://kaldi-asr.org/doc/dnn.html|title=Kaldi: Deep Neural Networks in Kaldi|website=kaldi-asr.org}}

Kaldi is capable of generating features like mfcc, fbank, fMLLR, etc. Hence in recent deep neural network research, a popular usage of Kaldi is to pre-process raw waveform into acoustic feature for end-to-end neural models.

Kaldi has been incorporated as part of the [https://chimechallenge.github.io/chime6/ CHiME Speech Separation and Recognition Challenge] over several successive events.{{cite web|url=http://spandh.dcs.shef.ac.uk/chime_challenge/software.html|title=The 4th CHiME Speech Separation and Recognition Challenge|author=|date=|website=|publisher=|accessdate=15 February 2017|archive-date=16 February 2017|archive-url=https://web.archive.org/web/20170216210537/http://spandh.dcs.shef.ac.uk/chime_challenge/software.html|url-status=dead}}{{cite web|url=http://spandh.dcs.shef.ac.uk/chime_challenge/chime2015/software.html|title=The 3rd CHiME Speech Separation and Recognition Challenge|author=|date=|website=|publisher=|accessdate=15 February 2017|archive-date=26 July 2017|archive-url=https://web.archive.org/web/20170726013949/http://spandh.dcs.shef.ac.uk/chime_challenge/chime2015/software.html|url-status=dead}}Emmanuel Vincent, Jon Barker, Shinji Watanabe, Jonathan Le Roux, Francesco Nesta, et

al.. The second 'CHiME' Speech Separation and Recognition Challenge: Datasets, tasks and

baselines. ICASSP - 38th International Conference on Acoustics, Speech, and Signal Processing

- 2013, May 2013, Vancouver, Canada. pp.126-130, 2013. The software was initially developed as part of a 2009 workshop at Johns Hopkins University.{{cite web|title=History of the Kaldi project|url=http://kaldi-asr.org/doc/history.html|accessdate=26 July 2017}}

Kaldi is named after the legendary Ethiopian goat herder Kaldi who was said to have discovered the coffee plant.{{Cite web|url=https://kaldi-asr.org/doc/about.html|title=Kaldi: About the Kaldi project}}

See also

{{Portal|Free and open-source software}}

References

{{Reflist}}