All model builders strive to emulate reality in their creations. With artificial intelligence (AI) model building, that striving for perfection never ceases. Iteration is a fundamental tenet of model building. Rapid research cycles drive continual refinements to our technology. The faster we can iterate, the better the quality of the solutions we deliver to our customers.
Over the past few years, Digital Reasoning has tasked its Audio Research Team with developing audio analytics software that is highly effective at processing the noisy, domain-specific voice data that we typically encounter within the trading operations of major banks. Our audio development pipeline contains various deep learning models trained on large volumes of audio and text datasets, to feed features to our downstream natural language understanding (NLU) models. A crucial portion of our audio pipeline is our automatic speech recognition (ASR) model.
ASR model training involves massive amounts of computation, training data, and time, but it is the basis of our ability to experiment and increase our rate of iteration. Once trained, the ASR model serves as our out-of-the-box model to commence customer deployments, with fine-tuning to further improve accuracy for individual customer’s needs.
In an article published on Medium, Digital Reasoning’s Audio Research Team explain the creation of our ASR model in further depth, focusing on our advancements in performance and showing how this has resulted in significant improvements in training times.
Click here to read the article on Medium.