I think the future for access to large aural collections lies in speech analytics rather than voice to text. Many large media companies are employing dialog searching to seek across videos, television programming, and news footage. There is no transcript and they are not wanting to create a transcript. Just searching the audio directly – kind of like OCR for voice. In this scenario you would not create a transcript as a search mechanism but may only create a transcript or partial transcript on an as needed basis for documentation as text, such as quotations from the speaker.