Tendr: Speech recognition, both via an Automatic Speech Recognition system and Human-based speech transcription
Zadavatel: United Nations office at Geneva
Fáze “Request for Information” – RFI
RFI deadline: 4/10
The purpose of this Request for Information (RFI) is to provide the United Nations Office at Geneva (UNOG) with information about vendor capabilities in several aspects of speech recognition, both via an Automatic Speech Recognition (ASR)* system and Human-based speech transcription, and information on the latest developments and solutions on the speech recognition market, in order for UNOG to gain a better understanding of the range of existing options and related natural language processing services in this field. Based on the findings, UNOG may consider the possibility of launching a competitive solicitation process for this solution.
As a major conference hub, Division of Conference Management (DCM), provides service to thousands of multilingual meetings per year. Speeches at the conferences will be relayed into six official UN languages (Arabic, Chinese, English, French, Spanish, and Russian). In this regard, UNOG is interested in receiving information about available solutions for implementing speech recognition for live captioning of meetings and for generating searchable verbatim transcripts in the six official UN languages to make them available shortly after a meeting for a number of uses.
Furthermore, UNOG is looking into the next level of services arising from availability of data generated through speech recognition, e.g. extractive text summarization, data mining and sentiment analysis, topic and entity recognition, readability assessment, and other text post-processing options.
The solutions proposed in the response to this RFI should address the following areas:
– Ability to support direct recognition on audio feed and produce captioning in real-time in at least one of the six official UN languages;
– Ability to support, detect and smoothly relay between six official UN languages;
– Ability to generate highly accurate transcriptions of recorded audios with reasonable delay;
– Search-engine capabilities within provided user interface, if any;
– Post-processing capabilities based on generated text.
Please detail whether your solution supports live captioning, which languages and dialects are supported in general, and what other post-processing services are offered, and provide information about supported formats/interface for transcription output and search-engine features.
Responses to this RFI should not exceed 15 pages.