VoIP Calls with Minutes using AWS Transcribe

In this case, you will be shown how to start an audio conference with speech-to-text minutes for English-spoken audio.

In the click-to-dial case, we have shown how simple it is to place a call from a browser. It can be even linked to conferencing or to the PSTN network. Even greater value can be offered to end-user when some advanced AWS services are used, such as speech recognition AWS Transcribe. Transcribe can analyse audio files and return them translated to text for further processing: archiving conference minutes, searching, translation (see Amazon Translate) or natural language processing (see Amazon Comprehend).

You will be presented a simple webpage from which you can join an audio conferencing room either as participant or as organiser with recording. The recorded audio will be translated into a simple text email.

Known Limitations

This use-case is utilising an AWS service, Transcribe, which entered General Availability on April 4th, 2018. At the date of GA release, the maximum number of concurrent calls was limited to ten. 

How To Use It

Once the Transcribe service is available to you, you will be able to start the cloud formation by visiting the following link. Include an email address to which the translated text will be sent.

When the Cloud Formation process completes, you will find a link in the Ouputs tab. Follow the link, accept self-signed certificates. You can share the browser link with additional participants.

In the meantime, you should have received a verification of your email address. Approve it to be able to receive the textual transcription of your conference call.

Return to the conferencing webpage. If you are the conference organiser, join the room using the “Join&Record” button, otherwise use the “My Bridge” button. You will be prompted for permission to use your microphone.

Once the organiser has left the conference using the same button he had used to join previously, an audio file will be created and processed. After several minutes, it will be delivered to the Email address indicated during the cloud formation process.

What Is Orchestrated

An SBC instance processes the calls and stores WAV files at the Monitor instance. The Monitor pushes the WAV files through S3 to Amazon Transcribe and retrieves the text back through S3 again. Eventually the Monitor sends an SNS message to the Email address.

What Else You Should Know

This scenario is easily possible due to ABC-SBC‘s integrated media features (recording, conferencing, announcement) and ability to combine them with AWS media processing features. Minor modifications of the ABC SBC configuration allow to add scenarios such as “leave us a voice message”, and/or pass the resulting text to a language translation engine.