Audio in SAP Enable Now

I’m not a great fan of using audio in training materials. This is primarily because of the additional development time, effort, and cost – and potentially incurring all of these again, over and over, during updates and maintenance. There have been studies that argue the case for pandering to ‘auditory learners’, and posited that we retain more if we hear it as well as see it, but these studies have largely been debunked. I’ll concede that there is some value in having audio – generally as a ‘nice to have’ or to provide a (very subjectively) ‘more professional’ deliverable – but at what cost? So I have generally avoided using audio as I just don’t see the cost/benefit.

That said, SAP Enable Now is making me reconsider this. Not necessarily because I now see an increased benefit, but because I see a decreased cost. And decreased to almost zero, at that.

SAP Enable Now allows you to record audio ‘voiceovers’ for simulations (in Demo mode) and for Book Pages. This, I don’t like for all of the reasons noted above. However, Enable Now also provides fairly decent Text-to-Speech functionality. Using this functionality, Enable Now will create an audio rendition of all of the ‘bubble text’ in a simulation, pretty much at the click of a button. And if you don’t want just a straight reading of the bubble text, then you can enter a separate ‘override’ text that will be used for the Text-to-Speech generation. This is handy for cases where the Text-to-Speech generated audio doesn’t quite get the pronunciation right, because you can just enter the text phonetically (as I have had to do for “SAP”, replacing it with “essay pea”!).

For Book Pages, things are slightly more complicated in that Enable Now does not scrape any text off the screen to use for Text-to-Speech (it would be prohibitively complicated to even attempt to do this, with possibly multiple text boxes, hidden elements, and so on). This means that you need to enter all of of the ‘source text’ for Text-to-Speech generation yourself – but this is still much easier than actually recording it (and even if you recorded it you’d likely have to type it up as a script anyway).

If you recall the days of SAP Tutor and it’s Wizard, you’d be right to be skeptical about ‘generated audio’ – as was I. That sounded like it was being narrated by Stephen Hawking, and was largely unlistenable. However, in an exceptionally smart move, Enable Now makes use of Windows’ built-in text to speech capabilities, instead of implementing this functionality itself, or requiring third-party add-ins (as is the case with Adobe Captivate using NeoSpeech). You just select one of the Windows-installed ‘voices’ for Enable Now to use, and your speech is generated pretty close to ‘natural speech’. It’s not perfect, and won’t be mistaken for an actual person, but it is certainly ‘fit for government’. There’s also rumor that Enable Now will eventually support Google’s Cloud Text-to-Speech engine, which should make it sound even closer to a real human.

If you need to provide translations of your training material, then this approach helps there, too. The ‘dictionary’ bubble text (the default text generated during recording) can be translated automatically just by selecting a new language in the simulation’s properties, and any custom bubble text (or Book Page content) can be translated as usual. Then, once all of the translated text is in place, just switch your Windows Text-to-Speech voice to another language, and generate the audio in exactly the same way. Simple!

All that said, what you are really doing is generating the audio during editing. The audio files generated using Text-to-Speech are stored in the simulation project and distributed (or accessed via the cloud) along with all of the other content. What would be ideal is for Enable Now to dynamically create and play the Text-to-Speech audio at display time – and in the language chosen by the user – so that the Developer doesn’t really need to do anything. We’re a little ways off that, but it is probably coming, most likely through an interface/API to Google Translate.

Until then, SAP Enable Now’s currently-available Text-to-Speech functionality is perfectly adequate for those clients who insist on having audio. In fact, I can see myself providing audio in my simulations by default, just because I can – with minimal effort and almost no increase in development time and cost.