Audio in SAP Enable Now

I’m not a great fan of using audio in training materials. This is primarily because of the additional development time, effort, and cost – and potentially incurring all of these again, over and over, during updates and maintenance. There have been studies that argue the case for pandering to ‘auditory learners’, and posited that we retain more if we hear it as well as see it, but these studies have largely been debunked. I’ll concede that there is some value in having audio – generally as a ‘nice to have’ or to provide a (very subjectively) ‘more professional’ deliverable – but at what cost? So I have generally avoided using audio as I just don’t see the cost/benefit.

That said, SAP Enable Now is making me reconsider this. Not necessarily because I now see an increased benefit, but because I see a decreased cost. And decreased to almost zero, at that.

SAP Enable Now allows you to record audio ‘voiceovers’ for simulations (in Demo mode) and for Book Pages. This, I don’t like for all of the reasons noted above. However, Enable Now also provides fairly decent Text-to-Speech functionality. Using this functionality, Enable Now will create an audio rendition of all of the ‘bubble text’ in a simulation, pretty much at the click of a button. And if you don’t want just a straight reading of the bubble text, then you can enter a separate ‘override’ text that will be used for the Text-to-Speech generation. This is handy for cases where the Text-to-Speech generated audio doesn’t quite get the pronunciation right, because you can just enter the text phonetically (as I have had to do for “SAP”, replacing it with “essay pea”!).

For Book Pages, things are slightly more complicated in that Enable Now does not scrape any text off the screen to use for Text-to-Speech (it would be prohibitively complicated to even attempt to do this, with possibly multiple text boxes, hidden elements, and so on). This means that you need to enter all of of the ‘source text’ for Text-to-Speech generation yourself – but this is still much easier than actually recording it (and even if you recorded it you’d likely have to type it up as a script anyway).

If you recall the days of SAP Tutor and it’s Wizard, you’d be right to be skeptical about ‘generated audio’ – as was I. That sounded like it was being narrated by Stephen Hawking, and was largely unlistenable. However, in an exceptionally smart move, Enable Now makes use of Windows’ built-in text to speech capabilities, instead of implementing this functionality itself, or requiring third-party add-ins (as is the case with Adobe Captivate using NeoSpeech). You just select one of the Windows-installed ‘voices’ for Enable Now to use, and your speech is generated pretty close to ‘natural speech’. It’s not perfect, and won’t be mistaken for an actual person, but it is certainly ‘fit for government’. There’s also rumor that Enable Now will eventually support Google’s Cloud Text-to-Speech engine, which should make it sound even closer to a real human.

If you need to provide translations of your training material, then this approach helps there, too. The ‘dictionary’ bubble text (the default text generated during recording) can be translated automatically just by selecting a new language in the simulation’s properties, and any custom bubble text (or Book Page content) can be translated as usual. Then, once all of the translated text is in place, just switch your Windows Text-to-Speech voice to another language, and generate the audio in exactly the same way. Simple!

All that said, what you are really doing is generating the audio during editing. The audio files generated using Text-to-Speech are stored in the simulation project and distributed (or accessed via the cloud) along with all of the other content. What would be ideal is for Enable Now to dynamically create and play the Text-to-Speech audio at display time – and in the language chosen by the user – so that the Developer doesn’t really need to do anything. We’re a little ways off that, but it is probably coming, most likely through an interface/API to Google Translate.

Until then, SAP Enable Now’s currently-available Text-to-Speech functionality is perfectly adequate for those clients who insist on having audio. In fact, I can see myself providing audio in my simulations by default, just because I can – with minimal effort and almost no increase in development time and cost.

12 thoughts on “Audio in SAP Enable Now”

  1. Hi Manual,
    Can we add robotic voiceover , i mean as per the Enable now video content is it possible to attach the robotic voiceover based on the content?

  2. Changed the google text to speech on my mp4 but when I replay video it still has old voice. How do I change that?

    1. Did you re-generate the audio? Or just change the default voice? Audio is generated and then stored, so if you change the voice it should use after you generated it the first time, it won’t magically go back and update any existing audio, and you need to re-generate it. If you did re-generate it but it’s not picking up the new audio, it must be a caching issue (open the Folder containing the audio and play it from there, to confirm it was saved OK. Let me know if that didn’t help.

  3. Is there a way to add your voice over after you have created the video as its very slow to do it with the video?

    1. What do you mean by “the video”? You mean an .mp4 you generated via Tools > Generate Video? No, not in SAP Enable Now, but there are 3rd party tools that effectively let you add audio to an MP4 (e.g. Camtasia). If you mean a simulation recording (still in SAP Enable Now project form) and as an alternative to capturing during recording, yes. You need to select menu option Project > Convert to Audio Project, and then you can add/record audio per Step.

  4. Hi Dirk,
    I have already written to you but just to repeat my question. Do we have a way of adding closed captions or subtitles in the demo mode. This is for a simulation we are creating. We saw that there was an option for book page in settings but could not find any for demo. My next question is whether there is an option to stagger the timing for several elements on on screen. For example the first bullet appears and then the second and so on.

    Thanks
    Vasumathi

    1. Hi, Vasumathi.

      Closed Captions apply to Text-to-Speech texts (which you need to explicitly enter in the Book Page – there is no defaulted text like there is for simulation project Bubbles). To have this text displayed in the Book Reader, go to Tools | Settings | Playback Settings | book reader | Visual Properties and select Show Subtitles. The text will be displayed at the bottom of the page, by default as white text in a black box. Note that this only applies to the Book Reader – subtitles are not visible in preview in the Editor.

      For your second point, yes, you can absolutely do this using a Time Control – you just need to have each element in it’s own object (such as a text box). Don’t forget to trigger the Tome Control when the page is loaded. See page 275 of my book for an example.

      Hope this helps,
      Dirk

  5. As predicted, the 1902 (February 2019) cloud release has introduced the ability to use Google Cloud Text-to-Speech for providing audio in your simulation projects. This is still generated during editing and not generated on the fly at display time, but the results of my initial testing are very good – the voice is MUCH more natural – especially if you use the ‘Wavenet’ voices.

What's on your mind?

This site uses Akismet to reduce spam. Learn how your comment data is processed.