The digital transformation project: Speech recognition – an unusual challenge

In recent years, “speech recognition” has established itself as synonymous with efficiency, flexibility and accuracy. For a speech recognition project to be successful, not only do we need to find and introduce the right solution but also manage the expectations that we have in regards to this technology.

Although the concept and technology of medical speech recognition is nothing new, introducing it often takes longer than anticipated. Some of the reasons for this are unrealistic expectations and a lack of understanding of the technology. Having said this, experience shows that if these issues are taken into account from the outset, projects run swiftly and smoothly and to the satisfaction of all those involved. As a result, speech recognition increases efficiency tremendously within a short period of time and pays for itself in just a few months.

Once the decision has been made to use speech recognition, two sides often form between project teams and users. Some expect it to massively speed up medical documentation with immediate effect and perfect accuracy, without any need for correction. Others are opposed to it, since digital transformation projects appear to pose many risks and that some doctors cannot correctly use speech recognition. The truth lies somewhere in-between. It is important to manage expectations and concerns properly, with project leaders often being asked the following questions:

Does the software understand different accents?

Speech recognition has made tremendous progress in this area. What is important is distinguishing between dialects and accents. Naturally, if users dictate in standard German, this is how the text will be written. However, it is of no importance whether users have a Swiss or a foreign accent or whether they come from Zürich or Bern. The software will even understand what is being dictated if users have a cold or are working in a noisy environment.

Does the software understand medical terminology?

Not only does the software recognise medical terms, it also spells them correctly. Its vocabulary contains medical terms for a variety of fields and words can be added or formatted. What is important is that the chosen system understands both medical and general texts correctly using the same vocabulary. This is where products that are available on the market vary greatly.  

What effects does the software have on processes?

There are basically two methods of working, which can be combined depending on process requirements:

1. Front end speech recognition:
In this scenario, users such as doctors or therapists prepare texts directly in the hospital information system or another application. The need to provide the resulting dictation to the secretarial staff is avoided, meaning processing time is significantly reduced.

2. Backend speech recognition:
This process is very similar to digital dictation without speech recognition. Users dictate on a mobile dictation device or Smartphone and send the audio files to the server. Secretarial staff receive audio files as well as the recognised text from the speech recognition server. This saves time for the staff, as texts only need to be edited rather than typed.

What are the main effects on quality and processes?

Texts are checked for quality when working with both front end and back end speech recognition. If necessary, they are then corrected either by doctors or by the secretarial staff. This is far more efficient than typing texts by hand or using traditional methods of dictation and getting them typed by the secretarial staff. Speech recognition does not reduce the quality of content, nor does it call for important changes to the document processes.

What precisely can we achieve with this project?

According to a recent study conducted by the manufacturer Nuance, doctors dictate on average 120 words per minute, however they only type 40 words per minute. Medical editors can type an average of 192 lines per hour but can read and correct 559 lines per hour. As a result, front end and back end speech recognition lead to significantly improved productivity rates.  

Front end speech recognition enables doctors to enter text directly into patient files, eliminating media discontinuity and the need for interfaces. Backend speech recognition, on the other hand, does not change the process compared to digital dictation and typing, however it does take pressure away from the secretarial staff.

Finally, speech recognition also saves a significant amount of time and money, as demonstrated by projects in various clinics and hospitals. Reto Heusser, managing partner at Voicepoint, sums up as follows: “Introduced successfully, speech recognition greatly reduces the amount of time between completing an examination and preparing a report. Some of our customers improve efficiency by an average of 60 percent”.

Voicepoint participates in numerous speech recognition projects in hospitals, clinics and doctors’ practices and places great importance on customer specific needs and suitable work methods. This ensures that customers can work towards a solution that is tailored to their individual needs and can be introduced successfully with careful planning.

Would you like a personal consultation or do you have any questions about our products or services? Contact us – we will be happy to help you.