Events Calendar

Mon
Tue
Wed
Thu
Fri
Sat
Sun
M
T
W
T
F
S
S
27
12:00 AM - Arab Health 2020
29
1
7
10
12
14
16
20
23
25
27
28
29
1
Arab Health 2020
2020-01-27 - 2020-01-30    
All Day
ABOUT ARAB HEALTH 2020 Arab Health is an industry-defining platform where the healthcare industry meets to do business with new customers and develop relationships with [...]
12th International Conference on Acute Cardiac Care
2020-01-28 - 2020-01-29    
All Day
ABOUT 12TH INTERNATIONAL CONFERENCE ON ACUTE CARDIAC CARE Acute Cardiac Care has been undergoing a substantial transformation in recent years as the population ages and [...]
30 Jan
2020-01-30 - 2020-01-31    
All Day
The ICMHS conference is an international forum for the presentation of technological advances and research results in the fields of Medical and Health Sciences. The [...]
Annual Lower and Upper Canada Anesthesia Symposium 2020 (LUCAS)
2020-01-31 - 2020-02-02    
All Day
ABOUT ANNUAL LOWER & UPPER CANADA ANESTHESIA SYMPOSIUM 2020 (LUCAS) On behalf of the Departments of Anesthesia of McGill University, Queen’s University, and the University [...]
RF - 577th International Conference On Medical & Health Science - ICMHS 2020
2020-02-02 - 2020-02-03    
All Day
577th International Conference on Medical & Health Science - ICMHS 2020. It will be held during 2nd-3rd February, 2020 at Berlin , Germany. ICMHS 2020 [...]
ISER- 747th International Conference On Science, Health And Medicine ICSHM
2020-02-02 - 2020-02-03    
All Day
ISER- 747th International Conference on Science, Health and Medicine ICSHM is a prestigious event organized with a motivation to provide an excellent international platform for [...]
International Conference On Medical And Health SciencesICMHS-2020
2020-02-03 - 2020-02-04    
All Day
The ICMHS conference is an international forum for the presentation of technological advances and research results in the fields of Medical and Health Sciences. The [...]
Medlab Middle East 2020
2020-02-03 - 2020-02-06    
All Day
ABOUT MEDLAB MIDDLE EAST 2020 Medlab Middle East is the only medical laboratory industry event that offers manufacturers the opportunity to meet a diverse audience [...]
Cloud Architecture Implementation Healthcare 2020
2020-02-04 - 2020-02-06    
All Day
This summit brings together leaders from healthcare organizations to scale up their cloud infrastructure, implement cloud technology and share use cases about the success and [...]
4th Microbiome Movement - Drug Development Summit Europe 2020 - London, UK
2020-02-04 - 2020-02-06    
All Day
A unique forum focusing on pursuing disease causation to foster the creation of targeted Microbiome-based therapeutics, biomarkers and diagnostics. Time: 8:30 am - 5:50 pm [...]
Structural Heart Intervention And Imaging Feb 2020 CME Conference-San Diego
2020-02-05 - 2020-02-07    
All Day
The Scripps Structural Heart Intervention and Imaging conference features live case demonstrations, lectures from renowned faculty, hands-on workshops, and extensive satellite symposia. Time: 7:00 am [...]
Structural Heart Intervention And Imaging Feb 2020 CME Conference-San Diego
2020-02-05 - 2020-02-07    
All Day
The Scripps Structural Heart Intervention and Imaging conference features live case demonstrations, lectures from renowned faculty, hands-on workshops, and extensive satellite symposia. Time: 7:00 am [...]
18th Annual South Beach Symposium
2020-02-06 - 2020-02-09    
All Day
ABOUT 18TH ANNUAL SOUTH BEACH SYMPOSIUM The 18th Annual South Beach Symposium will take place in Miami Beach, Florida from February 6-9, 2020 at the [...]
Primary Care CME In Clearwater Beach, Florida February 2020
2020-02-08 - 2020-02-10    
All Day
Topics include latest hypertension guidelines, cancer screening, cholesterol management, immunizations, COPD, skin and soft tissue infections, etc. Time: 08:00 - 11:00
Primary Care CME In Clearwater Beach, Florida February 2020
2020-02-08 - 2020-02-10    
All Day
Topics include latest hypertension guidelines, cancer screening, cholesterol management, immunizations, COPD, skin and soft tissue infections, etc. Time: 08:00 - 11:00  
World Congress On Medical Imaging And Clinical Research WCMICR-2020
2020-02-09 - 2020-02-10    
All Day
The WCMICR conference is an international forum for the presentation of technological advances and research results in the fields of Medical Imaging and Clinical Research. [...]
Medical Design & Manufacturing (MD&M) West
2020-02-11 - 2020-02-13    
All Day
ABOUT MEDICAL DESIGN & MANUFACTURING (MD&M) WEST Medical Design & Manufacturing (MD&M) West is where serious professionals find the technologies, education, and connections to stay [...]
Third International Conference On Zika Virus And Aedes Related Infections
2020-02-13    
All Day
This Conference will bring together multidisciplinary experts aiming to tackle the challenges that Aedes related infections present including zika, dengue, yellow fever, and chikungunya. Time: [...]
The IRES - 791st International Conferences On Medical And Health Science ICMHS
2020-02-15 - 2020-02-16    
All Day
The IRES - 791st International Conferences on Medical and Health Science ICMHS aimed at presenting current research being carried out in that area and scheduled [...]
4th International Conference on Chronic Diseases
2020-02-17 - 2020-02-18    
All Day
ABOUT 4TH INTERNATIONAL CONFERENCE ON CHRONIC DISEASES It takes immense pleasure to invite you to attend the 4th International Conference on Chronic Diseases (Chronic Diseases [...]
European Gynecology and Obstetrics Congress
2020-02-17 - 2020-02-18    
All Day
ABOUT EUROPEAN GYNECOLOGY AND OBSTETRICS CONGRESS Gynecology 2020 destine to endeavor leading-edge memoranda of eminent keynote speakers, universal personalities, special sessions and poster presentations attracting [...]
18 Feb
2020-02-18 - 2020-02-20    
All Day
Technology Networks is a global online scientific publication that covers the latest research, industry news, and technologies. Our 12 online communities provide focused coverage of [...]
6th International Conference On Food And Beverages
2020-02-19 - 2020-02-20    
All Day
Meetings International Meetings Int. invites you to attend the ‘6th International Conference on Food and Beverages 2020” which is to be held on February 19-20, [...]
10th Global Summit on Neuroscience and Neuroimmunology
2020-02-19 - 2020-02-20    
All Day
ABOUT 10TH GLOBAL SUMMIT ON NEUROSCIENCE AND NEUROIMMUNOLOGY 10th Global Summit on Neuroscience and Neuroimmunology (Neuroimmunology 2020) is aimed at improving health across the globe, [...]
Mayo Clinic Nephrology And Transplantation For The Clinician 2020
2020-02-21 - 2020-02-22    
All Day
Nephrology and Transplantation for the Clinician: 18th Annual Update From Mayo Clinic is a two-day course designed to u-p-d-a-t-e participants on nephrology topics relevant to [...]
28th International Conference on Cancer Research and Pharmacology
2020-02-21 - 2020-02-22    
All Day
ABOUT 28TH INTERNATIONAL CONFERENCE ON CANCER RESEARCH AND PHARMACOLOGY PULSUS Conferences is glad to invite all the participants across the globe to attend 28th International [...]
Rocky Mountain Winter Conference On Emergency Medicine 2020
2020-02-22 - 2020-02-26    
All Day
Each day the conference starts with a hot breakfast followed by engaging, cutting edge didactics led by experts from the countrys top academic programs. Please [...]
CRT20 Conference
2020-02-22 - 2020-02-25    
All Day
ABOUT CRT20 CONFERENCE CRT, one of the world’s leading interventional cardiology conferences, is attended by more than 3,000 interventional and endovascular specialists. At the 2019 [...]
3rd International conference on  Diabetes, Hypertension and Metabolic Syndrome
2020-02-24 - 2020-02-25    
All Day
About Diabetes Meet 2020 Conference Series takes the immense Pleasure to invite participants from all over the world to attend the 3rdInternational conference on Diabetes, Hypertension and [...]
3rd International Conference on Cardiology and Heart Diseases
2020-02-24 - 2020-02-25    
All Day
ABOUT 3RD INTERNATIONAL CONFERENCE ON CARDIOLOGY AND HEART DISEASES The standard goal of Cardiology 2020 is to move the cardiology results and improvements and to [...]
Medical Device Development Expo OSAKA
2020-02-26 - 2020-02-28    
All Day
ABOUT MEDICAL DEVICE DEVELOPMENT EXPO OSAKA What is Medical Device Development Expo OSAKA (MEDIX OSAKA)? Gathers All Kinds of Technologies for Medical Device Development! This [...]
Events on 2020-01-27
Arab Health 2020
27 Jan 20
Dubai
Events on 2020-01-28
Events on 2020-01-30
Events on 2020-01-31
Events on 2020-02-03
Events on 2020-02-06
18th Annual South Beach Symposium
6 Feb 20
Miami Beach
Events on 2020-02-09
Events on 2020-02-11
Events on 2020-02-17
Events on 2020-02-18
18 Feb
Events on 2020-02-22
CRT20 Conference
22 Feb 20
National Harbor
Events on 2020-02-26
Articles

Challenges in Measuring Automatic Transcription Accuracy

This post continues our series of articles on Automatic Speech Recognition, the foundational technology that powers Descript’s automatic transcription. The marquee article in this series will test the accuracy rates of today’s biggest ASR vendors — like Google, Amazon, and IBM. Before we publish the results, we wanted to explore the reasons why declaring one ASR provider to rule them all is a bit trickier than it sounds.

Over the last couple of years you may have seen headlines proclaiming that AI-enhanced computers have reached parity (and even surpassed!) the speech recognition capabilities of humans. It’s a claim that’s both exciting and — given the “creative” interpretations of voice assistants like Siri and Alexa — tough to swallow.

Speech recognition has gotten better, sure. But try using your phone to record a typical, noisy meeting in a boomy conference room—then pass the resulting audio through one of the leading automatic speech recognition engines. You’re liable to wind up with something closer to word salad than meeting minutes.

So what are these researchers on about? To understand why their claims actually have merit — and the associated caveats—we need to explore the industry’s standard accuracy test, Word Error Rate.

How Word Error Rate Works

Measuring transcription accuracy seems like a task that should be reasonably straightforward: you tally how many words the transcription engine gets correct, contrast that with how many it got wrong — and there you go… Right?

And indeed, that’s essentially how the experts do it. They use fancy math formulas and terms like Word Error Rate (WER) and Levenshtein distance, but conceptually it’s pretty intuitive: words wrong, divided by the number of words that should be there. It’s a linguistic batting average.

At a high level, WER works like this: add up the number of words that the ASR engine got wrong — namely words that have been incorrectly Inserted, Deleted, or Substituted — and divide that by the number of words that should be in the transcript. The resulting percentage is your Word Error Rate.

Now, in order to discern what the ASR engines are getting right and wrong we need to have an accurate transcript to compare to. These are called reference or ‘ground truth’ transcripts, and they’re hand-transcribed and checked by humans. Each reference transcript is then automatically aligned with its ASR-generated counterpart, so the test can tell which words are supposed to be where. This is important: if the test isn’t using the optimal alignment, it can count what should be a single Substitution error as a pair of Insertion/Deletion errors, inflating the WER.

You may be wondering how WER handles stylistic differences. For example, some ASR engines will transcribe numbers as words, while others use the corresponding digits (1, 3, 5). And if an ASR engine says “going to” but the source transcript says “gonna” — what then? Such cases are addressed via a normalization process that specifies which contractions are valid, that “Street” and “St.” mean the same thing, and so on.

Issues with WER

The fundamental problem with WER is that every word is worth the same number of points. Whether it’s a name or adjective, “a” or “Antarctica” — they all count the same.

Of course, reality tends to disagree: anyone could tell you that not all words in a sentence are equally important — and that some errors matter more than others. But because these factors depend on context and meaning, it’s difficult to develop a test that can be broadly applied without a litany of caveats.

Which is why you’re reading a litany of caveats.

Along with ignoring the importance of words, WER is also a brutally harsh judge: it gives no partial credit. Even if a mis-transcribed word is just one character off, WER treats it the same as a complete, nonsensical whiff.

Now consider the following two sentences:

  • It’s a matter of free peach.
  • It’s a matter of free.

Using Word Error Rate, these two sentences would receive the same score: it’s just as bad to transcribe “peach” as it is to simply omit the word. To a human, the first sentence is obviously more useful — but WER doesn’t care (granted, if the ASR engine guessed “free lasagna” nobody would be campaigning for partial credit).

Another issue with WER is its total disregard for speaker labels and punctuation. These may or may not be important, depending on your use-case—but it’s obviously a major simplification.

It’s also worth considering what we even mean by “accuracy” in this context. A 100%-verbatim transcript is likely to include many words that are essentially meaningless: “uhms”, “uhs”, false starts, and duplicates — words that can actually interfere with reading comprehension. We can tweak the test to account for some of this, but it’s a good reminder that WER is just a proxy for evaluating how transcripts will be used in the real world.

Better than the Rest

Despite these compromises, Word Error Rate is the most widely-used measure of transcription accuracy by a long shot, and it’s what we use for our testing. While imperfect, its prevalence and endurance in the field attest to its utility all the same.

There’s also a body of evidence that shows that WER correlates with other measures of accuracy that the test itself doesn’t take into account, like Keyword Error Rate — which weights each word depending on its likely importance (and is vastly more complex to calculate). After conducting an experiment comparing the two metrics, researchers concluded “the use of Word Error Rate is sufficient especially for cases where WER remains below 25%.”

Even WER’s critics begrudgingly admit its supremacy. In a research paper asking Does WER Really Predict Performance? — which is generally fairly critical of WER — the authors state the following:

“The purpose of this paper is not to postulate a better alternative to WER for evaluating transcript quality; we stipulate that no better alternative likely exists if the task at hand is taken to be speech transcription for its own sake.”

WE’Re Winning!

In recent years, researchers from Baidu, IBM, Microsoft, and Google (among others) have been sprinting toward wringing ever-lower Word Error Rates from their speech recognition engines — with remarkable results.

Spurred by advances involving neural networks and deep learning, along with massive datasets compiled by these tech giants, WERs have improved enough to generate headlines about meeting and surpassing human efficiency, based on findings that professional human transcriptionists have a WER of around 5.15.9% (people mishear things a lot!).

In contrast, Microsoft researchers report their ASR engine has a WER of 5.1%; IBM Watson’s 5.5%. And Google claims an error rate of just 4.9%.

WERs — Based on published research papers

The catch is that most of these tests were conducted using the same set of audio recordings: namely a corpus called Switchboard, which consists of a large number of recorded phone conversations spanning a broad array of topics. Switchboard has been used in the field for many years and is nearly ubiquitous in the current literature—so it’s a reasonable choice. By testing against the same audio corpus, researchers can make apples-to-apples comparisons between themselves and competitors. (Google is the exception; it uses its own, internal test corpus, which is opaque to outsiders).

But this homogeneity leads to a sort of tunnel vision: those claims of surpassing human transcriptionists are based on a very specific kind of audio. If the footage you’re working with doesn’t involve phone calls — then which system is best? Audio is not one-size-fits-all: depending on whether footage has been recorded via a phone or professional mic, from two inches or twenty feet away, with or without accents, featuring two people or twelve — there are a lot of variables, and they can have a substantial impact on transcription accuracy.

That’s one reason Descript decided to run its own tests: we deal with so many different kinds of audio, it makes sense to test with a broader sample, and to get a sense for whether different ASR providers excel at different things.

Source