View source for Mean Opinion Score

== Methodology ==

=== Rating Scales ===

The most commonly used rating scale is the '''Absolute Category Rating''' (ACR) scale, which maps subjective quality ratings to numerical values:

{| class="wikitable"
|-
! Score !! Quality Level !! Description
|-
| 5 || Excellent || Completely natural speech; imperceptible artifacts
|-
| 4 || Good || Mostly natural speech; just perceptible but not annoying
|-
| 3 || Fair || Equally natural and unnatural; perceptible and slightly annoying
|-
| 2 || Poor || Mostly unnatural speech; annoying but not objectionable
|-
| 1 || Bad || Completely unnatural speech; very annoying and objectionable
|}

Alternative scales may use different ranges (e.g., 1-100) or different qualitative descriptors, depending on the specific application and testing requirements.<ref>https://www.twilio.com/docs/glossary/what-is-mean-opinion-score-mos</ref>

=== Testing Procedures ===

Traditional MOS testing involves several key steps:

# '''Subject Selection''': Recruiting appropriate test participants, typically naive listeners without specialized training in audio quality assessment
# '''Environment Control''': Conducting tests in acoustically controlled environments meeting ITU-T specifications
# '''Stimulus Presentation''': Playing audio samples to subjects in randomized order to minimize bias
# '''Rating Collection''': Having subjects rate each stimulus on the chosen scale
# '''Statistical Analysis''': Calculating the arithmetic mean and confidence intervals for each stimulus

Modern extensions include comparative methods such as:

* '''Degradation Category Rating''' (DCR): Subjects compare processed audio to a reference
* '''Comparison Category Rating''' (CCR): Direct comparison between two stimuli

=== Objective Estimation ===

While traditional MOS relies on human evaluation, objective models have been developed to predict MOS scores automatically. Key standardized methods include:

* '''PESQ''' (ITU-T P.862): Perceptual Evaluation of Speech Quality, introduced in 200
* '''POLQA''' (ITU-T P.863): Perceptual Objective Listening Quality Assessment, approved in 201
* '''PSQM''' (ITU-T P.861): Perceptual Speech Quality Measure, the first standardized method from 1997

These algorithms analyze acoustic properties of audio signals to estimate human perceptual quality, enabling automated quality monitoring and real-time assessment.<ref>https://en.wikipedia.org/wiki/Perceptual_Evaluation_of_Speech_Quality</ref>