Tag Archives: speaker recognition

GLOSSARY OF SPEAKER RECOGNITION AND AUDIO IDENTIFICATION

By @forensicfield

This is as a kind of a quick-reference list for important technical terms from Forensic Speaker Identification.

Which is taken from –

Accent

The pronunciation used by a speaker (as opposed to other things like choice of words or syntax) that is characteristic of a particular area, or social group.

Acoustic Forensic Analysis

The expert use of acoustic, as opposed to auditory, information to compare forensic speech samples.

Acoustic Phonetics Or Speech Acoustics

That part of phonetics that deals with the properties of speech sounds, and how they are transmitted between speaker and hearer.

Allomorph

The realisation of a morpheme. Two allomorphs of the plural morpheme in English for example are s, as in gnats and es as in horses.

Allophone

A speech sound functioning as the realization of a phoneme.

Articulation Rate

A measure of how fast someone speaks, usually quantified in terms of syllables per second, exclusive of pauses.

Articulatory Phonetics

The study of how speech sounds are made by the speaker.

Auditory Forensic Analysis Or Technical Speaker Recognition By Listening

The expert use of auditory, as opposed to acoustic, information to compare forensic speech samples.

Aural-spectrographic Identification

Highly controversial method of speaker identification using both visual examination of spectrograms and listening.

Between-speaker Variation

The fact that different speakers of the same language differ in some aspects of their speech. One of the conditions that makes forensic speaker identification possible.

Centisecond or csec or cs

Unit for quantifying duration in acoustic phonetics: one hundredth of a second.

Cepstrum

A very common parameter used in automatic speaker recognition, one effect of which is to smooth the spectrum.

Closed Set Comparison

An unusual situation in forensic speaker identification where it is known that the offender is present among the suspects.

Convergence

The tendency for two participants in a conversation to become more similar in their speech behaviour to signal in-group membership. Speakers can also diverge from one another.

Conversation analysis

The study of how conversation is structured and regulated.

Decibel or dB

Unit for quantifying amplitude in acoustic phonetics.

Dialectology

The study of how language varies with geographical location.

Digitising

The process of converting an analogue speech signal, e.g. from a cassette recorder, into a digital form that can be used by a computer for speech analysis.

Diphthong

A vowel in a single syllable that involves a change in quality from one target to another, as in how, or high.

F– (Or Formant) Pattern

The ensemble of formant frequencies in a given sound or word.

False Negative

In speaker recognition, deciding that two speech samples have come from different speakers when in fact they are from the same speaker.

False Positive

In speaker recognition, deciding that two speech samples have come from the same speaker when in fact they are from different speakers.

FFT or Fast Fourier Transform

A common method of spectral analysis in acoustic phonetics.

Formant

A very important acoustic parameter in forensic speaker identification. Formants reflect the size and shape of the speaker’s vocal tract.

Formant Bandwidth

An acoustic parameter that reflects the degree to which acoustic energy is absorbed in the vocal tract during speech.

Fundamental Frequency (Or F0)

A very important acoustic parameter in forensic speaker identification. F0 is the acoustic correlate of the rate of vibration of the vocal cords.

Hertz (or Hz)

Unit for quantifying frequency: so many times per second. 100 Hz for example means one hundred times per second.

Incidential Difference

One of the ways in which speakers can differ in their phonemic structure.

Indexical Information

Information in speech that signals the speaker as belonging to a particular group, e.g. male, middle-class, with a cold, Vietnamese immigrant, and so on.

Intonation

The use of pitch to signal things like questions or statements, or the emotional attitude of the speaker.

Kilohertz (or kHz)

Unit for quantifying frequency: so many thousand times per second. 1 kHz for example means one thousand times per second.

Linear Prediction

A commonly used method of digital speech analysis.

Long-term

A common type of quantification in forensic speaker identification whereby a parameter, usually fundamental frequency, is measured over a long stretch of speech rather than a single speech sound or word.

Manner (of articulation)

The type of obstruction in the vocal tract used in making a consonant, e.g. fricative, or stop.

Millisecond (or msec or ms)

Common unit for quantifying duration in acoustic phonetics: one-thousandth of a second.

Morpheme

A unit of linguistic analysis used in describing the structure of words: the smallest meaningful unit in a language. For example, the word dogs consists of two morphemes: {dog} and {plural}.

Naive Speaker Recognition

When an untrained listener attempts to recognize a speaker, as in voice line-ups, etc.

Open Set Comparison

The usual situation in forensic speaker identification where it is not known whether the offender is present among the suspects.

Parameter (Or Dimension, Or Feature)

A generic term for anything used to compare forensic speech samples, e.g. mean fundamental frequency, articulation rate, phonation type.

Phonation Type

The way the vocal cords vibrate, giving rise to auditorily different qualities, e.g. creaky voice, or breathy voice.

Phone

A technical name for speech sound.

Phoneme

A unit of linguistic analysis: the name for a contrastive sound in a language. For example, bat and pat begin with two different phonemes.

Phonemics

The study of how speech sounds function contrastively, to distinguish words in a given language. Phonemics is an important conceptual framework for the comparison of forensic speech samples.

Phonetic Quality

One of two very important descriptive components of a voice, the other being voice quality. Describes those aspects of a voice that have to do with the realisation of speech sounds.

Phonetics

The study of all aspects of speech, but especially how speech sounds are made, their acoustic properties, and how the acoustic properties of speech sounds are perceived as speech by listeners.

Phonology

One of the main sub-areas in linguistics. Phonology studies the function and organisation of speech sounds, both within a particular language, and in languages in general.

Pitch

  • An important auditory property of speech.
  • Pitch and pitch range can be used to characterise an individual’s voice.
  • Another term for fundamental frequency.

Pitch Accent

The use of pitch to signal differences between words that is partly like tone and partly like stress. Japanese is a pitch-accent language.

Place (of articulation)

Where in the vocal tract a consonantal sound is made.

Posterior Odds

In forensic speaker identification, the odds in favour of the hypothesis of common origin for two or more speech samples after the forensic-phonetic evidence, in the form of the likelihood ratio, is taken into account. The posterior odds are the product of prior odds and LR.

Prior Odds

In forensic speaker identification, the odds in favour of the hypothesis of common origin for two or more speech samples before the forensic-phonetic evidence is taken into account.

Realisational Difference

One of the ways in which speakers can differ in their phonemic structure.

Segmentals

A generic term for vowels and consonants.

Sociolect

A way of talking that is typical of a particular social group.

Sociolinguistics

The study of how language varies with sociological variables like age, sex, income, education, etc.

Spectral Slope

An acoustic parameter that relates to the way the vocal cords vibrate.

Spectrogram

A picture of the distribution of acoustic energy in speech. It normally shows how frequency varies with time. Spectrograms are often used to illustrate an acoustic feature or features of importance.

Speech Perception

That part of phonetics that studies how the acoustic properties of speech sounds are perceived by the listener.

Spectrum

The result of an acoustic analysis showing how much energy is present at what frequencies in a given amount of speech.

Standard Deviation

A statistical measure quantifying the spread of a variable around a mean value.

Stress

Prominence of one syllable in a word used to signal linguistic information, like the difference between implant (noun) and implant (verb) in English.

Subglottal Resonance

A frequency in speech attributable to structures below the vocal cords, e.g. the trachea.

Suprasegmentals

A generic term for tone, stress and intonation.

Syllable (Or Speaking) Rate

A measure of how fast someone speaks, usually quantified in terms of syllables per second, inclusive of pauses.

Systemic Difference

One of the ways in which speakers can differ in their phonemic structure.

Tone

The use of pitch to signal different words, as in tone languages like Chinese.

Variance

A statistical measure quantifying the variability of a variable; the square of the standard deviation.

Voice Quality

One of two very important descriptive components of a voice, the other being phonetic quality. Describes those long-term or short-term aspects of a voice that do not have to do with the realisation of speech sounds.

Voiceprint Identification

Highly controversial method of speaker identification exclusively using visual examination of spectrograms.

Voice Print

Another name for spectrogram. Usually avoided because of its association with voice print identification.

Voicing / Phonation

Refers to activity of the vocal cords.

Watch it🤳, Share it ✌and Subscribe it 👇 : –

Advertisements