Adli Ses Bilim ve Akustik: Ocak 2010

4 Ocak 2010 Pazartesi

Makale Özetleri II

Forensic Phonetics

Michael Jessen*
Bundeskriminalamt BKA

Abstract

An overview of forensic phonetics is presented, focusing on speaker identification as its core task. Speaker profiling/speaker classification is applied when the offender has been recorded, but no suspect has been found. Auditory speaker identification by victims and witnesses becomes relevant when no speech recording of the offender is available. It can take the form of familiar-speaker identification or unfamiliar-speaker identification, and in the latter case a voice line-up/voice parade can be carried out. When recordings of both the offender and a suspect are available, a voice comparison is done by an expert in forensic speech analysis. Current issues and domains in voice comparison analysis include the Bayesian approach to forensic reasoning and the Likelihood Ratio, the use of formant frequency measurements, non-analytic perception and Exemplar Theory, forensic automatic speaker identification, and the interaction between different methods.

Speaker-specific formant dynamics: An experiment on Australian English /aI/

Kirsty McDougall (2004)
Department of Linguistics, University of Cambridge

ABSTRACT

Formant frequency dynamics are relevant to forensic speaker identification since they are determined by the shape and size of a speaker’s vocal tract and the way he or she configures the articulators for speech. This study investigates individual differences in the formant dynamics of /aI/ produced by five male Australian English speakers, and the effects of changes in speaking rate and prosodic stress on these differences. F1, F2 and F3 frequencies are examined at equidistant time-normalized intervals through /aI/. At each measurement point a degree of speaker individuality is present, and speaker differentiation improves as increasing numbers of measurement points are considered in combination. Patterns of speaker-specific behaviour are generally consistent across different rate-stress conditions. Discriminant analyses based on predictors from all three formants yield classification rates of 88–95%, with nuclear-stressed /aI/ performing best. The findings suggest that further research to develop techniques for characterizing individual speakers using formant dynamics is warranted.

KEYWORDS speaker identity, formant frequency dynamics, diphthongs, speaking
rate, prosodic stress

Beware of the ‘telephone effect’: the influence of telephone transmission on the measurement of formant frequencies

Hermann J. Künzel
Department of Phonetics, University of Marburg

ABSTRACT

Speech scientists often have to work with speech signals that have been transmitted over the telephone. Although the acoustic properties of telephone transmission such as the band-pass filter characteristics are well known, little attention has been paid to their effect on the measurement of speech parameters.1 This study deals with artefacts introduced by the lower cut-off slope of the transmission channel on vowel formants. For theoretical reasons, frequency components may be assumed to be attenuated the lower they are. Therefore F1 of most vowels can be expected to be affected most. Attenuation of the lower components of a formant will necessarily increase the relative weight of the higher components for the determination of a formant and thus cause an artificial upward shift of its centre frequency. An empirical investigation with directly and telephone-transmitted samples from ten male and ten female subjects shows that the predicted effect on F1 does in fact occur for all tested vowels except /a/, whose F1 is too high to be affected by the slope of the band-pass. The consequences of measurement errors arising from such artefacts are discussed with special reference to speaker identification and empirical dialectology.

KEYWORDS telephone transmission, spectrographic analysis, spectrographic shifting, forensic speaker recognition, dialectology

Makale Özetleri I

Forensic Phonetics: Issues in speaker identification evidence

Andrew Butcher (2002)
Centre for Human Communication Research
Flinders Medical Research Institute
Flinders University, Adelaide, Australia

Abstract

The field of forensic phonetics has developed over the last 20 years or so and embraces a number of areas involving analysis of the recorded human voice. The area in which expert opinion is most frequently sought is that of speaker identification – the question of whether two or more recordings of speech (from suspect and perpetrator) are from the same speaker. Automated analysis (in which Australia is a world leader) is only possible where recording conditions are identical. In the most frequently encountered real-world forensic situation, comparison is required between a police interview recording and recordings made via telephone intercepts or listening devices. This necessitates a complex procedure, involving auditory and acoustic comparison of both linguistic and non-linguistic features of the speech samples in order to build up a profile of the speaker. The most commonly used measures are average fundamental frequency and the first and second formant frequencies of vowels. Much work is still needed to develop appropriate statistical procedures for the evaluation of phonetic evidence. This means estimating the probability of finding the observed differences between samples from the same speaker and the probability of finding those same differences between samples from two different speakers. Thus there needs to be an acceptance that the outcome will not be an absolute identification or exclusion of the suspect. By itself, your voice is not a complete giveaway.

Effects of voice disguise on speaking fundamental frequency

Hermann J. Künzel (2000)
Department of Phonetics, University of Marburg

ABSTRACT

Patterns of voice disguise1 in forensic cases involving speaker identification or speaker profiling may contain clues to features of the undisguised voice of a speaker. In a longitudinal and synchronous study, 100 subjects were asked to read a text on five occasions during a period of six months, first using their normal voices, and subsequently with two out of three modes of voice disguise, (1) raising fundamental frequency, (2) lowering fundamental frequency, (3) denasalization by firmly pinching their nose. The focus of this investigation is on fundamental frequency (F0). Results show that most subjects were in fact able consistently to change their F0 according to the mode of disguise they had selected. However, there were differences between both sexes with regard to their preference of disguise modes as well as to the individual articulatory ‘strategies’ which they employed to implement them. Results corroborate experience with forensic casework, that is, they show that there is a constant relation between the F0 of a speaker’s natural speech behaviour and the kind of disguise he will use in an incriminating phone call. Speakers with higher-than-average F0 tend to increase their F0 levels. This process may or may not involve register changes from modal voice to falsetto. Speakers with lower-than-average F0 prefer to disguise their voices by lowering F0 even more and often end up with permanently creaky voice. The latter trend can be observed much more clearly in males. Females are generally more reluctant to make drastic changes to their fundamental frequency patterns.

KEYWORDS speaker identification, voice disguise, fundamental frequency, synchronousaspects, longitudinal aspects

Issues in transcription: factors affecting the reliability of transcripts as evidence in legal cases

Helen Fraser (2003)
School of Languages Cultures and Linguistics, University of New England

ABSTRACT

This article considers the reliability of transcripts used as evidence in court, especially transcripts of poor recordings. Background information about human speech and speech perception is presented, and the implications of this information for the use of transcripts of different kinds in legal contexts is considered. Finally, recommendations are made to allow judgement of the reliability of existing transcripts, ensure that newly created transcripts are reliable, and to ensure that transcripts are presented to a jury appropriately.

KEYWORDS transcription, forensic phonetics, human speech perception, transcriptreliability

A recent voice parade

Francis Nolan (2003)
University of Cambridge

ABSTRACT

An account is given of a case in which a voice parade contributed significantly
to prosecution evidence. A witness had overhead his landlord arranging for another younger man to set fire to a house (where a fire later that night resulted in a woman’s death), and claimed to know the voice. A voice parade was constructed using composite samples from this suspect’s interview tapes, and, as foils, composite samples from police interviews with similar young men from the London Asian community. The witness identified the man from the voice parade, and also recognized him in a visual parade. This, together with other evidence, resulted in both men being convicted. The paper outlines the problems involved in picking foils from the interview tapes supplied by the police, discusses the format and conduct of the resulting parade including the question asked of the witness, and summarizes challenges in court to the fairness of the parade. In conclusion ways are suggested in which the procedure might be streamlined and its reliability improved.

KEYWORDS Forensic speech analysis, earwitness identification, line-up parade, voice description.

Digital audio recording analysis: the Electric Network Frequency (ENF) Criterion

Catalin Grigoras (2005)
National Institute of Forensic Expertise, Bucharest, Romania

ABSTRACT

This article reports on the Electric Network Frequency Criterion as a means of assessing the integrity of digital audio evidence. A brief description is given of
phenomena that determine ENF variations. In most situations, to reach a non-authenticity opinion, the visual inspection of spectrograms and comparison with an ENF database are enough. A more detailed investigation, in the time domain, requires short time windows measurements and analyses. The stability of the ENF over geographical distances has been established by comparison of synchronized recordings made at different locations on the same network. A real case is presented, in which the ENF Criterion was used to investigate an audio file created with a secret surveillance system. By applying the ENF Criterion in forensic audio analysis, one can determine whether and where a digital recording has been edited, establish whether it was made at the time claimed, and identify the time and date of the registering operation.

KEYWORDS Electrical Network Frequency Criterion, forensic audio, forensic
acoustics, audio authentication, digital audio recordings

GSM interference cancellation for forensic audio: a report on work in progress

Philip Harrison (2001)
J P French Associates, England

ABSTRACT

A central aspect of forensic phonetic casework concerns the transcription of noisy recordings. An increasing problem in this area of work is the contamination of recordings with interference caused by radio transmissions from GSM mobile phones.
Transmitting phones emit short duration radio-frequency pulses at a rate of 217 Hz.
The induced interference signal contains the 217 Hz fundamental and a large number
of harmonics that overlap the frequency range of speech, and therefore severely
degrade speech intelligibility. Listener fatigue is increased due to the harsh sound of the interference, and overall the transcription of such audio samples is problematic. This paper describes the progressing development of a filter to assist the forensic phonetician in carrying out the transcription of such contaminated recordings.

KEYWORDS forensic audio, transcription, GSM interference, adaptive filter

The ‘Mobile Phone Effect’ on Vowel Formants

Catherine Byrne* and Paul Foulkes** (2004)
*University of Sheffield, **University of York

ABSTRACT

This study analyses the effect of mobile phone transmission on vowel formant frequencies, based on the study presented by Künzel (2001). Six male and six female speakers read a short passage into a mobile phone. Two simultaneous recordings were made, one at the far end of the phone line and the other via a microphone directly in front of the speaker. Measurements of F1, F2 and F3 were taken from between 15 and 25 stressed vowels per speaker in both sets of recordings. Due to the filtering effect of the phone transmission, F1 frequencies for most vowels were found to be higher than their counterparts in the direct recordings. The overall effect of the mobile phone on F1 frequencies was considerably greater than the landline telephone effect found by Künzel (2001): on average the F1 values in the mobile condition were 29 per cent higher than in the direct condition. On the whole F2 measures were not significantly affected, in line with Künzel’s findings. F3 frequencies were also generally unaffected by the mobile phone transmission. Exceptions were found, however, particularly for individual speakers with relatively high F3s. In these cases the mobile recordings tended to yield significantly lower values. The consequences of measurement errors arising from the different recording conditions are discussed with reference to forensic speaker identification.

KEYWORDS formant analysis, mobile phone transmission, forensic speaker identification

Adli Sesbilim ve Adli Konuşma Bilimi
Hazırlayan: Burcu ÖNDER

ADLİ SESBİLİM (Forensic Phonetics), dilbilimcilerin ve sesbilim uzmanlarının adli amaçlar doğrultusunda sesbilim araştırmalarında ya da adli davalara ışık tutabilecek ilgili soruşturmalarda kullanılmalarıdır.
Sesbilimcilerin bu tür araştırmalarda ve soruşturmalarda üzerinde çalıştıkları konular ana hatlarıyla:
• Ses kalitesi düşük olan kayıtların deşifre edilmesi
• Tartışmalı ifadelerin analizi
• Konuşmacı davranışının değerlendirilmesi
• Konuşmacı profili çıkarma
• Konuşmacı tanımlama
• Tanık vasıtasıyla konuşmacı tanımlamadır.

A. Konuşmacı Profili Çıkarma:

• Konuşmacının cinsiyeti, yaşı, bölgesel- sosyal altyapısı ve bireysel farklılıkları (patolojik kaynaklı söyleyiş bozuklukları) üzerinde durulur.
• Sesbilimsel ve dilbilimsel analiz gereklidir.
• Sonuçların kesinlik derecesi materyalin uzunluğuna, kalitesine, analizcinin bu alandaki deneyimine ve betimlenebilen dialektolojik bilginin yeterliliğine göre değişkenlik gösterir.

B. Konuşmacı Tanıma:

• Adli sesbilimcilerin üzerinde çalıştıkları ana konudur. Dünyanın birçok ülkesinde adli davaların %70’i konuşmacı tanıma üzerinedir.
• Suça konu olmuş kayıttaki bilinmeyen kişinin ses kaydı ile (gizli ses kayıtları, telefon dinleme) şüphelinin bilinen ses kaydı (polis sorgusu) karşılaştırılır.
• Konuşmacı tanıma işlemi dünyada üç farklı yöntemle gerçekleştirilir. Bunlar sesizi yöntemi, otomatik konuşmacı tanıma yöntemi ve sesbilimsel-akustik yöntemdir.
• Sesizi yöntemi dünyanın bazı ülkelerinde uygulanmasına rağmen günümüz akademik dünyasında geçerliliğini çoktan yitirmiştir. Bunun nedeni konuşmanın sadece kişiye özel değil; her konuşma davranışının kendine göre özel olmasıdır. Konuşmadaki hiçbir özelliğin sürekliliği yoktur. Bir kişinin sesi kendi içersinde çok fazla değişkenlik gösterebilir. Örneğin, fısıldama, bağırma, soğuk algınlığı, duygusal durum, alkol ve uyuşturucu kullanımı kişinin sesinde değişikliğe sebep olabilen önemli faktörlerden olduğu gibi, kayıt türü (telefon, digital) de kişinin sesinde önemli değişikliklere sebep olmaktadır.
• Otomatik konuşmacı tanıma sistemlerinde ise, karmaşık akustik sinyaller matematiksel bir modele dönüştürülmüştür. Çok soyuttur ve bireysel sesbilimsel verilerle doğrudan alakalı değildir. Bilgisayar referans modellerle yeni ses örneğinin karşılaştırmasını yapar ve sonuca kendi matematiksel sistemi içersinde ulaşır. Çok başarılı sonuçlar veren otomatik konuşmacı tanıma sistemleri bulunmaktadır fakat bütün bu sistemler laboratuvar ortamında alınmış temiz kayıtlar üzerinden işlem yapmaktadır. Halbuki bilindiği gibi adli davalara konu olan ses kayıtları genellikle anlaşılması zor, kötü olarak tabir edilebilecek düzeydeki kayıtlardır. Bu nedenle bu sistemler henüz tek başlarına kullanılabilecek kadar güvenilir değildirler.
• Sesbilimsel-akustik yöntem ise günümüzde en güvenilir yöntemdir. Konuşma bütün bileşenleriyle incelenir: sözdizimi, kelime kullanımı, sesbilim, akustik, vurgu, tonlama, ritim vs.

B.1. Konuşmacı tanıma süreci:

• Kayıtlardaki ilgili bütün sesbilimsel ve dilbilimsel veriler analiz edilir.
• Analiz işitsel (auditory) ve akustiktir.
• Bilinen konuşmacının özellikleriyle (B) bilinmeyen konuşmacının özellikleri (X) benzerlikleri ve farklılıkları değerlendirmek için karşılaştırılır.
• Kayıtların birbiriyle olan tutarlılığı tanımlanır.
• Tutarlılığı sağlayan öğelerin toplumun diğer bireylerine kıyasla ne kadar ayırt edici olduğu belirlenir ve toplumda fazla görülmeyen özellikler üzerinde durulur. En iyi özellik, kişiler arasında büyük farklılıklar gösteren ama kendi içerisinde az oranda değişkenlik gösterendir.
• Tek bir özellik bir kişiyi tanımlamada yeterli olmaz. İncelenen özellikler kişiye göre değişkenlik göstermektedir. Her bir davada onlarca bileşen incelenmektedir.
• Konuşmanın karmaşık yapısı nedeniyle tanımlama prosedürü genellikle kanıt yerine ölçümlü bir değerlendirme raporu olarak sonuçlanır.

İdeal olan sesbilimsel bir öğeyi parmak izi ya da DNA profili kesinliğinde tanımlayabilmektir ama ne yazık ki böyle bir durumdan bahsetmek mümkün değildir. Ne kadar da bir kişinin konuşmasını diğer bireylerden ayırabilecek yeterlilikte tek bir karakteristik özellik olmasa da, bazı özelliklerin kullanımı nispeten başarılı tanımlamalara altyapı oluşturmaktadır. Sonuç itibariyle, kanıt olarak ses kaydı her zaman destekleyici diğer delillerle birlikte kullanılmalıdır.

C. Tanık Vasıtasıyla Konuşmacı Tanıma:

• Prosedür konuşmacı tanımlama süreciyle benzerlik gösterir fakat bu durumda analizi yapması gereken kişi tanıktır.
• Çoğu araştırma bu süreçte sesbilim konusunda uzman olmayan bir kişinin bir sesbilimciden farklılık gösterdiğini ortaya koymuştur çünkü tanığın performansını etkileyecek birçok unsur mevcuttur.Dinleyicinin sese olan yakınlık derecesi, sese maruz kalmanın aktif (karşılıklı konuşma) ya da pasif (sadece duyulması) olması, kayıt örneğinin analiz alanının sınırları (direkt duyulması ya da telefondan işitilmesi), sese maruz kalınması ve ses tanıma süreci arasındaki sürenin uzunluğu bu unsurlar arasında sıralanabilir. Bu tür davalarda yapılması gereken, bir sesbilimcinin hazırlayacağı testler doğrultusunda tanığın güvenilirliğini test etmek ve daha sonra tanığın güvenilirlik derecesine göre konuşmacı tanıma prosedürüne geçmektir.

Bir analizci hem savcılık hem de savunma adına çalışabilir. Unutulmaması gereken nokta analizi yapan sesbilimci davayı kendi tarafı için kazanmaya çalışan kişi değildir. Görevi; objektif bir biçimde verileri ortaya koymaktır.

Kaynakça:

Baldwin, J. and French, J.P. (1990) Forensic Phonetics. London: Pinter.

Braun, A. and H. Künzel (1998) Is forensic speaker identification unethical – or can it be unethical not to do it? Forensic Linguistics 5(1), 10-21.

Broeders, A. (1996) Earwitness identification: common ground, disputed territory and uncharted areas. Forensic Linguistics 3(1), 3-13.

Broeders, A. (1999) Some observations on the use of probability scales in forensic identification. Forensic Linguistics 6(2), 228-241.

Broeders, A. and A. Rietveld (1995) Speaker identification by earwitnesses. In J-P. Köster & A. Braun (eds) Studies in Forensic Phonetics. Trier: Trier University Press.
Butcher (2002) Forensic phonetics: issues in speaker identification evidence. Paper presented at the Inaugural International Conference of the Institute of Forensic Studies: ‘Forensic Evidence: Proof and Presentation’, Prato, Italy.

Clark, J. & Foulkes, P. (2007) Identification of voices in disguised speech. The International Journal of Speech, Language and the Law 14(2), 195-221

French, J. P. and Harrison, P. T. (2007) Position Statement concerning use of impressionistic likelihood terms in forensic speaker comparison cases. The International Journal of Speech, Language and the Law 14.1: 137-144.

Foulkes, P. & French, J.P. (2001) Forensic phonetics and sociolinguistics. In Mesthrie, R. (ed.) Concise Encyclopedia of Sociolinguistics. Amsterdam: Elsevier Press. pp. 329- 332.

Nolan, F. (1997) Speaker recognition and forensic phonetics. In W.J. Hardcastle and J. Laver (eds), A Handbook of Phonetic Sciences. Oxford: Blackwell.

Adli Sesbilim Nedir? (Forensic Phonetics)

4 Ocak 2010 Pazartesi

Makale Özetleri II

Makale Özetleri I

Adli Ses Bilim ve Akustik

Hakkımda

Blog Arşivi

Faydalı Linkler