The LINGUIST List is dedicated to providing information on language and language analysis, and to providing the discipline of linguistics with the infrastructure necessary to function in the digital world. LINGUIST is a free resource, run by linguistics students and faculty, and supported primarily by your donations. Please support LINGUIST List during the 2017 Fund Drive (
Ask Your Question

frequency of phonemes in American English

asked 2015-03-19 14:40:42 -0400

anonymous user


I am trying to find out the frequency of consonant phonemes in American accented English as actually spoken, as opposed to 'Proper' dictionary pronunciation. I would like to include the use of glottal stop in this list as well. So that words like 'hat' or 'kitten' in which Americans are most likely to use a glottal stop in place of the 't's is factored in.

I would like all of the actual phonemes taken into account like how 'exit' is pronounced 'egzit' or many past tense -ed verbs are pronounce with a 't' instead and so on. Where would I be able to find this kind information?

Thanks for your time.


(Transferred from old LINGUIST List Ask-a-Linguist site)

edit retag flag offensive close merge delete

2 answers

Sort by ยป oldest newest most voted

answered 2015-03-21 20:57:57 -0400

usagi5886 gravatar image

It seems like what you're looking for is a speech corpus from which you could extract out frequency counts of the different phones of the language. If so, of the various corpora available online, I'd recommend starting here:

Both of them can be obtained for free for academic purposes. Depending on how narrow or broad their transcriptions are, you might be able to extract out the information you're looking for. An in-depth examination of the data will probably require programming skills, so you if you don't know how to use Python, R, etc., you could ask a friend/colleague/professor for help.

Also, I should note that the answer will depend on what kind of speech you're looking at. Just like text corpora differ in many ways depending on whether they are from novels, news, etc., your distributions of frequency counts will probably depend on whether you're looking at spontaneous speech or semi-structured/elicited speech, and what kind of discourse it is. (Some of these effects will be because certain kinds of words, containing certain kinds of sounds, are more/less frequent in one genre compared to another.)

edit flag offensive delete link more


You might check out Phonological CorpusTools ( to more easily search such corpora without extensive programming knowledge. E.g., if you have the Buckeye corpus, you can load it in to PCT and do a "summary" of the corpus to get frequency counts for each phoneme.

K. C. Hall gravatar imageK. C. Hall ( 2015-04-15 20:47:21 -0400 )edit

answered 2015-03-19 14:44:49 -0400

anonymous gravatar image

I am not sure I understand the question. You may want to submit the question again, telling us why you want this information.

You might want to start by reading up on phonetics a bit. Ladefoged's course on phonetics is available online, with sounds (

You need to:

  1. Understand the difference between a phoneme and an allophone. For example, /p/ sounds quite different in 'pin', 'spin', and 'nip',

  2. Understand allomorphs. For example, the plural (written as (e)s) is pronounced differently in 'cats', 'dogs', and 'horses'. This concept applies to the regular past tense.

  3. Understand that there is dialectal variation in both phonemes and allophones.

Hope this helps. I am not sure what you mean by 'frequency' in this question.

Anthea Fraser Gupta

Hi, Kevin,

I can't answer your question with references either, off the top of my head, though as Dr. Fischer says there are such references available. In fact, with a little (OK, a lot of) study of the sound system of American English (btw, note also that there's no such thing, really: there is a vast array of different variants of 'American English', from 'Yankee' to 'Southern' to 'Midwest' to 'Western', and each one includes a wide variety of different 'accents', without even mentioning isolated variants like Ozark, Appalachian, etc. and others like 'Hispanic', 'Yiddish (English)', etc. Likewise, each of the varieties (mostly including a large geographic area each one) I have mentioned is itself divided into subvarieties. Indeed, in the focal, original areas of American English (for historical reasons, mostly found on or near the East Coast), even small areas can have a great variety of different ways of pronouncing the same words. One example: if we try to describe 'Philadelphia English', we need to take into account many different subvarieties: I remember one article about a kid who moved from one western (I think it was) suburb of Philadelphia to King of Prussia (which I'm not quite sure exactly where it is, but it is a suburb of Philadelphia) when she was six years old. Years later, natives of KofP could still tell she was not a native speaker of KofP English, although most linguists would consider six years old as plenty young enough to master natively a new language or dialect. In any case, the pronunciation differences, even between variants very close to one another indeed, is often detectable and therefore describable; and there are huge numbers of such differences. Now, an Aussie, for example, would be unlikely to detect all these subtle differences and would mostly notice an 'American accent' only. So a separate issue would be the audience you are directing your study to.)

To return to the previous, interrupted sentence: you could take a speech synthesizer for American English (ie, a program that converts written text to speech, a much easier task than speech recognition), but it would have to be a very good one to capture all (or at least some ...

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower


Asked: 2015-03-19 14:40:42 -0400

Seen: 830 times

Last updated: Mar 21 '15