Hi Diana
This is a HUGE topic.
I think you have to keep at least three dimensions in mind. Let's use an example so we have something to talk about: "John loves Mary".
First, what is the frequency you are measuring? You can look at the frequency of words, phoneme sequences, stress patterns, part of speech sequences, syntactic phrases, syllable types. ANY linguistic unit you can imagine.
Second, what are you computing frequencies over? Some corpus? A dictionary? How are you tokenizing/lemmatizing? For example, does "loves" get counted separately from "loved"?
Finally, what is your frequency affect meant to explain? In the psycholinguistic literature, frequency effects have been found for production, comprehension, lexical access, etc. In other areas, frequency effects show up in historical change, all over the place in sociolinguistics, phonology, phonetics, etc.
I hope this helps. The take-home point is that there is lots and lots of work in this general area and you probably want to focus in more than on "frequency effects in language" per se.
Mike Hammond
Hi, Diana,
There are, as Dr. Hammond pointed out, lots of things in language that could be considered to be or to be caused by "frequency effects". Likewise, there are many ways of considering elements to be "frequent". Word frequency is usually measured by lemma (ie, roughly, dictionary headword, that is, a conglomerate of all morphological forms of a given "word" [and read up on the many varied definitions of the term "word"]), although there are reasons for assuming that different morphological forms of the "same" word might have different frequencies of use and therefore even be pronounced differently, say (case in point in English: first-syllable vowel reduced in 'mistake' [verb] but nonreduced in 'mistook' [past tense of the same verb]. Frequency figures for different forms of the same words are not easy to come by, for example, in the Thorndike & Lorge counts from the 30's and 40's, so if you get interested in that aspect of things, I would recommend using the Carroll et al. (1970 or 71) Word frequency book, which is both more recent and lists separately all the forms encountered in a 5M-word corpus.
I have pretty much agreed with your understanding of the term "frequency" as principally applying to word frequency; you should be aware, though, that several studies might be taken to suggest that word frequency, and especially for words like homonyms, etc., might not be the whole story (time/thyme are not pronounced quite the same; see mistake/mistook above, etc.). Grammaticalization studies likewise might be taken as suggesting that even frequent words may have different pronunciations depending on their subpart of speech (eg, well as an adverb or as an interjection/marker).
Many of these themes (and I have here concentrated on word form pronunciation, because it's what I know best) have only recently come to the fore in linguistics (eg, markers and interjections have for some centuries not really been taken seriously in ...
(more)