Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Should all sentence lengths be considered in the Menzerath-Altmann Law?

Hi there,

I am currently doing an analysis on some English texts in terms of the Menzerath-Altmann law: the longer the sentences are, the longer the words are (using Arens' law that the longer the sentences are - the shorter the phrases are - so the longer the words are). My results show that the texts, with 15277 sentences in total, do NOT conform to the Menzerath-Altmann law - or at least not strongly at all. I measured sentences in terms of words and words in terms of syllables.

However, I wanted to know whether it is 'legitimate' to ignore any extremely short or extremely long sentences in the analysis. For example, 22 sentences consist of just one word, and just 30 sentences consist of 40 words or more. Eliminating these sentences give a much stronger conformity to the Menzerath-Altmann law.

Does anybody have any experience with this in statistical analyses? Is it "allowed" to ignore the extremes either side, for example ignoring the shortest 0.5% and longest 0.5% of analysed sentences? Or does one have to consider all sentences equally?

Thanks in advance for any help and advice! reklaw4

Should all sentence lengths be considered in the Menzerath-Altmann Law?

Hi there,

I am currently doing an analysis on some English texts in terms of the Menzerath-Altmann law: the longer the sentences are, the longer the words are (using Arens' law that the longer the sentences are - the shorter the phrases are - so the longer the words are). My results show that the texts, with 15277 sentences in total, texts do NOT conform to the Menzerath-Altmann law - or at least not strongly at all. I measured sentences in terms of words and words in terms of syllables.

However, I wanted to know whether it is 'legitimate' to ignore any extremely short or extremely long sentences in the analysis. For example, 22 sentences consist of just one word, and just 30 sentences consist of 40 words or more. Eliminating these sentences give a much stronger conformity to the Menzerath-Altmann law.

Does anybody have any experience with this in statistical analyses? Is it "allowed" to ignore the extremes either side, for example ignoring the shortest 0.5% and longest 0.5% of analysed sentences? Or does one have to consider all sentences equally?

Thanks in advance for any help and advice! reklaw4