Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Looking for the right corpus data

For my PhD research I am building a computational model that predicts how two languages converge in the bilingual mind. With this model, I wish to establish what elements of language are more likely to converge and therefore more vulnerable for contact-induced change.

Put simply, the model looks at the parental input that is coded for morphosyntax and semantics and based on the structural and conceptual overlap the model produces output that should correspond to the actual linguistic output of the child. In order to do so, I need a corpus that contains a reasonable amount of parental input in two languages (where one parent speaks one language and the other parent another language) and of course the child’s output. Furthermore, as contact-induced change is probably more visible in languages that do not share a large part of their structures I would like to look at bilingual data from languages of different language families.

Are you aware of a transcribed corpus that contains these data? A corpus that contains: a reasonable amount of bilingual input using the one parent one language policy (OPOL), a combination of languages from two different language families and the child’s output (it does not matter in what age range).

Thanks a lot!