Rather than 'converging', it appears that bilinguals, from the very earliest stages, keep both (or more) languages pretty well compartmentalized. (And they also know which language to use to whom &/or whether and how to use code-switching). It sounds like you are operating from different suppositions, something like 'language mixing', which I think is now accepted as being very unusual, rather than the norm. Even in what we call code-switching, the changes from one code (language) to the other are almost always done in 'chunks', that is, at major syntactic boundaries. (There are some exceptions to this, such as pidgins and a few other cases, usually where there is insufficient exposure to one of the languages.)

Probably the largest corpus for (mainly child) language acquisition is CHILDES (, which is multilingual and transcribed in ASCII (text). On the site are also programs to analyze and manipulate the data, much of which is of bilingual children, though I don't know how much is with both languages. If this corpus is unsuitable for you (IMO unlikely), try Googling to find what might be more suitable. There are also corpora of language learners (also, e. g., ESL).


