Analysis of breadth of words covered in a tv show

asked 2016-08-15 16:22:27 -0500

Good afternoon to everybody,

I am in the process of developing a course/product that uses a telenovela (tv series) to teach people Spanish. However, I would like to do an analysis of the episode transcripts in order to determine what words are covered in the show and what words are not. More specifically, I need to compare the words used in the show against a frequency list of the words used in the language, in order to determine if there are any important words that are missing from the show.

Does anybody know how I can do this? If anyone has knowledge of how to do this for English but not Spanish, that is ok as well, as I am also interested in analyzing the transcripts of a few English shows as well.

Thank you to everyone for taking the time to read this!

edit retag flag offensive close merge delete