Ask Your Question

Corpus analysis software program

asked 2015-04-28 19:10:51 -0400

Purplecow gravatar image

updated 2015-04-28 19:11:35 -0400

Is there any software program that i can use for analyzing language corpora other than English corpora? I tried EM editor but it doesn't seem to work.

edit retag flag offensive close merge delete

2 answers

Sort by ยป oldest newest most voted

answered 2015-04-28 20:04:27 -0400

See the answer here:

for example AntConc should be useful, but also general Unix command line tools can allow you to process or analyze language corpora for other languages, assuming that you are talking about other scripts or text encodings (e.g. some Unicode standard). Otherwise, you can always code simple tools in Python or Java yourself. You might want to look into NLTK for Python, which can process various file formats and Unicode encoded texts.


edit flag offensive delete link more

answered 2015-05-01 19:36:47 -0400

usagi5886 gravatar image

updated 2015-05-01 20:31:10 -0400

If you're just looking for something like EmEditor, i.e. a text editor for Windows that can support large file sizes, there are many out there. I'd try out the software suggested at these pages:

Many of the listed programs are free, and most should have full Unicode support (hence be usable with non-English corpora).

I personally use Notepad++, and a colleague of mine swears by Crimson Editor, but I'm not sure if either of them support files as large as you need them to.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools


Asked: 2015-04-28 19:10:51 -0400

Seen: 449 times

Last updated: May 01 '15