Conversation
Hello, I was just hoping to add my name as a past developer. It was going to happen before, but there were problems with releasing updates to the website at the time. I was primarily involved in the effort to get OpenNLP into Apache. I wrote a lot of tools and code that were untracked at the time, and created nearly all of the models on this page: https://opennlp.sourceforge.net/models-1.5/ And I authored a paper with Dr. Jason Baldridge about the results of this effort. I also authored much of the old Sourceforge wiki: https://web.archive.org/web/20121117050721/http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Main_Page including the entirety of this: https://web.archive.org/web/20100917162145/http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Newlang Thanks, Sean Adams
|
Hi @scadams, thanks for doing this. The origin of those old SourceForge models comes up a lot! I'd love to learn more about the data and how they were trained. They are left on SourceForge and weren't moved over because we can't be sure they'd fit under the Apache license since no one really knew their training data. I'd love to read your paper with Jason also. As someone who got involved past that point I think it would be a good read to have and maybe even put on the website. |
|
I would also love to know more about the paper. Maybe you can provide a reference to it, so we can get an idea how the models where trained, etc :) |
|
Here, does this work? I've never attached a file like this before. |
|
@jzonthemtn @rzo1 Hopefully this answers your questions: The training data came from CoNLL 2006 and all of it can be downloaded here along with documentation and license information: CoNLL-X Shared Task: Multi-lingual Dependency Parsing These are/were the data sources for each language:
The Language Data section of the wiki I mentioned above describes how the models were trained using this data. |
|
@scadams Thanks a lot for that info. The project has been asked so many times it's nice to be able to give an answer now! |
Hello, I was just hoping to add my name as a past developer. It was going to happen before, but there were problems with releasing updates to the website at the time.
I was primarily involved in the effort to get OpenNLP into Apache. I wrote a lot of tools and code that were untracked at the time, and created nearly all of the models on this page: https://opennlp.sourceforge.net/models-1.5/
And I authored a paper with Dr. Jason Baldridge about the results of this effort.
I also authored much of the old Sourceforge wiki: https://web.archive.org/web/20121117050721/http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Main_Page including the entirety of this: https://web.archive.org/web/20100917162145/http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Newlang
Thanks,
Sean Adams