Skip to content

Update team.ad#94

Merged
jzonthemtn merged 1 commit intoapache:mainfrom
scadams:patch-1
Sep 27, 2025
Merged

Update team.ad#94
jzonthemtn merged 1 commit intoapache:mainfrom
scadams:patch-1

Conversation

@scadams
Copy link
Contributor

@scadams scadams commented Sep 16, 2025

Hello, I was just hoping to add my name as a past developer. It was going to happen before, but there were problems with releasing updates to the website at the time.

I was primarily involved in the effort to get OpenNLP into Apache. I wrote a lot of tools and code that were untracked at the time, and created nearly all of the models on this page: https://opennlp.sourceforge.net/models-1.5/

And I authored a paper with Dr. Jason Baldridge about the results of this effort.

I also authored much of the old Sourceforge wiki: https://web.archive.org/web/20121117050721/http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Main_Page including the entirety of this: https://web.archive.org/web/20100917162145/http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Newlang

Thanks,
Sean Adams

Hello, I was just hoping to add my name as a past developer. It was going to happen before, but there were problems with releasing updates to the website at the time.

I was primarily involved in the effort to get OpenNLP into Apache. I wrote a lot of tools and code that were untracked at the time, and created nearly all of the models on this page: https://opennlp.sourceforge.net/models-1.5/

And I authored a paper with Dr. Jason Baldridge about the results of this effort.

I also authored much of the old Sourceforge wiki: https://web.archive.org/web/20121117050721/http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Main_Page including the entirety of this: https://web.archive.org/web/20100917162145/http://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Newlang

Thanks,
Sean Adams
@rzo1 rzo1 requested a review from jzonthemtn September 16, 2025 06:05
@jzonthemtn
Copy link
Contributor

Hi @scadams, thanks for doing this.

The origin of those old SourceForge models comes up a lot! I'd love to learn more about the data and how they were trained. They are left on SourceForge and weren't moved over because we can't be sure they'd fit under the Apache license since no one really knew their training data.

I'd love to read your paper with Jason also. As someone who got involved past that point I think it would be a good read to have and maybe even put on the website.

@rzo1
Copy link
Contributor

rzo1 commented Sep 16, 2025

I would also love to know more about the paper. Maybe you can provide a reference to it, so we can get an idea how the models where trained, etc :)

@scadams
Copy link
Contributor Author

scadams commented Sep 18, 2025

Here, does this work? I've never attached a file like this before.

Converseon-sponsoredOpenNLPdevelopment.pdf

@jzonthemtn jzonthemtn merged commit fe8fea5 into apache:main Sep 27, 2025
1 check passed
@scadams scadams deleted the patch-1 branch September 28, 2025 08:06
@scadams
Copy link
Contributor Author

scadams commented Sep 28, 2025

@jzonthemtn @rzo1 Hopefully this answers your questions:

The training data came from CoNLL 2006 and all of it can be downloaded here along with documentation and license information: CoNLL-X Shared Task: Multi-lingual Dependency Parsing

These are/were the data sources for each language:

  • Danish: The Danish Dependency Treebank
  • Dutch: The Alpino Treebank
  • Portuguese: The Floresta Sintá(c)tica project
  • Swedish: Talbanken05 Swedish treebank

The Language Data section of the wiki I mentioned above describes how the models were trained using this data.

@jzonthemtn
Copy link
Contributor

@scadams Thanks a lot for that info. The project has been asked so many times it's nice to be able to give an answer now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants