Skip to content
endpnt edited this page Jan 30, 2012 · 19 revisions

Ohai, welcome to Andoc!

We need developers! It's a big project and we would like to have a first real alpha working as soon as possible.

  • We are a strictly inclusive community. Any genders and non-genders and aliens etc etc...are more then welcome.
  • Andoc is software not service.
  • We aim for a solid and sustainable community of developers, researchers, scientists and supporters.

Want to help?

  • start with the README
  • clone the code and follow the INSTALL.md
  • play around with the UI workflow and mark some text elements.
  • generate the graph visualisation as described in the INSTALL.md
  • have a look at the current tasks

Contact

@endpnt on twitter or here.

Concept

The following graph shows the possible structure and concept of andoc. (Sadly, this is already out-dated. The triple concept and 4store will be abandoned.)

Concept Diagram

Workflow

  1. Apply structure information to the initial document. Start and end text-positions of header, body, lists are blocks added.
  2. Apply semantic information to the structured document. Select start and end text-positions to identify person, places, dates and events.
  3. Finalize the document and mark as completed.

Tools

The diagram includes a couple of external tools that can be used in the overall setup.

  • igraph: for analyzing graph data (networks). (Good tutorial.)
  • Solr: a java based full-text search engine.
  • Redis: a fast in-memory key/value store with a publisher / subscriber update model. This helps to distribute the current changes for the web-based collaboration.
  • Pincaster: a in-memory database for geo-location data. Allows a fast nearby search.
  • (abandoned : 4store: a database to store semantic RDF data (triple-store). It allows the easy "mash-up" of semantic information. Data can be accessed (or combined) using SPARQL.)
  • CouchDB: a erlang based document store.

Clone this wiki locally