Skip to content
René Witte edited this page Oct 7, 2017 · 3 revisions

TextMining-LODeXporter Wiki

The LODeXporter is a GATE component that can export NLP annotations directly to a triplestore, with configurable vocabularies, for use in LOD applications.

Compiling

To compile LODeXporter, you need a JDK (min. version 8), ant, and ivy. The default ant target generates the plugin, LODeXporter.jar, as well as the Javadoc documentation. The ant test target runs the JUnit tests.

Running

Load the LODeXporter PR through GATE's plugin manager (for details on working with GATE PRs, please refer to the GATE User Guide). When creating a new LODeXporter PR, you can set the following parameters:

  • Initialization Parameters:
    • mappingFile: The file containing the RDF mapping rules for transforming annotations to triples
    • rdfStoreDir: The triplestore containing the RDF mapping rules for transforming annotations to triples
      • Note: you cannot set both mappingFile and rdfStoreDir, the first is for file-based import/export, the second for TDB-based rule import/triple export
    • subjectMappingSparql: The SPARQL query applied on the mapping rules to obtain all rules used for exporting GATE annotation types to triple subjects
    • propertyMappingSparql: The SPARQL query applied on the mapping rules to obtain all rules used for exporting GATE annotation features to triple properties and objects/literals
    • relationMappingSparql: The SPARQL query applied on the mapping rules to obtain all rules used for exporting relations between GATE annotations
  • Run-Time Parameters:
    • inputASName: the AnnotationSet to export
    • customURI: customURI
    • exportFilePath: output directory for generated triple files in N-Quads format (".nq") (only when not using TDB-based export, see rdfStoreDir above)

Quick Start Guide

For a basic example on using LODeXporter to generate triples from a document, start by loading GATE's ANNIE pipeline. Load the LODeXporter and add it to the end of the pipeline, using the provided default values. It will export some of the annotations in a text (e.g., Person) to the output triple file (/tmp/<sessionID>.nq by default).

Clone this wiki locally