Package org.apache.any23.extractor
Class ExtractionResultImpl
- java.lang.Object
-
- org.apache.any23.extractor.ExtractionResultImpl
-
- All Implemented Interfaces:
ExtractionResult
,IssueReport
,TagSoupExtractionResult
public class ExtractionResultImpl extends Object implements TagSoupExtractionResult
A default implementation of
ExtractionResult
; it receives extraction output from oneExtractor
working on one document, and passes the output on to aTripleHandler
. It deals with details such as creation ofExtractionContext
objects and closing any open contexts at the end of extraction.The
close()
method must be invoked after the extractor has finished processing.There is usually no need to provide additional implementations of the ExtractionWriter interface.
- Author:
- Richard Cyganiak (richard@cyganiak.de), Michele Mostarda (michele.mostarda@gmail.com)
- See Also:
TripleHandler
,ExtractionContext
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.any23.extractor.IssueReport
IssueReport.Issue, IssueReport.IssueLevel
-
Nested classes/interfaces inherited from interface org.apache.any23.extractor.TagSoupExtractionResult
TagSoupExtractionResult.PropertyPath, TagSoupExtractionResult.ResourceRoot
-
-
Constructor Summary
Constructors Constructor Description ExtractionResultImpl(ExtractionContext context, Extractor<?> extractor, TripleHandler tripleHandler)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addPropertyPath(Class<? extends MicroformatExtractor> extractor, org.eclipse.rdf4j.model.Resource propertySubject, org.eclipse.rdf4j.model.Resource property, org.eclipse.rdf4j.model.BNode object, String[] path)
Adds a property path to the list of the extracted data.void
addResourceRoot(String[] path, org.eclipse.rdf4j.model.Resource root, Class<? extends MicroformatExtractor> extractor)
Adds a root property to the extraction result, specifying also the path corresponding to the root of data which generated the property and the extractor responsible for such addition.void
close()
Close the result.ExtractionContext
getExtractionContext()
Collection<IssueReport.Issue>
getIssues()
Returns all the collected issues.int
getIssuesCount()
List<TagSoupExtractionResult.PropertyPath>
getPropertyPaths()
Returns all the collected property paths.List<TagSoupExtractionResult.ResourceRoot>
getResourceRoots()
Returns all the collected property roots.boolean
hasIssues()
void
notifyIssue(IssueReport.IssueLevel level, String msg, long row, long col)
Notifies an issue occurred while performing an extraction on an input stream.ExtractionResult
openSubResult(ExtractionContext context)
Open a result nested in the current one.void
printReport(PrintStream ps)
Prints out the content of the report.String
toString()
void
writeNamespace(String prefix, String uri)
Write a namespace.void
writeTriple(org.eclipse.rdf4j.model.Resource s, org.eclipse.rdf4j.model.IRI p, org.eclipse.rdf4j.model.Value o)
Write a triple.void
writeTriple(org.eclipse.rdf4j.model.Resource s, org.eclipse.rdf4j.model.IRI p, org.eclipse.rdf4j.model.Value o, org.eclipse.rdf4j.model.IRI g)
Writes a triple.
-
-
-
Constructor Detail
-
ExtractionResultImpl
public ExtractionResultImpl(ExtractionContext context, Extractor<?> extractor, TripleHandler tripleHandler)
-
-
Method Detail
-
hasIssues
public boolean hasIssues()
-
getIssuesCount
public int getIssuesCount()
-
printReport
public void printReport(PrintStream ps)
Description copied from interface:IssueReport
Prints out the content of the report.- Specified by:
printReport
in interfaceIssueReport
- Parameters:
ps
- aPrintStream
to use for generating the report.
-
getIssues
public Collection<IssueReport.Issue> getIssues()
Description copied from interface:IssueReport
Returns all the collected issues.- Specified by:
getIssues
in interfaceIssueReport
- Returns:
- a collection of
IssueReport.Issue
s.
-
openSubResult
public ExtractionResult openSubResult(ExtractionContext context)
Description copied from interface:ExtractionResult
Open a result nested in the current one.- Specified by:
openSubResult
in interfaceExtractionResult
- Parameters:
context
- the context to be used to open the sub result.- Returns:
- the instance of the nested extraction result.
-
getExtractionContext
public ExtractionContext getExtractionContext()
-
writeTriple
public void writeTriple(org.eclipse.rdf4j.model.Resource s, org.eclipse.rdf4j.model.IRI p, org.eclipse.rdf4j.model.Value o, org.eclipse.rdf4j.model.IRI g)
Description copied from interface:ExtractionResult
Writes a triple. Parameters can be null, then the triple will be silently ignored.- Specified by:
writeTriple
in interfaceExtractionResult
- Parameters:
s
- subjectp
- predicateo
- objectg
- graph
-
writeTriple
public void writeTriple(org.eclipse.rdf4j.model.Resource s, org.eclipse.rdf4j.model.IRI p, org.eclipse.rdf4j.model.Value o)
Description copied from interface:ExtractionResult
Write a triple. Parameters can be null, then the triple will be silently ignored.- Specified by:
writeTriple
in interfaceExtractionResult
- Parameters:
s
- subjectp
- predicateo
- object
-
writeNamespace
public void writeNamespace(String prefix, String uri)
Description copied from interface:ExtractionResult
Write a namespace.- Specified by:
writeNamespace
in interfaceExtractionResult
- Parameters:
prefix
- the prefix of the namespaceuri
- the long IRI identifying the namespace
-
notifyIssue
public void notifyIssue(IssueReport.IssueLevel level, String msg, long row, long col)
Description copied from interface:IssueReport
Notifies an issue occurred while performing an extraction on an input stream.- Specified by:
notifyIssue
in interfaceIssueReport
- Parameters:
level
- issue level.msg
- issue message.row
- issue row.col
- issue column.
-
close
public void close()
Description copied from interface:ExtractionResult
Close the result.
Extractors should close their results as soon as possible, but don't have to, the environment will close any remaining ones. Implementations should be robust against multiple close() invocations.- Specified by:
close
in interfaceExtractionResult
-
addResourceRoot
public void addResourceRoot(String[] path, org.eclipse.rdf4j.model.Resource root, Class<? extends MicroformatExtractor> extractor)
Description copied from interface:TagSoupExtractionResult
Adds a root property to the extraction result, specifying also the path corresponding to the root of data which generated the property and the extractor responsible for such addition.- Specified by:
addResourceRoot
in interfaceTagSoupExtractionResult
- Parameters:
path
- the path from the document root to the local root of the data generating the property.root
- the property root node.extractor
- the extractor responsible of such extraction.
-
getResourceRoots
public List<TagSoupExtractionResult.ResourceRoot> getResourceRoots()
Description copied from interface:TagSoupExtractionResult
Returns all the collected property roots.- Specified by:
getResourceRoots
in interfaceTagSoupExtractionResult
- Returns:
- an unmodifiable list of
TagSoupExtractionResult.ResourceRoot
s.
-
addPropertyPath
public void addPropertyPath(Class<? extends MicroformatExtractor> extractor, org.eclipse.rdf4j.model.Resource propertySubject, org.eclipse.rdf4j.model.Resource property, org.eclipse.rdf4j.model.BNode object, String[] path)
Description copied from interface:TagSoupExtractionResult
Adds a property path to the list of the extracted data.- Specified by:
addPropertyPath
in interfaceTagSoupExtractionResult
- Parameters:
extractor
- the identifier of the extractor responsible for retrieving such property.propertySubject
- the subject of the property.property
- the property IRI.object
- the property object if any,null
otherwise.path
- the path of the HTML node from which the property literal has been extracted.
-
getPropertyPaths
public List<TagSoupExtractionResult.PropertyPath> getPropertyPaths()
Description copied from interface:TagSoupExtractionResult
Returns all the collected property paths.- Specified by:
getPropertyPaths
in interfaceTagSoupExtractionResult
- Returns:
- a valid list of property paths.
-
-