Package org.apache.any23.extractor
Class ExtractionResultImpl
- java.lang.Object
-
- org.apache.any23.extractor.ExtractionResultImpl
-
- All Implemented Interfaces:
ExtractionResult,IssueReport,TagSoupExtractionResult
public class ExtractionResultImpl extends Object implements TagSoupExtractionResult
A default implementation of
ExtractionResult; it receives extraction output from oneExtractorworking on one document, and passes the output on to aTripleHandler. It deals with details such as creation ofExtractionContextobjects and closing any open contexts at the end of extraction.The
close()method must be invoked after the extractor has finished processing.There is usually no need to provide additional implementations of the ExtractionWriter interface.
- Author:
- Richard Cyganiak (richard@cyganiak.de), Michele Mostarda (michele.mostarda@gmail.com)
- See Also:
TripleHandler,ExtractionContext
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.any23.extractor.IssueReport
IssueReport.Issue, IssueReport.IssueLevel
-
Nested classes/interfaces inherited from interface org.apache.any23.extractor.TagSoupExtractionResult
TagSoupExtractionResult.PropertyPath, TagSoupExtractionResult.ResourceRoot
-
-
Constructor Summary
Constructors Constructor Description ExtractionResultImpl(ExtractionContext context, Extractor<?> extractor, TripleHandler tripleHandler)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddPropertyPath(Class<? extends MicroformatExtractor> extractor, org.eclipse.rdf4j.model.Resource propertySubject, org.eclipse.rdf4j.model.Resource property, org.eclipse.rdf4j.model.BNode object, String[] path)Adds a property path to the list of the extracted data.voidaddResourceRoot(String[] path, org.eclipse.rdf4j.model.Resource root, Class<? extends MicroformatExtractor> extractor)Adds a root property to the extraction result, specifying also the path corresponding to the root of data which generated the property and the extractor responsible for such addition.voidclose()Close the result.ExtractionContextgetExtractionContext()Collection<IssueReport.Issue>getIssues()Returns all the collected issues.intgetIssuesCount()List<TagSoupExtractionResult.PropertyPath>getPropertyPaths()Returns all the collected property paths.List<TagSoupExtractionResult.ResourceRoot>getResourceRoots()Returns all the collected property roots.booleanhasIssues()voidnotifyIssue(IssueReport.IssueLevel level, String msg, long row, long col)Notifies an issue occurred while performing an extraction on an input stream.ExtractionResultopenSubResult(ExtractionContext context)Open a result nested in the current one.voidprintReport(PrintStream ps)Prints out the content of the report.StringtoString()voidwriteNamespace(String prefix, String uri)Write a namespace.voidwriteTriple(org.eclipse.rdf4j.model.Resource s, org.eclipse.rdf4j.model.IRI p, org.eclipse.rdf4j.model.Value o)Write a triple.voidwriteTriple(org.eclipse.rdf4j.model.Resource s, org.eclipse.rdf4j.model.IRI p, org.eclipse.rdf4j.model.Value o, org.eclipse.rdf4j.model.IRI g)Writes a triple.
-
-
-
Constructor Detail
-
ExtractionResultImpl
public ExtractionResultImpl(ExtractionContext context, Extractor<?> extractor, TripleHandler tripleHandler)
-
-
Method Detail
-
hasIssues
public boolean hasIssues()
-
getIssuesCount
public int getIssuesCount()
-
printReport
public void printReport(PrintStream ps)
Description copied from interface:IssueReportPrints out the content of the report.- Specified by:
printReportin interfaceIssueReport- Parameters:
ps- aPrintStreamto use for generating the report.
-
getIssues
public Collection<IssueReport.Issue> getIssues()
Description copied from interface:IssueReportReturns all the collected issues.- Specified by:
getIssuesin interfaceIssueReport- Returns:
- a collection of
IssueReport.Issues.
-
openSubResult
public ExtractionResult openSubResult(ExtractionContext context)
Description copied from interface:ExtractionResultOpen a result nested in the current one.- Specified by:
openSubResultin interfaceExtractionResult- Parameters:
context- the context to be used to open the sub result.- Returns:
- the instance of the nested extraction result.
-
getExtractionContext
public ExtractionContext getExtractionContext()
-
writeTriple
public void writeTriple(org.eclipse.rdf4j.model.Resource s, org.eclipse.rdf4j.model.IRI p, org.eclipse.rdf4j.model.Value o, org.eclipse.rdf4j.model.IRI g)Description copied from interface:ExtractionResultWrites a triple. Parameters can be null, then the triple will be silently ignored.- Specified by:
writeTriplein interfaceExtractionResult- Parameters:
s- subjectp- predicateo- objectg- graph
-
writeTriple
public void writeTriple(org.eclipse.rdf4j.model.Resource s, org.eclipse.rdf4j.model.IRI p, org.eclipse.rdf4j.model.Value o)Description copied from interface:ExtractionResultWrite a triple. Parameters can be null, then the triple will be silently ignored.- Specified by:
writeTriplein interfaceExtractionResult- Parameters:
s- subjectp- predicateo- object
-
writeNamespace
public void writeNamespace(String prefix, String uri)
Description copied from interface:ExtractionResultWrite a namespace.- Specified by:
writeNamespacein interfaceExtractionResult- Parameters:
prefix- the prefix of the namespaceuri- the long IRI identifying the namespace
-
notifyIssue
public void notifyIssue(IssueReport.IssueLevel level, String msg, long row, long col)
Description copied from interface:IssueReportNotifies an issue occurred while performing an extraction on an input stream.- Specified by:
notifyIssuein interfaceIssueReport- Parameters:
level- issue level.msg- issue message.row- issue row.col- issue column.
-
close
public void close()
Description copied from interface:ExtractionResultClose the result.
Extractors should close their results as soon as possible, but don't have to, the environment will close any remaining ones. Implementations should be robust against multiple close() invocations.- Specified by:
closein interfaceExtractionResult
-
addResourceRoot
public void addResourceRoot(String[] path, org.eclipse.rdf4j.model.Resource root, Class<? extends MicroformatExtractor> extractor)
Description copied from interface:TagSoupExtractionResultAdds a root property to the extraction result, specifying also the path corresponding to the root of data which generated the property and the extractor responsible for such addition.- Specified by:
addResourceRootin interfaceTagSoupExtractionResult- Parameters:
path- the path from the document root to the local root of the data generating the property.root- the property root node.extractor- the extractor responsible of such extraction.
-
getResourceRoots
public List<TagSoupExtractionResult.ResourceRoot> getResourceRoots()
Description copied from interface:TagSoupExtractionResultReturns all the collected property roots.- Specified by:
getResourceRootsin interfaceTagSoupExtractionResult- Returns:
- an unmodifiable list of
TagSoupExtractionResult.ResourceRoots.
-
addPropertyPath
public void addPropertyPath(Class<? extends MicroformatExtractor> extractor, org.eclipse.rdf4j.model.Resource propertySubject, org.eclipse.rdf4j.model.Resource property, org.eclipse.rdf4j.model.BNode object, String[] path)
Description copied from interface:TagSoupExtractionResultAdds a property path to the list of the extracted data.- Specified by:
addPropertyPathin interfaceTagSoupExtractionResult- Parameters:
extractor- the identifier of the extractor responsible for retrieving such property.propertySubject- the subject of the property.property- the property IRI.object- the property object if any,nullotherwise.path- the path of the HTML node from which the property literal has been extracted.
-
getPropertyPaths
public List<TagSoupExtractionResult.PropertyPath> getPropertyPaths()
Description copied from interface:TagSoupExtractionResultReturns all the collected property paths.- Specified by:
getPropertyPathsin interfaceTagSoupExtractionResult- Returns:
- a valid list of property paths.
-
-