Package org.apache.any23.extractor.rdf
Class BaseRDFExtractor
- java.lang.Object
-
- org.apache.any23.extractor.rdf.BaseRDFExtractor
-
- All Implemented Interfaces:
Extractor<InputStream>
,Extractor.ContentExtractor
- Direct Known Subclasses:
FunctionalSyntaxExtractor
,JSONLDExtractor
,ManchesterSyntaxExtractor
,NQuadsExtractor
,NTriplesExtractor
,RDFa11Extractor
,RDFaExtractor
,RDFXMLExtractor
,TriXExtractor
,TurtleExtractor
public abstract class BaseRDFExtractor extends Object implements Extractor.ContentExtractor
Base class for a generic RDFExtractor.ContentExtractor
.- Author:
- Michele Mostarda (mostarda@fbk.eu), Hans Brende (hansbrende@apache.org)
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.any23.extractor.Extractor
Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor
-
-
Constructor Summary
Constructors Constructor Description BaseRDFExtractor()
BaseRDFExtractor(boolean verifyDataType, boolean stopAtFirstError)
Constructor, allows to specify the validation and error handling policies.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected abstract org.eclipse.rdf4j.rio.RDFParser
getParser(ExtractionContext extractionContext, ExtractionResult extractionResult)
boolean
isStopAtFirstError()
boolean
isVerifyDataType()
void
run(ExtractionParameters extractionParameters, ExtractionContext extractionContext, InputStream in, ExtractionResult extractionResult)
Executes the extractor.void
setStopAtFirstError(boolean b)
Iftrue
, the extractor will stop at first parsing error, iffalse
the extractor will attempt to ignore all parsing errors.void
setVerifyDataType(boolean verifyDataType)
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.any23.extractor.Extractor
getDescription
-
-
-
-
Constructor Detail
-
BaseRDFExtractor
public BaseRDFExtractor()
-
BaseRDFExtractor
public BaseRDFExtractor(boolean verifyDataType, boolean stopAtFirstError)
Constructor, allows to specify the validation and error handling policies.- Parameters:
verifyDataType
- iftrue
the data types will be verified, iffalse
will be ignored.stopAtFirstError
- iftrue
the parser will stop at first parsing error, iffalse
will ignore non blocking errors.
-
-
Method Detail
-
getParser
protected abstract org.eclipse.rdf4j.rio.RDFParser getParser(ExtractionContext extractionContext, ExtractionResult extractionResult)
-
isVerifyDataType
public boolean isVerifyDataType()
-
setVerifyDataType
public void setVerifyDataType(boolean verifyDataType)
-
isStopAtFirstError
public boolean isStopAtFirstError()
-
setStopAtFirstError
public void setStopAtFirstError(boolean b)
Description copied from interface:Extractor.ContentExtractor
Iftrue
, the extractor will stop at first parsing error, iffalse
the extractor will attempt to ignore all parsing errors.- Specified by:
setStopAtFirstError
in interfaceExtractor.ContentExtractor
- Parameters:
b
- tolerance flag.
-
run
public void run(ExtractionParameters extractionParameters, ExtractionContext extractionContext, InputStream in, ExtractionResult extractionResult) throws IOException, ExtractionException
Description copied from interface:Extractor
Executes the extractor. Will be invoked only once, extractors are not reusable.- Specified by:
run
in interfaceExtractor<InputStream>
- Parameters:
extractionParameters
- the parameters to be applied during the extraction.extractionContext
- The document context.in
- The extractor input data.extractionResult
- the collector for the extracted data.- Throws:
IOException
- On error while reading from the input stream.ExtractionException
- On other error, such as parse errors.
-
-