Package org.apache.any23.extractor
Interface Extractor<Input>
- 
- Type Parameters:
- Input- the type of the input data to be processed.
 - All Known Subinterfaces:
- Extractor.BlindExtractor,- Extractor.ContentExtractor,- Extractor.TagSoupDOMExtractor
 - All Known Implementing Classes:
- AdrExtractor,- BaseRDFExtractor,- CSVExtractor,- EmbeddedJSONLDExtractor,- EntityBasedMicroformatExtractor,- FunctionalSyntaxExtractor,- GeoExtractor,- HAdrExtractor,- HCalendarExtractor,- HCardExtractor,- HCardExtractor,- HeadLinkExtractor,- HEntryExtractor,- HEventExtractor,- HGeoExtractor,- HItemExtractor,- HListingExtractor,- HProductExtractor,- HRecipeExtractor,- HRecipeExtractor,- HResumeExtractor,- HResumeExtractor,- HReviewAggregateExtractor,- HReviewExtractor,- HTMLMetaExtractor,- ICalExtractor,- ICBMExtractor,- JCalExtractor,- JSONLDExtractor,- LicenseExtractor,- ManchesterSyntaxExtractor,- MicrodataExtractor,- MicroformatExtractor,- NQuadsExtractor,- NTriplesExtractor,- RDFa11Extractor,- RDFaExtractor,- RDFXMLExtractor,- SpeciesExtractor,- TitleExtractor,- TriXExtractor,- TurtleExtractor,- TurtleHTMLExtractor,- XCalExtractor,- XFNExtractor,- XPathExtractor,- YAMLExtractor
 
 public interface Extractor<Input>It defines the signature of a generic Extractor.
- 
- 
Nested Class SummaryNested Classes Modifier and Type Interface Description static interfaceExtractor.BlindExtractorstatic interfaceExtractor.ContentExtractorThis interface specializes anExtractorable to handleInputStreamas input format.static interfaceExtractor.TagSoupDOMExtractor
 - 
Method SummaryAll Methods Instance Methods Abstract Methods Modifier and Type Method Description ExtractorDescriptiongetDescription()Returns aExtractorDescriptionof this extractor.voidrun(ExtractionParameters extractionParameters, ExtractionContext context, Input in, ExtractionResult out)Executes the extractor.
 
- 
- 
- 
Method Detail- 
runvoid run(ExtractionParameters extractionParameters, ExtractionContext context, Input in, ExtractionResult out) throws IOException, ExtractionException Executes the extractor. Will be invoked only once, extractors are not reusable.- Parameters:
- extractionParameters- the parameters to be applied during the extraction.
- context- The document context.
- in- The extractor input data.
- out- the collector for the extracted data.
- Throws:
- IOException- On error while reading from the input stream.
- ExtractionException- On other error, such as parse errors.
 
 - 
getDescriptionExtractorDescription getDescription() Returns aExtractorDescriptionof this extractor.- Returns:
- the object representing the extractor description.
 
 
- 
 
-