Package org.apache.any23.extractor
This package contains classes and interfaces modeling the
Extractor API.-
Interface Summary Interface Description ExtractionResult Interface defining the methods that a representation of an extraction result must have.Extractor<Input> It defines the signature of a generic Extractor.Extractor.BlindExtractor Extractor.ContentExtractor This interface specializes anExtractorable to handleInputStreamas input format.Extractor.TagSoupDOMExtractor ExtractorDescription It defines a minimal signature for anExtractordescription.ExtractorFactory<T extends Extractor<?>> Interface defining a factory forExtractor.ExtractorRegistry An interface to the enable a registry for extractors to be implemented by different implementors of this API.IssueReport This interface models an issue reporter.TagSoupExtractionResult This interface models a specificExtractionResultable to collect property roots generated by HTML Microformat extractions. -
Class Summary Class Description ExampleInputOutput A reporter for example input and output of an extractor.ExtractionContext This class provides the context for the processing of a singleExtractor.ExtractionParameters This class models the parameters to be used to perform an extraction.ExtractionResultImpl A default implementation ofExtractionResult; it receives extraction output from oneExtractorworking on one document, and passes the output on to aTripleHandler.ExtractorGroup It simple models a group ofExtractorFactoryproviding simple accessing methods.ExtractorRegistryImpl Singleton class acting as a register for all the variousExtractor.IssueReport.Issue This class defines a generic issue traced by this extraction result.SimpleExtractorFactory<T extends Extractor<?>> This class is a simple and default-like implementation ofExtractorFactory.SingleDocumentExtraction This class acts as a facade where all extractors (for a given MIMEType) can be called on a single document.SingleDocumentExtractionReport This class provides the report for aSingleDocumentExtractionrun.TagSoupExtractionResult.PropertyPath Defines a property path object.TagSoupExtractionResult.ResourceRoot Defines a property root object. -
Enum Summary Enum Description ExtractionParameters.ValidationMode Declares the supported validation actions.IssueReport.IssueLevel Possible issue levels. -
Exception Summary Exception Description ExtractionException Defines a specific exception raised during the metadata extraction phase.