|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
Input - the type of the input data to be processed.public interface Extractor<Input>
It defines the signature of a generic Extractor.
| Nested Class Summary | |
|---|---|
static interface |
Extractor.BlindExtractor
This interface specializes an Extractor able to handle
URI as input format. |
static interface |
Extractor.ContentExtractor
This interface specializes an Extractor able to handle
InputStream as input format. |
static interface |
Extractor.TagSoupDOMExtractor
This interface specializes an Extractor able to handle
Document as input format. |
| Method Summary | |
|---|---|
ExtractorDescription |
getDescription()
Returns a ExtractorDescription of this extractor. |
void |
run(ExtractionParameters extractionParameters,
ExtractionContext context,
Input in,
ExtractionResult out)
Executes the extractor. |
| Method Detail |
|---|
void run(ExtractionParameters extractionParameters,
ExtractionContext context,
Input in,
ExtractionResult out)
throws IOException,
ExtractionException
extractionParameters - the parameters to be applied during the extraction.context - The document context.in - The extractor input data.out - the collector for the extracted data.
IOException - On error while reading from the input stream.
ExtractionException - On other error, such as parse errors.ExtractorDescription getDescription()
ExtractorDescription of this extractor.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||