public class XPathExtractor extends Object implements Extractor.TagSoupDOMExtractor
Extractor.TagSoupDOMExtractor able to
apply XPathExtractionRules and generate quads.XPathExtractionRuleExtractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor| Modifier and Type | Field and Description |
|---|---|
static ExtractorFactory<XPathExtractor> |
factory |
static String |
NAME |
| Constructor and Description |
|---|
XPathExtractor(List<XPathExtractionRule> rules) |
| Modifier and Type | Method and Description |
|---|---|
void |
add(XPathExtractionRule rule) |
boolean |
contains(XPathExtractionRule rule) |
ExtractorDescription |
getDescription()
Returns a
ExtractorDescription of this extractor. |
void |
remove(XPathExtractionRule rule) |
void |
run(ExtractionParameters extractionParameters,
ExtractionContext extractionContext,
Document in,
ExtractionResult out)
Executes the extractor.
|
public static final String NAME
public static final ExtractorFactory<XPathExtractor> factory
public XPathExtractor(List<XPathExtractionRule> rules)
public void add(XPathExtractionRule rule)
public void remove(XPathExtractionRule rule)
public boolean contains(XPathExtractionRule rule)
public void run(ExtractionParameters extractionParameters, ExtractionContext extractionContext, Document in, ExtractionResult out) throws IOException, ExtractionException
Extractorrun in interface Extractor<Document>extractionParameters - the parameters to be applied during the extraction.extractionContext - The document context.in - The extractor input data.out - the collector for the extracted data.IOException - On error while reading from the input stream.ExtractionException - On other error, such as parse errors.public ExtractorDescription getDescription()
ExtractorExtractorDescription of this extractor.getDescription in interface Extractor<Document>Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.