Package org.apache.any23.extractor.html
Class HCalendarExtractor
- java.lang.Object
-
- org.apache.any23.extractor.html.MicroformatExtractor
-
- org.apache.any23.extractor.html.HCalendarExtractor
-
- All Implemented Interfaces:
Extractor<Document>
,Extractor.TagSoupDOMExtractor
public class HCalendarExtractor extends MicroformatExtractor
Extractor for the hCalendar microformat.- Author:
- Gabriele Renzi
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.any23.extractor.Extractor
Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor
-
-
Field Summary
-
Fields inherited from class org.apache.any23.extractor.html.MicroformatExtractor
BEGIN_SCRIPT, END_SCRIPT, valueFactory
-
-
Constructor Summary
Constructors Constructor Description HCalendarExtractor()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected boolean
extract()
Performs the extraction of the data and writes them to the model.ExtractorDescription
getDescription()
Returns the description of this extractor.-
Methods inherited from class org.apache.any23.extractor.html.MicroformatExtractor
addBNodeProperty, addBNodeProperty, addIRIProperty, conditionallyAddLiteralProperty, conditionallyAddResourceProperty, conditionallyAddStringProperty, fixLink, fixLink, getCurrentExtractionResult, getDocumentIRI, getExtractionContext, getHTMLDocument, includes, openSubResult, run, setCurrentExtractionResult
-
-
-
-
Method Detail
-
getDescription
public ExtractorDescription getDescription()
Description copied from class:MicroformatExtractor
Returns the description of this extractor.- Specified by:
getDescription
in interfaceExtractor<Document>
- Specified by:
getDescription
in classMicroformatExtractor
- Returns:
- a human readable description.
-
extract
protected boolean extract() throws ExtractionException
Description copied from class:MicroformatExtractor
Performs the extraction of the data and writes them to the model. The nodes generated in the model can have any name or implicit label but if possible they SHOULD have names (either URIs or AnonId) that are uniquely derivable from their position in the DOM tree, so that multiple extractors can merge information.- Specified by:
extract
in classMicroformatExtractor
- Returns:
- true if extraction is successful
- Throws:
ExtractionException
- if there is an error during extraction
-
-