Class HCardExtractor
- java.lang.Object
-
- org.apache.any23.extractor.html.MicroformatExtractor
-
- org.apache.any23.extractor.html.EntityBasedMicroformatExtractor
-
- org.apache.any23.extractor.html.microformats2.HCardExtractor
-
- All Implemented Interfaces:
Extractor<Document>,Extractor.TagSoupDOMExtractor
public class HCardExtractor extends EntityBasedMicroformatExtractor
Extractor for the h-Card microformat.- Author:
- Nisala Nirmana
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.any23.extractor.Extractor
Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor
-
-
Field Summary
-
Fields inherited from class org.apache.any23.extractor.html.MicroformatExtractor
BEGIN_SCRIPT, END_SCRIPT, valueFactory
-
-
Constructor Summary
Constructors Constructor Description HCardExtractor()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected booleanextractEntity(Node node, ExtractionResult out)Extracts an entity from a DOM node.org.eclipse.rdf4j.model.ResourceextractEntityAsEmbeddedProperty(HTMLDocument fragment, org.eclipse.rdf4j.model.BNode card, ExtractionResult out)protected StringgetBaseClassName()Returns the base class name for the extractor.ExtractorDescriptiongetDescription()Returns the description of this extractor.protected voidresetExtractor()Resets the internal status of the extractor to prepare it to a new extraction section.-
Methods inherited from class org.apache.any23.extractor.html.EntityBasedMicroformatExtractor
extract, getBlankNodeFor
-
Methods inherited from class org.apache.any23.extractor.html.MicroformatExtractor
addBNodeProperty, addBNodeProperty, addIRIProperty, conditionallyAddLiteralProperty, conditionallyAddResourceProperty, conditionallyAddStringProperty, fixLink, fixLink, getCurrentExtractionResult, getDocumentIRI, getExtractionContext, getHTMLDocument, includes, openSubResult, run, setCurrentExtractionResult
-
-
-
-
Method Detail
-
getDescription
public ExtractorDescription getDescription()
Description copied from class:MicroformatExtractorReturns the description of this extractor.- Specified by:
getDescriptionin interfaceExtractor<Document>- Specified by:
getDescriptionin classMicroformatExtractor- Returns:
- a human readable description.
-
getBaseClassName
protected String getBaseClassName()
Description copied from class:EntityBasedMicroformatExtractorReturns the base class name for the extractor.- Specified by:
getBaseClassNamein classEntityBasedMicroformatExtractor- Returns:
- a string containing the base of the extractor.
-
resetExtractor
protected void resetExtractor()
Description copied from class:EntityBasedMicroformatExtractorResets the internal status of the extractor to prepare it to a new extraction section.- Specified by:
resetExtractorin classEntityBasedMicroformatExtractor
-
extractEntity
protected boolean extractEntity(Node node, ExtractionResult out) throws ExtractionException
Description copied from class:EntityBasedMicroformatExtractorExtracts an entity from a DOM node.- Specified by:
extractEntityin classEntityBasedMicroformatExtractor- Parameters:
node- the DOM node.out- the extraction result collector.- Returns:
trueif the extraction has produces something,falseotherwise.- Throws:
ExtractionException- if there is an error during extraction
-
extractEntityAsEmbeddedProperty
public org.eclipse.rdf4j.model.Resource extractEntityAsEmbeddedProperty(HTMLDocument fragment, org.eclipse.rdf4j.model.BNode card, ExtractionResult out) throws ExtractionException
- Throws:
ExtractionException
-
-