Class MicroformatExtractor

    • Constructor Detail

      • MicroformatExtractor

        public MicroformatExtractor()
    • Method Detail

      • extract

        protected abstract boolean extract()
                                    throws ExtractionException
        Performs the extraction of the data and writes them to the model. The nodes generated in the model can have any name or implicit label but if possible they SHOULD have names (either URIs or AnonId) that are uniquely derivable from their position in the DOM tree, so that multiple extractors can merge information.
        Returns:
        true if extraction is successful
        Throws:
        ExtractionException - if there is an error during extraction
      • getDocumentIRI

        public org.eclipse.rdf4j.model.IRI getDocumentIRI()
      • getCurrentExtractionResult

        protected ExtractionResult getCurrentExtractionResult()
        Returns the ExtractionResult associated to the extraction session.
        Returns:
        a valid extraction result.
      • setCurrentExtractionResult

        protected void setCurrentExtractionResult​(ExtractionResult out)
      • conditionallyAddStringProperty

        protected boolean conditionallyAddStringProperty​(Node n,
                                                         org.eclipse.rdf4j.model.Resource subject,
                                                         org.eclipse.rdf4j.model.IRI p,
                                                         String value)
        Helper method that adds a literal property to a subject only if the value of the property is a valid string.
        Parameters:
        n - the HTML node from which the property value has been extracted.
        subject - the property subject.
        p - the property IRI.
        value - the property value.
        Returns:
        returns true if the value has been accepted and added, false otherwise.
      • conditionallyAddLiteralProperty

        protected boolean conditionallyAddLiteralProperty​(Node n,
                                                          org.eclipse.rdf4j.model.Resource subject,
                                                          org.eclipse.rdf4j.model.IRI property,
                                                          org.eclipse.rdf4j.model.Literal literal)
        Helper method that adds a literal property to a node.
        Parameters:
        n - the HTML node from which the property value has been extracted.
        subject - subject the property subject.
        property - the property IRI.
        literal - value the property value.
        Returns:
        returns true if the literal has been accepted and added, false otherwise.
      • conditionallyAddResourceProperty

        protected boolean conditionallyAddResourceProperty​(org.eclipse.rdf4j.model.Resource subject,
                                                           org.eclipse.rdf4j.model.IRI property,
                                                           org.eclipse.rdf4j.model.IRI uri)
        Helper method that adds a IRI property to a node.
        Parameters:
        subject - the property subject.
        property - the property IRI.
        uri - the property object.
        Returns:
        true if the the resource has been added, false otherwise.
      • addBNodeProperty

        protected void addBNodeProperty​(Node n,
                                        org.eclipse.rdf4j.model.Resource subject,
                                        org.eclipse.rdf4j.model.IRI property,
                                        org.eclipse.rdf4j.model.BNode bnode)
        Helper method that adds a BNode property to a node.
        Parameters:
        n - the HTML node used for extracting such property.
        subject - the property subject.
        property - the property IRI.
        bnode - the property value.
      • addBNodeProperty

        protected void addBNodeProperty​(org.eclipse.rdf4j.model.Resource subject,
                                        org.eclipse.rdf4j.model.IRI property,
                                        org.eclipse.rdf4j.model.BNode bnode)
        Helper method that adds a BNode property to a node.
        Parameters:
        subject - the property subject.
        property - the property IRI.
        bnode - the property value.
      • addIRIProperty

        protected void addIRIProperty​(org.eclipse.rdf4j.model.Resource subject,
                                      org.eclipse.rdf4j.model.IRI property,
                                      org.eclipse.rdf4j.model.IRI object)
        Helper method that adds a IRI property to a node.
        Parameters:
        subject - subject to add
        property - predicate to add
        object - object to add
      • fixLink

        protected org.eclipse.rdf4j.model.IRI fixLink​(String link)
      • fixLink

        protected org.eclipse.rdf4j.model.IRI fixLink​(String link,
                                                      String defaultSchema)