Package org.apache.any23.extractor.html
Class TagSoupParserTest
- java.lang.Object
-
- org.apache.any23.extractor.html.TagSoupParserTest
-
public class TagSoupParserTest extends Object
Reference Test class forTagSoupParser
parser.- Author:
- Davide Palmisano (dpalmisano@gmail.com), Michele Mostarda (michele.mostarda@gmail.com)
-
-
Constructor Summary
Constructors Constructor Description TagSoupParserTest()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
tearDown()
void
testEmptySpanElements()
Test related to the issue 78 and disabled until the underlying NekoHTML bug has been fixed.void
testExplicitEncodingBehavior()
void
testImplicitEncodingBehavior()
This tests the Neko HTML parser without forcing it on using a specific encoding charset.void
testParseSimpleHTML()
-
-
-
Method Detail
-
tearDown
public void tearDown() throws org.eclipse.rdf4j.repository.RepositoryException
- Throws:
org.eclipse.rdf4j.repository.RepositoryException
-
testParseSimpleHTML
public void testParseSimpleHTML() throws IOException
- Throws:
IOException
-
testExplicitEncodingBehavior
public void testExplicitEncodingBehavior() throws IOException, ExtractionException, org.eclipse.rdf4j.repository.RepositoryException
- Throws:
IOException
ExtractionException
org.eclipse.rdf4j.repository.RepositoryException
-
testImplicitEncodingBehavior
public void testImplicitEncodingBehavior() throws IOException, ExtractionException, org.eclipse.rdf4j.repository.RepositoryException
This tests the Neko HTML parser without forcing it on using a specific encoding charset. We expect that this test may fail if something changes in the Neko library, as an auto-detection of the encoding.- Throws:
IOException
- if there is an error interpreting the input dataExtractionException
- if there is an exception during extractionorg.eclipse.rdf4j.repository.RepositoryException
- if an error is encountered whilst loading content from a storage connection
-
testEmptySpanElements
public void testEmptySpanElements() throws IOException
Test related to the issue 78 and disabled until the underlying NekoHTML bug has been fixed.- Throws:
IOException
- if there is an error interpreting the input data
-
-