Package org.apache.any23.extractor.html
Class TagSoupParserTest
- java.lang.Object
-
- org.apache.any23.extractor.html.TagSoupParserTest
-
public class TagSoupParserTest extends Object
Reference Test class forTagSoupParserparser.- Author:
- Davide Palmisano (dpalmisano@gmail.com), Michele Mostarda (michele.mostarda@gmail.com)
-
-
Constructor Summary
Constructors Constructor Description TagSoupParserTest()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidtearDown()voidtestEmptySpanElements()Test related to the issue 78 and disabled until the underlying NekoHTML bug has been fixed.voidtestExplicitEncodingBehavior()voidtestImplicitEncodingBehavior()This tests the Neko HTML parser without forcing it on using a specific encoding charset.voidtestParseSimpleHTML()
-
-
-
Method Detail
-
tearDown
public void tearDown() throws org.eclipse.rdf4j.repository.RepositoryException- Throws:
org.eclipse.rdf4j.repository.RepositoryException
-
testParseSimpleHTML
public void testParseSimpleHTML() throws IOException- Throws:
IOException
-
testExplicitEncodingBehavior
public void testExplicitEncodingBehavior() throws IOException, ExtractionException, org.eclipse.rdf4j.repository.RepositoryException- Throws:
IOExceptionExtractionExceptionorg.eclipse.rdf4j.repository.RepositoryException
-
testImplicitEncodingBehavior
public void testImplicitEncodingBehavior() throws IOException, ExtractionException, org.eclipse.rdf4j.repository.RepositoryExceptionThis tests the Neko HTML parser without forcing it on using a specific encoding charset. We expect that this test may fail if something changes in the Neko library, as an auto-detection of the encoding.- Throws:
IOException- if there is an error interpreting the input dataExtractionException- if there is an exception during extractionorg.eclipse.rdf4j.repository.RepositoryException- if an error is encountered whilst loading content from a storage connection
-
testEmptySpanElements
public void testEmptySpanElements() throws IOExceptionTest related to the issue 78 and disabled until the underlying NekoHTML bug has been fixed.- Throws:
IOException- if there is an error interpreting the input data
-
-