public class DefaultWebCrawler
extends edu.uci.ics.crawler4j.crawler.WebCrawler
WebCrawler implementation.| Constructor and Description |
|---|
DefaultWebCrawler() |
| Modifier and Type | Method and Description |
|---|---|
boolean |
shouldVisit(edu.uci.ics.crawler4j.crawler.Page referringPage,
edu.uci.ics.crawler4j.url.WebURL url)
Override this method to specify whether the given URL should be visited or not.
|
void |
visit(edu.uci.ics.crawler4j.crawler.Page page)
Override this method to implement the single page processing logic.
|
getMyController, getMyId, getMyLocalData, getThread, handlePageStatusCode, handleUrlBeforeProcess, init, isNotWaitingForNewURLs, onBeforeExit, onContentFetchError, onContentFetchError, onPageBiggerThanMaxSize, onParseError, onRedirectedStatusCode, onStart, onUnexpectedStatusCode, onUnhandledException, run, setThread, shouldFollowLinksInpublic boolean shouldVisit(edu.uci.ics.crawler4j.crawler.Page referringPage,
edu.uci.ics.crawler4j.url.WebURL url)
shouldVisit in class edu.uci.ics.crawler4j.crawler.WebCrawlerpublic void visit(edu.uci.ics.crawler4j.crawler.Page page)
visit in class edu.uci.ics.crawler4j.crawler.WebCrawlerCopyright © 2010–2019 The Apache Software Foundation. All rights reserved.