Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
WEBPAGE CRAWLING METHOD AND APPARATUS
Document Type and Number:
WIPO Patent Application WO/2018/157686
Kind Code:
A1
Abstract:
A webpage crawling method, comprising: configuring a crawling task and a crawling policy, wherein the crawling task comprises a target website, and the crawling policy comprises a URL restriction policy; generating a crawling list according to the target website; crawling webpages, in the crawl list, of the target website in sequence to acquire website links in the webpages; and filtering the website links according to the URL restriction policy so as to filter out an invalid link in the website links, and adding the remaining website links after the filtering as the link of the target website to the crawling list for subsequent crawling.

Inventors:
SHAN CHANGMEI (CN)
LI LING (CN)
Application Number:
PCT/CN2018/074262
Publication Date:
September 07, 2018
Filing Date:
January 26, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ZTE CORP (CN)
International Classes:
G06F17/30
Domestic Patent References:
WO2012142092A12012-10-18
Foreign References:
CN104182412A2014-12-03
CN104063448A2014-09-24
Attorney, Agent or Firm:
AFD CHINA INTELLECTUAL PROPERTY LAW OFFICE (CN)
Download PDF: