Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND DEVICE FOR COLLECTING WEBSITES
Document Type and Number:
WIPO Patent Application WO/2023/008785
Kind Code:
A1
Abstract:
The present invention relates to a method and device for collecting websites, and one objective of the present invention is to collect websites by combining an automatic method and a manual method. To achieve this objective, the present invention relates to a method by which an electronic device collects websites, the method comprising: step a of accessing a web server corresponding to an URL and receiving a website corresponding to the URL; step b of, if there is captcha on the website, obtaining a first solution key on the basis of a captcha solution model; step c of transmitting the first solution key to the web server and receiving an authentication result; step d of recalculating the first solution key if the authentication of the first solution key fails, and transmitting a captcha solution request signal to a user terminal if the authentication fails a preset number of times or more; and step e of receiving a second solution key from the user terminal, transmitting same to the web server, and crawling the website.

Inventors:
CHOI JAE MIN (KR)
YOON CHANG HOON (KR)
KIM YEON KEUN (KR)
Application Number:
PCT/KR2022/010199
Publication Date:
February 02, 2023
Filing Date:
July 13, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
S2W INC (KR)
International Classes:
G06F16/951; G06F16/955; G06F21/31; G06N20/00
Foreign References:
US9977892B22018-05-22
US9471767B22016-10-18
Other References:
JIANYI ZHANG; XIALI HEI; ZHIQIANG WANG: "Typer vs. CAPTCHA: Private information based CAPTCHA to defend against crowdsourcing human cheating", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 29 April 2019 (2019-04-29), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081268349
JIANYI ZHANG; XIALI HEI; ZHIQIANG WANG: "Typer vs. CAPTCHA: Private information based CAPTCHA to defend against crowdsourcing human cheating", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 29 April 2019 (2019-04-29), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081268349
ZAHRA NOURY; MAHDI REZAEI: "Deep-CAPTCHA: a deep learning based CAPTCHA solver for vulnerability assessment", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 1 January 1900 (1900-01-01), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081696363
Attorney, Agent or Firm:
DODAM IP LAW FIRM (KR)
Download PDF: