Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ウェブページ内容を抽出する方法、装置及びコンピュータプログラム
Document Type and Number:
Japanese Patent JP7347179
Kind Code:
B2
Abstract:
To provide method and device of extracting web page content.SOLUTION: The method of extracting web page content includes the steps of: calculating the similarity between a web page feature and a representative set of at least one web page feature cluster, the representative set including a sample of web page features that have a relatively high degree of similarity between each other in the corresponding web page feature cluster; determining a representative set that has the highest similarity to the web page features; updating the web page feature cluster associated with the determined representative set using web page features; recalculating the representative set of the updated web page feature clusters; and extracting the content from the web pages on the basis of the extraction template associated with the updated web page feature cluster.SELECTED DRAWING: Figure 2

Inventors:
Summer welcome torch
John Junguang
Haruka Meng
Jenu Yen
Application Number:
JP2019221285A
Publication Date:
September 20, 2023
Filing Date:
December 06, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
富士通株式会社
International Classes:
G06F16/84
Domestic Patent References:
JP2007199966A
JP2009181301A
JP2005092889A
Foreign References:
US20180300576
Attorney, Agent or Firm:
Tadashige Ito
Tadahiko Ito



 
Previous Patent: oil pump equipment

Next Patent: copy protection card