Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
文書のクラスタリング又は範疇化のための方法
Document Type and Number:
Japanese Patent JP4774073
Kind Code:
B2
Abstract:
Documents are clustered or categorized to generate a model associating documents with classes. Outlier measures are computed for the documents indicative of how well each document fits into the model. Outlier documents are identified to a user based on the outlier measures and a user selected outlier criterion. Ambiguity measures are computed for the documents indicative of a number of classes with which each document has similarity under the model. If a document is annotated with a label class, a possible corrective label class is identified if the annotated document has higher similarity with the possible corrective label class under the model than with the annotated label class. The clustering or categorizing is repeated adjusted based on received user input to generate an updated model associating documents with classes. Outlier and ambiguity measures are also calculated at runtime for new documents classified using the model.

Inventors:
Gene Michelle Renders
Caroline Privart
Ludovic Menougue
Application Number:
JP2008095354A
Publication Date:
September 14, 2011
Filing Date:
April 01, 2008
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
XEROX CORPORATION
International Classes:
G06F17/30
Foreign References:
US6003027
US20030139828
US6751600
US7043492
Other References:
石川佳治、外1名,忘却の概念に基づくクラスタリング手法の改良方式,日本データベース学会Letters,日本,日本データベース学会,2003年12月18日,第2巻,第3号,p.53-56
青野雅樹,クラスタ粒度階層構造を用いたアウトライヤー文書の検出手法,情報処理学会研究報告(2005-DBS-137(I)),日本,社団法人情報処理学会,2005年 7月13日,第2005巻,第67号,p.1-7
青野雅樹、外1名,文書-単語双クラスタリングを用いた特許データ概念検索制能向上手法について,DEWS2005論文集[online],日本,(社)電子情報通信学会データ工学研究専門委員会,2005年 5月 2日,p.1-8
Xinhao WANG、外3名,Improving Chinese Text Categorization by Outlier Learning,Proceedings of 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE'05.[online],2005年11月 1日,p.602-607,[DL from IEEE Xplore]
Hongyu Li、外1名,OUTLIER DETECTION IN BENCHMARK CLASSIFICATION TASKS,Proceedings. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006[online],2006年 5月19日,p.V-557~V-560,[DL from IEEE Xplore]
Attorney, Agent or Firm:
Kenji Yoshida
Jun Ishida