Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR SEARCHING RELATION SUDDEN RISING WORD AND SYSTEM THEREOF
Document Type and Number:
WIPO Patent Application WO/2009/038285
Kind Code:
A1
Abstract:
A method and system for searching for a related term having rapidly increasing popularity is provided. The method includes: analyzing a search log and extracting a daily search frequency for each search term; comparing peaks of the daily search frequency, extracted for each search term in a predetermined period; and analyzing relevance between candidate search terms in which the peaks have occurred together in the predetermined period as a result of the comparison and filtering out a candidate search term having no relevance.

Inventors:
KIM DONGWOOK (KR)
Application Number:
PCT/KR2008/004634
Publication Date:
March 26, 2009
Filing Date:
August 08, 2008
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NHN CORP (KR)
KIM DONGWOOK (KR)
International Classes:
G06F17/30
Foreign References:
KR100522029B12005-10-18
KR20050102869A2005-10-27
KR20060029709A2006-04-07
KR20070095552A2007-10-01
Attorney, Agent or Firm:
MUHANN PATENT & LAW FIRM (6th Floor Myeonglim Building,51-8 Nonhyeon-dong, Gangnam-gu, Seoul 135-814, KR)
Download PDF:
Claims:
CLAIMS

1. A method for searching for a related term having rapidly increasing popularity, the method comprising: analyzing a search log and extracting a daily search frequency for each search term; comparing peaks of the daily search frequency, extracted for each search term in a predetermined period; and analyzing relevance between candidate search terms in which the peaks have occurred together in the predetermined period as a result of the comparison, and filtering out a candidate search term having no relevance.

2. The method of claim 1, wherein the extracting of the daily search frequency for each search term comprises: extracting the daily search frequency for each search term in the predetermined period; and analyzing the daily search frequency for each search term, and extracting a search term whose search frequency increases more rapidly than a predetermined reference value of increasing and decreases more rapidly than a predetermined reference value of decreasing, and extracting time information when the peak has occurred.

3. The method of claim 1, wherein the comparing of the peaks of the extracted daily search frequency for each search term in the predetermined period comprises: comparing the peaks of the extracted daily search frequency for each search term in the predetermined period; and searching for candidate search terms in which the peaks have occurred together in the predetermined period as the result of the comparison.

4. The method of claim 1 , further comprising: analyzing time information of rapid increase and time information of rapid decrease of peaks to be compared and establishing the predetermined period so that the time information of rapid decrease is greater than the time information of rapid increase.

5. The method of claim 1, wherein the analyzing of relevance between candidate search terms in which the peaks have occurred together in the predetermined period as the result of the comparison and filtering out the candidate search term having no relevance comprises: analyzing the candidate search terms in which the peaks have occurred together in the predetermined period as the result of the comparison and determining whether relevance exists between the search terms; and filtering out a candidate search term having no relevance from the candidate search terms as the result of the determination.

6. The method of claim 5, wherein the determining of whether relevance exists between the candidate search terms analyzes the search terms and determines the analyzed search term has relevance when the analyzed search term is a correlative search term.

7. The method of claim 5, wherein the determining of whether relevance exists between the candidate search terms measures a number of search sessions where the search terms are inputted and a number of search sessions where a pair of search terms included in the search terms are inputted, and determines whether correlation exists in search terms.

8. The method of claim 5, wherein the determining of whether relevance exists between the candidate search terms measures a number of user identifiers where the search terms are inputted and a number of user identifiers wherein the pair of the search terms included in the search terms are inputted, and determines whether correlation exists.

9. The method of claim 5, wherein the determining of whether relevance exists between the candidate search terms measures a number of Internet Protocol (IP) addresses where the search terms are inputted and a number of IP addresses where a pair of search terms including the search term are inputted, and determines whether correlation exists.

10. The method of claim 5, wherein the determining of whether relevance exists between the candidate search terms analyzes the search terms, and, when a single search term is included in a portion of another search term, determines there is relevance between the search terms.

11. The method of claim 1 , further comprising: recording and maintaining the search log including the daily search frequency in a database, wherein the analyzing of the search log and extracting of the daily search frequency for each search term analyzes the search log with reference to the database and extracts the daily search frequency for each search term.

12. At least one computer-readable storage medium storing instructions for implementing the method of any one of claims 1 through 11.

13. A system for searching for a related term having rapidly increasing popularity, the system comprising: an extraction unit analyzing a search log and extracting daily search frequency for each search term; a comparison unit analyzing a search log and comparing the extracted daily search frequency for each search term; and a filtering unit analyzing relevance between candidate search terms in which peaks have occurred together in the predetermined period as a result of the comparison and filtering out a candidate search term having no relevance.

14. The system of claim 13, wherein the extraction unit extracts the daily search frequency for each search term within the predetermined period, and analyzes the daily search frequency for each search term and extracting a search term whose search frequency increases more rapidly than a predetermined reference value of increasing and decreases more rapidly than a predetermined reference value of decreasing, and extracts time information when the peak has occurred.

15. The system of claim 13, wherein the comparison unit compares the peaks of the extracted daily search frequency for each search term in the predetermined period, and searches for candidate search terms in which peaks have occurred together in the predetermined period as the result of the comparison.

16. The system of claim 13, wherein the comparison unit analyzes time information of a rapid increase and time information of a rapid decrease of the peaks to be compared and establishes the predetermined period so that the time information of rapid decrease is greater than the time information of rapid increase.

17. The system of claim 13, wherein the filtering unit analyzes the candidate search terms in which the peaks have occurred together in the predetermined period as the result of the comparison and determining whether relevance exists between the candidate search terms, and filters out a candidate search term having no relevance from the candidate search terms as the result of the determination.

18. The system of claim 13, wherein the filtering unit analyzes the search terms and determines the analyzed search term has relevance when the analyzed search term is correlative search term.

19. The system of claim 13, wherein the filtering unit measures a number of search sessions where the search terms are inputted and a number of search sessions where a pair of search terms included in the search terms are inputted, determines whether correlation exists, and filters out a candidate search term having no relevance from the candidate search terms as a result of the determination.

20. The system of claim 13, wherein the filtering unit measures a number of a user identifier where the search terms are inputted and a number of a user identifier wherein the pair of the search terms included in the search terms are inputted, determines whether correlation exists, and filters out a candidate search term having no relevance from the candidate search terms as a result of the determination.

21. The system of claim 13, wherein the filtering unit measures a number of IP addresses where the search terms are inputted and a number of IP addresses where a pair of search terms including the search term are inputted, determines whether correlation exists, and filters out a candidate search term having no relevance from the candidate search terms as a result of the determination.

22. The system of claim 13, wherein the filtering unit analyzes the search terms, and, when a single search term is included in a portion of another search term, determines there is relevance between the those search terms.

23. The system of claim 13, further comprising a database recording and maintaining the search log including the daily search frequency, wherein the extraction unit analyzes the search log with reference to the database and extracts the daily search frequency for each search term.

Description:

METHOD FOR SEARCHING RELATION SUDDEN RISING WORD AND

SYSTEM THEREOF

Technical Field The present invention relates to an apparatus and a system for searching for a related term having rapidly increasing popularity, and more particularly, to an apparatus and a system for searching for a related term having rapidly increasing popularity in which related term having the same peak which represents a change of a search frequency over times.

Background Art

A searching service system which provides a searching service generally provides users with search results corresponding to the search according to a query (for example, web pages containing the search query, articles including the search query, images with filenames including the search query, and the like) when a search query is inputted from users.

However, recent searching service systems, for the purpose of providing more accurate and faster search results, provide users with a related search query which is extracted when there is relevance with the inputted search query. This means that a searching service system usually provides users with different search results according to the inputted query. As an example, a search result when a user inputs "motor vehicle" is different from a search result when a user inputs "car". Consequently, a user would like to input a search query having higher relevance to desired information, yet there are times when the users are having difficulties coming up with such search queries. Consequently, recent searching service systems provide users with search queries which are related to the inputted search queries to enable users to perform searching with other search queries.

Here, the related search query may refer to a search query which may be a higher concept or a lower concept for the inputted search query ("Foreign language" when "Japanese" is inputted, or "Japanese" when "Foreign language" is inputted), to a search query which is a synonym with the inputted query ("bookstore" when "bookshop" is inputted), to a search query which has a similar meaning with the

inputted query ("tail" when "rear" is inputted), to a search query which has a related search term ("saw, seen, seeing" when "see" is inputted), and other search queries having related terms in their meaning. However, the related search queries are not limited to the queries having related term in their meaning and may refer to the search queries from various perspectives, when "Chanho Park" is inputted from a user, "Baseball" which is his occupation, "Major League" which is a league he belongs to, "Hanyang University" which is a school he graduated from, "Texas Rangers" which is his current team, and "Byeonghyeon Kim" who is another Korean baseball pitcher in the major league, may be showed. However, a conventional method for searching for a related term is limited to search terms which are higher or lower in concepts, to search terms which have synonym relationships, or to search terms simply related to the search query only. Thus, the conventional method for searching for the related term has a disadvantage in that, there are a high possibility of extracting a result which is not related to an inputted search query, and a possibility of not satisfying a user demand for a service with higher quality where the desired information may more promptly be obtained with more accuracy.

Accordingly, a new technology which may provide users with a related term with a higher accuracy by effectively collecting data related to input search queries, by the systematically analyzing the collected data, and by accurately determining the related term having rapidly increasing popularity using a change of a search frequency over time is required.

Disclosure of Invention Technical Goals

According to an aspect of the present invention, there is provided a method and system for searching for a related term having rapidly increasing popularity which has the same peak representing a change of a search frequency over time.

According to an aspect of the present invention, there is provided a method and system which provides a related term having rapidly increasing popularity by searching for peaks in a time distribution of a search term, finding out candidate search terms by comparing the peaks, and filtering out a candidate search term having no relevance from

the candidate search terms.

Technical Solutions

According to embodiments of the present invention, a method for searching for a related term having rapidly increasing popularity includes: analyzing a search log and extracting a daily search frequency for each search term; comparing peaks of the daily search frequency, extracted for each search term in a predetermined period; and analyzing relevance between candidate search terms in which the peaks have occurred together in the predetermined period as a result of the comparison and filtering out a candidate search term having no relevance.

According to embodiments of the present invention, a system for searching for a related term having rapidly increasing popularity includes: an extraction unit analyzing a search log and extracting daily search frequency for each search term; a comparison unit analyzing a search log and comparing the extracted daily search frequency for each search term; and a filtering unit analyzing relevance between candidate search terms in which peaks have occurred together in the predetermined period as a result of the comparison and filtering out a candidate search term having no relevance.

Advantageous Effects

According to the present invention, it is possible to provide a method and system for searching for a related term having rapidly increasing popularity which has the same peak representing a change of a search frequency over time.

According to the present invention, it is possible to provide a method and system for searching for a related term having rapidly increasing popularity by searching for peaks in a time distribution of a search term, searching for candidate search terms by comparing the peaks, and filtering out a candidate search term having no relevance.

Brief Description of Drawings

FIG. 1 is a diagram illustrating an operational relationship between a system for searching for a related term having rapidly increasing popularity of the present

invention and a user terminal;

FIG. 2 is a diagram illustrating an example of a daily search frequency for each search term according to the present invention;

FIG. 3 is a diagram illustrating a configuration of a system for searching for a related term having rapidly increasing popularity according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating a method for searching for a related term having rapidly increasing popularity according an embodiment of the present invention;

FIG. 5 is a flowchart illustrating operations of extracting a daily search frequency for each search term according to an embodiment of the present invention;

FIG. 6 is a flowchart illustrating operations of comparing peaks of a daily search frequency for each search term according to an embodiment of the present invention; and

FIG. 7 is a flowchart illustrating operations of filtering out a search term having no relevance according to an embodiment of the present invention.

Best Mode for Carrying Out the Invention

FIG. 1 is a diagram illustrating operational relationship between a system for searching for a related term having rapidly increasing popularity of the present invention and a user terminal.

Referring to FIG. 1, a system 110 for searching for a related term having rapidly increasing popularity provides search terms which are inputted from users using user terminals 130-1 to 130-n via a communication network 120. The communication network 120 may include various wired and wireless communication network data communication between the system 110 for searching for the related term having rapidly increasing popularity and the user terminals 130-1 to 130-n and the Internet. The user terminals 130-1 to 130-n are inputted with search terms which are related to a desired object to be searched for from a user, and forwards the search terms to the system 110 for searching for the related term having rapidly increasing popularity via the communication network 120.

The system 110 for searching for the related term having rapidly increasing popularity records and maintains a search log including a daily search frequency for

each search term which is inputted by the users.

FIG. 2 is a diagram illustrating an example of a daily search frequency for each search term according to the present invention.

Referring to FIG. 2, the daily search frequency varies every day. This is due to a fact that a number of user input for a search term varies depending on issues or interests of society at the current time. That is, a peak, in which a daily search frequency for each search term occurs over time, indicates a point when the search term becomes an issue of the society or when the search term is meaningful. Similarly, search terms which share the peak at the same time exist. There is a probability that such search terms have relevance with each other.

As illustrated in FIG. 2, the system 110 for searching for the related term having rapidly increasing popularity searches for search terms having the same peak which indicates changes in daily search frequency over time.

As an example, when a search term of 'Ministry of Justice' has been inputted from user terminals 130-1 to 130-n, the system 110 for searching for the related term having rapidly increasing popularity may extract other terms which have shown the same peak at the same time point such as the 'Ministry of Labor, Ministry of

Administration, Blue House, Ministry of Health and Welfare, Sashiro, Journal of Law,

Bar examination, Judicial examination, 815 special pardons, homepage of National Police Agency, pardons for drunken driving, and Drive License Management Agency'.

The system 110 for searching for the related term having rapidly increasing popularity, as illustrated in FIG. 2, searches for the peaks in a time distribution, compares the retrieved peaks to find candidate search terms, and filters out a candidate search term having no relevance from the candidate search terms, thereby searching for the related term having rapidly increasing popularity with the relevance.

The system 110 for searching for the related term having rapidly increasing popularity provides users with the retrieved related term having rapidly increasing popularity via the user terminals 130-1 to 130-n in the communication network 120.

FIG. 3 is a diagram illustrating a configuration of the system 110 of FIG. 1 for searching for a related term having rapidly increasing popularity according to an embodiment of the present invention.

Referring to FIG. 3, the system 110 for searching for the related term having

rapidly increasing popularity includes a search term input unit 310, a database 320, an extraction unit 330, a comparison unit 340 and a filtering unit 350.

The search term input unit 310 receives search terms from users. That is, the search term input unit 310 is inputted with search terms which may be related to a desired object from users.

The database 320 records and maintains a search log including a daily search frequency for each search term. In other words, the database 320 may record and maintain the search log including time information of inputted search terms, a daily search frequency for each search term, and the like. The extraction unit 330 extracts the daily search frequency of each search term by referring to the database 320 and analyzes the search log. The extraction unit 330 extracts the daily search frequency of each search term from the database 320 in a predetermined period.

Also, the extraction unit 330 analyzes the daily search frequency of each search term and extracts a peak with rapid increase or decrease in a short period of time. In other words, the extraction unit 330 analyzes the daily search frequency of each search and extracts a search term whose search frequency increases more rapidly than a predetermined reference value of increasing and decreases more rapidly than a predetermined reference value of decreasing, and extracts time information when the peak has occurred. As an example, as the analysis result of the daily search frequency of each search term, the extraction unit 330 may extract a search term in which a peak has occurred and time information when the peak has occurred, the peak showing a rapid increase of popularity 10 times within a day and also a rapid decrease of popularity also 10 times within a week. As an example, an extraction unit 330 may extract a search term and time information in a pair as in (Lee Hyori, July 2) and (Ministry of Justice, June 24).

The comparison unit 340 compares the daily search frequency extracted for each search term in a predetermined period. In other words, the comparison unit 340 compares peaks for the daily search frequency for each search term and searches for candidate search terms having the same peak in the same period as a result of the comparison. Also, the comparison unit 340 may analyze time information of a rapid increase for the peaks being compared and the time information of a rapid decrease and

may establish the predetermined period so that the time information of the rapid decrease is greater than the time information of the rapid increase. As an example, it is assumed that a date of increase for a specific peak is peak.up, and a date of decrease for a specific peak is peak.down and there are a first peak (peakl) and a second peak (peak2), when |peakl .up - peak2.up|<deltal and |peakl.down - peak2.down|<delta2, it may be assumed that the first peak and the second peak are identical peaks. Here, delta 1 refers to a time difference for an increase of the first peak and the second peak, and delta2 refers to a time difference for a decrease of the first peak and the second peak. Since the decrease may largely differ in comparison to the increase in general, the comparison unit 340 may compare the peaks of the extracted daily search frequency for each search term in the predetermined period so as to have a larger delta2 value than a deltal value.

The filtering unit 350 analyzes relevance between the candidate search terms in which the peaks have occurred together in the predetermined period as a result of the comparison and filters out a candidate search term having no relevance. In other words, the filtering unit 350 analyzes the candidate search terms in which the peaks have occurred together in the predetermined period as the result of the comparison, determines whether relevance exists between the search terms, and filters out a candidate search term having no relevance from the candidate search terms as the result of the determination. As an example, the filtering unit 350 analyzes the candidate search terms, and, when they are related search terms, may perform filtering so that the related search terms may be selected as related search terms with rapidly increasing popularity. As an example, the filtering unit 350 may measure a number of search sessions where the search terms are inputted, and a number of search sessions where a pair of search terms included in the search terms are inputted, determine whether correlation exists in search terms, and filter out a candidate search term from the candidate search terms as a result of the determination. As an example, the filtering unit 350 may measure a number of user identifiers where the search terms are inputted and a number of user identifiers where a pair of search terms included in the search terms are inputted to determine whether the relevance exists between the search terms and filter out a candidate search term having no relevance from the candidate search terms according to the determination result. As an example, the filtering unit 340 may

measure a number of Internet Protocol (IP) addresses where the above search terms are inputted, and a number of IP addresses where a pair of search terms including the search term are inputted to determine relevance and filter out a candidate search term having no relevance from the candidate search terms according to the relevance decision result. As an example, the filtering unit 350 may analyze the search terms, when a single search term is included in a portion of another search term, determine there is relevance between the search terms, consequently may determine the search terms as related terms having rapidly increasing popularity.

Similarly, the system 110 for searching for the related term having rapidly increasing popularity may provide the related term having rapidly increasing popularity by searching for peaks in a time distribution of search terms, finding candidate search terms by comparing the peaks, and filtering out a candidate search terms having no relevance from the candidate search terms.

FIG. 4 is a flowchart illustrating a method for searching for a related term having rapidly increasing popularity according an embodiment of the present invention.

Referring to FIG. 4, the system 110 for searching for the related term having rapidly increasing popularity records and maintains a search log in the database. In other words, in operation S410, the system 110 for searching for the related term having rapidly increasing popularity may record and maintain the search log including time information and the search frequency regarding the search terms inputted by users.

In operation S420, the system 110 for searching for the related term having rapidly increasing popularity refers to the database, and analyzes the search log to extract a daily search frequency for each search term.

FIG. 5 is a flowchart illustrating operations of extracting a daily search frequency for each search term according to an embodiment of the present invention.

Referring to FIG. 5, the system 110 for searching for the related term having rapidly increasing popularity extracts a daily search frequency for each search term in a predetermined period in operation S510. That is, in S510, the system 110 for searching for the related term having rapidly increasing popularity may extract the daily search frequency for each search term in the predetermined period as illustrated in FIG. 2.

In operation S520, the system 110 for searching for the related term having

rapidly increasing popularity analyzes the daily search frequency of each search term and extracts a search term whose search frequency increases more rapidly than a predetermined reference value of increasing and decreases more rapidly than a predetermined reference value of decreasing, and extracts time information when the peak has occurred. As an example, in operation S520, the system 110 for searching for the related term having rapidly increasing popularity extract a search term in which a peak has occurred and time information when the peak has occurred, the peak showing a rapid increase of popularity 10 times within a day and also a rapid decrease of popularity 10 times within a week. As an example, the system 110 for searching for the related term having rapidly increasing popularity may extract a search term and time information in a pair as in (Lee Hyori, July 2) and (Ministry of Justice, June 24).

In operation S430, the system 110 for searching for the related term having rapidly increasing popularity compares the peaks for the extracted daily search frequency of each search term. Also, in operation S430, the system 110 for searching for the related term having rapidly increasing popularity may analyze time information of a rapid increase and time information of a rapid decrease of peaks to be compared and establish the predetermined period so that the time information of the rapid decrease is greater than the time information of the rapid increase.

FIG. 6 is a flowchart illustrating operations of comparing peaks of a daily search frequency for each search term according to an embodiment of the present invention.

Referring to FIG. 6, the system 110 for searching for the related term having rapidly increasing popularity compares the peaks of the extracted search term in the predetermined period. In operation S620, the system 110 for searching for the related term having rapidly increasing popularity, according to the peak comparison result, searches for candidate search terms in which peaks have occurred together in a predetermined period.

In operation S440, the system 110 for searching for the related term having rapidly increasing popularity analyzes relevance between candidate search terms in which peaks have occurred together in the predetermined period as a result of the comparison and filters out a candidate search term having no relevance.

FIG. 7 is a flowchart illustrating operations of filtering out a search term having

no relevance according to an embodiment of the present invention.

Referring to FIG. 7, in operation S710, the system 110 for searching for the related term having rapidly increasing popularity analyzes the candidate search terms in which the peaks have occurred together in the predetermined period as the result of the comparison and determines whether relevance exists between the search terms. In operation S710, the system 110 for searching for the related term having rapidly increasing popularity analyzes the search terms and determines the analyzed search term has relevance when the analyzed search term is a correlative search term. Specifically, in operation S710, the system 110 for searching for the related term having rapidly increasing popularity may measure a number of search sessions where the search terms are inputted and a number of search sessions where a pair of search terms included in the search terms are inputted. As an example, in operation S710, the system 110 for searching for the related term having rapidly increasing popularity may measure a number of user identifiers where the search terms are inputted and a number of user identifiers wherein the pair of the search terms included in the search terms are inputted, and determines whether correlation exists. As an example, in operation S710, the system 110 for searching for the related term having rapidly increasing popularity may measure a number of IP addresses where the search terms are inputted and a number of IP addresses where a pair of search terms including the search term are inputted, and determines whether correlation exists. As an example, in S710, the system 110 for searching for the related term having rapidly increasing popularity may analyze the search terms, and, when a single search term is included in a portion of another search term, determines there is relevance between the search terms.

In operation S720, the system 110 for searching for the related term having rapidly increasing popularity filters out a candidate search term having no relevance from the candidate search terms according to the relevance decision result. That is, in operation S720, the system 110 for searching for the related term having a rapidly increasing popularity may filter out the candidate search term having no relevance from the above candidate search terms according to the relevance decision result and provide the candidate search term as a related term having the rapidly increasing popularity from among the candidate search terms.

As described above, a method for searching for a related term having rapidly

increasing popularity according to the present invention may provide a related term having a rapidly increasing popularity by searching for peaks in a time distribution of search terms, finding candidate search terms by comparing the peaks, and filtering out a candidate search term having no relevance. The method for searching for the related term having rapidly increasing popularity according to the above-described exemplary embodiments may be recorded in computer-readable media including program instructions to implement various operations embodied by a computer. The media may also include, alone or in combination with the program instructions, data files, data structures, and the like. Examples of computer-readable media include magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD ROM disks and DVD; magneto-optical media such as optical disks; and hardware devices that are specially configured to store and perform program instructions, such as read-only memory (ROM), random access memory (RAM), flash memory, and the like. The media may also be a transmission medium such as optical or metallic lines, wave guides, and the like, including a carrier wave transmitting signals specifying the program instructions, data structures, and the like. Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter. The described hardware devices may be configured to act as one or more software modules in order to perform the operations of the above-described embodiments of the present invention.

The foregoing descriptions of specific embodiments of the present invention have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed, and obviously many modifications and variations are possible in light of the above teaching. Therefore, it is intended that the scope of the invention be defined by the claims appended thereto and their equivalents.

Although a few embodiments of the present invention have been shown and described, the present invention is not limited to the described embodiments. Instead, it would be appreciated by those skilled in the art that changes may be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.