Database creation device and search system

文档序号:1406188 发布日期:2020-03-06 浏览:12次 中文

阅读说明:本技术 数据库制作装置以及检索系统 (Database creation device and search system ) 是由 坂本大辅 于 2019-08-27 设计创作,主要内容包括:本发明提供一种数据库制作装置以及检索系统,能够在提高有用性的同时制作数据库。数据处理服务器(2)从外部服务器(6)获取日语数据和外语数据,利用机器翻译将外语数据翻译成日语数据,从而制作机器翻译数据,并通过将机器翻译数据作为日语数据的一部分与其组合来制成混合数据,利用混合数据制成保存数据。(The invention provides a database creation device and a search system, which can create a database while improving the usefulness. A data processing server (2) acquires Japanese data and foreign language data from an external server (6), translates the foreign language data into Japanese data by machine translation to create machine translation data, combines the machine translation data with the Japanese data as a part of the Japanese data to create mixed data, and creates stored data from the mixed data.)

1. A database creation device is characterized by comprising:

a text information acquisition unit that performs a predetermined filtering process on public information disclosed by a predetermined medium to acquire, as text information associated with a predetermined field, 1 st language text information including a predetermined 1 st language as a text and 2 nd language text information including one or more 2 nd languages other than the predetermined 1 st language as texts;

a translated text information creating unit that creates translated text information by translating the 2 nd language text information into the prescribed 1 st language by a prescribed translation method;

a mixed text information producing unit that produces mixed text information by combining the translated text information as a part of the 1 st language text information with the 1 st language text information; and the number of the first and second groups,

and a database creating unit that creates a database for search by associating the perceptual information with the mixed text information from which the noise information has been removed, after performing an extraction process of extracting perceptual information from the mixed text information and a noise removal process of removing noise information constituting noise from the mixed text information.

2. The database creating apparatus according to claim 1,

in the noise removal processing, when a predetermined noun associated with the predetermined field is included in the mixed text information, and when a part of speech following the predetermined noun is other than a frame, an object, and any of the qualified assist words, the mixed text information including the predetermined noun is removed as the noise information.

3. A search system is characterized by comprising:

the database production apparatus of claim 1;

a database storage unit that stores the database;

a retrieval unit that retrieves the database stored in the database storage unit based on a prescribed keyword associated with the prescribed field;

a discrimination unit that discriminates perceptual information in a search result of the search unit into a plurality of categories of perceptual information; and the number of the first and second groups,

and a display unit that displays the plurality of classified perceptual information by color-distinguishing them in different colors from each other.

4. A search system is characterized by comprising:

the database production apparatus of claim 1;

a database storage unit that stores the database;

a retrieval unit that retrieves the database stored in the database storage unit based on a prescribed keyword associated with the prescribed field;

a discrimination unit that discriminates the perceptual information in the search result of the search unit into perceptual information classified into a plurality of stages from the top level to the bottom level; and the number of the first and second groups,

and a display unit which displays the classified perceptual information of the plurality of stages in a stepwise manner in order from the highest level to the lowest level.

5. A search system is characterized by comprising:

the database production apparatus of claim 1;

a database storage unit that stores the database;

a retrieval unit that retrieves the database stored in the database storage unit based on a prescribed retrieval period; and the number of the first and second groups,

and a display unit that displays the plurality of perceptual information in the search result of the search unit and displays a related word corresponding to the selected perceptual information and information of the database when any one of the plurality of perceptual information is selected.

6. A search system is characterized by comprising:

the database production apparatus of claim 2;

a database storage unit that stores the database;

a retrieval unit that retrieves the database stored in the database storage unit based on a prescribed keyword associated with the prescribed field;

a discrimination unit that discriminates the perceptual information in the search result of the search unit into a plurality of classified perceptual information; and the number of the first and second groups,

and a display unit that displays the plurality of classified perceptual information by color-distinguishing them in different colors from each other.

7. A search system is characterized by comprising:

the database production apparatus of claim 2;

a database storage unit that stores the database;

a retrieval unit that retrieves the database stored in the database storage unit based on a prescribed keyword associated with the prescribed field;

a discrimination unit that discriminates the perceptual information in the search result of the search unit into perceptual information classified into a plurality of stages from the top level to the bottom level; and the number of the first and second groups,

and a display unit which displays the classified perceptual information of the plurality of stages in a stepwise manner in order from the highest level to the lowest level.

8. A search system is characterized by comprising:

the database production apparatus of claim 2;

a database storage unit that stores the database;

a retrieval unit that retrieves the database stored in the database storage unit based on a prescribed retrieval period; and the number of the first and second groups,

and a display unit that displays the plurality of perceptual information in the search result of the search unit and displays a related word corresponding to the selected perceptual information and information of the database when any one of the plurality of perceptual information is selected.

Technical Field

The present invention relates to a database creating device and the like that create a database for search.

Background

Conventionally, a database creation device described in patent document 1 (japanese patent application laid-open No. 2011-48527) has been known as a database creation device. In this database creating device, a perceptual expression is extracted from text information in japanese, and a search target database is created by associating perceptual information with a search target using the created perceptual expression database.

Further, as a database creation device, a database creation device described in patent document 2 (japanese patent application laid-open No. 2010-272075) is known. The database creating device creates a search target database by extracting perceptual expression from text information in japanese using a perceptual expression dictionary and a perceptual expression extraction rule, and generating perceptual information for each search target using a perceptual vector dictionary.

Disclosure of Invention

The database creation devices of patent documents 1 and 2 create a database based only on japanese text information, and have a problem of low usefulness of the database because the data collection range is limited. As a result, the usefulness of the search result in searching the database is also reduced.

The present invention has been made to solve the above problems, and an object thereof is to provide a database creating device and the like capable of creating a database while improving usefulness.

Means for solving the problems

In order to achieve the above object, a database creating device according to the present invention includes: a text information acquisition unit that performs a predetermined filtering process on public information disclosed by a predetermined medium to acquire, as text information associated with a predetermined field, 1 st language text information including a predetermined 1 st language as a text and 2 nd language text information including one or more 2 nd languages other than the predetermined 1 st language as texts; a translated text information creating unit that creates translated text information by translating the 2 nd language text information into a prescribed 1 st language by a prescribed translation method; a mixed text information making unit that makes mixed text information by combining the translated text information as a part of the 1 st language text information with the 1 st language text information; and a database creating unit that creates a database for search by associating the perceptual information with the mixed text information from which the noise information has been removed, after performing an extraction process of extracting the perceptual information from the mixed text information and a noise removal process of removing the noise information constituting the noise from the mixed text information.

According to the database creating device, the predetermined filtering process is performed on the public information disclosed by the predetermined medium, and the 1 st language text information including the predetermined 1 st language as the text and the 2 nd language text information including one or more 2 nd languages other than the predetermined 1 st language as the text information related to the predetermined region are acquired. Furthermore, the 2 nd language text information is translated into the 1 st language by a predetermined translation method to produce the translated text information, and the 1 st language text information is combined with the translated text information as a part of the 1 st language text information to produce the mixed text information. Further, since the database for search is created using the mixed text information, unlike the cases of patent documents 1 and 2, the database can be created using two or more languages included in the public information disclosed by a predetermined medium as the information of the text. Thus, for example, when searching the database, it is possible to search a wider range of information than in the case of patent documents 1 and 2, and thus the usefulness of the database can be improved.

Also, an extraction process of extracting perceptual information from the mixed text information and a noise removal process of removing noise information constituting noise from the mixed text information are performed. Then, a database is created by associating perceptual information with the mixed text information from which the noise information is removed. This makes it possible to search for appropriate information while avoiding searching for information constituting noise, for example, when searching the database. This can further improve the usefulness of the database (note that the "predetermined medium" in this specification includes mass media such as TV, radio, and newspaper, web media such as electronic bulletin boards, blogs, and SNS, and multimedia).

In the present invention, it is preferable that, in the noise removal processing, when a predetermined noun associated with a predetermined field is included in the mixed text information, when a part of speech following the predetermined noun is other than the subject, the object, and any of the qualified assist words, the mixed text information including the predetermined noun is removed as the noise information.

According to this database creation device, in the noise removal process, when a predetermined noun associated with a predetermined field is included in the mixed text information, if the part of speech following the predetermined noun is other than the subject, the object, and any of the qualified assist words, the mixed text information including the predetermined noun is removed as the noise information. In this case, when the part of speech following a predetermined noun is other than the subject, the object, and any of the qualified assist words, the predetermined noun is highly likely to be used as a part of a language other than nouns. Therefore, it is possible to avoid mixing of noise information including such a language that is easily confused into the database, and the usefulness of the database can be further improved.

The search system of the present invention is characterized by comprising: the database creation device; a database storage unit that stores a database; a search unit that searches the database stored in the database storage unit based on a predetermined keyword associated with a predetermined domain; a discrimination unit that discriminates the perceptual information in the search result of the search unit into a plurality of classified perceptual information; and a display unit that displays the plurality of classified perceptual information by color-distinguishing them in different colors.

According to this search system, the database stored in the database storage unit is searched based on the predetermined keyword associated with the predetermined domain, and the perceptual information in the search result of the search unit is classified into a plurality of categories of perceptual information. Further, since the plurality of categories of the perceptual information are displayed in different colors from each other, the user of the search system can grasp the plurality of categories of the perceptual information in the search result at a glance, and the convenience can be improved.

The search system of the present invention is characterized by comprising: the database creation device; a database storage unit that stores a database; a search unit that searches the database stored in the database storage unit based on a predetermined keyword associated with a predetermined domain; a discrimination unit that discriminates the perceptual information in the search result of the search unit into perceptual information classified into a plurality of stages from the top level to the bottom level; and a display unit which displays the classified perceptual information of the plurality of stages in a stepwise manner in order from the top to the bottom.

According to this search system, the database stored in the database storage unit is searched based on the predetermined keyword associated with the predetermined domain, and the perceptual information in the search result of the search unit is classified into perceptual information classified into a plurality of stages from the top to the bottom. Then, the perceptual information of the plurality of stages of classification is displayed in a stepwise manner in order from the top to the bottom. In this way, the user of the search system can refer to the perceptual information in the search result in a stepwise manner in order from the top to the bottom, and can thereby study in detail what perceptual information is included in the search result.

The search system of the present invention is characterized by comprising: the database creation device; a database storage unit that stores a database; a search unit that searches the database stored in the database storage unit based on a predetermined search period; and a display unit that displays the plurality of perceptual information in the search result of the search unit and displays the related word corresponding to the selected perceptual information and the information of the database when any one of the plurality of perceptual information is selected.

According to this search system, the database stored in the database storage unit is searched based on a predetermined search period, and a plurality of perceptual information in the search result of the search unit is displayed. When any one of the plurality of perceptual information is selected, the related word corresponding to the selected perceptual information and the information of the database are displayed. Thus, the user of the search system can refer to the related word corresponding to the selected perceptual information and the information of the database, and the convenience can be improved.

Drawings

Fig. 1 is a diagram schematically showing the configuration of a database creating apparatus and a search system according to an embodiment of the present invention.

Fig. 2 is a flowchart showing the stored data creating process.

Fig. 3 is a diagram showing an example of acquired text data.

Fig. 4 is a diagram showing an example of japanese data.

Fig. 5 is a diagram showing an example of foreign language data.

Fig. 6 is a diagram showing an example of data that does not require translation.

Fig. 7 is a diagram showing an example of translation data.

Fig. 8 is a diagram showing an example of machine translation data.

Fig. 9 is a diagram showing an example of quasi-japanese data.

Fig. 10 is a diagram showing an example of mixed data.

Fig. 11 is a diagram showing an example of data that does not require analysis.

Fig. 12 is a diagram showing an example of analysis data.

Fig. 13 is a diagram showing an example of large and small classifications of perceptual information.

Fig. 14 is a diagram showing an example of the save data.

Fig. 15 is a diagram showing a communication operation in the 1 st search processing of the search system.

Fig. 16 is a diagram showing a display example of a related word.

Fig. 17 is a diagram showing an example of display of a large classification of perceptual information.

Fig. 18 is a diagram showing an example of display of small classifications of perceptual information.

Fig. 19 is a diagram showing an example of display of an original text of a database.

Fig. 20 is a diagram showing a communication operation in the 2 nd search processing of the search system.

Detailed Description

Hereinafter, a search system and a database creation device according to an embodiment of the present invention will be described with reference to the drawings. Since the database creating apparatus of the present embodiment is included in the search system, the following description will be given of the search system and the function and configuration of the database creating apparatus.

As shown in fig. 1, a search system 1 according to the present embodiment includes a data processing server 2, a database server 3, and a plurality of search terminals 4 (only two of which are shown).

The data processing server 2 includes a processor, a memory (such as a RAM or a ROM), an I/O interface, and the like, and executes a stored data creating process and the like described later based on an arithmetic program in the memory.

A plurality of external servers 6 (only three are shown in the figure) are connected to the data processing server 2 via a network 5 (for example, the internet). In this case, various SNS servers, servers for predetermined media (e.g., news agencies), servers for searching websites, and the like correspond to the external server 6. In the present embodiment, the medium constituted by the external server 6 corresponds to a predetermined medium, and the data in the external server 6 corresponds to public information disclosed in the predetermined medium.

In the stored data creating process described later, the data processing server 2 acquires text information from the external servers 6, creates stored data, and outputs the stored data to the database server 3.

In the present embodiment, the data processing server 2 corresponds to a database creation device, a text information acquisition means, a translated text information creation means, a mixed text information creation means, a database creation means, a search means, and a classification means.

The database server 3 includes a processor, a memory, an I/O interface, and the like as in the data processing server 2. In the database server 3, the save data input from the data processing server 2 is stored in a memory as a part of the database. In the present embodiment, the database server 3 corresponds to a database storage means.

The search terminal 4 is a personal computer type terminal, and includes a display 4a, a memory (storage)4b, an input interface 4c, and the like. Application software for search processing (hereinafter referred to as "search software") is installed in the memory (storage)4b, and the input interface 4c is configured by a keyboard, a mouse, and the like for operating the search terminal 4.

In the search terminal 4, as will be described later, a search of the database or the like is executed by the data processing server 2 in accordance with an operation of the input interface 4c by the user during the startup of the search software. In the present embodiment, the search terminal 4 corresponds to a search means and a display means.

Next, the stored data creating process will be described with reference to fig. 2. As described below, this processing creates stored data of a database constituting a part of the database using text data input from the external server 6 to the data processing server 2, and executes this processing at a predetermined control cycle in the data processing server 2.

The data acquired in the stored data creating process, the created data, and the calculated data are all stored in the RAM of the memory of the data processing server 2.

As shown in the figure, first, data is acquired (fig. 2/step 1). Specifically, the text data including the vehicle-related terms is acquired by applying a predetermined filtering process to the data input from the external server 6 to the data processing server 2. In this case, for example, text data is acquired as shown in fig. 3. In the figure, "X" represents a vehicle name, and "Y company" represents a vehicle-building company name.

The term related to a vehicle is a term in a field related to a vehicle such as a two-wheeled vehicle and a four-wheeled vehicle, and specifically, a vehicle name, a vehicle manufacturer name, a company (manager) name of a vehicle manufacturer, a vehicle component term, a vehicle racing term, a racer name, and the like correspond to the term related to a vehicle. In the present embodiment, the vehicle-related field corresponds to a predetermined field.

Next, language classification processing is performed (fig. 2/step 2). Specifically, the text data acquired as described above is classified into japanese data and foreign language data. For example, in the case of the text data shown in fig. 3, the text data is classified into japanese data shown in fig. 4 and foreign language data shown in fig. 5.

Next, when the text data is classified as described above, it is determined whether foreign language data exists (fig. 2/step 3). If the determination is negative (NO in fig. 2/step 3 … NO), that is, if the text data is only japanese data, the processing proceeds to analysis data selection processing (fig. 2/step 8) described later.

On the other hand, when the determination is affirmative (YES in fig. 2/step 3 …), the translation data selecting process is executed (fig. 2/step 4). In this processing, data to be translated is selected as translation data from the foreign language data classified as described above. For example, in the case of the foreign language data shown in fig. 5, since the URL data shown in fig. 6 does not need to be translated, the data shown in fig. 7 is selected as the translation data to be translated.

Next, a machine translation process is performed (FIG. 2/step 5). In this process, machine translation is performed on the translation data to obtain machine translation data. For example, when machine translation is performed on the data for translation shown in fig. 7, the machine translation data shown in fig. 8 is obtained.

Next, quasi-japanese data is created (fig. 2/step 6). In this case, when data that is not selected, that is, data that is not subjected to machine translation exists in the data selection processing for translation, quasi-japanese data is created by combining the data with machine translation data. For example, by combining the URL data shown in fig. 6 with the machine translation data shown in fig. 8, quasi-japanese data shown in fig. 9 is created. On the other hand, when there is no data subjected to machine translation, the machine translation data is set as quasi-japanese data as it is.

Subsequently, mixed data is created (fig. 2/step 7). Specifically, the mixed data is created by combining quasi-japanese data with japanese data. For example, the quasi-japanese data shown in fig. 9 is combined with the japanese data shown in fig. 4 to create the mixed data shown in fig. 10.

When the mixed data is created in this way or when the foreign language data is not present in the above determination, the analysis data selection process is executed (fig. 2/step 8).

In this process, analysis data to be analyzed is selected from the mixed data or japanese data. For example, when the mixed data shown in fig. 10 is created, the data shown in fig. 11 is a list of titles and nouns, and analysis is not necessary, and therefore the data shown in fig. 12 is selected as the analysis data.

Next, a perceptual extraction process is performed (fig. 2/step 9). In this processing, perceptual information of analysis data is classified and extracted using a language understanding algorithm for understanding and determining a structure of a sentence or a connection relation of words. Specifically, as shown in fig. 13, the perceptual information of the analysis data is extracted in two stages, i.e., three major categories, "Positive", "Neutral", "Negative", and many minor categories below the major categories.

In this figure, the categories "happy", … …, "want to buy" correspond to the lower small category of the big category "positive", and the categories "surprised", … … … "invite" correspond to the lower small category of the big category "neutral". In addition, the categories "angry", … … … … "do not want to buy" are equivalent to the lower small category of the big category "negative".

Next, a noise removal process is performed (fig. 2/step 10). In this process, Morphological Analysis (Morphological Analysis) is first performed on the Analysis data. When a predetermined noun of the vehicle-related term is included in the analysis data, it is determined whether or not the analysis data is noise data based on a part of speech following the predetermined noun.

Specifically, the part of speech following a predetermined noun is a lattice assist word (japanese: lattice assist ), and if the lattice assist word is any of the subject lattice, the object lattice, and all the lattices, it is determined not to be noise data, and if not, it is determined to be noise data. When it is determined that the data is noise data, the data is removed from the analysis data.

For example, in the case of the analysis data shown in fig. 12, although the vehicle name "Fit (フィット)" is included in the data of No.8, the language following the noun "Fit" is not a word "go (する)", and it is determined that the data is noise data. Thus, the data of No.8 is removed from the analysis data of FIG. 12.

Next, save data is created (FIG. 2/step 11). Specifically, the stored data is created by associating the analysis data from which the noise has been removed in the noise removal processing with the perceptual information extracted in the perceptual extraction processing. For example, the stored data shown in fig. 14 is created by associating the data excluding the data of No.8 from the analysis data shown in fig. 12 with the perceptual information.

Next, the saved data created in the above manner is output to the database server 3 (fig. 2/step 12). Then, the present process is ended. This causes the save data to be stored in the database server 3 as a part of the database.

Next, the 1 st search process executed by the search system 1 will be described with reference to fig. 15. The 1 st search process is executed when a keyword and a search period are input by a user operating the input interface 4c during the start of the search software on the search terminal 4.

As shown in the figure, first, the search terminal 4 inputs a keyword as search information and a search period by a user's operation of the input interface 4c (fig. 15/step 30). An example in which the user inputs the enterprise name "HONDA (HONDA)" as a keyword will be described below.

Next, a search information signal is transmitted from the search terminal 4 to the data processing server 2 (fig. 15/step 31). The search information signal includes a keyword as data and a search period.

When the data processing server 2 receives the search information signal, it performs perceptual information statistical processing (fig. 15/step 32). In this processing, the database in the database server 3 is searched based on the keyword and the search period included in the search information signal, and the number of hits of the perceptual information in the search result is counted. Specifically, the number of hits of each of three large categories and/or the number of hits of each of a plurality of small categories in the perceptual information are counted.

Next, based on the statistical result of the perceptual information, related words and perceptual large-category display data are created (fig. 15/step 33). The related word and the perceptual large-category display data are data for displaying the ratio of three large categories of the language and the perceptual information associated with the keyword.

Next, the related word and the perceptual large classification display signal are transmitted from the data processing server 2 to the search terminal 4 (fig. 15/step 34). The related word and perceptual large-category display signal includes the related word and perceptual large-category display data.

When the related word and the perceptual large-category display signal are received at the search terminal 4, the related word and the perceptual large-category are displayed on the display 4a of the search terminal 4 in accordance with the related word and the perceptual large-category display data (fig. 15/step 35). In this case, as shown in fig. 16, the related word displays a language having a large number of hits and being related to the keyword in the form of a word cloud (word clouds) centering on the keyword "HONDA (HONDA)".

As shown in fig. 17, for example, the large classification of the perceptual information is displayed in the form of a circular graph (circular graph). As shown in the figure, in the graph, three large categories "positive", "neutral", and "negative" in the perceptual information are displayed in three regions in a distinguished manner. The areas of these regions are set according to the ratio of the number of hits in each large category, and are displayed by being distinguished by colors different from each other.

Then, when the user selects any one of the three major categories after visually confirming the major category of the perceptual information displayed on the display 4a (fig. 15/step 36), a perceptual major category selection signal is transmitted from the search terminal 4 to the data processing server 2 (fig. 15/step 37).

The perceptual large category selection signal is indicative of a large category selected by the user. In addition, the user's selection of the large category is implemented by: the operation input interface 4c presses any one of the three large-classified regions (circular regions in fig. 17) on the display 4 a. Hereinafter, an example of a case where the user selects "positive" as the large category of the perceptual information will be described.

When the data processing server 2 receives the perceptual large category selection signal, perceptual small category display data is generated (step 38/fig. 15). The sensitivity small classification display data is created as data for displaying a small classification in the lower level of the large classification of the sensitivity information selected by the user based on the sensitivity large classification selection signal.

Next, the perceptual small classification display signal is transmitted from the data processing server 2 to the search terminal 4 (fig. 15/step 39). The perceptual small classification display signal comprises the perceptual small classification display data.

When the search terminal 4 receives the perceptual small classification display signal, the small classification of the perceptual information is displayed on the display 4a of the search terminal 4 in correspondence with the perceptual small classification display data (fig. 15/step 40). In this case, a small classification of the perceptual information is displayed in the form of a histogram as shown in fig. 18, for example, and the length of the histogram is set in accordance with the number of hits.

Then, when any one of the small classifications of the plural small classifications is selected by the user after the user visually confirms the small classification of the perceptual information displayed on the display 4a (fig. 15/step 41), a perceptual small classification selection signal is transmitted from the search terminal 4 to the data processing server 2 (fig. 15/step 42).

The perceptual small category selection signal is indicative of a small category selected by the user. In addition, the user's selection of a small category is implemented by: the input interface 4c is operated and any one of a plurality of display areas (areas of the bar chart shown by stippling) of small categories displayed on the display 4a is pressed. Hereinafter, an example when the user selects "show/appreciate" as the small category of the perceptual information will be described.

When the sensitive small category selection signal is received, the data processing server 2 creates related words and original text display data (fig. 15/step 43). The related word and the original text display data are created as data for displaying an original text of the database corresponding to the small classification of the perceptual information selected by the user while displaying a language related to the keyword input by the user.

Next, the data processing server 2 transmits the related word and the original text display signal to the search terminal 4 (fig. 15/step 44). The related word and original text display signal includes the related word and original text display data.

When the related word and original text display signal is received at the search terminal 4, the original text and related word of the database are displayed on the display 4a of the search terminal 4 in correspondence with the related word and original text display data (fig. 15/step 45).

In this case, similar to fig. 16, the related words are displayed in a character cloud form centering on the most hit number. Thus, the user can determine how many related words are related to the keyword "HONDA (HONDA)" and the selected small classification of the perceptual information in the medium constituted by the external server 6 during the search period.

Further, the texts in the database are displayed in a state where the texts corresponding to the dates, the media names, and the small categories of the perceptual information are arranged in a table format as shown in fig. 19, for example. Thus, the user can determine what kind of perceptual information included in the text data associated with the keyword "HONDA" in the medium is disclosed in a large amount during the search period. The 1 st retrieval process is executed in the above manner.

Next, the 2 nd search processing executed by the search system 1 will be described with reference to fig. 20. The 2 nd search processing is executed when only a search period is input by the user operating the input interface 4c during the start of the search software in the search terminal 4.

As shown in the figure, first, the search terminal 4 inputs only the search period as search information by the user's operation of the input interface 4c (fig. 20/step 50).

Thereby, the search information signal is transmitted from the search terminal 4 to the data processing server 2 (step 51/fig. 20). The search information signal contains a search period as data.

When the search information signal is received, the data processing server 2 performs perceptual information statistical processing (fig. 20/step 52). In this processing, the database in the database server 3 is searched based on the search period included in the search information signal, and the perceptual information in the search result is counted. Specifically, the number of hits of each of the plurality of small classifications in the perceptual information is counted.

Then, based on the statistical result of the sensitivity information, sensitivity small classification display data is created (fig. 20/step 53). As described above, the sensitivity small classification display data is created as data for displaying a small classification of sensitivity information.

Next, a perceptual small classification display signal is transmitted from the data processing server 2 to the search terminal 4 (fig. 20/step 54). The perceptual small classification display signal comprises the perceptual small classification display data.

When the perceptual small classification display signal is received at the search terminal 4, the small classification of the perceptual information is displayed on the display 4a of the search terminal 4 in correspondence with the perceptual small classification display data (fig. 20/step 55). In this case, the small classification of the perceptual information is displayed in the form of a histogram, for example, as in fig. 18 described above.

Then, when the user selects any one of the small classifications of the perceptual information displayed on the display 4a by operating the input interface 4c with the user after visually confirming the small classification (fig. 20/step 56), a perceptual small classification selection signal is transmitted from the search terminal 4 to the data processing server 2 (fig. 20/step 57).

When the sensitive small category selection signal is received, the data processing server 2 creates related words and original text display data (fig. 20/step 58). The related word and the original text display data are created as data for displaying a related word corresponding to the small classification of the perceptual information selected by the user and the original text of the database corresponding to the small classification of the perceptual information selected by the user.

Next, the related word and the language display signal are transmitted from the data processing server 2 to the search terminal 4 (fig. 20/step 59). The related word and language display signal includes the related word and the original text display data.

When the related word and the original text display signal are received at the search terminal 4, the related word and the original text of the database are displayed on the display 4a of the search terminal 4 in correspondence with the related word and the original text display data (fig. 20/step 60).

In this case, for example, as in fig. 16 described above, the related words are displayed in the form of a character cloud. Further, for example, as in fig. 19 described above, the original texts of the state display database in which dates, medium names, and articles corresponding to the small categories of the perceptual information are arranged in a table format. The 2 nd retrieval processing is executed in the above manner.

As described above, the data processing server 2 of the search system 1 according to the present embodiment executes the stored data creating process shown in fig. 2. In this processing, japanese data containing japanese as text and foreign language data containing foreign languages other than japanese as text are acquired as text data of the vehicle-related field from data in the external server 6 (step 1). Then, machine translation data is created by machine translating the foreign language data into japanese (step 5), and mixed data is created by combining the machine translation data as a part of the japanese data (step 7). Next, analysis data is selected from the mixed data (step 8), and stored data is created from the analysis data (steps 9 to 11). Then, the save data is stored in the database server 3 as a part of the database.

Therefore, unlike the cases of patent documents 1 and 2, a database can be created using text data in which two or more languages are included as texts in data disclosed by a medium constituted by the external server 6. Thus, for example, when searching the database, it is possible to search more extensive information than in the case of patent documents 1 and 2, and thus the usefulness of the database can be improved.

When creating the stored data from the analysis data, a sensitivity extraction process for extracting sensitivity information (step 9) and a noise removal process for removing noise information constituting noise from the analysis data (step 10) are performed. Then, the perceptual information is associated with the analysis data from which the noise information has been removed, thereby creating stored data (step 11). This makes it possible to search for appropriate information while avoiding searching for information constituting noise, for example, when searching a database. This can further improve the usefulness of the database.

In the noise removal processing, when a predetermined noun of a vehicle-related term is included in the analysis data, if the part of speech following the predetermined noun is not other than any assist word among the subject, object, and all frames, the mixed data including the predetermined noun is removed as the noise information. In this case, when the part of speech following the predetermined noun is other than the subject, the object, and any auxiliary word of all the qualifiers, the predetermined noun is highly likely to be used as a part of a language other than nouns. Therefore, it is possible to avoid mixing of noise information including such a language that is easily confused into the database, and the usefulness of the database can be further improved.

In the 1 st search processing shown in fig. 15, the database is searched based on the keyword and the search period. Then, the perceptual information in the search result is displayed in the form of a circular graph divided into three major categories "positive", "neutral", and "negative" as shown in fig. 17. In the graph, the areas of three large classified regions are set according to the ratio of the number of hits, and are displayed by being distinguished by colors different from each other. Thus, the user can determine the ratio of the three large categories of perceptual information in the search result at a glance.

When any one of the three large categories of the perceptual information is selected, a plurality of small categories lower than the selected large category are displayed in the form of a histogram corresponding to the number of hits as shown in fig. 18. Thus, when the user selects any one of the three large-category perceptual information, the ratio of the lower-most small categories can be determined at a glance. As described above, the user can confirm the proportions of the perceptual information of the three large categories first, and further, when any one of the large categories is selected, the proportions of the plurality of small categories that are next to the selected large category can be confirmed in stages, so that high convenience can be ensured.

On the other hand, in the 2 nd retrieval process shown in fig. 20, the database is retrieved based on only the retrieval period. Then, the perceptual information of a large number of small categories in the search result is displayed in the form of a histogram corresponding to the number of hits as shown in fig. 18. Thus, the user can determine the proportion of a large number of small categories of perceptual information within the search period at a glance, and high convenience can be ensured.

In the embodiment, the vehicle-related region is set as the predetermined region, but a region other than the vehicle-related region may be set as the predetermined region. For example, a clothing-related field, a food-related field, a toy-related field, and the like may be defined as predetermined fields.

In the embodiment, the 1 st language is japanese, but the 1 st language may be a foreign language other than japanese such as english or german. The 2 nd language may be any language other than the 1 st language. For example, when the 1 st language is english, the 2 nd language may be japanese, german, or the like.

Further, the embodiment is an example in which the medium constituted by the external server 6 is a predetermined medium, but the predetermined medium of the present invention is not limited to this, and may be mass media such as TV, radio, and newspaper, and network media such as electronic bulletin boards, blogs, and SNS. In this case, when mass media such as TV, radio, and newspaper are used as a predetermined medium, public information (video information, voice information, and character information) disclosed on the TV, radio, and newspaper may be input as text data into the data processing server 2 via an input interface such as a personal computer.

On the other hand, although the embodiment uses the machine translation method as an example of the predetermined translation method, the predetermined translation method of the present invention is not limited to this, and any method may be used as long as it can translate the 2 nd language text information into the 1 st language. For example, the 2 nd language text information may be translated into the 1 st language by a human translation job.

In addition, although the embodiment is an example in which the perceptual information is divided into two stages of a large classification and a small classification, the perceptual information of the present invention is not limited to this, and may be divided into a plurality of stages from the top to the bottom. For example, the perceptual information may be classified into three or more stages.

23页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种内容处理方法及电子设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!