Localized mechanism searching method, electronic equipment and storage medium

文档序号：1952965 发布日期：2021-12-10 浏览：19次中文

阅读说明：本技术 一种本地化机构搜索的方法、电子设备和存储介质 (Localized mechanism searching method, electronic equipment and storage medium ) 是由袁俊王杰路烽裘加林陈建群于 2021-11-12 设计创作，主要内容包括：本发明涉及一种本地化机构搜索的方法、电子设备和存储介质,该方法包括：基于第一搜索词确定位置信息、关键词、第二搜索词；在预先构建的词库中,确定与第二搜索词匹配的第一集合,同时,在预先构建的语义数据库中,确定与第二搜索词匹配的第二集合；确定第一集合和第二结合的并集；在并集中,确定与位置信息匹配的第三集合；在预先构建的机构图谱中,确定与位置信息匹配的第四集合,以及,与关键词匹配的第五集合；将第三集合、第四集合、第五集合并集中的元素作为第一搜索词所搜索到的本地化机构。本发明通过词库、语义数据库和机构图谱使得搜索结果不仅在语义层面上满足用户需求,还在地里位置层面上也符合用户需求,提升了搜索结果的准确性。(The invention relates to a method, an electronic device and a storage medium for localized organization searching, wherein the method comprises the following steps: determining position information, a keyword and a second search word based on the first search word; determining a first set matched with a second search word in a pre-constructed word bank, and simultaneously determining a second set matched with the second search word in a pre-constructed semantic database; determining a union of the first set and the second combination; in the union, determining a third set matched with the position information; determining a fourth set matched with the position information and a fifth set matched with the keywords in a pre-constructed mechanism map; and taking the elements in the third set, the fourth set and the fifth set as localization mechanisms searched by the first search term. According to the method, the word bank, the semantic database and the mechanism map are used, so that the search result not only meets the user requirement on the semantic level, but also meets the user requirement on the ground position level, and the accuracy of the search result is improved.)

1. A method of localizing an organizational search, the method comprising:

acquiring a first search word;

determining position information, a keyword and a second search word based on the first search word; the second search word is obtained by address normalization of the first search word;

determining N1 first localization mechanisms matched with the second search word in a pre-constructed word stock to form a first set, and simultaneously determining N2 second localization mechanisms matched with the second search word in a pre-constructed semantic database to form a second set; the word bank stores address normalization names of the sixth localization mechanisms, and the semantic database stores name semantic vectors of the sixth localization mechanisms;

determining a union of the first set and the second set;

in the union set, determining a third localization mechanism matched with the position information to form a third set;

in a mechanism map constructed in advance, a fourth localization mechanism matched with the position information is determined, and a fourth set is formed by the intersection of the fourth localization mechanism and the union set; the mechanism map comprises attribute information of each sixth localization mechanism;

in a pre-constructed mechanism map, determining a fifth localization mechanism matched with the keyword, and forming a fifth set by the intersection of the fifth localization mechanism and the union set;

and taking the elements in the third set, the fourth set and the fifth set as localization mechanisms searched by the first search term.

2. The method of claim 1, wherein determining location information, keywords, and second search terms based on the first search term comprises:

performing word segmentation processing on the first search word;

determining whether a first participle representing a place name exists;

if the first word segmentation exists, the first word segmentation is used as position information, if the first word segmentation does not exist, the IP address for sending the first search word is determined, and the position of the IP address is used as the position information;

determining a second participle representing the service type;

taking the participles except the first participle and the second participle as keywords;

if the first segmentation exists, mapping the first segmentation to the administrative division level to obtain a third segmentation, and replacing the first segmentation with the third segmentation to obtain a second search term.

3. The method of claim 1, wherein before determining the N1 first localization mechanisms matching the second search term in the pre-constructed thesaurus to form the first set, and determining the N2 second localization mechanisms matching the second search term in the pre-constructed semantic database to form the second set, the method further comprises:

acquiring the name, the identification and the address of each sixth localization mechanism;

performing word segmentation processing on the name of each six localization mechanism, determining a fourth word segmentation representing the name of a place, mapping the fourth word segmentation to the administrative division level to obtain a fifth word segmentation, and replacing the fifth word segmentation with the fifth word segmentation to obtain a processed name;

constructing a word bank according to the processed name, the identifier and the address of each sixth localization mechanism;

determining a mechanism vector of each processed name according to a pre-trained coding model;

constructing a semantic database according to the mechanism vector, the identifier and the address of each sixth localization mechanism;

the coding model consists of a first mapping embedding layer, a neural network coding DNNEncoder layer, an outer product Outerproduct layer, a first full-connection Dense layer and a cross entropy loss calculation CrossenttorLoss layer;

the training process of the coding model comprises the following steps:

acquiring a first sample search word and the name and the label of a first sample localization mechanism; the names of the first sample search word and the first sample localization mechanism are subjected to address normalization;

mapping the semantic space to the vector space by the first sample search word and the name of the first sample localization mechanism through a first embedding layer respectively, and coding the mapped vector space through a DNNEncoder layer respectively to obtain a first coding result and a second coding result;

inputting the first coding result and the second coding result into an OuterProduct layer for outer product calculation;

inputting the calculated result and the label of the first sample localization mechanism into a CrossentPyross layer, and calculating the cross entropy loss;

if the cross entropy loss is smaller than a preset threshold value, finishing the training of the coding model;

and if the cross entropy loss is not less than the preset threshold, adjusting the coding model according to the calculation result, and repeatedly executing the training process aiming at the adjusted coding model until the calculation result of the cross entropy loss is less than the preset threshold.

4. The method of claim 1, wherein the attribute information of any sixth localization mechanism comprises one or more of: address information, name, nickname, features;

the mechanism map comprises points and edges;

the name of each sixth localization authority corresponds to a unique point in the authority graph;

each sixth localization authority's nickname corresponds to a unique point in the authority's map;

each element in the sixth set corresponds to a unique point in the organizational graph; the sixth set is a union of the features of all sixth localization mechanisms;

each administrative division corresponds to a unique point in the mechanism picture;

a side exists between the point corresponding to the name of the same sixth localization mechanism and the point corresponding to the nickname of the sixth localization mechanism, the side points to the point corresponding to the nickname by the point corresponding to the name of the side, and the relationship of the side is the nickname;

an edge exists between the point corresponding to the name of the same local mechanism and the point corresponding to the characteristic of the local mechanism, the edge points to the point corresponding to the characteristic from the point corresponding to the name of the edge, and the relationship of the edge is the characteristic;

according to the membership relationship between each administrative division, an edge exists between points corresponding to the corresponding administrative divisions, the edge points to the point corresponding to the administrative division to which the edge belongs from the point corresponding to one administrative division, and the relationship of the edge is located;

determining the administrative division to which each sixth localization mechanism belongs according to the address information of the sixth localization mechanism; an edge exists between the point corresponding to the name of each sixth localization mechanism and the point corresponding to the administrative division to which the sixth localization mechanism belongs, the edge points to the point corresponding to the administrative division to which the sixth localization mechanism belongs from the point corresponding to the name of the edge, and the relationship of the edge is that the edge is located;

determining, in the pre-constructed institution atlas, a fourth localization institution that matches the location information, comprising:

determining a point corresponding to the position information in a pre-constructed mechanism map;

determining a sixth localization mechanism corresponding to the point with the relationship that an edge exists between the points corresponding to the position information as a fourth localization mechanism matched with the position information;

determining a fifth localization mechanism matching the keyword in the pre-constructed mechanism graph, including:

determining a point corresponding to the keyword in a pre-constructed mechanism map;

and determining a sixth localization mechanism corresponding to the point which has an edge with a characteristic relation to the point corresponding to the keyword as a fifth localization mechanism matched with the keyword.

5. The method of claim 1, wherein the first search term is input by a user;

after the element in the third set, the fourth set, and the fifth set is used as the localization mechanism searched by the first search term, the method further includes:

determining the matching degree between each seventh localization mechanism and the first search word, and acquiring the characteristics of each seventh localization mechanism, the user portrait of the user and historical behavior information; the seventh localization mechanism is a searched localization mechanism;

mapping the matching degree, the characteristics of the seventh local mechanism, the user portrait and the historical behavior information from a semantic space to a vector space through a second embedding layer respectively to obtain the mapped matching degree, the mapped characteristics, the mapped user portrait and the mapped historical behavior information;

determining attention weight according to the mapped historical behavior information and the mapped features, and performing kronecker product operation on the attention weight and the mapped historical behavior information;

sequencing the Crohn's product operation result, the mapped user portrait and the mapped characteristics by a seventh localization mechanism according to the mapping matching degree sequentially through a first fusion dimensionality reduction Concat layer, a first activation function WRelu, a first information processing function WMice, a second activation function Softmax and an Output layer;

feeding back a seventh localization mechanism to the user in order.

6. The method of claim 5, wherein determining the attention weight based on the mapped historical behavior information and the mapped features comprises:

and (3) sequentially passing the mapped historical behavior information and the mapped features through a feature product layer, a third activation function WRelu, a second Concat layer and a second information processing function WMice, and obtaining the attention weight through a second Dense layer.

7. The method of claim 5, wherein determining N1 first localization mechanisms matching the second search term in a pre-constructed thesaurus comprises:

determining the literal similarity Score _ e between the address normalized name of each sixth localization mechanism in the lexicon and the second search word, and selecting N1 sixth localization mechanisms with the largest Score _ e as first localization mechanisms;

the determining, in the pre-constructed semantic database, N2 second localization mechanisms that match the second search term, comprising:

determining vector cosine similarity Score _ m between the name semantic vector of each sixth localization mechanism in the semantic database and the second search term, and selecting N2 sixth localization mechanisms with the largest Score _ m as second localization mechanisms;

the determining a degree of match between each seventh localization mechanism and the first search term comprises:

for any seventh localization mechanism, a degree of match Score = (λ 1 × Score _ e + λ 2 × Score _ m)/2 between the any seventh localization mechanism and the first search term;

wherein λ 1 is the weight of the lexicon, and λ 2 is the weight of the semantic database.

8. The method of claim 6, wherein the activation function is: (xi) = xi, when xi > 0; WRelu (xi) =0.25xi when xi is not more than 0;

wherein i is a parameter identifier of an input activation function WRelu, and xi is an ith parameter value of the input activation function WRelu;

an information processing function WMice(s) = p(s) ^ s 2+ (1-p(s) ^ 0.25s ^ 2;

wherein the content of the first and second substances,s is a parameter value of the input information processing function WMice, σ () is a mean function, Var () is a variance function, and ℇ is a preset infinitesimal value.

9. An electronic device, comprising:

a memory;

a processor; and

a computer program;

wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-8.

10. A computer-readable storage medium, having stored thereon a computer program; the computer program is executed by a processor to implement the method of any one of claims 1-8.

Technical Field

The present invention relates to the field of information retrieval technologies, and in particular, to a method, an electronic device, and a storage medium for localized mechanism search.

Background

In a search scenario, a user has a need to search for a local authority. For example, to search public places for specific functions such as hospitals, malls, movies, etc. For the mechanism search problem in the existing search technology, the search word of the user and the mechanism name stored in the database are generally directly used for word matching based on the text, and the mechanism with the highest matching degree is recalled as a candidate mechanism.

If matching is done only by literal similarity of the text, the recalled results may not be the results intended by the user, or even the returned information may be of no value to the user. Such as: a user searches a Hangzhou women and children health care hospital in Hangzhou, a database comprises a Zhejiang women and children health care hospital and a Guangzhou women and children health care hospital, a search system is matched and returned to the Guangzhou women and children health care hospital according to the existing text similarity matching method, and the best result at the moment is the Zhejiang women and children health care hospital. In addition, the text matching based literal similarity method cannot solve the semantic similarity problem, such as the semantic similarity problem of women and children and women and children, which affects the user experience.

Disclosure of Invention

Technical problem to be solved

In view of the above-mentioned shortcomings and drawbacks of the prior art, the present invention provides a method, an electronic device, and a storage medium for localized organization searching.

(II) technical scheme

In order to achieve the purpose, the invention adopts the main technical scheme that:

in a first aspect, the present invention provides a method of localizing an institution search, the method comprising:

acquiring a first search word;

determining position information, a keyword and a second search word based on the first search word; the second search word is obtained by address normalization of the first search word;

determining a union of the first set and the second set;

in the union set, determining a third localization mechanism matched with the position information to form a third set;

and taking the elements in the third set, the fourth set and the fifth set as localization mechanisms searched by the first search term.

In a second aspect, the present invention provides an electronic device, comprising:

a memory;

a processor; and

a computer program;

wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method as described in the first aspect above.

In a third aspect, the present invention provides a computer-readable storage medium, characterized by a computer program stored thereon; the computer program is executed by a processor to implement the method as described in the first aspect above.

(III) advantageous effects

The invention discloses a method for searching a localization mechanism, which comprises the following steps: acquiring a first search word; determining position information, a keyword and a second search word based on the first search word; the second search word is obtained by address normalization of the first search word; determining N1 first localization mechanisms matched with the second search word in a pre-constructed word stock to form a first set, and simultaneously determining N2 second localization mechanisms matched with the second search word in a pre-constructed semantic database to form a second set; the word bank stores address normalization names of the sixth localization mechanisms, and the semantic database stores name semantic vectors of the sixth localization mechanisms; determining a union of the first set and the second combination; in the union set, determining a third localization mechanism matched with the position information to form a third set; determining a fourth localization mechanism matched with the position information in a mechanism map constructed in advance, and forming a fourth set by the intersection of the fourth localization mechanism and the union set; the organization map comprises attribute information of each sixth localization organization; determining a fifth localization mechanism matched with the keyword in a mechanism map constructed in advance, and forming a fifth set by the intersection of the fifth localization mechanism and the union set; and taking the elements in the third set, the fourth set and the fifth set as localization mechanisms searched by the first search term. The method provided by the invention ensures that the search result not only meets the user requirements on the semantic level, but also meets the user requirements on the ground position and other levels through the word stock, the semantic database and the mechanism map, and improves the accuracy of the search result.

Drawings

Fig. 1 is a schematic flow chart of a method for localized organization search according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a coding model architecture according to an embodiment of the present invention;

FIG. 3 is a schematic view of an organizational chart provided in accordance with an embodiment of the present invention;

fig. 4 is a schematic diagram of an implementation architecture of a seventh local mechanism sorting process according to an embodiment of the present invention;

fig. 5 is a schematic diagram of an implementation architecture of an attention weight determination process according to an embodiment of the present invention;

fig. 6 is a flowchart illustrating another method for localized entity searching according to an embodiment of the present invention.

Detailed Description

For the purpose of better explaining the present invention and to facilitate understanding, the present invention will be described in detail by way of specific embodiments with reference to the accompanying drawings.

In a search scenario, a user has a need to search for a local authority. For example, to search public places for specific functions such as hospitals, malls, movies, etc. If literal matching is performed only on the basis of search words input by a user and names of mechanisms, and semantic information and position information of the mechanisms are not considered, the problems that short names and formal names of the mechanisms cannot be matched and the mechanisms are recommended across cities occur, and user experience is affected.

In order to enable the search results to meet the user requirements on the semantic level and meet the user requirements on the ground position and other levels, the invention provides a method, electronic equipment and storage medium for localized mechanism search.

Referring to fig. 1, the implementation process of the method for localized institution search provided in this embodiment is as follows:

101, obtaining a first search word.

The first search term is entered by a user when conducting a localized institutional search. For example, the first search term is "Hangzhou women's health care institute". For another example, the first search term is "Hangzhou Lefu Shi center".

Because the first search term is input by the user and the user often cannot accurately describe the localization mechanism that the user wants to search for, the first search term may be the name of the localization mechanism that the user wants to obtain, or may not be the name of the localization mechanism that the user wants to obtain, but may describe the content of the name of the localization mechanism that the user wants to obtain.

For example, a user wants to search for "Hangzhou city obstetrical and gynecological hospital," and if the user knows the exact name, the first search term he enters may be "Hangzhou city obstetrical and gynecological hospital. If the user does not know the exact name and only knows the maternal-child healthcare hospital in Hangzhou city, or the user thinks that the hospital is the maternal-child healthcare hospital in Hangzhou city, or the user forgets the name and only remembers the meaning related to the maternal-child in Hangzhou city, the first search word input by the user may be 'the maternal-child healthcare hospital in Hangzhou city', or the first search word input by the user may be 'the maternal-child healthcare hospital'.

It should be noted that the present embodiment and the following embodiments are directed to a scenario in which a user wants to search for a localization mechanism, where the localization mechanism may be a hospital, and may also be an organization such as a movie theater, a shopping mall, and the like. For convenience of description, the present embodiment and the following embodiments only take the search for hospitals as an example, and the search for other institutions is not separately illustrated.

In addition, the "first" in the first search term is used for identification only and is used for distinguishing the search term input by the user from other search terms obtained after other processing, and besides, no other substantial meaning is given. That is to say, the first search term is a search term, and is a search term input by a user who performs a localized organization search by the localized organization search method of the present embodiment.

And 102, determining the position information, the key words and the second search words based on the first search words.

And the second search word is obtained by address normalization of the first search word.

In particular, the method comprises the following steps of,

102-1, performing word segmentation processing on the first search word.

The step adopts the existing word segmentation processing method to carry out word segmentation processing, such as word segmentation at the end.

Taking the first search word as "gynecologic hospitals in Hangzhou city" as an example, the word segmentation process is carried out to obtain the following segmentation words: "Hangzhou city", "gynecology" and "hospital".

102-2, determining whether a first participle characterizing a place name exists.

This step also determines whether the first participle exists among all the participles obtained in 102-1 by using the existing scheme. For example, a place noun in the text is recognized as a first word through a part-of-speech analysis or a named entity recognition method.

Taking the example in step 102-1 as an example, it is determined that there is a first participle, i.e., "Hangzhou city".

It should be noted that the term "first" in the first word is used for identification only, and is used for distinguishing the word representing the name of the place from the word representing other contents, and has no essential meaning. That is, the first participle, which is exactly one participle, is one of the participles obtained in step 102-1, but is only the participle satisfying a special condition, i.e., the participle representing the name of the place.

102-3, if the first word segmentation exists, the first word segmentation is used as position information, if the first word segmentation does not exist, the IP address for sending the first search word is determined, and the position of the IP address is used as the position information.

If the search word (i.e. the first search word) input by the user includes the participle (i.e. the first participle) representing the location noun, the first participle is directly used as the position information. For example, if the first word is "hang state city", the location information is hang state city.

If the search word (i.e. the first search word) input by the user does not include the word segmentation (i.e. the first word segmentation) representing the place noun, the IP address of the device of the user when the user inputs the search word is obtained, and the position of the IP address is used as the position information. For example, if the first search word is "maternal health care hospital", and there is no participle representing a location noun in the participle, the IP address of the device where the user inputs "maternal health care hospital" is obtained, and if the IP address belongs to beijing city, the location information is beijing city.

At this point, the position information is determined.

102-4, determining a second participle characterizing the service type.

This step also determines whether there is a second participle among all the participles obtained at 102-1 using existing schemes. For example, a service type noun in the text is recognized as a second sub-word by a part of speech analysis or a named entity recognition method.

The service types in this step and the subsequent steps, such as hospitals, movie theaters, markets, and the like, represent the service types of the localization mechanisms.

Taking the example in step 102-1 as an example, a second participle, namely "hospital", is determined.

It should be noted that the term "second" in the second word segmentation is used for identification only, and is used for distinguishing the word segmentation for characterizing the service type from the word segmentation for characterizing other contents, and has no essential meaning. That is, the second participle, which is exactly one participle, is one of the participles obtained in step 102-1, but is a participle satisfying a special condition, i.e., a participle characterizing a service type.

102-5, using the participles except the first participle and the second participle as keywords.

Through the above steps, the participles obtained in step 102-1 may belong to three categories, one category is a first participle, i.e., a participle representing a location name, one category is a second participle, i.e., a participle representing a service type, and one category is a participle not representing a location name nor a service type.

In a specific implementation, there may be no first participle and/or no second participle.

The participles obtained in step 102-1 belong to only two categories, one category is the first participle, i.e., the participle representing the name of the location, and the other category is the participle not representing the name of the location nor the type of service. Or, the participles obtained in step 102-1 only belong to two categories, one category is the second participle, i.e. the participle representing the service type, and the other category is the participle not representing the place name nor the service type. Alternatively, the participles obtained in step 102-1 belong to only two categories, one category is the first participle, i.e., the participle representing the name of the location, and the other category is the second participle, i.e., the participle representing the type of service.

For another example, the participles obtained in step 102-1 belong to only one class, and are the second participles, i.e. the participles representing the service type. Alternatively, the participles obtained in step 102-1 belong to only one class, and are participles that do not characterize location name nor service type. For the case of the word belonging only to the first word, i.e. the word representing the name of the place, it can appear theoretically, and in practice the probability of appearance is very low.

This step will use the segmentation that does not characterize the place name nor the service type as the key word.

Still taking the example in step 102-1 as an example, "gynecology" is the keyword.

Executing so far, the determination of the keywords is completed.

102-6, if the first segmentation exists, mapping the first segmentation to the administrative division level to obtain a third segmentation, and replacing the first segmentation with the third segmentation to obtain a second search term.

The first segmentation is mapped to the administrative division level to obtain a third segmentation, the first segmentation is replaced by the third segmentation to obtain a second search term, the process of obtaining the second search term can also be an address normalization process, and normalization of place nouns in the first search term according to the administrative district level is achieved. The address normalization process can solve the problem that the similarity of administrative divisions cannot be understood during semantic recognition, and improves the accuracy and interpretability of semantic recognition.

For example, the administrative division levels are as follows:

the place is Zhejiang province, and the superior word is the province name;

the place is Jiangsu province, and the superior word is the province name;

the place is Hangzhou city, and the hypernym is grade city;

the place is Jinhua city, and the place is grade city;

the place is an urban area, and the hypernym is a political area;

the place is a sunny area, and the superior word is a political area;

the place is Jiashan county, and the superior level is the basic level county;

the location is Tung Lu county, and the superior word is the primary county.

If the first segmentation is 'Hangzhou city', mapping the first segmentation to the administrative division level to obtain a third segmentation which is 'grade City', replacing the first segmentation with a third segmentation to obtain a second search word which is 'grade City gynecological hospital'.

It should be noted that the term "third" in the third participle is used for identification only, and is used for distinguishing the participle after normalization from the participle representing other contents, and has no essential meaning. That is, the third participle, which is exactly one participle, is normalized to represent the participle of the location name. The participle is mapped from one of the participles obtained in step 102-1.

In addition, the "second" in the second search term is used for identification only, and is used for distinguishing the search term after address normalization from the search term input by the user and other search terms obtained after processing, and besides, the second search term has no other substantial meaning. That is to say, the second search term is a search term, and is obtained by address normalization processing of the search term input by the user.

In addition, the execution sequence of 102-2 and 102-4 is not distinguished from that of the following embodiments, and this embodiment only takes the execution of 102-2 first and then 102-4 as an example, and when the embodiment is implemented, 102-4 may be executed first and then 102-2 may be executed, or 102-2 and 102-4 may be executed at the same time.

In addition, the execution sequence of 102-3 and 102-6 is not distinguished from that of the following embodiments, and this embodiment only takes the execution of 102-3 first and then 102-6 as an example, and when the embodiment is implemented, 102-6 may be executed first and then 102-3 may be executed, or 102-3 and 102-6 may be executed at the same time. So long as 102-3 and 102-6 are performed after 102-2.

That is, after the word segmentation is obtained in step 102-1, the determination sequence of the position information, the keyword and the second search word is not limited, but in this embodiment, the position information is determined first (steps 102-2 and 102-3), then the keyword is determined (steps 102-4 and 102-5), and finally the second search word is determined (step 102-6) are taken as an example.

Executing the steps, address normalization is carried out on the search word (namely the first search word) input by the user, and the normalized search word (namely the second search word) is obtained

103, in the pre-constructed thesaurus, N1 first localization mechanisms matching the second search term are determined to form a first set, and at the same time, N2 second localization mechanisms matching the second search term are determined in the pre-constructed semantic database to form a second set.

The word bank stores address normalization names of the sixth localization mechanisms, and the semantic database stores name semantic vectors of the sixth localization mechanisms.

This step calculates the literal similarity Score _ e between the address normalized name of each sixth localization mechanism in the determined thesaurus and the second search term (e.g., Score _ e calculated by BM25 algorithm), selects the N1 sixth localization mechanisms with the largest Score _ e as the first localization mechanisms, and forms all the first localization mechanisms into the first set.

And determining the vector cosine similarity Score _ m between the name semantic vector of each sixth localization mechanism in the semantic database and the second search term, selecting N2 sixth localization mechanisms with the largest Score _ m as second localization mechanisms, and forming all the second localization mechanisms into a second set.

Where N1 and N2 are preset positive integers for controlling the number of first and second localization mechanisms. The relationship between N1 and N2 is not limited in this and subsequent embodiments, and N1 may be equal to N2, may be smaller than N2, and may be larger than N2.

Although the localization mechanisms stored in the word bank and the semantic database are the same, the stored contents are different, the address normalization name of the sixth localization mechanism is stored in the word bank, and the name semantic vector of the sixth localization mechanism is stored in the semantic database, so that the localization mechanisms obtained by searching the word bank and the semantic database with the same second search word may be completely the same, may not be completely the same, or may be completely different.

It should be noted that "first", "second", and "sixth" of the first localization mechanism, the second localization mechanism, and the sixth localization mechanism are merely identification functions, and are used for localization mechanisms from different locations, and do not have a substantial meaning. That is, the word stock and the semantic database store information of each localization mechanism, and in order to distinguish the localization mechanisms stored in the word stock and the semantic database from the localization mechanisms stored in other positions, the localization mechanism stored in the word stock and the semantic database is named as a sixth localization mechanism, that is, the sixth localization mechanism is the localization mechanism stored in the word stock and the semantic database.

In this step, part of the localization mechanisms are selected from the word stock according to the second search word, and in order to distinguish the selected localization mechanism from other localization mechanisms in the word stock, the localization mechanism selected from the word stock is named as a first localization mechanism, that is, the first localization mechanism is also a localization mechanism and is a localization mechanism which is stored in the word stock and is matched with the second search word. In addition, the organization is a sixth localization organization in the thesaurus, and becomes the first localization organization after being selected, and the first localization organization is also the sixth localization organization actually, that is, the first localization organization is a part of the sixth localization organization.

In this step, a part of the localization mechanisms are selected from the semantic database according to the second search word, and in order to distinguish the selected localization mechanism from other localization mechanisms in the semantic database, the localization mechanism selected from the semantic database is named as a second localization mechanism, that is, the second localization mechanism is also a localization mechanism and is a localization mechanism stored in the semantic database and matched with the second search word. In addition, the authority is a sixth localization authority in the semantic database, and becomes a second localization authority after being selected, the second localization authority also being actually the sixth localization authority, that is, the second localization authority is a part of the sixth localization authority.

The relationship between the first localization mechanism and the second localization mechanism is not limited in this embodiment, and may be the same, different, or partially the same. As long as it is in the thesaurus and matches the second search term, it can be the first localization mechanism. It can be a second localization mechanism as long as it is in the semantic database and matches the second search term.

Before the step is executed, a word stock and a semantic database are also constructed, and the construction process is as follows:

the name, Identification (ID) and address of each sixth localization mechanism are acquired 201.

The address is the actual operation address of the sixth localization mechanism, and is not the registration address.

202, performing word segmentation processing on the name of each six localization mechanism, determining a fourth word segmentation representing the name of the place, mapping the fourth word segmentation to the administrative division level to obtain a fifth word segmentation, and replacing the fifth word segmentation with the fifth word segmentation to obtain a processed name.

Specifically, for any name of the six localization mechanisms, the existing word segmentation processing method (such as the ending word segmentation) is firstly adopted for word segmentation processing, then the existing scheme is adopted to determine whether the fourth word segmentation exists in all the obtained word segmentations, the address normalization is carried out on the fourth word segmentation, the fourth word segmentation is replaced by the analysis after the normalization, and the processed name is obtained.

The address normalization process in this step is similar to that of 102-6, and can be referred to as 102-6, which is not described herein again.

It should be noted that the terms "fourth" and "fifth" in the fourth and fifth terms are used merely as labels to distinguish different subjects, and have no essential meaning. That is, the fourth participle is actually a participle, but a participle in the name of the six localization mechanisms, and the participle characterizes the place name. The fifth participle is also a participle actually, is normalized, also represents the place name, and the participle is mapped by the fourth participle.

And 203, constructing a word stock according to the processed name, the identification and the address of each sixth localization mechanism.

For example, if the sixth localization mechanism is the first people hospital in the rich yang district, the name of the sixth localization mechanism is acquired as "the first people hospital in the rich yang district", which is identified as "ID 1", and the address is "north loop 429 of the rich spring street in the rich yang district, hangzhou, zhejiang. In step 202, the "Fuyang district" (the fourth participle) is mapped according to the administrative division level (as in the example of step 102-6) to obtain the "Bashan county" (the fifth participle), and then the processed name "first-person civil hospital in Bashan county" is obtained. In step 203, "the first people hospital in primary county", "ID 1", "north loop 429 number of the rich spring street in the yang-rich region of hangzhou, zhejiang" are stored as information of the sixth localization mechanism in the thesaurus.

In addition, the thesaurus may be an ElasticSearch framework.

204, determining the mechanism vector of each processed name according to the pre-trained coding model.

The coding model is a neural network architecture model and is composed of a first embedding (mapping) layer, a DNNEncoder (neural network coding) layer, an Outerproduct layer, a first Dense (full connection) layer and a Cross entropy loss (cross entropy loss calculation) layer.

The training process of the coding model is as follows:

301, the name and label of the first sample search term and the first sample localization mechanism are obtained.

And the names of the first sample search word and the first sample localization mechanism are subjected to address normalization.

The first normalization process is similar to 102-6, and can be referred to as 102-6, and is not repeated here.

302, mapping the semantic space to the vector space by the first embedding layer for the first sample search word and the name of the first sample localization mechanism respectively, and coding the mapped vector space by the DNNEncoder layer respectively to obtain a first coding result and a second coding result.

In specific implementation, the embedding layer may use 2 embedding modules to respectively perform the name of the first sample search term and the name of the first sample localization mechanism, and thus, the DNNEncoder layer may also use 2 DNNEncoder modules to respectively encode the 2 mapping results.

If the implementation scheme that the 2 modules respectively process is adopted, the parameters and the structures of the two embedding modules and the two DNN Encoders are the same, and the DNN Encoders are in BERT structures.

And 303, inputting the first coding result and the second coding result into an OuterProduct layer, and performing outer product calculation.

The calculated result and the label of the first sample localization mechanism are input into a CrossentPyross layer, and the cross entropy loss is calculated.

And if the cross entropy loss is less than the preset threshold value, finishing the training of the coding model.

If the cross entropy loss is not less than the preset threshold, the coding model is adjusted according to the calculation result, and the training process is repeatedly executed aiming at the adjusted coding model until the calculation result of the cross entropy loss is less than the preset threshold.

It should be noted that the first sample search word and the "first" in the first sample localization mechanism are only used for identification, and are used for distinguishing the search word in other model training samples from the localization mechanism, and have no other substantial meaning. That is, the first sample search term is a search term that is a training sample of the coding model, the sample being used to train the coding model, and the first sample localization mechanism is a localization mechanism that is a training sample of the coding model, the sample being used to train the coding model.

The first embedding layer and the first place in the first place layer are only used for identification, and are used for distinguishing the embedding layer and the place layer in other models without other essential meanings. That is, the first embedding layer is an embedding layer located in the coding model, and the first sense layer is a sense layer located in the coding model. That is, in the implementation, the coding model architecture of the present embodiment is as shown in fig. 2.

During training, initialization weight is firstly set for a coding model, a training sample is composed of three parts, the first part is a search text (namely a first sample search word), the second part is a mechanism name (namely the name of a first sample localization mechanism), the third part is a label, the value of the label is 0 or 1 (0 represents that the search text and the mechanism name are not the same mechanism, and 1 represents that the search text and the mechanism name are the same mechanism), for example, the basic-level woman healthcare institute 1 in the primary county of the woman protection institute is the basic level, when the coding model is pre-trained, the search text and the mechanism name are coded through an embedding layer and a DNNEncoder layer, vectors of sentences output from a neural network are subjected to outer product on an Outerproduct layer and then pass through a Dense layer, and the output of the Dense layer and the given label solve the cross entropy calculation loss on a Crosstetropyloss layer to update the model.

The coding model is a semantic similarity model of later-stage interaction, and the model not only can have the precision of an interactive semantic model, but also can meet the requirement of expressing the reasoning speed of the model on line.

A semantic database is built 205 from the institution vectors, identities and addresses of each sixth localization institution.

Still taking the above example as an example, if the sixth localization mechanism is the first-person hospital in the sunny area, the name of the sixth localization mechanism is "the first-person hospital in the sunny area", which is identified as "ID 1", and the address is "north loop 429 of the north spring street in the sunny area in hangzhou, zhejiang" in step 201. In step 204, the institution vector of "first people hospital in rich yang district" is obtained, and in step 205, the institution vector of "first people hospital in rich yang district", ID1 ", and" north loop 429 number of spring street in rich yang district, hangzhou, zhejiang are stored in the semantic database as information of the sixth localization institution.

In addition, the semantic database may be a HNSW (hierarchical Navigable Small world) database.

104, determining a union of the first set and the second combination.

In the union, a third localization mechanism matching the location information is determined, forming a third set 105.

Wherein the location information is the location information determined in step 102.

That is, address normalization is performed on the search term input by the user to obtain a second search term, and then N2 matching localization mechanisms (i.e., second localization mechanisms, also elements in the second set) are found in the semantic database according to the fact that N1 matching localization mechanisms (i.e., first localization mechanisms, also elements in the first set) are found in the word bank by the second search term. A localization mechanism (i.e., a third localization mechanism, also an element in the third set) that matches the user's intended location (i.e., location information) is determined among the elements in the two sets of merged sets.

In addition, "matching" in this step is a relationship in which administrative regions are the same, that is, if the names of the localization mechanisms in the merged set (i.e., the names after address normalization) are the same as the location information in the administrative regions, the localization mechanisms are the third localization mechanisms and are elements in the third set.

For example, if the element in the union is "department-level gynecologic health care hospital", the location information is "hangzhou", and hangzhou is in a department-level city higher than the administrative division, then "department-level gynecologic health care hospital" is the same as the location information in the administrative division, which is the third localization mechanism, and is an element in the third set.

For another example, the element in the union is "province famous women and children health care hospital", the location information is "hangzhou", and hangzhou is in the higher-level city of the administrative district, and the "province famous women and children health care hospital" is different from the location information in the administrative district, and is not a third localization mechanism.

It should be noted that the "third" in the third localization mechanism is only used for identification, and is used for distinguishing and concentrating the localization mechanism matched with the location information from other localization mechanisms, and has no other essential meaning. That is, the third localization mechanism is a localization mechanism that is an element in the first set and also an element in the second set and that matches the location information, which is named the third localization mechanism for the sake of distinction. Then stand at the perspective of the first set, since the localization mechanisms in the first set are all named first localization mechanisms, then the third localization mechanism is actually also a first localization mechanism. Likewise, standing at the perspective of the second set, since the localization mechanisms in the second set are named second localization mechanisms, the third localization mechanism is actually also a second localization mechanism.

And 106, determining a fourth localization mechanism matched with the position information in the pre-constructed mechanism map, and forming a fourth set by the intersection of the fourth localization mechanism and the union set.

Wherein the organization map includes attribute information of each sixth localization organization.

It should be noted that the "fourth" in the fourth localization mechanism is only used for identification, and is used for distinguishing the localization mechanism matched with the position information from other localization mechanisms in the mechanism map, and has no other essential meaning. That is, the fourth localization mechanism is a localization mechanism that is one localization mechanism in the mechanism map and that is matched with the location information, which is named the fourth localization mechanism for distinction. If the localization mechanism stored in the institution map is named sixth localization mechanism, then the fourth localization mechanism is also essentially a sixth localization mechanism, a sixth localization mechanism that satisfies certain conditions (i.e., matches with the location information).

Before executing this step, an organization knowledge graph (i.e. organization graph) is constructed for all the information of the localization organizations, and the organization knowledge graph contains the organization attribute information, wherein the attribute information includes, but is not limited to, one or more of the following: address information, name, nickname, feature (a feature is a feature department, for example, a hospital), organization rating.

The organization map not only enables simple location matching, but also can solve local localization organization search of the same type but with different administrative district levels.

The mechanism map includes points and edges, which is a graph, and is shown in fig. 3, and is formed as follows:

the name of each sixth localization institution (e.g., people's hospital in Zhejiang province in FIG. 3) corresponds to a unique point in the institution's map.

The nickname of each sixth localization authority (e.g., the provincial-civilian hospital in fig. 3) corresponds to a unique point in the authority map.

The elements in each sixth set (e.g., pediatric in fig. 3) correspond to a unique point in the organizational chart. Wherein the sixth set is the union of the features of all sixth localization mechanisms (as in fig. 3, the sixth set is { gynecology, pediatrics, infectious department, oncology }).

Each administrative division (e.g., china, shanghai, zhejiang, hangzhou, jinhua in fig. 3) corresponds to only one point in the organization's picture.

There is an edge between the point corresponding to the name of the same sixth localization mechanism and the point corresponding to the Nickname, the edge points to the point corresponding to the Nickname from the point corresponding to the name of the edge, and the relationship of the edge is the Nickname (for example, "Nickname" in fig. 3). As shown in fig. 3, an edge exists between the point corresponding to the people hospital in zhejiang province and the point corresponding to the people hospital in province, the edge points to the point corresponding to the people hospital in zhejiang province from the point corresponding to the people hospital in zhejiang province, and the relationship of the edge is Nickname.

There is an edge between the point corresponding to the name of the same localization mechanism and the point corresponding to the feature, the edge points to the point corresponding to the feature from the point corresponding to the name, and the relationship of the edge is the feature (for example, "FamousDeparatent" in FIG. 3). As shown in fig. 3, there is an edge between the point corresponding to the people hospital in zhejiang province and the point corresponding to the infectious department, the edge points from the point corresponding to the people hospital in zhejiang province to the point corresponding to the infectious department, and the relationship of the edge is famous department.

According to the membership relationship between the administrative regions, there is an edge between the points corresponding to the administrative regions, where the edge points from the point corresponding to one administrative region to the point corresponding to the administrative region to which the edge belongs, and the edge is Located (as "Located" in fig. 3). As shown in fig. 3, there is an edge between the point corresponding to zhe jiang and the point corresponding to hang state, because hang state belongs to zhe jiang in the administrative division, the edge points from the point corresponding to hang state to the point corresponding to zhe jiang, and the relationship of the edge is Located.

And determining the administrative division to which each sixth localization mechanism belongs according to the address information of the sixth localization mechanism. There is an edge between the point corresponding to the name of each sixth localization mechanism and the point corresponding to the administrative division to which the sixth localization mechanism belongs, and the edge points to the point corresponding to the administrative division to which the sixth localization mechanism belongs from the point corresponding to the name of the edge, and the relationship of the edge is Located (as "Located" in fig. 3). As shown in fig. 3, it is determined that the zhejiang province patient hospital is Located in the hang state city according to the address information of the zhejiang province patient hospital, and then an edge exists between a point corresponding to the zhejiang province patient hospital and a point corresponding to the hang state, the edge points to the point corresponding to the hang state from the point corresponding to the zhejiang province patient hospital, and the relationship of the edge is Located.

Based on the mechanism map shown in fig. 3, in this step, a point corresponding to the position information is determined in the mechanism map constructed in advance. A sixth localization mechanism corresponding to a point having an edge between points corresponding to the location information and having a relationship of being Located (i.e., "Located" in fig. 3) is determined as a fourth localization mechanism matching the location information.

For example, if the location information is "hangzhou", the point corresponding to hangzhou is determined in the institution picture shown in fig. 3, and the sixth localization institution (i.e., the obstetrics and gynecology affiliated with the zhejiang province hospital and the zhejiang science medical institute) corresponding to the point where there is an edge between the points corresponding to hangzhou is determined as the fourth localization institution matching the location information. That is, the fourth localization institution is a hospital affiliated with obstetrics and gynecology department, people hospital in Zhejiang province and science medical college in Zhejiang province.

Then the fourth set is { Zhejiang province people hospital, Zhejiang science medical institute affiliated obstetrics and gynecology hospital }. andu (first set @ second set).

In the pre-constructed institution graph, a fifth localization institution matching the keyword is determined 107, and the intersection of the fifth localization institution and the union is formed into a fifth set.

The keywords in this step are the keywords obtained in step 102.

It should be noted that "fifth" in the fifth localization mechanism is only used for identification, and is used for distinguishing the localization mechanism matched with the keyword from other localization mechanisms in the mechanism map, and has no other essential meaning. That is, the fifth localization mechanism is one localization mechanism that is one localization mechanism in the mechanism graph and that matches the keyword, which is named as the fifth localization mechanism for distinction. If the localization mechanism stored in the institution graph is named sixth localization mechanism, then the fifth localization mechanism is also essentially a sixth localization mechanism, a sixth localization mechanism that satisfies certain conditions (i.e., matches keywords).

Based on the mechanism map shown in fig. 3, this step determines points corresponding to keywords in a mechanism map constructed in advance. A sixth localization mechanism corresponding to a point in which an edge exists between points corresponding to the keyword and the relationship is a feature (i.e., "famous department" in fig. 3) is determined as a fifth localization mechanism matching the keyword.

For example, if the keyword is "gynaecology", points corresponding to gynaecology are determined in the institution pictures shown in fig. 3, and a sixth localization institution (i.e., affiliated obstetrical and gynecological hospital of Zhejiang science medical institute) corresponding to the points having a relationship with features is determined as a fifth localization institution matching the keyword. That is, the fifth localization institution is a gynecologic hospital affiliated to the science and medicine institute of Zhejiang.

Then the fifth set is { Zhejiang scientific medical institute affiliated gynecologic hospital }.n (first set @ U second set).

And 108, taking the elements in the third set, the fourth set and the fifth set as localization mechanisms searched by the first search term.

All localization mechanisms related to the search term are obtained by the union of the first set and the second set. Through the third set, a localization mechanism matching the user's intended location is derived and centralized. Through the fourth set, localization mechanisms matched with all localization mechanisms related to the user intention position in the mechanism knowledge graph are obtained and concentrated. And through the fifth combination, the localization mechanisms matched with all the localization mechanisms related to the user intention keywords in the mechanism knowledge graph are obtained and concentrated. All the user intention localization mechanisms can be comprehensively acquired through the union of the third set, the fourth set and the fifth set, so that the step takes the elements in the union of the third set, the fourth set and the fifth set as the localization mechanisms searched by the first search term.

However, in all the user intention localization mechanisms, the degree of the user intention localization is different from the degree of the user intention localization mechanism, and some mechanisms are more appropriate. Therefore, after the localization mechanism searched by the first search term is obtained, the localization mechanisms are sequenced according to the difference between the localization mechanisms and the intention of the user, and are sequentially displayed to the user from top to bottom according to the contact degree, so that the user can more easily find high-quality resources meeting the requirement of the user. The specific sorting process is as follows:

401, a matching degree between each seventh localized mechanism and the first search word is determined, and characteristics of each seventh localized mechanism, a user portrait of the user, and historical behavior information are obtained.

Wherein the seventh localization mechanism is the searched localization mechanism.

Since Score _ e and Score _ m are calculated in step 103, the matching Score = (λ 1 × Score _ e + λ 2 × Score _ m)/2 between any seventh localization mechanism and the first search term is calculated for any seventh localization mechanism in this step.

Wherein λ 1 is the weight of the lexicon, and λ 2 is the weight of the semantic database.

λ 1 is the accuracy of the lexicon on a given test set, and λ 2 is the accuracy of the semantic database on a given test set.

It should be noted that "seventh" in the seventh localization mechanism is used for identification only, and is used for distinguishing the mechanism map from other localization mechanisms, and has no other essential meaning. That is, the seventh localization mechanism is a localization mechanism, which is the localization mechanism searched by the first search word, and is named as the seventh localization mechanism for distinction.

In addition, the characteristics of the seventh localization mechanism such as ranking, user rating, etc.

The user image is the user's age, sex, ID, etc.

Historical behavior information such as the organization of the user's historical selections, etc.

And 402, mapping the matching degree, the characteristics of the seventh localization mechanism, the user portrait and the historical behavior information from the semantic space to the vector space through a second embedding layer respectively to obtain the mapped matching degree, the mapped characteristics, the mapped user portrait and the mapped historical behavior information.

It should be noted that "second" in the second embedding layer is only used for identification, and is used for distinguishing the embedding layers in other models, and has no other essential meaning. That is, the second embedding layer is an embedding layer for sorting the seventh localization mechanism.

And 403, determining an attention weight according to the mapped historical behavior information and the mapped features, and performing a kronecker product operation on the attention weight and the mapped historical behavior information.

Wherein, the process of determining the attention weight is as follows: and (3) sequentially passing the mapped historical behavior information and the mapped features through a feature product layer, a third activation function WRelu, a second Concat layer and a second information processing function WMice, and obtaining the attention weight through a second Dense layer.

It should be noted that the "second" of the second Concat layer, the second information processing function WMice and the second sense layer is only used for identification, and is used for distinguishing the Concat layer, the information processing function WMice and the sense layer in other models, and has no other essential meaning. That is, the second Concat layer is a Concat layer for determining the attention weight, the second telematics function WMice is a telematics function WMice for determining the attention weight, and the second sense layer is a sense layer for determining the attention weight.

The "third" in the third activation function WRelu is used to distinguish different activation functions without any other essential meaning. That is, the third activation function WRelu is an activation function, and its function is WRelu.

Wherein, the information processing function WMice(s) = p(s) ^ s 2+ (1-p(s) ^ 0.25s ^ 2. Wherein the content of the first and second substances,s is a parameter value of the input information processing function WMice, σ () is a mean function, Var () is a variance function, and ℇ is a preset infinitesimal value.

ℇ is used to prevent the denominator from being 0.

The activation function is: (xi) = xi, when xi > 0; WRelu (xi) =0.25xi when xi is not more than 0;

wherein i is the parameter identification of the input activation function WRelu, and xi is the ith parameter value of the input activation function WRelu.

In particular, the attention weight may be a press network structure, and the architecture thereof is shown in fig. 5.

As shown in fig. 5, the attention weight is obtained by passing the mapped historical behavior information and the mapped features through a product layer, an activation function WRelu, a Concat layer, an information processing function WMice, and a sense layer in sequence.

404, the kronecker product operation result, the mapped user portrait and the mapped feature are sequentially subjected to a first Concat (fusion dimensionality reduction) layer, a first activation function WRelu, a first information processing function WMice, a second activation function Softmax and an Output layer to obtain the ranking of a seventh localization mechanism.

It should be noted that the first Concat layer and the "first" in the first information processing function WMice are only used for identification, and are used for distinguishing the Concat layer and the information processing function WMice in other models, and have no other essential meanings. That is, the first Concat layer is a Concat layer for sorting the seventh local mechanism, and the first information processing function WMice is an information processing function WMice for sorting the seventh local mechanism.

The "first" and "second" in the first activation function WRelu and the second activation function Softmax are only identification functions and are used for distinguishing different activation functions without other essential meanings. That is, the first activation function WRelu is an activation function, and the second activation function Softmax is an activation function, and the function is Softmax.

Wherein, the information processing function WMice(s) = p(s) ^ s 2+ (1-p(s) ^ 0.25s ^ 2. Wherein the content of the first and second substances,s is a parameter value of the input information processing function WMice, σ () is a mean function,var () is a variance function, ℇ is a preset infinitesimal value.

ℇ is used to prevent the denominator from being 0.

The activation function is: (xi) = xi, when xi > 0; WRelu (xi) =0.25xi when xi is not more than 0;

wherein i is the parameter identification of the input activation function WRelu, and xi is the ith parameter value of the input activation function WRelu.

The seventh localization mechanism sorting process performed in the steps 401 and 404 can be implemented by a neural network, and the architecture thereof is shown in fig. 4.

In the framework in fig. 4, the matching degree, the features of the localization mechanism, the user portrait, and the historical behavior information are mapped from the semantic space to the vector space through the embedding layer, respectively, to obtain the mapped matching degree, the mapped features, the mapped user portrait, and the mapped historical behavior information. Determining an attention weight (i.e., attentionweight in fig. 4) according to the mapped historical behavior information and the mapped features, and performing a kronecker product operation (i.e., ⨂ in fig. 4) on the attention weight and the mapped historical behavior information. The kronecker product operation result, the mapped user portrait and the mapped characteristics are sequentially subjected to Concat layer (i.e. Concat & Flattem in figure 4), activation function WRelu, information processing function WMice, activation function Softmax and Output layer to obtain the ordering of the localization mechanism.

The seventh localization mechanism is fed back to the user in order 405.

Through the above steps 401 and 405, the seventh localization mechanism may be sorted and reversed to the user according to the sorting order. During sorting, the two similarity degrees obtained by comparison with a word stock and a semantic database are weighted and averaged, then the two similarity degrees are subjected to box separation, the level ranking and the user score of the mechanism are added, the information of the mechanism selected by the history of the current user, the level of the history mechanism and the user score is taken as the mechanism characteristics, and the age, the sex and the ID of the user are added as the portrait information of the user, so that the score of each mechanism is obtained. The sorting scheme fully considers the matching degree of the name of the localization mechanism and the user search, the attribute of the localization mechanism, the user portrait and the historical behavior information, ensures that the mechanism with high quality and the user preference is arranged in front on the basis of accurate return result, and enables the user to find the high-quality resource meeting the requirement more easily.

The method for searching the localized organization provided by the embodiment is a method for recalling the organization by combining the technologies of site normalization, semantic model, knowledge graph and the like of the localized organization, and integrating organization information and user portrait for personalized sequencing. And sorting the organization information according to organization rating, user rating and the like. Flow diagrams as shown below, the red line represents the process flow for an organization in the database.

The method of localized organization search provided in the present embodiment is described again below with reference to fig. 6 as an example. Referring to fig. 6, where the dotted line represents a processing flow performed on the localization mechanism in the database, when the method for searching the localization mechanism provided in this embodiment is executed, the mechanism name in the database is first converted into a semantic vector and stored in the semantic database, and the text is stored in the word bank. The solid line represents the flow of processing of the online user search term.

When the method for searching the localized mechanism is initialized, all mechanisms are firstly subjected to site normalization and then coded into vectors through a coding model and stored in a semantic database. When a user inputs a search word, firstly normalizing the location in the search word, then processing by using a coding model to obtain a semantic vector, finally calculating cosine similarity with mechanism vectors stored in a semantic database, and taking the first N1 mechanisms with the highest cosine similarity as a set A (namely a first set); meanwhile, the normalized search word is input into the word stock, the BM25 algorithm is used for calculating the literal similarity with the mechanisms in the word stock, and the first N2 mechanisms with the highest similarity are taken as a set B (namely a second set). And (4) solving a union set of the set A and the set B to obtain a set C, wherein the total matching degree score is a weighted average of cosine similarity in the set A and word similarity in the set B.

After the candidate organization set C is obtained, address information searched by the user is matched with address information of the localization organizations in the set C. If the address information exists in the user search word, the address information in the search word is obtained, if the address information does not exist in the user search word, the address information is converted from the user network IP, and the address information contained in the user search word is given priority. In addition, an organization knowledge graph is constructed for the information of the organizations, the organization knowledge graph comprises address information, organization names, organization ratings, organization characteristics (such as special departments of hospitals) and the like of the organizations, and the graph not only enables people to perform simple location matching, but also can solve the problem of organization search of local organizations of the same type but with different administrative district levels. When the localization mechanisms in the user search and the set C are matched, matching is firstly carried out on the location in the user search and the mechanism location in the set C to obtain a localization mechanism set D (namely a third set) with a matched location; then, the location searched by the user is inquired in the mechanism map, and a localization mechanism located in the location and in the set C is inquired to obtain a set E (namely a fourth set); and then filtering out the place related text and the 'hospital' in the user search word, for example, if the user searches for the 'tumor hospital' in Hangzhou state to obtain 'tumor', taking the keyword as a node name to go to a node of the mechanism graph, wherein the search relation is that the Famous Department points to the node, solving an intersection with the set C to obtain a set F (namely a fifth set), and finally solving an union set of D, E, F to be used as a final localization mechanism.

And then sequencing the final localization mechanisms and outputting the sequencing results to the user.

The user in Hangzhou state inputs: the term "women and children health care hospital" is used as an example for illustration.

And Hangzhou user input: after the 'maternal and child health care institute' is obtained, the input 'maternal and child health care institute' is obtained after part of speech analysis and normalization, address information does not exist in search words, the user address of 'Hangzhou' is obtained by taking the IP of a user, the maternal and child health care institute input word bank and the vector obtained by a coding model are respectively searched, and the searched results are respectively searched and integrated to obtain the following candidate institution names:

grade city health-care hospital for women and children

Provincial health-care hospital for women and children

Administrative area women and children health care hospital

Then, the places of the candidate institutions and the 'Hangzhou' are taken to carry out editing distance calculation, the situation that the addresses of the candidate institutions are not matched with the Hangzhou is found, searching is carried out in a Hangzhou institution-removing map, intersection of all hospitals located in the Hangzhou and the previously obtained candidate hospitals is inquired, and the matched hospitals are 'Zhejiang maternal and child health care homes' and 'Shang city maternal and child health care homes'. Then, the keyword 'maternal and child' is taken to the mechanism map for searching, intersection of all hospitals with characteristic departments as maternal and child is inquired and the candidate hospitals obtained in the front is obtained, and matched hospitals are 'Zhejiang maternal and child health care hospital' and 'Shang city maternal and child health care hospital'. The above-mentioned sets are combined to obtain Zhejiang province women and children health care hospital and Shang city women and children health care hospital.

And carrying out weighted average on the word similarity and the vector similarity of the two hospitals, and then carrying out bucket separation (for example, the weighted average score is 1 when the value is more than 0.8, 2 when the value is 0.6 to 0.8, and 3 when the value is less than 0.6). And then the user portrait information such as age, sex, historical hospitalization information, hospital grade and user score of the hospital on the platform are subjected to onehot coding to form a group of characteristics, the characteristics are input into the structure shown in figure 4, the scores of the two matched hospitals are obtained, the two hospitals are displayed according to the score, and the score with high score is arranged in front of the organization with low score.

The following description will take the Hangzhou user as an example to input "Hangzhou tumor hospital".

The Hangzhou user inputs the Hangzhou tumor hospital, the Hangzhou tumor hospital is input into the grade city tumor hospital after the part of speech analysis and normalization, and the location is Hangzhou. Candidate hospitals are not obtained through the word bank and the semantic database. At this time, "hang state" and "tumor" (the first word segmentation of the hospital at the representative location name is removed at this time) are used as two nodes of the institution graph shown in fig. 3, and chain query is performed in the institution graph to obtain the Zhejiang province people hospital located in hang state and having a characteristic department including oncology department as a matching hospital. Because only one mechanism is matched, sorting is not needed, and the information is directly fed back to a user.

The method for searching for a localized organization provided by the embodiment comprises the following steps: acquiring a first search word; determining position information, a keyword and a second search word based on the first search word; the second search word is obtained by address normalization of the first search word; determining N1 first localization mechanisms matched with the second search word in a pre-constructed word stock to form a first set, and simultaneously determining N2 second localization mechanisms matched with the second search word in a pre-constructed semantic database to form a second set; the word bank stores address normalization names of the sixth localization mechanisms, and the semantic database stores name semantic vectors of the sixth localization mechanisms; determining a union of the first set and the second combination; in the union set, determining a third localization mechanism matched with the position information to form a third set; determining a fourth localization mechanism matched with the position information in a mechanism map constructed in advance, and forming a fourth set by the intersection of the fourth localization mechanism and the union set; the organization map comprises attribute information of each sixth localization organization; determining a fifth localization mechanism matched with the keyword in a mechanism map constructed in advance, and forming a fifth set by the intersection of the fifth localization mechanism and the union set; and taking the elements in the third set, the fourth set and the fifth set as localization mechanisms searched by the first search term. The method provided by the invention ensures that the search result not only meets the user requirements on the semantic level, but also meets the user requirements on the ground position and other levels through the word stock, the semantic database and the mechanism map, and improves the accuracy of the search result.

Based on the same inventive concept as the method of localized agency search, the present embodiment provides an electronic device, which includes: memory, processor, and computer programs.

Wherein the computer program is stored in the memory and configured to be executed by the processor to implement the localized mechanism search method as shown in figure 1.

In particular, the method comprises the following steps of,

acquiring a first search word;

determining position information, a keyword and a second search word based on the first search word; the second search word is obtained by address normalization of the first search word;

determining a union of the first set and the second combination;

in the union set, determining a third localization mechanism matched with the position information to form a third set;

determining a fourth localization mechanism matched with the position information in a mechanism map constructed in advance, and forming a fourth set by the intersection of the fourth localization mechanism and the union set; the organization map comprises attribute information of each sixth localization organization;

determining a fifth localization mechanism matched with the keyword in a mechanism map constructed in advance, and forming a fifth set by the intersection of the fifth localization mechanism and the union set;

and taking the elements in the third set, the fourth set and the fifth set as localization mechanisms searched by the first search term.

Optionally, determining the location information, the keyword, and the second search term based on the first search term includes:

performing word segmentation processing on the first search word;

determining whether a first participle representing a place name exists;

determining a second participle representing the service type;

taking the participles except the first participle and the second participle as keywords;

and if the first segmentation exists, mapping the first segmentation to the administrative division level to obtain a third segmentation, and replacing the first segmentation with the third segmentation to obtain a second search word.

Optionally, before determining N1 first localization mechanisms matching the second search term in the pre-constructed word library to form the first set, and determining N2 second localization mechanisms matching the second search term in the pre-constructed semantic database to form the second set, the method further includes:

acquiring the name, the identification and the address of each sixth localization mechanism;

performing word segmentation processing on the name of each six localization mechanism, determining a fourth word segmentation representing the name of the place, mapping the fourth word segmentation to the administrative division level to obtain a fifth word segmentation, and replacing the fifth word segmentation with the fifth word segmentation to obtain a processed name;

constructing a word bank according to the processed name, the identifier and the address of each sixth localization mechanism;

determining a mechanism vector of each processed name according to a pre-trained coding model;

constructing a semantic database according to the mechanism vector, the identifier and the address of each sixth localization mechanism;

the coding model consists of a first mapping embedding layer, a neural network coding DNN Encoder layer, an Outer Product layer, a first full-connection sense layer and a Cross entropy Loss calculation Cross entropy Loss layer;

the training process of the coding model is as follows:

mapping a semantic space to a vector space by a first embedding layer for the first sample search word and the name of a first sample localization mechanism respectively, and coding the mapped vector space by a DNN Encoder layer respectively to obtain a first coding result and a second coding result;

inputting the first coding result and the second coding result into an Outer Product layer, and performing Outer Product calculation;

inputting the calculated result and the label of the first sample localization mechanism into a Cross entropy Loss layer, and calculating the Cross entropy Loss;

if the cross entropy loss is smaller than a preset threshold value, finishing the training of the coding model;

if the cross entropy loss is not less than the preset threshold, the coding model is adjusted according to the calculation result, and the training process is repeatedly executed aiming at the adjusted coding model until the calculation result of the cross entropy loss is less than the preset threshold.

Optionally, the attribute information of any sixth localization mechanism includes one or more of: address information, name, nickname, features;

the mechanism map comprises points and edges;

the name of each sixth localization mechanism corresponds to a unique point in the mechanism map;

the nickname of each sixth localization authority corresponds to only one point in the authority map;

each element in the sixth set corresponds to a unique point in the organizational graph; the sixth set is the union of the features of all sixth localization mechanisms;

each administrative division corresponds to a unique point in the mechanism picture;

determining a fourth localization mechanism matching the location information in the pre-constructed mechanism map, comprising:

determining points corresponding to the position information in a pre-constructed mechanism map;

determining a sixth localization mechanism corresponding to the point with the relationship that the point corresponding to the edge exists between the points corresponding to the position information as a fourth localization mechanism matched with the position information;

determining a fifth localization mechanism matching the keyword in the pre-constructed mechanism graph, comprising:

determining points corresponding to the keywords in a pre-constructed mechanism map;

and determining a sixth localization mechanism corresponding to the point which has an edge between the points corresponding to the keywords and has a relation of characteristics as a fifth localization mechanism matched with the keywords.

Optionally, the first search term is input by a user;

after the elements in the third set, the fourth set and the fifth set are used as the localization mechanism searched by the first search term, the method further comprises the following steps:

determining the matching degree between each seventh local mechanism and the first search word, and acquiring the characteristics of each seventh local mechanism, the user portrait of the user and historical behavior information; the seventh localization mechanism is the searched localization mechanism;

the seventh localization mechanism is fed back to the user in the order.

Optionally, determining the attention weight according to the mapped historical behavior information and the mapped features includes:

Optionally, determining N1 first localization mechanisms matching the second search term in a pre-constructed thesaurus, comprising:

determining N2 second localization mechanisms matching the second search term in a pre-constructed semantic database, comprising:

determining a degree of match between each seventh localization mechanism and the first search term, comprising:

for any seventh localization mechanism, a degree of match Score = (λ 1 × Score _ e + λ 2 × Score _ m)/2 between any seventh localization mechanism and the first search term;

wherein λ 1 is the weight of the lexicon, and λ 2 is the weight of the semantic database.

Optionally, the activation function is: (xi) = xi, when xi > 0; WRelu (xi) =0.25xi when xi is not more than 0;

wherein i is a parameter identifier of an input activation function WRelu, and xi is an ith parameter value of the input activation function WRelu;

an information processing function WMice(s) = p(s) ^ s 2+ (1-p(s) ^ 0.25s ^ 2;

The electronic device provided by the embodiment can acquire the first search term; determining position information, a keyword and a second search word based on the first search word; the second search word is obtained by address normalization of the first search word; determining N1 first localization mechanisms matched with the second search word in a pre-constructed word stock to form a first set, and simultaneously determining N2 second localization mechanisms matched with the second search word in a pre-constructed semantic database to form a second set; the word bank stores address normalization names of the sixth localization mechanisms, and the semantic database stores name semantic vectors of the sixth localization mechanisms; determining a union of the first set and the second combination; in the union set, determining a third localization mechanism matched with the position information to form a third set; determining a fourth localization mechanism matched with the position information in a mechanism map constructed in advance, and forming a fourth set by the intersection of the fourth localization mechanism and the union set; the organization map comprises attribute information of each sixth localization organization; determining a fifth localization mechanism matched with the keyword in a mechanism map constructed in advance, and forming a fifth set by the intersection of the fifth localization mechanism and the union set; and taking the elements in the third set, the fourth set and the fifth set as localization mechanisms searched by the first search term. The method provided by the invention ensures that the search result not only meets the user requirements on the semantic level, but also meets the user requirements on the ground position and other levels through the word stock, the semantic database and the mechanism map, and improves the accuracy of the search result.

Based on the same inventive concept as the method of localized agency search, the present embodiment provides a computer-readable storage medium on which a computer program is stored. The computer program is executed by a processor to implement the localized mechanism search method as shown in fig. 1.