Satisfaction improving system and method based on scenic spot evaluation

文档序号:1963765 发布日期:2021-12-14 浏览:18次 中文

阅读说明:本技术 基于景区评价的满意度提升系统和方法 (Satisfaction improving system and method based on scenic spot evaluation ) 是由 付萍 陈海江 于 2021-09-08 设计创作,主要内容包括:本发明涉及旅游大数据技术领域,具体涉及一种基于景区评价的满意度提升系统和方法,本发明通过获取OTA平台的评论,计算每个景区评论文本的情感总值,判断评论文本的情感倾向,包括正向、负向及中性,利用LDA主题聚类方法分析正负向评论的主题,结合时间变量,挖掘游客满意度的影响因素以及游客关注焦点变化。本发明通过建立各维度下的景区满意度关键特征,在各评论的情感层面进行赋值,再与景区的游客评价文字进行关联匹配,得到不同维度下的游客评价倾向,用以衡量景区在整体体验、景区基础条件、景区旅游消费、景区旅游资源、景区便利程度等特定维度的满意度情况,实现游客评论的多维度分析,针对性的对景区进行分析,提升管理水平。(The invention relates to the technical field of tourism big data, in particular to a satisfaction improving system and method based on scenic spot evaluation. According to the invention, through establishing the key characteristics of the scenic spot satisfaction degree under each dimension, the evaluation is carried out on the emotional level of each comment, and then the evaluation tendency of the tourist in the scenic spot is obtained through correlation matching with the evaluation characters of the tourist in the scenic spot, so that the satisfaction conditions of specific dimensions of the scenic spot, such as the overall experience, the basic conditions of the scenic spot, the tourist consumption of the scenic spot, the tourist resources of the scenic spot, the convenience degree of the scenic spot and the like, are measured, the multi-dimensional analysis of the tourist comments is realized, the analysis is carried out on the scenic spot in a targeted manner, and the management level is improved.)

1. A satisfaction promotion method based on scenic spot evaluation is characterized by comprising the following steps:

s1, selecting a scenic spot to be evaluated, and acquiring comment data information issued by the tourist to the scenic spot on an OTA website through a crawler;

s2, data cleaning is carried out on the obtained comment data information, and invalid and redundant data are screened out according to data cleaning rules;

s3, quantifying the total emotion value contained in the comment data information through text emotion analysis, and calculating the text emotion value of the comment in the scenic spot through sentence summation;

s4, analyzing the positive and negative comment subjects through subject clustering, researching the main reasons of satisfaction or dissatisfaction of scenic spots in scenic spot evaluation, and improving the satisfaction by the improvement.

2. A satisfaction promotion method based on scenic spot evaluation according to claim 1, wherein the rule of data cleaning comprises: removing repeated data, removing missing invalid data, deleting short sentences, English, numbers and characters, and removing stop words and Chinese word segmentation.

3. The scenic spot evaluation-based satisfaction promotion method according to claim 2, wherein the repeated data includes volume repeat data and system default comment data;

the invalid data comprises content missing data, blank comments, null data and HTML hypertext tag data which appear in the data set for multiple times;

the phrase deletion is to delete data with shorter comment content;

the Chinese word segmentation is to convert the data segmentation process into structured data.

4. The satisfaction improving method based on scenic spot evaluation as claimed in claim 1, wherein the method comprises the following steps when performing emotion value calculation through text emotion analysis:

t1, carrying out sentence segmentation on the comment data information to obtain clauses;

t2 performing word segmentation processing and stop word processing on the segmented clauses;

t3, positioning and assigning the emotion words;

t4 carries out weighting adjustment to the clauses;

t5 sums the clauses to get the sentiment value of the text.

5. A satisfaction promotion method based on scenic spot evaluation as claimed in claim 4, wherein in said step T2, the jueba segmentation is used to perform chinese segmentation and stop word removal on each clause, the clause is first segmented and processed, then the stop word processing is performed, the segmented text is matched with the stop word dictionary constructed in this text, the stop words that are successfully matched are deleted, and the stop words that are not successfully matched are retained.

6. A satisfaction promotion method based on scenic spot evaluation as claimed in claim 4, wherein in said step T5, when summing up the sub-sentences, it is assumed that a scenic spot online comment text is divided into n sub-sentences, and the emotion value of each sub-sentence is senti respectively1,senti2,....sentinThen, the emotional tendency value of the online comment of the whole scenic spot is:

7. the satisfaction promotion method based on scenic spot evaluation according to claim 1, wherein in the method, the theme clustering analyzes the theme of the positive and negative comments by using a theme clustering method in terms of overall experience, scenic spot basic conditions, scenic spot tourism consumption, scenic spot tourism resources and scenic spot convenience.

8. The satisfaction improving method based on scenic spot evaluation as claimed in claim 7, wherein the topic clustering uses an LDA topic clustering model, which is a three-layer bayesian model, to calculate the possibility of classifying a document as a topic and the possibility of classifying a topic as a word by training and optimizing text data, and finally forms the three-layer bayesian model of document-topic-word; wherein

P (term | document) ═ P (topic | document) × P (term | topic)

When the LDA topic clustering model is applied, three parameters are set: the number of the topics, the hyper-parameters alpha and beta, the number of the topics needs to be set according to the actual situation of the text, for the hyper-parameters alpha and beta, the larger alpha is, the closer the whole document is to one topic, the larger beta is, and the greater importance of the special vocabulary under each topic is.

9. A satisfaction promotion system based on scenic spot evaluation, the system being used for realizing the satisfaction promotion method based on scenic spot evaluation according to any one of claims 1 to 8, characterized by comprising

The source data acquisition module is used for acquiring comment data information of OTA website public data published by a tourist;

the data cleaning module is used for screening out invalid and redundant data according to the data cleaning rule;

the text emotion analysis module is used for analyzing to obtain a text emotion value;

and the theme clustering module is used for researching the main reasons of satisfaction or dissatisfaction of the scenic region in the scenic region evaluation.

10. The system of claim 9, wherein the textual emotion analysis comprises:

the text situation analysis part is used for segmenting sentences through punctuation marks in the comments and segmenting the whole text comment into clauses;

segmenting sentences and words and removing stop words, segmenting each clause by utilizing jieba words, and deleting stop words existing in each clause by utilizing a stop word list constructed in the text;

positioning the emotion words and emotion assignment parts, matching all emotion words in each clause with the constructed emotion dictionary, and performing emotion assignment on the successfully matched emotion words;

a weighted summation part, which is used for matching degree adverbs and negatives before the emotion words in the clauses with a degree adverb dictionary and a negation adverb dictionary in the emotion dictionary constructed in the text, endowing corresponding weights to the degree adverbs and negation adverbs, and finally calculating the emotion tendency value of the whole clause;

and calculating the emotional tendency value part of the whole text, and summing the tendency values of all the clauses in the single text of the whole sentence to finally obtain the emotional tendency value of the whole text.

Technical Field

The invention relates to the technical field of tourism big data, in particular to a satisfaction degree improving system and method based on scenic spot evaluation.

Background

The most common mode for investigating scenic spot satisfaction is a questionnaire investigation method, and when an online tourism platform evaluates scenic spots, only one overall score is usually available, and all-round and all-dimensional evaluations of the scenic spots are rarely available.

From the perspective of a scenic spot manager, the overall evaluation of the scenic spot is easy to obtain, but the actual complaining points of the tourists cannot be clearly known, but the method is time-consuming, labor-consuming, difficult to grasp in accuracy, incapable of accurately positioning the problems, not beneficial to summarizing experience, and capable of improving the scenic spot management.

Disclosure of Invention

Aiming at the defects of the prior art, the invention discloses a satisfaction improving system and method based on scenic spot evaluation, which are used for solving the problem that the change trend of the satisfaction of a dynamic monitoring scenic spot cannot be determined and helping tourism management departments and scenic spot managers to improve the management level based on improvement factors.

The invention is realized by the following technical scheme:

in a first aspect, the invention provides a satisfaction degree improving method based on scenic spot evaluation, which comprises the following steps:

s1, selecting a scenic spot to be evaluated, and acquiring comment data information issued by the tourist to the scenic spot on an OTA website through a crawler;

s2, data cleaning is carried out on the obtained comment data information, and invalid and redundant data are screened out according to data cleaning rules;

s3, quantifying the total emotion value contained in the comment data information through text emotion analysis, and calculating the text emotion value of the comment in the scenic spot through sentence summation;

s4, analyzing the positive and negative comment subjects through subject clustering, researching the main reasons of satisfaction or dissatisfaction of scenic spots in scenic spot evaluation, and improving the satisfaction by the improvement.

Further, in the method, the rule for data cleaning includes: removing repeated data, removing missing invalid data, deleting short sentences, English, numbers and characters, and removing stop words and Chinese word segmentation.

Further, the repeated data comprises repeated data and system default comment data;

the invalid data comprises content missing data, blank comments, null data and HTML hypertext tag data which appear in the data set for multiple times;

the phrase deletion is to delete data with shorter comment content;

the Chinese word segmentation is to convert the data segmentation process into structured data.

Furthermore, in the method, when emotion value calculation is performed through text emotion analysis, the steps are as follows:

t1, carrying out sentence segmentation on the comment data information to obtain clauses;

t2 performing word segmentation processing and stop word processing on the segmented clauses;

t3, positioning and assigning the emotion words;

t4 carries out weighting adjustment to the clauses;

t5 sums the clauses to get the sentiment value of the text.

Further, in the step T2, the jieba participles are used to perform chinese participles and stop word removal on each clause, the clauses are firstly participled and processed, then stop word processing is performed, the participle text is matched with the stop word dictionary constructed in the text, stop words that are successfully matched are deleted, and stop words that are not successfully matched are retained.

Further, in the step T5, when the sentences are summed up, it is assumed that a scenic spot online comment text is divided into n clauses, and the emotion value of each clause is senti respectively1,senti2,....sentinThen, the emotional tendency value of the online comment of the whole scenic spot is:

furthermore, in the method, the theme clustering is used for analyzing the themes of the positive and negative comments in the aspects of overall experience, scenic spot basic conditions, scenic spot tourism consumption, scenic spot tourism resources and scenic spot convenience.

Further, the topic clustering uses an LDA topic clustering model which is a three-layer Bayes model, and the probability of classifying a document into a topic and the probability of classifying a topic into a word are calculated by training and optimizing text data, so as to finally form the three-layer Bayes model of document-topic-word; wherein

P (term | document) ═ P (topic | document) × P (term | topic)

When the LDA topic clustering model is applied, three parameters are set: the number of the topics, the hyper-parameters alpha and beta, the number of the topics needs to be set according to the actual situation of the text, for the hyper-parameters alpha and beta, the larger alpha is, the closer the whole document is to one topic, the larger beta is, and the greater importance of the special vocabulary under each topic is.

In a second aspect, the present invention discloses a satisfaction promoting system based on scenic spot evaluation, which is used for implementing the satisfaction promoting method based on scenic spot evaluation of the first aspect, and is characterized by comprising

The source data acquisition module is used for acquiring comment data information of OTA website public data published by a tourist;

the data cleaning module is used for screening out invalid and redundant data according to the data cleaning rule;

the text emotion analysis module is used for analyzing to obtain a text emotion value;

and the theme clustering module is used for researching the main reasons of satisfaction or dissatisfaction of the scenic region in the scenic region evaluation.

Further, the text emotion analysis comprises:

the text sentiment analysis part is used for segmenting sentences through punctuation marks in the comments and segmenting the whole text comments into clauses;

segmenting sentences and words and removing stop words, segmenting each clause by utilizing jieba words, and deleting stop words existing in each clause by utilizing a stop word list constructed in the text;

positioning the emotion words and emotion assignment parts, matching all emotion words in each clause with the constructed emotion dictionary, and performing emotion assignment on the successfully matched emotion words;

a weighted summation part, which is used for matching degree adverbs and negatives before the emotion words in the clauses with a degree adverb dictionary and a negation adverb dictionary in the emotion dictionary constructed in the text, endowing corresponding weights to the degree adverbs and negation adverbs, and finally calculating the emotion tendency value of the whole clause;

and calculating the emotional tendency value part of the whole text, and summing the tendency values of all the clauses in the single text of the whole sentence to finally obtain the emotional tendency value of the whole text.

The invention has the beneficial effects that:

according to the invention, through establishing the key characteristics of the scenic spot satisfaction degree under each dimension, the evaluation is carried out on the emotional level of each comment, and then the evaluation tendency of the tourist in the scenic spot is obtained through correlation matching with the evaluation characters of the tourist in the scenic spot, so that the satisfaction conditions of specific dimensions of the scenic spot, such as the overall experience, the basic conditions of the scenic spot, the tourist consumption of the scenic spot, the tourist resources of the scenic spot, the convenience degree of the scenic spot and the like, are measured, the multi-dimensional analysis of the tourist comments by tourist management departments and scenic spot management personnel is facilitated, the analysis of the scenic spot can be carried out in a targeted manner, and the management level is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of a satisfaction promotion method based on scenic spot assessment;

FIG. 2 is a flow diagram of a text sentiment analysis module according to an embodiment of the present invention;

fig. 3 is a flowchart of LDA topic analysis according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1

The embodiment discloses a satisfaction improving method based on scenic spot evaluation, which comprises the following steps:

1. the method comprises the steps of obtaining source data, selecting a scenic spot to be evaluated, and obtaining comment data information of OTA website public data published by a tourist through a crawler;

2. data cleaning, namely simply screening the acquired data information to screen out some useless garbage data information;

3. text sentiment analysis, namely analyzing and quantifying a total sentiment value contained in the comment by using a text sentiment analysis method, and calculating a text sentiment value of the comment in the scenic spot through sentence summation;

4. and (4) topic clustering, namely analyzing the topics of positive and negative comments by utilizing a topic clustering method in the aspects of overall experience, basic conditions of the scenic region, scenic region tourism consumption, scenic region tourism resources and scenic region convenience, and researching the main reasons of satisfaction or dissatisfaction of the scenic region in scenic region evaluation.

According to the method, the comments of the OTA platform are obtained, the total emotion value of the comment text of each scenic spot is calculated, the emotion tendency of the comment text is judged, wherein the emotion tendency comprises positive direction, negative direction and neutrality, then the topics of the positive and negative direction comments are analyzed by using an LDA topic clustering method, and influence factors of the satisfaction degree of the tourists and the change of the attention focus of the tourists are mined in combination with time variables.

Example 2

In the embodiment, a Hainan centipede branch and continental island is taken as an example, so that scenic spot comments crawling a journey-taking platform are selected, 28000 on-line comments about a centipede branch and continental scenic spot of a journey-taking network in data sets from 2015 to 2020 are selected, each comment is not only used for discussing one aspect of the scenic spot and often comprises a plurality of characteristics of the scenic spot, and therefore, a whole comment cannot be simply subjected to theme extraction and sentiment analysis, and subsequent analysis needs to be carried out after a single comment is cut and divided.

When the data cleaning is performed, the method comprises the following steps:

the method comprises the following steps that repeated data are removed, due to the fact that the number of review sets is large, content is inevitably repeated, for example, many users generally forget to review the travel products after purchasing the travel products in a travel website, and the system automatically defaults to default good reviews after the review period is over, and the data cannot represent the emotional attitude of the users and should be removed. The method mainly comprises content repeated data and system default comment data.

And the second step is to remove the missing invalid data, wherein the removing of the invalid data mainly comprises removing the content missing data, enabling blank comments to appear in the data set for many times, deleting the blank data, and removing HTML hypertext tag data, such as the hypertext tags "< htkl > </HTML >, < ul/>, and the like.

The third step is short sentence deletion, for some data with short comment content, such as 'ok', 'good' and 'general', the comment text with short vocabulary expresses the emotional attitude of the user to the scenic spot, but does not specify the specific object of the emotional attitude

The fourth step is the deletion of English, number and character. The numbers "1", "2", "3" and the special character strings "@", "#", etc. appearing in the text data do not have analytical value, and thus need to be deleted in the data preprocessing stage.

And fifthly, stop words are removed, and much data in the article are useless so as to avoid influencing the accuracy of subsequent emotion processing and needing to be removed. The data is mainly expressed as words which are frequently used, such as 'our', 'everybody', 'so' and the like, and words which are frequently used, such as 'wong', 'ya' and the like, and are useless, such as the text, and in order to make the text analysis result more accurate, the best mode is to remove the useless words.

The sixth step is Chinese word segmentation, because the computer can not directly process unstructured data, such as Chinese characters, in order to analyze scenic spot text, the data segmentation process needs to be converted into structured data.

When the embodiment performs text emotion analysis, by taking "scenery is really beautiful, scenery of various colors is invisible, temperature is comfortable, disappointing and great value is taken as an example in scenic spot comment texts, a specific whole sentence is explained, and emotion value calculation steps and processes are as follows:

the sentence segmentation, will "the scenery is true very beautiful, and every kind of scenery lets people's eyes not catch, and the temperature is very comfortable, does not have disappointed, very worth one trip" input software in, utilize punctuation mark to carry out preliminary segmentation to the text, form five little clauses, the emotional tendency attitude in follow-up every clause, the good clause of segmentation is as follows:

clause 1: "landscape is really very beautiful"

Clause 2: landscape with various colors and imperfect eyes "

Clause 3: comfortable temperature "

Clause 4: "No disappointment".

The method comprises the following steps of performing Chinese word segmentation and stop word removal, performing Chinese word segmentation and stop word removal on each clause by utilizing jieba word segmentation, performing word segmentation and word segmentation processing on the clauses, wherein the specific results are as follows:

after word segmentation, 1: "landscape", "true", "very", "beautiful"

After word segmentation 2: various colors, scenery, people and people with poor eyesight "

After word segmentation, 3: "temperature", "very", "comfortable"

After word segmentation, 4: "none" or "disappointment"

After word segmentation, 5: "very", "worth" and "one trip"

And (3) carrying out stop word processing after clause word segmentation processing, matching the segmented word text with the stop word dictionary constructed in the text, deleting stop words successfully matched, and keeping the stop words if the matched stop words are not successfully matched, wherein the specific results are as follows:

after stop word 1: "landscape", "very", "beauty"

After stop word 2: various colors, scenery and imperfect eyes "

After the stop word is removed 3: "temperature", "very", "comfortable"

After the stop word is removed 4: "none" or "disappointment"

After stop word 5: "very", "worth" and "one trip"

And positioning and assigning the emotion words, wherein if the words are judged to be positive emotion words, assigning the words to be 1, and if the words are judged to be negative emotion words, assigning the words to be-1. Finally, extracting all emotion words and assignments thereof in the clauses, and processing the words as follows:

clause 1: senti11, "american", clause 2: senti21, "eyes are not perfect", clause 3: senti31, "comfortable", clause 4: senti4When-1, "disappointment", clause 5: senti51, "worth".

And (4) weighting adjustment, namely searching for degree adverbs for modifying the reference emotional words and negating the adverbs to calculate the emotional tendency value of the whole sub-clause. And matching the text information with the degree adverb dictionary according to the constructed corresponding degree adverb emotion dictionary, giving corresponding emotion weight to the text information if the matching is successful, and not performing any processing if the matching is unsuccessful. Finally, the result obtained after the degree adverb emotion weight assignment is as follows: clause 1: sendi 1' ═ very "," american "clause 2: sendi 2' ═ 1, "blinding" clause 3: sendi 3' ═ 1, "comfortable" clause 4: sendi 4' ═ -1, "disappointment" clause 5: sendi 5' ═ 1.1, "very", "worth".

Sum of clauses, whole textAll clauses in the text are determined, and all clauses form the whole single comment text, so that the emotional tendency value of the single text is the sum of the emotional tendency values of all clauses. Suppose that a scenic spot online comment text is divided into n clauses, and the emotion value of each clause is senti respectively1,senti2,....sentinThen, the emotional tendency value of the online comment of the whole scenic spot is:

scenic spot comment text "the scenery is true very beautiful, and all kinds of scenery lets people's sight not to connect, and the temperature is very comfortable, does not disappointe, very worth going on one journey" through a row processing such as location assignment, the sentiment value that obtains whole scenic spot comment text is: 1.25+1+1+ 1.25 ═ 5.5.

Therefore, according to the final emotional tendency values of all the single comment texts, dividing the final emotional tendency values into three aspects of positive attitude, negative attitude and neutral, defining the comment with the emotional tendency value result larger than 0 as a positive comment, and assigning a value of 1; the comment with the emotional tendency value result smaller than 0 is defined as a negative comment, and is assigned as-1. For a comment defining an emotional tendency value result equal to 0 as a neutral comment, an assignment is made to 0.

When the LDA theme model analysis is performed in this embodiment, the positive and negative comment data sets are divided into 5 themes, and the overall experience, scenic spot basic conditions, scenic spot tourism consumption, scenic spot tourism resources, and scenic spot convenience are analyzed, so after the LDA theme model analysis, and the high-frequency vocabulary display comprehensively collect the comprehensive evaluation of the satisfaction of the tourists on the centipede continent scenic spots, the evaluation has the following aspects: the Centipede continental scenic spot has the advantages of facility equipment, scenic spot arrangement, tour guide, service provided by workers, delicacy characteristics of local delicacies, natural scenery of scenic spots and satisfaction to an online booking platform. The unsatisfactory places of the scenic spots are mainly reflected in the places where part of tourists think that arrangement of scenic spots and services of workers in the scenic spots still have improvement, the places where unreasonable consumption exists in scenic spots and shopping malls, and the unsatisfactory phenomena of part of tourists in group purchasing of foods in the scenic spots.

The method comprises the steps of dividing a year into 6 sections, cutting data into 6 sections, analyzing emotional tendency of each text datum, selecting forward comments in each section of time, calculating forward comment strength in each section of time, namely percentage of the forward comments in the year, and reflecting the change trend of the satisfaction degree of tourists according to the change trend of the forward comment strength of the tourists. Analyzing the reason of the decrease of the satisfaction degree of the scenic spot, and respectively carrying out cluster analysis on the negative comments from 2015 to 2020 by taking time as a dividing basis, wherein the focus of researching the negative comment tourists on the scenic spot is respectively as follows:

2015, high-frequency keywords are 'charge', 'price of things', 'landscape', 'bad', 'route'

2016 high-frequency keywords including "queue", "pick-up", "don't worth", "people many", and "money" in the past "

High-frequency keywords of 'entrance ticket', 'queue', 'people many', 'very noble' and 'force' in 2017 "

High-frequency keywords of 2018 are service attitude, nonvalue, too noble, consumption and seafood "

High-frequency keywords in 2019 are 'price', 'queue', 'regret', 'disappointment', 'too pit'

In 2020, the high-frequency keywords are "queuing", "service", "too expensive", "commercialization", "consumption".

Example 3

The embodiment discloses a satisfaction lift system based on scenic spot evaluation, include

The source data acquisition module is used for acquiring comment data information of OTA website public data published by a tourist;

the data cleaning module is used for screening out invalid and redundant data according to the data cleaning rule;

the text emotion analysis module is used for analyzing to obtain a text emotion value;

and the theme clustering module is used for researching the main reasons of satisfaction or dissatisfaction of the scenic region in the scenic region evaluation.

According to the method, through emotion analysis of emotional tendency of online comments in scenic spots, evaluation tendency and satisfaction degree analysis of tourists in different dimensions are obtained, and through topic clustering, the most satisfied and the most unsatisfied focuses of the tourists are focused.

Example 4

The emotion analysis module of the present embodiment, as shown in fig. 2, includes the following steps when calculating an emotion tendency value:

1. and text sentiment analysis, namely segmenting sentences through punctuation marks in the comments, and segmenting the whole text comment into clauses.

2. Segmenting sentences and words and removing stop words, segmenting each clause by utilizing jieba words, and deleting the stop words existing in each clause by utilizing a stop word list constructed in the text.

3. Locating the emotion words and emotion assignment. After segmenting words and removing stop words in the step 2, matching all emotion words in each clause with the constructed emotion dictionary, and performing emotion assignment on the emotion words successfully matched.

4. And (4) weighting and summing, matching degree adverbs and negative words in front of the emotional words in the clauses with a degree adverb dictionary and a negative adverb dictionary in the emotion dictionary constructed in the text, giving corresponding weights to the degree adverbs and the negative adverb dictionary, and finally calculating the emotional tendency value of the whole clause.

5. And calculating the emotional tendency value of the whole text. And summing the tendency values of all the clauses in the single text of the whole sentence to finally obtain the emotion tendency value of the whole text.

In this embodiment, the emotional tendency value of the whole text is calculated, different emotional attitudes are given to the text according to the emotional value of the text, that is, if the emotional tendency value is positive, the comment is a positive comment; negative sentiment tendency values are judged as negative comments, and neutral comments are judged if the sentiment values are zero.

Example 5

The embodiment discloses an LDA topic clustering model, which can be regarded as a three-layer bayesian model, and as shown in fig. 3, the probability of classifying a document into a topic and the probability of classifying a topic into a word are calculated by training and optimizing text data, so as to finally form the three-layer bayesian model of document-topic-word.

P (term | document) P (topic | document) P (term | topic)

In the present embodiment, three parameters need to be set in the LDA process: the number of the topics, the sum of the hyper-parameters and the number of the topics are required to be set according to the actual situation of the text, the sum of the hyper-parameters is generally set to be 0.01 and can also be set by self, the larger the number is, the closer the whole document is to one topic, the larger the number is, and the greater the importance of the special vocabulary under each topic is.

The themes are respectively integral experience, scenic spot basic conditions, scenic spot tourism consumption, scenic spot tourism resources and scenic spot convenience, the key features of the tourists, which are satisfied with the scenic spots, can be found from the theme clusters with positive comments according to characteristic high-frequency words appearing under each theme, and the key features of the tourists, which are not satisfied with the scenic spots, can be found from the theme clusters with negative comments.

In conclusion, by establishing the key feature of the scenic spot satisfaction degree under each dimension, the method and the system perform assignment on the emotional level of each comment, perform correlation matching with the tourist evaluation characters of the scenic spot to obtain the tourist evaluation tendency under different dimensions, so as to measure the satisfaction degree conditions of specific dimensions of the scenic spot, such as overall experience, basic conditions of the scenic spot, tourist consumption of the scenic spot, tourist resources of the scenic spot, convenience degree of the scenic spot and the like, help tourist management departments and scenic spot managers to realize multi-dimensional analysis of the tourist comments, analyze the scenic spot in a targeted manner, and boost the management level.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

12页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种文章原创度评价系统、方法、设备及介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!