Processing method and processing device of thesis document, electronic equipment and storage medium

文档序号:700996 发布日期:2021-04-13 浏览:2次 中文

阅读说明:本技术 论文文档的处理方法、处理装置、电子设备及存储介质 (Processing method and processing device of thesis document, electronic equipment and storage medium ) 是由 辛洋 皮霞林 于 2019-10-09 设计创作,主要内容包括:本发明实施例提供了一种论文文档的处理方法、处理装置、电子设备及存储介质,其中方法包括:获取待处理论文文档中各个段落的样式属性,并基于样式属性中的编号样式和编号内容,将具有相同编号样式且编号内容连续的段落所组成的最大段落区间对应的部分,确定为待处理论文文档的正文部分,确定正文部分中的不同标题段落的论文元素,以及各标题段落对应的文本内容段落的论文元素,确定待处理论文文档的非正文部分的论文元素,根据论文模板中预设的不同论文元素与不同样式属性的对应关系,为待处理论文文档中所确定的各个论文元素所对应的段落设置新的样式属性。本发明实施例能够降低用户的排版难度,改善用户体验不佳的问题。(The embodiment of the invention provides a processing method, a processing device, electronic equipment and a storage medium of a thesis document, wherein the method comprises the following steps: the method comprises the steps of obtaining the style attribute of each paragraph in a thesis document to be processed, determining a part corresponding to a maximum paragraph interval formed by paragraphs with the same number style and continuous number contents as a text part of the thesis document to be processed based on the number style and the number contents in the style attribute, determining thesis elements of different title paragraphs in the text part and thesis elements of text content paragraphs corresponding to the title paragraphs, determining thesis elements of a non-text part of the thesis document to be processed, and setting a new style attribute for the paragraph corresponding to each determined thesis element in the thesis document to be processed according to the corresponding relation between different thesis elements and different style attributes preset in a thesis template. The method and the device for typesetting can reduce typesetting difficulty of the user and improve the problem of poor user experience.)

1. A method of processing a paper document, the method comprising:

acquiring the style attribute of each paragraph in the thesis document to be processed, wherein the style attribute is used for representing the paragraph style and the font style of each paragraph;

determining a part corresponding to a maximum paragraph interval composed of paragraphs with the same numbering style and continuous numbering content as a body part of the paper document to be processed based on the numbering style and the numbering content in the style attribute, wherein the body part comprises: a title paragraph and a text content paragraph;

determining thesis elements of different title paragraphs in the body part and the thesis elements of text content paragraphs corresponding to the title paragraphs; wherein one of the paper elements is used for representing paragraphs in a paper document having the same style attribute;

determining the paper elements of the non-text part of the paper document to be processed, wherein the non-text part is the other part except the text part in the paper document to be processed;

and setting new style attributes for paragraphs corresponding to the determined paper elements in the paper document to be processed according to the corresponding relationship between the different paper elements and the different style attributes preset in the paper template.

2. The method of claim 1, wherein the step of obtaining style attributes of paragraphs in the paper document to be processed comprises:

and at least acquiring the numbering format and the numbering content of the numbered paragraphs in the to-be-processed thesis document.

3. The method according to claim 1, wherein the step of determining a portion corresponding to a maximum paragraph interval composed of paragraphs having the same numbering style and consecutive numbering contents as a body portion of the paper document to be processed based on the numbering style and the numbering contents in the style attribute comprises:

dividing paragraphs with the same number patterns and continuous number contents into a paragraph interval to obtain a plurality of paragraph intervals;

determining a text part corresponding to the largest paragraph interval in the plurality of paragraph intervals as a body part; the starting position of the text part is the starting position of the maximum paragraph interval, and the ending position of the text part is the nearest position containing a preset keyword behind the maximum paragraph interval.

4. The method of claim 1, wherein the step of determining the paper elements of different title paragraphs in the body portion and the paper elements of the text content paragraph corresponding to each title paragraph comprises:

identifying the title paragraphs with the same numbering pattern as the same level;

determining thesis elements corresponding to different levels and corresponding thesis elements of the text content paragraphs; the paper elements are used to represent style attributes of paragraphs in a paper document.

5. The method of claim 1, wherein the step of determining a paper element of the non-textual portion of the paper document to be processed comprises:

aiming at the non-text part of the thesis document to be processed, determining a thesis element corresponding to the recognized preset keyword in the non-text part as a thesis element of a paragraph of the preset keyword according to the corresponding relation between different preset keywords and different thesis elements established in advance;

and determining a thesis element corresponding to a next paragraph of the paragraph where the preset keyword is located.

6. The method as claimed in claim 1, wherein the corresponding relationship between different paper elements and different style attributes is preset in a paper template, and the step of setting a new style attribute for the paragraphs corresponding to each paper element determined in the paper document to be processed according to the corresponding relationship between different paper elements and different style attributes preset in the paper template includes:

generating an index for the to-be-processed thesis document, wherein the index represents the corresponding relation between the paragraph sequence numbers in the to-be-processed thesis document and different thesis elements, and the paragraph sequence numbers are sequence numbers of paragraphs which are sequentially arranged in all paragraphs of the to-be-processed thesis document;

searching a first thesis element in the thesis template, wherein the first thesis element is the thesis element with the same type as the thesis element recorded in the index;

acquiring a first style attribute of the first paper element, wherein the first style attribute is a style attribute corresponding to the first paper element in the paper template;

determining a second style attribute according to the first style attribute, wherein the second style attribute is a style attribute of a paper element in the index;

and setting the second style attribute to the paragraph where the paragraph sequence number corresponding to the thesis element in the index is located.

7. The method of claim 1, wherein after the step of determining a paper element of the non-textual portion of the paper document to be processed, the method further comprises:

establishing a blank document;

and copying the content of the paper document to be processed into the blank document, wherein the blank document comprises an index.

8. An apparatus for processing a thesis document, the apparatus comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring the style attribute of each paragraph in a thesis document to be processed, and the style attribute is used for representing the paragraph style and the font style of each paragraph;

a first determining module, configured to determine, based on the numbering style and the numbering content in the style attribute, a portion corresponding to a maximum paragraph interval composed of paragraphs with the same numbering style and consecutive numbering content as a body portion of the to-be-processed thesis document, where the body portion includes: a title paragraph and a text content paragraph;

a second determining module, configured to determine thesis elements of different title paragraphs in the body portion and a thesis element of a text content paragraph corresponding to each title paragraph; wherein one of the paper elements is used for representing paragraphs in a paper document having the same style attribute;

a third determining module, configured to determine a paper element of a non-text portion of the to-be-processed paper document, where the non-text portion is another portion except the text portion in the to-be-processed paper document;

and the setting module is used for setting a new style attribute for the paragraph corresponding to each paper element determined in the to-be-processed paper document according to the corresponding relation between different paper elements and different style attributes preset in the paper template.

9. The apparatus of claim 8, wherein the obtaining module is specifically configured to:

and at least acquiring the numbering format and the numbering content of the numbered paragraphs in the to-be-processed thesis document.

10. The apparatus of claim 8, wherein the first determining module comprises:

the dividing submodule is used for dividing paragraphs which have the same number patterns and are continuous in number content into a paragraph interval to obtain a plurality of paragraph intervals;

a first determining submodule, configured to determine a text portion corresponding to a largest paragraph interval of the plurality of paragraph intervals as a body portion; the starting position of the text part is the starting position of the maximum paragraph interval, and the ending position of the text part is the nearest position containing a preset keyword behind the maximum paragraph interval.

11. The apparatus of claim 8, wherein the second determining module comprises:

an identifying submodule for identifying the title paragraphs having the same numbering pattern as the same level;

the second determining submodule is used for determining thesis elements corresponding to different levels and the thesis elements corresponding to the text content paragraphs; the paper elements are used to represent style attributes of paragraphs in a paper document.

12. The apparatus of claim 8, wherein the third determining module comprises:

a third determining sub-module, configured to determine, according to a pre-established correspondence between different preset keywords and different thesis elements, a thesis element corresponding to the preset keyword identified in the non-text portion as a thesis element of a paragraph in which the preset keyword is located, for a non-text portion of the thesis document to be processed;

and the fourth determining submodule is used for determining a thesis element corresponding to the next paragraph of the paragraph where the preset keyword is located.

13. The apparatus of claim 8, wherein the setup module comprises:

a generation submodule, configured to generate an index for the to-be-processed thesis document, where the index represents a correspondence between a paragraph number in the to-be-processed thesis document and different thesis elements, and the paragraph number is a sequence number of paragraphs that are sequentially arranged in all paragraphs of the to-be-processed thesis document;

the searching submodule is used for searching a first thesis element in the thesis template, wherein the first thesis element is a thesis element with the same type as the thesis element recorded in the index;

the obtaining submodule is used for obtaining a first pattern attribute of the first thesis element, wherein the first pattern attribute is a pattern attribute corresponding to the first thesis element in the thesis template;

a fifth determining submodule, configured to determine a second style attribute according to the first style attribute, where the second style attribute is a style attribute of a thesis element in the index;

and the setting submodule is used for setting the second style attribute to the paragraph where the paragraph sequence number corresponding to the thesis element in the index is located.

14. The apparatus of claim 8, further comprising:

the establishing module is used for establishing a blank document;

and the copying module is used for copying the content of the paper document to be processed into the blank document, and the blank document comprises an index.

15. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any of claims 1 to 7 when executing a program stored in the memory.

16. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for processing a thesis document, an electronic device, and a storage medium.

Background

When editing a paper document, not only the contents of the paper, such as the topic selection, direction, etc., but also the typesetting of the paper generally needs to be considered, so that the paper meets the requirements of the format.

One article generally includes: the parts include Chinese abstract, English abstract, directory, introduction, etc., each part usually includes multiple paragraphs, for example, the introduction part may be composed of a paragraph composed of "introduction" two words and a paragraph composed of the introduction content. In the prior art, when setting style attributes of a thesis document, the processing procedure is specifically as follows: the user terminal receives a style attribute modification instruction for a specified paragraph in the paper document (for example, to modify the line spacing of the paragraph or to modify the font size of the paragraph), and then modifies the style attribute of the specified paragraph according to the style attribute modification instruction.

However, a paper includes a plurality of sections, and each section of each section includes a plurality of style attributes, for example, style attributes such as line spacing, font size, and the like.

Disclosure of Invention

The embodiment of the invention aims to provide a processing method, a processing device, electronic equipment and a storage medium for a thesis document, so as to reduce the operation difficulty of a user when the style attribute of the thesis document is set. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a method for intelligently applying a thesis template, where the method includes:

acquiring the style attribute of each paragraph in the thesis document to be processed, wherein the style attribute is used for representing the paragraph style and the font style of each paragraph;

determining a part corresponding to a maximum paragraph interval composed of paragraphs with the same numbering style and continuous numbering content as a body part of the paper document to be processed based on the numbering style and the numbering content in the style attribute, wherein the body part comprises: a title paragraph and a text content paragraph;

determining thesis elements of different title paragraphs in the body part and the thesis elements of text content paragraphs corresponding to the title paragraphs; wherein one of the paper elements is used for representing paragraphs in a paper document having the same style attribute;

determining the paper elements of the non-text part of the paper document to be processed, wherein the non-text part is the other part except the text part in the paper document to be processed;

and setting new style attributes for paragraphs corresponding to the determined paper elements in the paper document to be processed according to the corresponding relationship between the different paper elements and the different style attributes preset in the paper template.

Optionally, the step of obtaining the style attribute of each paragraph in the thesis document to be processed includes:

and at least acquiring the numbering format and the numbering content of the numbered paragraphs in the to-be-processed thesis document.

Optionally, the step of determining, based on the number style and the number content in the style attribute, a portion corresponding to a maximum paragraph interval formed by paragraphs with the same number style and consecutive number content as a body portion of the paper document to be processed includes:

dividing paragraphs with the same number patterns and continuous number contents into a paragraph interval to obtain a plurality of paragraph intervals;

determining a text part corresponding to the largest paragraph interval in the plurality of paragraph intervals as a body part; the starting position of the text part is the starting position of the maximum paragraph interval, and the ending position of the text part is the nearest position containing a preset keyword behind the maximum paragraph interval.

Optionally, the step of determining the thesis elements of different title paragraphs in the body part and the thesis elements of the text content paragraph corresponding to each title paragraph includes:

identifying the title paragraphs with the same numbering pattern as the same level;

determining thesis elements corresponding to different levels and corresponding thesis elements of the text content paragraphs; the paper elements are used to represent style attributes of paragraphs in a paper document.

Optionally, the step of determining a paper element of the non-text part of the paper document to be processed includes:

aiming at the non-text part of the thesis document to be processed, determining a thesis element corresponding to the recognized preset keyword in the non-text part as a thesis element of a paragraph of the preset keyword according to the corresponding relation between different preset keywords and different thesis elements established in advance;

and determining a thesis element corresponding to a next paragraph of the paragraph where the preset keyword is located.

Optionally, the corresponding relationship between different thesis elements and different style attributes is preset in a thesis template, and the step of setting a new style attribute for a paragraph corresponding to each determined thesis element in the to-be-processed thesis document according to the corresponding relationship between different thesis elements and different style attributes preset in the thesis template includes:

generating an index for the to-be-processed thesis document, wherein the index represents the corresponding relation between the paragraph sequence numbers in the to-be-processed thesis document and different thesis elements, and the paragraph sequence numbers are sequence numbers of paragraphs which are sequentially arranged in all paragraphs of the to-be-processed thesis document;

searching a first thesis element in the thesis template, wherein the first thesis element is the thesis element with the same type as the thesis element recorded in the index;

acquiring a first style attribute of the first paper element, wherein the first style attribute is a style attribute corresponding to the first paper element in the paper template;

determining a second style attribute according to the first style attribute, wherein the second style attribute is a style attribute of a paper element in the index;

and setting the second style attribute to the paragraph where the paragraph sequence number corresponding to the thesis element in the index is located.

Optionally, after the step of determining a paper element of the non-text part of the paper document to be processed, the method further includes:

establishing a blank document;

and copying the content of the paper document to be processed into the blank document, wherein the blank document comprises an index.

In a second aspect, an embodiment of the present invention provides an apparatus for processing a thesis document, where the apparatus includes:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring the style attribute of each paragraph in a thesis document to be processed, and the style attribute is used for representing the paragraph style and the font style of each paragraph;

a first determining module, configured to determine, based on the numbering style and the numbering content in the style attribute, a portion corresponding to a maximum paragraph interval composed of paragraphs with the same numbering style and consecutive numbering content as a body portion of the to-be-processed thesis document, where the body portion includes: a title paragraph and a text content paragraph;

a second determining module, configured to determine thesis elements of different title paragraphs in the body portion and a thesis element of a text content paragraph corresponding to each title paragraph; wherein one of the paper elements is used for representing paragraphs in a paper document having the same style attribute;

a third determining module, configured to determine a paper element of a non-text portion of the to-be-processed paper document, where the non-text portion is another portion except the text portion in the to-be-processed paper document;

and the setting module is used for setting a new style attribute for the paragraph corresponding to each paper element determined in the to-be-processed paper document according to the corresponding relation between different paper elements and different style attributes preset in the paper template.

Specifically, the obtaining module is specifically configured to:

and at least acquiring the numbering format and the numbering content of the numbered paragraphs in the to-be-processed thesis document.

Specifically, the first determining module includes:

the dividing submodule is used for dividing paragraphs which have the same number patterns and are continuous in number content into a paragraph interval to obtain a plurality of paragraph intervals;

a first determining submodule, configured to determine a text portion corresponding to a largest paragraph interval of the plurality of paragraph intervals as a body portion; the starting position of the text part is the starting position of the maximum paragraph interval, and the ending position of the text part is the nearest position containing a preset keyword behind the maximum paragraph interval.

Specifically, the second determining module includes:

an identifying submodule for identifying the title paragraphs having the same numbering pattern as the same level;

the second determining submodule is used for determining thesis elements corresponding to different levels and the thesis elements corresponding to the text content paragraphs; the paper elements are used to represent style attributes of paragraphs in a paper document.

Specifically, the third determining module includes:

a third determining sub-module, configured to determine, according to a pre-established correspondence between different preset keywords and different thesis elements, a thesis element corresponding to the preset keyword identified in the non-text portion as a thesis element of a paragraph in which the preset keyword is located, for a non-text portion of the thesis document to be processed;

and the fourth determining submodule is used for determining a thesis element corresponding to the next paragraph of the paragraph where the preset keyword is located.

Specifically, the setting module includes:

a generation submodule, configured to generate an index for the to-be-processed thesis document, where the index represents a correspondence between a paragraph number in the to-be-processed thesis document and different thesis elements, and the paragraph number is a sequence number of paragraphs that are sequentially arranged in all paragraphs of the to-be-processed thesis document;

the searching submodule is used for searching a first thesis element in the thesis template, wherein the first thesis element is a thesis element with the same type as the thesis element recorded in the index;

the obtaining submodule is used for obtaining a first pattern attribute of the first thesis element, wherein the first pattern attribute is a pattern attribute corresponding to the first thesis element in the thesis template;

a fifth determining submodule, configured to determine a second style attribute according to the first style attribute, where the second style attribute is a style attribute of a thesis element in the index;

and the setting submodule is used for setting the second style attribute to the paragraph where the paragraph sequence number corresponding to the thesis element in the index is located.

Specifically, the apparatus further comprises:

the establishing module is used for establishing a blank document;

and the copying module is used for copying the content of the paper document to be processed into the blank document, and the blank document comprises an index.

In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete mutual communication through the communication bus; the machine-readable storage medium stores machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: the method steps of the processing method of the thesis document provided by the first aspect of the embodiment of the invention are realized.

In a fourth aspect, the present invention provides a computer-readable storage medium, in which a computer program is stored, where the computer program is executed by a processor to perform the method steps of the processing method for thesis documents provided in the first aspect of the present invention.

The processing method, the processing device, the electronic device and the storage medium for the thesis document, provided by the embodiment of the invention, are used for acquiring the style attribute of each paragraph in the to-be-processed thesis document, determining a part corresponding to a maximum paragraph interval formed by paragraphs with the same numbering style and continuous numbering content as a body part of the to-be-processed thesis document based on the numbering style and the numbering content in the style attribute, determining the thesis elements of different title paragraphs in the body part and the thesis elements of text content paragraphs corresponding to each title paragraph, and then determining the paper elements of the non-text part of the paper document to be processed, and further setting new style attributes for paragraphs corresponding to the paper elements determined in the paper document to be processed according to the corresponding relations between different paper elements and different style attributes preset in the paper template. The user only needs to send the setting instruction once, and the to-be-processed thesis document can set the style attributes of each paragraph according to the setting instruction, so that the problems that the user needs to send the corresponding setting instruction once every time one style attribute is set, the operation difficulty of the user is large, and the experience is poor are solved. Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of a method for processing a paper document according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating step S102 according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating step S103 according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating step S104 according to an embodiment of the present invention;

FIG. 5 is a flowchart illustrating step S105 according to an embodiment of the present invention;

FIG. 6 is a flowchart of another method for processing a paper document according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of a device for processing a thesis document according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of a first determining module according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a second determining module according to an embodiment of the present invention;

FIG. 10 is a block diagram of a third determining module according to an embodiment of the present invention;

FIG. 11 is a schematic structural diagram of a setup module according to an embodiment of the present invention;

FIG. 12 is a schematic structural diagram of another processing apparatus for a thesis document according to an embodiment of the present invention;

fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in FIG. 1, an embodiment of the present invention provides a method for processing a thesis document, which may include the following steps:

s101, obtaining the style attribute of each paragraph in the paper document to be processed.

In the embodiment of the present invention, a to-be-processed thesis document may refer to a thesis document to which a style attribute in a thesis template is to be set on a user terminal, the style attribute may be used to represent a paragraph style and a font style of each paragraph, and the paragraph style may include a numbering format, numbering content, line spacing, head line indentation, and the like of the paragraph; the font style may include a chinese character size, a chinese character style, an english character size, an english character style, and the like.

S102, determining a part corresponding to a maximum paragraph interval composed of paragraphs with the same numbering style and continuous numbering contents as a text part of the paper document to be processed based on the numbering style and the numbering contents in the style attribute.

The main body part can comprise a title section and a text content section, the main body part is a part of the document of the paper to be processed, which mainly introduces the subject of the document of the paper to be processed, and one characteristic that the main body part is different from other parts of the document of the paper to be processed is that the main body part has a plurality of levels of title sections, and the title sections have two style attributes of a numbering style and a numbering content, so that a section interval formed by the title sections of different levels can be determined based on the two style attributes of the numbering style and the numbering content. However, some non-title paragraphs also have both style attributes of numbered style and numbered content, e.g., paragraphs of the reference section also have numbered style and numbered content, and thus result in multiple paragraph intervals. Since the text part is a part of the paper document to be processed, which contains the most paragraphs, and the more paragraphs there are, the larger paragraph intervals are, so that the part corresponding to the largest paragraph interval can be determined as the text part of the paper document to be processed.

S103, determining the thesis elements of different title paragraphs in the body part and the thesis elements of the text content paragraph corresponding to each title paragraph.

The text part can be divided into a title paragraph and a text content paragraph, the title paragraph at the same level corresponds to a type of thesis element, the other paragraphs except the title paragraph in the text part correspond to a type of thesis element, the thesis elements of the title paragraphs at different levels in the text part are determined by dividing the paragraph hierarchy, the other paragraphs except the title thesis element in the text part are determined as the text content thesis elements, and one thesis element is used for representing the paragraphs with the same style attribute in the thesis document.

S104, determining the paper elements of the non-text part of the paper document to be processed.

The non-text part can be divided into a paragraph containing a preset keyword and a paragraph without the preset keyword, the thesis elements of the paragraph containing the preset keyword are determined according to the pre-established corresponding relation between different preset keywords and different thesis elements, and the paragraph without the preset keyword is determined as the content thesis element of the thesis element nearest to the paragraph. The non-text part of the paper document to be processed is the other part except the text part in the paper document to be processed, and comprises contents such as Chinese abstract, English abstract, catalog, introduction and the like.

S105, setting new style attributes for paragraphs corresponding to the paper elements determined in the paper document to be processed according to the corresponding relations between the different paper elements and the different style attributes preset in the paper template.

Generating an index for the thesis document to be processed, searching the thesis elements with the same type as the thesis elements in the index in the thesis template, and setting the style attributes of the thesis elements corresponding to the thesis elements in the thesis template to the paragraphs where the sequence numbers of the paragraphs corresponding to the thesis elements in the index are located. The preset correspondence between different thesis elements and different style attributes can be represented by a table, for example, as shown in table 1, three types of different thesis elements and multiple style attributes are included, and the preset correspondence between different thesis elements and different style attributes can be set by a technician according to actual business requirements, for example, the font size of the chinese abstract can be set to 12, 15 or 22.

Table 1 thesis element and style attribute correspondence table

As an optional implementation manner of the embodiment of the present invention, the step S101 specifically includes:

and at least acquiring the numbering format and the numbering content of the numbered paragraphs in the paper document to be processed.

It can be understood that there are many style attributes, and since the numbering format and the numbering content of the numbered paragraphs are used to determine the text part of the paper document to be processed in the subsequent steps, at least the numbering format and the numbering content of the numbered paragraphs in the paper document to be processed are obtained, and of course, style attributes such as chinese character size, english character size, line spacing, and the like can also be obtained.

As an optional implementation manner of the embodiment of the present invention, as shown in fig. 2, the step S102 may specifically include:

s1021, the paragraphs with the same number styles and continuous number contents are divided into a paragraph interval to obtain a plurality of paragraph intervals.

The paragraphs with the same numbering style and consecutive numbering content are all title paragraphs, and the method for determining a paragraph as a title paragraph comprises the following steps:

calculating each paragraph in the thesis document to be processed as a predicted value of a title according to the style attribute; determining paragraphs serving as headings in each paragraph of the to-be-processed thesis document according to the calculated predicted values of the paragraphs, wherein calculating the predicted values of the paragraphs serving as headings in the to-be-processed thesis document according to the style attributes is prior art, and the embodiment of the present invention is not described herein again.

The method comprises the steps of dividing paragraphs with the same number pattern and continuous number contents into a paragraph interval to obtain a plurality of paragraph intervals, wherein the paragraph interval comprises a title paragraph and a non-title paragraph with the same number pattern and continuous number contents, and the paragraph interval is [ the smallest paragraph number in the title paragraph, the largest paragraph number in the title paragraph ]. For example, if the numbers 1.1, 1.2, and 1.3 have the same number pattern and consecutive numbers, and the paragraph numbers of the paragraphs with the numbers 1.1, 1.2, and 1.3 are 7, 12, and 15, respectively, the paragraph interval is [7, 15 ].

S1022, the text portion corresponding to the largest paragraph interval among the plurality of paragraph intervals is determined as the body portion.

The number of paragraphs included in each paragraph interval is different, and the larger the number of paragraphs included, the larger the paragraph interval, the text portion corresponding to the largest paragraph interval among the plurality of paragraph intervals is determined as the text portion. The starting position of the text part is the starting position of the maximum paragraph interval, the ending position of the text part is the position which is closest to the maximum paragraph interval and contains the preset keyword, and if the paper document to be processed does not contain the preset keyword, the ending position of the text part is the ending position of the paper document to be processed.

As an optional implementation manner of the embodiment of the present invention, step S102 may further include:

and step A, dividing paragraphs with the same number patterns and continuous number contents into a paragraph interval to obtain a plurality of paragraph intervals.

And step B, determining the interval relation of each paragraph interval, wherein the interval relation is divided into phase separation, intersection and inclusion, and the intersection and inclusion are collectively called non-phase separation relation.

The term "overlap" means that the two paragraph intervals do not overlap, and the term "overlap" means that the two paragraph intervals partially overlap, including that one paragraph interval is completely within the other paragraph interval. Illustratively, if one paragraph interval is [1, 1], another paragraph interval is [2, 2], and there is no portion overlapped between the two intervals, the interval relationship between the two intervals is a phase separation; if one paragraph interval is [1, 5], the other paragraph interval is [2, 2], and the second paragraph interval is completely contained in the first paragraph interval, the interval relationship between the two paragraph intervals is contained; if one paragraph interval is [1, 2], the other paragraph interval is [2, 3] the overlapped part of the first paragraph interval and the second paragraph interval is [2, 2], the interval relationship of the two paragraph intervals is intersection.

Step C, if the interval relation between the first paragraph interval and the second paragraph interval is a separation relation, keeping the two paragraph intervals unchanged; in the obtained plurality of paragraph intervals, the paragraph numbers are sorted according to the smallest paragraph number in each paragraph interval, and the second paragraph interval is arranged after the first paragraph interval.

And D, if the interval relation between the two paragraph intervals is a non-phase relation, taking the union of the two paragraph intervals to obtain a new paragraph interval.

For example, if the first paragraph interval is [1, 2], the second paragraph interval is [2, 3], and there is a portion where the first paragraph interval and the second paragraph interval overlap, the interval relationship between the first paragraph interval and the second paragraph interval is an intersection or a non-separation relationship, and then the first paragraph interval and the second paragraph interval are merged to obtain a new paragraph interval [1, 3 ].

And E, acquiring a maximum paragraph interval in the finally obtained paragraph interval, and determining the maximum paragraph interval as the text part of the thesis document to be processed.

In the processing method of a thesis document provided by the embodiment of the present invention, paragraphs with the same number style and consecutive number contents are divided into a paragraph interval to obtain a plurality of paragraph intervals, and a text portion corresponding to a largest paragraph interval among the plurality of paragraph intervals is determined as a text portion. Therefore, the paper document to be processed is divided into a text part and a non-text part, and the style attribute can be set in different modes aiming at different parts, so that the accuracy of setting the style attribute of the paper document to be processed is improved.

As an optional implementation manner of the embodiment of the present invention, as shown in fig. 3, the step S103 may specifically include:

and S1031, identifying the title paragraphs with the same numbering style as the same hierarchy.

It will be appreciated that multiple levels may be obtained by identifying the title paragraphs having the same numbered pattern as the same level. Because the numbering formats of the same-level title paper elements are the same in the to-be-processed paper document, title paragraphs with the same numbering styles can be identified as the same level, and then the paper elements corresponding to different levels are further determined.

S1032, determining the corresponding thesis elements of different levels and the corresponding thesis elements of the text content paragraphs.

After the smallest paragraph sequence number in each level is obtained, a smallest paragraph sequence number group can be obtained, all the same-level paragraphs of the paragraphs corresponding to the smallest paragraph sequence number in the smallest paragraph sequence number group are determined as first-level title paper elements, all the same-level paragraphs of the paragraphs corresponding to the second smallest paragraph sequence number are determined as second-level title paper elements, the same-level paragraphs represent the paragraphs identified as the same level, and by analogy, the title paper elements of each level can be determined. And determining other sections of the body part which are not determined as the title paper elements as body text paper elements. For example, the title paragraphs with paragraph numbers 3, 7, and 9 are in the same level, the title paragraphs with paragraph numbers 5, 10, and 12 are in the same level, the smallest paragraph number in each level is obtained, that is, the paragraph number 3 and the paragraph number 5 are obtained, the paragraph number 3 and the paragraph number 5 form the smallest paragraph number group, and the paragraph number 3 is the smallest paragraph number in the smallest paragraph number group, so that the paragraph with paragraph number 3 corresponding to the paragraph number and the paragraph at the same level, that is, the paragraph with paragraph number 7 and the paragraph number 9 corresponding to the paragraph number can be determined as the first-level title paper element.

The method and the device for determining the text thesis element can determine other paragraphs except the paragraphs corresponding to the title thesis elements of different levels in the text part as the text thesis element, wherein the thesis element is used for representing the style attribute of each paragraph in the thesis document.

As an optional implementation manner of the embodiment of the present invention, step S1031 may further include:

step A, dividing the title paragraphs with the same style attributes into paragraph groups.

And step B, determining the management interval of each title paragraph in each paragraph group according to the paragraph number and the following expression.

The management section may indicate a section formed by each title paragraph of the body part and the text content related to the title.

Step C, when a title paragraph has the next adjacent title paragraph in the paragraph group to which the title paragraph belongs, the management interval of the title paragraph is: [ segment number of the title paragraph, segment number-1 of the next adjacent title paragraph in the paragraph group to which the title paragraph belongs ]; when the next adjacent title paragraph does not exist in the paragraph group to which the title paragraph belongs, the management interval of the title paragraph is: [ segment number of the title paragraph, segment number of the title paragraph ].

For example, if the segment number of a title segment is 1 and the segment number of the next adjacent title segment in the same segment group is 4, the management interval of the title segment is [1, 3 ]; if a title paragraph has a paragraph number of 6 and there is no next adjacent paragraph in the same paragraph group, the management interval of the title paragraph is [6, 6 ].

And D, arranging the sequence according to the segment numbers of the title paragraphs.

Step E, determining the interval relation between the management interval of the first paragraph and the management interval of the second paragraph, wherein the first paragraph and the second paragraph are as follows: in the title paragraph, two paragraphs that are adjacent in the order of the paragraph number are arranged, and the second paragraph is arranged after the first paragraph in the order of the paragraph number.

And step F, judging whether the style attribute of the first paragraph is the same as the style attribute of the second paragraph when the interval relation is the separation relation.

Judging whether the style attribute of the first paragraph is the same as the style attribute of the second paragraph, and realizing the following steps:

first, whether the first paragraph and the second paragraph both have numbers is judged.

If the number exists, judging whether the style attribute of the first paragraph is the same as the style attribute of the second paragraph according to the number format of the first paragraph and the number format of the second paragraph. If the number formats are the same, judging that the style attributes of the first paragraph and the second paragraph are the same;

if the first paragraph and the second paragraph are not numbered, that is, the first paragraph and the second paragraph are not numbered, or only one of the first paragraph and the second paragraph is numbered, and the other paragraph is not numbered, judging whether the style attribute of the first paragraph is the same as the style attribute of the second paragraph according to the text setting of the paragraphs. And if the text setting of the paragraphs is the same, judging that the style attributes of the first paragraph and the second paragraph are the same.

In one case, the text settings of the paragraph include the size of the font size, whether it is centered and whether it is bolded, and when the font size, centered and bolded settings are all the same, the text settings of the paragraph are the same.

And G, if the first section and the second section are the same, determining that the hierarchical relationship between the first section and the second section is as follows: a peer paragraph.

And H, if the two sections are different, searching similar sections, wherein the similar sections are as follows: according to the section number arrangement sequence, selecting a title section which is the same as the style attribute of the second section before the first section; if the similar paragraphs exist, determining that the second paragraph is a hierarchical relationship with the similar paragraphs as follows: the same level; if no similar paragraph exists, determining the hierarchical relationship between the first paragraph and the second paragraph as: the paragraph with a small segment number is the next-level paragraph above the paragraph with a large segment number.

In searching for similar paragraphs, previous title paragraphs are recursively searched in order starting with the previous title paragraph of the first paragraph according to the paragraph numbers of the respective title paragraphs.

Step I, when the interval relation is a non-phase relation, a step of searching for similar paragraphs is executed.

If the similar paragraphs exist, determining the hierarchical relationship between the second paragraph and the similar paragraphs as follows: and (4) the same level.

If no similar paragraph exists, determining that the hierarchical relationship between the first paragraph and the second paragraph is: the paragraph with a small segment number is the next-level paragraph above the paragraph with a large segment number.

As an optional implementation manner of the embodiment of the present invention, as shown in fig. 4, the step S104 may specifically include:

s1041, aiming at the non-text part of the thesis document to be processed, determining the thesis element corresponding to the preset keyword identified in the non-text part according to the corresponding relation between different preset keywords and different thesis elements established in advance, and using the determined thesis element as the thesis element of the paragraph where the preset keyword is located.

In the non-text part of the paper document to be processed, some paragraphs contain preset keywords, and some paragraphs do not contain preset keywords. For paragraphs containing preset keywords, the paper elements of the paragraphs containing the preset keywords can be determined through the pre-established corresponding relations between different preset keywords and different paper elements. The pre-established correspondence between different preset keywords and different thesis elements can be represented by a table, as shown in table 2, and the pre-established correspondence between different preset keywords and different thesis elements can be set by a technician according to actual business requirements, for example, the preset keyword corresponding to the thesis element of two words in chinese abstract can be set as abstract, content abstract or content outline.

Traversing the full text of the paper document to be processed, looking up the preset keywords in the table 2, determining the paper elements corresponding to the searched preset keywords in the table, and taking the paper elements as the paper elements of the paragraph where the preset keywords are located.

Table 2 thesis element and preset keyword correspondence table

S1042, determining a thesis element corresponding to a next paragraph of the paragraph where the preset keyword is located.

For a paragraph without a preset keyword in the non-text part, a nearest thesis element containing the preset keyword before the paragraph can be searched, the paragraph without the preset keyword is determined as a content thesis element of the searched thesis element, for example, a paragraph does not contain the preset keyword, the nearest thesis element containing the preset keyword before the paragraph is found as an introduction 2 word thesis element, and the paragraph without the preset keyword is determined as the introduction content thesis element.

As an optional implementation manner of the embodiment of the present invention, as shown in fig. 5, the step S105 may specifically include:

s1051, generating an index for the paper document to be processed.

The index in the embodiment of the invention can represent the corresponding relation between the paragraph sequence numbers and different thesis elements in the to-be-processed thesis document, and the paragraph sequence numbers are sequence numbers of the paragraphs which are sequentially arranged in all the paragraphs of the to-be-processed thesis document. As shown in table 3, table 3 is an example of the correspondence between the text header paper element and the paragraph number, the index table may include the paragraph number and the paper element type, and one paper element type may correspond to a plurality of paragraph numbers.

TABLE 3 thesis element and paragraph number index Table

S1052, searching a first paper element in the paper template.

In the paper document to be processed, the paper element types corresponding to the paragraph sequence numbers in the index are sequentially obtained according to the paragraph sequence numbers, and then a first paper element is searched in the paper template, wherein the first paper element is the same as the paper element type recorded in the index.

S1053, obtaining the first style attribute of the first thesis element.

And obtaining the style attribute corresponding to the searched first paper element according to a paper element and style attribute corresponding table preset in the paper template, wherein the first style attribute is the style attribute corresponding to the first paper element in the paper template.

And S1054, determining a second style attribute according to the first style attribute.

The second style attribute is a style attribute of a paper element in the index, and the second style attribute may be the same as the first style attribute.

S1055, setting the second style attribute to the paragraph where the paragraph sequence number corresponding to the thesis element in the index is located.

And in an index generated by the paper document to be processed, confirming the paragraph sequence number corresponding to the paper element to be set with the second style attribute, and setting the second style attribute to the paragraph where the determined paragraph sequence number is located.

The processing method of the thesis document provided by the embodiment of the invention is used for obtaining the style attribute of each paragraph in the thesis document to be processed, determining the part corresponding to the maximum paragraph interval formed by the paragraphs with the same numbering style and continuous numbering content as the body part of the thesis document to be processed based on the numbering style and the numbering content in the style attribute, determining the thesis elements of different title paragraphs in the body part and the thesis elements of the text content paragraphs corresponding to the title paragraphs, determining the thesis elements of the non-body part of the thesis document to be processed, and setting a new style attribute for the paragraph corresponding to each determined thesis element in the thesis document to be processed according to the corresponding relation between different thesis elements preset in the thesis template and different style attributes. The user only needs to send the setting instruction once, and the to-be-processed thesis document can set the style attributes of each paragraph according to the setting instruction, so that the problems that the user needs to send the corresponding setting instruction once every time one style attribute is set, the operation difficulty of the user is large, and the experience is poor are solved.

As shown in fig. 6, an embodiment of the present invention further provides a method for processing a thesis document, where the method may include:

s201, obtaining the style attribute of each paragraph in the paper document to be processed.

This step is the same as step S101 in the embodiment shown in fig. 1, and is not described again here.

S202, based on the numbering style and the numbering content in the style attribute, determining a part corresponding to a maximum paragraph interval formed by the paragraphs with the same numbering style and continuous numbering content as a text part of the paper document to be processed.

This step is the same as step S102 in the embodiment shown in fig. 1, and is not described again here.

S203, determining the thesis elements of different title paragraphs in the body part and the thesis elements of the text content paragraph corresponding to each title paragraph.

This step is the same as step S103 in the embodiment shown in fig. 1, and is not described again here.

S204, determining the paper elements of the non-text part of the paper document to be processed.

This step is the same as step S104 in the embodiment shown in fig. 1, and is not described again here.

S205, establishing a blank document.

S206, copying the content of the paper document to be processed into a blank document, wherein the blank document comprises an index.

It can be understood that the paper document to be processed is copied to the blank document to obtain a new paper document to be processed, and the subsequent step of setting the style attribute is performed in the new paper document to be processed, while the style attribute of each paragraph of the original paper document to be processed remains unchanged. When a user wants to keep the style attribute of the original paper document to be processed, a new paper document to be processed can be generated, and the style attribute is set for the new paper document to be processed.

S207, setting new style attributes for paragraphs corresponding to the paper elements determined in the paper document to be processed according to the corresponding relations between the different paper elements and the different style attributes preset in the paper template.

This step is the same as step S105 in the embodiment shown in fig. 1, and is not described again here.

The processing method of the thesis document provided by the embodiment of the invention obtains a new thesis document to be processed by establishing a new blank document and copying the thesis document to be processed into the blank document, wherein the new thesis document to be processed comprises an index, and the step of setting the style attribute can be carried out in the new thesis document to be processed without the style attribute of the original thesis document to be processed, so that the style attribute of the original thesis document to be processed is reserved, and the user experience is improved.

As shown in fig. 7, an embodiment of the present invention further provides a device for processing a thesis document, including:

an obtaining module 301, configured to obtain a style attribute of each paragraph in the paper document to be processed, where the style attribute is used to represent a paragraph style and a font style of each paragraph.

A first determining module 302, configured to determine, based on the numbering style and the numbering content in the style attribute, a portion corresponding to a maximum paragraph interval formed by paragraphs with the same numbering style and consecutive numbering content as a body portion of the paper document to be processed, where the body portion includes: a title paragraph and a text content paragraph.

A second determining module 303, configured to determine thesis elements of different title paragraphs in the body part and a thesis element of a text content paragraph corresponding to each title paragraph; wherein one paper element is used to represent paragraphs in a paper document having the same style attribute.

A third determining module 304, configured to determine a thesis element of a non-text portion of the to-be-processed thesis document, where the non-text portion is another portion of the to-be-processed thesis document except for the text portion.

The setting module 305 is configured to set a new style attribute for a paragraph corresponding to each paper element determined in the to-be-processed paper document according to a correspondence between different paper elements and different style attributes preset in the paper template.

As an optional implementation manner of the embodiment of the present invention, the first determining module 301 is specifically configured to:

and at least acquiring the numbering format and the numbering content of the numbered paragraphs in the paper document to be processed.

As an optional implementation manner of the embodiment of the present invention, as shown in fig. 8, the first determining module 302 includes:

the dividing submodule 3021 is configured to divide paragraphs with the same number pattern and consecutive number contents into a paragraph interval, so as to obtain a plurality of paragraph intervals.

A first determining submodule 3022 configured to determine a text portion corresponding to a largest paragraph interval of the plurality of paragraph intervals as a body portion; the starting position of the text part is the starting position of the maximum paragraph interval, and the ending position of the text part is the nearest position containing the preset keyword after the maximum paragraph interval.

As an optional implementation manner of the embodiment of the present invention, as shown in fig. 9, the second determining module 303 includes:

the identifying submodule 3031 is configured to identify the title paragraphs having the same numbering scheme as the same hierarchy.

A second determining submodule 3032, configured to determine thesis elements corresponding to different hierarchies and thesis elements corresponding to text content paragraphs; the paper elements are used to represent style attributes for paragraphs in a paper document.

As an optional implementation manner of the embodiment of the present invention, as shown in fig. 10, the third determining module 304 includes:

the third determining sub-module 3041 is configured to determine, according to the pre-established correspondence between different preset keywords and different thesis elements, a thesis element corresponding to the preset keyword identified in the non-text portion as a thesis element of a paragraph where the preset keyword is located, for the non-text portion of the thesis document to be processed.

The fourth determining submodule 3042 is configured to determine a thesis element corresponding to a next paragraph of the paragraph where the preset keyword is located.

As an optional implementation manner of the embodiment of the present invention, as shown in fig. 11, the setting module 305 includes:

the generation submodule 3051 is configured to generate an index for the to-be-processed thesis document, where the index represents a correspondence between paragraph numbers in the to-be-processed thesis document and different thesis elements, and the paragraph numbers are sequence numbers of paragraphs sequentially arranged in all paragraphs of the to-be-processed thesis document.

The searching sub-module 3052 is configured to search a first thesis element in the thesis template, where the first thesis element is a thesis element of the same type as the thesis element recorded in the index.

The obtaining sub-module 3053 is configured to obtain a first pattern attribute of the first thesis element, where the first pattern attribute is a pattern attribute corresponding to the first thesis element in the thesis template.

A fifth determining sub-module 3054, configured to determine, according to the first style attribute, a second style attribute, where the second style attribute is a style attribute of a paper element in the index.

The setting sub-module 3055 is configured to set the second style attribute to the paragraph where the paragraph number corresponding to the thesis element in the index is located.

As an optional implementation manner of the embodiment of the present invention, on the basis of the apparatus structure shown in fig. 7, as shown in fig. 12, the apparatus for processing a thesis document according to the embodiment of the present invention may further include:

the creating module 401 is configured to create a blank document.

A copying module 402, configured to copy the content of the paper document to be processed into a blank document, where the blank document includes an index.

The processing device for a thesis document, provided by the embodiment of the present invention, acquires a style attribute of each paragraph in the thesis document to be processed, determines a part corresponding to a maximum paragraph interval composed of paragraphs with the same numbering style and continuous numbering content as a body part of the thesis document to be processed based on a numbering style and numbering content in the style attribute, determines thesis elements of different heading paragraphs in the body part and thesis elements of text content paragraphs corresponding to the heading paragraphs, determines a thesis element of a non-body part of the thesis document to be processed, and sets a new style attribute for a paragraph corresponding to each determined thesis element in the thesis document to be processed according to a correspondence between different thesis elements preset in a template and different style attributes. The user only needs to send the setting instruction once, and the to-be-processed thesis document can set the style attributes of each paragraph according to the setting instruction, so that the problems that the user needs to send the corresponding setting instruction once every time one style attribute is set, the operation difficulty of the user is large, and the experience is poor are solved.

An embodiment of the present invention further provides an electronic device, as shown in fig. 13, including a processor 501, a communication interface 502, a memory 503 and a communication bus 504, where the processor 501, the communication interface 502, and the memory 503 complete mutual communication through the communication bus 504, and the memory 503 is used for storing a computer program;

the processor 501, when executing the program stored in the memory 503, implements the following steps:

acquiring the style attribute of each paragraph in the thesis document to be processed;

determining a part corresponding to a maximum paragraph interval formed by paragraphs with the same numbering style and continuous numbering content as a text part of the thesis document to be processed based on the numbering style and the numbering content in the style attribute;

determining thesis elements of different title paragraphs in the text part and the thesis elements of the text content paragraph corresponding to each title paragraph;

determining a paper element of a non-text part of a paper document to be processed;

and setting new style attributes for paragraphs corresponding to the determined paper elements in the paper document to be processed according to the corresponding relation between the different paper elements and the different style attributes preset in the paper template.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

The electronic device provided by the embodiment of the invention acquires the style attribute of each paragraph in a thesis document to be processed, determines a part corresponding to a maximum paragraph interval composed of paragraphs with the same numbering style and continuous numbering content as a body part of the thesis document to be processed based on the numbering style and the numbering content in the style attribute, determines the thesis elements of different heading paragraphs in the body part and the thesis elements of text content paragraphs corresponding to the heading paragraphs, determines the thesis elements of a non-body part of the thesis document to be processed, and sets a new style attribute for the paragraph corresponding to each determined thesis element in the thesis document to be processed according to the corresponding relationship between different thesis elements and different style attributes preset in a thesis template. The user only needs to send the setting instruction once, and the to-be-processed thesis document can set the style attributes of each paragraph according to the setting instruction, so that the problems that the user needs to send the corresponding setting instruction once every time one style attribute is set, the operation difficulty of the user is large, and the experience is poor are solved.

In yet another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the above-mentioned methods for processing a paper document.

The computer-readable storage medium provided by the embodiment of the present invention obtains a style attribute of each paragraph in a to-be-processed thesis document, determines a portion corresponding to a maximum paragraph interval composed of paragraphs with the same numbering style and continuous numbering content as a body portion of the to-be-processed thesis document based on a numbering style and numbering content in the style attribute, determines thesis elements of different heading paragraphs in the body portion and thesis elements of text content paragraphs corresponding to the heading paragraphs, determines a thesis element of a non-body portion of the to-be-processed thesis document, and sets a new style attribute for a paragraph corresponding to each determined thesis element in the to-be-processed thesis document according to a correspondence between different thesis elements preset in a template and different style attributes. The user only needs to send the setting instruction once, and the to-be-processed thesis document can set the style attributes of each paragraph according to the setting instruction, so that the problems that the user needs to send the corresponding setting instruction once every time one style attribute is set, the operation difficulty of the user is large, and the experience is poor are solved.

For the device/storage medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments. It should be noted that the apparatus and the storage medium according to the embodiments of the present invention are respectively an apparatus and a storage medium to which the above-mentioned method for processing a paper document is applied, and all embodiments of the above-mentioned method for processing a paper document are applicable to the apparatus and the storage medium, and can achieve the same or similar beneficial effects.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

26页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种标书自动生成方法、管理方法、介质以及计算机

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!