Multi-level fingerprint identification method for multiple browser extensions

文档序号:1904793 发布日期:2021-11-30 浏览:6次 中文

阅读说明:本技术 一种面向多个浏览器扩展的多层级指纹识别方法 (Multi-level fingerprint identification method for multiple browser extensions ) 是由 刘亮 吕婷 祝方舟 于 2021-09-07 设计创作,主要内容包括:本发明涉及一种面向多个浏览器扩展的多层级指纹识别方法,属于计算机领域中网络信息安全领域。本发明对现有的浏览器扩展指纹进行了拓展,优化了现有的基于网页DOM(Document Object Model)修改的扩展指纹技术,并提出了一种新的基于JavaScript属性的扩展指纹技术,可提高扩展指纹的唯一性。在自定义的诱饵页面中,收集并构建了扩展指纹数据库,并以此为基础提出了一种多层级的浏览器扩展识别方法,可有效攻破现有的浏览器扩展指纹防御措施,同时提高扩展识别的准确率。(The invention relates to a multi-level fingerprint identification method for multiple browser extensions, and belongs to the field of network information security in the field of computers. The invention expands the existing browser extended fingerprint, optimizes the existing extended fingerprint technology modified based on webpage DOM (document Object model), and provides a new extended fingerprint technology based on JavaScript attribute, which can improve the uniqueness of the extended fingerprint. In a self-defined decoy page, an extended fingerprint database is collected and constructed, and a multi-level browser extended identification method is provided on the basis of the extended fingerprint database, so that the existing browser extended fingerprint defense measures can be effectively broken through, and the accuracy of extended identification is improved.)

1. A multi-level fingerprint identification method for multiple browser extensions is characterized by comprising the following steps:

(1) and (5) generating a decoy page. When a decoy page is designed to be loaded by a user browser, the modification of the page by the extension installed by the user is triggered as much as possible, so that more extension fingerprint information is captured.

(2) And acquiring all extension lists, loading programs one by one, and automatically establishing an extension fingerprint database as a comparison object of extension identification by extracting differences of accessible attributes of the DOM and JavaScript of the webpage before and after the extension is loaded.

(3) And (5) expanding and identifying. When the user accesses the constructed decoy webpage website, the modification of the DOM and JavaScript attributes of the webpage by the collection, analysis and expansion is transmitted to the attack server. And the attack server compares the received fingerprint information with the extended fingerprints in the extended fingerprint database by using a fuzzy matching algorithm, and finally obtains an extended list installed by the user.

2. The generation of a decoy page of claim 1, comprising the steps of:

(1) extensions to modify basic HTML elements. A basic decoy page is created that contains all the tags and attributes of the HTML.

(2) Extensions to modify web page text. Constructing a text content database triggering different expansions, and adding the text content to the basic page constructed in the step (1).

(3) Extensions to modify values of particular attributes. A list of attributes is generated, and all of the attributes in the list and all possible attribute values are added to the decoy page.

(4) And adding JavaScript codes into the decoy page, and capturing the current webpage DOM and the JavaScript attribute list.

3. The construction of the extended fingerprint database of claim 1, comprising the steps of:

(1) accessing the decoy page generated in claim 2, and obtaining the DOM of the webpage and the JavaScript attribute list at this time.

(2) And acquiring an extended ID list of the Chome browser.

(3) And loading the extensions one by one, accessing the decoy pages generated in the claim 2, acquiring the DOM and the JavaScript attribute list of the webpage which is extended and modified after the extensions are loaded, and sending the DOM and the JavaScript attribute list to an attack server.

(4) And (4) the attack server respectively compares the webpage DOM and JavaScript attribute lists obtained in the step (1) and the step (3).

(5) And processing the comparison result.

(6) Storing the final result obtained by processing into an extended fingerprint database, wherein the database mainly comprises: extended ID, added DOM element, deleted DOM element, added JavaScript attribute, and deleted JavaScript attribute.

4. A comparison result process according to claim 3, comprising the steps of:

(1) and dividing the webpage DOM comparison result according to the webpage tags, filtering out specific values of each tag attribute, and only keeping the tag name, the tag attribute name and the text content in the webpage.

(2) And dividing the comparison result of the JavaScript attribute list according to attributes, wherein each attribute comprises an attribute name and an attribute value corresponding to the attribute name.

5. An extended identification as claimed in claim 1, comprising the steps of:

(1) a user with installed extensions accesses the bait page generated in claim 2.

(2) After the page is loaded, the JavaScript code in the decoy page automatically collects the DOM and JavaScript attribute list of the current webpage and sends the DOM and JavaScript attribute list to the attack server.

(3) The attack server compares the information obtained in the step (2) with the DOM and JavaScript attribute list of the original decoy page obtained in the step (1) in claim 3.

(4) Processing the comparison result according to claim 4 to obtain a quadruple including an added DOM element, a deleted DOM element, an added JavaScript attribute, and a deleted JavaScript attribute.

(5) Comparing the quadruple of the step (4) with the extended fingerprint database of claim 3 by a fuzzy matching algorithm to obtain a candidate extended list.

(6) And the extended fingerprints in the filtering candidate extended list are extended of the subset of any other extended fingerprints in the list to obtain a final extended list.

6. The fuzzy matching algorithm of claim 5, comprising the steps of:

(1) and respectively calculating the number of added DOM elements and the number of labels in deleted DOM elements, the number of added JavaScript attributes and the number of attributes in deleted JavaScript attributes of each extended fingerprint in the extended fingerprint database.

(2) For each extended fingerprint in the extended fingerprint database, judging whether each tag in the added DOM element is contained in the added DOM element of the quadruple in the claim 5, calculating the number of tags meeting the containing condition, and judging the deleted DOM element in the same way; and respectively judging the number of the added JavaScript attribute and the deleted JavaScript attribute.

(3) And (3) if the number of the labels in the step (2) is larger than or equal to the product of the number of the labels obtained in the step (1) and a threshold T1, and the number of the attributes meets a threshold T2, adding the extension to the candidate list.

Technical Field

The invention relates to a multi-level fingerprint identification method for multiple browser extensions, and belongs to the field of network information security in the field of computers.

Background

Browser extensions are third-party applications that can customize the browsing experience, which while providing convenience [1-3], also pose many security privacy threats [4, 5 ]. Once the user's installed extension list is obtained, its identity and privacy are exposed. The concrete points are as follows: on one hand, different users install different extensions according to different requirements or preferences of the users, so that an extension list with personal characteristics is formed. An attacker can generate fingerprint identification users according to the list, and the identification, anonymous tracking and advertisement personalized pushing of the users who do not log in are realized. On the other hand, some browser extensions may expose the user's personal privacy, for example, a user installed a Trump Filter extension may cast the user's political tendencies.

For security reasons, the browser prohibits loaded web pages from directly obtaining the user-installed browser extension list. Nevertheless, recent research has found that the extended list of browsers can still be obtained indirectly through browser extended fingerprinting techniques [6, 7 ]. The attacker first creates a bait web page (a custom page consisting of HTML, CSS, and JavaScript). When a user accesses the bait page through the browser, the browser loads the page and executes the JavaScript code of the page, and the JavaScript code automatically collects and generates browser extension fingerprints by comparing and analyzing modification of the bait page through browser extension, extended WAR (web access resources) IDs and the like and sends the browser extension fingerprints to a server. And the server side matches the fingerprint information with an extended fingerprint database collected in advance so as to deduce an extended list installed by the user.

In order to protect the privacy of users, many researchers have recently started studying browser extended fingerprint defense techniques. Trickel et al [8] designed a CloakX system that randomizes the WAR ID and the values of the tag class, ID attributes in the DOM, so that the WAR-based extended fingerprint is invalidated while the DOM-modified extended fingerprint is obfuscated. This significantly reduces the effectiveness of user privacy inference and extended identification attacks using existing browser extended fingerprinting techniques. The invention provides a multi-level extended identification technology, which comprehensively utilizes an improved extended fingerprint based on DOM modification and an extended fingerprint based on JavaScript attributes newly proposed in the invention. When a user installs a plurality of extensions, the problem of mutual coverage and influence inevitably exists between the extensions, and how to effectively identify the extensions is solved, so that an extension list installed by the user is obtained.

The literature referred to above is sourced as follows:

[1]Iqbal U,Shafiq Z,Qian Z.The ad wars:retrospective measurement and analysis of anti-adblock filter lists[C]//Proceedings of the 2017 Internet Measurement Conference.2017:171-183.

[2]Mathur A,Vitak J,Narayanan A,et al.Characterizing the use of browser-based blocking extensions to prevent online tracking[C]//Fourteenth Symposium on Usable Privacy and Security ({SOUPS}2018).2018:103-116.

[3]Snyder P,Taylor C,Kanich C.Most websites don′t need to vibrate:A cost-benefit approach to improving browser security[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.2017:179-194.

[4]Calzavara S,Bugliesi M,Crafa S,et al.Fine-grained detection of privilege escalation attacks on browser extensions[C]//European Symposium on Programming Languages and Systems.Springer,Berlin,Heidelberg,2015:510-534.

[5]Onarlioglu K,Buyukkayhan A S,Robertson W,et al.Sentinel:Securing legacy firefox extensions[J].Computers & Security,2015,49:147-161.

[6]A,Van Acker S,Sabelfeld A.Discovering browser extensions via web accessible resources[C]//Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy.2017:329-336.

[7]A,Van Acker S,Picazo-Sanchez P,et al.Latex gloves:Protecting browser extensions from probing and revelation attacks[J].Power,2018,57.

[8]Trickel E,Starov O,Kapravelos A,et al.Everyone is different:Client-side diversification for defending against extension fingerprinting[C]//28th{USENIX}Security Symposium({USENIX}Security 19).2019:1679-1696.

disclosure of Invention

The invention aims to solve the technical problems that:

the invention aims to design a multi-level extended fingerprint identification technology for multiple browser extensions, so as to solve the problem of mutual influence among extended fingerprints generated when multiple extensions are loaded simultaneously. In a real environment, a user often installs a plurality of extensions, different extensions modify web pages differently, and generated fingerprints are different. Therefore, there is inevitably mutual influence and coverage of fingerprints among the multiple extensions, thereby destroying the integrity and certainty of the fingerprints. The existing browser extended fingerprint research only analyzes a single extended fingerprint, and due to the appearance of various defense measures, the current extended fingerprint technology is no longer applicable or the attack efficacy is suddenly reduced. In order to further improve the fingerprint property of extension and the uniqueness of extension fingerprints, the invention designs a fuzzy matching algorithm, wherein the extended fingerprints based on DOM and the extended fingerprints based on JavaScript attribute are comprehensively utilized, and even if a user installs a plurality of extensions at the same time and applies the existing defense measures, the extension list installed by the user can still be accurately identified.

The invention adopts the following technical scheme for solving the technical problems:

a multi-level fingerprint identification method for multiple browser extensions comprises the following steps:

(1) decoy page generation

In the scheme, firstly, a decoy page is designed by utilizing HTML, CSS and JavaScript, and the content which can trigger the modification of the DOM and JavaScript attributes of the webpage in an extension mode is included as much as possible, so that more extended fingerprint information is captured. This decoy page will serve as an access page in subsequent expansion database construction and expansion identification.

(2) Construction of extended fingerprint database

The invention simultaneously utilizes the operation behavior of the extension on the webpage DOM and the modification on the JavaScript attribute to generate the extension fingerprint and construct the extension fingerprint database. The fingerprint formed by the two kinds of information can improve the fingerprint expandability and the uniqueness of the expansion fingerprint to a certain extent. The extended fingerprint database will serve as a comparison object for extended identification.

(3) Extended recognition

Simulating a real user, loading a plurality of different extensions at the same time, accessing a bait page, and acquiring fingerprint information through an automatic fingerprint collection program. And comparing the fingerprint information with the extended fingerprint database by adopting a fuzzy matching algorithm to obtain a candidate extended list. And finally, filtering the candidate list to obtain a final extended list, namely an extended set installed by a user.

Compared with the prior art, the invention adopting the technical scheme has the following beneficial effects:

(1) the extended fingerprint identification technology based on the multi-hierarchy comprehensively utilizes extended fingerprints based on DOM modification and extended fingerprints based on JavaScript attributes. The former ignores consideration of the extended attribute value and can break through the existing defense measures based on DOM modified fingerprints; the latter provides a new fingerprint feature for identification extension, further expanding the number of fingerprint extensions. The combined action of the two improves the uniqueness of the extended fingerprint.

(2) The extension identification adopts a fuzzy matching algorithm, and the problem of fingerprint matching failure caused by partial change of respective fingerprints due to mutual influence among different extensions is solved through a matching threshold value. And meanwhile, the extended identification accuracy is improved.

Drawings

FIG. 1 is a schematic diagram of the generation of a bait page.

FIG. 2 is a schematic diagram of the construction of an extended database.

Fig. 3 is an expanded recognition diagram.

FIG. 4 is a schematic diagram of a fuzzy matching algorithm.

Detailed Description

The invention is described in further detail below with reference to the accompanying drawings.

In order to improve the uniqueness of the extended fingerprint and the accuracy of extended identification, the method for identifying the extended fingerprint of the browser provided by the invention comprises three parts: 1. and (5) generating a decoy page. 2. And constructing an extended fingerprint database. 3. And (5) expanding and identifying.

(1) Decoy page generation

As shown in FIG. 1, the bait page is primarily composed of four parts: HTML all tags and their basic attributes, text content, special attributes, JavaScript code. The basic idea of creation of a decoy page is that the page contains enough rich content that the user, when visiting, triggers as much as possible the modification behavior of different extensions on the web page DOM, JavaScript properties.

A specific generation process will now be described:

1. first, a crawler technology is used to acquire all extension IDs (for uniquely identifying one extension) in a Chrome online application store, and an extension ID list is generated.

2. And (3) automatically acquiring the first segment of text description content of the summary part of each extended detailed page by using a headless browser according to the extended ID list in the step 1. Punctuation marks, repeated words and stop words in the regular expressions are filtered by using the regular expressions and natural language processing respectively so as to remove text redundancy.

3. And creating a basic webpage, and adding all tags in the HTML and corresponding basic attributes into the basic webpage to obtain an initial decoy page.

4. And (3) constructing a basic webpage by using the text content in the step (2), loading and expanding the webpage one by one through the expansion ID list, accessing the webpage DOM before and after the expansion and the loading, and comparing the DOM. And if the comparison result shows that the webpage text content changes, adding the webpage text to a text content database. And all the expanded and modified webpage text contents form a final text content database. It is added to the bait page.

5. All attributes with fixed selectable items (e.g., autocomplete can only be on or off) are obtained, and a candidate attribute list is generated. Each attribute in the candidate list is traversed in the text of step 2 and if the attribute appears in the text, all its possible attribute values are added to the decoy page.

6. And injecting JavaScript codes for automatically acquiring the webpage DOM and JavaScript attributes.

(2) Construction of extended fingerprint database

After the extended fingerprint information captured by the attack page is sent to the attack server, in order to identify an extended list installed by a user on the attack server, each type of extended fingerprint needs to be collected online in advance, and an extended fingerprint database is constructed to be used as a comparison database for extended identification. Fig. 2 shows a schematic diagram of extended fingerprint database construction.

The specific construction process will now be described:

1. the original decoy page without any extensions installed is accessed.

2. And the JavaScript code of the decoy page automatically collects webpage DOM and JavaScript attributes after the page is loaded.

3. The extensions are downloaded one by one using the extension ID list.

4. And acquiring a Chrome browser by using the headless browser and loading the extensions one by one.

5. A decoy page is accessed in the browser installing the extension.

6. And collecting the webpage DOM and JavaScript attributes after the expansion.

7. And respectively comparing the DOM and JavaScript attributes of the webpage obtained in the step 6 with those obtained in the step 2.

8. For the comparison result of the webpage DOM, the tag name, the attribute name in the tag and the text content among the tags are saved by taking the tag as a unit, and the specific attribute value is filtered; and for the comparison result of the JavaScript attribute, the attribute name and the corresponding attribute value are reserved by taking the attribute as a unit.

9. Storing the result in a database, wherein each extended fingerprint consists of five parts: extended ID, added tab set, deleted tab set, added JavaScript attribute set, deleted JavaScript attribute set.

(3) Extended recognition

As shown in FIG. 3, the quadruple Q < add _ dom, delete _ dom, add _ properties, delete _ properties > is a fingerprint generated by a plurality of user-installed extensions together, and item < war _ id, add _ dom, delete _ dom, add _ properties, delete _ properties > is an extended fingerprint in the extended fingerprint database. add _ DOM, delete _ DOM, add _ properties, delete _ properties are added DOM elements, deleted DOM elements, added JavaScript properties, deleted JavaScript properties, respectively.

The specific construction process will now be described:

1. the user accesses the bait page of the present invention.

2. And the page automation program acquires the DOM and JavaScript attributes of the current webpage.

3. And respectively comparing the DOM and JavaScript attributes of the webpage with the corresponding content of the original page.

4. And processing the DOM in the comparison result according to the label, and processing the JavaScript attribute list according to the attributes to obtain a quadruple Q.

5. The first item in the extended fingerprint database is taken.

6. And comparing the Q with the item by adopting a fuzzy matching algorithm, and returning a comparison result.

7. If result is False, selecting the next item in the database, and skipping to the step 6; and if the matching result is True, storing the item into the candidate list, selecting the next item and skipping to the step 6. If the item is the last extended fingerprint of the extended fingerprint database, step 8 is executed.

8. And deleting the extended fingerprints in the candidate list to be the extensions of the subsets of any other extended fingerprints in the list, and finally obtaining the extended list installed by the user.

(4) Fuzzy matching algorithm

As shown in FIG. 4, Q is a fingerprint generated by multiple extensions installed by a user together, item is an extended fingerprint in an extended fingerprint database. Q0]~Q[3]And respectively storing the captured webpage DOM added and deleted nodes and the added and deleted JS attributes. item [0]]Store extended ID, item [1]]~item[4]And Q [0]]~Q[3]And correspond to each other. count (item [1]])、count(item[2]) Respectively representing the tag counts in the added DOM and the deleted DOM in the extended fingerprint. count+、count-Respectively used for representing the number of the added and deleted tags of the DOM related fingerprints contained in Q in the item. T1, T2 are matching thresholds for DOM-modified-based fingerprints and JavaScript attribute-based fingerprints, respectively. result, result1, result2 are boolean types. round () is a rounding function.

The fuzzy matching algorithm is mainly divided into two parts, namely fingerprint matching based on DOM modification and fingerprint matching based on JavaScript attributes, the output result of the algorithm is the intersection of the two matching results, and the difference between the two matching results is that one is used for counting tags and the other is used for counting attributes. And it is composed of two parts: added fingerprints and deleted fingerprints. Only the tag count to increase the DOM will be described in detail.

The specific fuzzy matching algorithm comprises the following steps:

1. inputs Q, item, T1, T2.

2. Initializing count+、count-A value of 0 is assigned.

3. The first element in item [1] is selected.

4. The element is assigned to i.

5. Judging whether i is contained in Q0, if yes, executing step 6; otherwise, step 7 is performed.

6、count+And adding 1.

7. Judging whether i is the last element in item [1], if so, executing the next step; otherwise, selecting the next element in item [1] and skipping to step 4.

8. By item [2]]、Q[1]Replacement item [1]]、Q[0]Repeating the steps 3 to 7 to obtain the updated count-

9、result1=(count+≥round(count(item[1])*T1)∧(count-≥round(count(item[2])*T1)。

10. Replacing item [1], item [2], Q [0], Q [1] and T1 with item [3], item [4], Q [2], Q [3] and T2 respectively, and repeating the steps 2 to 9 to obtain result 2.

11、result=result1∧result2。

12. And outputting result.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:显示信息的方法和装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!