Genomic data analysis systems and methods

文档序号:1643216 发布日期:2019-12-20 浏览:38次 中文

阅读说明:本技术 基因组数据分析系统和方法 (Genomic data analysis systems and methods ) 是由 安德鲁·沃伦 于 2018-03-29 设计创作,主要内容包括:实施例涉及用于分析例如遗传变异等基因组数据的方法和系统。一些实施例涉及个体的某些遗传变异的有效分析和呈现。(Embodiments relate to methods and systems for analyzing genomic data, such as genetic variations. Some embodiments relate to the efficient analysis and presentation of certain genetic variations of an individual.)

1. A computer-implemented method for displaying genetic variation data, comprising:

receiving genetic variation data from genomic sequence data of an individual;

creating an index for the documents of the determined genetic variation data;

receiving a selection from a user to select at least one filter from a plurality of filters for a feature of interest in the genetic variation data;

searching the index based on the selected filter to generate a filtered genetic variation of the individual;

identifying a genetic variation that is a translocation and a first point and a second point, the first point being the location of a first breakpoint on a first axis of the translocation and the second point being the location of a second breakpoint on a second axis of the translocation, the second axis comprising a linear representation of a genome; and

displaying a browser page on a display device, the browser page displaying the filtered genetic variation of the individual, wherein the browser page includes a first map having the first axis, the first axis including a linear representation of a genome having a location on the first axis where the genetic variation is located, and wherein different types of genetic variations are identified by different icons, wherein for a translocating genetic variation, the browser page displays the first point and the second point connected using a straight line or a curved line.

2. The method of claim 1, further comprising:

determining the genetic variation data from the individual.

3. The method of claim 1 or 2, further comprising displaying a second atlas comprising a magnified view of the first axis and a non-magnified view of the second axis.

4. The method of claim 3, wherein switching from the first map directly to the second map.

5. The method of any of claims 1-4, wherein the different icons are each selectable for launching the second atlas.

6. The method of any of claims 1 to 5, further comprising highlighting a translocation when a user hovers over or selects a corresponding translocation.

7. The method of any of claims 1 to 6, further comprising:

when a user selects an icon for a genetic variation, a pop-up window for details of the genetic variation is displayed.

8. The method of any of claims 1-7, wherein the genetic variation data is stored at a location remote from a server that performs the search.

9. The method of any one of claims 2 to 8, wherein determining genetic variation data comprises invoking a plurality of variation recognition tools.

10. The method of claim 9, further comprising:

creating annotated genetic variation data with the variation identification tool, wherein the annotated genetic variation data comprises at least one feature selected from the group consisting of: the type of genetic variation, the locus of the genetic variation, and the quality score of the genetic variation.

11. The method of any one of claims 1 to 10, wherein the filter selectively provides genetic variation associated with at least one feature of the group consisting of: genome wide, chromosomes, types of genetic variation, quality metrics, clinical indications, population frequency, and overlapping database variations.

12. The method of claim 11, wherein the clinical indication is a phenotype associated with a genetic variation.

13. The method of any of claims 1-12, wherein creating an index comprises creating an inverted index.

14. The method of claim 13, wherein searching the index comprises searching the inverted index.

15. The method of any one of claims 1 to 14, wherein the genetic variation comprises at least one variation selected from the group consisting of: inversion, deletion, insertion, duplication, substitution, and translocation.

16. An electronic system for analyzing genetic variation data, comprising:

an information module running on a processor and adapted to determine genetic variation data from genomic sequence data from an individual;

an indexing module adapted to create an index of documents in memory for the determined genetic variation data;

a selection module adapted to present a browser page displaying a plurality of filters that can be used for a feature of interest in the genetic variation data, and to receive a selection from a user to select at least one filter from the plurality of filters;

a search module adapted to search the index based on the selected filter and generate a filtered genetic variation of the individual;

an identification module adapted to identify a genetic variation that is a translocation and a first point and a second point, the first point being the location of a first breakpoint on a first axis of the translocation and the second point being the location of a second breakpoint on a second axis of the translocation, the second axis comprising a linear representation of a genome; and

a browser module adapted to return to a browser page that displays the filtered genetic variation of the individual, wherein the browser page includes a first map having a first axis that includes a linear representation of a genome having a location on the first axis where the filtered genetic variation is located, and wherein different types of the filtered genetic variation are identified by different icons, wherein for a translocating genetic variation, the browser page displays a first point and a second point connected using a straight line or a curved line.

17. The system of claim 16, wherein the returned browser page displays a second atlas comprising a magnified view of the first axis and a non-magnified view of the second axis.

18. The system of claim 17, wherein the switch is made directly from the first map to the second map.

19. The system of claim 17 or 18, wherein the different icons are each selectable for launching the second atlas.

20. The system of any of claims 16 to 19, wherein the returned browser page highlights a corresponding translocation when a user hovers over the translocation or selects the translocation.

21. The system of any of claims 16-20, wherein the returned browser page displays a pop-up window of details of the genetic variation when a user selects an icon for the genetic variation.

22. The system of any one of claims 16 to 21, wherein the genetic variation data is stored at a location remote from a server that performs the search.

23. The system of any one of claims 16 to 22, wherein the information module is adapted to invoke a plurality of variant calling tools.

24. The system of claim 23, wherein the variant calling tool creates annotated genetic variant data comprising at least one feature selected from the group consisting of: the type of genetic variation, the locus of the genetic variation, and the quality score of the genetic variation.

25. The system of any one of claims 16 to 24, wherein the filter selectively provides genetic variation related to at least one feature of the group consisting of: genome wide, chromosomes, types of genetic variation, quality metrics, clinical indications, population frequency, and overlapping database variations.

26. The system of claim 25, wherein the clinical indication is a phenotype associated with a genetic variation.

27. The system of any one of claims 16 to 26, wherein the indexing module is adapted to create an inverted index.

28. The system of claim 27, wherein the indexing module is adapted to search the inverted index.

29. The system of any one of claims 16 to 28, wherein the genetic variation comprises at least one variation selected from the group consisting of: inversion, deletion, insertion, duplication, substitution, and translocation.

30. An electronic method for displaying a browser page summarizing genetic variations, comprising:

determining genetic variation data from genome sequence data of a whole genome of the individual;

creating a genome-wide index for the determined documents of genetic variation data for the genome-wide;

presenting a browser comprising a plurality of filters that can be used for features of interest in the genetic variation data;

receiving a user selection for selecting at least one filter;

searching the genome-wide index based on the selected one or more filters; and

a browser page is displayed to summarize the genetic variation in response to the selected one or more filters.

31. The method of claim 30, wherein genetic variant data is stored at a location remote from a server that performs the search.

32. The method of claim 30, wherein the genetic variation comprises at least one structural variation.

33. The method of claim 32, wherein the structural variation comprises at least one of inversion, deletion, insertion, duplication, and translocation.

34. The method of claim 30, wherein determining genetic variation data comprises invoking a plurality of variation identification tools to identify potential variations and variation loci.

35. The method of claim 34, wherein the variant identification tool creates annotated genetic variant data for use in the search, the annotated genetic variant data comprising at least a type of variant, a locus of each variant, and a quality score for each identified variant.

36. The method of claim 30, wherein the plurality of filters includes filters that are selectively applicable to one or more chromosomes and up to the genome wide, and includes at least one of a type of variation, a quality metric, an overlapping database variation, a clinical filter, and a population frequency.

37. The method of claim 30, wherein the search is performed by creating an inverted index of the documents and performing the search on the inverted index.

38. The method of claim 30, wherein displaying the browser page comprises: formatting the browser page by representing a type and a location of each type of variant with an icon for the variant.

39. The method of claim 38, wherein the icon is placed at each locus represented as a variation on a map of the individual's whole genome.

40. The method of claim 30, wherein, for each translocation variant, the profile comprises an axis indicative of a locus of a break end of the variant, an axis indicative of a locus matching a break end, and a line or curve connecting the two break ends.

41. The method of claim 40, comprising automatically plotting a Bezier curve connecting each expected locus to each actual locus for each translocation variant.

42. The method of claim 30, comprising highlighting a translocation variant map when a user hovers over a corresponding translocation variant or selects a corresponding translocation variant.

43. The method of claim 30, comprising scaling the atlas based on user input, and wherein the scaling changes the proportion of the split-end locus axis without changing the proportion of the matching split-end locus axis.

44. The method of claim 30, comprising displaying a pop-up window of details of a particular variation when a user selects an icon for the particular variation.

Technical Field

Background

Increasingly complex systems have been developed to determine genomic information and analyze that information to determine the scope of a property of interest. Such systems may allow processing of a sample of genetic material to perform nucleotide sequencing, such as next generation sequencing, on the material. These systems may also include an information component designed to piece together extended segments of nucleotide sequences from genetic material and ultimately determine the sequence of the entire chromosome and genome of an individual.

One aspect of genetic analysis involves determining genetic variation. Different types of variation include insertions, deletions, substitutions, duplications, translocations, and inversions. Currently, challenges facing genomic analysis include the identification and classification of genetic variations, presenting the genetic variations to human researchers and clinicians, and manipulating the data in the most beneficial and instructive way for the user.

Disclosure of Invention

Some embodiments include a computer-implemented method for displaying genetic variation data, comprising: receiving genetic variation data from genomic sequence data of an individual; creating an index for the determined documents of genetic variation data; receiving a selection from a user to select at least one filter from a plurality of filters for a feature of interest in genetic variation data; searching the index based on the selected filter to generate a filtered genetic variation of the individual; identifying a genetic variation that is a translocation and a first point and a second point, the first point being the location of a first breakpoint of the translocation on a first axis, the second point being the location of a second breakpoint at which the translocation localizes on a second axis, the second axis comprising a linear representation of the genome; and displaying a browser page on the display device, the browser page displaying the filtered genetic variation of the individual, wherein the browser page includes a first map having a first axis, the first axis including a linear representation of a genome having a location where the genetic variation is located on the first axis, and wherein different types of genetic variations are identified by different icons, wherein for a translocating genetic variation, the browser page displays a first point and a second point connected using a straight line or a curved line. Some embodiments further comprise determining genetic variation data from the individual.

Some embodiments further comprise displaying a second atlas comprising a magnified view of the first axis and a non-magnified view of the second axis. In some embodiments, the switch is made directly from the first map to the second map. In some embodiments, different icons are each selectable for initiating the second map.

Some embodiments also include highlighting the translocation when the user hovers over or selects the corresponding translocation. Some embodiments further include a pop-up window that displays details for the genetic variation when the user selects the icon for the genetic variation.

In some embodiments, the genetic variation data is stored at a location remote from a server that performs the search.

In some embodiments, determining genetic variation data comprises invoking a plurality of variation identification tools. Some embodiments further comprise creating annotated genetic variation data with the variation identification tool, wherein the annotated genetic variation data comprises at least one feature selected from the group consisting of: the type of genetic variation, the locus of the genetic variation, and the quality score of the genetic variation.

In some embodiments, the filter selectively provides genetic variation associated with at least one feature of the group consisting of: genome wide, chromosomes, types of genetic variation, quality metrics, clinical indications, population frequency, and overlapping database variations. In some embodiments, the clinical indication is a phenotype associated with a genetic variation.

In some embodiments, creating the index includes creating an inverted index. In some embodiments, searching the index includes searching an inverted index.

In some embodiments, the genetic variation comprises at least one variation selected from the group consisting of: inversion, deletion, insertion, duplication, substitution, and translocation.

Some embodiments include an electronic system for analyzing genetic variation data, comprising: an information module running on the processor and adapted to determine genetic variation data from genomic sequence data from an individual; an indexing module adapted to create an index of documents in memory for the determined genetic variation data; a selection module adapted to present a browser page displaying a plurality of filters available for a feature of interest in the genetic variation data, and the selection module receives a selection from a user to select at least one filter from the plurality of filters; a search module adapted to search the index based on the selected filter and generate filtered genetic variations of the individual; an identification module adapted to identify a genetic variation that is a translocation and a first point and a second point, the first point being the location of a first breakpoint that is translocated on a first axis and the second point being the location of a second breakpoint that is translocated on a second axis, the second axis comprising a linear representation of a genome; and a browser module adapted to return to a browser page, the browser page displaying the filtered genetic variation of the individual, wherein the browser page includes a first atlas having a first axis, the first axis including a linear representation of a genome having a location on the first axis where the filtered genetic variation is located, and wherein different types of the filtered genetic variation are identified by different icons, wherein for a translocating genetic variation, the browser page displays a first point and a second point connected using a straight line or a curved line.

In some embodiments, the returned browser page displays a second atlas that includes a magnified view of the first axis and a non-magnified view of the second axis. In some embodiments, the switch is made directly from the first map to the second map. In some embodiments, a different icon is selectable for initiating the second map.

In some embodiments, the returned browser page highlights the translocation when the user hovers over or selects the corresponding translocation. In some embodiments, the returned browser page displays a pop-up window of details of the genetic variation when the user selects the icon for the genetic variation.

In some embodiments, the genetic variation data is stored at a location remote from the server, which performs the search.

In some embodiments, the information module is adapted to invoke a plurality of variant calling tools.

In some embodiments, the variant calling tool creates annotated genetic variant data that includes at least one feature selected from the group consisting of: the type of genetic variation, the locus of the genetic variation, and the quality score of the genetic variation.

In some embodiments, the filter selectively provides genetic variation associated with at least one feature of the group consisting of: genome wide, chromosomes, types of genetic variation, quality metrics, clinical indications, population frequency, and overlapping database variations. In some embodiments, the clinical indication is a phenotype associated with a genetic variation.

In some embodiments, the indexing module is adapted to create an inverted index. In some embodiments, the index module is adapted to search the inverted index.

In some embodiments, the genetic variation comprises at least one variation selected from the group consisting of: inversion, deletion, insertion, duplication, substitution, and translocation.

Some embodiments include an electronic method for displaying a browser page summarizing genetic variations, comprising: determining genetic variation data from genome sequence data of a whole genome of the individual; creating a genome-wide index for the determined documents of genetic variation data for the genome-wide; presenting a browser, the browser including a plurality of filters available for a feature of interest in the genetic variation data; receiving a user selection for selecting at least one filter; searching the genome-wide index based on the selected one or more filters; and displaying a browser page to summarize the genetic variation in response to the selected one or more filters.

In some embodiments, the genetic variation data is stored at a location remote from a server, which performs the search.

In some embodiments, the genetic variation comprises at least one structural variation. In some embodiments, the structural variation comprises at least one of inversion, deletion, insertion, duplication, and translocation.

In some embodiments, determining genetic variation data comprises invoking a plurality of variation identification tools to identify potential variations and variation loci. In some embodiments, the variant identification tool creates annotated genetic variant data for use in the search, the annotated genetic variant data including at least a type of variant, a locus of each variant, and a quality score for each identified variant.

In some embodiments, the plurality of filters includes filters that are selectively applicable to one or more chromosomes and up to the entire genome, and includes at least one of a type of variation, a quality metric, an overlapping database variation, a clinical filter, and a population frequency.

In some embodiments, the search is performed by creating an inverted index of documents and performing a search on the inverted index.

In some embodiments, displaying the browser page comprises: the browser page is formatted by representing the type and location of the variant using an icon for each type of variant. In some embodiments, the icon is placed at each locus represented as a variation on a map of the individual's whole genome. In some embodiments, for each translocation variant, the profile comprises an axis indicative of the locus of the variant's broken end, an axis indicative of the locus matching the broken end, and a line or curve connecting the two broken ends. Some embodiments further comprise automatically plotting a bezier curve connecting each expected locus to each actual locus for each translocation variant. Some embodiments further include highlighting the translocation variant map when the user hovers over the corresponding translocation variant or selects the corresponding translocation variant. Some embodiments further comprise scaling the map based on the user input, and wherein scaling changes the proportion of the split end locus axis without changing the proportion of the matched split end locus axis. Some embodiments also include a pop-up window that displays details of a particular variation when the user selects an icon for the particular variation.

Drawings

FIG. 1 depicts an embodiment of a genetic analysis system that determines and reports genomic information.

Fig. 2 depicts an embodiment of a genetic information analysis system.

FIG. 3 depicts an embodiment of certain components of the system of FIG. 2.

FIG. 4 is a flow diagram illustrating example logic in obtaining and analyzing genetic information via a system.

FIG. 5 is a flow diagram illustrating example logic in searching and reporting genetic information via the system.

FIG. 6A depicts a portion of a browser screen including a filter selection window that includes a regular filter, a results filter, and all fused results filters.

FIG. 6B depicts a portion of a browser screen including a filter selection window including a filter for selecting a particular genomic region.

FIG. 6C depicts a portion of a browser screen including a filter selection window that includes various population filters.

Fig. 7 depicts a map showing the occurrence of genomic variations along the whole genome in a browser page and includes indications for certain genetic variations at a first locus and indications for certain second loci related to the first locus.

FIG. 8 depicts a browser page including an enlarged view of a portion of the browser page depicted in FIG. 7.

FIG. 9 depicts a portion of a browser page including an enlarged view of the portion of the browser page depicted in FIG. 8 and including a pop-up window displaying information of structural variation.

FIG. 10 depicts a portion of a browser page that includes an enlarged view of a portion of the browser page depicted in FIG. 9 and includes different icons of different types of identified genetic variations.

FIG. 11 depicts a portion of a browser page including an enlarged view of a portion of the browser page depicted in FIG. 10 and including an icon of an identified genetic variation shown as an inverted triangle.

FIG. 12 depicts a portion of a browser page that includes an enlarged view of a portion of the browser page depicted in FIG. 10 and includes icons, shown as triangles, of the identified genetic variations.

FIG. 13 depicts a portion of a browser page including an enlarged view of a portion of the browser page depicted in FIG. 10 and including an icon of the identified genetic variation shown as an inverted semi-circle.

FIG. 14 depicts a portion of a browser page that includes an enlarged view of the portion of the browser page depicted in FIG. 10 and includes icons, shown as crosses, of identified genetic variations.

FIG. 15 depicts a portion of a browser page that includes an enlarged view of a portion of the browser page depicted in FIG. 10 and includes icons, shown as circles, of identified genetic variations.

FIG. 16 depicts a portion of a browser page including an enlarged view of a portion of the browser page depicted in FIG. 10 and including icons, shown as diamonds, of identified genetic variations.

Embodiments relate to methods and systems for analyzing genomic data, such as genetic variations. Some embodiments relate to the efficient analysis and presentation of certain genetic variations of an individual.

33页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于生物标记识别的系统和方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!