Automatic determination of collision energy of mass spectrometer

文档序号：1631738 发布日期：2020-01-14 浏览：19次中文

阅读说明：本技术 质谱仪碰撞能量的自动测定 (Automatic determination of collision energy of mass spectrometer ) 是由 P·F·叶 H·L·卡达西斯小詹姆斯·L·斯蒂芬森于 2018-05-07 设计创作，主要内容包括：本公开建立了新的解离参数,所述参数能用于测定使用碰撞池类型碰撞诱导解离来实现给定分析物前体离子的期望解离程度所需的碰撞能量(CE)。这种选择仅基于所述分析物前体离子的分子量MW和电荷态z。提出了能用作“解离程度”的参数的度量,并且针对实现每个度量的一系列值所需的CE建立了预测模型。每个模型都仅是前体离子的MW和z的简单平滑函数。通过结合实时质谱去卷积(m/z到质量)算法,根据本发明的方法能够通过以前体依赖性方式对碰撞能量进行自动实时选择来控制所述解离程度。(The present disclosure establishes new dissociation parameters that can be used to determine the Collision Energy (CE) required to achieve a desired degree of dissociation of a given analyte precursor ion using collision cell type collision induced dissociation. This selection is based solely on the molecular weight MW and charge state z of the analyte precursor ion. Metrics are proposed that can be used as parameters for "degree of dissociation" and a predictive model is built for the CE required to achieve a range of values for each metric. Each model is simply a smooth function of MW and z of the precursor ion. By incorporating a real-time mass spectrum deconvolution (m/z to mass) algorithm, the method according to the invention enables control of the dissociation degree by automatic real-time selection of collision energy in a precursor-dependent manner.)

1. A method for identifying an intact protein in a sample containing a plurality of intact proteins using a mass spectrometer, the method comprising:

(a) introducing the sample into an ionization source of the mass spectrometer;

(b) generating a plurality of ion species from the plurality of intact proteins using the ionization source, each protein thereby generating a respective subset of the plurality of ion species, wherein each ion species in each subset is a multiply-protonated ion species generated from a respective one of the intact proteins;

(d) automatically identifying each subset of the plurality of ion species by performing a mathematical analysis on the data generated by the mass analysis and assigning a charge state z to each identified ion species and a molecular weight MW to each intact protein;

(e) selecting one of the ion species;

(f) the collision energy CE for fragmenting the selected ion species is automatically calculated using the following relationship:

CE(D_P)＝c+(1/k)[ln(1/D_P)-1]，

wherein D_PIs a fraction of the selected ion species that is expected to remain unfragmented after the fragmentation, and c and k are functions of only the charge state z of the selected ion species and the molecular weight MW of the intact protein from which the selected ion species was derived;

(g) separating the selected ion species using automatically calculated collision energies and fragmenting the species, forming fragment ion species therefrom; and

(h) mass analysing the fragment ion species.

2. A method for identifying intact proteins within a sample containing a plurality of intact proteins using a mass spectrometer, the method comprising:

(a) introducing the sample into an ionization source of the mass spectrometer;

(e) selecting one of the ion species;

(f) the collision energy CE for fragmenting the selected ion species is automatically calculated using the following relationship:

wherein D_EIs a parameter corresponding to a desired distribution of fragment ion species produced by said fragmentation, z is a designated charge state of said selected ion species, MW is the molecular weight of said intact protein from which said selected ion species was produced, and b₁、b₂And b₃Is a predetermined parameter that varies according to DE;

(g) separating the selected ion species using automatically calculated collision energies and fragmenting the species, forming fragment ion species therefrom; and

(h) mass analysing the fragment ion species.

Technical Field

The present invention relates to mass spectrometry, and more particularly to methods and apparatus for mass spectrometry of complex mixtures of proteins or polypeptides by tandem mass spectrometry. More particularly, the present invention relates to a method and apparatus for fragmenting precursor ions using collision induced dissociation, and in which the selection of precursor ions to fragment and the magnitude of the collision energy to be imparted to the selected precursor ions are automatically determined.

Background

The study of proteins in living cells and tissues (proteomics) is an active area of clinical and basic scientific research, as metabolic control in cells and tissues is performed at the protein level. For example, comparison of protein expression levels between healthy and diseased tissues or between pathogenic microbial strains and non-pathogenic microorganisms may accelerate the discovery and development of new pharmaceutical compounds or agricultural products. Furthermore, analysis of protein expression patterns in diseased tissue or tissue excised from the organism receiving treatment can also serve as a diagnosis of the effectiveness of a disease state or treatment strategy and provide prognostic information regarding the appropriate treatment regimen and treatment selection for individual patients. Further, identification of proteomes in samples derived from microorganisms (e.g., bacteria) can provide a means to identify species and/or strains of microorganisms and possible resistance to such species or strains with respect to the bacteria.

Mass Spectrometry (MS) is currently considered to be a valuable analytical tool for biochemical mixture analysis and protein identification, as it can be used to provide detailed protein and peptide structural information. Thus, conventional protein analysis methods typically combine two-dimensional (2D) gel electrophoresis for separation and quantification with mass spectrometric identification of proteins. Also, capillary liquid chromatography and various other "front-end" separation or chemical fractionation techniques have been used in conjunction with electrospray ionization tandem mass spectrometry to facilitate large-scale protein identification without gel electrophoresis. Qualitative differences between mass spectra can be identified by using mass spectrometry, and proteins corresponding to peaks that occur only in certain mass spectra serve as candidate biomarkers.

The term "top-down proteomics" refers to an analytical method in which a protein sample is introduced intact into a mass spectrometer without prior enzymatic, chemical or other digestion means. Top-down analysis allows the study of intact proteins, allowing identification, determination of major structures and localization of post-translational modifications (PTMs) directly at the protein level. Top-down proteomic analysis typically consists of: introducing intact protein into an ionization source of a mass spectrometer; determining the intact mass of the protein; fragmenting protein ions; and the mass-to-charge ratio (m/z) and abundance of each fragment thus produced were measured. This sequence of instrument steps is commonly referred to as tandem mass spectrometry, or alternatively as "MS/MS" analysis. This technique can be advantageously used for polypeptide studies. The resulting fragments are many times more complex than fragments of simple peptides. Interpretation of such fragment mass spectra typically involves comparing observed fragmentation patterns to a protein sequence database comprising compiled experimental fragmentation results generated from known samples, or alternatively to theoretically predicted fragmentation patterns. For example, Liu et al ("Top-Down protein identification/Characterization of a Priori Unknown Proteins by Ion trap collision-Induced Dissociation and Ion/Ion Reactions in a Quadrupole/Time-of-Flight Tandem Mass Spectrometer" Top-Down protein identification/Characterization of a Priori Unknown Proteins via Ion trap chromatography-Induced Dissociation and Ion/Ion reaction in a quadrupol/Time-of-Flight derived Mass Spectrometers "describe Top-Down protein identification and Characterization of modified and unmodified Unknown Proteins, up to a Mass of 28 kDa.

One advantage of top-down versus bottom-up analysis is that proteins can be identified directly, rather than being predicted as peptides in so-called "bottom-up" analysis. Another advantage is that alternative forms of the protein can be identified, such as post-translational modifications and splice variants. However, top-down analysis has a disadvantage compared to bottom-up analysis, as many proteins can be difficult to isolate and purify. Thus, in mass spectrometry, each protein in an incompletely separated mixture can produce a plurality of ion species, each species corresponding to a different respective degree of protonation and a different respective charge state, respectively, and each such ion species can produce a plurality of isotopic variants. A single MS mass spectrum measured in a top-down analysis can easily contain hundreds or even thousands of peaks belonging to different analytes-these peaks interleaved together in a given m/z range, with ion signals of very different intensities overlapping.

When front-end sample fractionation, such as two-dimensional gel electrophoresis or liquid chromatography, is performed prior to MS analysis, the complexity of each individual mass spectrum can be reduced. Nevertheless, the mass spectrum of such sample portions may still include characteristics of multiple proteins and/or polypeptides. A general technique of performing Mass Spectrometry (MS) analysis of ions generated from compounds separated by Liquid Chromatography (LC) may be referred to as "LC-MS". If the mass spectrometry is performed as tandem mass spectrometry (MS/MS), the process can be referred to as "LC-MS/MS". In a conventional LC-MS/MS experiment, a sample is first analyzed by mass spectrometry to determine the mass-to-charge ratio (m/z) of ions derived from the sample, and to identify (i.e., select) the mass spectral peak of interest. The sample is then further analyzed by performing a product ion MS/MS scan of the selected peak or peaks. More specifically, a full scan mass spectrum including an initial survey scan is obtained in the first stage of analysis (commonly referred to as "MS 1"). One or more precursor ion species are then selected after the full scan mass spectrum. Fragmentation of selected species of precursor ions may be accomplished, for example, using a collision cell or using another form of fragmentation cell, such as surface induced dissociation, electron transfer dissociation or photo-dissociation. In the second stage, the resulting fragment (product) ions are detected for further analysis (commonly referred to as "MS/MS" or "MS 2") using the same mass analyser or a second mass analyser. The resulting product spectrum shows a set of fragmentation peaks (a group of fragments) that can be used in many cases as a means to derive structural information about the precursor ion species.

Fig. 1A shows a hypothetical experimental scenario in which different fractions are resolved chromatographically well (in time) after being introduced into the mass spectrometer due to different analyte species. Curves a10 and a12 represent the assumed concentration of each respective analyte at various times, where concentration is expressed as a percentage on the relative intensity (r.i) scale, and time is plotted along the abscissa as retention time. Curves a10 and a12 can be readily determined from measurements of the total ion current input into the mass spectrometer. The threshold intensity level A8 for the total ion current was set to a level lower than that at which only MS1 data was acquired. When the first analyte (detected as peak a10) elutes, the total ionic current intensity crosses the threshold A8 at time t 1. When this occurs, the on-board processor or other controller of the mass spectrometer may initiate one or more MS/MS spectra to be acquired. Subsequently, the front of another elution peak a12 was detected. When the total ion current again breaches the threshold intensity a8 at time t3, one or more additional MS/MS scans will be initiated. Typically, peaks a10 and a12 will correspond to elution of different analytes, and thus, different precursor ions are selected for fragmentation during elution of the first analyte (between time t1 and time t 2) rather than during elution of the second analyte (between time t3 and time t 4). Since different precursor ions will typically comprise different m/z ratios and different charge states, the experimental conditions required to produce optimal fragmentation may differ between two different elution cycles.

In more complex analyte mixtures, there may be components where the elution peaks completely overlap, as shown in the ionic current intensity versus retention time graph of FIG. 1B. In this example, elution peak a11 represents the ion current attributable to a precursor ion produced by a first analyte, while elution peak a13 represents the ion current attributable to a different precursor ion produced by a second analyte, where the mass and/or charge state of these different precursor ions are different from each other. In the hypothetical case shown in fig. 1B, the elution of the compounds that produce different ions almost completely overlaps, and the mass spectral intensity of the first precursor ion is consistently greater than the mass spectral intensity of the second precursor ion during co-elution. As is assumed to be shown in fig. 1C, mass spectra of all precursor ions may occur at any time during co-elution of two analytes, e.g., between time t6 and time t7, where the line group indicated by envelope 78 is caused by ionization of a first analyte and the line group indicated by envelope 76 is caused by ionization of a second analyte. Under these conditions, automated mass spectrometry must not only be able to distinguish between different precursor ions associated with different respective analytes, but must also be able to adjust the collision energy imparted to the different precursor ions during mass spectrometry so that each ion is optimally fragmented. Indeed, as described below, even when the analytes are not co-eluted, it is important to properly scale the applied collision energy. Correct scaling is particularly important when the properties (e.g., MW and/or z) of the various analytes are significantly different, regardless of relative elution times.

One common method of causing ion fragmentation in MS/MS analysis is Collision Induced Dissociation (CID), a method in which a population of analyte precursor ions are accelerated into a target neutral gas molecule, such as nitrogen (N) gas₂) Or argon (Ar) to impart internal vibrational energy to the precursor ions, which can lead to bond breakage and dissociation. The fragment ions are analysed to provide useful information about the structure of the precursor ions. The term "collision induced dissociation" encompasses techniques in which energy is imparted to precursor ions by a resonance excitation process, which may be referred to as RE-CID techniques. This resonance excitation method includes applying an auxiliary alternating voltage (AC) to the trapping electrode in addition to the main RF trapping voltage. The auxiliary voltage typically has a relatively low amplitude (about 1 volt (V)) and a duration of about tens of milliseconds. The frequency of the auxiliary voltage is chosen to match the frequency of motion of the ions, which in turn is determined by the main trapping field amplitude, frequency and mass-to-charge ratio (m/z) of the ions. ByIn the resonance of the motion of the ions with the applied voltage, the energy of the ions increases and their motion amplitude increases.

Figure 2 schematically illustrates another method of collision induced dissociation, sometimes referred to as high energy collision dissociation (HCD). In the HCD method, selected ions are temporarily stored in or passed through a multipole ion storage device 52, which may, for example, comprise a multipole ion trap. At a particular time, the potential on the gate electrode assembly 54 is varied to accelerate selected precursor ions 6 out of the ion storage device and into a collision cell 56 containing inert target gas molecules 8. The ions are accelerated to cause the ions to collide with the target molecules with kinetic energy determined by the difference in potential difference between the collision cell and the storage device.

When using HCD or RE-CID to generate fragment ions in MS/MS experiments, it is highly desirable to set up the instrument so as to impart the correct amount of collision energy to selected precursor ions. For HCD, the Collision Energy (CE) is set by setting the potential difference by which ions are accelerated into the HCD cell. Where the ions collide with the resident gas one or more times until the ions exceed the vibrational energy threshold to break bonds, thereby producing dissociation product ions. The product ions may retain sufficient kinetic energy such that further collisions result in successive dissociation events. The optimum collision energy varies depending on the nature of the precursor ion selected. Setting the HCD collision energy too high can lead to such successive dissociation events, resulting in a large number of small, non-specific product ions. Conversely, setting this potential too low will result in ions that provide useful information all clustering together, as the mass spectral characteristics of at least some fragment ions may be weak or non-existent. In either case, sufficient structural information about the precursor ion will not be available from the product ion mass spectrum, and thus no identification or structural (or sequence) description can be provided. Analytes of different sizes, structures, and charge capacities dissociate to varying degrees at any given CE. Thus, using only a single collision energy setting for all precursor ions during an automated mass spectrometry experiment carries the risk of undesirable or unacceptable fragmentation of certain ions. Nevertheless, mass spectrometry procedures are often performed on samples or sample portions with reduced chemical diversity for a variety of reasons (e.g., ionization, chromatography, fragmentation, etc.). Reducing chemical diversity increases the likelihood of setting the appropriate collision energy by adjusting the collision energy on similar analytes.

Although resonance excitation CID (RE-CID) and HCD produce similar mass spectra based on the same charge of the same protein, the exact collision energy optima required to produce the maximum amount of structural information may vary greatly. In the case of RE-CID, since the applied assist frequency is at the same fundamental frequency as the motion of the precursor ions, the internal energy of the precursor ions is increased so that the minimum dissociation energy is reached and product ions are produced. The degree of fragmentation reaches a maximum with increasing applied energy and levels off with depletion of the precursor ions. If the applied fragmentation energy is further increased, the relative abundance of the individual product ions will generally not change. Conversely, as the fragmentation energy increases beyond the onset of the plateau region, the relative abundance of the product ions remains approximately constant, and little to no additional relevant structural information is obtained from this process.

In contrast, in the case of HCD fragmentation, the collision activation process is a function of only the potential difference between the HCD cell and the adjacent ion optical element. Thus, any product ions formed in the HCD cell may undergo further fragmentation based on their excess internal energy. Since the HCD process involves using nitrogen as the collision gas, rather than helium, which is commonly used in the RE-CID experiment, higher energy and more structural information can be obtained from the HCD process if near-optimal collision energy is applied. In the RE-CID process, increasing the applied collision energy beyond its optimum reduces the amount of precursor ions remaining, but does not significantly alter the amount of opposing fragment ions. In HCD fragmentation, increasing the applied collision energy beyond its optimum value typically results in further fragmentation of the fragment ions.

Fig. 3A shows a general comparison between the effect of increasing energy on the number of identifiable protein fragment ions produced by HCD fragmentation (curve 151) and the effect of increasing energy on the number of such identifiable ions produced by RE-CID fragmentation (curve 152). Curve 152 shows the effect of varying the applied resonance energy on fragmentation of the precursor ion from the protein myoglobin. In this example, the amount of structural information will remain relatively constant as the collision energy increases beyond 25% RCE. In contrast, when the HCD process is used (curve 151), the structural information content obtained has a well-defined maximum for an HCD energy of about 28% RCE. In the case where the collision energy is less than or exceeds the optimal RCE setting, the quality of the structural information obtained from the HCD experiment may be drastically degraded.

The effect of varying the applied HCD fragmentation energy is well illustrated in the fragmentation of the +8 charge state precursor ion of protein ubiquitin as shown by the product ion mass spectra of figures 3B-3D. Fig. 3B shows the limited number of fragment ions resulting from ion fragmentation when using a 25% suboptimal RCE setting. In many experimental cases, this limited fragmentation will not allow for the correct identification of proteins by searching standard tandem mass spectral libraries or using sequence information from available databases. However, when the RCE setting was changed to 30%, HCD fragmentation of the same precursor ion was best and the resulting product ion mass spectrum (fig. 3C) showed a rich array of fragments of various charge states that enabled identification of proteins using any of several methods. Finally, as shown in fig. 3D, further increasing the RCE setting to 40% results in an excessive fragmentation condition, in which case most of the product ions generated are singly charged low mass fragments that are more reflective of the amino acid composition of the protein than the actual protein sequence itself. Therefore, it is highly desirable to adjust the collision energy for HCD fragmentation of unknown proteins and complex mixtures in real time to maximize the available information content.

U.S. patent No. 6,124,591 in the name of inventor Schwartz et al describes a method of generating product ions in a quadrupole ion trap by RE-CID in which the amplitude of the applied resonance excitation voltage is substantially linearly related to the precursor ion m/z ratio. The technique described in us patent No. 6,124,591 attempts to normalize the main variations in the optimum resonant excitation voltage amplitude of different ions and the variations due to instrument variations. Schwartz et al further discovered that the contributions of different structures, charge states, and stability are secondary in nature to the impact of determining the applied collision energy, and that these secondary effects can be modeled by simple correction coefficients.

According to the teachings of Schwartz et al, a simple and fast calibration of the substantially linear relationship between the applied optimal CE and m/z can be performed on an instrument-by-instrument basis. Fig. 4A schematically illustrates the principle of the generation and use of a calibration curve. Initially, a calibration curve for a particular mass spectrometer is generated by fitting a linear relationship to the calibration data in which a particular percentage (e.g. 90% reduction) of the precursor ion intensity reduction is observed. The linear relationship is shown in fig. 4A by line 22. Schwartz et al found that a two point calibration was sufficient to characterize a linear relationship, and more simply, a single point calibration could be used if the intercept of the line was fixed at some value or zero. In a typical calibration, the intercept of the calibration line 22 is assumed to be at the origin, as shown in FIG. 4A, and a single point calibration involves a specified reference mass-to-charge ratio (m/z)₀The applied collision energy is measured or calculated at reference point 29. Typically, the reference point is at m/z-500 Da, and the reference collision energy value measured at 500Da or extrapolated to 500Da during calibration may be denoted CE₅₀₀。

Once the instrument calibration is determined, subsequent operation of the mass spectrometer typically does not take the full CE value represented by line 22, but rather takes the Relative Collision Energy (RCE) value, which is expressed as a percentage of the CE value in the values represented by line 22 at any given m/z. For example, lines 24, 26, and 28 shown in FIG. 4A represent RCE values of 75%, 50%, and 25%, respectively. The user may then simply specify the expected value of the RCE. A simple scalar charge correction factor f (z) accounts for the secondary effect of the precursor ion charge state z on the optimum CE applied. It has been found that these general relationships, initially determined in RE-CID fragmentation, are alsoEffective for HCD fragmentation. With these simplifications, the absolute collision energy CE applied to each precursor for HCD fragmentation is then automatically set according to the following equation_actual(expressed in electron volts):

wherein CE_actualIs the applied collision energy, typically expressed in electron volts (eV), RCE is the relative collision energy-a percentage value typically defined by the user for each experiment, and f (z) is the charge correction factor. Table 1 in fig. 4B lists acceptable charge correction factors. Note that the numerator and denominator of the fraction in parentheses are both expressed in units of daltons, Da (or more precisely thomson, Th). Although this equation is generally sufficient to fine tune the absolute CE applied to the sample over a narrow range of precursor ion properties, it should be noted that since f (z) yields a fixed value when z ≧ 5, the collision energy is too high for heavier molecules with higher charge states (such as proteins and polypeptides), leading to excessive fragmentation of these species.

Recently, mass spectrometry of intact proteins and polypeptides has gained widespread popularity. For such applications, the size, structure and charge capacity of the analytes within the sample may vary greatly, thus requiring very different collision energies to achieve the same degree of dissociation. It has been found that even if the range of charge coefficients is extended and extrapolated to charge states above +5, the above equation does not sufficiently normalize the collision energy for all precursors in a polypeptide or whole protein sample. Therefore, these specific analytes require a modified model.

Disclosure of Invention

The present teachings relate to establishing new dissociation parameters that will be used to determine the HCD (collision cell type CID) Collision Energy (CE) required to achieve a desired degree of dissociation for a given analyte precursor ion. The selection is based solely on the Molecular Weight (MW) and charge state (z) of the analyte precursor ions. To this end, the inventors designed two different indicators that could be used as a measure of the "degree of dissociation" D, and replace the previous oneRelative collision energy and normalized collision energy parameters. These two new indicators are relative precursor decay (D)_p) And spectral entropy (D)_E) Although other indicators describing the degree of dissociation are conceivable in the future. The inventors have further developed predictive models for the collision energy values required to achieve a range of values for each such indicator. Each model is simply a smooth function of MW and z of the precursor ion. By incorporating a real-time spectral deconvolution algorithm that is capable of determining the molecular weight of the analyte molecules, these new techniques will be able to control the extent of dissociation by automatic real-time selection of collision energy in a precursor-dependent manner. With these novel collision energy determination methods, the inventors eliminate the need for the user to "tune" or "optimize" the collision energy for different compounds or applications, since a single "degree of dissociation" parameter setting will apply to all sampled MW and z. This function is advantageous for complete protein analysis, in which case the precursor may encompass a wide range of physical properties in a single sample. Existing methods are tailored for a limited range of analyte properties (such as those of simple peptides) and do not adequately address the complexity of complete protein and polypeptide analysis.

Drawings

To further clarify the above and other advantages and features of the present disclosure, a more particular description of the disclosure will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. It is appreciated that these drawings depict only illustrative embodiments of the disclosure and are therefore not to be considered limiting of its scope. The disclosure will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:

FIG. 1A is a schematic of an analysis of two analyte fractions showing well separated chromatographic elution peaks;

FIG. 1B is a schematic of a portion of a chromatogram having highly overlapping elution peaks, both above an analysis threshold;

FIG. 1C is a schematic representation of a hypothetical plurality of staggered mass spectral peaks of two simultaneously eluting protein or polypeptide analytes;

FIG. 2 is a schematic illustration of a conventional apparatus and method for fragmenting ions by collision-induced dissociation;

fig. 3A is a general graphical comparison between the effect of increasing energy on the number of identifiable protein fragment ions produced by HCD fragmentation and the effect of increasing energy on the number of such identifiable ions produced by RE-CID fragmentation.

Fig. 3B, 3C and 3D are mass spectra of fragment ions generated by HCD fragmentation of +8 charge state precursor ions of protein ubiquitin using relative collision energy settings of 25, 30 and 40, respectively.

FIG. 4A is a graph showing the relationship between applied collision energy and precursor ion mass-to-charge ratio according to a known "normalized collision energy" manipulation technique;

FIG. 4B is a table showing correction coefficients applied to known normalized collision energy manipulation techniques to compensate for the effect of precursor ion charge states on the degree of fragmentation produced by collision induced dissociation;

FIG. 5A is a schematic diagram of a system for generating and automatically analyzing a chromatography/mass spectrum according to the present teachings;

FIG. 5B is a schematic diagram of an exemplary mass spectrometer suitable for use in conjunction with methods according to the present teachings, the mass spectrometer including a hybrid system including a quadrupole mass filter, a dual pressure quadrupole ion trap mass analyzer, and an electrostatic trap mass analyzer;

FIG. 6A is a set of graphs of the percentage of individual precursor ion species remaining after fragmentation as a function of applied collision energy fitted to data generated by a logistic regression plot, where the precursor ion species are the +22, +24, +26, and +28 charge states of carbonic anhydrase having a molecular weight of about 29 kdaltons;

FIG. 6B is a table of parameters that may be used in a model according to the present teachings to calculate collision energies that should be provided experimentally to yield various desired precursor ion survival percentages D_pAccording to each selected D_pThe values are tabulated.

FIG. 7A is a combination of five representative product ion mass spectra with different degrees of collision-induced dissociation, showing variations in the "total entropy of mass" values calculated in accordance with the present teachings;

FIG. 7B is a graph of dividing each of two product ion mass spectra into two regions and determining a first mass spectral entropy E associated with each first region₁And a second mass spectral entropy E associated with each second region₂And in E₁、E₂And total mass spectrum entropy E_totExamples of comparisons therebetween;

FIG. 8A is a total mass spectrum entropy calculated from product ion mass spectra (top panel), E, according to the present teachings₁(middle panel) and E₂(lower panel) a set of graphs that vary with the collision energy imparted to the indicative precursor ion charge state of myoglobin (about 17 kdalton);

FIG. 8B is a table of parameters that may be used to calculate collision energy according to another model of the present teachings, which should be provided experimentally to yield a parameter D according to product ion entropy_ESet of distributed product ions, for each selected D_EThe values are tabulated.

Fig. 9A is a comparison between the conventionally calculated collision energy (solid line) as a function of mass-to-charge ratio and the calculated collision energy (dashed line) according to the entropy model of the present teachings, and for an ion charge state of +5 and a default setting of the conventional relative collision energy.

FIG. 9B is a comparison between the scaled conventionally calculated impact energy (solid line) and the impact energy calculated by the entropy model according to the present teachings (dashed line), where the conventionally calculated impact energy in FIG. 9A is scaled by a scaling factor of 0.79475.

FIG. 10 is a graph of charge state scaling factors that may be applied to conventionally calculated collision energies to reconcile those conventionally calculated collision energies with certain calculations determined in accordance with the present teachings;

FIG. 11 is a tabular representation of the charge state scaling factor graphically depicted in FIG. 10;

FIG. 12 is a flow chart of a method for tandem mass spectrometry analysis of a protein or polypeptide using automated collision energy determination according to the present teachings;

FIG. 13A is a pictorial representation of a computer screen information display showing the calculated peak cluster decomposition results from mass spectrometry of a five component protein mixture consisting of cytochrome-c, lysozyme, myoglobin, trypsin inhibitor and carbonic anhydrase produced by computer software employing a method according to the present teachings; and is

FIG. 13B is a diagram of a computer screen information display showing peak cluster decomposition results produced by computer software employing a method according to the present teachings, the display showing an expanded portion of the decomposition results shown in FIG. 13A.

FIG. A1 shows a mass spectrum and a series of m/z values studied by the methods taught in the appendix.

Detailed Description

The following description is presented to enable any person skilled in the art to make and use the invention, and is provided in the context of a particular application and its requirements. Various modifications to the described embodiments will be readily apparent to those skilled in the art, and the generic principles herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the embodiments and examples shown, but is intended to be accorded the widest scope possible consistent with the claims. The specific features and advantages of the present invention will become more apparent in conjunction with the following discussion with reference to fig. 1-13.

FIG. 5A is a schematic example of a general system 30 for generating and automatically analyzing chromatography/mass spectra that may be employed in connection with the methods of the present teachings; a chromatograph 33, such as a liquid chromatograph, high performance liquid chromatograph, or ultra high performance liquid chromatograph, receives a sample 32 of the analyte mixture and at least partially separates the analyte mixture into individual chemical components according to well-known chromatographic principles. The resulting at least partially separated chemical components are transferred to mass spectrometer 34 at different respective times for mass analysis. As the mass spectrometer receives each chemical component, the chemical component is ionized by the ionization source 112 of the mass spectrometer. The ionization source may generate a plurality of ions comprising a plurality of ion species (i.e., a plurality of precursor ion species) that include a different charge or mass than each chemical component. Thus, multiple ion species having different respective mass-to-charge ratios can be generated for each chemical component, each such component eluting from the chromatograph at its own characteristic time. These individual ion species are typically analyzed by spatial or temporal separation by mass analyzer 139 of the mass spectrometer and detected by detector 35. As a result of this process, ion species can be appropriately identified according to their various mass-to-charge ratios (m/z). As shown in fig. 5A, the mass spectrometer includes a reaction cell 23 for fragmenting precursor ions or causing other reactions of precursor ions to produce multi-process product ions including a plurality of product ion species.

Still referring to fig. 5A, the programmable processor 37 is electrically connected to the detector of the mass spectrometer and receives data generated by the detector during chromatographic/mass spectrometric analysis of one or more samples. A programmable processor may comprise a stand-alone computer or may comprise just a circuit board or any other programmable logic device operated by firmware or software. Optionally, the programmable processor may also be electrically connected to the chromatograph and/or mass spectrometer for transmitting electronic control signals to one or the other of these instruments to control the operation thereof. The nature of such control signals may be determined in response to data transmitted from the detector to the programmable processor or by analysis of that data performed by a method according to the present teachings. The programmable processor may also be electrically connected to a display or other output 38 to output data or data analysis results directly to a user or to an electronic data storage device 36. The programmable processor shown in fig. 5A is generally operable to: the precursor ion chromatography/mass spectrometry and the product ion chromatography/mass spectrometry are received from the chromatography/mass spectrometer device and various instrument control, data analysis, data retrieval and data storage operations are automatically performed according to various methods discussed below.

FIG. 5B is a schematic diagram of a particular example mass spectrometer 200 that may be used to perform methods according to the present teachings. The mass spectrometer shown in fig. 5B is a hybrid mass spectrometer that includes more than one type of mass analyzer. Specifically, the mass spectrometer 200 includes an ion trap mass analyzer 216 toAnd Orbitrap^TMAnalyzer 212, said Orbitrap^TMThe analyser is one of electrostatic trap mass analysers. Orbitrap^TMThe mass analyser 212 employs image charge detection in which ions are detected indirectly by detecting image currents induced on electrodes by the motion of the ions within the ion trap. Various analytical methods according to the present teachings employ multiple mass analysis data acquisitions. Thus, the hybrid mass spectrometer system can be advantageously employed to improve duty cycle by using two or more analyzers simultaneously. However, a hybrid system of the type shown in fig. 5B is not required, and methods according to the present teachings can be employed on any mass analyzer system capable of performing tandem mass spectrometry and employing collision-induced dissociation. Suitable types of mass analyzers and mass spectrometers include, but are not limited to, triple quadrupole mass spectrometers, quadrupole time-of-flight (q-TOF) mass spectrometers, and quadrupole Orbitrap^TMA mass spectrometer.

In operation of the mass spectrometer 200, the electrospray ion source 201 provides ions of a sample to be analyzed into the aperture of the skimmer 202 where they enter the first vacuum chamber. Upon entry, the ions are captured by the stacked ring ion guide 204 and focused into a tight beam. The first ion optical transfer assembly 203a transfers the beam into a high vacuum region downstream of the mass spectrometer. Most of the remaining neutral molecules and undesirable high velocity ion clusters (e.g., solvated ions) are separated from the ion beam by the curved beam guide 206. Neutral molecules and ion clusters follow a straight path, while the ions of interest bend 90 degrees under the influence of the resistive field, thereby creating a separation.

The quadrupole mass filter 208 of the mass spectrometer 200 functions in its conventional sense as an adjustable mass filter to pass only ions within a selected narrow m/z range. The subsequent ion optical transfer assembly 203b transfers the filtered ions to a curved quadrupole ion trap ("C-trap") assembly 210. The C-trap 210 is capable of transferring ions along a path between the quadrupole mass filter 208 and the ion trap mass analyzer 216. The C-trap 210 also has the capability to temporarily collect and store a large number of ions, which are then transferred as pulses or packets to the Orbitrap^TMCapabilities in the mass analyzer 212. By placing in the C-well 210In the C-well 210 and Orbitrap^TMA potential difference is applied between a set of injection electrodes 211 between the mass analyzer 212 to control the transport of the ion packets. The curvature of the C-trap is designed to focus the ion packets spatially to match Orbitrap^TMThe angular spectral width of the entrance aperture of the mass analyzer 212.

The multipole ion guide 214 and optical transfer assembly 203b are used to guide ions between the C-trap 210 and the ion trap mass analyzer 216. The multipole ion guide 214 provides temporary ion storage capability so that ions generated in a first processing step of the analysis method can be retrieved later for processing in a subsequent step. The multipole ion guide 214 may also act as a fragmentation cell. The individual gate electrodes along the path between the C-trap 210 and the ion trap mass analyzer 216 are controllable so that ions can be transferred in either direction, depending on the order of ion processing steps required in any particular analysis method.

The ion trap mass analyzer 216 is a dual-pressure quadrupole linear ion trap (i.e., a two-dimensional trap) comprising a high-pressure linear trap cell 217a and a low-pressure linear trap cell 217b, which are adjacent to each other, separated by a plate lens having an aperture that allows ion transfer between the two cells (which creates pumping limitations) and allows different pressures to be maintained in the two traps. The environment of the high pressure cell 217a facilitates ion cooling and ion fragmentation by collision induced dissociation or electron transfer dissociation or ion-ion reactions, such as proton transfer reactions. The environment of the low-pressure cell 217b facilitates analytical scanning with high resolution and mass accuracy. The low pressure cell contains a double dynode ion detector 215.

As shown in fig. 5B, mass spectrometer 200 further comprises a control unit 37, which may be connected to the various components of system 200 by electronic links. As shown in fig. 5A, discussed previously, the control unit 37 may be connected to one or more additional "front end" devices that supply samples to the mass spectrometer 200 and may perform various sample preparation and/or separation steps prior to supplying sample material to the mass spectrometer. For example, as part of controlling the operation of the liquid chromatograph, controller 37 may control the overall flow rate of the fluid within the liquid chromatograph, including the application of various reagents or mobile phases to the various samples. Control unit 37 may also act as a data processing unit, for example to process (e.g. in accordance with the present teachings) data from mass spectrometer 200 or to forward data to one or more external servers for processing and storage (external servers not shown).

Data collection for model development

Dissociation mass spectral data (MS/MS tandem mass spectral data) were collected on the following 11 protein standards: ubiquitin (. about.8 kDa), cytochrome c (. about.12 kDa), lysozyme (. about.14 kDa), RNAse A (. about.14 kDa), myoglobin (. about.17 kDa), trypsin inhibitor (. about.19 kDa), rituximab LC (. about.25 kDa), carbonic anhydrase (. about.29 kDa), GAPDH (. about.35 kDa), enolase (. about.46 kDa) and bovine serum albumin (. about.66 kDa). The sample was introduced by direct injection and ionized by electrospray ionization. These proteins were selected to construct models due to their well-known fragmentation patterns and performance as typical top-down protein standards. By HCD dissociation, approximately 10 charge states of each protein were selected for MS/MS analysis. In these experiments, the absolute collision energy CE of each precursor ion was varied from an absolute collision energy of 5 to 50eV in 1 electron volt (eV) steps. From these decay curves, a logistic regression plot of the charge state for each analysis was obtained. Calculating a metric D for each mass spectrum_pAnd D_EThese values are then used to build a predictive model for the CE (i.e. a function of the precursors MW and z) required to achieve a range of D values.

Precursor decay model

Method 1

For each protein standard, the relative measured total ion current D was calculated at each absolute Collision Energy (CE) at each precursor ion charge state z_pThe residual precursor ion strength of. D_pThe variation with CE follows a standard decay curve as shown in figure 6A, where decay curves 302, 304, 306 and 308 represent precursor ion decay curves for the +22, +24, +26 and +28 charge states of carbonic anhydrase, respectively. Inventors modeled the variation by logistic regression

CE＝c+(1/k)[ln(1/D_P)-1]Equation 2

Where the parameter c represents the CE at 50% relative to the precursor remaining, and the parameter k is the-slope at c. Curve 304 of fig. 6A, corresponding to z ═ 24, includes additional labels to further depict the calculation of parameters c and k for that particular charge state. Specifically, point 311 is the point where curve 304 crosses the 50% threshold, and thus, parameter c is located at about 17.6 eV. Further, line 313 is a tangent to curve 304 at point 311. Thus, the parameter k determines the slope of the tangent. Computationally, the relative residual intensities calculated are fitted by a least squares method to obtain the values of c and k. The best fit parameters depend on the molecular weight MW of the protein standard and the charge state z at which the protein is fragmented. The parameters c and k can be modeled as simple products of MW and powers of z. The best fit powers of c and k are again obtained using least squares fitting, as shown below.

c＝0.0018×MW^1.6×z^-2.2Equation 3

k＝0.00025×MW^1.7×z^1.9Equation 4

Using method 1, once the molecular weight MW and charge z (described below) are determined, the values of the c and k parameters can be determined according to equations 3 and 4. Then, for any desired residual precursor ion percentage D_pThe calculated c and k values can be used to calculate the required impact energy CE that must be applied by equation 2.

Method 2

After the step of modeling each decay curve by the logistic regression of equation 2, the second method is different from "method 1" described above. The second approach does not represent the parameter c as an independent function of two variables MW and z and likewise represents the parameter k as another independent function of the same two independent variables, but rather employs a more gradual strategy. In this method, a target percentage D of the remaining relative precursor intensity is first specified_p. Equation 1 (using the c and k values determined from the respective decay curves) is then used to tabulate all of the CE, MW and z values, which in combination increase the target precursor ion percentage D_p. Then, using least squares fit to obtainThe functional form of CE at the target, the product of MW and the power of z. In this manner, for each D of interest_pA more customized model of the appropriate CE may be obtained. In such a customized model, a certain percentage D may be calculated to be achieved according to a set of equations of the form_pCollision Energy (CE) required for precursor ion survival of (a):

CE(D_P)＝a1×MW^a2×z^a3equation 5

Where a1, a2, and a3 are for each D of interest_pEach of the values is pre-calculated and tabulated as a parameter. Providing these parameters for D_pAs shown in table 2 provided in fig. 6B.

Entropy model

For centroid product ion mass spectra, another measure of the degree of dissociation, total spectral entropy, is defined as follows:

E_total＝∑_iP_iln(P_i) Equation 6

Wherein p is_iIs the centroid intensity (or area) of the mass spectral peaks (in m/z) of the index i, normalized by the total intensity (or area) of all these peaks or the total ion current TIC. The sum is the sum of all centroids in the mass spectrum (all i). As described above, the calculated value of the total spectral entropy of the HCD product ion spectrum was found to closely reflect the degree of dissociation observed in the data, up to E_totalIs about 0.7, where the position of the ion current becomes an important consideration (fig. 7A). To enhance the ability to distinguish (or resolve) "ideal dissociation" into an excessive fragmentation range (high total spectral entropy), the total entropy is divided into a first partial entropy (E)₁) And a second partial entropy (E)₂) In which E₁Represents the entropy of the region of the MS/MS spectrum from the minimum value m/z to half of the precursor ion m/z, and E₂Representing the entropy of the spectral region from half the precursor ion m/z to the final m/z (fig. 7B). Thus, E is calculated using equation 6₁Only use E₁P of m/z peak centroid within region_iValue, and also calculate E using equation 6₂Only sum up E₂P of m/z peak centroid within region_iThe value is obtained. E₁And E₂In the calculation of both p_iThe denominator in the calculation of (a) is again the total ion current of the mass spectrum (E)₁And E₂A region).

The calculated E for the precursor ion charge state of the selected myoglobin is shown in FIG. 8A_total、E₁And E₂The myoglobin is a protein of approximately 17kDa from the model dataset. Curves 426, 526 and 626 represent E calculated for the +26 charge state of myoglobin, respectively_total、E₁And E₂I.e. a function of the applied collision energy. Similarly, curves 424, 524, and 624 represent E calculated for the +24 charge state of myoglobin, respectively_total、E₁And E₂I.e. a function of the applied collision energy. Similarly, curves 421, 521, and 621 represent E calculated for the +21 charge state of myoglobin, respectively_total、E₁And E₂I.e. a function of the applied collision energy. Similarly, curves 417, 517 and 617 represent calculated E for the +17 charge state of myoglobin, respectively_total、E₁And E₂I.e. a function of the applied collision energy. Finally, curves 415, 515, and 615 represent E calculated for the +15 charge state of myoglobin, respectively_total、E₁And E₂I.e. a function of the applied collision energy.

Considering all protein profiles, it was observed that: (a) e₁Values increase monotonically over the CE range of interest; (b) e₁Curve ratio E₂The curve is much smoother, and (c) all E₁The curves can be modeled well by logistic regression. Using only E₁The disadvantage of the data is that the curves are relatively featureless and it is therefore difficult to normalize the different E' s₁The value is obtained. However, the following facts are utilized: each E₂The curve almost always contains a well-defined maximum value that defines the reference CE for each charge state of each protein standard. Likewise, the inventors have addressed MW, precursors z and E₂The relationship between the CE values at the maximum in the curve is modeled, which results in the following equation 7:

CE^E2max＝0.1×MW^0.93×z^-1.5equation 7

Now apply this set of reference CE values to E₁The curve, the E for each charge state associated with each protein standard can be determined₂Maximum value of E₁The value is obtained. Furthermore, by applying to each E₁The curves are logically fitted and a CE that yields any desired fractional value of the reference entropy can be defined for each z of each standard. The fractional reference entropy becomes the new parameter D_E. Specifically, the parameter D_EIs defined for any particular z, e.g.

Wherein the content of the first and second substances,

is the first partial entropy E₁Value of (i.e. collision energy CE)^E2maxWith a second partial entropy E₂Is correlated with the maximum value in (1). Any particular set of fractional entropy values for CE values can be fitted to a power function form similar to equation 7, which is written in general form as:

CE(D_E)＝b1×MW^b2×z^b3equation 9

Wherein b1, b2 and b3 are for D_EThe various values of (a) are pre-calculated and tabulated parameters as shown in table 3 appearing in fig. 8B. As expected at D_EAt 1, we have recovered equation 6. The concept of spectral entropy can also be easily extended to capture dissociation. For example, rather than calculating entropy solely from m/z distributions, an m/z to mass deconvolution step is first performed on the product ion spectrum to obtain the charge and molecular weight of the product ions. The entropy of molecular weight and the entropy of charge state can be easily defined based on the distribution of molecular weight and charge of the product ions, respectively.

Equation 9 above may be employed to determine the value of the collision energy applied experimentally during HCD fragmentation to produce a spread of product ion m/z values corresponding to the m/z values according toEntropy parameter D calculated in the above discussion_EGiven values of (a). To the best of the inventors' knowledge, this is the first example of a model that proposes applying collision energy based on the desired properties of the product ion set. The present invention is not limited to the use of a particular metric (D)_E) To represent the distribution or spread of the product ions, as other alternative measures of m/z spread of the product ions may be advantageous in certain specific situations.

The b1, b2, and b3 values listed in each row of table 3 are associated with a certain product ion spread ("entropy fraction") D_ECorrelation, the spread is given by equation 8, where D_EIs in the range 0.1, 0.2, …, 2.0. Default level 1.0 corresponds to maximum entropy E of fragmentation spectrum_maxAnd E is observed by counting MW, z and_maxthe corresponding parameter set is obtained by modeling the relationship between the collision energies. Levels below and above 1.0 and E_maxAre correlated and can be modeled separately to provide the best collision energy for lower and higher fragmentation respectively. Generally, prior to conducting an experiment or analysis on a sample containing an unknown compound, it may be necessary to determine the parameter p of any particular instrument by obtaining initial test data for known standards, as described above₁、p₂、p₃(i.e., perform calibration).

Real-time fine calibration

Small instrument to instrument variability and time drift for any particular instrument are anticipated. In view of this, a mechanism is provided to automatically correct for variability that results in a fixed offset for any given model. For example, given an entropy model, if D_ESet to 0.68 and rolling average D of the latest mass spectra (e.g., 100 latest mass spectra)_EIf the difference is more than +/-15% of this value, the system should automatically adjust so that the actual measured D is_ECloser to the desired "target" D_E. We expect that a simple multiplicative correction coefficient will suffice without modifying the coefficients of the underlying equations.

New method for adapting conventional charge state correction coefficients

FIG. 9A shows a schematic representation of a system using, for example, the U.S. patent No. 96,124,591 collision energy (curve 703) calculated conventionally as z-5 and using an entropy fraction D of 1.0 according to the entropy model_EComparison between the calculated 35% impact energy versus impact energy (RCE) (curve 704). For entropy model calculation, the molecular weight was calculated as (m/z-1.007). times.z. Like the NCE curve (by definition, a straight line), the curve computed from the entropy model appears linear over the relevant m/z range of 500.. 2000. Therefore, it should be possible to apply a scaling factor to the NCE curve to obtain a fitted curve that matches the trend of the collision energy value calculated from the entropy model. In fact, the fitted curve 705 matches the entropy model curve well (fig. 9B). Such scaling may be performed using curve fitting with substantially the same goodness of fit (data not shown) for all charge states in the 1..100 range.

The resulting scaling factor for the first 5 charge states is significantly lower than 1, which means that the entropy model tends to assign lower collision energies than the standard NCE method using a default RCE value of 35%. Therefore, the scaling factor for z ═ {1..5} resulting from the fit is significantly different from the conventional correction factors used in the normalized collision energy model, and similar deviations are expected for "intermediate" charge states in the range around 6..10 (when the RCE correction factors are extrapolated to >5 higher charge states). However, for compatibility reasons, modification of the established correction coefficients (table 1) to the low charge state should be avoided.

To solve this problem, two methods are combined as follows: the curve of conventional correction coefficients is extrapolated in steps of-0.05 until intersecting the curve of scaling coefficients determined by curve fitting herein. The intersection was observed at z ≈ 10, which marks the transition of the traditional approach to the novel entropy approach described herein. The resulting scaling factors are shown in fig. 10 as curves 708a and 708 b. Thus, the resulting extended NCE curve (FIG. 10, curves 708a and 708b) is defined as follows:

for z ═ 1..5}, the conventional correction coefficients given in table 1 were used.

For z {6..10}, the correction coefficients are extrapolated by decreasing the last value f (5) ═ 0.75 in steps of 0.05, i.e., f (z ═ {6..10}) {0.70,0.65,0.60,0.55,0.50 }.

For z >10, the correction factor is given by the scale factor resulting from the above fit and normalized to the applied NCE correction factor of 0.75 (to avoid the use of double scaling).

The extended NCE coefficients are given in table 4 shown in fig. 11.

Summary of examples of molecular weight calculation methods

The above model requires knowledge of the Molecular Weight (MW) of the analyte in order to estimate the optimal collision energy for fragmenting selected ions of the analyte. In the case of ionization of ions of protein and polypeptide molecules by electrospray ionization, the ions mainly comprise intact molecules with multiple adducted protons. In this case, the charge on each major analyte ion species is only equal to the number of adducted protons. In this case, the molecular weight can be readily determined, at least theoretically, provided that the individual multiply protonated molecular ion species represented in the mass spectrum can be identified and assigned to groups (i.e., series of charge states) according to their molecular origin. Unfortunately, this process of identification and assignment is often complicated by the fact that typical mass spectra often contain lines representing multiple overlapping sequences of charge states, and by the fact that the characteristics of each ion species of a given charge state can be divided by isotopic variations.

Since samples of biological origin are often very complex, a single MS mass spectrum can easily contain hundreds or even thousands of peaks belonging to different analytes, which are interwoven together within a given m/z range, where ion signals of very different intensities overlap and suppress each other. The computational challenge presented by this is to trace each peak back to a certain analyte or analytes. Eliminating "noise" and determining the correct charge distribution are the first steps to address this challenge. Once the charge of the peaks is determined, the charge states associated with the analyte can be further grouped using known relationships between charge states in the sequence of charge states. This information can further be used to determine the molecular weight of one or more analytes in a process best described as mathematical decomposition (also known in the art as mathematical deconvolution).

Furthermore, the mathematical deconvolution required to identify the various overlapping charge state sequences must be performed in "real time" (i.e., as the mass spectral data is acquired) because the deconvolution results of the precursor ion mass spectra are immediately used to select the ion species to be dissociated and determine the appropriate collision energy to apply during dissociation, which may vary from species to species. To be successful, a data acquisition strategy to predict multiple mass spectra for each ion species and an optimized real-time data analysis strategy are required. Typically, the deconvolution process should be completed in less than one second. An algorithm that achieves the desired analysis of complex samples within such time limits and operates as application software is described in U.S. pre-grant publication No. 2016/0268112a1, the disclosure of which is incorporated herein by reference in its entirety. Alternatively, co-pending european patent application No. 16188157, filed on 9.9.2016, teaches a method for another suitable mathematical deconvolution algorithm. The text of the aforementioned european patent application is included as an appendix to this document, and its drawings are included as a drawing a1 in the accompanying drawing set. The algorithm can be encoded into a hardware processor connected to the mass spectrometer and run faster. The following paragraphs briefly summarize some of the main features of the computational deconvolution algorithm described in the above-mentioned patent application publication No. 2016/0268112a 1.

Only the centroid is used.

Standard mass spectrometry charge distribution algorithms use complete profile data of lines in the mass spectrum. In contrast, the calculation method described in U.S. pre-grant publication No. 2016/0268112a1 uses centroids. The main advantage of using centroids rather than line profiles is data reduction. Typically, the number of contour data points is about one order of magnitude greater than the number of centroids. Any algorithm that uses centroids will gain significant advantages in computational efficiency over the standard assignment method. For applications requiring real-time charge distribution, it is preferable to design an algorithm that requires only centroid data. The main drawback of using centroids is the inaccuracy of the m/z values. Factors such as mass accuracy, resolution, and peak extraction efficiency tend to compromise the quality of the centroid data. However, these concerns can be greatly alleviated by taking m/z inaccuracies into account in algorithms that employ centroid data.

The intensity is binary.

As described in U.S. pre-grant publication No. 2016/0268112a1, mass-line intensities are encoded as binary (or Boolean) variables (true/false or present/absent). The boolean method only considers whether the centroid intensity is above a threshold. The intensity value will take a boolean "True" value if it meets user settable criteria based on signal strength or signal to noise ratio or both, otherwise a "False" value will be assigned regardless of the actual value of the intensity. A well-known disadvantage of using boolean values is the loss of information. However, if a large number of data points can be used, e.g., thousands of centroids in a typical high resolution mass spectrum, then the number of boolean variables is far enough to compensate for the loss of intensity information. Thus, the cited deconvolution algorithm takes advantage of this data abundance to achieve efficiency and accuracy.

In an alternative embodiment, additional accuracy can be achieved without significant computation speed loss by using approximate intensity values rather than just boolean true/false variables. For example, a case where only peaks of similar heights are compared with each other can be conceived. By discretizing the intensity values into a small number of low resolution bins (e.g., "low," "medium," "high," and "very high"), additional information can be easily accommodated. Such binning may enable a good balance with "height information" without sacrificing computational simplicity of the intensity of the very simplified representation.

To achieve computational efficiency comparable to when using the boolean variable alone while still incorporating intensity information, one approach is to encode the intensity in bytes, the same size as the boolean variable. This can be easily achieved by using the logarithm of the intensity (rather than the original intensity) and the appropriate logarithm base in the calculation. The logarithm of the intensity may be further converted to an integer. If the logarithmic base is chosen appropriately, the logarithmic (intensity) values will all comfortably fall within the range of values 0-255, which can be expressed as one byte. In addition, rounding errors in converting double precision variables to integers can be minimized by careful selection of the logarithmic base.

To further minimize any performance degradation that may be caused by the byte algorithm (rather than the boolean algorithm), the computation for the separation or grouping centroid may only need to compute the strength ratio, rather than the byte value strength itself. The calculation of the ratio is very efficient because: 1) the logarithm of the ratio does not use floating-point division, but a simple difference of logarithms, in which case it is converted to a subtraction of only two bytes; and 2) to recover the exact ratio from the difference of the logarithmic values, only exponentiation of the difference of the logarithmic values is required. Since such a calculation will only encounter the exponents of a limited predefined set of numbers (i.e. all possible integer differences between 2 bytes (-255 to + 255)), the exponents can be pre-calculated and stored as a look-up array. Therefore, the use of a byte representation of the log strength and a pre-computed index lookup array does not affect computational efficiency.

Sub-box of mass-to-charge ratio

As described in U.S. pre-grant publication No. 2016/0268112a1, the mass-to-charge ratio values are converted and assembled into a low resolution bin, and the relative charge state intervals are pre-calculated once and buffered for efficiency. In addition, the m/z values of mass spectral lines have been converted from their normal linear scale in daltons to a more natural dimensionless logarithmic representation. This conversion greatly simplifies the calculation of m/z values for any peaks belonging to the same protein, for example, but potentially representing different charge states. The conversion does not affect the accuracy. The cached relative m/z values can be utilized to improve computational efficiency when computing using the converted variables.

Charge state scores based on simple counts and statistical selection criteria.

All relevant mass spectra were encoded as a boolean array as described in U.S. pre-grant publication No. 2016/0268112a 1. The charge state to centroid score is reduced to a simple count, i.e., yes or no (true or false) for the boolean variable at the converted m/z position of the charge state appropriate for the query. This approach bypasses the computationally expensive operations involving double-precision variables. Once scores are compiled for a range of potential states of charge, the optimal value can be easily selected by simple statistical procedures. Using statistical criteria is more rigorous and reliable than using an arbitrary score cutoff or simply selecting the highest scoring charge state.

Iterative optimization of charge state assignments

The teachings of the aforementioned U.S. pre-grant publication No. 2016/0268112a1 use an iterative process defined by a complete self-consistency of charge distribution. The final key feature of the method is to direct the charge distribution to the solution using appropriate optimality conditions. The optimal condition is simply defined as the most consistent distribution of charge for all centroids in the mass spectrum. The basis for this condition is that the charge state assigned to each centroid should coincide with the charge state assigned to the other centroids in the mass spectrum. The algorithm described in said publication implements an iterative process to direct the generation of charge state assignments according to the optimality conditions described above. This process conforms to the accepted specifications for the optimization process. That is, an appropriate optimality condition is first defined, then an algorithm is designed to satisfy this condition, and finally, the effectiveness of the algorithm can be judged by how well it satisfies the optimality condition.

Examples of mass spectrum deconvolution results

FIG. 13A shows the deconvolution results for a five protein mixture consisting of cytochrome c, lysozyme, myoglobin, trypsin inhibitor, and carbonic anhydrase, where deconvolution was performed according to the teachings of U.S. Pub. No. 2016/0268112A 1. The top display panel 1203 of the graphical user interface display shows the data acquired from the mass spectrometer in terms of the centroid. The centrally located main display panel 1201 illustrates each peak as a respective symbol. The horizontally placed mass to charge ratio (m/z) scale 1207 of the top panel 1203 and the center panel 1201 are shown below the center panel. The panel 1205 on the left side of the display shows the calculated molecular weight or weights of the protein molecules in daltons. The Molecular Weight (MW) scale of the side panel 1205 is oriented vertically on the display, which is perpendicular to the horizontally oriented m/z scale 1207 associated with the detected ions. In this example, each horizontal line in the central panel 1201 indicates detection of a protein, with the dashed outline corresponding to the algorithm assigned ion charge states shown as a direct result of the conversion calculation discussed previously. In fig. 13B a display belonging to the same dataset is shown, where the Molecular Weight (MW) scale is greatly expanded relative to the view shown in fig. 13A. The magnified view of fig. 13B shows the well-resolved isotopes of a single protein charge state (lowest portion of the left panel 1205) and a potential adduct or impurity peak (two present in the display). The most potent of these three molecules is the trypsin inhibitor protein.

Figure 12 is a flow chart of a method (method 800) for tandem mass spectrometry analysis of a protein or polypeptide using automated collision energy determination according to the present teachings. In step 802 of method 800 (fig. 12), a sample or sample portion comprising a plurality of proteins and/or polypeptides is input to a mass spectrometer and ionized. Preferably, the ionization is performed by an ionization technique or ionization source that produces ion species of a type that can calculate the molecular weight of various protein or polypeptide compounds from a measurement of the mass-to-charge ratio (m/z) of the ions. In particular, preferably, the ionization technique or ionization source generates ion species from each analyte compound that include a series of charge states, wherein each such ion species includes an originally intact molecule of the analyte compound, but includes one or more adducts. Electrospray and thermal spray ionization are two examples of suitable ionization techniques, as the major ion species generated from proteins and/or polypeptides by these particular ionization techniques are multi-protonated molecules with different degrees of protonation. Ions generated by the ionization source and introduced into the mass spectrometer from the ion source may be referred to as "first generation ions".

After the first generation ions are introduced into the mass spectrometer, the first generation ions are mass analyzed in step 804 to generate a mass spectrum, referred to herein as an "MS 1" mass spectrum to indicate that it is associated with the first generation ions. Mass spectra are a simple list or table of ion currents (intensity is proportional to the number of ions detected) measured at each of a plurality of m/z values, typically maintained in a computer readable memory. The MS1 spectra are then automatically examined in step 806 in a manner that enables the molecular weight of the various protein or polypeptide compounds to be calculated from the m/z ratio of the ions whose presence was detected in the mass spectrum. Performing such steps may, if desired, require prior mathematical decomposition (deconvolution) of the mass spectral data into a sequence of individual identified charge states, wherein each charge state corresponds to a different respective protein or polypeptide compound. Mathematical deconvolution and identification of charge state sequences can be performed according to the method described in the above-mentioned U.S. pre-grant publication No. 2016/0268112a 1. Alternatively, the mathematical deconvolution may be performed by any equivalent algorithm. For example, co-pending european patent application No. 16188157, filed on 9.9.2016, teaches such an alternative mathematical algorithm. The text of the aforementioned european patent application is included as an appendix to this document, and its drawings are included as a drawing a1 in the accompanying drawing set. In some cases, the algorithm should be an optimized algorithm so that the required deconvolution can be performed within the time constraints specified by the mass spectrometry experiment that includes the method 800 as a part.

In step 808 of method 800 (fig. 12), at least one precursor ion species having a corresponding m/z is selected from each of the one or more sequences of charge states identified in the previous step. Preferably, if more than one precursor ion is selected, different precursor ions are selected from different sequences of charge states. Then, in step 810, an optimal Collision Energy (CE) for each selected precursor ion species is calculated, wherein each calculated optimal collision energy is later assigned to ions in the corresponding selected precursor ion species in an ion fragmentation step, and wherein in calculating the optimal collision energy associated with said ion species, the calculated molecular weights of the molecular species from which the corresponding selected ions were generated are used. Optionally, the respective identified z-value for each respective selected ion species may be included in the calculation of the optimal collision energy associated with that ion species.

The optimal collision energy may be calculated in step 810 according to the methods taught herein. For example, if selectedThe optimum collision energy is chosen so that a residual percentage of the precursor ion intensity D remains after fragmentation_pThe collision energy can then be calculated using equation 2, where the parameters c and k are determined according to equations 3 and 4, or by equations in the form of these two equations, but with different values determined from previous calibrations of the particular mass spectrometer device. Alternatively, the optimal collision energy may be selected to leave a residual percentage of precursor ion intensity D remaining after fragmentation using equation 5 in combination with the parameter values listed in Table 2_p. As another alternative, the optimal collision energy may be selected such that the product ion distribution present after fragmentation of the selected precursor ion species is associated with a certain desired entropy parameter D obtained using equation 9 in conjunction with the parameter values listed in Table 3_EAnd (5) the consistency is achieved.

In step 812 of method 800, selected precursor ion species are isolated within the mass spectrometer by known isolation means. For example, if MS1 ion species are temporarily stored in a multipole ion trap device, auxiliary oscillating voltages (auxiliary AC voltages) may be applied to the electrodes of the trap to cause all but a particular selected species to be expelled from the ion trap, thereby isolating only the selected species within the trap. Subsequently, in step 814, ions of the selected and isolated precursor ion species are fragmented by the HCD technique to produce fragment ions, wherein the previously calculated optimal collision energy is imparted to the selected ions to initiate fragmentation. At step 815, a mass spectrum (i.e., MS2 spectrum) of the fragment ions is acquired and stored in computer readable memory.

If there are any remaining non-fragmented selected precursor ion species after step 815 is performed, execution returns to step 814, where step 815 is performed, where another selected precursor ion species is isolated and fragmented. Otherwise, execution proceeds to either step 818 or step 820. In step 818, the m/z or molecular weight of the selected precursor ion obtained from the MS1 spectrum is combined with information from the MS2 spectrum to identify or determine structural information about the polypeptide or protein in the sample or sample portion being analyzed. Optional step 818 need not be performed immediately after step 816, may be performed until after the termination of method 800, or may actually be performed at a later time, provided that information from the associated MS1 and MS2 mass spectra is stored for later use and analysis. Finally, if it is determined at step 820 that there is additional sample or sample portion to be analyzed, execution returns to step 802 where the next sample or sample portion is analyzed. The individual sample portions may be generated by fractionating an initially homogeneous sample (e.g., by capillary electrophoresis, liquid chromatography, etc.) such that the material input to the mass spectrometer in each execution of step 802 is chemically simpler than the original unfractionated sample. Certain measured aspects of the fractionation (e.g., observed retention times) may be combined with corresponding MS1 and MS2 information to identify one or more analytes during subsequent performance of step 818.

And (4) conclusion: model inspection

By correlating the parameters D_pAnd D_EAnd the mass spectrum deconvolution algorithm of U.S. pre-grant publication No. 2016/0268112a1, supra, incorporated existing data acquisition control software, testing the precursor decay model and the entropy model. The protein fraction of E.coli cell lysates was analyzed by MS/MS analysis of the liquid chromatography fraction using precursor ion decay and product ion entropy models, and by various optimized fixed normalized collision energies. In these experiments, it was observed that using either model to calculate the optimal collision energy improved the control of the degree of dissociation relative to the optimized fixed conventional normalized collision energy scheme. Using the methods of the present teachings, this improved fragmentation has led to improvements in protein identification in various data sets.

Appendix: method for identifying monoisotopic mass of molecular species

58页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：具有共用栅极堆叠的双通道CMOS

Automatic determination of collision energy of mass spectrometer

相关技术

网友询问留言