Method and system for generating a centerline of an object and computer readable medium

文档序号：1339282 发布日期：2020-07-17 浏览：11次中文

阅读说明：本技术 用于生成对象的中心线的方法和系统以及计算机可读介质 (Method and system for generating a centerline of an object and computer readable medium ) 是由王昕� 尹游兵宋麒白军杰陆易吴毅高峰曹坤琳于 2020-03-24 设计创作，主要内容包括：本公开提供了用于生成对象的中心线的方法和系统以及计算机可读介质。该方法包括接收包含对象的图像。该方法还包括通过追踪图像块序列来生成对象的中心线。通过追踪图像块序列来生成对象的中心线包括,对于除初始图像块之外的各个图像块：根据前一个图像块的位置和动作来追踪当前图像块；基于当前图像块利用训练好的学习网络输出策略函数和值函数,该学习网络包括编码器后跟第一学习网络和第二学习网络,所述学习网络通过最大化累积奖励来训练；并且确定当前图像块的动作。(The present disclosure provides methods and systems, and computer-readable media for generating a centerline of an object. The method includes receiving an image containing an object. The method also includes generating a centerline of the object by tracking the sequence of image blocks. Generating the centerline of the object by tracking the sequence of image blocks includes, for each image block other than the initial image block: tracking the current image block according to the position and the action of the previous image block; outputting a strategy function and a value function by utilizing a trained learning network based on a current image block, wherein the learning network comprises an encoder followed by a first learning network and a second learning network, and the learning network is trained by maximizing accumulated rewards; and determining an action of the current image block.)

1. A computer-implemented method for generating a centerline of an object, comprising:

receiving an image containing the object, wherein the image is acquired by an imaging device; and

generating, by a processor, a centerline of the object by tracking a sequence of image blocks, including for each image block other than the initial image block:

tracking the current image block according to the position and the action of the previous image block;

outputting a policy function and a value function using a trained learning network based on a current image block, the learning network comprising an encoder followed by a first learning network and a second learning network, wherein the learning network is trained by maximizing a cumulative reward; and

an act of determining a current image block.

2. The method of claim 1, wherein outputting a policy function and a value function using a trained learning network based on current image blocks comprises:

determining a first vector with the encoder based on a current image block;

determining a second vector using the first learning network based on the first vector; and

outputting a policy function and a value function using the second learning network based on a vector obtained by concatenating the first vector, the second vector, and additional inputs including at least a reward and an action of a previous image block.

3. The method of claim 2, wherein the additional inputs include a reward and an action for a previous tile and a tracking speed for a current tile.

4. The method of claim 2, wherein the learning network is trained by maximizing jackpot and minimizing aiding losses in detecting forking, end point, and loop closure.

5. The method according to claim 4, characterized in that the policy function and the value function and the detection results of the bifurcation, endpoint and loop closure are output separately with a separate fully connected layer at the end of the second learning network.

6. The method of claim 2, wherein the reward for each image block incorporates a point-to-curve distance representing a distance between a position of the current image block and a center line of the object and a similarity of intensity between the current image block and a next image block.

7. The method of claim 1, wherein the initial image blocks are predetermined and the step of tracking the sequence of image blocks ends in an end state or ends when the maximum total length is reached.

8. The method of claim 1, wherein the image comprises a 3D image and the space of actions comprises six actions.

9. The method of claim 1, wherein the encoder is a convolutional neural network, and wherein the first learning network and the second learning network are both recurrent neural networks.

10. A system for generating a centerline of an object, comprising:

an interface configured to receive an image containing the subject, wherein the image is acquired by an imaging device; and

a processor configured to:

generating a centerline of the object by tracking the sequence of image blocks, including for each image block other than the initial image block:

tracking the current image block according to the position and the action of the previous image block;

an act of determining a current image block.

11. The system of claim 10, wherein the processor is further configured to:

a first vector is determined with the encoder based on the current image block,

determining a second vector based on the first vector using the first learning network, an

Outputting a policy function and a value function using the second learning network based on a vector obtained by concatenating the first vector, the second vector, and additional inputs including at least a reward and an action of a previous image block.

12. The system of claim 11, wherein the additional inputs include a reward and an action for a previous image block and a tracking speed for a current image block.

13. The system of claim 11, wherein the processor is further configured to train the learning network by maximizing a jackpot and minimizing a loss of assistance in detecting bifurcations, endpoints, and loop closures.

14. The system of claim 13, wherein the policy function and the value function and the detection results of the bifurcation, endpoint, and loop closure are separately output using separate fully connected layers at the end of the second learning network.

15. The system of claim 11, wherein the reward for each image patch incorporates a point-to-curve distance representing a distance between a location of the current image patch and a centerline of the object, and a similarity of intensity between the current image patch and a next image patch.

16. The system of claim 10, wherein the initial tiles are pre-defined and the processor is configured to end tracking the sequence of tiles when a termination state or maximum local length is reached.

17. The system of claim 10, wherein the image comprises a 3D image and the space of actions comprises six actions.

18. The system of claim 10, wherein the encoder is a convolutional neural network and the first learning network and the second learning network are both recurrent neural networks.

19. A non-transitory computer readable medium having stored therein instructions, which when executed by a processor, implement a method for generating a centerline of an object, the method comprising:

receiving an image containing the object, wherein the image is acquired by an imaging device; and

generating, by a processor, a centerline of the object by tracking a sequence of image blocks, including for each image block other than the initial image block:

tracking the current image block according to the position and the action of the previous image block;

an act of determining a current image block.

Technical Field

The present disclosure relates generally to medical image processing and analysis. More particularly, the present disclosure relates to a method and system for generating a centerline of an object (e.g., a blood vessel, an airway, a lactiferous duct, etc.) in an image.

Background

Various biomedical image applications involve complex objects of tree structure, such as blood vessels and airways. Objects that present a tree structure are commonly observed in humans, including human airways, blood vessels (arteries and veins, capillaries), neural structures, and ducts extending from the nipple, among others. Recent technological advances in medical imaging (CT, MRI, fundus camera imaging, etc.) make it possible to acquire medical images (2D, 3D, or 4D) including the above-described structures.

The centerline is a skeletal (or medial) representation of the shape such that each point thereon is equidistant from the boundaries of the shape. The centerlines provide a compact representation, emphasizing geometric and topological properties of the object, such as connectivity, length, orientation, and the like. For example, in clinical practice, centerline extraction is essential for quantitative measurement of tree structures (including length, radius, angle, etc.). Current centerline tracking methods can be divided into two broad categories: morphology skeletonization methods and least cost path based methods. For morphological skeletonization methods such as erosion and refinement, segmentation masks are commonly used, and small perturbations or noise on the image/mask can easily lead to spurious branches. In contrast, the least cost path based approach constructs a cost image and calculates the best path from the starting point to the end point. The cost image is typically computed from image intensity or derived metrics. In addition, to ensure that the extracted centerline is inside the lumen, this least cost path method can be applied to the segmentation mask itself and the cost image calculated based on the distance transform. Although the minimum cost path based algorithm is generally more robust than the morphology tracking algorithm, it still has serious limitations. On the one hand, the start and end points are specified manually, which may increase user interaction, or detected using a priori information, which may result in missing points or detection of unnecessary points. On the other hand, cost images computed based on intensity or intensity-derived metrics may not handle large variations in image intensity and quality well. Moreover, computing the cost image requires additional steps to extract the mask, which is in fact a very difficult task.

Due to the lack of robustness of the above conventional methods, clinicians or technicians often track the centerline manually or using some semi-automatic tool, which is laborious and time consuming, and the results may be prone to error.

The conventional method has disadvantages. For example, intensity-based least cost path algorithms lack robustness due to large differences in image intensity. For segmentation-based centerline tracking algorithms, the segmentation step must be performed throughout the entire scan slice ("full slice"). Typically, the size of the medical image is very large, so the segmentation step itself is very time consuming. Segmentation-based centerline tracking algorithms also require manual specification or extraction of a start or end point based on a priori information. Segmentation-based centerline tracking algorithms are not end-to-end models. These algorithms use post-processing to process smoothness of centerline tracking, often relying on the output of previous multi-step models, so the results are not optimal for image objects. The centerlines are extracted separately from a single path to another path and are therefore not well handled for tree structures.

Disclosure of Invention

The present disclosure is provided to enable robust automatic extraction of tree structure centerlines in an end-to-end manner by introducing a depth-enhanced learning (DR L) algorithm.

In one aspect, a computer-implemented method for generating a centerline of an object is disclosed. The method includes receiving an image containing an object. The image is acquired by an imaging device. The method also includes generating, by the processor, a centerline of the object by tracking the sequence of image blocks. Generating, by the processor, a centerline of the object by tracking the sequence of image blocks includes, for each image block other than the initial image block: tracking a current image block based on a position and a motion of a previous image block; outputting a strategy function and a value function by utilizing a trained learning network based on a current image block, wherein the learning network comprises an encoder followed by a first learning network and a second learning network; and an act of determining a current image block. The learning network may be trained by maximizing the jackpot.

In another aspect, a system for generating a centerline of an object is disclosed. The system includes an interface configured to receive an image containing an object. The image is acquired by an imaging device. The system also includes a processor configured to generate a centerline of the object by tracking the sequence of image blocks. The processor is further configured to, for each image block other than the initial image block: tracking a current image block based on a position and a motion of a previous image block; outputting a strategy function and a value function by utilizing a trained learning network based on a current image block, wherein the learning network comprises an encoder followed by a first learning network and a second learning network; and an act of determining a current image block. The learning network may be trained by maximizing the jackpot.

In another aspect, a non-transitory computer-readable medium having instructions stored thereon is disclosed. The instructions, when executed by the processor, implement a method for generating a centerline of an object. The method includes receiving an image containing a subject, wherein the image is acquired by an imaging device. The method also includes generating, by the processor, a centerline of the object by tracking the sequence of image blocks. The method further comprises, for each image block other than the initial image block: tracking a current image block based on a position and a motion of a previous image block; outputting a strategy function and a value function by utilizing a trained learning network based on a current image block, wherein the learning network comprises an encoder followed by a first learning network and a second learning network; and an act of determining a current image block. The learning network may be trained by maximizing the jackpot.

To handle the tracking of tree-structured centerlines in biomedical images, the reinforcement learning process can be improved by enhancing the loss function with an auxiliary task that provides more training information that supports the learning of tracking related tasks. Three additional auxiliary tasks may be considered, namely fork detection, end point detection and loop detection. The fork detection task involves identifying forks. This auxiliary task is intended to improve trajectory planning at the bifurcation and to keep track of all bifurcating branches. The end point detection task directly detects the end points of the tree. An agent (agent) is trained to predict whether the current location is an endpoint to stop tracking. The loop detection task detects the loop closure directly from the trace. The agent is trained to predict whether the current location has been previously visited. The trained learning network may be used as an agent.

Advantages of the disclosed methods, systems, and media for generating a centerline of an object in an image may be summarized as follows. The model is an end-to-end depth network (with images as its input) and the motion of each image patch can be determined by a policy function and a value function updated for the respective image patch to track the sequence of image patches, thereby accurately and quickly generating the centerline of the object in a robust manner. Furthermore, in some embodiments, the reinforcement learning process of the model may be improved by introducing an auxiliary output layer and corresponding auxiliary tasks to provide more training information to train the model from various structural and topological features. In this way, the trained model can jointly learn the target-driven reinforcement learning problem and better solve the centerline tracking problem in the tree structure. Moreover, the method of the invention can avoid scanning the whole image in both the training phase and the prediction phase.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

In the drawings, which are not necessarily drawn to scale, like reference numerals may describe similar components in different views. Like reference numerals having letter suffixes or different letter suffixes may represent different instances of similar components. The drawings illustrate generally, by way of example, but not by way of limitation, various embodiments and, together with the description and claims, serve to explain the disclosed embodiments. Such embodiments are illustrative, and are not intended to be exhaustive or exclusive embodiments of the present method, system, or non-transitory computer-readable medium having instructions thereon for carrying out the method.

Fig. 1(a) shows an overview of a method for generating a centerline of an object in an image according to an embodiment of the present disclosure;

FIG. 1(b) shows an overview of a conventional segmentation-based approach;

FIG. 2 illustrates an architecture of a deep reinforcement learning (DR L) network for generating centerlines of objects according to an embodiment of the present disclosure;

FIG. 3 illustrates an architecture of a DR L network for generating a centerline of an object, according to another embodiment of the present disclosure;

FIG. 4 illustrates an example architecture of a DR L network for generating a centerline of an object according to yet another embodiment of the present disclosure;

FIG. 5 shows a flowchart of an example method for generating a centerline of an object, in accordance with an embodiment of the present disclosure;

FIG. 6 shows a schematic diagram of a training phase and a prediction phase; and is

Fig. 7 shows a block diagram illustrating an example centerline generation system in accordance with an embodiment of the present disclosure.

Detailed Description

In the following, the technical term "object" is used to contrast with the background of the image. For medical images, the "subject" may represent organs and tissues of interest, such as blood vessels, airways, glands. For optical character recognition, an "object" may represent a character. In some embodiments, medical images are used as an example of an image and blood vessels are used as an example of an "object", but the methods, apparatuses and systems in embodiments may be easily and smoothly converted to extract centerlines of other objects in other types of images. Also, the technical term "image" may refer to a complete image or an image block cropped from an image.

FIGS. 1(a) and 1(b) illustrate, respectively, an overview of a method for generating centerlines of objects in images and a conventional segmentation-based method to compare the two methods, according to embodiments of the present disclosure, a method for generating centerlines for objects in images may be implemented by means of an end-to-end learning framework, particularly the DR L end-to-end method, which incorporates at least one auxiliary task (detecting bifurcations, end points, loops, etc.) into a primary task that tracks the centerlines of objects (e.g., blood vessels).

As shown in FIG. 1(a), the end-to-end learning framework of the present disclosure may work as follows.an image may be fed into a DR L tracker initial image patches may be pre-set.e., image patches containing a portion of an object instead of a portion of the background may be set as initial image patches.

The DR L tracker may actually be applied to individual image blocks for each tracking step and may decide on the actions of the current image block, the DR L tracker may be implemented at least in part by a learning network that may be trained by considering the performance of auxiliary tasks, such as the detection of bifurcations, end points, and loops.

In some embodiments, the DR L tracker as shown in fig. 1(a) may be implemented as an asynchronous dominant actor-critic (A3C) architecture.

The A3C architecture has a smart agent that is typically trained to perform the task of tracking the centerline of an object included in an image, which may be a 2D image or A3D image A DR L tracker, as shown in FIG. 1(a), acts as a smart agent_tThereafter, the agent moves from the current image patch to the next location according to the selected action. The image block located next will then become the current image block. For the A3C architecture, an Actor (Actor) corresponds to a policy function (on how to make an action, e.g., a probability distribution over actions), while a Critic (Critic) corresponds to a value function (on how to evaluate the performance of an action). For the training phase, the jackpot may be calculated by a merit function, upon which a learning network (i.e., agent) may be trained to learn and update the policy function and the value function, as is well known in the A3C architecture, and will not be described further herein.

Fig. 2 shows an example of the structure of a DR L network 200 for generating a centerline of an object according to an embodiment of the present disclosure, particularly with respect to how to track the next image patch (expected to belong to the centerline) from the current image patch 204 for an independent tracking step (one time step). as described above, the operation of one time step as shown in fig. 2 may be iteratively performed until the maximum office (epicode) length is reached, and the centerline may be generated by a sequence of tracked image patches.

As shown in FIG. 2, the architecture of the DR L network 200 includes an encoder 201 that receives an image containing an object, as an example, the encoder 201 may be any Convolutional Neural Network (CNN) architecture, an image may be acquired by an imaging device, the image may be a 2D image, a 3D image, or a 4D image, the image may be acquired directly by various imaging modalities, such as, but not limited to, CT, Digital Subtraction Angiography (DSA), MRI, functional MRI, dynamic contrast enhanced MRI, diffusion MRI, spiral CT, Cone Beam Computed Tomography (CBCT), Positron Emission Tomography (PET), emission computed tomography (SPECT), X-ray imaging, optical tomography, fluorescence imaging, ultrasound imaging, radiotherapy portal imaging, or reconstructed based on the original image acquired by the imaging device.

As shown in FIG. 2, the architecture of the DR L network 200 according to embodiments also includes a first learning network 202 and a second learning network 203 following the encoder 201. the policy function (π) and the value function (V) are the outputs of the DR L network 200, sharing all intermediate representations, both computed from the top of the model using a separate output layer (e.g., a linear layer).

In fig. 2, as an example, a first fully-connected layer 210 is cascaded to the end of the main part 203 'of the second learning network and outputs a policy function, and a second fully-connected layer 211 is cascaded to the end of the main part 203' of the second learning network and outputs a value function. Hereinafter, the main portion 203' of the second learning network together with the output layers (e.g., the first fully-connected layer 210 and the second fully-connected layer 211) constitute the second learning network 203.

The first fully-connected layer 210 and the second fully-connected layer 211 comprise a plurality of nodes, each node being connected to a respective node of the main part 203' of the second learning network. Training the first learning network 202 and the second learning network 203 by maximizing the jackpot to observe(s) at a given state_t) Learning a policy function and a value function. In some embodiments, the intensity of the image block obtained by the action performed by the (t-1) th (t being an integer greater than 1) tracking step may be used as the state observation(s)_t)。

The first and second learning networks 202 and 203 (or the main part 203' of the second learning network) may be layers of multi-layered perceptrons (M L P) or layers of stacked recurrent neural networks (RNNs.) the stacked RNNs may be added to the network architecture in order to take into account contextual information along the centerline.

Fig. 3 shows an architecture of a DR L network 300 for generating a centerline of an object according to another embodiment of the present disclosure in fig. 3, the architecture of the DR L network 300 includes an encoder 301 that receives an image containing an object, followed by a first learning network 302 and a second learning network 303, a first fully connected layer 310 is cascaded to a main portion 303 'of the second learning network and outputs a policy function as an output of the DR L network 300, and a second fully connected layer 311 is cascaded to the main portion 303' of the second learning network and outputs a value function as another output of the DR L network 300.

According to this embodiment, the input to the second learning network 303 is a concatenated vector consisting of the output of the encoder 301, the output of the first learning network 302, and additional inputs the architecture of the DR L network 300 of this embodiment is otherwise similar to the embodiment of FIG. 2_t-1Action a in step (t-1)_t-1The tracking speed of the t stepv_tEtc., as will be described in detail below. Where the joint reward function includes the distance of points to the curve, which is used to measure the effect of the transition from state to state, and the similarity of intensities, such as the average integral of the intensity (difference) between the current location and the next location of the agent, the joint reward function may be used as the reward function for step (t-1). The point-to-curve distance may indicate a distance between the location of the current image block and the centerline of the object, and may be minimized to maximize the reward for the current image block (or current step). Any function for calculating the point-to-curve distance may be used. As an example, the average distance from a sample point in an image block to a sample point on the nearest centerline may be calculated as a point-to-curve distance to measure the distance between the image block and the centerline. The intensity similarity between a current image block and a next image block of an image may be maximized in order to maximize the reward for the current image block. As an example, an average integral of the intensity (difference) between a current image block and a next image block of an image may be calculated. Since joint reward functions are used for each image patch, the topology of the image patch matches the centerline of the object as closely as possible. Other additional inputs are contemplated and may be used.

FIG. 4 illustrates an architecture of a DR L network 400 for generating centerlines of objects according to another embodiment of the present disclosure, the architecture illustrated in FIG. 4 differs from the architecture illustrated in FIG. 3 in that three additional auxiliary tasks are considered, bifurcation detection, endpoint detection, and loop detection, the bifurcation detection task involves identifying bifurcations, the auxiliary task is intended to improve trajectory planning at bifurcations and keep tracking along all bifurcating branches, the endpoint detection task directly detects the endpoints of the tree and trains the agent to predict whether the current location is an endpoint in order to stop tracking, the loop detection task directly detects loop closures from the tracking trajectory and trains the agent to predict whether the current location has been previously visited, the fully connected layers 312, 313, 314 are cascaded to the main portion 403' of the second learning network and output results of endpoint, and loop, respectively, as illustrated in FIG. 4, the second learning network 403 has 5 output layers, e.g., fully connected layers 410, 411, 412, 413, and 414 for outputting endpoint functions, detection results, and bifurcation detection results, respectively, the loop detection results may be maximized by a simultaneous reward of the loop detection and loop detection losses of the training network L.

Fig. 5 shows a flowchart of a process for generating a centerline of an object, in accordance with an embodiment of the present disclosure. As shown in fig. 5, the process of generating the center line of the object starts with receiving an image containing the object at step S10, and then proceeds to step S20, where the center line of the object is generated by tracking the sequence of image blocks. The step of generating the centre line of the object by tracking the sequence of image blocks at step S20 comprises, for each image block other than the initial image block: tracking a current image block based on a position and an action of a previous image block at step S200; in step S210, outputting a policy function and a value function using a learning network based on the current image block; learning a policy function and a value function by maximizing the accumulated rewards at step S220; and an act of determining the current image block at step S230. In particular, the learning network may include an encoder followed by a first learning network and a second learning network. As an example, the learning/training step S220 may be performed offline or online. In some embodiments, step S220 may not be performed during the prediction phase. Instead, the learning network may be trained offline, so that, in step S210, the policy function and the value function may be determined based on the current image block by means of the trained learning network. In some embodiments, for a run, the individual image blocks (individual steps) may share the same learning network, the parameters of which have been determined by training.

In some embodiments, outputting the policy function and the value function based on the current image block by means of a trained learning network comprises the steps of: determining a first vector based on a current image block with the encoder 201, 302, 401, determining a second vector based on the first vector with the first learning network 202, 302, 402, and outputting a policy function and a value function based on a vector obtained by connecting the first vector, the second vector and additional inputs, including, for example, rewards and actions of a previous image block, with each of the second learning networks 203, 303, 403.

In some embodiments, the additional inputs include a reward and an action for a previous image block and a tracking speed for a current image block.

In some embodiments, the learning network may be trained by maximizing the jackpot (e.g., within a round), and minimizing the overhead penalty of detecting forks, endpoints, and loop closures.

In some embodiments, the policy and value functions and the detection results of bifurcations, endpoints and loop closures are output from respective fully connected layers 410, 411, 412, 413, 414 respectively cascaded to the previous learning network (e.g. the main part 203', 303', 403' of the second learning network), as shown in fig. 2 to 4.

In some embodiments, the reward for each image block may combine a point-to-curve distance and a similarity of intensity between the current image block and the next image block, and the point-to-curve distance represents a distance between a position of the current image block and a centerline of the object. In this way, the tracked image blocks may be inhibited from deviating from the center line. In addition, the similarity of the texture (or intensity distribution) of the image blocks located on the center line may be considered to further improve the tracking accuracy.

In some embodiments, the initial image block may be preset and selected. The step of tracking the sequence of image blocks ends with a termination state or with a maximum local length being reached. The parameters of the agent are denoted by Θ. The gradient of Θ propagates back from the actor-critic output to lower level layers. In this way, the agent may be trained in an end-to-end manner.

In some embodiments, the image comprises a 3D image and the motion space comprises six motions. As described above, the six actions include left, right, up, down, front, and back. At this time, the image may be a 3D image. The tracked object centerlines can be presented to the user in a 3D mode.

According to some embodiments, the encoder may be a convolutional neural network, and both the first learning network and the second learning network may be rnn. the rnn network may be a long short term memory (L STM), a Gated Recursion Unit (GRU), a Convolutional Gated Recursion Unit (CGRU), or a convolutional long short term memory (C L STM).

Unlike traditional supervised and unsupervised learning of other deep learning networks, reinforcement learning may indirectly consider ground truth (ground true) by reward, as an example, the reward for each tile may incorporate a point-to-curve distance representing the distance between the location of the current tile and the ground truth centerline of the object.

The training phase may be an off-line process in which a database of training data labeled with ground truth may be populated. Given a 3D volumetric image and a list of ground truth vessel centerline points, the tracking model of the agent can be learned to track the centerline through an optimal trajectory. Batch normalization, entropy regularization compatibility, and the like may also be used to stabilize and improve training. The prediction phase may be an online process. For invisible test samples, a starting point, for example at the root of a blood vessel, is provided to the system. If the agent moves out of the volume, or forms a loop, i.e., moves to a location that has been previously visited, the centerline tracking process stops.

Next, the training and prediction phases for centerline tracking and/or generation may be described in detail with reference to fig. 6, fig. 6 showing an overview of an embodiment of a centerline tracking system 600 including a training phase and a prediction phase. As shown, the centerline tracking system 600 may include a centerline tracking model training unit 602 and a centerline tracking prediction unit 604. The center line tracking model training unit 602 acquires a training image from the training image database 601 as a ground truth value to train the center line tracking model, and as a result, outputs the trained center line tracking model to the center line tracking prediction unit 604. The center line tracking prediction unit 604 is communicatively coupled to the image block extraction unit 603, the image block extraction unit 603 may extract one or more image blocks from the medical images in the medical image database 606, and then the center line tracking prediction unit 604 may predict the center line of the object by tracking the sequence of the image blocks, eventually generating the center line of the object as a prediction result. According to embodiments of the present disclosure, the policy function and the value function as described above may be learned by maximizing the cumulative reward during training of the model. After iterative training is performed by using training data, a trained centerline tracking model can be obtained, and particularly parameters of a learning network can be optimized. In some embodiments, the centerline tracking prediction unit 604 may be communicatively coupled to the training image database 601 via a network 605. In this way, the result of prediction of the center line obtained by the center line tracking prediction unit 604, particularly the medical image marked with the center line, can be fed back to the training image database 601 as a training sample after confirmation by the radiologist or clinician. Thereby, the training image database 601 can be expanded.

Fig. 7 illustrates a block diagram of an exemplary centerline generation system 700 in accordance with an embodiment of the present disclosure. The centerline generation system 700 may include a network interface 707 through which the centerline generation system 700 (or a centerline generation device therein, referring to components other than the network interface 707) may be connected to a network (not shown), such as, but not limited to, a local area network or the internet in a hospital. The network may connect the centerline generation system 700 with external devices such as an image acquisition device (not shown), a medical image database 708, and an image data storage device 709. The image acquisition device may use any type of imaging modality, such as, but not limited to, CT, Digital Subtraction Angiography (DSA), MRI, functional MRI, dynamic contrast enhanced MRI, diffusion MRI, helical CT, Cone Beam Computed Tomography (CBCT), Positron Emission Tomography (PET), Single Photon Emission Computed Tomography (SPECT), X-ray, optical tomography, fluoroscopic imaging, ultrasound imaging, radiotherapy portal imaging.

In some embodiments, the centerline generation system 700 may be a dedicated smart device or a general-purpose smart device. For example, the system 700 may employ a computer customized for image data acquisition and image data processing tasks, or a server in the cloud. For example, the system 700 may be integrated into an image acquisition device.

The centerline generation system 700 may include an image processor 701 and a memory 704, and may additionally include at least one of an input/output 702 and an image display 703.

The image processor 701 may be a processing device, such as a microprocessor, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), etc., including one or more general purpose processing devices, such as a microprocessor, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), etc., more particularly, the image processor 701 may be a Complex Instruction Set Computing (CISC) microprocessor, a Reduced Instruction Set Computing (RISC) microprocessor, a very long instruction word (V L IW) microprocessor, a processor running other instruction sets, or a processor running a combination of instruction sets^TMProduced Pentium^TM，Core^TM，Xeon^TMOrSeries of microprocessors, AMD^TMManufactured Turion^TM，Athlon^TM，Sempron^TM，Opteron^TM，FX^TM，Phenom^TMThe family or any of the various processors manufactured by Sun Microsystems. Image processor 701 may also include a graphics processing unit, such as from Nvidia^TMMade ofBySeries of GPUs, by Intel^TMGMA, Iris manufactured^TMOf the seriesGPU, or by AMD^TMRadeon of manufacture^TMA series of GPUs. The image processor 701 may also include an accelerated processing unit, such as AMD^TMManufactured Desktop A-4(6,6) series, Intel^TMManufactured Xeon Phi^TMSeries, and so on. The disclosed embodiments are not limited to any type of processor or processor circuit that is otherwise configured to meet the computational requirements of receiving, identifying, analyzing, maintaining, generating, and/or providing a large amount of imaging data or manipulating such imaging data to perform the following operations consistent with the disclosed embodiments: generating a center line of an object by tracking an image block sequence based on an input image using trained first and second learning networks, and tracking a current image block based on a position and an action of a previous image block for each image block other than an initial image block set or selected in advance, outputting a policy function and a value function using the first and second learning networks based on the current image block, learning the policy function and the value function by maximizing an accumulated reward, and determining an action of the current image block. In addition, the terms "processor" or "image processor" may include more than one processor, e.g., a multi-core design or multiple processors each having a multi-core design. The image processor 701 may execute sequences of computer program instructions stored in the memory 704 to perform the various operations, processes, methods disclosed herein.

The image processor 701 may be communicatively coupled to the memory 704 and configured to execute computer-executable instructions stored therein to perform the steps of the method as described above. The memory 704 may include Read Only Memory (ROM), flash memory, Random Access Memory (RAM), Dynamic Random Access Memory (DRAM) such as synchronous DRAM (sdram) or Rambus DRAM, static memory (e.g., flash memory, static random access memory, etc.), etc., on which computer-executable instructions are stored in any format. In some embodiments, memory 704 may store computer-executable instructions of one or more image processing programs 705. The computer program instructions may be accessed by the image processor 701, read from ROM or any other suitable storage location, and loaded into RAM for execution by the image processor 701. For example, the memory 704 may store one or more software applications. The software applications stored in the memory 704 may include, for example, an operating system (not shown) for a general purpose computer system and for a soft control device.

Further, the memory 704 may store the entire software application or only a portion of the software application (e.g., the image processing program 705) so as to be executable by the image processor 701. Additionally, the memory 704 may store a plurality of software modules for implementing various steps of a method for generating a centerline of an object in an image or a process for training a learning network according to the present disclosure. For example, the encoder 201, 301, 401, the first learning network 202, 302, 402 and the second learning network 203, 303, 403 (as shown in fig. 2-4) may be implemented as software modules stored on the memory 704.

Further, the memory 704 may store data generated/buffered when the computer program is executed, such as medical image data 706, including medical images transmitted from the image acquisition device, the medical image database 708, the image data storage device 709, and the like. In some embodiments, the medical image data 706 may comprise images received from an image acquisition apparatus to be processed by the image processing program 705 and may comprise medical image data generated during execution of a method of generating a centerline of an object and/or training a learning network.

Further, the image processor 701 may execute the image processing program 705 to implement a method for generating a center line of an object. In this manner, each online centerline generation process may generate a new piece of training data to update the medical image data 706. The image processor 701 may train the first and second learning networks in an online manner to update existing parameters (e.g., weights) in the current learning network. In some embodiments, the updated parameters of the trained learning network may be stored in the medical image data 706 and then may be used for the same object of the same patient in the next centerline generation. Thus, if the image processor 701 determines that the centerline generation system 700 has already performed centerline generation for the same subject of the current patient, the most recently updated learning network for centerline generation may be invoked and used directly.

In some embodiments, the image processor 701 may associate the input image with an automatically (or semi-automatically) generated centerline of the subject as medical image data 706 for presentation and/or transmission when performing an online centerline generation process. In some embodiments, the input image along with the generated centerline may be displayed on an image display 703 for viewing by a user. In some embodiments, medical image data by associating the input image with the generated centerline may be transmitted to the medical image database 708 for access, acquisition, and utilization by other medical devices as needed.

In some embodiments, an image data store 709 may be provided to exchange image data with the medical image database 708, and the memory 704 may communicate with the medical image database 708 to obtain an image of the current patient. For example, the image data storage 709 may reside in other medical image acquisition devices, such as a CT that performs a scan of a patient. A patient slice about an object (e.g., a blood vessel) may be transmitted, reconstructed into a volumetric image and saved to the medical image database 708, and the centerline generation system 700 may retrieve the volumetric image of the object from the medical image database 708 and generate a centerline of the object in the volumetric image.

In some embodiments, the memory 704 may communicate with the medical image database 708 to send and save the input image associated with the generated centerline into the medical image database 708 as a piece of ground truth-labeled training data, which may be used for training as described above.

For example, image display 703 may be an L CD, CRT or L ED display.

Input/output 702 may be configured to allow centerline generation system 700 to receive and/or transmit data. Input/output 702 may include one or more digital and/or analog communication devices that allow system 700 to communicate with a user or other machines and devices. For example, input/output 702 may include a keyboard and mouse that allow a user to provide input.

In some embodiments, the image display 703 may present a user interface such that a user may conveniently and intuitively correct (e.g., edit, move, modify, etc.) the automatically generated object centerline through the input/output 702 and the user interface.

The network interface 707 may include a network adapter, cable connector, serial connector, USB connector, parallel connector, high speed data transmission adapter such as fiber optic, USB 6.0, lightning, wireless network adapter such as Wi-Fi adapter, telecommunications (4G/L TE, 5G, 6G or higher version, etc.) adapter the system 700 may connect to a network through the network interface 707 the network may provide the functionality of a local area network (L AN), wireless network, cloud computing environment (e.g., software as a service, platform as a service, infrastructure as a service, etc.), client server, Wide Area Network (WAN), etc. through various communication protocols currently used or developed in the future.

Various operations or functions are described herein that may be implemented as or defined as software code or instructions. Such content may be source code or differential code ("delta" or "patch" code) ("object" or "executable" form) that may be executed directly. The software code or instructions may be stored in a computer-readable storage medium and, when executed, may cause a machine to perform the functions or operations described, and includes any mechanism for storing information in a form accessible by a machine (e.g., a computing device, an electronic system, etc.), such as recordable or non-recordable media (e.g., Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media, optical storage media, flash memory devices, etc.).

The example methods described herein may be implemented at least in part by a machine or computer. Some examples may include a non-transitory computer-readable or machine-readable medium encoded with instructions operable to configure an electronic device to perform a method as described in the above examples. An implementation of such a method may include software code, such as microcode, assembly language code, a high-level language code, and so forth. Various software programming techniques may be used to create the various programs or program modules. For example, the program parts or program modules may be designed in or by Java, Python, C + +, assembly language, or any known programming language. One or more of such software portions or modules may be integrated into a computer system and/or computer-readable medium. Such software code may include computer readable instructions for performing various methods. The software code may form part of a computer program product or a computer program module. Further, in an example, the software code can be tangibly stored on one or more volatile, non-transitory, or non-volatile tangible computer-readable media, e.g., during execution or at other times. Examples of such tangible computer-readable media may include, but are not limited to, hard disks, removable magnetic disks, removable optical disks (e.g., compact disks and digital video disks), magnetic cassettes, memory cards or sticks, Random Access Memories (RAMs), Read Only Memories (ROMs), and the like.

Moreover, although illustrative embodiments have been described herein, the scope includes any and all embodiments having equivalent elements, modifications, omissions, combinations (e.g., of aspects across various embodiments), adaptations or alterations based on the present disclosure. The elements in the claims are to be interpreted broadly based on the language employed in the claims and not limited to examples described in the specification or during the prosecution of the application. Further, the steps of the disclosed methods may be modified in any manner, including by reordering steps or inserting or deleting steps. It is intended, therefore, that the description be regarded as illustrative only, with a true scope being indicated by the following claims and their full scope of equivalents.

21页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种斑马鱼幼鱼年龄估计方法及系统

Method and system for generating a centerline of an object and computer readable medium

相关技术

网友询问留言