Fishbone-shaped clock tree and implementation method

文档序号:322825 发布日期:2021-11-30 浏览:36次 中文

阅读说明:本技术 一种鱼骨状时钟树及实现方法 (Fishbone-shaped clock tree and implementation method ) 是由 王锐 关娜 李建军 莫军 王亚波 于 2021-08-10 设计创作,主要内容包括:本发明属于时钟树技术领域,公开了一种鱼骨状时钟树及实现方法,所述鱼骨状时钟树包括一条主时钟树和若干条子时钟树,所述主时钟树从PLL引出,所述子时钟树从主时钟树引出,所述子时钟树设置有若干时钟分叉点,所述时钟分叉点作为传统时钟树的源头建立传统时钟树。有益效果在于:通过建立主时钟树和子时钟树建立鱼骨状的时钟树结构,减少了PLL到芯片子模块间的缓冲单元,降低了芯片功耗;主时钟树和子时钟树还起到公共路径的作用,OCV占掉的时钟周期少,最终的时序收敛容易。(The invention belongs to the technical field of clock trees, and discloses a fishbone clock tree and an implementation method thereof. Has the advantages that: the fishbone clock tree structure is established by establishing the main clock tree and the sub-clock trees, so that the buffer units from the PLL to the sub-modules of the chip are reduced, and the power consumption of the chip is reduced; the main clock tree and the sub clock trees also play the role of a common path, the OCV occupies less clock cycles, and the final time sequence is easy to converge.)

1. A fishbone clock tree is characterized by comprising a main clock tree and a plurality of sub clock trees, wherein the main clock tree is led out from a PLL (phase locked loop), the sub clock trees are led out from the main clock tree, the sub clock trees are provided with a plurality of clock branching points, and the clock branching points are used as the source of a traditional clock tree to establish the traditional clock tree.

2. The fishbone clock tree of claim 1, wherein the master clock tree is disposed on a central axis of the chip.

3. The fishbone clock tree of claim 1, wherein the main clock tree includes a plurality of first buffer units connected in sequence, the sub-clock tree includes a plurality of second buffer units connected in sequence, and the sub-clock tree is led out from a connecting line between two adjacent first buffer units.

4. The fishbone clock tree of claim 3, wherein the first cache unit of the main clock tree is disposed outside the sub-module of the chip, the second cache unit passes through the sub-module of the chip, the sub-module that the sub-clock tree passes through includes a plurality of second cache units, and a clock branch point is led out between two adjacent second cache units.

5. A method for realizing a fishbone clock tree is characterized by comprising the following steps:

obtaining the sub-module layout of the chip;

leading out a main clock tree from the PLL of the chip and setting the main clock tree according to the layout of the sub-modules so that the main clock tree is positioned on the central axis of the chip;

leading out a plurality of sub-clock trees from the main clock tree, and enabling the sub-clock trees to be uniformly distributed on two sides of the main clock tree;

and leading out clock bifurcation points from the sub-clock trees, and establishing the traditional clock tree by taking the clock bifurcation points as the source of the traditional clock tree.

6. The method as claimed in claim 5, wherein the main clock tree includes a plurality of first buffer units connected in sequence, the sub clock tree includes a plurality of second buffer units connected in sequence, and the sub clock tree is derived from a connecting line between two adjacent first buffer units.

7. The method as claimed in claim 5, wherein the first cache unit of the main clock tree is disposed outside the sub-module of the chip, the second cache unit passes through the sub-module of the chip, the sub-module that the sub-clock tree passes through includes a plurality of second cache units, and a clock branch point is led out between two adjacent second cache units.

8. The method as claimed in claim 5, wherein the sub-module layout of the chip includes a single-layer sub-module layout and a double-layer sub-module layout.

9. The method as claimed in claim 8, wherein when the chip is in a single-layer sub-module layout, the sub-clock tree derived from the main clock tree divides the chip into a plurality of sub-regions, and the conventional clock tree is established at each sub-region by a clock branch point derived from the sub-clock tree.

10. The method as claimed in claim 8, wherein when the chip is configured as a dual-layer sub-module, a plurality of uniformly distributed sub-modules in a second layer are disposed on a sub-module in a first layer of the chip, a main clock tree is led out from the PLL of the chip and is disposed on a central axis of the sub-modules in the second layer, a plurality of sub-clock trees are led out from the main clock tree and pass through the sub-modules in the second layer, so that the sub-modules in the second layer that are located on the same layer and the same column of the main clock tree are located on the same sub-clock tree, and a clock bifurcation point is disposed in the sub-modules in the second layer and serves as a source of a conventional clock tree to dispose the conventional clock tree.

Technical Field

The invention relates to the technical field of clock trees, in particular to a fishbone clock tree and an implementation method thereof.

Background

The clock tree is a mesh structure built by balancing a plurality of buffer units (buffer/inv cells), and has a source point, generally a clock input port (clock input port) and possibly a certain unit output pin (cell output pin) inside a design, which are built by one-level and one-level buffer units, and key factors for measuring the quality of the clock tree include: clock tree length, clock tree common path, clock tree signal transition time (clock transition time), clock tree drift (clock skew), clock tree noise, clock duty cycle.

The clock tree construction scheme is a very important step in the realization of the back-end physical design of the chip, and the quality of the clock tree directly concerns the power consumption of the chip and the running speed of the chip. The high calculation power chip has extremely high tracing speed and extremely high power consumption, and the Internet of things chip has extremely high tracing power consumption. However, in any chip, how to construct a clock tree with few stages, many common paths, low noise interference, and easily converged timing is a problem that designers in the back end of the chip need to expend energy and much time to solve.

A traditional clock tree is firstly led out of a buffer unit from a PLL (phase locked loop) when being built, then two next-level buffer units are arranged by taking the buffer unit as a source, a third-level buffer unit is arranged below the next-level buffer unit, and the process is repeated until the buffer units are connected to submodules of a chip. According to the traditional clock tree building method, the number of stages of the clock tree from the PLL to each submodule is the same, the submodule close to the PLL and the submodule far away from the PLL have the same number of stages, and synchronous timing sequence check exists among a plurality of submodules.

Those skilled in the art know that too many clock buffers can cause the clock tree to have long length and large power consumption, and easily cause channel wiring congestion, the clock tree wiring occupies more resources, the clock tree has large noise, and meanwhile, too many buffer units on the clock tree can cause the clock tree to have less branching and early common paths, so that more clock cycles are occupied by OCVs, and the final timing sequence convergence is difficult.

Therefore, the existing clock tree building method needs to be improved, the chip power consumption is reduced, and the difficulty of time sequence convergence is reduced.

Disclosure of Invention

The purpose of the invention is: a novel clock tree and an implementation method are provided, so that the power consumption of a chip is reduced, and the difficulty of time sequence convergence is reduced.

In order to achieve the above object, the present invention provides a fishbone clock tree, which includes a main clock tree and several sub clock trees, wherein the main clock tree is led out from a PLL, the sub clock trees are led out from the main clock tree, the sub clock trees are provided with several clock branching points, and the clock branching points are used as the source of a traditional clock tree to establish a traditional clock tree.

Further, the master clock tree is arranged on a central axis of the chip.

Furthermore, the main clock tree comprises a plurality of first cache units which are connected in sequence, the sub clock tree comprises a plurality of second cache units which are connected in sequence, and the sub clock tree is led out from a connecting line between two adjacent first cache units.

Furthermore, the first cache unit of the main clock tree is arranged outside the sub-module of the chip, the second cache unit penetrates through the sub-module of the chip, the sub-module penetrated by the sub-clock tree comprises a plurality of second cache units, and a clock bifurcation point is led out between every two adjacent second cache units.

The invention also discloses a method for realizing the fishbone clock tree, which comprises the following steps:

and acquiring the sub-module layout of the chip.

And leading out a main clock tree from the PLL of the chip and setting the main clock tree according to the layout of the sub-modules so that the main clock tree is positioned on the central axis of the chip.

Several sub-clock trees are led out from the main clock tree, and the sub-clock trees are uniformly distributed on two sides of the main clock tree.

And leading out clock bifurcation points from the sub-clock trees, and establishing the traditional clock tree by taking the clock bifurcation points as the source of the traditional clock tree.

Furthermore, the main clock tree comprises a plurality of first cache units which are connected in sequence, the sub clock tree comprises a plurality of second cache units which are connected in sequence, and the sub clock tree is led out from a connecting line between two adjacent first cache units.

Furthermore, the first cache unit of the main clock tree is arranged outside the sub-module of the chip, the second cache unit penetrates through the sub-module of the chip, the sub-module penetrated by the sub-clock tree comprises a plurality of second cache units, and a clock bifurcation point is led out between every two adjacent second cache units.

Furthermore, the sub-module layout of the chip comprises a single-layer sub-module layout and a double-layer sub-module layout.

Furthermore, when the chip is in a single-layer sub-module layout, the sub-clock tree led out from the main clock tree divides the chip into a plurality of sub-regions, and a traditional clock tree is established at each sub-region through a clock branch point led out from the sub-clock tree.

Furthermore, when the chip is in a double-layer sub-module layout, a plurality of second-layer sub-modules which are uniformly distributed are arranged on a first-layer sub-module of the chip, a main clock tree is led out from a PLL of the chip and is arranged at the central axis position of the second-layer sub-modules, a plurality of sub-clock trees are led out from the main clock tree and pass through the second-layer sub-modules, so that second-layer sub-modules which are positioned on the same layer and the same column of the main clock tree are positioned on the same sub-clock tree, clock branching points are arranged in the second-layer sub-modules, and the clock branching points are used as the source of the traditional clock tree to arrange the traditional clock tree.

Compared with the prior art, the fishbone clock tree and the implementation method thereof have the beneficial effects that: the fishbone clock tree structure is established by establishing the main clock tree and the sub-clock trees, so that the buffer units from the PLL to the sub-modules of the chip are reduced, and the power consumption of the chip is reduced; the main clock tree and the sub clock trees also play the role of a common path, the OCV occupies less clock cycles, and the final time sequence is easy to converge.

Drawings

FIG. 1 is a diagram illustrating a conventional clock tree in the background of the present invention;

FIG. 2 is a schematic diagram of a fishbone clock tree for a single-level sub-module layout according to the present invention;

FIG. 3 is a schematic diagram of a two-layer sub-module layout of the chip of the present invention;

FIG. 4 is a schematic diagram of a fishbone clock tree for a two-level sub-module layout chip according to the present invention;

FIG. 5 is a schematic diagram of a sub-clock tree structure with clock branching points according to the present invention.

In the figure, 1, a master clock tree; 2. a child clock tree; 3. the data stream flows.

Detailed Description

The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

Example 1:

referring to fig. 2, 4 and 5, the invention discloses a fishbone clock tree, which comprises a main clock tree and a plurality of sub clock trees, wherein the main clock tree is led out from a PLL, the sub clock trees are led out from the main clock tree, the sub clock trees are provided with a plurality of clock bifurcation points, and the clock bifurcation points are used as the source of the traditional clock tree to establish the traditional clock tree.

In this embodiment, the master clock tree is disposed on a central axis of the chip. The main clock tree is arranged on the central axis of the chip or at the center of a plurality of submodules of the chip, so that the submodules of the chip are uniformly distributed on two sides of the main clock tree as far as possible, and the length of wiring and the number of wiring can be reduced in physical distance. Meanwhile, when the main clock tree is positioned on the central axis of the chip, the sub-clock trees can conveniently and uniformly connect the sub-modules together. The layout of the actual submodules of the chip should be considered when setting the sub-clock tree, and the sub-clock tree is set according to the number and the positions of the submodules.

In this embodiment, the main clock tree includes a plurality of first buffer units connected in sequence, the sub clock tree includes a plurality of second buffer units connected in sequence, and the sub clock tree is led out from a connection line between two adjacent first buffer units.

In this embodiment, the plurality of first buffer units on the main clock tree form a main clock tree path, and the plurality of second buffer units on the sub-clock tree form a secondary clock tree path.

In this embodiment, the first cache unit of the main clock tree is disposed outside the sub-module of the chip, the second cache unit passes through the sub-module of the chip, the sub-module through which the sub-clock tree passes includes a plurality of second cache units, and a clock bifurcation is led between two adjacent second cache units.

An alternative embodiment is: two second cache units are arranged in the submodule, and a clock bifurcation point is led out on a connecting line between the two second cache units.

In the embodiment, the fishbone-shaped clock tree structure is established by establishing the main clock tree and the sub-clock trees, so that the buffer units from the PLL to the sub-modules of the chip are reduced, and the power consumption of the chip is reduced; the main clock tree and the sub clock trees also play the role of a common path, the OCV occupies less clock cycles, and the final time sequence is easy to converge.

Example 2:

the invention also discloses a method for realizing the fishbone clock tree, which applies the fishbone clock tree in the embodiment 1 to a chip and mainly comprises the following steps:

step 1, obtaining the sub-module layout of the chip.

And 2, leading out a main clock tree from the PLL of the chip and setting the main clock tree according to the layout of the sub-modules so that the main clock tree is positioned on the central axis of the chip.

And 3, leading out a plurality of sub-clock trees from the main clock tree, and enabling the sub-clock trees to be uniformly distributed on two sides of the main clock tree.

And 4, leading out clock bifurcation points from the sub-clock trees, and establishing the traditional clock tree by taking the clock bifurcation points as the source of the traditional clock tree.

In step 1, since different chips have different numbers of sub-modules during design and the arrangement positions of the sub-modules are different, the actual sub-module layout of the chip should be considered when applying the fishbone clock tree to different chips.

In this embodiment, the sub-module layout of the chip includes a single-layer sub-module layout and a double-layer sub-module layout. FIG. 2 is a schematic diagram of a single-layer sub-module chip, and FIG. 3 is a schematic diagram of a double-layer sub-module chip.

In step 2, the main clock tree is arranged on the central axis of the chip or at the center of a plurality of submodules of the chip, so that the submodules of the chip are uniformly distributed on two sides of the main clock tree as far as possible, and the length of wiring and the number of wiring can be reduced in physical distance. Meanwhile, when the main clock tree is positioned on the central axis of the chip, the sub-clock trees can conveniently and uniformly connect the sub-modules together. The layout of the actual submodules of the chip should be considered when setting the sub-clock tree, and the sub-clock tree is set according to the number and the positions of the submodules.

In step 3, the main clock tree includes a plurality of first buffer units connected in sequence, the sub clock tree includes a plurality of second buffer units connected in sequence, and the sub clock tree is led out from a connecting line between two adjacent first buffer units.

In step 4, clock bifurcation points are led out from the sub-clock trees, and the clock bifurcation points are used as the source of the traditional clock tree to establish the traditional clock tree. Referring to fig. 5, a third buffer unit is disposed at a clock branch point led out from the sub-clock tree, and the third buffer unit is used as a source of the conventional clock tree to construct the conventional clock tree.

In this embodiment, the structure and setting method of the conventional clock tree are that a buffer unit is determined as a source, two next-level buffer units are set under the source buffer unit, and a third-level buffer unit is set under the next-level buffer unit, that is, two next-level buffer units are set under the previous-level buffer unit. And repeatedly establishing the multi-level cache units until all the chip sub-modules are connected.

In this embodiment, the first cache unit of the main clock tree is disposed outside the sub-module of the chip, the second cache unit passes through the sub-module of the chip, the sub-module through which the sub-clock tree passes includes a plurality of second cache units, and a clock bifurcation is led between two adjacent second cache units.

Since embodiment 2 is written based on embodiment 1, the technical features that are partially repeated are not repeated.

Example 3:

on the basis of embodiment 2, referring to fig. 2, when a chip is laid out as a single-layer sub-module, a sub-clock tree derived from a main clock tree divides the chip into a plurality of sub-regions, and a conventional clock tree is established at each sub-region through a clock branch point derived from the sub-clock tree.

Example 4:

on the basis of embodiment 2, when the chip is a double-layer sub-module layout, a plurality of uniformly distributed second-layer sub-modules are arranged on a first-layer sub-module of the chip, a main clock tree is led out from a PLL of the chip and is arranged in a central axis position of the second-layer sub-modules, a plurality of sub-clock trees are led out from the main clock tree and pass through the second-layer sub-modules, so that the second-layer sub-modules which are positioned on the same layer and in the same column of the main clock tree are positioned on the same sub-clock tree, a clock branching point is arranged in the second-layer sub-modules, and the clock branching point is used as a source of a traditional clock tree to arrange a traditional clock tree.

Referring to fig. 3 and 4, the layout of the double-layer sub-modules of the chip specifically includes: a second tier sub-module is placed on the first tier sub-module, the large box of figure 4. The second layer of sub-modules is multiple.

In this embodiment, when building a fishbone-shaped clock tree, the central axis positions of a plurality of second-layer sub-modules are selected to set a main clock tree, and then the sub-clock trees are drawn out from the arrangement positions of the second-layer sub-modules, and the sub-clock trees pass through the second-layer sub-modules. Reference numeral 1 in fig. 4 is a main clock tree which is directly provided from the PLL and is laterally distributed in the middle of the chip under the condition that clock transition time (clock transition time) is satisfied, and reference numeral 2 is a clock tree which is sent from the main clock tree and passes through the sub-modules, and the main clock tree and the sub-clock trees together form a fishbone-shaped clock tree structure.

In this embodiment, reference numeral 3 in fig. 4 denotes the data stream flow direction, and data is generated by the chip main control logic. Examples are as follows: data generated by main control is processed by an H12 submodule on the north side and then flows into H6 to be processed, and then enters H0 to be processed, the data is reversely sent back to H6 after being processed for 2 times by H0, H6 is processed again and sent back to H12, the data is sent back to a chip main control unit after being processed by H12, and one-time data processing is completed. In fig. 4 12 data processing links are shown.

Referring to fig. 5, the common paths of the clock trees of the flip-flops in the sub-module H12 are buffer0 and buffer4, a conventional clocktree is constructed behind the buffer4 (in the same way as in fig. 1), and the non-common paths of the clocks for the flip-flops inside the H12 are controlled in the H12 sub-module, and all paths from the PLL to the buffer4 belong to a clock common path; clock common paths of the flip-flops in the H6 module are buffer0, buffer1, buffer2 and buffer5, and similarly, a traditional clock tree is constructed behind buffer5, non-common paths of clocks for the flip-flops inside H6 are controlled inside an H6 sub-module, and the clock paths from the PLL to the buffer5 all belong to a common clock path. For the flip-flops needing to be interacted between the sub-modules H12 and H6, the clock common path only reaches buffer0, and when the clock of the flip-flops in H12 and H6 deviates, the delay of buffer1+ buffer2 is obtained, the interface is fully considered in the logic design stage, and the timing margin is sufficient.

If the chip structure in fig. 4 directly uses the conventional clock tree to connect the PLL and the sub-module, the connection mode is the connection mode in fig. 1. The number of clock tree stages from the PLL to each submodule is the same, and physically, submodules H0, H6, H12, H18, H24, and H30 which are close to the PLL do not need as many clock tree buffer lists according to the physical distance. Meanwhile, synchronous timing checks exist among H0, H6, H12, H18, H24 and H30. Too many clock buffers can cause the clock tree to have long length, the clock tree has high power consumption, the channel wiring is easy to be congested, the clock tree wiring occupies more resources, the noise on the clock tree is high, meanwhile, too many buffer units on the clock tree can cause the clock tree to have less branching and early public paths, more clock cycles are occupied by OCVs, and the final time sequence convergence is difficult.

To sum up, the embodiment of the invention provides a fishbone clock tree and a realization method thereof, compared with the prior art, the fishbone clock tree has the beneficial effects that: the fishbone clock tree structure is established by establishing the main clock tree and the sub-clock trees, so that the buffer units from the PLL to the sub-modules of the chip are reduced, and the power consumption of the chip is reduced; the main clock tree and the sub clock trees also play the role of a common path, the OCV occupies less clock cycles, and the final time sequence is easy to converge.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and substitutions can be made without departing from the technical principle of the present invention, and these modifications and substitutions should also be regarded as the protection scope of the present invention.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于法律文件的图谱构建方法、设备及介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!

技术分类