Three-dimensional integrated chip, construction method thereof, data processing method and electronic equipment

文档序号:274551 发布日期:2021-11-19 浏览:3次 中文

阅读说明:本技术 三维集成芯片及其构建方法、数据处理方法、电子设备 (Three-dimensional integrated chip, construction method thereof, data processing method and electronic equipment ) 是由 周小锋 于 2021-10-25 设计创作,主要内容包括:本发明涉及一种三维集成芯片及其构建方法、数据处理方法、电子设备。本发明涉及通过三维异质集成的逻辑晶圆或逻辑芯片以及存储器晶圆或存储器芯片所实现的高性能及超算计算系统,所述高性能及超算计算系统能够实现更高的存储器带宽、更低的存储器功耗以及更低的延迟,还可以通过纵向堆叠多层存储器晶圆来实现存储器容量和存储器带宽的扩展。(The invention relates to a three-dimensional integrated chip, a construction method thereof, a data processing method and electronic equipment. The present invention relates to a high-performance and super-computing system realized by three-dimensional heterogeneous integrated logic wafers or logic chips and memory wafers or memory chips, which can realize higher memory bandwidth, lower memory power consumption and lower latency, and can also realize expansion of memory capacity and memory bandwidth by vertically stacking multiple layers of memory wafers.)

1. A three-dimensional integrated chip is characterized in that the three-dimensional integrated chip at least comprises a memory unit and a logic unit,

the memory cell includes at least one memory array and at least one first three-dimensional integrated port, wherein:

the memory array comprises at least one memory bank;

each memory bank corresponds to a first three-dimensional integrated port;

the logic unit comprises at least one logic subunit, the logic subunit at least comprises a second three-dimensional integrated port, at least one computing core and a memory controller, wherein:

the second three-dimensional integrated port is connected with the at least one first three-dimensional integrated port, so that the memory unit and the logic unit are connected in a three-dimensional heterogeneous integration manner;

the computing core is configured to access the at least one memory bank and implement a corresponding computing function using the accessed data;

the memory controller is coupled to the computing core and selectively coupled to the second three-dimensional integrated port, the memory controller configured to control memory access between the computing core and the at least one memory bank via the coupling of the second three-dimensional integrated port and the at least one first three-dimensional integrated port.

2. The three-dimensional integrated chip of claim 1, wherein the memory controller is configured to enable access to the corresponding memory bank by the compute core through the connection of the second three-dimensional integrated port and the first three-dimensional integrated port corresponding to the access address based on the access address in the access instruction received from the compute core.

3. The three-dimensional integrated chip of claim 1 or 2, wherein the logic unit further comprises:

a test and repair unit selectively connected with the second three-dimensional integration port, the test and repair unit configured to test and repair the at least one memory bank via connection of the second three-dimensional integration port and the first three-dimensional integration port.

4. The three-dimensional integrated chip of claim 3, wherein the logic unit further comprises:

a multiplexer switch connecting the memory controller and the test and repair unit, the multiplexer switch configured to selectively connect the memory controller with the second three-dimensional integrated port to enable memory access or the test and repair unit with the second three-dimensional integrated port to enable testing and repair.

5. The three-dimensional integrated chip according to claim 1 or 2, wherein the three-dimensional integrated chip comprises at least two memory arrays and at least two logic sub-units, and the logic sub-units further comprise at least one routing unit configured to connect the at least two logic sub-units;

wherein the routing unit in the source logical subunit is configured to implement, based on a target access address in the access instruction received from the computation core in the source logical subunit, access to the corresponding target access address by the computation core in the source logical subunit at least through a connection of the routing unit in the source logical subunit and the routing unit in the target logical subunit.

6. The three-dimensional integrated chip of claim 5, wherein the target access address is an address of a compute core in a target logical subunit, wherein

The routing unit in the source logical subunit is configured to enable access of the compute core in the source logical subunit to the compute core in the target logical subunit through at least a connection of the routing unit in the source logical subunit and the routing unit in the target logical subunit based on an address of the compute core in the target logical subunit in the access instruction received from the compute core in the source logical subunit.

7. The three-dimensional integrated chip of claim 5, wherein the target access address is an address of a bank of a target memory array, wherein

The routing unit in the source logic subunit is configured to implement access of the computational core in the source logic subunit to the memory bank of the target memory array at least through connection of the routing unit in the source logic subunit, the routing unit in the logic subunit corresponding to the target memory array, and the second three-dimensional integration port to the first three-dimensional integration port corresponding to the memory bank of the target memory array based on the address of the memory bank of the target memory array in the access instruction received from the computational core in the source logic subunit.

8. The three-dimensional integrated chip of claim 1 or 2, wherein the memory unit further comprises:

at least one Error Correction Code (ECC) cell, wherein the at least one memory bank each has an ECC cell configured to detect and correct errors of data stored in each memory bank.

9. The three-dimensional integrated chip of claim 1, wherein the logic subunit further comprises:

the buffer module is configured to convert the working voltage of the logic unit into the working voltage of the memory unit or convert the working voltage of the memory unit into the working voltage of the logic unit when the logic unit performs storage access on the memory unit.

10. The three-dimensional integrated chip according to claim 1 or 2, wherein the logic unit is a high performance computational logic unit or a super computational logic unit.

11. A method for constructing a three-dimensional integrated chip, wherein the three-dimensional integrated chip at least comprises a memory unit and a logic unit, the method comprising:

building the memory cell, the memory cell comprising at least one storage array and at least one first three-dimensional integrated port, wherein:

the memory array comprises at least one memory bank;

each memory bank corresponds to a first three-dimensional integrated port;

constructing the logic unit, wherein the logic unit comprises at least one logic subunit, and the logic subunit at least comprises a second three-dimensional integrated port, at least one computing core and a memory controller, wherein:

the second three-dimensional integrated port is connected with the at least one first three-dimensional integrated port;

the computing core is configured to access the at least one memory bank and implement a corresponding computing function using the accessed data;

the memory controller is coupled to the computing core and selectively coupled to the second three-dimensional integrated port, the memory controller configured to control memory access between the computing core and the at least one memory bank via the coupling of the second three-dimensional integrated port and the at least one first three-dimensional integrated port.

12. An electronic device, characterized in that it comprises a three-dimensional integrated chip according to any one of claims 1-10.

13. A data processing method of a three-dimensional integrated chip, wherein the data processing method is based on the three-dimensional integrated chip of any one of claims 1 to 10, and the method comprises:

receiving an access instruction from a computing core, and acquiring an access address based on the access instruction;

and controlling storage access between the computing core and the at least one storage body through the connection of the second three-dimensional integrated port and the at least one first three-dimensional integrated port.

Technical Field

The present invention relates to the field of memory, and more particularly, to the field of High Performance Computing (HPC) and supercomputing Computing systems for three-dimensional integrated (3D-IC). More particularly, the present invention relates to a three-dimensional integrated chip, a method of constructing the same, a data processing method, and an electronic device.

Background

High Bandwidth Memory (HBM) is known in the prior art, and is a new type of Memory implemented by stacked Memory technology. Fig. 1 shows a system configuration diagram of a high bandwidth memory in the prior art. In high bandwidth memory, memory dies are first stacked and then connected together using Through-silicon Vias (TSVs), thereby providing higher bandwidth to the CPU/GPU in a manner that increases memory access I/O and speed. The high bandwidth memory standard is implemented in a 2.5D package, with I/O of the high bandwidth memory interconnected to I/O of the CPU/GPU through a silicon interposer (interposer). Interconnecting the I/O of the high bandwidth memory to the I/O of the CPU/GPU through an interposer (interposer) has significant advantages in terms of speed and power consumption over interconnecting memory to the CPU/GPU through a PCB in conventional memory.

The technical scheme of the high-bandwidth memory in the prior art mainly has two defects: first, in a high bandwidth Memory, a Dynamic Random Access Memory (DRAM) is interconnected with a GPU/CPU/SoC using a Redistribution Layer (Redistribution Layer) of micro-bumps (micro-bumps). However, the pitch of the microbumps is about 40 microns. This highlights the problem of low integration density, limiting further increases in I/O and bandwidth. Second, interconnecting stacked memory dies by means of through-silicon vias creates large parasitic capacitances and thermal resistances. These large parasitic parameters result in high transmission delay and high power consumption, limiting further upgrades in system bandwidth and power consumption.

Therefore, there is a need to solve the above technical problems in the prior art.

Disclosure of Invention

The present invention relates to a system that can achieve higher memory bandwidth, lower memory power consumption, and lower access latency. High-performance computing systems and supercomputing computing systems are realized by three-dimensional heterogeneous integrated logic wafers or logic chips and memory wafers or memory chips. In addition, expansion of memory capacity and memory bandwidth may also be achieved by stacking multiple layers of memory wafers or memory chips vertically.

According to a first aspect of the present invention, there is provided a three-dimensional integrated chip comprising at least a memory cell and a logic cell,

the memory cell includes at least one memory array and at least one first three-dimensional integrated port, wherein:

the memory array comprises at least one memory bank;

each memory bank corresponds to a first three-dimensional integrated port;

the logic unit comprises at least one logic subunit, the logic subunit at least comprises a second three-dimensional integrated port, at least one computing core and a memory controller, wherein:

the second three-dimensional integrated port is connected with the at least one first three-dimensional integrated port, so that the memory unit and the logic unit are connected in a three-dimensional heterogeneous integration manner;

the computing core is configured to access the at least one memory bank and implement a corresponding computing function using the accessed data;

the memory controller is coupled to the computing core and selectively coupled to the second three-dimensional integrated port, the memory controller configured to control memory access between the computing core and the at least one memory bank via the coupling of the second three-dimensional integrated port and the at least one first three-dimensional integrated port.

According to a preferred embodiment of the three-dimensional integrated chip of the present invention, the memory controller implements access to the corresponding memory banks by the computation core through connection of the second three-dimensional integrated port and the first three-dimensional integrated port corresponding to the access address based on the access address in the access instruction received from the computation core.

According to a preferred embodiment of the three-dimensional integrated chip of the present invention, the logic unit further comprises:

a test and repair unit selectively connected with the second three-dimensional integration port, the test and repair unit configured to test and repair the at least one memory bank via connection of the second three-dimensional integration port and the first three-dimensional integration port.

According to a preferred embodiment of the three-dimensional integrated chip of the present invention, the logic unit further comprises:

a multiplexer switch connecting the memory controller and the test and repair unit, the multiplexer switch configured to selectively connect the memory controller with the second three-dimensional integrated port to enable memory access or the test and repair unit with the second three-dimensional integrated port to enable testing and repair.

According to a preferred embodiment of the three-dimensional integrated chip of the present invention, the three-dimensional integrated chip comprises at least two memory arrays and at least two logic subunits, and the logic subunits further comprise at least one routing unit configured to connect the at least two logic subunits;

wherein the routing unit in the source logical subunit is configured to implement, based on a target access address in the access instruction received from the computation core in the source logical subunit, access to the corresponding target access address by the computation core in the source logical subunit at least through a connection of the routing unit in the source logical subunit and the routing unit in the target logical subunit.

According to a preferred embodiment of the three-dimensional integrated chip of the present invention, the target access address is an address of a computing core in a target logical subunit, wherein

The routing unit in the source logical subunit is configured to enable access of the compute core in the source logical subunit to the compute core in the target logical subunit through at least a connection of the routing unit in the source logical subunit and the routing unit in the target logical subunit based on an address of the compute core in the target logical subunit in the access instruction received from the compute core in the source logical subunit.

According to a preferred embodiment of the three-dimensional integrated chip of the present invention, the target access address is an address of a bank of a target memory array, wherein

The routing unit in the source logic subunit is configured to implement access of the computational core in the source logic subunit to the memory bank of the target memory array at least through connection of the routing unit in the source logic subunit, the routing unit in the logic subunit corresponding to the target memory array, and the second three-dimensional integration port to the first three-dimensional integration port corresponding to the memory bank of the target memory array based on the address of the memory bank of the target memory array in the access instruction received from the computational core in the source logic subunit.

According to a preferred embodiment of the three-dimensional integrated chip of the present invention, the memory unit further includes:

at least one Error Correction Code (ECC) cell, wherein the at least one memory bank each has an ECC cell configured to detect and correct errors of data stored in each memory bank.

According to a preferred embodiment of the three-dimensional integrated chip of the present invention, the logic subunit further comprises: the buffer module is configured to convert the working voltage of the logic unit into the working voltage of the memory unit or convert the working voltage of the memory unit into the working voltage of the logic unit when the logic unit performs storage access on the memory unit.

According to a preferred embodiment of the three-dimensional integrated chip of the present invention, the logic unit is a high performance computational logic unit or an ultra-computational logic unit.

According to a second aspect of the present invention, there is provided a method of building a three-dimensional integrated chip including at least a memory cell and a logic cell, the method comprising:

building the memory cell, the memory cell comprising at least one storage array and at least one first three-dimensional integrated port, wherein:

the memory array comprises at least one memory bank;

each memory bank corresponds to a first three-dimensional integrated port;

constructing the logic unit, wherein the logic unit comprises at least one logic subunit, and the logic subunit at least comprises a second three-dimensional integrated port, at least one computing core and a memory controller, wherein:

the second three-dimensional integrated port is connected with the at least one first three-dimensional integrated port;

the computing core is configured to access the at least one memory bank and implement a corresponding computing function using the accessed data;

the memory controller is coupled to the computing core and selectively coupled to the second three-dimensional integrated port, the memory controller configured to control memory access between the computing core and the at least one memory bank via the coupling of the second three-dimensional integrated port and the at least one first three-dimensional integrated port.

According to a third aspect of the invention, there is provided an electronic device comprising the three-dimensional integrated chip according to the first aspect of the invention.

According to a fourth aspect of the present invention, there is provided a data processing method of a three-dimensional integrated chip, the data processing method being based on the three-dimensional integrated chip provided in the first aspect, the method comprising: receiving an access instruction from a computing core, and acquiring an access address based on the access instruction; and controlling storage access between the computing core and the at least one storage body through the connection of the second three-dimensional integrated port and the at least one first three-dimensional integrated port.

Drawings

The invention will be more readily understood by the following description in conjunction with the accompanying drawings, in which:

fig. 1 shows a system configuration diagram of a high bandwidth memory in the prior art.

Fig. 2 schematically shows an embodiment of a three-dimensional integrated chip according to the invention.

Fig. 3 schematically shows another embodiment of a three-dimensional integrated chip according to the invention.

Fig. 4 schematically shows an embodiment of a logic cell of a three-dimensional integrated chip according to the invention.

Figure 5 schematically shows a typical memory cell used in the present invention.

Figure 6 schematically shows an embodiment of a memory network according to the invention.

Detailed Description

Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.

Fig. 2 schematically shows an embodiment of a three-dimensional integrated chip according to the invention.

The three-dimensional integrated chip shown in fig. 2 includes one logic wafer 210 and two memory wafers 220. The logic wafer 210 is a wafer for implementing logic functions, and the memory wafer 220 is a wafer for implementing memory functions.

As shown in fig. 2, a logic wafer 210, two memory wafers 220 are stacked and integrated in a vertical direction. The "vertical direction" herein refers to the thickness direction of the logic wafer 210 and the memory wafer 220.

In addition, two adjacent memory wafers 220 and the logic wafer 210 are bonded and connected in a three-dimensional heterogeneous integration manner, so that the three-dimensional heterogeneous integration of the logic wafer 210 and the two memory wafers 220 is realized.

Specifically, the memory wafer 220 near the logic wafer 210 (i.e., the memory wafer 220 below in fig. 2) is bonded by a hybrid bonding technique (shown schematically as:infig. 2) "") directly to the logic wafer 210; while memory wafers 220 remote from the logic wafer 210 (i.e., the upper memory wafer 220 in fig. 2) are first processed by hybrid bonding techniques (shown schematically as:infig. 2) "") to a memory wafer 220 near the logic wafer 210 (i.e., the lower memory wafer 220 in FIG. 2), and then through the memory wafer near the logic wafer 210220 (i.e., the lower memory wafer 220 in fig. 2) through-silicon vias TSV 240 (shown schematically in fig. 2 as "") and then by hybrid bonding techniques (shown schematically in fig. 2 as ″) via a memory wafer 220 close to the logic wafer 210 (i.e., the lower memory wafer 220 in fig. 2)"") to logic wafer 210.

Thus, the logic wafer 210 enables independent storage access to the upper memory wafer 220 and the lower memory wafer 220 in fig. 2.

As can also be seen in fig. 2, the logic wafer 210 is connected to the substrate 230 by a Bump (Bump) process. Bump processes include, but are not limited to, BOPCOA, BOAC, HOTROD as known in the art.

Additionally, depicted in FIG. 2 is a logic wafer and a memory wafer. It is known that wafers are basic raw materials for manufacturing semiconductor devices, and semiconductors of extremely high purity are prepared into wafers through processes of pulling, slicing, and the like. The wafer is processed through a series of semiconductor manufacturing processes to form a tiny circuit structure, and then is cut, packaged and tested to form a chip. The invention is not limited to logic wafers and memory wafers, and in fact, the logic wafer 210 may also be a logic chip, and the memory wafer 220 may also be a memory chip. Preferably, the memory wafer 220 may be a dynamic random access memory wafer or a dynamic random access memory chip.

Shown in fig. 2 are two memory wafers 220, the two memory wafers 220 each constituting a two-layer memory wafer. It should be understood that to meet bandwidth and storage capacity requirements, the memory die 220 may be more layers, including but not limited to 4 layers, 6 layers, and the like. In addition, a plurality of memory wafers may be disposed in the same layer of the multi-layer memory wafer. For example, two memory wafers are disposed in the same memory wafer layer, and the two memory wafers are independent of each other.

In addition, the multi-layer memory wafer may be disposed on two sides of the logic wafer respectively. Fig. 3 schematically illustrates such an embodiment, taking two memory wafers 220 as an example. As shown in fig. 3, one memory wafer 220 is disposed above the logic wafer 210, and another memory wafer 220 is disposed below the logic wafer 210. Therefore, the memory access path from the logic wafer 210 to the memory wafer 220 is shortened, and the memory access efficiency is improved.

Specifically, memory wafers 220 on both sides of logic wafer 210 are formed by hybrid bonding techniques (shown schematically in FIG. 3 as "") directly to the logic wafer 210; and logic wafer 210 is first processed by a hybrid bonding technique (shown schematically as "in fig. 3)"") is connected to the underlying memory wafer 220 and then through the through-silicon vias TSV 240 (shown schematically in fig. 3) of the underlying memory wafer 220"") and then connected to the substrate 230 by a bumping process.

Fig. 4 schematically shows an embodiment of a logic cell of a three-dimensional integrated chip according to the invention. Fig. 5 schematically illustrates a typical memory cell (i.e., memory array) used in the present invention. The three-dimensional integrated chip at least comprises a logic unit as shown in fig. 4 and a memory unit as shown in fig. 5, and the logic unit and the memory unit are stacked and integrated, preferably by bonding or more preferably by hybrid bonding.

As shown in fig. 4, the logic unit includes at least a compute core 410 and a memory controller 420 connected to the compute core 410. In addition, the logic unit also comprises an input and output three-dimensional heterogeneous integrated port area(i.e., hybrid bonded ports, through two rows in the figure) "Schematically shown), these three-dimensional heterogeneous integrated port regionsThe logic cells are connected with memory cells (e.g., the memory cells shown in fig. 5). In FIG. 4, the input-output three-dimensional heterogeneous integrated port region(i.e., hybrid bonded ports, through two rows in the figure) "Schematically shown) is located in a cache module 460 (two rows) ""box at hand). The illustration in fig. 4 is only schematic, and the input and output three-dimensional heterogeneous integrated port area can be located in other modules. The buffer module 460 is configured to perform voltage conversion, and is used to convert the operating voltage of the logic unit into the operating voltage of the memory unit or convert the operating voltage of the memory unit into the operating voltage of the logic unit when the logic unit performs a storage access to the memory unit.

As shown in fig. 5, a typical memory cell used in the present invention is schematically shown. The memory cell shown in FIG. 5 also includes an input-output three-dimensional heterogeneous integrated port region (i.e., hybrid bonded ports, also by way of illustration)Two rows in (1) "Schematically shown), these three-dimensional heterogeneous integrated port regions Logic cells (e.g., the logic cells shown in fig. 4) are connected with memory cells.

Three-dimensional heterogeneous integrated port regions of logic cells for the logic cells of FIG. 4 and the memory cells of FIG. 5(i.e., hybrid bonded ports, through two rows in the figure) "Schematically shown) corresponding to three-dimensional heterogeneous integrated port regions of connected memory cells (i.e., hybrid bonded ports, through two rows in the figure) ""shown schematically). It should be noted that the Memory unit in fig. 5 has 8 Memory banks (Memory banks), each Memory Bank corresponds to a Memory space of 128Mb and a corresponding data bit width of 128bits, each Memory Bank corresponds to an independent three-dimensional heterogeneous integrated port region, and the three-dimensional heterogeneous integrated port regions corresponding to the 8 Memory banks are respectively . Three-dimensional heterogeneous integrated port region of logic cell in FIG. 4Three-dimensional heterogeneous integrated port region corresponding to memory cell of FIG. 5 That is, the logic unit in fig. 4 is connected to 8 banks of the memory unit in fig. 5, so that the memory access of the logic unit to the 8 banks can be realized.

In addition, 8 banks are shown in fig. 5. It should be understood that to meet bandwidth and storage capacity requirements, the memory banks in a memory unit may also be 16, 32, etc., and the storage space per memory bank may also be 64Mb, 32Mb, etc. The invention is not limited in this respect.

Furthermore, it is also shown in fig. 5 that each bank comprises error correction code ECC units with a capacity of 8 Mb. These ECC cells are used to store additional check bits to detect and correct errors in the data stored in each bank. These ECC units may be implemented using ECC algorithms known in the art. In addition, the storage capacity of the ECC unit is not limited to 8 Mb. The storage capacity of the ECC unit may vary depending on the bandwidth and storage capacity of the memory bank, the ECC algorithm employed, etc.

Referring back to FIG. 4, in the normal operating mode of the three-dimensional integrated chip, the compute core 410 in the logic unit has direct access to the three-dimensional heterogeneous integrated port region through itConnected memory banks.

For example, when the computing core 410 receives an access instruction, the access address carried by the access instruction is the top left of the memory unit shown in FIG. 5With 128Mb corner banks, memory controller 420 passes through the three-dimensional heterogeneous integrated port regions of the logic cells based on the access addressThree-dimensional heterogeneous integrated port region corresponding to memory bank at upper left cornerThe physical connection port formed accesses the memory bank in the upper left corner. Specifically, the access instruction includes, for example, a read instruction and a write instruction. When the access instruction is a read instruction, memory controller 420 passes through the region based on the read instructionAnd regionThe physical connection port is formed to read data from the upper left bank. When the access command is a write command, the access command also includes data to be written, and the memory controller 420 passes through the region based on the write commandAnd regionThe formed physical connection port writes the data to be written into the memory bank at the upper left corner.

For another example, when the computing core 410 receives an access instruction, and the access address carried by the access instruction is a memory bank with a memory space of 128Mb in the upper right corner of the memory unit shown in fig. 5, the memory controller 420 passes through the three-dimensional heterogeneous integrated port region of the logic unit based on the access addressThree-dimensional heterogeneous integrated port region corresponding to memory bank at upper right cornerThe physical connection port formed accesses the upper right bank. Specifically, the access instruction includes, for example, a read instruction and a write instruction. When the access instruction is a read instruction, memory controller 420 passes through the region based on the read instructionAnd regionThe physical connection port is formed to read data from the upper right bank. When the access command is a write command, the access command also includes data to be written, and the memory controller 420 passes through the region based on the write commandAnd regionThe formed physical connection port writes the data to be written into the memory bank at the upper right corner.

Therefore, in the three-dimensional integrated chip of the invention, the logic unit can perform independent memory access on different memory banks of the memory unit, and the memory access processes among different memory banks are not interfered with each other. Therefore, the three-dimensional integrated chip of the invention can further improve the bandwidth and the access efficiency.

In addition, as shown in FIG. 4, the logic unit further includes a test and repair unit 440 for testing and repairing at least one memory cell. In addition, the logic unit also includes a multiplexer switch 450.

Under the normal working mode of the three-dimensional integrated chip, the multi-way selection switch 450 will calculate the core 410, the memory controller 420 and the three-dimensional heterogeneous integrated port regionConnect to thereby directly access toAt least one bank of at least one memory cell. The specific procedure is as described above.

In the test mode of the three-dimensional integrated chip, the multiplexer 450 connects the test and repair unit 440 to the three-dimensional heterogeneous integrated port regionConnected to test and repair at least one bank of at least one memory cell.

Specifically, when memory needs to be tested and repaired, the multiplexer 450 switches off the compute core 410, the memory controller 420 to the three-dimensional heterogeneous integrated port regionAnd turn on the test and repair unit 440 to the three-dimensional hetero-integrated port regionThe path between them. At this time, the test and repair unit 440 tests and repairs each bank of the memory cells. For example, the test and repair unit 440 passes through the three-dimensional heterogeneous integrated port regionAnd three-dimensional heterogeneous integrated port regionThe formed physical connection port tests and repairs the memory bank at the upper left corner of the memory unit; as another example, test and repair unit 440 passes through a three-dimensional heterogeneous integrated port regionAnd three-dimensional heterogeneous integrated port regionThe formed physical connection port tests and repairs the memory bank at the upper right corner of the memory unit。

In addition, as shown in fig. 4, the logic unit further includes a routing unit 430, the routing unit 430 connects the multiplexing switch 450 and the three-dimensional heterogeneous integrated port region. The routing unit 430 is used to interconnect a plurality of memories to form a system on chip.

Figure 6 schematically shows an embodiment of a memory network, i.e. a system on chip, according to the invention. The system on chip shown in fig. 6 is a system on chip in which a plurality of logic subunits and a plurality of memory subunits are integrated via a routing unit. The logic sub-unit in fig. 6 may be constituted by the logic unit in fig. 4, and the memory sub-unit in fig. 6 may be constituted by the memory unit in fig. 5. The system on chip is formed by routing cell interconnections located in logical subunits.

The ability of the compute core 410 in the logic unit to directly access the memory banks connected through its three-dimensional heterogeneous integrated port region was discussed above with respect to fig. 4 and 5. In the case of forming the system on chip shown in fig. 6, the computational core in one logical subunit in the three-dimensional integrated chip can also indirectly access the computational core in another logical subunit or the memory bank to which another logical subunit belongs through the routing unit 430.

For example, when a compute core in the upper left-hand logical subunit (hereinafter referred to as "first logical subunit") shown in fig. 6 receives an access instruction, and the access address carried by the access instruction is a bank of memory cells of the upper right-hand logical subunit (hereinafter referred to as "second logical subunit") shown in fig. 6, the memory controller of the first logical subunit, based on the access address, passes through an access path formed by a routing cell in the first logical subunit (hereinafter referred to as "first routing cell"), a routing cell in the middle logical subunit in the first row (hereinafter referred to as "middle routing cell"), a routing cell in the second logical subunit (hereinafter referred to as "second routing cell"), and passes through a memory bank of memory cells of the second logical subunitThree-dimensional heterogeneous integrated port regions in cellsThree-dimensional heterogeneous integrated port region of memory bank corresponding to access address And the formed physical connection port accesses the memory bank corresponding to the access address.

For another example, when the compute core in the first logic subunit shown in fig. 6 receives the access instruction, and the access address carried by the access instruction is the compute core in the second logic subunit shown in fig. 6, the memory controller in the first logic subunit accesses the compute core in the second logic subunit through the access path formed by the first routing unit in the first logic subunit, the intermediate routing unit in the intermediate logic subunit, the second routing unit in the second logic subunit, and the memory controller based on the access address.

The memory network of the mesh topology is only schematically shown in fig. 6. The memory networks of the present invention include, but are not limited to, bus, star, ring, tree, mesh, and hybrid topologies.

Applications of the present invention may include, but are not limited to, high performance computing systems, supercomputing systems, and the like.

When the logic wafer 210 is a high performance computing logic wafer, the obtained corresponding three-dimensional integrated chip is a three-dimensional integrated high performance computing chip.

In the prior art, the performance of high performance computing systems is limited by memory bandwidth and power consumption. However, in the invention, since the high-performance computing logic wafer and the memory wafer are stacked and integrated in a hybrid bonding mode, and the hybrid bonding mode can realize higher-density integration, the realization of higher memory bandwidth is facilitated; in addition, since the high performance computing system of the present invention has a lower parasitic parameter (capacitance), lower memory access power consumption can be achieved. In addition, since the memory wafers can also be stacked vertically in multiple layers, expansion of memory capacity and memory bandwidth can be further achieved.

Therefore, the three-dimensional integrated chip obtained by the invention can realize higher memory bandwidth, higher memory capacity and lower memory access power consumption, and can meet the requirements of a high-performance computing system.

In addition, different memory banks in the memory array connected to the three-dimensional heterogeneous integrated port region of the high-performance computation logic unit (i.e., the memory array to which the high-performance computation core belongs) respectively have corresponding three-dimensional heterogeneous integrated port regions, so that the high-performance computation core in the high-performance computation logic unit can perform independent memory access on the different memory banks in the memory array to which the high-performance computation core belongs, memory access processes among the different memory banks are not interfered with each other, and access bandwidth and access efficiency can be remarkably improved.

For example, when the high-performance computing core receives an access instruction, and an access address carried by the access instruction is a memory bank at the upper right corner of the memory unit shown in fig. 5, the memory controller in the high-performance logic unit accesses the memory bank at the upper right corner through a physical connection port formed by the three-dimensional heterogeneous integrated port region of the high-performance logic unit and the three-dimensional heterogeneous integrated port region corresponding to the memory bank at the upper right corner based on the access address. For another example, when the high-performance computing core receives the access instruction and the access address carried by the access instruction is the lower left bank in the memory unit shown in fig. 5, the memory controller in the high-performance logic unit accesses the lower left bank through the physical connection port formed by the three-dimensional heterogeneous integrated port region of the high-performance logic unit and the three-dimensional heterogeneous integrated port region corresponding to the lower left bank based on the access address. That is to say, since the upper right and lower left banks have corresponding three-dimensional heterogeneous integrated port regions, the high-performance computation core can perform the storage access of the upper right bank and the lower left bank in parallel and independently, which can greatly improve the access bandwidth and the access efficiency.

When the logic wafer 210 is an overcomputing logic wafer, the obtained corresponding three-dimensional integrated chip is a three-dimensional integrated overcomputing chip.

In the prior art, the performance of a supercomputing system is limited by memory bandwidth and access latency. However, in the invention, the super-computation logic wafer and the memory wafer are stacked and integrated in a hybrid bonding mode, and the hybrid bonding mode can realize higher-density integration, so that higher memory bandwidth is realized; in addition, the distributed super-computation core improves the computation parallelism, thereby further improving the computation speed of the super-computation chip.

The super computation logic unit comprises a super computation core, an instruction cache, a data cache, a memory controller and a cache module. Similar to the above description, there may be an input-output three-dimensional heterogeneous integrated port region in the cache module, and the cache module is configured for voltage conversion. And the memory controller directly accesses the memory array to which the super computation logic unit belongs by connecting the input/output three-dimensional heterogeneous integrated port area in the cache module with the input/output three-dimensional heterogeneous integrated port area in the memory array.

For example, when the supercomputing computing core receives an access instruction, the supercomputing computing core first determines whether the access instruction and the access data are stored in the instruction cache and the data cache. If the access instruction and the access data are stored in the instruction cache and the data cache, the access results are returned from the instruction cache and the data cache to the supercomputing computational core. And if the access instruction and the access data are not stored in the instruction cache and the data cache, the memory controller accesses the storage array and returns an access result to the super computing core through a physical connection port formed by the input/output three-dimensional heterogeneous integrated port area in the cache module and the input/output three-dimensional heterogeneous integrated port area in the storage array based on an access address carried in the access instruction.

In the super computation logic unit, the caching strategies of the instruction cache and the data cache are established according to the access frequency or the writing time sequence of the super computation cores. And writing the access instruction and the access data into an instruction cache and a data cache or a storage array according to the cache policies of the instruction cache and the data cache. The presence of an instruction cache and a data cache in the supercomputing logic unit further increases the computation speed.

In addition, in the case where the three-dimensional integrated supercomputing chip is formed in a manner similar to that shown in fig. 6, the supercomputing logic unit described above may form a supercomputing logic sub-unit of the three-dimensional integrated supercomputing chip. Similar to what is described with respect to fig. 6, a supercomputing computational core in one supercomputing-logic subunit can indirectly access a supercomputing computational core in another supercomputing-logic subunit or a memory bank to which the other supercomputing-logic subunit belongs through a routing unit.

For example, when the access instruction is received by the supercomputing computational core in the source supercomputing logic subunit, and the access address carried by the access instruction is the memory bank of the memory unit of the target supercomputing logic subunit, the memory controller of the source supercomputing logic subunit accesses the memory bank corresponding to the access address through the access path formed by the routing unit in the source supercomputing logic subunit, the routing unit in one or more intermediate supercomputing logic subunits, and the routing unit in the target supercomputing logic subunit, based on the access address, and through the physical connection port formed by the three-dimensional heterogeneous integrated port area in the target supercomputing logic subunit and the three-dimensional heterogeneous integrated port area of the memory bank corresponding to the access address.

For another example, when the super computation core in the source super computation logic subunit receives the access instruction and the access address carried by the access instruction is the super computation core in the target super computation logic subunit, the memory controller in the source super computation logic subunit accesses the super computation core in the target super computation logic subunit through an access path formed by the routing unit in the source super computation logic subunit, the intermediate routing unit in the one or more intermediate super computation logic subunits, the routing unit in the target super computation logic subunit, and the memory controller based on the access address.

By forming the system on supercomputer in a manner similar to that shown in fig. 6, each supercomputer has independent access to its described storage array and independent computation in the manner described above, or has access to another supercomputer or the storage array to which the supercomputer belongs via a routing unit, the computation speed is significantly increased.

The three-dimensional integrated chips proposed in the present invention are memory chips or memory dies (e.g., ROM, SDRAM, RAM, DRAM, SRAM, FLASH, EPROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices) for storing data and/or computer code. The three-dimensional integrated chip may be or include non-transitory volatile memory or non-volatile memory, etc.

The invention also provides a data processing method based on the three-dimensional integrated chip in any of the embodiments, and the method comprises the following steps: receiving an access instruction from a computing core, and acquiring an access address based on the access instruction; and controlling storage access between the computing core and the at least one storage body through the connection of the second three-dimensional integrated port and the at least one first three-dimensional integrated port. It can be understood that the data processing method further includes functions that can be implemented by the three-dimensional integrated chip, and details are not described herein.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. It is to be understood that the scope of the invention is defined by the claims.

17页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:显示器及半导体存储器件

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!