AMD Instinct

{{Infobox graphics processing unit

| name = AMD Instinct

| image = AMD Radeon Instinct wordmark.svg

| created = {{Start date and age|2017|June|20}}

| designfirm = AMD

| marketed_by = AMD

| codename =

| architecture = {{Unbulleted list|GCN 3|GCN 4|GCN 5||CDNA|CDNA 2|CDNA 3}}

| model = MI Series

| fab =

| transistors1 = 5.7B (Polaris10) 14 nm

| transistors2 = 8.9B (Fiji) 28 nm

| transistors3 = 12.5B (Vega10) 14 nm

| transistors4 = 13.2B (Vega20) 7 nm

| transistors5 = 25.6B (Arcturus) 7 nm

| transistors6 = 58.2B (Aldebaran) 6 nm

| transistors7 =146B (Antares) 5 nm

| transistors8 = 153B (Aqua Vanjaram) 5 nm

| entry =

| midrange =

| highend =

| enthusiast =

| d3dversion =

| openclversion =

| openglversion =

| predecessor = {{Unbulleted list|AMD FirePro|Radeon Sky series}}

| variant =

| successor =

|numcores=36-304 Compute Units (CUs)}}

AMD Instinct is AMD's brand of data center GPUs.{{cite news |last=Smith |first=Ryan |date=December 12, 2016 |title=AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming in 2017 |url=http://www.anandtech.com/show/10905/amd-announces-radeon-instinct-deep-learning-2017 |access-date=December 12, 2016 |publisher=Anandtech}}{{cite news |last=Shrout |first=Ryan |date=December 12, 2016 |title=Radeon Instinct Machine Learning GPUs include Vega, Preview Performance |url=https://www.pcper.com/reviews/Graphics-Cards/Radeon-Instinct-Machine-Learning-GPUs-include-Vega-Preview-Performance |access-date=December 12, 2016 |publisher=PC Per}} It replaced AMD's FirePro S brand in 2016. Compared to the Radeon brand of mainstream consumer/gamer products, the Instinct product line is intended to accelerate deep learning, artificial neural network, and high-performance computing/GPGPU applications.

The AMD Instinct product line directly competes with Nvidia's Tesla and Intel's Xeon Phi and Data Center GPU lines of machine learning and GPGPU cards.

The brand was originally known as AMD Radeon Instinct, but AMD dropped the Radeon brand from the name before AMD Instinct MI100 was introduced in November 2020.

In June 2022, supercomputers based on AMD's Epyc CPUs and Instinct GPUs took the lead on the Green500 list of the most power-efficient supercomputers with over 50% lead over any other, and held the top first 4 spots.{{Cite web |title=Green500 Release June 2022 |url=https://www.top500.org/lists/green500/2022/06/ |access-date=2024-05-09 |publisher=TOP500}} One of them, the AMD-based Frontier is since June 2022 and as of 2023 the fastest supercomputer in the world on the TOP500 list.{{Cite web |title=Top500 Release June 2022 |url=https://www.top500.org/lists/top500/2022/06/ |access-date=2024-05-09 |publisher=TOP500}}{{Cite web |title=Top500 Release November 2023 |url=https://www.top500.org/lists/top500/2023/11/ |access-date=2024-05-09 |publisher=TOP500}}

Products

File:Больше не майнинговые Играем на AMD BC-160 и Instinct Mi50 (Мой Компьютер) 25.png

class="wikitable" \|+AMD Instinct GPU generations ! rowspan="2" \|Accelerator ! rowspan="2" \|Launch date ! rowspan="2" \|Architecture ! rowspan="2" \|Lithography ! rowspan="2" \|Compute Units ! colspan="3" \|Memory ! rowspan="2" \|PCIe support ! rowspan="2" \|Form factor ! colspan="8" \|Processing power ! rowspan="2" \|TBP
Size ! Type !Bandwidth (GB/s) ! FP16 ! BF16 ! FP32 ! FP32 matrix ! FP64 performance ! FP64 matrix ! INT8 ! INT4
MI6 \| rowspan="3" \|2016-12-12{{Cite web \|last=Smith \|first=Ryan \|title=AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming In 2017 \|url=https://www.anandtech.com/show/10905/amd-announces-radeon-instinct-deep-learning-2017 \|access-date=2024-06-03 \|website=www.anandtech.com}} \| GCN 4 \| 14 nm \| 36 \| 16 GB \| GDDR5 \|224 \| rowspan="3" \|3.0 \| rowspan="7" \|PCIe \| 5.7 TFLOPS \| rowspan="5" \|N/A \| 5.7 TFLOPS \| rowspan="5" \|N/A \| 358 GFLOPS \| rowspan="6" \|N/A \| rowspan="3" \|N/A \| rowspan="6" \|N/A \| 150 W
MI8 \| GCN 3 \| 28 nm \| rowspan="2" \| 64 \| 4 GB \| HBM \|512 \| 8.2 TFLOPS \| 8.2 TFLOPS \| 512 GFLOPS \| 175 W
MI25 \| rowspan="3" \| GCN 5 \| 14 nm \| rowspan="2" \|16 GB \| rowspan="4" \|HBM2 \|484 \| 26.4 TFLOPS \| 12.3 TFLOPS \| 768 GFLOPS \| 300 W
MI50 \| rowspan="2" \|2018-11-06{{Cite web \|last=Smith \|first=Ryan \|title=AMD Announces Radeon Instinct MI60 & MI50 Accelerators: Powered By 7nm Vega \|url=https://www.anandtech.com/show/13562/amd-announces-radeon-instinct-mi60-mi50-accelerators-powered-by-7nm-vega \|access-date=2024-06-03 \|website=www.anandtech.com}} \| rowspan="3" \|7 nm \| 60 \| rowspan="2" \|1024 \| rowspan="6" \|4.0 \| 26.5 TFLOPS \| 13.3 TFLOPS \| 6.6 TFLOPS \| 53 TOPS \| 300 W
MI60 \| 64 \| rowspan="2" \|32 GB \| 29.5 TFLOPS \| 14.7 TFLOPS \| 7.4 TFLOPS \| 59 TOPS \| 300 W
MI100 \|2020-11-16 \| CDNA \| 120 \|1200 \| 184.6 TFLOPS \| 92.3 TFLOPS \| 23.1 TFLOPS \| 46.1 TFLOPS \| 11.5 TFLOPS \| 184.6 TOPS \| 300 W
MI210 \|2022-03-22{{Cite web \|last=Smith \|first=Ryan \|title=AMD Releases Instinct MI210 Accelerator: CDNA 2 On a PCIe Card \|url=https://www.anandtech.com/show/17326/amd-releases-instinct-mi210-accelerator-cdna-2-on-a-pcie-card \|access-date=2024-06-03 \|website=www.anandtech.com}} \| rowspan="3" \|CDNA 2 \| rowspan="3" \|6 nm \| 104 \| 64 GB \| rowspan="3" \|HBM2E \|1600 \| colspan="2" \|181 TFLOPS \| 22.6 TFLOPS \| 45.3 TFLOPS \| 22.6 TFLOPS \| 45.3 TFLOPS \| colspan="2" \|181 TOPS \| 300 W
MI250 \| rowspan="2" \|2021-11-08{{Cite web \|last=Smith \|first=Ryan \|title=AMD Announces Instinct MI200 Accelerator Family: Taking Servers to Exascale and Beyond \|url=https://www.anandtech.com/show/17054/amd-announces-instinct-mi200-accelerator-family-cdna2-exacale-servers \|access-date=2024-06-03 \|website=www.anandtech.com}} \| 208 \| rowspan="2" \|128 GB \| rowspan="2" \|3200 \| rowspan="2" \|OAM \| colspan="2" \|362.1 TFLOPS \| 45.3 TFLOPS \| 90.5 TFLOPS \| 45.3 TFLOPS \| 90.5 TFLOPS \| colspan="2" \|362.1 TOPS \| 560 W
MI250X \| 220 \| colspan="2" \|383 TFLOPS \| 47.92 TFLOPS \| 95.7 TFLOPS \| 47.9 TFLOPS \| 95.7 TFLOPS \| colspan="2" \|383 TOPS \| 560 W
MI300A \| rowspan="2" \|2023-12-06{{Cite web \|last=Bonshor \|first=Ryan Smith, Gavin \|title=The AMD Advancing AI & Instinct MI300 Launch Live Blog (Starts at 10am PT/18:00 UTC) \|url=https://www.anandtech.com/show/21181/the-amd-advancing-ai-live-blog-starts-at-10am-pt1800-utc \|access-date=2024-06-03 \|website=www.anandtech.com}} \| rowspan="3" \|CDNA 3 \| rowspan="3" \|6 & 5 nm \| 228 \| 128 GB \| rowspan="2" \|HBM3 \| rowspan="2" \|5300 \| rowspan="3" \|5.0 \| APU SH5 socket \| colspan="2" \|980.6 TFLOPS 1961.2 TFLOPS (with Sparsity) \| colspan="2" \|122.6 TFLOPS \| 61.3 TFLOPS \| 122.6 TFLOPS \| 1961.2 TOPS 3922.3 TOPS (with Sparsity) \| N/A \| 550 W 760 W (with liquid cooling)
MI300X \| rowspan="2" \| 304 \| 192 GB \| rowspan="2" \| OAM \| colspan="2" rowspan="2" \|1307.4 TFLOPS 2614.9 TFLOPS (with Sparsity) \| colspan="2" rowspan="2" \|163.4 TFLOPS \| rowspan="2" \| 81.7 TFLOPS \| rowspan="2" \| 163.4 TFLOPS \| rowspan="2" \| 2614.9 TOPS 5229.8 TOPS (with Sparsity) \| rowspan="2" \| N/A \| rowspan="2" \| 750 W
MI325X \|2024-10-10{{Cite web \|last=Smith \|first=Ryan \|title=AMD Plans Massive Memory Instinct MI325X for Q4'24, Lays Out Accelerator Roadmap to 2026 \|url=https://www.anandtech.com/show/21422/amd-instinct-mi325x-reveal-and-cdna-architecture-roadmap-computex \|access-date=2024-06-03 \|website=www.anandtech.com}} \|256 GB \|HBM3E \|6000

class="wikitable"

|+AMD Instinct GPU generations

! rowspan="2" |Accelerator

! rowspan="2" |Launch date

! rowspan="2" |Architecture

! rowspan="2" |Lithography

! rowspan="2" |Compute Units

! colspan="3" |Memory

! rowspan="2" |PCIe support

! rowspan="2" |Form factor

! colspan="8" |Processing power

! rowspan="2" |TBP

Size

! Type

!Bandwidth (GB/s)

! FP16

! BF16

! FP32

! FP32 matrix

! FP64 performance

! FP64 matrix

! INT8

! INT4

MI6

| rowspan="3" |2016-12-12{{Cite web |last=Smith |first=Ryan |title=AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming In 2017 |url=https://www.anandtech.com/show/10905/amd-announces-radeon-instinct-deep-learning-2017 |access-date=2024-06-03 |website=www.anandtech.com}}

| GCN 4

| 14 nm

| 36

| 16 GB

| GDDR5

|224

| rowspan="3" |3.0

| rowspan="7" |PCIe

| 5.7 TFLOPS

| rowspan="5" |N/A

| 5.7 TFLOPS

| rowspan="5" |N/A

| 358 GFLOPS

| rowspan="6" |N/A

| rowspan="3" |N/A

| rowspan="6" |N/A

| 150 W

MI8

| GCN 3

| 28 nm

| rowspan="2" | 64

| 4 GB

| HBM

|512

| 8.2 TFLOPS

| 512 GFLOPS

| 175 W

MI25

| rowspan="3" | GCN 5

| 14 nm

| rowspan="2" |16 GB

| rowspan="4" |HBM2

|484

| 26.4 TFLOPS

| 12.3 TFLOPS

| 768 GFLOPS

| 300 W

MI50

| rowspan="2" |2018-11-06{{Cite web |last=Smith |first=Ryan |title=AMD Announces Radeon Instinct MI60 & MI50 Accelerators: Powered By 7nm Vega |url=https://www.anandtech.com/show/13562/amd-announces-radeon-instinct-mi60-mi50-accelerators-powered-by-7nm-vega |access-date=2024-06-03 |website=www.anandtech.com}}

| rowspan="3" |7 nm

| 60

| rowspan="2" |1024

| rowspan="6" |4.0

| 26.5 TFLOPS

| 13.3 TFLOPS

| 6.6 TFLOPS

| 53 TOPS

| 300 W

MI60

| 64

| rowspan="2" |32 GB

| 29.5 TFLOPS

| 14.7 TFLOPS

| 7.4 TFLOPS

| 59 TOPS

| 300 W

MI100

|2020-11-16

| CDNA

| 120

|1200

| 184.6 TFLOPS

| 92.3 TFLOPS

| 23.1 TFLOPS

| 46.1 TFLOPS

| 11.5 TFLOPS

| 184.6 TOPS

| 300 W

MI210

|2022-03-22{{Cite web |last=Smith |first=Ryan |title=AMD Releases Instinct MI210 Accelerator: CDNA 2 On a PCIe Card |url=https://www.anandtech.com/show/17326/amd-releases-instinct-mi210-accelerator-cdna-2-on-a-pcie-card |access-date=2024-06-03 |website=www.anandtech.com}}

| rowspan="3" |CDNA 2

| rowspan="3" |6 nm

| 104

| 64 GB

| rowspan="3" |HBM2E

|1600

| colspan="2" |181 TFLOPS

| 22.6 TFLOPS

| 45.3 TFLOPS

| 22.6 TFLOPS

| 45.3 TFLOPS

| colspan="2" |181 TOPS

| 300 W

MI250

| rowspan="2" |2021-11-08{{Cite web |last=Smith |first=Ryan |title=AMD Announces Instinct MI200 Accelerator Family: Taking Servers to Exascale and Beyond |url=https://www.anandtech.com/show/17054/amd-announces-instinct-mi200-accelerator-family-cdna2-exacale-servers |access-date=2024-06-03 |website=www.anandtech.com}}

| 208

| rowspan="2" |128 GB

| rowspan="2" |3200

| rowspan="2" |OAM

| colspan="2" |362.1 TFLOPS

| 45.3 TFLOPS

| 90.5 TFLOPS

| 45.3 TFLOPS

| 90.5 TFLOPS

| colspan="2" |362.1 TOPS

| 560 W

MI250X

| 220

| colspan="2" |383 TFLOPS

| 47.92 TFLOPS

| 95.7 TFLOPS

| 47.9 TFLOPS

| 95.7 TFLOPS

| colspan="2" |383 TOPS

| 560 W

MI300A

| rowspan="2" |2023-12-06{{Cite web |last=Bonshor |first=Ryan Smith, Gavin |title=The AMD Advancing AI & Instinct MI300 Launch Live Blog (Starts at 10am PT/18:00 UTC) |url=https://www.anandtech.com/show/21181/the-amd-advancing-ai-live-blog-starts-at-10am-pt1800-utc |access-date=2024-06-03 |website=www.anandtech.com}}

| rowspan="3" |CDNA 3

| rowspan="3" |6 & 5 nm

| 228

| 128 GB

| rowspan="2" |HBM3

| rowspan="2" |5300

| rowspan="3" |5.0

| APU SH5 socket

| colspan="2" |980.6 TFLOPS
1961.2 TFLOPS (with Sparsity)

| colspan="2" |122.6 TFLOPS

| 61.3 TFLOPS

| 122.6 TFLOPS

| 1961.2 TOPS
3922.3 TOPS (with Sparsity)

| N/A

| 550 W
760 W (with liquid cooling)

MI300X

| rowspan="2" | 304

| 192 GB

| rowspan="2" | OAM

| colspan="2" rowspan="2" |1307.4 TFLOPS
2614.9 TFLOPS (with Sparsity)

| colspan="2" rowspan="2" |163.4 TFLOPS

| rowspan="2" | 81.7 TFLOPS

| rowspan="2" | 163.4 TFLOPS

| rowspan="2" | 2614.9 TOPS
5229.8 TOPS (with Sparsity)

| rowspan="2" | N/A

| rowspan="2" | 750 W

MI325X

|2024-10-10{{Cite web |last=Smith |first=Ryan |title=AMD Plans Massive Memory Instinct MI325X for Q4'24, Lays Out Accelerator Roadmap to 2026 |url=https://www.anandtech.com/show/21422/amd-instinct-mi325x-reveal-and-cdna-architecture-roadmap-computex |access-date=2024-06-03 |website=www.anandtech.com}}

|256 GB

|HBM3E

|6000

The three initial Radeon Instinct products were announced on December 12, 2016, and released on June 20, 2017, with each based on a different architecture.{{cite web |author=WhyCry |date=December 12, 2016 |title=AMD announces first VEGA accelerator:RADEON INSTINCT MI25 for deep-learning |url=https://videocardz.com/64677/amd-announces-first-vega-accelerator-radeon-instinct-mi25-for-deep-learning |website=VideoCardz |access-date=June 6, 2022}}{{cite web |last=Mujtaba |first=Hassan |date=June 21, 2017 |title=AMD Radeon Instinct MI25 Accelerator With 16 GB HBM2 Specifications Detailed – Launches Today Along With Instinct MI8 and Instinct MI6 |url=https://wccftech.com/amd-radeon-instinct-mi25-mi8-mi6-graphics-accelerators/ |website=Wccftech |access-date=June 6, 2022}}

= MI6 =

The MI6 is a passively cooled, Polaris 10 based card with 16 GB of GDDR5 memory and with a <150 W TDP. At 5.7 TFLOPS (FP16 and FP32), the MI6 is expected to be used primarily for inference, rather than neural network training. The MI6 has a peak double precision (FP64) compute performance of 358 GFLOPS.{{cite web |title=Radeon Instinct MI6 |url=http://instinct.radeon.com/product/mi/radeon-instinct-mi6/ |website=Radeon Instinct |publisher=AMD |access-date=June 22, 2017 }}{{Dead link|date=July 2023 |bot=InternetArchiveBot |fix-attempted=yes }}

= MI8 =

The MI8 is a Fiji based card, analogous to the R9 Nano, has a <175W TDP. The MI8 has 4 GB of High Bandwidth Memory. At 8.2 TFLOPS (FP16 and FP32), the MI8 is marked toward inference. The MI8 has a peak (FP64) double precision compute performance 512 GFLOPS.{{cite web |title=Radeon Instinct MI8 |url=http://instinct.radeon.com/product/mi/radeon-instinct-mi8/ |website=Radeon Instinct |publisher=AMD |access-date=June 22, 2017 }}{{Dead link|date=July 2023 |bot=InternetArchiveBot |fix-attempted=yes }}

= MI25 =

The MI25 is a Vega based card, utilizing HBM2 memory. The MI25 performance is expected to be 12.3 TFLOPS using FP32 numbers. In contrast to the MI6 and MI8, the MI25 is able to increase performance when using lower precision numbers, and accordingly is expected to reach 24.6 TFLOPS when using FP16 numbers. The MI25 is rated at <300W TDP with passive cooling. The MI25 also provides 768 GFLOPS peak double precision (FP64) at 1/16th rate.{{cite web |title=Radeon Instinct MI25 |url=http://instinct.radeon.com/product/mi/radeon-instinct-mi25/ |website=Radeon Instinct |publisher=AMD |access-date=June 22, 2017 }}{{Dead link|date=July 2023 |bot=InternetArchiveBot |fix-attempted=yes }}

= MI300 series =

File:超级核显来袭！AMD新品现场体验 (2160p 60fps VP9-128kbit AAC)-00.06.51.561.png

The MI300A and MI300X are data center accelerators that use the CDNA 3 architecture, which is optimized for high-performance computing (HPC) and generative artificial intelligence (AI) workloads. The CDNA 3 architecture features a scalable chiplet design that leverages TSMC’s advanced packaging technologies, such as CoWoS (chip-on-wafer-on-substrate) and InFO (integrated fan-out), to combine multiple chiplets on a single interposer. The chiplets are interconnected by AMD’s Infinity Fabric, which enables high-speed and low-latency data transfer between the chiplets and the host system.

The MI300A is an accelerated processing unit (APU) that integrates 24 Zen 4 CPU cores with four CDNA 3 GPU cores, resulting in a total of 228 CUs in the GPU section, and 128 GB of HBM3 memory. The Zen 4 CPU cores are based on the 5 nm process node and support the x86-64 instruction set, as well as AVX-512 and BFloat16 extensions. The Zen 4 CPU cores can run general-purpose applications and provide host-side computation for the GPU cores. The MI300A has a peak performance of 61.3 TFLOPS of FP64 (122.6 TFLOPS FP64 matrix) and 980.6 TFLOPS of FP16 (1961.2 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300A supports PCIe 5.0 and CXL 2.0 interfaces, which allow it to communicate with other devices and accelerators in a heterogeneous system.

The MI300X is a dedicated generative AI accelerator that replaces the CPU cores with additional GPU cores and HBM memory, resulting in a total of 304 CUs (64 cores per CU) and 192 GB of HBM3 memory. The MI300X is designed to accelerate generative AI applications, such as natural language processing, computer vision, and deep learning. The MI300X has a peak performance of 653.7 TFLOPS of TP32 (1307.4 TFLOPS with sparsity) and 1307.4 TFLOPS of FP16 (2614.9 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300X also supports PCIe 5.0 and CXL 2.0 interfaces, as well as AMD’s ROCm software stack, which provides a unified programming model and tools for developing and deploying generative AI applications on AMD hardware.{{cite web |title= AMD CDNA 3 Architecture|url=https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf |website=AMD CDNA Architecture|publisher=AMD|access-date=December 7, 2023}}{{cite web |title= AMD INSTINCT MI300A APU|url=https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/data-sheets/amd-instinct-mi300a-data-sheet.pdf |website=AMD Instinct Accelerators|publisher=AMD|access-date=December 7, 2023}}{{cite web |title= AMD INSTINCT MI300X APU|url=https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/data-sheets/amd-instinct-mi300x-data-sheet.pdf |website=AMD Instinct Accelerators|publisher=AMD|access-date=December 7, 2023}}

Software

= ROCm =

Following software is, as of 2022, regrouped under the Radeon Open Compute meta-project.

== MxGPU ==

The MI6, MI8, and MI25 products all support AMD's MxGPU virtualization technology, enabling sharing of GPU resources across multiple users.{{cite news |last=Kampman |first=Jeff |date=December 12, 2016 |url=https://techreport.com/review/31093/amd-opens-up-machine-learning-with-radeon-instinct |title=AMD opens up machine learning with Radeon Instinct |publisher=TechReport |access-date=December 12, 2016}}

== MIOpen ==

MIOpen is AMD's deep learning library to enable GPU acceleration of deep learning. Much of this extends the GPUOpen's Boltzmann Initiative software. This is intended to compete with the deep learning portions of Nvidia's CUDA library. It supports the deep learning frameworks: Theano, Caffe, TensorFlow, MXNet, Microsoft Cognitive Toolkit, Torch, and Chainer. Programming is supported in OpenCL and Python, in addition to supporting the compilation of CUDA through AMD's Heterogeneous-compute Interface for Portability and Heterogeneous Compute Compiler.