NVLink

{{infobox Information appliance

|name=NVLink

|image=200px

|caption=

|manufacturer=Nvidia

|type=Multi-GPU and CPU technology

|releasedate=

|price=

|connectivity=

|lifespan=

|unitssold=

|media=

|os=

|input=

|power=

|storage=

|memory=

|display=

|service=

|dimensions=

|weight=

|predecessor=Scalable Link Interface

|successor=

|related=}}

NVLink is a wire-based serial multi-lane near-range communications link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central hub. The protocol was first announced in March 2014 and uses a proprietary high-speed signaling interconnect (NVHS).[http://www.fudzilla.com/news/graphics/41420-nvidia-nvlink-2-0-arrives-in-ibm-servers-next-year Nvidia NVLINK 2.0 arrives in IBM servers next year] by Jon Worrel on fudzilla.com on August 24, 2016

Principle

NVLink is developed by Nvidia for data and control code transfers in processor systems between CPUs and GPUs and solely between GPUs. NVLink specifies a point-to-point connection with data rates of 20, 25 and 50 Gbit/s (v1.0/v2.0/v3.0+ resp.) per differential pair. For NVLink 1.0 and 2.0 eight differential pairs form a "sub-link" and two "sub-links", one for each direction, form a "link". Starting from NVlink 3.0 only four differential pairs form a "sub-link". For NVLink 2.0 and higher the total data rate for a sub-link is 25 GB/s and the total data rate for a link is 50 GB/s. Each V100 GPU supports up to six links. Thus, each GPU is capable of supporting up to 300 GB/s in total bi-directional bandwidth.{{Cite web|url=http://images.nvidia.com/content/pdf/dgx1-v100-system-architecture-whitepaper.pdf|title=NVIDIA DGX-1 With Tesla V100 System Architecture}}{{cite web |url=http://blogs.nvidia.com/blog/2014/11/14/what-is-nvlink/ |title=What Is NVLink? |publisher=Nvidia |date=2014-11-14}} NVLink products introduced to date focus on the high-performance application space. Announced May 14, 2020, NVLink 3.0 increases the data rate per differential pair from 25 Gbit/s to 50 Gbit/s while halving the number of pairs per NVLink from 8 to 4. With 12 links for an Ampere-based A100 GPU this brings the total bandwidth to 600 GB/s.{{cite news|url=https://www.anandtech.com/show/15801/nvidia-announces-ampere-architecture-and-a100-products|title=NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator|author=Ryan Smith|date=May 14, 2020|publisher=AnandTech}} Hopper has 18 NVLink 4.0 links enabling a total of 900 GB/s bandwidth.{{Cite web |last=Jacobs |first=Blair |date=2022-03-23 |title=Nvidia reveals next-gen Hopper GPU architecture |url=https://www.club386.com/nvidia-reveals-next-gen-hopper-gpu-architecture/ |access-date=2022-05-04 |website=Club386 |language=en-GB}} Thus NVLink 2.0, 3.0 and 4.0 all have a 50 GB/s per bidirectional link, and have 6, 12 and 18 links correspondingly.

Performance

The following table shows a basic metrics comparison based upon standard specifications:

class="wikitable sortable" style="text-align:center;" ! Interconnect !style="max-width:0"\| Transfer rate ! Line code ! Modulation ! Effective payload rate per lane or NVLink (unidir.) ! Max. total lane length {{efn\|text=PCIe: incl. 5" for PCBs}} ! Total Links (NVLink) ! Total Bandwidth (PCIe x16 or NVLink) ! Realized in design
PCIe 1.x	2.5 GT/s	8b/10b		0.25 GB/s	{{cvt\|50\|cm}}		8 GB/s
PCIe 2.x	5 GT/s	8b/10b		0.50 GB/s	{{cvt\|50\|cm}}		16 GB/s
PCIe 3.x	8 GT/s	128b/130b		0.99 GB/s	{{cvt\|50\|cm}}{{Cite web\|url=https://www.elektronik-kompendium.de/sites/com/0904051.htm\|title=PCIe - PCI Express (1.1 / 2.0 / 3.0 / 4.0 / 5.0)\|website=www.elektronik-kompendium.de}}		31.51 GB/s	Pascal, Volta, Turing
PCIe 4.0	16 GT/s	128b/130b		1.97 GB/s	{{cvt\|20
30\|cm\|0}}		63.02 GB/s	style="max-width:0"\| Volta on Xavier, Ampere, POWER9
PCIe 5.0	32 GT/s{{Cite web\|url=https://www.tomshardware.com/news/pcie-4.0-5.0-pci-sig-specification,38460.html\|title=PCIe 5.0 Is Ready For Prime Time\|first=Paul \|last=Alcorn \|website=Tom's Hardware\|date=17 January 2019}}	128b/130b		3.94 GB/s			126.03 GB/s	Hopper
PCIe 6.0	64 GT/s	236B/256B{{cite web \|title=The PCIe 6.0 Specification Webinar Q&A: A Deeper Dive into FLIT Mode, PAM4, and Forward Error Correction (FEC) PCI-SIG \|url=https://pcisig.com/blog/pcie-60-specification-webinar-qa-deeper-dive-flit-mode-pam4-and-forward-error-correction-fec \|website=pcisig.com \|publisher=PCI-SIG \|access-date=28 November 2024 \|quote="We considered various FLIT sizes and settled on 256 Bytes with 236 bytes of TLP payload and a TLP efficiency of 92%."}}	FLIT PAM4 w/ FEC	7.56 GB/s			242 GB/s	Blackwell
NVLink 1.0	20 GT/s		NRZ	20 GB/s		4	160 GB/s	Pascal, POWER8+
NVLink 2.0	25 GT/s		NRZ	25 GB/s		6	300 GB/s	Volta, POWER9
NVLink 3.0	50 GT/s		NRZ	25 GB/s		12	600 GB/s	Ampere
NVLink 4.0	50 GT/s {{Cite web\|url=https://hc34.hotchips.org/assets/program/conference/day2/Network%20and%20Switches/NVSwitch%20HotChips%202022%20r5.pdf \|title=NVLink-Network Switch - NVIDIA's Switch Chip for High Communication-Bandwidth SuperPODs \|website=HotChips 34\|date=23 August 2022}}		PAM4 differential-pair	25 GB/s		18	900 GB/s	Hopper, Nvidia Grace
NVLink 5.0{{cite web \|title=NVIDIA Blackwell Architecture Technical Overview \|url=https://resources.nvidia.com/en-us-blackwell-architecture?ncid=no-ncid \|website=NVIDIA \|access-date=28 November 2024 \|page=8 \|language=en \|quote=Fifth-generation NVLink doubles the performance of fourth- generation NVLink in NVIDIA Hopper. While the new NVLink in Blackwell GPUs also uses two high-speed differential pairs in each direction to form a single link as in the Hopper GPU, NVIDIA Blackwell doubles the effective bandwidth per link to 50 GB/sec in each direction.}}	100 GT/s		PAM4 differential-pair	50 GB/s		18	1800 GB/s	Blackwell, Nvidia Grace

The following table shows a comparison of relevant bus parameters for real world semiconductors that all offer NVLink as one of their options:

class="wikitable sortable" style="text-align:center;" ! Semiconductor ! Board/bus delivery variant ! Interconnect ! Transmission technology rate (per lane) ! Lanes per sub-link (out + in) ! Sub-link data rate (per data direction){{efn \|name=datarate}} ! Sub-link or unit count ! Total data rate (out + in){{efn \|name=datarate}} ! Total lanes (out + in) ! Total data rate (out + in){{efn \|name=datarate}}
Nvidia GP100	P100 SXM,{{Cite web\|url=https://geizhals.de/nvidia-tesla-p100-sxm2-nvtp100-sxm-a1501151.html\|title=NVIDIA Tesla P100 [SXM2], 16GB HBM2 (NVTP100-SXM) {{pipe}} heise online Preisvergleich / Deutschland\|first=heise\|last=online\|website=geizhals.de}} P100 PCI-E{{Cite web\|url=https://geizhals.de/pny-tesla-p100-pcie-tcsp100m-16gb-pb-nvtp100-16-a1501119.html\|title=PNY Tesla P100 [PCIe], 16GB HBM2 (TCSP100M-16GB-PB/NVTP100-16) ab € 4990,00 (2020) {{pipe}} heise online Preisvergleich / Deutschland\|first=heise\|last=online\|website=geizhals.de\|date=14 August 2023 }}	PCIe 3.0	{{0}}8 GT/s	16 + 16 {{efn \|name=fractions}}	128 Gbit/s = 16 GB/s	1	16 + 16 GB/s[https://www.nextplatform.com/2016/05/04/nvlink-takes-gpu-acceleration-next-level/ NVLink Takes GPU Acceleration To The Next Level] by Timothy Prickett Morgan at nextplatform.com on May 4, 2016	32 {{efn \|name=diff_pair}}	{{0}}32 GB/s
Nvidia GV100	V100 SXM2,{{Cite web\|url=https://www.techpowerup.com/gpu-specs/tesla-v100-sxm2-16-gb.c3018\|title=NVIDIA Tesla V100 SXM2 16 GB Specs\|website=TechPowerUp\|date=14 August 2023 }} V100 PCI-E{{Cite web\|url=https://geizhals.de/pny-quadro-gv100-vcqgv100-pb-a1800874.html\|title=PNY Quadro GV100, 32GB HBM2, 4x DP (VCQGV100-PB) ab € 10199,00 (2020) {{pipe}} heise online Preisvergleich / Deutschland\|first=heise\|last=online\|website=geizhals.de\|date=14 August 2023 }}	PCIe 3.0	{{0}}8 GT/s	16 + 16 {{efn \|name=fractions}}	128 Gbit/s = 16 GB/s	1	{{0}}16 + {{0}}16 GB/s	32 {{efn \|name=diff_pair}}	{{0}}32 GB/s
Nvidia TU104	GeForce RTX 2080, Quadro RTX 5000	PCIe 3.0	{{0}}8 GT/s	16 + 16 {{efn \|name=fractions}}	128 Gbit/s = 16 GB/s	1	{{0}}16 + {{0}}16 GB/s	32 {{efn \|name=diff_pair}}	{{0}}32 GB/s
Nvidia TU102	GeForce RTX 2080 Ti, Quadro RTX 6000/8000	PCIe 3.0	{{0}}8 GT/s	16 + 16 {{efn \|name=fractions}}	128 Gbit/s = 16 GB/s	1	{{0}}16 + {{0}}16 GB/s	32 {{efn \|name=diff_pair}}	{{0}}32 GB/s
Nvidia GA100{{Cite web\|url=http://www.nextplatform.com/2020/05/14/nvidia-unifies-ai-compute-with-ampere-gpu/\|title=Nvidia Unifies AI Compute With "Ampere" GPU\|first=Timothy Prickett\|last=Morgan\|date=May 14, 2020\|website=The Next Platform}}{{cite web \|url=https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet.pdf \|title=Data sheet \|website=www.nvidia.com \|access-date=2020-09-15}} Nvidia GA102{{cite web\|url=https://www.nvidia.com/content/dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf\|access-date=2 May 2023\|website=nvidia.com\|title=NVIDIA ampere GA102 GPU Architecture Whitepaper}}	Ampere A100 (SXM4 & PCIe){{cite web\|url=https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet.pdf\|access-date=2 May 2023\|website=nvidia.com\|title=Tensor Core GPU}}	PCIe 4.0	{{0}}16 GT/s	16 + 16 {{efn \|name=fractions}}	256 Gbit/s = 32 GB/s	1	{{0}}32 + {{0}}32 GB/s	32 {{efn \|name=diff_pair}}	{{0}}64 GB/s
Nvidia GP100	P100 SXM, (not available with P100 PCI-E){{cite web\|url=https://www.theregister.co.uk/2016/06/20/nvidia_tesla_p100_pcie_card/\|title=All aboard the PCIe bus for Nvidia's Tesla P100 supercomputer grunt\|author=Chris Williams\|website=theregister.co.uk \|date=June 20, 2016}}	NVLink 1.0	20 GT/s	{{0}}8 + {{0}}8 {{efn \|name=sub_bundling}}	160 Gbit/s = 20 GB/s	4	{{0}}80 + {{0}}80 GB/s	64	160 GB/s
Nvidia GV100	V100 SXM2{{Cite web\|url=https://www.heise.de/newsticker/meldung/Nvidia-Tesla-V100-PCIe-Steckkarte-mit-Volta-Grafikchip-und-16-GByte-Speicher-angekuendigt-3753051.html\|title=Nvidia Tesla V100: PCIe-Steckkarte mit Volta-Grafikchip und 16 GByte Speicher angekündigt\|first=heise\|last=online\|website=heise online\|date=22 June 2017 }} (not available with V100 PCI-E)	NVLink 2.0	25 GT/s	{{0}}8 + {{0}}8 {{efn \|name=sub_bundling}}	200 Gbit/s = 25 GB/s	6[https://www.hardwareluxx.de/index.php/galerie/komponenten/grafikkarten/gtc17-keynote-volta.html GV100 Blockdiagramm] in "GTC17: NVIDIA präsentiert die nächste GPU-Architektur Volta - Tesla V100 mit 5.120 Shadereinheiten und 16 GB HBM2" by Andreas Schilling on hardwareluxx.de on May 10, 2017	150 + 150 GB/s	96	300 GB/s
Nvidia TU104	GeForce RTX 2080, Quadro RTX 5000{{cite web \|last1=Angelini \|first1=Chris \|title=Nvidia's Turing Architecture Explored: Inside the GeForce RTX 2080 \|url=https://www.tomshardware.com/reviews/nvidia-turing-gpu-architecture-explored,5801-7.html \|website=Tom's Hardware \|access-date=28 February 2019 \|ref=tomshardware-geforcertx \|page=7 \|date=14 September 2018 \|quote=TU102 and TU104 are Nvidia's first desktop GPUs rocking the NVLink interconnect rather than a Multiple Input/Output (MIO) interface for SLI support. The former makes two x8 links available, while the latter is limited to one. Each link facilitates up to 50 GB/s of bidirectional bandwidth. So, GeForce RTX 2080 Ti is capable of up to 100 GB/s between cards and RTX 2080 can do half of that.}}	NVLink 2.0	25 GT/s	{{0}}8 + {{0}}8 {{efn \|name=sub_bundling}}	200 Gbit/s = 25 GB/s	1	{{0}}25 + {{0}}25 GB/s	16	{{0}}50 GB/s
Nvidia TU102	GeForce RTX 2080 Ti, Quadro RTX 6000/8000	NVLink 2.0	25 GT/s	{{0}}8 + {{0}}8 {{efn \|name=sub_bundling}}	200 Gbit/s = 25 GB/s	2	{{0}}50 + {{0}}50 GB/s	32	100 GB/s
Nvidia GA100	Ampere A100 (SXM4 & PCIe)	NVLink 3.0	50 GT/s	{{0}}4 + {{0}}4 {{efn \|name=sub_bundling}}	200 Gbit/s = 25 GB/s	12{{Cite web\|url=https://www.hardwareluxx.de/index.php/news/hardware/grafikkarten/53450-a100-pcie-nvidia-ga100-gpu-kommt-auch-als-pci-express-variante.html\|title=A100 PCIe: NVIDIA GA100-GPU kommt auch als PCI-Express-Variante\|first=Andreas\|last=Schilling\|website=Hardwareluxx\|date=22 June 2020 \|accessdate=2 May 2023}}	300 + 300 GB/s	96	600 GB/s
Nvidia GA102	GeForce RTX 3090, Quadro RTX A6000	NVLink 3.0	28.125 GT/s	{{0}}4 + {{0}}4 {{efn \|name=sub_bundling}}	112.5 Gbit/s = 14.0625 GB/s	4	56.25 + 56.25 GB/s	16	112.5 GB/s
NVSwitch for Hopper{{cite web \|url=https://www.nvidia.com/en-us/data-center/nvlink/ \|title= NVLINK AND NVSWITCH \|website=www.nvidia.com \|access-date=2021-02-07}} \| (fully connected 64 port switch) \| NVLink 4.0 \| 106.25 GT/s \| {{0}}9 + {{0}}9 {{efn \|name=sub_bundling}} \| 450 Gbit/s \| 18 \| 3600 + 3600 GB/s \| 128 \| 7200 GB/s
Nvidia Grace CPU{{cite web \| url=https://www.hpcwire.com/2024/02/22/a-big-memory-nvidia-gh200-next-to-your-desk-closer-than-you-think/ \| title=A Big Memory Nvidia GH200 Next to Your Desk: Closer Than You Think \| date=23 February 2024 }} \| Nvidia GH200 Superchip \| PCIe-5 (4x, 16x) @ 512 GB/s
Nvidia Grace CPU \| Nvidia GH200 Superchip \| NVLink-C2C @ 900 GB/s
Nvidia Hopper GPU \| Nvidia GH200 Superchip \| NVLink-C2C @ 900 GB/s
Nvidia Hopper GPU \| Nvidia GH200 Superchip \| NVLink 4 (18x) @ 900 GB/s

{{notelist| refs=

{{efn |name=datarate |text=Data rate columns are maximum theoretical values.}}

{{efn |name=sub_bundling |text=sample value; NVLink sub-link bundling should be possible.}}

{{efn |name=fractions |text=sample value; other fractions for the PCIe lane usage should be possible.}}

{{efn |name=diff_pair |text=a single PCIe lane transfers data over a differential pair.}}

}}

Real world performance could be determined by applying different encapsulation taxes as well usage rate. Those come from various sources:{{citation needed|date=November 2024}}

128b/130b line code (see e.g. PCI Express data transmission for versions 3.0 and higher)
Link control characters
Transaction header
Buffering capabilities
DMA usage on computer side

Those physical limitations usually reduce the data rate to between 90 and 95% of the transfer rate.{{citation needed|date=November 2024}} NVLink benchmarks show an achievable transfer rate of about 35.3 Gbit/s{{contradictory inline |reason=This isn't 90-95%|date=November 2024}} (host to device) for a 40 Gbit/s (2 sub-lanes uplink) NVLink connection towards a P100 GPU in a system that is driven by a set of IBM POWER8 CPUs.{{cite web|url=https://www.microway.com/hpc-tech-tips/comparing-nvlink-vs-pci-e-nvidia-tesla-p100-gpus-openpower-servers/ |title=Comparing NVLink vs PCI-E with NVIDIA Tesla P100 GPUs on OpenPOWER Servers |author=Eliot Eshelman |website=microway.com |date=January 26, 2017}}

Usage with plug-in boards

For the various versions of plug-in boards (a yet small number of high-end gaming and professional graphics GPU boards with this feature exist) that expose extra connectors for joining them into a NVLink group, a similar number of slightly varying, relatively compact, PCB based interconnection plugs does exist. Typically only boards of the same type will mate together due to their physical and logical design. For some setups two identical plugs need to be applied for achieving the full data rate. As of now the typical plug is U-shaped with a fine grid edge connector on each of the end strokes of the shape facing away from the viewer. The width of the plug determines how far away the plug-in cards need to be seated to the main board of the hosting computer system - a distance for the placement of the card is commonly determined by the matching plug (known available plug widths are 3 to 5 slots and also depend on board type).{{Cite web|url=https://www.nvidia.com/de-de/design-visualization/nvlink-bridges/|title=NVIDIA Quadro NVLink Grafikprozessor-Zusammenschaltung in Hochgeschwindigkeit|website=NVIDIA}}{{Cite web|url=https://www.nvidia.com/de-de/geforce/graphics-cards/rtx-2080-ti/|title=Grafik neu erfunden: NVIDIA GeForce RTX 2080 Ti-Grafikkarte|website=NVIDIA}} The interconnect is often referred as Scalable Link Interface (SLI) from 2004 for its structural design and appearance, even if the modern NVLink based design is of a quite different technical nature with different features in its basic levels compared to the former design. Reported real world devices are:{{Cite web|url=https://www.pugetsystems.com/labs/articles/NVLink-on-NVIDIA-GeForce-RTX-2080-2080-Ti-in-Windows-10-1253/|title=NVLink on NVIDIA GeForce RTX 2080 & 2080 Ti in Windows 10|website=Puget Systems|date=5 October 2018 }}

Quadro GP100 (a pair of cards will make use of up to 2 bridges;[http://www.shopblt.com/cgi-bin/shop/shop.cgi?action=enter&thispage=011004001508_B2PU977P.shtml]{{dead link|date=September 2020}} the setup realizes either 2 or 4 NVLink connections with up to 160 GB/s{{Cite web|url=https://www.hardwareluxx.de/index.php/news/hardware/grafikkarten/41825-nvidia-praesentiert-quadro-gp100-mit-gp100-gpu-und-16-gb-hbm2.html|title=NVIDIA präsentiert Quadro GP100 mit GP100-GPU und 16 GB HBM2|first=Andreas|last=Schilling|website=Hardwareluxx|date=5 February 2017 }} - this might resemble NVLink 1.0 with 20 GT/s)
Quadro GV100 (a pair of cards will need up to 2 bridges and realize up to 200 GB/s - this might resemble NVLink 2.0 with 25 GT/s and 4 links)

GeForce RTX 2080 based on TU104 (with single bridge "GeForce RTX NVLink-Bridge"{{Cite web|url=https://www.nvidia.com/de-de/geforce/graphics-cards/rtx-2080/|title=NVIDIA GeForce RTX 2080 Founders Edition Graphics Card|website=NVIDIA}})
GeForce RTX 2080 Ti based on TU102 (with single bridge "GeForce RTX NVLink-Bridge")
Quadro RTX 5000{{Cite web|url=https://www.nvidia.com/en-us/design-visualization/quadro-desktop-gpus/|title=NVIDIA Quadro Graphics Cards for Professional Design Workstations|website=NVIDIA}} based on TU104{{Cite web|url=https://www.hardwareinside.de/nvidia-quadro-rtx-6000-und-rtx-5000-ready-fuer-pre-order-36399/|title=NVIDIA Quadro RTX 6000 und RTX 5000 Ready für Pre-Order|date=October 1, 2018}} (with single bridge "NVLink" up to 50 GB/s{{Cite web|url=https://www.pny.com/professional/explore-our-products/learn-about-nvidia-quadro/nvlink|title=NVLink {{pipe}} pny.com|website=www.pny.com}} - this might resemble NVLink 2.0 with 25 GT/s and 1 link)
Quadro RTX 6000 based on TU102 (with single bridge "NVLink HB" up to 100 GB/s - this might resemble NVLink 2.0 with 25 GT/s and 2 links)
Quadro RTX 8000 based on TU102{{Cite web|url=https://www.techpowerup.com/gpu-specs/quadro-rtx-8000.c3306|title=NVIDIA Quadro RTX 8000 Specs|website=TechPowerUp|date=14 August 2023 }} (with single bridge "NVLink HB" up to 100 GB/s - this might resemble NVLink 2.0 with 25 GT/s and 2 links)

Service software and programming

For the Tesla, Quadro and Grid product lines, the NVML-API (Nvidia Management Library API) offers a set of functions for programmatically controlling some aspects of NVLink interconnects on Windows and Linux systems, such as component evaluation and versions along with status/error querying and performance monitoring.{{Cite web|url=http://docs.nvidia.com/deploy/nvml-api/index.html|title=NvLink Methods|website=docs.nvidia.com}} Further, with the provision of the NCCL library (Nvidia Collective Communications Library) developers in the public space shall be enabled for realizing e.g. powerful implementations for artificial intelligence and similar computation hungry topics atop NVLink.{{Cite web|url=https://developer.nvidia.com/nccl|title=NVIDIA Collective Communications Library (NCCL)|date=May 10, 2017|website=NVIDIA Developer}} The page "3D Settings" » "Configure SLI, Surround, PhysX" in the Nvidia Control panel and the CUDA sample application "simpleP2P" use such APIs to realize their services in respect to their NVLink features. On the Linux platform, the command line application with sub-command "nvidia-smi nvlink" provides a similar set of advanced information and control.

History

On 5 April 2016, Nvidia announced that NVLink would be implemented in the Pascal-microarchitecture-based GP100 GPU, as used in, for example, Nvidia Tesla P100 products.{{cite web |url=https://devblogs.nvidia.com/parallelforall/inside-pascal/ |title=Inside Pascal: NVIDIA's Newest Computing Platform |date=2016-04-05}} With the introduction of the DGX-1 high performance computer base it was possible to have up to eight P100 modules in a single rack system connected to up to two host CPUs. The carrier board (...) allows for a dedicated board for routing the NVLink connections – each P100 requires 800 pins, 400 for PCIe + power, and another 400 for the NVLinks, adding up to nearly 1600 board traces for NVLinks alone (...).Anandtech.com Each CPU has direct connection to 4 units of P100 via PCIe and each P100 has one NVLink each to the 3 other P100s in the same CPU group plus one more NVLink to one P100 in the other CPU group. Each NVLink (link interface) offers a bidirectional 20 GB/sec up 20 GB/sec down, with 4 links per GP100 GPU, for an aggregate bandwidth of 80 GB/sec up and another 80 GB/sec down.[http://www.anandtech.com/show/10229/nvidia-announces-dgx1-server NVIDIA Unveils the DGX-1 HPC Server: 8 Teslas, 3U, Q2 2016] by anandtech.com on April, 2016 NVLink supports routing so that in the DGX-1 design for every P100 a total of 4 of the other 7 P100s are directly reachable and the remaining 3 are reachable with only one hop. According to depictions in Nvidia's blog-based publications, from 2014 NVLink allows bundling of individual links for increased point to point performance so that for example a design with two P100s and all links established between the two units would allow the full NVLink bandwidth of 80 GB/s between them.[https://devblogs.nvidia.com/parallelforall/how-nvlink-will-enable-faster-easier-multi-gpu-computing/ How NVLink Will Enable Faster, Easier Multi-GPU Computing] by Mark Harris on November 14, 2014

At GTC2017, Nvidia presented its Volta generation of GPUs and indicated the integration of a revised version 2.0 of NVLink that would allow total I/O data rates of 300 GB/s for a single chip for this design, and further announced the option for pre-orders with a delivery promise for Q3/2017 of the DGX-1 and DGX-Station high performance computers that will be equipped with GPU modules of type V100 and have NVLink 2.0 realized in either a networked (two groups of four V100 modules with inter-group connectivity) or a fully interconnected fashion of one group of four V100 modules.

In 2017–2018, IBM and Nvidia delivered the Summit and Sierra supercomputers for the US Department of Energy{{cite web |url=http://www.teratec.eu/actu/calcul/Nvidia_Coral_White_Paper_Final_3_1.pdf |title=Whitepaper: Summit and Sierra Supercomputers |date=2014-11-01}} which combine IBM's POWER9 family of CPUs and Nvidia's Volta architecture, using NVLink 2.0 for the CPU-GPU and GPU-GPU interconnects and InfiniBand EDR for the system interconnects.{{cite web |url=http://www.anandtech.com/show/8727/nvidia-ibm-supercomputers |title=Nvidia Volta, IBM POWER9 Land Contracts For New US Government Supercomputers |publisher=AnandTech |date=2014-11-17}}

In 2020, Nvidia announced that they will no longer be adding new SLI driver profiles on RTX 2000 series and older from January 1, 2021.{{cite web |url=https://www.pcworld.com/article/3573384/rip-nvidia-slams-the-final-nail-in-slis-coffin-no-new-profiles-after-2020.html |publisher=PC World |date=2020-09-18 |title=RIP: Nvidia slams the final nail in SLI's coffin, no new profiles after 2020}}

References

Category:Nvidia

Category:Computer buses

Category:Serial buses