ARM Cortex-A78

{{Short description|Microprocessor core model by ARM}}

{{Infobox CPU

| name = ARM Cortex-A78

| image =

| image_size =

| alt =

| caption =

| produced-start = 2020

| produced-end =

| soldby =

| designfirm = ARM Ltd.

| manuf1 =

| cpuid =

| code =

|slowest=2.4

|fastest=3.0 GHz in phones and 3.3 GHz in tablets/laptops

| slow-unit = GHz

| fast-unit =

| fsb-slowest =

| fsb-fastest =

| fsb-slow-unit =

| fsb-fast-unit =

| hypertransport-slowest =

| hypertransport-fastest =

| hypertransport-slow-unit =

| hypertransport-fast-unit =

| qpi-slowest =

| qpi-fastest =

| qpi-slow-unit =

| qpi-fast-unit =

| dmi-slowest =

| dmi-fastest =

| dmi-slow-unit =

| dmi-fast-unit =

| data-width =

| address-width =

| virtual-width =

| l1cache = 32–64 KB (parity)

32kb L1 Instruction cache and 32kb L1 Data cache.

or

64kb L1 Instruction cache and 64kb L1 Data cache.

| l2cache = 256–512 (private L2 ECC) KiB

| l3cache = Optional, 512 KB to 4 MB (A78, A78AE)
Optional, 512 KB to 8 MB (A78C)

| l4cache =

| llcache =

| application =

| size-from =

| size-to =

| microarch = ARM Cortex-A78

| arch = ARMv8-A

| instructions =

| extensions = ARMv8.1-A, ARMv8.2-A, cryptography, RAS, ARMv8.3-A LDAPR instructions

| transistors =

| numcores = 1–4 per cluster (A78, A78AE)
1–8 per cluster (A78C)

| gpu =

| co-processor =

| pack1 =

| sock1 =

| core1 =

| pcode1 = Hercules

| model1 =

| brand1 =

| variant = ARM Cortex-X1

| predecessor = ARM Cortex-A77

| successor = ARM Cortex-A710

}}

The ARM Cortex-A78 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Ltd.'s Austin centre.{{Cite web|title=Cortex-A78|url=https://developer.arm.com/ip-products/processors/cortex-a/cortex-a78|access-date=2020-07-01|website=Arm Developer|language=en}}

Design

The ARM Cortex-A78 is the successor to the ARM Cortex-A77. It can be paired with the ARM Cortex-X1 and/or ARM Cortex-A55 CPUs in a DynamIQ configuration to deliver both performance and efficiency. The processor also claims as much as 50% energy savings over its predecessor.

The Cortex-A78 is a 4-wide decode out-of-order superscalar design with a 1.5K macro-OP (MOPs) cache. It can fetch 4 instructions and 6 Mops per cycle, and rename and dispatch 6 Mops, and 12 μops per cycle. The out-of-order window size is 160 entries and the backend has 13 execution ports with a pipeline depth of 14 stages, and the execution latencies consist of 10 stages.

The processor is built on a standard Cortex-A roadmap and offers a 2.1 GHz (5 nm) chipset which makes it better than its predecessor in the following ways:

  • 7% better performance
  • 4% lower power consumption
  • 5% smaller, meaning 15% more area serving for a quad-core cluster, extra GPU, NPU

There is also extended scalability with extra support from Dynamic Shared Unit for DynamIQ on the chipset. A smaller 32 KB L1 cache from the 64 KB L1 cache configuration is optional. To offset this smaller L1 memory, the branch predictor is better at covering irregular search patterns and is capable of following two taken branches per cycle, which results in fewer L1 cache misses and helps hide pipeline bubbles to keep the core well supplied. The pipeline is one cycle longer compared to the A77, which ensures that the A78 hits a clock frequency target of around 3 GHz. The A78 is a 6 instruction per cycle design.

ARM also introduced a second integer multiply unit in the execution unit and an additional load Address Generation Unit (AGU) to increase both the data load and bandwidth by 50%. Other optimizations of the chipset include fused instructions{{Cite web|url=https://en.wikichip.org/wiki/macro-operation_fusion#Arm|title = Macro-Operation Fusion (MOP Fusion) - WikiChip}} and efficiency improvements to instruction schedulers, register renaming structures, and the re-order buffer.

L2 cache is available up to 512 KB and has double the bandwidth to maximize the performance, while the shared L3 cache is available up to 4 MB, double that of previous generations. A Dynamic Shared Unit (DSU) also allows for an 8 MB configuration with the ARM Cortex-X1.{{Cite web|last=Frumusanu|first=Andrei|title=Arm's New Cortex-A78 and Cortex-X1 Microarchitectures: An Efficiency and Performance Divergence|url=https://www.anandtech.com/show/15813/arm-cortex-a78-cortex-x1-cpu-ip-diverging|access-date=2020-06-17|website=www.anandtech.com}}{{Cite web|date=2020-05-26|title=Arm Unveils the Cortex-A78: When Less Is More|url=https://fuse.wikichip.org/news/3536/arm-unveils-the-cortex-a78-when-less-is-more/|access-date=2020-06-17|website=WikiChip Fuse|language=en-US}}{{Cite web|last=Triggs|first=Robert|date=2020-05-26|title=Arm Cortex-X1 and Cortex-A78 CPUs: Big cores with big differences|url=https://www.androidauthority.com/arm-cortex-x1-cortex-a78-1119666/|access-date=2020-06-15|website=Android Authority|language=en-US}}{{Cite web|title=ARM's Cortex-A78 CPU and Mali-G78 GPU will power 2021's best Android phones|url=https://www.theverge.com/circuitbreaker/2020/5/26/21267893/arm-cortex-a78-mali-g78-cpu-gpu-designs-smartphones-2021-samsung-qualcomm-apple|access-date=2020-06-15|website=www.theverge.com|date=26 May 2020|language=en}}

Variants

=Cortex-A78C=

The Cortex-A78C is targeted for productivity and gaming applications, it increases the max core support from 4 to 8 cores and from 4MB to 8MB of L3 cache.https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-cortex-a78c

=Cortex-A78AE=

The Cortex-A78AE is targeted for security/safety applications.https://community.arm.com/arm-community-blogs/b/embedded-and-microcontrollers-blog/posts/arm-cortex-a78ae-on-the-road-to-an-autonomous-future

Licensing

The Cortex-A78 is available as a SIP core to licensees whilst its design makes it suitable for integration with other SIP cores (e.g. GPU, display controller, DSP, image processor, etc.) into one die constituting a system on a chip (SoC).{{Citation needed|date=June 2020}}

Usage

The Cortex-A78 was first used in Samsung Exynos 2100 SoC, introduced in November and December 2020 respectively.{{Cite web|last=Frumusanu|first=Andrei|title=Samsung Announces Exynos 1080 - 5nm Premium-Range SoC with A78 Cores|url=https://www.anandtech.com/show/16244/samsung-announces-exynos-1080-5nm-midrange-with-a78-cores|access-date=2020-11-13|website=www.anandtech.com}}{{Cite web|title=Exynos 1080 5G Mobile Processor: Specs, Features {{!}} Samsung Exynos|url=https://www.samsung.com/semiconductor/minisite/exynos/products/mobileprocessor/exynos-1080/|access-date=2021-01-11|website=Samsung Semiconductor|language=en}} The custom Kryo 680 Gold core used in the Snapdragon 888{{Broken anchor|date=2024-07-30|bot=User:Cewbot/log/20201008/configuration|target_link=List of Qualcomm Snapdragon processors#Snapdragon 888/888+ 5G (2021)|reason= The anchor (Snapdragon 888/888+ 5G (2021)) has been deleted.}} SoC is based on the Cortex-A78 microarchitecture.{{Cite web|last=Frumusanu|first=Andrei|title=Qualcomm Details The Snapdragon 888: 3rd Gen 5G & Cortex-X1 on 5nm|url=https://www.anandtech.com/show/16271/qualcomm-snapdragon-888-deep-dive|access-date=2021-01-11|website=www.anandtech.com}}{{Cite web|date=2020-12-02|title=Everything you need to know about the Qualcomm Snapdragon 888|url=https://www.xda-developers.com/qualcomm-snapdragon-888-explained-specs-features/|access-date=2021-01-11|website=xda-developers|language=en-US}} The Cortex-A78 is also used in the MediaTek Dimensity 1200 and 8000 series. The device is also used in Nvidia's BlueField-3 and 3X DPUs, and in the HiSilicon Kirin 9000s, released in August 2023.

See also

References

{{Reflist}}

{{Application ARM-based chips}}

Category:ARM processors