Multimedia Acceleration eXtensions

{{redirect|MAX-2|the video game|Mechanized Assault & Exploration 2}}

The Multimedia Acceleration eXtensions or MAX are instruction set extensions to the Hewlett-Packard PA-RISC instruction set architecture (ISA). MAX was developed to improve the performance of multimedia applications that were becoming more prevalent during the 1990s.

MAX instructions operate on 32- or 64-bit SIMD data types consisting of multiple 16-bit integers packed in general purpose registers. The available functionality includes additions, subtractions and shifts.

The first version, MAX-1, was for the 32-bit PA-RISC 1.1 ISA. The second version, MAX-2, was for the 64-bit PA-RISC 2.0 ISA.

Notability

The approach is notable because the set of instructions is much smaller than in other multimedia CPUs, and also more general-purpose. The small set and simplicity of the instructions reduce the recurring costs of the electronics, as well as the costs and difficulty of the design. The general-purpose nature of the instructions increases their overall value. These instructions require only small changes to a CPU's arithmetic-logic unit. A similar design approach promises to be a successful model for the multimedia instructions of other CPU designs.{{cite book|last1=Lee|first1=Ruby|author1-link=Ruby B. Lee|last2=Huck|first2=Jerry|title=COMPCON '96. Technologies for the Information Superhighway Digest of Papers |chapter=64-bit and multimedia extensions in the PA-RISC 2.0 architecture |date=February 25, 1996|pages=152–160|doi=10.1109/CMPCON.1996.501762|isbn=0-8186-7414-8|s2cid=13081443 }}{{cite journal|last1=Lee|first1=Ruby B.|authorlink=Ruby B. Lee|title=Accelerating Multimedia with Enhanced Microprocessors|journal=IEEE Micro|date=April 1995|volume=15|issue=2|pages=22–32|url=http://www.princeton.edu/~rblee/HPpapers/accelMultimediawEnhancedMicroproc.pdf|accessdate=21 September 2014|doi=10.1109/40.372347}} The set is also small because the CPU already included powerful shift and bit-manipulation instructions: "Shift pair" which shifts a pair of registers, "extract" and "deposit" of bit fields, and all the common bit-wise logical operations (and, or, exclusive-or, etc.).

This set of multimedia instructions has proven its performance, as well. In 1996 the 64-bit "MAX-2" instructions enabled real-time performance of MPEG-1 and MPEG-2 video while increasing the area of a RISC CPU by only 0.2%.{{cite journal|last1=Lee|first1=Ruby B.|authorlink=Ruby B. Lee|title=Subword Parallelism with MAX-2|journal=IEEE Micro|date=August 1996|volume=16|issue=4|pages=51–59|url=http://homepages.cae.wisc.edu/~ece734/mmx/00526925.pdf|accessdate=21 September 2014|doi=10.1109/40.526925}}

Implementations

MAX-1 was first implemented with the PA-7100LC in 1994. It is usually attributed as being the first SIMD extensions to an ISA. The second version, MAX-2, was for the 64-bit PA-RISC 2.0 ISA. It was first implemented in the PA-8000 microprocessor released in 1996.

The basic approach to the arithmetic in MAX-2 is to "interrupt the carries" between the 16-bit subwords, and choose between modular arithmetic, signed and unsigned saturation. This requires only small changes to the arithmetic logic unit.

MAX-1

Class="wikitable"

! width="100" | Instruction

! width="300" | Description

HADD

| Parallel add with modulo arithmetic

HADD,ss

| Parallel add with signed saturation

HADD,us

| Parallel add with unsigned saturation

HSUB

| Parallel subtract with modulo arithmetic

HSUB,ss

| Parallel subtract with signed saturation

HSUB,us

| Parallel subtract with unsigned saturation

HAVE

| Parallel average

HSHLADD

| Parallel shift left and add with signed saturation

HSHRADD

| Parallel shift right and add with signed saturation

MAX-2

MAX-2 instructions are register-to-register instructions that operate on multiple integers in 64-bit quantities. All have a one cycle latency in the PA-8000 microprocessor and its derivatives. Memory accesses are via the standard 64-bit loads and stores.

The "MIX" and "PERMH" instructions are a notable innovation because they permute words in the register set without accessing memory. This can substantially speed many operations.

class="wikitable"

! width="100" | Instruction

! width="300" | Description

HADD

| Parallel add with modulo arithmetic

HADD,ss

| Parallel add with signed saturation

HADD,us

| Parallel add with unsigned saturation

HSUB

| Parallel subtract with modulo arithmetic

HSUB,ss

| Parallel subtract with signed saturation

HSUB,us

| Parallel subtract with unsigned saturation

HSHLADD

| Parallel shift left and add with signed saturation

HSHRADD

| Parallel shift right and add with signed saturation

HAVG

| Parallel average

HSHR

| Parallel shift right signed

HSHR,u

| Parallel shift right unsigned

HSHL

| Parallel shift left

MIX

| Mix 16-bit sub-words in a 64-bit word; MIX Left, Ra,Rb,Rc, Rc:=a1,b1,a3,b3; MIX Right, Rc:=a2,b2,a4,b4

MIXW

| Mix 32-bit sub-words in a 64-bit word; e.g. MIXW Left, Ra,Rb,Rc, Rc:=a1,a2,b1,b2; MIXW Right, Rc:=a3,a4,b3,b4

PERMH

| Permute 16-bit sub-words of the source in any possible permutation in the destination register, including repetitions.

References

{{reflist}}

  • [http://www.openpa.net/pa-risc_architecture.html#max Multimedia Acceleration eXtensions (MAX-1 and MAX-2) PA-RISC CPU Architecture] OpenPA.net

{{Multimedia extensions}}

Category:Computer-related introductions in 1994

Category:HP microprocessors

Category:SIMD computing