Lyra2

{{Short description|Key derivative function}}

{{Multiple issues|

{{advert|date=October 2017}}

{{primary sources|date=October 2017}}

{{more citations needed|date=October 2017}}

}}

Lyra2 is a password hashing scheme (PHS) that can also function as a key derivation function (KDF). It gained recognition during the Password Hashing Competition in July 2015,{{Cite web|url=https://password-hashing.net/|title=Password Hashing Competition|website=password-hashing.net|access-date=2016-03-22}} which was won by Argon2. It is also used in proof-of-work algorithms such as Lyra2REv2,{{Cite web|url=https://en.bitcoinwiki.org/wiki/Lyra2REv2|title=Lyra2REv2|website=eprint.iacr.org|access-date=2016-03-22}} adopted by Vertcoin{{Cite web|url=http://vertcoin.org|title=Vertcoin|website=vertcoin.org|access-date=2019-10-08}} and MonaCoin,{{Cite web|url=https://monacoin.org|title=MonaCoin|website=monacoin.org|access-date=2019-10-08}} among other cryptocurrencies.{{Cite conference|last1=van Beirendonck|first1=M.|last2=Trudeau|first2=L.|last3=Giard|first3=P.|last4=Balatsoukas-Stimming|first4=A.|date=2019-05-29|title=A Lyra2 FPGA Core for Lyra2REv2-Based Cryptocurrencies|conference=IEEE International Symposium on Circuits and Systems (ISCAS)|location=Sapporo, Japan|publisher=IEEE|pages=1–5|doi=10.1109/ISCAS.2019.8702498|arxiv=1807.05764}}

Lyra2 was designed by Marcos A. Simplicio Jr., Leonardo C. Almeida, Ewerton R. Andrade, Paulo C. F. dos Santos, and Paulo S. L. M. Barreto from Escola Politécnica da Universidade de São Paulo.{{Cite web|url=https://eprint.iacr.org/2015/136|title=Cryptology ePrint Archive: Report 2015/136|website=eprint.iacr.org|access-date=2016-03-22}} It is based on Lyra,{{Cite journal|last1=Almeida|first1=Leonardo C.|last2=Andrade|first2=Ewerton R.|last3=Barreto|first3=Paulo S. L. M.|last4=Simplicio Jr|first4=Marcos A. |date=2014-01-04|title=Lyra: password-based key derivation with tunable memory and processing costs|journal=Journal of Cryptographic Engineering|language=en|volume=4|issue=2|pages=75–89|doi=10.1007/s13389-013-0063-5|issn=2190-8508|citeseerx=10.1.1.642.8519|s2cid=5245769 }}{{Cite web|url=https://eprint.iacr.org/2014/030|title=Cryptology ePrint Archive: Report 2014/030|website=eprint.iacr.org|access-date=2016-03-22}} which had been created by the same team. Lyra2 includes:

  • The ability to configure the desired amount of memory, processing time, and parallelism for the algorithm.
  • High memory usage with processing time similar to scrypt.

In addition, it:{{Cite journal|last1=Andrade|first1=E.|last2=Simplicio Jr|first2=M. |last3=Barreto|first3=P.|last4=Santos|first4=P.|date=2016-01-01|title=Lyra2: efficient password hashing with high security against time-memory trade-offs|journal=IEEE Transactions on Computers|volume=PP|issue=99|pages=3096–3108|doi=10.1109/TC.2016.2516011|s2cid=37232444 |issn=0018-9340}}

  • Provides higher security against time-memory trade-offs.
  • Allows legitimate users to better benefit from the parallelism capabilities of their own platforms.
  • Increases the costs of creating dedicated hardware to attack the algorithm.
  • Balances resistance against side-channel threats and attacks using cheaper, slower storage devices.

Lyra2 is released into the public domain.{{Citation needed | date=September 2024}}

Design

As any PHS, Lyra2 takes as input a salt and a password, creating a pseudorandom output that can then be used as key material for cryptographic algorithms or as an authentication string.{{Cite journal|url=http://csrc.nist.gov/publications/nistpubs/800-108/sp800-108.pdf|title=Recommendation for Key Derivation Using Pseudorandom Functions (Revised)|last=Chen|first=Lily|website=Computer Security|publisher=NIST|doi=10.6028/NIST.SP.800-108|year=2009|doi-access=free}}{{failed verification|date=December 2019}}{{Citation needed|date=December 2019}}

Internally, the scheme's memory is organized as a matrix that is expected to remain in memory during the whole password hashing process. Since its cells are iteratively read and written, discarding a cell for saving memory leads to the need of recomputing it whenever it is accessed once again, until the point it was last modified.

The construction and visitation of the matrix is done using a stateful combination of the absorbing, squeezing and duplexing operations of the underlying sponge (i.e., its internal state is never reset to zero), ensuring the sequential nature of the whole process.

Also, the number of times the matrix's cells are revisited after initialization is defined by the user, allowing Lyra2's execution time to be fine-tuned according to the target platform's resources.

= Inputs =

  • {{mono|password}}
  • {{mono|salt}}
  • {{mono|t_cost}} - execution time
  • {{mono|m_cost}} - memory required
  • {{mono|outlen}}

The algorithm additionally enables parameterization in terms of:

  • degree of parallelism (number of threads p)
  • underlying permutation function (can be seen as the main cryptographic primitive)
  • number of blocks used by the underlying permutation function (bitrate)
  • number of rounds performed for the underlying permutation function (\rho)
  • number of bits to be used in rotations (\omega)
  • output length(\kappa)

= Functions/symbols =

; || : Concatenate two strings

; {{code|^}} : Bitwise XOR

; {{code|[+]}} : Word wise add operation (i.e., ignoring carries between words)

; {{code|%}} : Modulus

; {{code|W}} : The target machine's word size (usually, 32 or 64)

; {{code|omega}} : Number of bits to be used in rotations (recommended: a multiple of the machine's word size, W)

; {{code|>>>}} : Right rotation

; {{code|rho}} : Number of rounds for reduced squeeze or duplexing operations

; {{code|blen}} : Sponge's block size in bytes

; {{code|H or H_i}} : Sponge with block size blen (in bytes) and underlying permutation f

; {{code|H.absorb(input)}} : Sponge's absorb operation on input

; {{code|H.squeeze(len)}} : Sponge's squeeze operation of len bytes

; {{code|H.squeeze_{rho}(len)}} : Sponge's squeeze operation of len bytes using rho rounds of f

; {{code|H.duplexing(input,len)}} : Sponge's duplexing operation on input, producing len bytes

; {{code|H.duplexing_{rho}(input,len)}}:Sponge's duplexing operation on input, using rho rounds of f, producing len bytes

; {{code|pad(string)}} : Pads a string to a multiple of blen bytes (padding rule: 10*1)

; {{code|lsw(input)}} : The least significant word of input

; {{code|len(string)}} : Length of a string, in bytes

; {{code|syncThreads()}} : Synchronize parallel threads

; {{code|swap(input1,input2)}} : Swap the value of two inputs

; {{code|C}} : Number of columns on the memory matrix (usually, 64, 128, 256, 512 or 1024)

; {{code|P}} : Degree of parallelism ({{code|1=P >= 1}} and {{code|1=(m_cost/2) % P == 0}})

= Algorithm without parallelism =

{{pre|1=

  • Bootstrapping phase: Initializes the sponge's state and local variables
  1. Byte representation of input parameters (others can be added)

params = outlen || len(password) || len(salt) || t_cost || m_cost || C

  1. Initializes the sponge's state (after that, password can be overwritten)

H.absorb( pad(password || salt || params) )

  1. Initializes visitation step, window and first rows that will feed

gap = 1

stp = 1

wnd = 2

sqrt = 2

prev0 = 2

row1 = 1

prev1 = 0

  • Setup phase: Initializes a (m_cost x C) memory matrix, its cells having blen-byte cells
  1. Initializes M[0], M[1] and M[2]

for col = 0 to C-1

M[0][C-1-col] = H.squeeze_{rho}(blen)

for col = 0 to C-1

M[1][C-1-col] = H.duplexing_{rho}( M[0][col], blen)

for col = 0 to C-1

M[2][C-1-col] = H.duplexing_{rho}( M[1][col], blen)

  1. Filling Loop: initializes remainder rows

for row0 = 3 to m_cost-1

# Columns Loop: M[row0] is initialized and M[row1] is updated

for col = 0 to C-1

rand = H.duplexing_{rho}( M[row1][col] [+] M[prev0][col] [+] M[prev1][col], blen)

M[row0][C-1-col] = M[prev0][col] ^ rand

M[row1][col] = M[row1][col] ^ ( rand >>> omega )

# Rows to be revisited in next loop

prev0 = row0

prev1 = row1

row1 = (row1 + stp) % wnd

# Window fully revisited

if (row1 = 0)

# Doubles window and adjusts step

wnd = 2 * wnd

stp = sqrt + gap

gap = -gap

# Doubles sqrt every other iteration

if (gap = -1)

sqrt = 2 * sqrt

  • Wandering phase: Iteratively overwrites pseudorandom cells of the memory matrix
  1. Visitation Loop: (2 * m_cost * t_cost) rows revisited in pseudorandom fashion

for wCount = 0 to ( (m_cost * t_cost) - 1)

# Picks pseudorandom rows

row0 = lsw(rand) % m_cost

row1 = lsw( rand >>> omega ) % m_cost

# Columns Loop: updates both M[row0] and M[row1]

for col = 0 to C-1

# Picks pseudorandom columns

col0 = lsw( ( rand >>> omega ) >>> omega ) % C

col1 = lsw( ( ( rand >>> omega ) >>> omega ) >>> omega ) % C

rand = H.duplexing_{rho}( M[row0][col] [+] M[row1][col] [+] M[prev0][col0] [+] M[prev1][col1], blen)

M[row0][col] = M[row0][col] ^ rand

M[row1][col] = M[row1][col] ^ ( rand >>> omega )

# Next iteration revisits most recently updated rows

prev0 = row0

prev1 = row1

  • Wrap-up phase: output computation
  1. Absorbs a final column with a full-round sponge

H.absorb( M[row0][0] )

  1. Squeezes outlen bits with a full-round sponge

output = H.squeeze(outlen)

  1. Provides outlen-long bitstring as output

return output

}}

= Algorithm with parallelism =

for each i in [0..P]

** Bootstrapping phase: Initializes the sponge's state and local variables

# Byte representation of input parameters (others can be added)

params = outlen || len(password) || len(salt) || t_cost || m_cost || C || P || i

# Initializes the sponge's state (after that, password can be overwritten)

H_i.absorb( pad(password || salt || params) )

# Initializes visitation step, window and first rows that will feed

gap = 1

stp = 1

wnd = 2

sqrt = 2

sync = 4

j = i

prev0 = 2

rowP = 1

prevP = 0

** Setup phase: Initializes a (m_cost x C) memory matrix, its cells having blen-byte cells

# Initializes M_i[0], M_i[1] and M_i[2]

for col = 0 to C-1

M_i[0][C-1-col] = H_i.squeeze_{rho}(blen)

for col = 0 to C-1

M_i[1][C-1-col] = H_i.duplexing_{rho}( M_i[0][col], blen)

for col = 0 to C-1

M_i[2][C-1-col] = H_i.duplexing_{rho}( M_i[1][col], blen)

# Filling Loop: initializes remainder rows

for row0 = 3 to ( (m_cost / P) - 1 )

# Columns Loop: M_i[row0] is initialized and M_j[row1] is updated

for col = 0 to C-1

rand = H_i.duplexing_{rho}( M_j[rowP][col] [+] M_i[prev0][col] [+] M_j[prevP][col], blen)

M_i[row0][C-1-col] = M_i[prev0][col] ^ rand

M_j[rowP][col] = M_j[rowP][col] ^ ( rand >>> omega )

# Rows to be revisited in next loop

prev0 = row0

prevP = rowP

rowP = (rowP + stp) % wnd

# Window fully revisited

if (rowP = 0)

# Doubles window and adjusts step

wnd = 2 * wnd

stp = sqrt + gap

gap = -gap

# Doubles sqrt every other iteration

if (gap = -1)

sqrt = 2 * sqrt

# Synchronize point

if (row0 = sync)

sync = sync + (sqrt / 2)

j = (j + 1) % P

syncThreads()

syncThreads()

** Wandering phase: Iteratively overwrites pseudorandom cells of the memory matrix

wnd = m_cost / (2 * P)

sync = sqrt

off0 = 0

offP = wnd

# Visitation Loop: (2 * m_cost * t_cost / P) rows revisited in pseudorandom fashion

for wCount = 0 to ( ( (m_cost * t_cost) / P) - 1)

# Picks pseudorandom rows and slices (j)

row0 = off0 + (lsw(rand) % wnd)

rowP = offP + (lsw( rand >>> omega ) % wnd)

j = lsw( ( rand >>> omega ) >>> omega ) % P

# Columns Loop: update M_i[row0]

for col = 0 to C-1

# Picks pseudorandom column

col0 = lsw( ( ( rand >>> omega ) >>> omega ) >>> omega ) % C

rand = H_i.duplexing_{rho}( M_i[row0][col] [+] M_i[prev0][col0] [+] M_j[rowP][col], blen)

M_i[row0][col] = M_i[row0][col] ^ rand

# Next iteration revisits most recently updated rows

prev0 = row0

# Synchronize point

if (wCount = sync)

sync = sync + sqrt

swap(off0,offP)

syncThreads()

syncThreads()

** Wrap-up phase: output computation

# Absorbs a final column with a full-round sponge

H_i.absorb( M_i[row0][0] )

# Squeezes outlen bits with a full-round sponge

output_i = H_i.squeeze(outlen)

  1. Provides outlen-long bitstring as output

return output_0 ^ ... ^ output_{P-1}

Security analysis

Against Lyra2, the processing cost of attacks using 1/2^{n+2} of the amount of memory employed by a legitimate user is expected to be between O(2^{2nT}R^{3}) and O(2^{2nT}R^{n+2}), the latter being a better estimate for n \gg 1, instead of the O(R) achieved when the amount of memory is O(R), where T is a user-defined parameter to define a processing time.

This compares well to Scrypt, which displays a cost of O(R^{2}) when the memory usage is highO(1),{{Cite web|url=https://www.tarsnap.com/scrypt/scrypt.pdf|title=Stronger Key Derivation via Sequential Memory-Hard Functions|last=Percival|first=Colin|website=TARSNAP|publisher=The Technical BSD Conference}} and with other solutions in the literature, for which the results are usually O(R^{T+1}).{{Cite web|url=https://eprint.iacr.org/2013/525|title=Cryptology ePrint Archive: Report 2013/525|website=eprint.iacr.org|access-date=2016-03-22}}{{Cite web|url=https://www.uni-weimar.de/fileadmin/user/fak/medien/professuren/Mediensicherheit/Research/Theses/sascha-schmidt-master-thesis-catena.pdf|title=Implementation of the Catena Password-Scrambling Framework|last=Schmidt|first=Sascha|website=Bauhaus-Universität Weimar|publisher=Faculty of Media}}{{Cite web|url=https://github.com/P-H-C/phc-winner-argon2/blob/master/argon2-specs.pdf|title=P-H-C/phc-winner-argon2|website=GitHub|access-date=2016-03-22}}

Nonetheless, in practice these solutions usually involve a value of R (memory usage) lower than those attained with the Lyra2 for the same processing time.{{Cite web|url=http://article.gmane.org/gmane.comp.security.phc/2237|title=Gmane -- Another PHC candidates mechanical tests|website=article.gmane.org|access-date=2016-03-22}}{{Cite web|url=http://article.gmane.org/gmane.comp.security.phc/1992|title=Gmane -- A review per day Lyra2|website=article.gmane.org|access-date=2016-03-22}}{{Cite web|url=http://article.gmane.org/gmane.comp.security.phc/1596|title=Gmane -- Lyra2 initial review|website=article.gmane.org|access-date=2016-03-22}}{{Cite web|url=http://article.gmane.org/gmane.comp.security.phc/1849|title=Gmane -- Memory performance and ASIC attacks|website=article.gmane.org|access-date=2016-03-22}}{{Cite web|url=http://article.gmane.org/gmane.comp.security.phc/1830|title=Gmane -- Quick analysis of Argon|website=article.gmane.org|access-date=2016-03-22}}

Performance

File:Lyra2-Bench.pdf and memory-hard PHC finalists with minimum parameters.]]

The processing time obtained with an SSE single-core implementation of Lyra2 is illustrated in the hereby shown figure. This figure was extracted from, and is very similar to, third-party benchmarks performed during the PHC context.

The results depicted correspond to the average execution time of Lyra2 configured with C=256, \rho=1, b=768 bits (i.e., the inner state has 256 bits), and different T and R settings, giving an overall idea of possible combinations of parameters and the corresponding usage of resources.

As shown in this figure, Lyra2 is able to execute in: less than 1 s while using up to 400 MB (with R = 2^{14} and T=5) or up to 1 GB of memory (with R \approx 4.2\cdot10^{4} and T=1); or in less than 5 s with 1.6 GB (with R = 2^{16} and T=6).

All tests were performed on an Intel Xeon E5-2430 (2.20 GHz with 12 Cores, 64 bits) equipped with 48 GB of DRAM, running Ubuntu 14.04 LTS 64 bits, and the source code was compiled using GCC 4.9.2.

Extensions

Lyra offers two main extensions:{{Cite web|url=https://password-hashing.net/submissions/specs/Lyra2-v3.pdf|title=The Lyra2 reference guide|last1=Simplicio Jr|first1=Marcos A.|last2=Almeida|first2=Leonardo C.|last3=Andrade|first3=Ewerton R.|last4=Santos|first4=Paulo C.|last5=Barreto|first5=Paulo S. L. M.|website=PHC|publisher=The Password Hashing Competition}}

  • Lyra2-δ: Provides more control over the algorithm's bandwidth usage.
  • Lyra2p: Takes advantage of parallelism capabilities on the user's platform.

See also

References

{{reflist}}