Searchable symmetric encryption

{{Short description|System allowing searching of encrypted documents}}

File:Searchable Symmetric Encryption (SSE) scheme.png

Searchable symmetric encryption (SSE) is a form of encryption that allows one to efficiently search over a collection of encrypted documents or files without the ability to decrypt them.{{Cite book|last1=Dawn Xiaoding Song|last2=Wagner|first2=D.|last3=Perrig|first3=A.|title=Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000 |chapter=Practical techniques for searches on encrypted data |chapter-url=http://dx.doi.org/10.1109/secpri.2000.848445|year=2000|pages=44–55|publisher=IEEE Comput. Soc|doi=10.1109/secpri.2000.848445|isbn=0-7695-0665-8|s2cid=2829840}}{{Cite book|last1=Curtmola|first1=Reza|last2=Garay|first2=Juan|last3=Kamara|first3=Seny|last4=Ostrovsky|first4=Rafail|title=Proceedings of the 13th ACM conference on Computer and communications security |chapter=Searchable symmetric encryption |date=2006-10-30|chapter-url=https://doi.org/10.1145/1180405.1180417|series=CCS '06|location=Alexandria, Virginia, USA|publisher=Association for Computing Machinery|pages=79–88|doi=10.1145/1180405.1180417|isbn=978-1-59593-518-2|s2cid=961719}}{{Cite journal |last1=Amorim |first1=Ivone |last2=Costa |first2=Ivan |date=2023-07-01 |title=Leveraging Searchable Encryption through Homomorphic Encryption: A Comprehensive Analysis |journal=Mathematics |language=en |volume=11 |issue=13 |pages=2948 |doi=10.3390/math11132948 |doi-access=free |issn=2227-7390|arxiv=2306.14407 }} SSE can be used to outsource files to an untrusted cloud storage server without ever revealing the files in the clear but while preserving the server's ability to search over them.

Description

A searchable symmetric encryption scheme is a symmetric-key encryption scheme that encrypts a collection of documents \mathbf{D} = (\mathrm{D_1}, \dots, \mathrm{D_n}), where each document \mathrm{D_i} \subseteq \mathbb{W} is viewed as a set of keywords from a keyword space \mathbb{W}. Given the encryption key K and a keyword w \in \mathbb{W}, one can generate a search token tk with which the encrypted data collection can be searched for w. The result of the search is the subset of encrypted documents that contain the keyword w.

= Static SSE =

A static SSE scheme consists of three algorithms \mathsf{SSE = (Setup, Token, Search)} that work as follows:

  • \mathsf{Setup} takes as input a security parameter k and a document collection \mathbf{D} and outputs a symmetric key K, an encrypted index \mathbf{I}, and an encrypted document collection \mathbf{ED}
  • \mathsf{Token} takes as input the secret key K and a keyword w and outputs a search token tk
  • \mathsf{Search} takes as input the encrypted index \mathbf{I}, the encrypted document collection \mathbf{ED} and a search token tk and outputs a set of encrypted documents \mathbf{R} \subseteq \mathbf{ED}

A static SSE scheme is used by a client and an untrusted server as follows. The client encrypts its data collection using the \mathsf{Setup} algorithm which returns a secret key K and an encrypted document collection \mathbf{ED}. The client keeps K secret and sends \mathbf{ED} and \mathbf{I} to the untrusted server. To search for a keyword w, the client runs the \mathsf{Token} algorithm on K and w to generate a search token tk which it sends to the server. The server runs Search with \mathbf{ED}, \mathbf{I}, and tk and returns the resulting encrypted documents back to the client.

= Dynamic SSE =

A dynamic SSE scheme supports, in addition to search, the insertion and deletion of documents. A dynamic SSE scheme consists of seven algorithms \mathsf{SSE = (Setup, Token, Search, InsertToken, Insert, DeleteToken, Delete)} where \mathsf{Setup}, \mathsf{Token} and \mathsf{Search} are as in the static case and the remaining algorithms work as follows:

  • \mathsf{InsertToken} takes as input the secret key K and a new document \mathrm{D_{n+1}} and outputs an insert token itk
  • \mathsf{Insert} takes as input the encrypted document collection EDC and an insert token itk and outputs an updated encrypted document collection \mathbf{ED'}
  • \mathsf{DeleteToken} takes as input the secret key K and a document identifier id and outputs a delete token dtk
  • \mathsf{Delete} takes as input the encrypted data collection \mathrm{EDC} and a delete token dtk and outputs an updated encrypted data collection \mathbf{ED'}

To add a new document \mathrm{D_{n+1}} the client runs \mathsf{InsertToken} on K and \mathrm{D_{n+1}}to generate an insert token itk which it sends to the server. The server runs \mathsf{Insert} with \mathbf{ED} and itk and stores the updated encrypted document collection. To delete a document with identifier id, the client runs the \mathsf{DeleteToken} algorithm with K and id to generate a delete token dtk which it sends to the server. The server runs \mathsf{Delete} with \mathbf{ED} and dtk and stores the updated encrypted document collection.

An SSE scheme that does not support \mathsf{DeleteToken} and \mathsf{Delete} is called semi-dynamic.

History of Searchable Symmetric Encryption

The problem of searching on encrypted data was considered by Song, Wagner and Perrig though previous work on Oblivious RAM by Goldreich and Ostrovsky{{Cite journal|last1=Goldreich|first1=Oded|last2=Ostrovsky|first2=Rafail|date=May 1996|title=Software protection and simulation on oblivious RAMs|url=http://dx.doi.org/10.1145/233551.233553|journal=Journal of the ACM|volume=43|issue=3|pages=431–473|doi=10.1145/233551.233553|hdl=1721.1/103684 |s2cid=7502114 |issn=0004-5411|hdl-access=free}} could be used in theory to address the problem. This work proposed an SSE scheme with a search algorithm that runs in time O(s), where s = |\mathbf{D}|. Goh{{Cite web|last=Goh|first=Eu-Jin|title=Secure Indexes|year=2003 |url=https://eprint.iacr.org/2003/216}} and Chang and Mitzenmacher{{Cite book|last1=Chang|first1=Yan-Cheng|last2=Mitzenmacher|first2=Michael|title=Applied Cryptography and Network Security |chapter=Privacy Preserving Keyword Searches on Remote Encrypted Data |date=2005|editor-last=Ioannidis|editor-first=John|editor2-last=Keromytis|editor2-first=Angelos|editor3-last=Yung|editor3-first=Moti|series=Lecture Notes in Computer Science|volume=3531 |language=en|location=Berlin, Heidelberg|publisher=Springer|pages=442–455|doi=10.1007/11496137_30|isbn=978-3-540-31542-1|doi-access=free}} gave new SSE constructions with search algorithms that run in time O(n), where n is the number of documents. Curtmola, Garay, Kamara and Ostrovsky later proposed two static constructions with O(\mathrm{opt} ) search time, where \mathrm{opt} is the number of documents that contain w, which is optimal. This work also proposed a semi-dynamic construction with O(\mathrm{opt}\cdot \log(u)) search time, where u is the number of updates. An optimal dynamic SSE construction was later proposed by Kamara, Papamanthou and Roeder.{{Cite book|last1=Kamara|first1=Seny|last2=Papamanthou|first2=Charalampos|last3=Roeder|first3=Tom|title=Proceedings of the 2012 ACM conference on Computer and communications security |chapter=Dynamic searchable symmetric encryption |date=2012-10-16|chapter-url=https://doi.org/10.1145/2382196.2382298|series=CCS '12|location=New York, NY, USA|publisher=Association for Computing Machinery|pages=965–976|doi=10.1145/2382196.2382298|isbn=978-1-4503-1651-4|s2cid=243046 }}

Goh and Chang and Mitzenmacher proposed security definitions for SSE. These were strengthened and extended by Curtmola, Garay, Kamara and Ostrovsky who proposed the notion of adaptive security for SSE. This work also was the first to observe leakage in SSE and to formally capture it as part of the security definition. Leakage was further formalized and generalized by Chase and Kamara.{{Cite book|last1=Chase|first1=Melissa|last2=Kamara|first2=Seny|title=Advances in Cryptology - ASIACRYPT 2010 |chapter=Structured Encryption and Controlled Disclosure |date=2010|editor-last=Abe|editor-first=Masayuki|series=Lecture Notes in Computer Science|volume=6477 |language=en|location=Berlin, Heidelberg|publisher=Springer|pages=577–594|doi=10.1007/978-3-642-17373-8_33|isbn=978-3-642-17373-8|doi-access=free}} Islam, Kuzu and Kantarcioglu described the first leakage attack.{{Cite journal|last1=Islam|first1=Mohammad|last2=Kuzu|first2=Mehmet|last3=Kantarcioglu|first3=Murat|title=Access Pattern disclosure on Searchable Encryption:Ramification, Attack and Mitigation|url=https://www.ndss-symposium.org/wp-content/uploads/2017/09/06_1.pdf|journal=Network and Distributed System Security (NDSS) Symposium}}

All the previously mentioned constructions support single keyword search. Cash, Jarecki, Jutla, Krawczyk, Rosu and Steiner{{Cite book|last1=Cash|first1=David|last2=Jarecki|first2=Stanislaw|last3=Jutla|first3=Charanjit|last4=Krawczyk|first4=Hugo|last5=Roşu|first5=Marcel-Cătălin|last6=Steiner|first6=Michael|title=Advances in Cryptology – CRYPTO 2013 |chapter=Highly-Scalable Searchable Symmetric Encryption with Support for Boolean Queries |date=2013|editor-last=Canetti|editor-first=Ran|editor2-last=Garay|editor2-first=Juan A.|series=Lecture Notes in Computer Science|volume=8042 |language=en|location=Berlin, Heidelberg|publisher=Springer|pages=353–373|doi=10.1007/978-3-642-40041-4_20|isbn=978-3-642-40041-4|doi-access=free}} proposed an SSE scheme that supports conjunctive search in sub-linear time in n. The construction can also be extended to support disjunctive and Boolean searches that can be expressed in searchable normal form (SNF) in sub-linear time. At the same time, Pappas, Krell, Vo, Kolesnikov, Malkin, Choi, George, Keromytis and Bellovin{{Cite book|last1=Pappas|first1=Vasilis|last2=Krell|first2=Fernando|last3=Vo|first3=Binh|last4=Kolesnikov|first4=Vladimir|last5=Malkin|first5=Tal|last6=Choi|first6=Seung Geol|last7=George|first7=Wesley|last8=Keromytis|first8=Angelos|last9=Bellovin|first9=Steve|title=2014 IEEE Symposium on Security and Privacy |chapter=Blind Seer: A Scalable Private DBMS |date=May 2014|pages=359–374 |publisher=IEEE|doi=10.1109/sp.2014.30|isbn=978-1-4799-4686-0 |s2cid=9165575 |doi-access=free}} described a construction that supports conjunctive and all disjunctive and Boolean searches in sub-linear time.

Security

SSE schemes are designed to guarantee that the untrusted server cannot learn any partial information about the documents or the search queries beyond some well-defined and reasonable leakage. The leakage of a scheme is formally described using a leakage profile which itself can consists of several leakage patterns. SSE constructions attempt to minimize leakage while achieving the best possible search efficiency.

SSE security can be analyzed in several adversarial models but the most common are:

  • the persistent model, where an adversary is given the encrypted data collection and a transcript of all the operations executed on the collection;
  • the snapshot model,{{Cite journal|last1=Amjad|first1=Ghous|last2=Kamara|first2=Seny|last3=Moataz|first3=Tarik|date=2019-01-01|title=Breach-Resistant Structured Encryption|journal=Proceedings on Privacy Enhancing Technologies|language=en|volume=2019|issue=1|pages=245–265|doi=10.2478/popets-2019-0014|s2cid=4047057 |doi-access=free}} where an adversary is only given the encrypted data collection (but possibly after each operation).

= Security in the Persistent Model =

In the persistent model, there are SSE schemes that achieve a wide variety of leakage profiles. The most common leakage profile for static schemes that achieve single keyword search in optimal time is \Lambda_{\mathrm{opt}} which reveals the number of documents in the collection, the size of each document in the collection, if and when a query was repeated and which encrypted documents match the search query.{{Cite web|title=Dynamic Searchable Encryption in Very-Large Databases: Data Structures and Implementation – NDSS Symposium|url=https://www.ndss-symposium.org/ndss2014/programme/dynamic-searchable-encryption-very-large-databases-data-structures-and-implementation/|access-date=2022-02-22|language=en-US}} It is known, however, how to construct schemes that leak considerably less at an additional cost in search time and storage.{{Cite book|last1=Kamara|first1=Seny|last2=Moataz|first2=Tarik|last3=Ohrimenko|first3=Olya|title=Advances in Cryptology – CRYPTO 2018 |chapter=Structured Encryption and Leakage Suppression |date=2018|editor-last=Shacham|editor-first=Hovav|editor2-last=Boldyreva|editor2-first=Alexandra|chapter-url=https://link.springer.com/chapter/10.1007/978-3-319-96884-1_12|series=Lecture Notes in Computer Science|volume=10991 |language=en|location=Cham|publisher=Springer International Publishing|pages=339–370|doi=10.1007/978-3-319-96884-1_12|isbn=978-3-319-96884-1|s2cid=51603585 }}{{Cite web|title=Revisiting Leakage Abuse Attacks – NDSS Symposium|url=https://www.ndss-symposium.org/ndss-paper/revisiting-leakage-abuse-attacks/|access-date=2022-02-22|language=en-US}}

When considering dynamic SSE schemes, the state-of-the-art constructions with optimal time search have leakage profiles that guarantee forward privacy{{Cite web|title=Practical Dynamic Searchable Encryption with Small Leakage – NDSS Symposium|url=https://www.ndss-symposium.org/ndss2014/programme/practical-dynamic-searchable-encryption-small-leakage/|access-date=2022-02-22|language=en-US}} which means that inserts cannot be correlated with past search queries.

= Security in the Snapshot Model =

In the snapshot model, efficient dynamic SSE schemes with no leakage beyond the number of documents and the size of the collection can be constructed. When using an SSE construction that is secure in the snapshot model one has to carefully consider how the scheme will be deployed because some systems might cache previous search queries.{{Cite book|last1=Grubbs|first1=Paul|last2=Ristenpart|first2=Thomas|last3=Shmatikov|first3=Vitaly|title=Proceedings of the 16th Workshop on Hot Topics in Operating Systems |chapter=Why Your Encrypted Database is Not Secure |date=2017-05-07|chapter-url=https://doi.org/10.1145/3102980.3103007|series=HotOS '17|location=New York, NY, USA|publisher=Association for Computing Machinery|pages=162–168|doi=10.1145/3102980.3103007|isbn=978-1-4503-5068-6|s2cid=10111288 }}

= Cryptanalysis =

A leakage profile only describes the leakage of an SSE scheme but it says nothing about whether that leakage can be exploited or not. Cryptanalysis is therefore used to better understand the real-world security of a leakage profile. There is a wide variety of attacks working in different adversarial models, based on a variety of assumptions and attacking different leakage profiles.{{Cite book|last1=Yao|first1=Jing|last2=Zheng|first2=Yifeng|last3=Guo|first3=Yu|last4=Wang|first4=Cong|title=Proceedings of the 8th International Workshop on Security in Blockchain and Cloud Computing |chapter=SoK: A Systematic Study of Attacks in Efficient Encrypted Cloud Data Search |date=2020-10-06|chapter-url=https://doi.org/10.1145/3384942.3406869|series=SBC '20|location=New York, NY, USA|publisher=Association for Computing Machinery|pages=14–20|doi=10.1145/3384942.3406869|isbn=978-1-4503-7609-9|s2cid=222179683 }}{{Cite journal|last1=Kamara|first1=Seny|last2=Kati|first2=Abdelkarim|last3=Moataz|first3=Tarik|last4=Schneider|first4=Thomas|last5=Treiber|first5=Amos|last6=Yonli|first6=Michael|date=2021|title=Cryptanalysis of Encrypted Search with LEAKER - A framework for LEakage AttacK Evaluation on Real-world data|url=https://eprint.iacr.org/2021/1035}}

See also

References