Privacy and technology have a complex relationship. Often, the two are pitted against each other and interpreted through a lens of competing interests, where an advancement in one leads necessarily to a reduction in the other. Although perhaps inconvenient to acknowledge, there is some truth to this perspective. It is based on a genuine insight dating back to at least 1890, when Samuel Warren and Louis Brandeis first wrote a ground-breaking article on “The Right to Privacy.”Footnote 1
After observing and—at least in the case of WarrenFootnote 2—directly experiencing the detrimental effects of unregulated tabloid journalism on individuals and their families, Warren and Brandeis sought to formulate a new principle on which to ground claims of privacy. Instead of commercial property rights, the two argued it was an individual’s “inviolate personality” that gave privacy its true legitimacy as a fundamental right. When viewed in this way, privacy is granted a certain primacy over other interests, including the economic interests of others. In the case of Warren and Brandeis, this meant recognizing that privacy and certain uses of the then-recent technical innovations of “instantaneous photographs and newspaper enterprise” could not coexist. For privacy to advance, meaningful limits had to be placed on this technology.
Yet, privacy and technology are not always in conflict. In fact, it is possible for the two to enjoy a kind of symbiotic relationship, where each supports and enhances the existence of the other. Instead of a zero-sum game, privacy and technology can work together to form an alliance of sorts, resulting in a “win-win” situation. The technologies that fall squarely into this category of values are known as privacy-enhancing technologies (PETs).
This Tech-Know blog post is a continuation of our series on PETs. In a previous post, we discussed differential privacy and federated learning. In another, we analyzed the pros and cons of synthetic data. Here, we turn to a new topic: homomorphic encryption.
Introduction
Homomorphic encryption is a special form of encryption whose structure enables mathematical operations to be performed directly on encrypted data or “ciphertext,” without the need for prior decryption. The result of a homomorphic operation is a new ciphertext, which, when decrypted, is equivalent to the result of the same operation performed on the plaintext message. Formally, homomorphic encryption satisfies the following property:
- Dec(Enc(a) ⊕ Enc(b)) = Dec(Enc(a + b)), where ⊕ represents the homomorphic addition operation; or more generally
- Dec(f’(Enc(x))) = Dec(Enc(f(x))), where f’(x) represents the homomorphic version of the function f(x).
How homomorphic encryption works conceptually, at a high level, is perhaps best explained through an analogy.Footnote 3
In all encryption schemes, the ciphertext is meant to act like a securely locked box containing the original plaintext message. Only someone with access to the secret key can unlock the box and read the contents of the message inside. Otherwise, the box should be impenetrable. The only other forms of processing are trivial, non-interactive ones: The box can be transmitted, or it can be stored. This property of encryption schemes is the source of the distinction we often hear between encrypted “in transit” and encrypted “at rest.”
However, with homomorphic encryption, the design of the ciphertext is more sophisticated. It continues to act like a securely locked box containing the original message, but in addition, the box comes equipped with a pair of special built-in gloves. Using the gloves, anyone can reach inside and manipulate the contents of the message. For example, they can add or multiply values. However, their mode of access is limited. While the gloves allow someone to “touch” the message, they do not allow them to “see” it. Anyone using the gloves remains locked out of the box and, as such, should not gain knowledge of the actual contents of the message inside.
Unlocking the box is the same as with traditional encryption schemes. Only someone with access to the secret key can do it. However, now, instead of the original message, the key holder sees the result of the manipulations performed on it.
This ability of homomorphic encryption to allow third-party processors to perform certain types of computations on encrypted data is what has led some to describe it as cryptography’s “holy grail.” No longer does the protection afforded by encryption limit the forms of processing to the transmission and storage of sensitive data. With homomorphic encryption, it is possible to protect the confidentiality of such data even while it is being used.
This new dynamic of encrypted “in use” opens up a number of new and emerging applications of homomorphic encryption. The use case most often mentioned is cloud computing. In traditional cloud computing, a cloud service provider can ultimately gain access to customer data through its role as system administrator of the services it provides. While customer data should be encrypted during transit and at rest, it is not during processing, so long as the cloud services go beyond mere data storage; hence, a point of access is available to the cloud service provider.
Beyond general external threats such as hackers, this means that customer data is subject to additional risks such as insider threats by cloud employees and secret access demands by governments. However, with homomorphic encryption, it is possible for customer data to remain encrypted throughout its cloud processing and only become decrypted when it reaches the IT systems owned and managed by the customer. This would help protect against the privacy and security risks mentioned above.
Another possible application relates to blockchain technology. By default, all data on a blockchain is public and readable to anyone who has access to the ledger. For payment applications such as Bitcoin, this means that users’ balances and transfers are stored on-chain in plaintext. The same is true of smart contract platforms such as Ethereum, albeit with the difference that the public on-chain data is not necessarily financial in nature. It could be of any type, including medical records, depending on the purpose of the smart contract.
The public-by-default nature of blockchain technology poses significant risks to the privacy of individuals. For example, some investors have been subjected to acts of targeted violence in the form of “crypto muggings.” At a societal level, the very idea of an immutable public database that contains the entire history of transactions performed on it contradicts many well-established privacy principles. While proponents of blockchain often cite the absence of a central authority as a virtue that promotes greater freedom, the same decentralized structure also holds the potential for mass surveillance at a scale and level of efficiency even the most oppressive of regimes would find difficult to replicate.
However, with homomorphic encryption, it is possible for public on-chain data to remain encrypted throughout its processing on a blockchain. In general, the applicable use cases focus on smart contract blockchains and require the capabilities of another PET, namely zero-knowledge proofs (ZKPs). For example, using a combination of single- and multi-key homomorphic encryption as well as ZKPs, researchers have shown that it is possible to design a blockchain-based smart contract platform that can operate on multiple users’ encrypted data fully on-chain, without the need for off-chain coordination of transactions in plaintext.Footnote 4 However, an important trade-off to consider in this approach is that the multi-key design would require what is, in effect, a central authority to manage the keys. This use of homomorphic encryption is still an active area of research.
Yet, where exactly are we in this quest for cryptography’s “holy grail”? Beyond the simple analogy of a locked box with gloves, how does homomorphic encryption work, really? Does it hold only advantages over non-encrypted forms of processing? Or does it also have certain disadvantages and present ongoing challenges to practitioners?
In this blog post, we will explore some key aspects of homomorphic encryption in the hopes of unlocking some answers to the above questions. To clarify, this document does not provide guidance on the application of homomorphic encryption under federal privacy laws. Our aim is to discuss some benefits and challenges to help contextualize our understanding of it from a technical perspective, as opposed to a legal or policy perspective.
What is encryption?
Homomorphic encryption is a complex technology, consisting of multiple parts and processes. Rather than jumping straight into a discussion of it, it is helpful to take an incremental approach, where we begin with a discussion of its most basic functionality and move gradually towards its more sophisticated features. At its core, homomorphic encryption is a form of encryption. Thus, the first step to understanding it is to understand the basics of encryption.
When we hear the word “encryption,” we may be tempted to think immediately of modern digital communications and the use of encryption in applications such as instant messaging, online banking or virtual private networks. While we do not actually “see” encryption at work, we know it is there, working in the background, through indicators such as the “s” in “https” or, what is now more common, the small image of a padlock next to the address bar in our web browser.
However, encryption is in fact an age-old practice, with algorithms to obfuscate the meaning of messages dating back millennia. For example, Julius Caesar is said to have used a substitution cipher to help protect the confidentiality of his communications. The algorithm was very simple. To encrypt, simply replace each letter of the message with the letter three places ahead in the alphabet, wrapping around to the beginning in the case of the last three letters. To decrypt, simply reverse the process.
While the Caesar cipher would by no means be considered secure by today’s standards, its design includes a set of basic components that are present in some form in all encryption schemes. Many of the terms we have been using so far correspond to these components. In general, there are four to mention:
- Plaintext. This is the message or data in its original, non-encrypted form, either before it is input into an encryption algorithm or after the ciphertext undergoes decryption. A plaintext message is not necessarily human-readable text. It may consist of any data format, including image or video files, so long as the message is interpretable by the receiver. It is also sometimes called “cleartext.”
- Ciphertext. This is the message or data in its obfuscated, encrypted form after the plaintext has passed through the encryption algorithm. Ciphertext should exhibit properties that make it equivalent to randomized, nonsensical data when analyzed by anyone without access to the secret key.
- Secret key. This is the piece of information used by the algorithm for both encryption and decryption. It should be kept secret and only be known to the sender and receiver of the message. In the Caesar example above, the key is equivalent to the number 3, denoting the number of letters in the alphabet to move forward or back. With the emergence of computers capable of performing brute-force attacks in the 20th century, the key length (or size) became an essential factor to determining the level of security of an encryption scheme.
- Algorithm. Also known as a “cipher,” this is the process that transforms plaintext into ciphertext, and ciphertext into plaintext, based on the key. In general, the decryption process works as the inverse of encryption. In the Caesar example above, the algorithm consists of substituting letters between two versions of the same alphabet using modular arithmetic. Today, for an encryption algorithm to be considered secure, it should produce ciphertexts with at least the following three basic properties:
- Confusion. This is when the relationship between the secret key and the ciphertext is obscured through complex transformations, such that each character of the ciphertext depends on multiple parts of the key.
- Diffusion. This is when the transformations that ultimately produce the ciphertext depend equally on all characters of the plaintext such that their effect is dispersed across multiple parts of the ciphertext, with the goal of hiding the statistical properties of the plaintext.
- Semantic security. This is when the encryption algorithm is non-deterministic or randomized, such that the same plaintext encrypts to multiple ciphertexts, whose properties do not reveal any information about the plaintext message, except its length.
Over the years, secret-key encryption schemes—also known as “symmetric” schemes—have led to the development of some of the most efficient and secure cryptographic tools we have today. For example, the Advanced Encryption Standard (AES) is the only algorithm the Canadian Centre for Cyber Security publicly recommends for protecting the confidentiality of certain types of sensitive government information.Footnote 5
What is public-key encryption?
Having discussed the basics of encryption, the next step to grasping homomorphic encryption is to understand an important advancement in cryptography that laid the foundations for its eventual discovery.
Up until the mid-1970s, all encryption schemes were exclusively symmetric, with only a single secret key used for both encryption and decryption. However, in 1976, a revolution of sorts occurred in cryptography when researchers discovered that certain high-level mathematical problems in number theory could be used to develop encryption algorithms with two keys: a public key for encryption and a private key for decryption. This new form of encryption was known as public-key or “asymmetric” cryptography.
The main discovery that led to the development of public-key encryption was a new class of mathematical expressions called “trapdoor one-way functions.”Footnote 6 These are arithmetic functions that are easy to compute in one direction, but hard to compute in the other direction, except if some special “trapdoor” information is known. The trapdoor information is a secret property of the function that unlocks an efficient means of inverting it. It works like a hidden button one can push to access the other side of the equation. Otherwise, the security of the function rests on the assumption that it is computationally infeasible to compute the inverse, either by reverse engineering the function itself or by discovering its trapdoor information.
An example of some trapdoor information is the solution to an integer factorization problem. For example, given two large prime numbers, it is easy to compute their product. However, going in the reverse direction, no efficient (non-quantum) algorithm is known to exist that can find a set of prime factors given only their product. By using the product of two large prime numbers as the base for modular arithmetic operations, it is possible to create a trapdoor one-way function where the prime factors of the base act as a passageway to the other side of the equation. This is the trapdoor technique developed by the widely used Rivest-Shamir-Adleman (RSA) encryption scheme.Footnote 7
The structure of a trapdoor one-way function lays the foundation for a new type of cipher. By making the one-way function public, but keeping the trapdoor information private, it is possible to design an algorithm where the key is split into two parts, corresponding to separate processes for encryption and decryption:
- Public key. This is the key used for encryption. It consists of the parameter values used to instantiate the one-way function for a given user. This information is meant to be shared publicly, so that anyone who wishes to send the user an encrypted message can do so. Each user should have a unique public key.
- Private key. This is the key used for decryption. It consists of the special trapdoor information that unlocks an efficient means of inverting the user’s one-way function. This information is meant to be kept secret by the user, so that only they can decrypt messages encrypted with their public key.
Beyond data encryption, public-key encryption schemes can be used to enhance the security of other important aspects of encrypted communications. This is a key difference between public-key encryption schemes and single-key, symmetric schemes. Whereas symmetric schemes can provide strong and efficient data encryption, they fall short in other areas. The opposite is the case for public-key schemes. While public-key schemes can be used for data encryption, they are comparably inefficient at it. Where public-key encryption schemes really show their utility is with respect to issues that arise at the protocol-level of encrypted communications.
In general, public-key encryption schemes provide two additional security functions:
- Secure key exchange. Before public-key encryption, two parties who wished to communicate over an insecure channel faced a challenging dilemma. To use encryption, they first needed to exchange a secret key, but to exchange a secret key, they needed to establish a secure alternative channel, since the one they wished to communicate over was insecure. Typically, the solution involved some form of trusted courier. However, with public-key encryption, this additional complexity is no longer needed. Once the public keys are verified as authentic (typically through a certificate authority), the same insecure channel can be used for both encrypted data and key exchange. For key exchange, either both parties simply share their public keys, or they use them to exchange a secret key for subsequent use in symmetric encryption.
- Digital signatures. A final set of security concerns arise at the margins of encrypted communications. How can we verify that an encrypted message was sent by the person we believe we are communicating with and not someone else? How can we ensure that the message was not altered in any way in transit? The single-key design of symmetric schemes makes them ill-equipped to deal with these problems by default. However, the dual-key structure of asymmetric schemes offers an elegant solution in the form of digital signatures. By “signing” a message with their private key, a sender can prove to a recipient not only that it was them who sent the message, but also that the message was not tampered with. This is possible given the special trapdoor relationship between private and public keys. The recipient can verify the authenticity of the sender’s signature by inverting the function on which the signature is based using the sender’s public key and then checking whether the result matches the message. Since only the holder of the private key could create such a signature, this proves that the message was not only unmodified but really sent by them.
What is homomorphic encryption?
We are now at the point where we can begin to discuss homomorphic encryption in earnest. The term “homomorphic” stems from the ancient Greek homos (“same”) and morphê (“shape or form”), which when combined translate literally to “having the same shape or form.” Within the context of encryption, the term is used to describe the relationship between the space of possible plaintexts and the space of generated ciphertexts. An encryption scheme is said to be homomorphic if and only if there is a mapping between the plaintext space and the ciphertext space that preserves their structure with respect to a set of mathematical operations.
The idea of homomorphic encryption was first raised within the context of public-key encryption. In the late 1970s, shortly after the RSA public-key encryption scheme was developed, a group of researchers (including two of the inventors of RSA) observed that RSA has an additional property. It is possible to multiply RSA-generated ciphertexts together while preserving the underlying plaintext structure of the result. The reason for this has to do with the nature of the trapdoor one-way function used in RSA. During encryption, the sender is required to raise the plaintext to a predetermined power e (as defined in the recipient’s public key). Since exponentiation is distributive, that is, since (xy)e = (xe)(ye), the RSA encryption algorithm comes with the additional property that the structure of the plaintext space is preserved in the case of ciphertext multiplication. The researchers provided other proof-of-concept examples of encryption schemes with structure-preserving mappings. They called these schemes “privacy homomorphisms.”Footnote 8
The example of RSA above demonstrates an important point about the nature of homomorphic encryption. It adds new features to public-key cryptography, but the features are not one size fits all. There are different ways in which the structure between the plaintext space and ciphertext space can be preserved. For example, a scheme may be homomorphic with respect to multiplication but not addition (as in the case of RSA or the later ElGamal cryptosystem) or with respect to addition but not multiplication (as in the case of the Paillier scheme). Alternatively, there can be limitations on the complexity of supported expressions. For example, a scheme may be homomorphic both with respect to addition and multiplication, but only for a limited number of low-order polynomials. Lastly, it is possible for a scheme to preserve all features, that is, to support both multiplication and addition an infinite number of times.
These different possibilities have led to a more nuanced classification system. The term “homomorphic” is ambiguous on its own. To capture the different types of structure-preserving mappings, homomorphic encryption schemes are qualified with an additional term. In general, there are three main types to consider:
- Partially homomorphic. These are schemes that support a single type of operation only—for example, either addition or multiplication—but with no restrictions on the complexity of expressions. Their mapping between plaintext and ciphertext preserves the structural depth but not the breadth of possible operations. This enables them to carry out an infinite number of a particular operation on encrypted data.
- Somewhat homomorphic. These are schemes that support any type of operation—for example, both addition and multiplication—but only up to a certain level of complexity of expressions. Their mapping between plaintext and ciphertext preserves the structural breadth but not the depth of possible operations. This enables them to carry out a finite number of any operation on encrypted data.
An important feature of somewhat homomorphic encryption schemes is that their ciphertexts are designed to be “noisy,” in the sense that a small random term is added to the ciphertext during encryption. This is by design. Having noisy ciphertexts is what enables somewhat homomorphic encryption schemes to protect the confidentiality of the plaintext message while also having the flexibility to support any type of operation. However, this flexibility comes with an important limitation. With each operation, the total amount of noise grows. After too many operations, the noise gets to the point where it begins to overflow into the output, thereby affecting the accuracy of the result. To avoid this situation, the ciphertext must be decrypted before the point of overflow. Decryption removes the accumulated noise by rendering the output into plaintext. A key mathematical problem that enables this type of noisy, yet flexible ciphertext construction is what is known as the “learning with errors” problem.Footnote 9 - Fully homomorphic. These are schemes that support any type of operation, with no restrictions on the complexity of expressions. As we will discuss in further detail below, what enables fully homomorphic encryption to preserve both the breadth and depth of possible operations is not a mapping between plaintext and ciphertext. It is rather a special technique that overcomes the limits of somewhat homomorphic encryption. This technique enables fully homomorphic encryption schemes to support an infinite number of any operation on encrypted data.
Supported operations | Number of operations | |
---|---|---|
Partially homomorphic | Single type (either addition or multiplication) | Infinite |
Somewhat homomorphic | Any type (both addition and multiplication) | Finite |
Fully homomorphic | Any type (both addition and multiplication) | Infinite |
What is fully homomorphic encryption?
In the remainder of this section, we will focus on the inner workings of fully homomorphic encryption. Today when people use the term “homomorphic” on its own, that is, without further qualifying it, implicitly what they mean is fully homomorphic. This is what is meant by talk of the “holy grail” of cryptography. Fully homomorphic is the most flexible, advanced form of homomorphic encryption.
Despite the concept of fully homomorphic encryption being introduced in the 1970s, it was not until the late 2000s that the first scheme was actually developed. What led to it was not the discovery of a new trapdoor one-way function with more robust homomorphic properties. The key advancement was more a shift in paradigm than mathematics. In his 2009 PhD thesis, Craig Gentry discovered a way to transform a somewhat homomorphic scheme into a fully homomorphic scheme by using a technique he called “bootstrapping.”Footnote 10
To understand bootstrapping, it is helpful to revisit and expand on our earlier analogy of a locked box with built-in gloves.Footnote 11 This analogy was adequate to introduce the basic idea of homomorphic encryption, but ultimately it is too simplistic. A key detail missing is that after a certain amount of work, the gloves begin to stiffen and eventually cease to function. Thus, like the noise in a somewhat homomorphic ciphertext, the gloves enable someone to manipulate the contents of the locked box in any way, but only for a limited amount of time.
Bootstrapping works as follows. Imagine that instead of a single locked box, there are many—possibly infinite—locked boxes with built-in gloves. Imagine further that it is possible to insert a locked box into another locked box through a special unidirectional slot in the side of each. With these additional capabilities, it is possible to overcome the glove-stiffening limitations of any single locked box by creating an assembly line of boxes where each box is numbered and set up in advance to contain the private key of the box immediately preceding it. For example, the private key to unlock box #1 is placed in box #2 and the private key to unlock box #2 is placed in box #3, and so on. Now, when someone begins to manipulate the contents of box #1 but the gloves eventually stiffen, they simply insert box #1 into box #2 and then use the gloves for box #2 to unlock box #1 using the private key placed in box #2 and continue their work, repeating the process as necessary.
More formally, bootstrapping is the process of “refreshing” the accumulated noise in a somewhat homomorphic ciphertext by encrypting it again with the public key (i.e., placing it into another box) and then homomorphically evaluating its decryption function using the encryption of its private key under the public key (i.e., opening it from inside another box). In other words, bootstrapping wraps the ciphertext in another layer of encryption and then decrypts the inner layer homomorphically to produce a “recrypted” ciphertext with the same private key. Since decryption removes the noise in a somewhat homomorphic ciphertext, the process of bootstrapping has the same effect. However, instead of rendering the ciphertext into plaintext, the homomorphic evaluation of the decryption function transforms the ciphertext into a new “no noise” version of itself, ready for more homomorphic operations.Footnote 12 In this sense, it is like performing the identity function (id(x) = x) on a ciphertext.
To enable bootstrapping, a fully homomorphic encryption scheme must share additional information with the third-party data processor, beyond the ciphertext itself. In general, this amounts to the following new component:
- Evaluation key. This is the key used in bootstrapping to homomorphically evaluate the decryption function to refresh the accumulated noise in the ciphertext. It consists of the encryption of the private key under its own associated public key.
What are the benefits?
Homomorphic encryption is the culmination of many important developments in the field of cryptography. From symmetric secret keys to asymmetric public-private key pairs to hybrid evaluation keys, homomorphic encryption builds upon insights developed in other encryption methodologies. The result is a PET with several benefits for the protection of personal information. In general, there are five main ones to mention:
- It enhances the security and confidentiality of third-party processing. In general, third-party processing of personal information is done on unencrypted, plaintext data. This raises several privacy and security risks for individuals. As discussed in the introduction, these include external attacks by hackers, insider threats by employees and secret access demands by governments in the case of cloud computing, as well as crypto muggings and mass surveillance in the case of blockchain technologies. However, homomorphic encryption can help protect against these risks by ensuring that all data remains encrypted throughout its processing lifecycle, whether in the cloud or on a blockchain.
- It preserves data accuracy. The process of encrypting personal information does not affect the underlying meaning or internal structure of the information. Its values and the relationships they hold remain the same. What changes with encryption is only at the outward level of appearance and confidentiality. It shields access to the information and hides it from view. It is as if the information is translated into another language that only the private key holders can understand. To anyone else, the information is gibberish, but to those with knowledge of the key, it retains its full original form. Strictly speaking, this means that the identifiability of the information remains the same. The association between the information and the individuals it is about is not reduced or removed.
A consequence of this structural retention is that homomorphic encryption preserves data accuracy in its operations. Unlike anonymization, there is no general tradeoff between data protection and utility. Because the underlying meaning of the information remains the same, the result of homomorphic operations is equivalent to those performed on plaintext. To complete the analogy above, homomorphic encryption is like translating the information into a special language where the actual words are incomprehensible to non-holders of the private key, but the articulation of phonemes nonetheless produces a “melody” whose notes can be manipulated to change the words into different ones. - Predetermined, bounded-depth computations can be performed efficiently with “levelled fully” homomorphic encryption. The efficiency of fully homomorphic encryption will be discussed in more detail in the next section. At this point, suffice it to say that the process of bootstrapping is computationally expensive and can act like a bottleneck on the performance of complex operations. If there was a way to avoid bootstrapping while still leveraging some of the computational flexibility afforded by fully homomorphic encryption, this would make for a noteworthy option to consider.
In the section “What is homomorphic encryption?,” we discussed the three main types of homomorphic encryption: partial, somewhat and fully. However, if we go a level deeper, it is possible to add a further subclass to the list. “Levelled fully” homomorphic encryption is a type of somewhat homomorphic encryption with the additional property that it is possible to specify in advance the supported level of operational complexity as a parameter.Footnote 13 This ability grants levelled fully homomorphic encryption a potentially virtuous middle position between the constraints of somewhat and fully homomorphic encryption. Unlike somewhat homomorphic, there is no internal limit to the complexity of supported expressions; and unlike fully homomorphic, there is no negative performance hit due to bootstrapping. The main drawback is that the total complexity of homomorphic operations must be known in advance and supported by available hardware and network capacity. However, if these conditions can be met, levelled fully homomorphic encryption can be used to efficiently perform computations with predetermined, bounded-depth complexity. - Current algorithms are considered quantum safe. Quantum computing is an emerging multidisciplinary field of research that seeks to leverage the counterintuitive properties of quantum mechanics to design and implement algorithms that can solve certain problems more quickly than classical computers. Not every problem has an exponential quantum speedup, but one that researchers have discovered does is integer factorization. Using Shor’s algorithm, it is possible for a quantum computer to efficiently determine the prime factors of a number by exploiting new structural information made possible by the entanglement of quantum bits in a superposition.Footnote 14 This has important implications for the security of public-key encryption schemes such as RSA. Recall from the section on “What is public-key encryption?” that the security of RSA is based on the supposed hardness of factoring the product of two large primes. With a large enough quantum computer, this assumption no longer holds. It would be possible to break RSA by determining a user’s private trapdoor information from their public key.
As of today, no quantum computer large enough to break public-key encryption schemes has been built. Advancements in the field suggest it is an increasingly realistic possibility. However, the question of how long it may take is subject to much debate. Researchers estimate it would require a computer with 20 million quantum bits. The largest quantum bit count to date is around 1000.
Yet regardless of how close (or far) the threat of a large-scale quantum computer may be, it appears that homomorphic encryption will not be affected by it. This is because the type of mathematical problem on which the security of current homomorphic algorithms is based is considered quantum safe. In other words, no quantum algorithm is thought to exist that can efficiently solve it. The reason for this has to do with the noise embedded in the homomorphic ciphertexts. Researchers have shown that the security of this noise can be reduced to the problem of finding the shortest non-zero vector in a multidimensional lattice.Footnote 15 Of the four quantum-resistant cryptographic algorithms the U.S. National Institute for Standardization and Technology has so far selected for standardization, three are based on the problem of structured lattices.Footnote 16 - It can be configured to support secure multiparty computations. By default, the computations performed using homomorphic encryption involve the data of a single entity only. Using a similar key-generation protocol as public-key cryptography, each user of homomorphic encryption creates its own set of keys consisting of a public key for encryption, a private key for decryption and an evaluation key for bootstrapping. While this approach protects users’ data individually, a limitation of it is that multiple users cannot combine their data and perform joint homomorphic operations to produce aggregate results. Because the processes of encryption, decryption and bootstrapping depend on a set of keys specific to each user, the only way joint homomorphic operations would be possible is if multiple users decide to use the same public key. However, the confidentiality of each user’s data would not be guaranteed in this case since the holder of the associated private key could decrypt all data.
However, it is possible to configure homomorphic encryption to support multiparty computations in a secure manner. To do so, a more complex key-generation protocol is needed. Instead of each user creating its own set of keys, the users who wish to perform joint homomorphic operations on their combined data participate in a distributed protocol that results in the creation of a group-specific set of shared keys where the private key is split into multiple parts and each user obtains a share of it. For encryption, the process is simple: each user encrypts their own data using the public key generated in the protocol. However, for decryption, the process is more complicated. Since each user only has access to part of the private key, individually they can only partially decrypt the result of the homomorphic operations. To obtain a full and accurate plaintext result, a preset “threshold” of users must combine their partial decryptions together.Footnote 17 This approach ensures that no single or minority group of members can act dishonestly to decrypt other members’ data outside of the established protocol. A full discussion of how homomorphic encryption can be used to support secure multiparty computation is beyond the scope of this blog.
What are the challenges?
While homomorphic encryption offers several benefits, it also comes with several challenges. It is a common refrain worth repeating that no PET is a “silver bullet.”Footnote 18 There are always risks and tradeoffs to consider as part of any technological enhancement to privacy. In the case of homomorphic encryption, the challenges are varied. Some relate to the current state of art, others are inherent to the nature of the technology itself. However, in general, there are five to mention:
- Implementations of bootstrapping are still too inefficient for many use cases. What first enabled fully homomorphic encryption to become a reality has also proven to be one of the biggest challenges to its widespread adoption. The irony of bootstrapping is that while it provides a means of overcoming the limits of somewhat homomorphic encryption, the computational overhead it incurs can be so great that it outweighs any gain in utility from the increased complexity of supported operations. For example, tests done on Gentry’s original bootstrapping procedure showed that performance was extremely slow, taking up to 30 minutes to complete on standard hardware.Footnote 19 While significant improvements in efficiency have been made since that time (and continue to be made), the technology is still not yet at the point where implementations can sufficiently overcome the performance bottleneck inherent in bootstrapping for many real-world applications. A Google product manager recently summarized the company’s experience with developing fully homomorphic encryption applications as running “typically 1000 times slower” than their plaintext counterparts.Footnote 20
- Input data cannot be audited for quality control or traceability purposes. Homomorphic encryption enables mathematical operations to be performed on encrypted data, without the need for prior decryption into plaintext. While this form of encryption “in use” protects the confidentiality of information while it is being processed, this same privacy-enhancing feature also limits the ability of third-party processors to put in place important measures to facilitate computational robustness and transparency. For example, before performing operations on some input data, third-party processors often first check the data values themselves to ensure they are of sufficient quality and free of errors. This preliminary phase helps ensure overall data quality and protects against “data poisoning” attacks. However, data quality checks of this nature are not possible with homomorphic encryption. Because third-party processors never get access to the raw plaintext data, they cannot directly verify the validity of the data values inputted into their computations.
Another practice limited by homomorphic encryption is data traceability. To facilitate transparency and explainability of decisions produced or augmented by artificial intelligence (AI), regulators have advised organizations to document the datasets and processes that yield an AI model’s decision, including its training data.Footnote 21 Being able to trace datasets back to their original source and know which datasets were used to train an AI model also helps discover sources of bias and discrimination. However, the confidentiality protections of homomorphic encryption make this type of transparency measure difficult to achieve in practice. Because third-party processors cannot “see” the data they are manipulating, they cannot build a comprehensive record of training data capable of enhancing visibility into the behavior of their AI model. - It may be vulnerable to key-recovery attacks. By default, communication in homomorphic encryption is two-way: a client shares some encrypted data with a third-party processor, who then performs some homomorphic operations on the data and sends the encrypted result back to the client. If all goes well, the communication ends at this point, at least with respect to that particular request. However, if a decryption error occurs, for example due to the ciphertext getting corrupted during transmission or computation, the process is typically repeated, with the client informing the third-party processor of the error.
This type of back-and-forth interaction with error handling is a normal feature of cloud-based services. While it may seem innocuous at first, in the case of homomorphic encryption it actually acts as an additional attack vector that can be exploited by a malicious or compromised third-party processor to extract the client’s secret key. In general, the attack works in the following manner. Instead of returning a valid homomorphically computed ciphertext, the malicious or compromised third-party processor purposely introduces curated perturbations into a series of ciphertexts and observes the client’s reaction (or lack thereof) to glean information about the noise values in the ciphertexts. Once enough noise values are learned, the processor combines this information into a system of equations, which it then solves to recover the client’s secret key. Researchers have developed practical attacks of this nature for all major homomorphic encryption schemes.Footnote 22 - There is generally a tradeoff between verifiable computing and server function confidentiality. Different measures can be applied to help protect against the above threat of key-recovery attacks, including organizational measures such as contractual agreements between the client and the third-party processor. However, if we stay within the pure technical domain of applied cryptography, then the key measure to consider is what is known as “verifiable computing.” This is where the third-party processor provides, in addition to the result of the computation requested by the client, a proof that the computation was performed correctly. By providing such a proof, the third-party processor can demonstrate to the client that the ciphertext produced by it was not maliciously tampered with and that it is in fact the result of a set of homomorphic operations expected by the client.
Yet to be effective, the proof must contain meaningful information about the nature of the computation. Otherwise, it would cease to function as a proof. In the case where the details of the computation are publicly known, including parameter values, the information contained in the proof would not reveal anything new to the client. However, in the case where the computation consists of a proprietary server function with confidential parameter values, the matter is different. Here, a proof to verify the correctness of the computation performed by the third-party processor would reveal additional, potentially sensitive information to the client. Even where advanced zero-knowledge proofs are used, the third-party processor must reveal the model architecture to the client and publish additional information about parameter values, at a minimum.Footnote 23 This demonstrates a general tradeoff between verifiable computing and server function confidentiality. - Circular security of evaluation key is still an open problem. Recall from the section on “What is fully homomorphic encryption?” that the bootstrapping operation requires access to an encryption of the private key under its associated public key to homomorphically “decrypt” the ciphertext to refresh the accumulated noise in it. While the need for such an evaluation key is understandable from a technical perspective, whether, or to what extent, it raises security considerations is less clear. Given the highly confidential nature of the private key, is it safe for a client to release an encrypted version of it to a potentially untrusted third party or does this requirement bring with it additional risks?
How to assess the security of a client’s disclosure of its evaluation key is still an open problem. While currently there are no known attacks that exploit it, there are also no formal proofs that demonstrate its security from existing hardness assumptions in noise-based homomorphic schemes. In the absence of strong evidence either for or against it, the standard approach so far has been to introduce an additional assumption known as “circular security.”Footnote 24 A homomorphic encryption scheme is circularly secure if it is considered “safe” to release an encryption of the private key under its associated public key. Work into the validity of this assumption is an ongoing area of research.
Conclusion
In this blog, we explored various aspects of homomorphic encryption and gained a number of insights into the history, context and nature of it:
- Homomorphic encryption is a special form of encryption whose structure enables mathematical operations to be performed directly on ciphertext, without the need for prior decryption.
- It is the culmination of many important developments in the field of cryptography, including symmetric and asymmetric encryption.
- It is a PET with several benefits:
- It enhances the security and confidentiality of third-party processing.
- It preserves data accuracy.
- Predetermined, bounded-depth computations can be performed efficiently with “levelled fully” homomorphic encryption.
- Current algorithms are considered quantum safe.
- It can be configured to support secure multiparty computations.
- Yet, it also comes with several risks and trade-offs to consider:
- Implementations of bootstrapping are still too inefficient for many use cases.
- Input data cannot be audited for quality control or traceability purposes.
- It may be vulnerable to key-recovery attacks.
- There is generally a tradeoff between verifiable computing and server function confidentiality.
- Circular security of evaluation key is still an open problem.
Based on these insights, it is clear that homomorphic encryption is a fast-developing PET with promising use cases. By creating a new modality of encrypted “in use,” it protects the confidentiality of information while it is being processed to open up a number of new and emerging applications. Yet, the current state of the art as well as the dynamics of how the technology works in practice mean that questions of when, where and how it should be used will continue to be raised in ongoing discussions. To conclude with a quote that summarizes the situation: “FHE [Fully homomorphic encryption] today is kind of like where deep learning was in 2009 / 2010.”Footnote 25