Obfuscation and Verification of Functions

James Olsen

 - 30 min read
Obfuscation and Verification of Functions bannner

This article discusses Functional-Verifiability as a scale, in which all functions are derived from. Perfect verifiability exists as one extreme of this scale, and perfect obfuscation exists as the other extreme. This concept of perfect obfuscation is explored further to conceptualise how it may be possible to construct verifiable computation across obfuscated programs by extending upon existing functions such as One-Way-Hash and Zero-Knowledge functions. In addition, the theory behind - and the properties of - other cryptographic primitives including Homomorphic-Encryption functions are explored in detail.

Contents:

FAQ
Introduction
Functional-Verifiability
  Definition
  Purpose
  Proposed Functions
Homomorphic Encryption
Conclusion

FAQ

What is a Program-Obfuscation function?

Any function that satisfies the properties of Program-Obfuscation is able to perform computation on any object $x$ without revealing the contents of $x$. Therefore a server could perform encrypted calculations on behalf of a user and verify to said user that the calculations were performed correctly without revealing the encryption key.

How can Program-Obfuscation functions be used?

Program-Obfuscation functions allow for both the verification of a user’s knowledge of an object $x$ (i.e. a password) without revealing $x$, and the obfuscation of a user’s computation of $x$ (i.e. allowing use of software without the risk of piracy) once again without revealing $x$.

What are some types of Program-Obfuscation functions?

There are three primary types of Program-Obfuscation functions that have been proposed: Black-Box Obfuscation, Indistinguishability Obfuscation (iO), and Extractability Obfuscation (eO). Black-Box Obfuscation has been proven impossible, however Indistinguishability Obfuscation has been proven to be possible to construct if Program-Obfuscation is possible.

Do Program-Obfuscation functions exist?

While there has been academic resources dedicated to the attempted construction of a Program-Obfuscation function, no such function exists yet as provably cryptographically secure. In some cases this is because malicious actors could decrypt any obfuscation performed, whereas in other cases this is because the amount of computation that must be performed is beyond the capability of nearly any computer.

What is a Homomorphic-Encryption function?

There are numerous types of functions that are valid Homomorphic-Encryption functions, the most significant of which are Fully-Homomorphic functions. Any valid Fully-Homomorphic function allows for the performing of computation on an encrypted object $e(x)$ as if the computation is being performed on $x$. Therefore a server could generate a key for an individual without knowing the password the individual has provided to generate said key.

What is the Functional-Verifiability scale?

Functional-Verifiability is a spectrum between “perfect verifiability” and “perfect obfuscation”, where the closer a function exists on the scale towards “perfect obfuscation” the less information is revealed about an object $x$ that has had computation performed on but also the less sure a verifier can be of the genuine existence of $x$.

What are the properties of a One-Way-Hash function?

First and second pre-image resistance are mandatory of any cryptographically secure One-Way-Hash function. First pre-image resistance means that it must be infeasible to discover the input that generated some output with only knowledge of the output, and second pre-image resistance means that it must be infeasible to an input that generates the same output as another input. Other common properties are collision resistance and avalanche effects.

What are the properties of a Zero-Knowledge function?

Completeness, Soundness, and Zero-Knowledge are the three core properties of any Zero-Knowledge function. Completeness requires that if a user genuinely does know an object $x$ they will be able to prove this. Soundness requires that if a user does not know an object $x$ there is an infeasible chance of acting like they do. Lastly, Zero-Knowledge requires that no information about $x$ is revealed to any users when proving knowledge of $x$.

Introduction

What are functions?

Functions hold a variety of definitions across the disciplines of mathematics and computer science, however, their applications are theoretically boundless as a tool that can be used to construct complex computations from simpler one’s. The existence of a function typically implies the existence of a function of a function, which implies the existence of a function of a function of a function and so forth. By their very nature they are not atomic objects, as a function can always be divided into either smaller functions or other forms of input such as a number. Extending this definition as applicable in computer science, it can be observed that all mathematical objects must either be a function or a number.

While this note may not appear entirely intuitive, if a function $f$ is derived in that $f(x)=x^2$ it will itself be capable of being defined as a function $g$. If it is known that $g(a,b)=a \times b$, then $f$ could instead be defined as $f(x)=g(x,x)$. Multiplication can be derived even further, where $x \times y$ can be calculated as the sum of $x$ across $y$ additions. While none of these derivations are particularly applicable to the the human-usage of mathematics, they emphasise the defining of the above useful generalisation; all mathematical objects are either a function, or a number. Since the first theoretical designs of a Turing-complete computer it has been essential to consider all computations as above, where any given object is either a function that may take an input and generate an output, or an object that can be provided to a function to perform calculations on (i.e. a number). It is important to emphasise for clarity that the former object will always be the latter as well (as computation can be performed on a given function), but the latter will not always be the former.

In an alternative perspective from this sense they can be contrasted with atoms. An atom is a combinations of proton, neutrons, and electrons that allows for the constructions of more complex objects in a similar way to how functions allow for more complex calculations. As is now known, protons, neutrons, and electrons themselves are not atomic by the sense of the original definition “to be indivisible” and are instead complex constructions of sub-atomic particles, which themselves are complex constructions of other sub-atomic particles and so forth. Hence, all matter in existence can either be observed to be some construction of sub-atomic particles (i.e. an atom), or a sub-atomic particle.

Purpose of functions

When considering the scope of computer science, most functions are typically defined by properties that are common to most. Assuming bounded computational resources and knowledge of $f,x,y$ for actors $R,V,P$, the actor $R$ will typically be able to resolve a valid input $x$ from a valid output $y$ in time that is not exponentially more than the time required by $V$ to verify that $y$ is the result of $x$, or the time required by $P$ to prove to $V$ that this is the case. Assuming $f(x)=y$, the actor $R$ will usually be able to construct the inverse function $f^{-1}(y)=x$ to resolve a valid input from a valid output, and $V$ will be able to run $f(x)$ themselves to verify that a valid output was the result of a valid input.

However, not all functions share these general properties. Trapdoor-Functions are a class of functions intended to make it exponentially more difficult for $R$ to resolve a valid input from a valid output. This is achieved theoretically by ensuring that it takes exponentially longer to derive $f^{-1}(y)=x$ then it takes to calculate $f(x)=y$. These class of functions support the existence of a majority of current cryptographic schemes. For instance, an actor $P$ can prove their right to authentication to $V$ by generating $h(p)=c$ where $p$ is the password, and $h$ is a One-Way-Hash algorithm. The verifier $V$ knows $h(p)$ such that they can easily verify that the hashed message $c$ could have only have been generated by the individual who knows $p$. However, for $R$ to be able to discover $p$ themselves given $h(p)$ they must be capable of generating a new function $h^{-1}(c)=p$ which is currently proven to take more computation than is feasibly possible in the Universe.

It can be noted that it is believed by most that it is theoretically impossible to construct a function $f$ such that no inverse function $f^{-1}$ could ever be derived from. As this is the core characteristic of what defines a Trapdoor-Function it is perhaps better conceived that all functions in this class can in fact be defined on a scale. From one extreme of the scale exists most standard functions, and at the other extreme exists functions such as the One-Way-Hash function referred to previously. Such a scale can be denoted the Functional-Trapdoor scale. The further along the scale a function is found the more computation at a linear rate that is required by $P$ to find $f(x)=y$ and by $V$ to confirm this result, and the more computation at an exponential rate that is required by $R$ to find $f^{-1}(y)=x$.

In the future this scale will become significantly more multi-layered than it is currently, as there will be a distinction between functions that the inverse cannot be found for in linear time through quantum computation, and those that can (but cannot through classical computation). Such a multi-layered scale applies to an additional class of functions that will be defined further throughout this article. This class will be denoted by the Functional-Verifiability scale to be illustrated upon through further reading.

Illustration of the Functional-Verifiability scale

Illustration of the Functional-Verifiability scale.

Functional-Verifiability

Definition

If an actor $P$ knows some object $x$ and wants to perform the computation $f$ upon this object where $f(x)=y$, then $y$ serves as a proof that $P$ not only knows $x$, but knows that $f(x)=y$ too. If $P$ was to give $f,x,y$ to some other actor $V$ then this actor could verify that $P$ knows $x$ and $f(x)=y$ by running the function themselves and comparing the results. But in some cases $P$ may not want to entrust $V$ with the value of $x$, as for instance it may be that actor’s password. Is it possible to construct a function $f(x)=y$ where $y$ still serves as a proof of the knowledge of and/or computation upon $x$ without revealing $x$ to the verifier $V$?

It remains an area of ever growing research to uncover what precisely is possible in regards to this question. As with Trapdoor-Functions many believe it is theoretically impossible to produce $f(x)=y$ as stipulated through the prior question that proves both knowledge of and computation upon $x$ without revealing any information about said object. However, a more probabilistic proof is possible; $P$ can prove to a significant probability that they do know an $x$ or a computation upon an $x$, but can never guarantee this fact from the perspective of $V$. Therefore, as with the Functional-Trapdoor scale the Functional-Verifiability scale can be defined, dependent on how much information is revealed, and by extension to what probability $V$ can trust a claim made by $P$. On one extreme once again exists most standard functions, and on the other extreme exists functions such as Program-Obfuscation (which will be explained in greater detail throughout the article). The further along the scale one moves from the former extreme to the latter, the less information is revealed about a given object $x$ while still fulfilling the purpose of a function.

Examples

Generally this becomes a trade-off between the information $V$ knows about $x$ (and to an extent the prior described probability $V$ can be sure about $P$’s claim of $x$), and the amount of computation both $P$ and $V$ must run to prove that $f(x)=y$ given public knowledge of $y$. Assuming that $x$ is some value with a size measured in bytes, the cost of with-holding $x$ may require kilobytes, megabytes, or even gigabytes of data to create a robust proof as replacement (the true amount currently varies greatly between alternative candidate and theoretical solutions). The computational cost of creating a robust proof tends to exponentially increase the further along the scale a function is found, or in other words, if information revealed about $x$ decreases at a linear rate, computation required will tend to increase at an exponential rate.

Brief mention has been made previously of the reasoning as to why a given actor $P$ may not desire to reveal $x$ specifically for a given computation, however, it is opportune to expand on a potential extended list of use cases to illustrate the specific requirements of a Functional-Verifiability scale. While schemes have been developed to confirm ownership of a password without revealing said password (in specific contexts) there is need for much more general schemes. Blockchain networks such as Ethereum have expanded the possibilities of decentralised finance (DeFi) by allowing the construction of smart contracts (functions) that can perform trustless computations, but further possibilities still remain limited. If an individual wanted to create a smart contract capable of moving assets from one of their company’s portfolio’s to another one of the same company’s portfolios automatically and in a manner that could be verified this would potentially cause all verifiers to have access to the information required to access these portfolio’s themselves. This is because to be able to verify that the smart contract ran correctly where said contract is visualised as a function $f(x)=y$, any verifier $V$ must know $x$ (likely a private-key used for accessing the company’s portfolio’s).

Another alternate example may include allowing a verifier $V$ to ascertain if computation performed in a manner that is not trustless (i.e. the verifier is unable to visually ascertain the performing of the computation) was performed genuinely. Even if $V$ is aware of the function $f$ used by the individual $P$ when performing the described computation, there would typically be no way of proving that $P$ really did run $f(x)=y$ and not $g(z)=y$ where it is assumed that $z$ is some invalid input for the function $f$. If $P$ provided an invalid input $z$ in an attempt to access an account they had no right to, and as such constructed an invalid function $g$ to produce a valid output $y$ that proves their right to access said account, there would need to some way for $V$ to verify if the output $y$ was produced by a valid input or not. The computation $f$ may not be performed in a trustless manner as it may require computational resources beyond the constraints of a trustless environment, or performing it in such a manner could reveal the information required for any verifier $V$ to access the account themselves (as in the previous example). It could be feasible that, as indicated by the below diagram, the advantages realised by both smart contracts and standard programs could be interloped to a more significant degree, perhaps supporting the existence of a new class of programs.

Possible dichotomy of malicious actor and developers

Possible dichotomy of malicious actor and developers, providing evidence to support the need for a new class of programs.

Both examples described present the need for functions derived from the Functional-Verifiability scale that have the capability of either proving knowledge of $x$ by $P$, or proving knowledge of $x$ such that some computation upon $x$ can be proved. More general schemes that satisfy these examples and many other instances where verifiability or obfuscation is required depend on the sound construction of existing and novel cryptographic primitives. Following on from this point these existing and potential novel primitives will be explored in greater detail and their purposes in reference to the Functional-Verifiability scale defined. Families of functions will be characterised by their purposes for either verifiability or obfuscation, and observations will be made as to how a variety of these derived functions can be constructed together to form more complex functions.

Purpose

Verifiability

Challenges associated with ensuring secure communications between two or more individuals across a network where an unintended recipient or a malicious actor may exist have been ongoing. Through Kerckhoffs's Principle as well as observations of networking and messaging in computation more generally it is known that no communication can be guaranteed to transmit from a sender to a recipient without having the capability of being read by another actor. Any attempts to obfuscate the workings of a network, or to construct a network that is not publicly advertised, will decrease the chances of interception but not to a level that is cryptographically-sound. If the discovery of any network is of sufficient importance to a malicious actor, the tools to find it are available. It must be assumed that outside of cryptographic assumptions (i.e. the difficulty of solving some mathematically hard problem) there are no assumptions that can ensure said networks are safe. Typically modern forms of encryption such as RSA, where a message is encrypted with the recipient's public-key such that they can decrypt the message with their private-key, are used to guarantee the origins and content of the message. However, all verifiers require the ability to be able to decrypt these messages to individually verify their origins and content themselves, and each verifier that is able to decrypt the message can learn of the contents themselves.

It can be observed that this capability to prove the contents of a message, or better termed, to prove the knowledge and existence of an object, without revealing the knowledge of the object to any verifiers is one of the two challenges that the Functional-Verifiability scale distinguishes between. Such a challenge can be termed the challenge of verifiability. Rewording the previous text, the following can be said: guaranteeing that knowledge of a secret object $s$ can be proven without revealing said object is a subset of the challenges as defined by the Functional-Verifiability scale, where the complementary subset consists of the proof of knowledge of the computation of $x$. The described distinction between the two subsets of challenges may not appear intuitive either, but it is of significant importance. Verifiability allows an actor to prove their knowledge of a mathematical object (i.e. a number) such as $x$, but does not allow an actor to prove their knowledge of an $x$ which proves $f(x)=y$ is valid. Hence, verifiability can be seen itself as a subset of obfuscation - obfuscation being the latter challenge described and explored in further detail alongside Program-Obfuscation later on.

By confronting the former subset of computational problems an efficient network could be created wherein an oracle could authenticate individuals in a verifiable manner without needing to share their knowledge or the knowledge held by the individuals with any verifiers. Verifiability additionally allows for new tools to be applied not possible previously. Proving ownership of an identity in a system where such identity could not be stolen has often be visualised as either an insurmountable obstacle, or a sacrifice in which some entity must be trusted unreservedly. Proving knowledge of an identity without revealing said identity or the means to claim knowledge of that identity is a possible task with a sufficiently robust system of verifiability. It should also be emphasised that the objects discussed here need not necessarily be atomic mathematical objects as in the case of numbers or values, but can also be functions. If $x$ is perceived to be some function that can perform computation on a specified input and produce an output of a specified nature it can be seen that any such solution would allow for the verification of the existence of such a function (but not its input). Numerous proposals that constitute as a form of a solution for these examples and beyond, and importantly a cryptographic solution, have been put forth previously.

Obfuscation

The latter challenge as described previously can be defined as the challenge of obfuscation. Where an actor $P$ desires to prove their knowledge of an object $x$ such that when input into a function $f$ produces the output $y$ without revealing $x$, a cryptographic primitive that provides obfuscation is required. Currently there exists no such primitive of a practical nature to satisfy cryptographic assumptions (primarily those of bounded computational resources for the prover $P$ and the verifier $V$). However, there do exist theoretical constructions and possible candidate constructions that are explored throughout the article. These solutions are primarily derived from the Program-Obfuscation family of functions - likely the strongest form of functions to be potentially derived from the Functional-Verifiability scale.

Where obfuscation can be denoted a superset of verifiability is in that the existence of this assumption implies the existence of verifiability, and implies the expanded possibility of being able to construct any computational system that a Turing-design allows for. If there exists an expression $f(x)=y$ wherein $P$ knows the valid obfuscation function $f$ which is capable of proving $P$’s knowledge of $x$ without revealing the value to any verifier $V$ that knows $y$, there exists an expression $g(f(x))=z$ where $g$ is a valid obfuscation function (and perhaps even equivalent to $f$). Therefore any proof that can be constructed with a verifiability function can be constructed with an obfuscation function, however, as obfuscation reveals less information about $x$ to $V$, the computation required in generally of an exponential degree higher. This is equivalent to saying that obfuscation functions are found further along the Functional-Verifiability scale than verifiability functions. Hence, while obfuscation functions are the most powerful of those to be discussed, they do not serve as an end all replacement for verifiability functions.

In addition it should also be noted that where there exists a $g(f(x))=z$ such that the knowledge of $f$ by the actor $P$ is proven to a verifier, there also must exist a $g(f(x))=z$ such that the knowledge of $x$ by the actor $P$ is proven despite two subsequent computations being performed. This is where the earlier emphasis that all mathematical objects are either functions or numbers, and that any function can be divided further into other functions or numbers, is important to understand especially. If knowledge of the computation of any function or number, and as extended upon previously the knowledge of any function or number, can be proven to a verifier without revealing any information to $V$, than it is possible to construct any computational system that can be verified to exist without any knowledge of said system being shared.

In an interesting intersection, any obfuscation function would satisfy the maximum requirements of the Functional-Trapdoor scale. If computation on objects (i.e. private-keys) could be performed in a verifiable manner without any revealing of said object if and only if the object was truly known by $P$, it would be infeasible to construct a function $f^{-1}$ such that $R$ could maliciously prove their knowledge of the object. While there is significant precedent to conceive these class of functions as a tool to extend upon the Functional-Trapdoor scale, the article will take the approach of defining them in a novel manner to address the challenges presented that require functions derived from the Functional-Verifiability scale.

Proposed functions

One-Way-Hash

The One-Way-Hash function family is identical to the one referred to previously in reference to the Functional-Trapdoor scale. Leverage of the same properties that support Trapdoor-Function schemes (i.e. authentication of passwords) are utilised in similar mechanisms to support the creation of schemes used for verifiability. These functions cannot support obfuscation solely on their own however with respect to the common properties typically defined of One-Way-Hash functions (and specifically those of robust cryptographic assumptions). As such all functions in this family are considered more so on the verifiability end of the Functional-Verifiability scale, and in general exist much closer to one extreme. This extreme indicates that while less information about $x$ can typically be hidden from a verifier $V$, the magnitude of computation required to construct and verify proofs of knowledge of $x$ are significantly reduced. With this particular attribute One-Way-Hash functions are able to serve a specific niche of allowing for low-cost verifiability in systems where trust might not serve as essential a criteria.

To be considered a One-Way-Hash function $h$ must satisfy the following properties:

  • Cost of computation
    • Property is in fact shared by all functions found on the Functional-Verifiability scale, but in particular has been an emphasis of One-Way-Hash functions. The construction of any proof $y$ wherein $h(x)=y$ should be sufficiently computationally affordable.
  • First pre-image resistance
    • It must be infeasible to find an $x$ such that $P$ would then be able to construct the proof $h(x)=y$ given that their intention was to create the proof $y$. In other words, satisfying this property feasibly prevents any actor $R$ from constructing a function $h^{-1}$ such that given a proof $y$ they can calculate $h^{-1}(y)=x$.
  • Second pre-image resistance
    • Given a specific $x$, it must be infeasible to find another object $a$ such that $h(x)=h(a)$. This property ensures that a malicious actor cannot discover an object $a$ that would create a proof that proves said actor’s knowledge of an object $x$ that they do not know that might provide them access to an account, as one example.
  • Collision resistance
    • Given any $x$, it must be infeasible to find another object $a$ such that $h(x)=h(a)$. While similar to the above property, the distinction lies in the difference between a specific $x$ or any $x$. Satisfying collision resistance is generally not necessary for cryptographic systems, but can be important for data such as file checksums where it would be disastrous for any two different files to have the same checksum.
  • Avalanche effect
    • If any bit is changed in the object $x$, the resulting proof $y$ where it is said that $h(x)=y$ must represent this change by having at least half of all the bits in $y$ change. Without this it could be possible to detect similarities between different proofs and discern that the objects they are proving knowledge of are therefore of similar values.

Any One-Way-Hash function that satisfies “second pre-image resistance” but not “collision resistance” can be said to have weak collision resistance, whereas functions that satisfy both are said to have strong collision resistance. However, as noted in the definitions of the properties above, most cryptographic systems only require weak collision resistance. This is because for instance it would typically be considered infeasible in a password-authentication system for any two passwords $x$ and $a$ to be hashed such that their proofs are equal (due to the significantly smaller number of passwords that will ever be stored on a server). Strong collision resistance would entail that given the birthday paradox it can be assumed that if there exists $2^{256}$ possible output proofs for a given hash function, than on average $2^{128}$ hashes would need to be performed before a collision is found.

Maintaining a sufficiently large level of security is necessary for satisfying the properties of a One-Way-Hash functions as partially described with the birthday paradox example. The lower the level of security, the more feasible it becomes for a malicious actor with bounded computational resources to break either of the pre-image resistances or collision resistance. Security of a One-Way-Hash function is signified by the size of the output of the proof produced by the function typically. All functions from this family produce proofs $y$ of a fixed size where $h(x)=y$. If $y$ is $n$ bits in size, then it can be said that the function $h$ has $n$ bits of security. Most proofs produced currently are about 256 bits in size, hence why $2^{256}$ was used in the above example. For any unique object $x$ this means that the proof could be anyone of $2^{256}$ values. By the pigeonhole principle as there are infinite possible values for $x$ and only a finite number of possible output values for $y$ it will on average take $2^{n}$ (where $n$ is the bits of security provided by $h$) to find a collision where the hash of a specific $x$ and any $a$ are equal (i.e. weak collision resistance).

If the level of security of $h$ is sufficiently large to fulfil all the necessary properties of a cryptographic One-Way-Hash function, than it can be said that it would be infeasible that any actor $P$ could prove their knowledge of $x$ to $V$ without having distinct knowledge of $x$. To this extent a One-Way-Hash function can be visualised as a “black-box”. Alongside the additionally useful property of the “avalanche effect” it should appear that any input provided to the function $h$ is treated in a manner to produce an output that has no association to the input (apart from that output always being created whenever said input is provided). Extending on the described “black-box” model, it could serve as an analogy to replace $h$ with a machine that takes the provided input, and one bit at a time calculates what to give back as an output but in a manner that has no apparent pattern that can be inspected. Taking the below illustrated example of a possible hashing of a transaction in Ethereum, it can be seen how this is the case.

Visualisation of a “black-box”

Visualisation of a “black-box” that takes some input and produces an output in a manner that cannot feasibly be reversed - a One-Way-Hash function.

There is a caveat to this design though, and how significantly a One-Way-Hash function can properly provide verifiability in the sense implied by the Functional-Verifiability scale. While the One-Way-Hash family of functions allows for an actor $P$ to prove they have knowledge of some object $x$ to another actor $V$ without sharing $x$, this cannot be achieved unless $V$ also has knowledge of $x$. With no capability to verify $P$’s knowledge of $x$ unless the verifier also knows $x$ severely limits the capacity of these functions to provide verifiability. While there are candidate designs using a One-Way-Hash as a foundation to provide appropriate verifiability, on their own these functions cannot achieve that. This is where the requirement for more sophisticated functions of larger computational necessities come in.

Zero-Knowledge

In contrast, through employment of functions derived from the Zero-Knowledge family of functions, an actor $P$ can verify their knowledge of $x$ without the actor $V$ being aware of the value of $x$. Such a property implies that any actor can be sure that $P$ does have knowledge of $x$ without these verifying actors requiring any pre-requisite knowledge themselves (except that of needing access to the proof $y$ where $z(x)=y$). Zero-Knowledge functions until recently where more or less theoretical, but the same examples that predicate the need for a Functional-Verifiability scale for the purposes of trustless computation, drove the development of this class.

These functions are found much further along the scale and closer to the extreme end that anticipates large computational resources required for proofs and verification. As expected, this is true. One of the primary limitations currently on any function $z$ is is if a network is prepared to handle sizes of proofs produced by $P$ that need to be verified by each other actor $V$. The primary reason for these proof sizes being significantly larger than the aforementioned One-Way-Hash functions is a result of another weakness that increases further along the scale. This weakness is in that now these functions are less and less deterministic such that it is becoming computationally more feasible, gradually, for $P$ to prove knowledge of $x$ without knowing the object. It should be emphasised though that just because a proof is probabilistic does not necessarily mean it is not cryptographically-sound.

To present both the weaknesses, and underlying structure and purpose of Zero-Knowledge functions, the following common analogy is presented. If there existed an actor $P$ that wanted to prove two marbles were different colours to the actor $V$ who cannot see the colour-difference between said marbles, a rudimentary Zero-Knowledge function is required. If $V$ takes the marbles and takes turns at revealing only one marble at a time to $P$, asking them each time if the marbles had been switched since the last turn, $P$ would only be able to say at a probability greater than 50% if the two marbles and in fact distinct. Therefore, if $P$ was correct for every turn for a significantly large enough number of turns $V$ could be sure to a great probability that the two marbles must be distinct without knowing which colour each marble is.

As such for all examples of Zero-Knowledge functions there is a requirement for a step in the proof production process wherein some step has to be conducted a significant number of times to confirm there is a very low probability that $P$ does not actually know $x$. This multi-step process hence requires significant computation on the part of both $P$ and $V$, and computation that must be communicated across a network. While the above analogy described an “interactive” variety of a Zero-Knowledge function where $P$ and $V$ communicate back and forth repeatedly, there are “non-interactive” varieties as well where $P$ constructs a proof and any verifier $V$ can view it at a later date.

It is at this point that further detail should be provided to the properties that entail a given function $z$ is a valid Zero-Knowledge function:

  • Completeness
    • Assuming $P$ really does know an object $x$, they will be able to convince any other actor $V$ of this given $V$ only knows the proof produced $y$ and the Zero-Knowledge function $z$.
  • Soundness
    • Satisfying this property allows a verifier $V$ to be probabilistically sure that any proof $y$ provided by $P$ is valid in proving $P$’s knowledge of a given object $x$. There is always, however, a small chance (how small depends partially on the computational size of the proof $y$) that $P$ is able to construct a proof of knowledge of $x$ without truly knowing $x$.
  • Zero-Knowledge
    • Given that $y$ is a valid proof wherein $z(x)=y$ proves that $P$ knows the value of $x$, any verifier that receives this proof will learn no additional information about $P$ or $x$ apart from the fact that $P$ does know $x$.

The description of the above properties have been simplified so as to avoid direct continued reference to any concept of interactive or non-interactive proofs. All discussion about Zero-Knowledge functions from this point onwards will continue with the described generalised form.

Program-Obfuscation

There is a third family of functions, Program-Obfuscation functions, that allow for an actor $P$ to prove they have knowledge of some object $f(x)$ to another actor $V$ without sharing $x$. To this point in the article there have been a variety of references to the Program-Obfuscation family of functions as they are the primary functions derived for the challenge of obfuscation from the Functional-Verifiability scale. Any construction of these functions allow for the construction of any other functions on this scale, and as noted previously, any function on the Functional-Trapdoor scale (as well as numerous other scales). Computation required for the production of proofs such that $o(f(x))=z$ where $z$ serves as proof that $P$ knows an object $x$ are exponentially larger in scale than either Zero-Knowledge or One-Way-Hash functions. This significant increase in proof size has in fact existed as one of the primary limitations to a practical construction of such a function $o$ - to this date there has been no such construction that is proven to satisfy robust cryptographic assumptions. However, due to the huge potential available in providing functions that truly satisfy the requirements of the challenge of obfuscation, a large degree of academic work has continued to explore this problem.

As can be evidenced by the earlier example of proving knowledge of computation of a given object $x$, Program-Obfuscation functions are in this capacity an expanded form of Zero-Knowledge functions, wherein Zero-Knowledge functions do not allow an individual $P$ to prove knowledge of $f(x)$ in the case where $V$ does not hold any prior knowledge of $x$. It can be noted this is in much the same way as how any function that satisfies the challenge of obfuscation will satisfy the requirements of the challenge of verifiability (as proven earlier). Therefore, the design of a function originating from the Program-Obfuscation family of functions is a logical expansion of what can be computationally performed given the challenges described above. Presenting the following two expressions in the context that $V$ does not know $x$, but knows the values $y$ as some proof, and $z$, and $o$ as some Zero-Knowledge function and Program-Obfuscation function respectively.

$$\begin{align*} z(x)&=y \\ o(f(x))&=y\end{align*}$$

In both these cases, $y$ serves as a proof that the individual $P$ knows $a$ where $z(a)=y$ and $o(a)=y$. While the difference illustrated here may appear subtle as explored previously, it is an important difference. Running computation on $x$ without revealing this value in the form of $f(x)$ allows for the construction of more complex proofs, such as existing and theoretical cryptographic primitives. If $f(x)$ can be computed without revealing $x$, $g(f(x))$ is also equally valid of a construction. These facts and proofs are only repetitions of what has already been explored in this article. There are three formally proposed functions derived from the Program-Obfuscation family of functions: Black-Box Obfuscation, Indistinguishability Obfuscation, and Extractability Obfuscation. Though none have ever been implemented under practical cryptographic assumptions, some theoretical progress has been made in defining these functions and proposing possible implementations.

Black-Box Obfuscation

Given some function $f$ wherein a Black-Box Obfuscation function $o$ can be constructed such that $o(f(x))=y$ where $y$ serves as a proof of $P$’s knowledge of $x$, no knowledge can be known of said function that couldn’t be known without querying inputs and analysing outputs. Often depicted as the simplest and most intuitive definition for any Program-Obfuscation function, it is required that no actor $V$ could feasibly understand the function $f$ beyond seeing which inputs produce which outputs. With successful implementation of this particular type of Program-Obfuscation there would be no information available to $V$ that could possibly derive them any knowledge of $x$ or any property of $x$ apart from that $P$ does know an $x$. However, this has been proven to be impossible as there will always exist some function $g$ that will reveal more information about $f$ than would have been known by only querying inputs.

To understand why it is impossible it must be said that an actor $P$ has constructed two functions:

$$ \begin{align*} f(a,b,x)&=\begin{cases} b, \ x=a \\ 0 \end{cases} \\ g(a,b,x,f)&=\begin{cases} 1, \ f(x)=b \\ 0 \end{cases} \end{align*} $$

In addition, it is defined that there exists a function $i$ such that $i(x)=0$ for all $x$. For an appropriate Black-Box Obfuscation to exist it must be the case that no actor (including $P$) is capable of distinguishing between either $o(f(a,b,x))$ or $o(i(x))$ without attempting all possible inputs on both computations. It is necessary that this is the case, as if $P$ or any actor at all has the capability to differentiate between either function without testing every input, then the function $o$ must be assumed to be leaking further information about said functions than would be otherwise available if the actors were given a black-box. However, assuming $P$ has performed the obfuscation on both $f$ and $i$, as well as $g$, the function $g$ will reveal more information about an obfuscated $f$ than if any actor only had black-box access to $f$. In addition to this, as $P$ has specially set the values of $a$, $b$, and $x$, this actor will be able to discern both the obfuscated versions of $f$ and $i$ only using $g$ without needing to test every single possible input. Hence, there would exist an actor who has more knowledge about the obfuscated function than they would have from black-box access.

By proposing this proof though, it was revealed that while Black-Box Obfuscation is impossible, it is possible that Indistinguishability Obfuscation could be implemented. While a weaker variety of obfuscation, it would still sufficiently satisfy the criteria of a Program-Obfuscation function such that it would allow for the challenge of obfuscation to be confronted.

Indistinguishability Obfuscation

Given two functions $f$ and $g$ that functionally perform identical computation (but on different objects), $f$ and $g$ cannot be discerned or differentiated apart from querying inputs and analysing outputs. While a more specific case of Black-Box Obfuscation, it would still prevent an individual $V$ from knowing $x$ when given $y$ yet verify that $f(x)$ is valid. It has been proven that if any Program-Obfuscation function is possible to construct, Indistinguishability Obfuscation is possible to construct. If it were possible to differentiate two functions that perform identical computation, then it would not be possible to create any two functions that could be obfuscated.

Visualisation of Indistinguishability Obfuscation

Visualisation of Indistinguishability Obfuscation wherein two functions of identical length cannot be feasibly discerned by any actor.

Interestingly, both Indistinguishability Obfuscation and the closely related Extractability Obfuscation rely on similar properties to those relied upon by One-Way-Hash functions - perhaps in part due to the similar purposes for the two classes of functions? To elaborate on these remarks it can be noted that as established previously any obfuscation performed using an Indistinguishability Obfuscation function must allow two functions $f$ and $g$ that perform the same computation to be indistinguishable. For two functions to be indistinguishable it must be said that given two outputs $a$ and $b$ where $o(f(x))=a$ and $o(g(y))=b$, it must be infeasible for a bounded actor $R$ to resolve the the functions that generate each output. This is to say that it must be infeasible for $R$ to construct two new functions $o^{-1}(a)=f(x)$ or $o^{-1}(b)=g(y)$. It can be visualised that if it is said that $o(f)=a$ and it must be infeasible to construct a function $o^{-1}(a)=f$, then this is an identical property to One-Way-Hash functions. To discern, the exact property in question is first pre-image resistance; second pre-image resistance for any form of obfuscation is not typically necessary as it is redundant to require any specific condition $o(f) \neq o(g)$ as it does not undermine the security of either function being obfuscated.

In contrast, Extractability Obfuscation requires the even stronger property of collision resistance which will be detailed further below. However, this distinction does not necessarily imply that Indistinguishability Obfuscation is a weaker cryptographic variant of Extractability Obfuscation. As discussed with One-Way-Hash functions often collision resistance is not required to ensure a robust cryptographic primitive unless a significantly large number of inputs are expected. For any given function to allow for collision resistance it is also the case that it must support first pre-image resistance, meaning any Extractability Obfuscation is an Indistinguishability Obfuscation.

Extractability Obfuscation

Any function $o$ that is a valid Extractability Obfuscation function is necessarily a valid Indistinguishability Obfuscation function, meaning that any two functions $f$ and $g$ that perform the same computation are indistinguishable to any given actor. In addition to this property however, it is also the case that such a valid function will allow two functions $f$ and $g$ to remain indistinguishable unless an actor $P$ can find an input $x$ where $o(f(x)) \neq o(g(x))$. If the prior expression is visualised as $o(f) \neq o(g)$, where the inequality implies they are not equal across every possible value of $x$, then the property of collision resistance can be explored. This new visualisation is also intuitive to state that $f \neq g$ if and only if $o(f) \neq o(g)$. However, implementing such a version of obfuscation is potentially more difficult (in part due to the stronger cryptographic assumptions). It implies that even in the case of two functions that do not perform identical computation, any actor $P$ could only differentiate said functions if they exhausted all known inputs until discovering an input in which the functions differed in output.

function f(x) {
	if (x * x == x * x * x) {
		return x;
	} else {
		return x + 1;
	}
}
function g(x) {
	if (x == 1 || x * x == x + x) {
		return x;
	} else {
		return x + 1;
	}
}

The above examples are simplistic and are not representative of a practical application of Program-Obfuscation, but it does demonstrate the point being made. Forms of obfuscation could be conceived to be valid Indistinguishability Obfuscation functions even in the case that an actor could differentiate the two functions above as they perform computationally unique tasks. However, valid Extractability Obfuscation functions would not allow an actor to discern which function is which (once it has been obfuscated) unless they are able to discover the $x$ in which said functions produce a different output. In the above case $x=2$ is the only input in which a different output is produced. Hence, the actor would need to run $o(f(2))$ and $o(g(2))$ to discover that they are unique functions.

Properties as illustrated have been denoted as similar to the collision resistance property of a One-Way-Hash function as can be seen by the following logical statement. To say, for all $x$ where $o(x)=a$ then for all $y \neq x$ it must be that $o(y) \neq a$ (collision resistance), is logically equivalent to saying, if $y \neq x$ then $o(y) \neq o(x)$. This is identical to the observation made previously where $o(f) \neq o(g)$ only when $f \neq g$ at least at one point in some shared input. Hence, if $f$ and $g$ are truly two separate functions that perform separate computations, it is impossible to have them equal each other at all points, and therefore they are only truly separate functions if there does exist a point they are unequal.

These properties of first pre-image resistance and collision resistance have carried over from the verifiability functions derived from One-Way-Hash and Zero-Knowledge classes, to the obfuscation functions described from the Program-Obfuscation family. As such the link between them is clear, and the mechanisms needed to build any obfuscation functions from a verifiability function can be discerned. However, to truly complete the link a specialised family of functions, Homomorphic-Encryption functions, must be introduced.

Homomorphic-Encryption

Homomorphic-Encryption functions are truly unique in regards to the concept of a Functional-Verifiability scale. They neither exist to ensure verifiability nor obfuscation when given the knowledge or computation of an object $x$. However, under certain cryptographic assumptions, they can be used to construct a valid version of any function that exists on the scale. To such an extent functions from this class are the true cryptographic primitive necessary for the challenges confronted throughout the article. Where it is said that the Program-Obfuscation family of functions present as nearly perfect examples of complete obfuscation at the extreme end of the Functional-Verifiability scale, the Homomorphic-Encryption family of functions sit at an approximately identical position on said spectrum. Currently computational requirements are similar, and currently both are capable of performing approximately identical functionality; the proving of knowledge of an object $x$ and the proving of the computation of $x$.

Homomorphic-Encryption has a variety of possible constructions, all that are viable as a primitive for other Functional-Verifiability functions. However, the strongest possible construction and the one of most use to allow for all forms of computation is Fully-Homomorphic-Encryption. Throughout this article any reference to Homomorphic-Encryption is in regards to Fully-Homomorphic-Encryption in which the properties of said class of functions are described below. All properties are described in perspective of performing computation on an object $x$ without requiring knowledge of $x$ by presenting $e(x)$ to any actor.

  • Semantic security
    • Where there exists an expression $e(x)=y$, it is not possible for any bounded actor $R$ to construct an inverse expression $e^{-1}(y)=x$ to calculate the object being encrypted.
  • Mathematical completeness
    • All mathematical operations $\circ$ as extensions of addition and multiplication must be supported such that $e(x) \circ e(y) = e(x \circ y)$ for all values of $x,y$.

It can be noted that the property of semantic security is logically equivalent to the requirements of first pre-image resistance in One-Way-Hash functions, and the requirements of indistinguishability in Indistinguishability Obfuscation. As such a continued link can be visualised between all the aforementioned Functional-Verifiability functions and Homomorphic-Encryption as a primitive to construct said schemes. It must also be noted that unless the output produced $y$ is of a fixed-length, there will typically be no feasibility of finding an $a \neq x$ such that $e(a)=e(x)$; in other terms, second pre-image resistance can typically be assumed under such assumptions of $y$. If $y$ is of a fixed-length though it is up to each specific implementation of a Homomorphic-Encryption function to discern whether second pre-image resistance would be implemented.

Using Homomorphic-Encryption functions of a variety such that it produces an output of a fixed-length while maintaining the logical equivalent of second pre-image resistance, any One-Way-Hash function can thus be constructed. It can be noted from prior reading in this article that the properties of first and second pre-image resistance are the only necessary properties to ensure a One-Way-Hash function (though perhaps one of unsatisfactory cryptographic assumptions depending on the context). In a similar manner, assuming that any valid Homomorphic-Encryption function $e$ appropriately conceals the value of $x$ where $e(x)=y$ and any actor is only provided the value $y$, the construction of a valid Zero-Knowledge function is possible. As it is known that this property is equivalent to semantic security, where no actor $R$ will ever be able resolve the value of $x$ when given $y$, it is known that any valid function $e$ must therefore allow for the construction of a valid Zero-Knowledge function.

$$ z(e(x))=z(x)=y $$

Where $z$ is a valid Zero-Knowledge function, and $x$ is some object that an actor $P$ desires to prove their knowledge of, the above expression can be seen to hold true. In contrast to the construction of a One-Way-Hash function entirely through the means of $e$ being a valid Homomorphic-Encryption function (and therefore a valid One-Way-Hash function at the same time), Homomorphic-Encryption functions simply support the implementation of Zero-Knowledge functions by their existence.

While there exists no practical implementation as of yet, it is possible to conceive that any solution that would allow a robust version of Homomorphic-Encryption would allow a robust version of Program-Obfuscation, and vice versa (though the latter statement has not been proven). Such an idea can be extended in tangent with the proof introduced above of the relation between One-Way-Hash and Program-Obfuscation functions. Any robust version of Homomorphic-Encryption would allow the construction of a One-Way-Hash function, implying that a One-Way-Hash function that is a valid Homomorphic-Encryption function would allow for a valid Program-Obfuscation function. To further stipulate upon and demonstrate this claim the following expressions copied from above are presented, where $h$ is a valid One-Way-Hash function, $o$ is a valid Program-Obfuscation function, and $x$ is some object that an actor $P$ wants to prove computation of to another actor $V$ without revealing said object.

$$ \begin{align*} f(h(x))=f(x) \rightarrow o(f(h(x)))=o(f(x)) \end{align*} $$

In the case that $h$ is only a valid One-Way-Hash function, and not a valid Homomorphic-Encryption function as well, the above expression is of limited viability. There are however a common set of functions where said viability is available. These such cases assume that both $h(x)$ verifies $P$’s knowledge of the object $x$, as well as that the performing of computation in the form of $f$ on $h(x)$ is equivalent to performing the same computation on $x$. Considering point functions (any functions that return an output in the case where an input equals some specific value) such as the one illustrated below it is possible to generate a proof $y$ where $h(x)=y$ and perform computation on said proof to prove knowledge of the computation of $x$.

function o(y) {
	let Vy = 0x38c2f...; // Previously known value of y by the actor V
	if (Vy == y) {
		return true;
	} else {
		return false;
	}
}

By inspecting the above function $o$ it can be seen that no other actor apart from the constructor of said function (most likely the actor $V$) will feasibly know the value of the object $x$. Hence, it is possible to construct a point function where computation can be performed on an object $x$ without revealing the value of said object in a manner that is verifiable to the constructor of said function (once again likely the actor $V$). As established with One-Way-Hash functions though the computation of $y$ can only be verified by an actor with prior knowledge of $x$ where $h(x)=y$. This limitation prevents the possibility of perfect obfuscation, as all verifiers must know the contents of the object whose knowledge of by the actor $P$ is being proven. An additional limitation is in that no further sophisticated computation can be performed upon $x$ beyond hashing or comparing the value of it to some other value.

Both of these limitations requires the valid One-Way-Hash function $h$ to additionally be a valid Homomorphic-Encryption function, as in that case all computation can be performed on the proof $y$ as if it was being performed on the original object $x$. Such computation may include any mathematical operation, including multiplication, or even other Functional-Verifiability functions (i.e. Zero-Knowledge functions) that would allow for knowledge of $x$ by $P$ to be proven to any verifier without prior knowledge of the object. Hence, what could previously only be achieved to a limited extent with point functions could be expanded to be applicable to the set of all functions - allowing for nearly perfect obfuscation.

Conclusion

Given the above findings in reference to the existence of Homomorphic-Encryption as a primitive to serve as a foundation for all functions derived from the Functional-Verifiability scale, and the knowledge of several of these class of functions and their purposes, more advanced designs can be conceptualised. Primarily for investigation is the notion of using Homomorphic-Encryption to support the creation of a valid Program-Obfuscation function. To potentially satisfy the creation of a Program-Obfuscation functions and as such respond to the challenge of obfuscation in any given network, there are some properties that must briefly be revisited for emphasis. Initially it must be noted that Extractability Obfuscation is the strongest variety of obfuscation available, but Indistinguishability Obfuscation is satisfactory under standard cryptographic assumptions as well. As Indistinguishability Obfuscation has been proven possible to construct in any instance where a Program-Obfuscation function is possible to construct this shall be the scheme that is put forth in the following candidate design. The following properties are therefore necessary of Indistinguishability Obfuscation.

  • Correctness
    • For any given obfuscation function $o$, the expression $o(f(x))=f(x)$ must hold true for all objects $x$ and all functions $f$.
  • First pre-image resistance
    • When any actor $R$ receives the output $y$ where $o(f(x))=y$ it is computationally unfeasible to discover an inverse function $o^{-1}(y)$ such that said actor could resolve the value of $f(x)$.

Previously the link has been made between a valid Program-Obfuscation function and a One-Way-Hash function, especially as second pre-image resistance is either implied by the design of $o$ or unnecessary given the cryptographic assumptions required. Taking a One-Way-Hash function $h$ that is also a valid Homomorphic-Encryption function, it is possible to create an expression that satisfies the above properties and exists as a valid Program-Obfuscation function.

$$ \begin{align*} o(f(h(x)))=f(x) \end{align*} $$

The computation $f$ can therefore be performed on an object $x$ by an actor $P$ without ever revealing the value of $x$ to any verifier $V$. Hence, using One-Way-Hash functions that double as valid Homomorphic-Encryption functions, it is possible to design a function of near-perfect obfuscation on the Functional-Verifiability scale. What this design does not promise however is the capability to avoid the exponentially increasing computation associated with achieving this set goal. The candidate design illustrated also is not guaranteed to be possible. It has not been proven that a Fully-Homomorphic version of a One-Way-Hash function could exist, and it has not yet been proven if such designs could ever become computationally feasible for a prover $P$ to run (a requirement which is often overlooked in cryptographic schemes).

While there has been work committed by the research community to date, investigating schemes such as Homomorphic-Encryption functions that are in addition valid One-Way-Hash functions, as well as the existence of Program-Obfuscation functions separately, not much work has been committed to associating this two together. Currently most exploration of both primitives involve analysing the possibility of either creating cryptographically secure versions of multi-linear mappings or lattice-based cryptography. Neither field has yet yielded the connecting the dots that will make this all possible though, and as such there is continuing merit to investigating if alternative schemes could yield the ideal results desired.

This article concludes here however, predicated on the notion that the most opportune direction forward to developing an appropriate Functional-Verifiability scale is to present the necessary properties to be satisfied to a wider audience. While a long-term solution that may not be eventuated for many years, it has been intended that the purpose to exploring and solving challenges in this field has been made clear.

If you have anything that you think should be added to this summary, or if you would like to collaborate on a future research task, please reach out to research@mycelium.ventures.

If you have learnt something and valued this research, please consider supporting mycelium-research.eth

© Copyright 2022, Mycelium Ventures Pty Ltd, ALL RIGHTS RESERVED.

We use cookies on this site to enhance your user experience.

By continuing to browse, you agree to the storing of cookies on your device to enhance your site experience and for analytical purposes. To learn more about how we use cookies, please see our privacy policy.

Discover how you can make an impact by joining our team.