November 2016

Nominal-knowledge Data Manipulation, Multi-state Cryptography, and the Clay-Balloon Analogy



Now that my deluge of midterms has subsided, the Inaugural #NYCyber912 which I co-organized is over, and my work as a SysAd successfully mitigated incursions into SIPA DCG’s infrastructure, I can now get back to my normal life. This means I can start thinking in the abstract again–at least until finals roll around.

Let me begin by presenting an analogy. Assume for a moment that you have a ball of pliable clay. Now envision that this mass is inside of an elastic balloon which is inflated. There is a small layer of air between the clay internals and the exterior material–just enough that the balloon obstructs what it encases. It’s possible to push on the outside of the balloon to work with the clay inside, but impossible to see its color, texture, shape, or other superficial attributes. It’s also possible to know that the clay exists, but not observe any of the resultant visual characteristics once the clay has been reshaped.

Recently I’ve been thinking about nominal-knowledge datasets–collections of information that can be aggregated without the collector knowing what the dataset contains. This is not a new concept as it exists prevalently in the world around us. Surveys are an easy example; a user submits information and assumes that other users are also submitting similar inputs (within the bounds of what is requested), but the user isn’t exactly sure what the database contains.

What is a novel concept is the continual ability of the user (herein, the “worker”) to work with and change what’s contained in the dataset without having read access to it (nominal-knowledge data manipulation, “NKDM”). I derive part of this concept of nominal-knowledge datasets from views introduced in 1985 by Goldwasser, Micali, and Rackoff on “The Knowledge Complexity of Interactive-proof Systems”. These authors originally introduced the concept of zero-knowledge and zero-knowledge proofs.

However, nominal-knowledge is not to be confused with zero-knowledge as “[a]n interactive proof is said to be zero-knowledge (ZK) if it yields nothing beyond the validity of the assertion being proved” (Rosen 2006). In the case of nominal-knowledge, the worker has some understanding of what the underlying data happens to be (hence, not zero-knowledge).

Zero-knowledge proofs (“ZKP”) work by having those that own or maintain information prove they are the only legitimate ones with access to it. By comparison, NKDM works by giving the worker only enough access and information about the dataset so that they are able to consistently work with and help maintain it. However, NKDM does not give the workers enough information to know anything substantive about the effects of their inputs, calculations, or edits. Rather than proving that communications between untrusted parties is authentic and legitimate (ZKP), NKDM allows for workers to affect the data without being privy to all of its attributes (the clay-balloon analogy).

For clarity within this post, the term “user” will be referred to hereinafter as “worker”. This is to distinguish between workers (who work with data) and owners (who own and maintain the data). This is also to avoid conflation with users that are included in the “other” permission group in “owner-group-other” unix file permissions.

The importance of this concept relies not on the cryptographic assurances that data is authentic, but rather that the data is abstracted from those that have access to it.

Obviously, this is a wildly theoretical concept, and one that seems more at place in quantum computing rather than in binary digital computing.

NKDM is a more realistic quantum concept because the data would have to exist in different states simultaneously (a ball of clay and an obstructed ball of clay). Data would have to exist in plaintext for the owner of the data, but would also have to exist in a separate (and encrypted) state for the worker. Ignoring the obvious public key/quantum issues for a moment, it’s easy to explain NKDM in terms of public key crypto. The owner would have to maintain the data as its prime factors, but each time the data was accessed by the worker, it would have to be presented to them to work with as very large non-prime integer. Once the data was accessed and manipulated, it would have to exist as a resultant large non-prime integer, but be capable of being simultaneously factored by the owner to a new set of prime factors respective of the worker’s manipulation.

To some extent, this concept is already feasible in traditional binary digital systems. In unix-based environments for example, a file can have permissions set as -rwx-w---- (where the “worker” from this example takes the place of the “group” in unix file permissions). However, the group would have to know what the file looks like initially so that it could make changes to it. This does not derive the nominal-knowledge benefits that the multi-state crypto (“MSC”) provides in NKDM (MSC being the disposition of the data to exist in different encrypted states at the same time). The group gets no feedback on how the data is manipulated, it just simply knows that it changed because of its actions. One errant edit can ruin the integrity of the entire dataset.

Unlike in traditional systems, workers in NKDM environments know that the data exists, how it is changed, and what other options they have available to work with it. Using the clay-balloon analogy, the worker knows that the clay has been manipulated, it just doesn’t know the full extent of its new shape or the composition of the clay. Put another way, in NKDM the worker gets some sense of how its changes interact with the data, just not what the comprehensive results happen to be. The benefit here (in addition to information security), is that the data maintains its integrity. Because the worker is getting some type of feedback from the MSC data, it knows whether its edits are useful or detrimental to the data’s overall state (e.g. did the clay break apart into multiple pieces after the last change?).

There are numerous benefits to NKDM, but the most important are in the financial, health, telecommunications, cloud computing, and defense & intelligence sectors.

  • Finance and Health: Audit Trails and Access Control. NKDM has a place in the financial and health sectors because it allows for efficient and absolute PCI, SOX, GLBA, and HIPAA compliance, among others, because the data is dynamic but readable only to those that require access. With this innovation, it would be possible to set specific and independent access controls on the data for individual machines like medical devices, and entities like technicians, administrators, and insurance companies.
  • Cloud Computing: Microtransactions and Pay-per-compute Instances. NKDM could revolutionize how server farms, supercomputers, and academic institutions handle computational tasks for customers. Rather than giving services like Amazon AWS or Google Cloud Platform complete access to data, it would be possible to give them access to abstracted data and let their hourly compute instances work on conducting nominal-knowledge calculations rather than calculations on plaintext datasets. In essence, they would be processing computations, they just wouldn’t know what they are computing.
  • Defense and Intelligence: Reporting and Analysis. NKDM has the ability to add a layer of abstraction between those involved with reports and those involved with analysis. Regardless of organizational policy regarding whom should have access to finalized reports, NKDM adds a layer of security to reports drafted abroad in enemy territory. Even in the event that a field post is compromised, security of the reports is maintained because the operator in this environment didn’t have read access to it in the first place.
  • Telecommunications: In-transit comms. NKDM could also be used to manipulate and/or evaluate encrypted data as it’s in transit. For example, it could adjust the level of compression of individual files to facilitate faster transport of end-to-end encrypted communications. Alternatively, NKDM could evaluate the actual use of data that is transported over dedicated connections to dynamically adjust how much customers pay (based on what’s used and not what’s allocated). Obviously there are also defense and intelligence applications here as well.

There are other instances where NKDM could be useful–basically any application where access control is paramount and where dissemination must be limited. Essentially, any instance where data needs to be dynamic/updatable and simultaneously secret would benefit from this concept.

Unfortunately, to reiterate, NKDM is not possible given current advancements in computing. Until quantum becomes standard, the world will have to make due with relying predominantly on the judgement of humans rather than strong crypto to protect information assets. In the interim, it’s fun to ponder unprecedented new ideas such as NKDM and consider what cybersecurity will look like in the future.


Featured Image Credit: David Bleasdale, “letter-sphere-d”. Creative Commons Attribution 2.0 Generic. Modified, Desaturated.

SHA256( dfddd96a8eab4413a6384908d73d84e5a79585068c59e0c7e1740375e8cd22ba