[DG-BSC] New BSC use-case: Privacy-Preserving Data Sharing on Blockchain P2P Nodes
hardjono at mit.edu
Tue Aug 2 09:03:01 CDT 2016
Below is a new BSC use-case I'm championing. Its a first sketch (I'll be revising/improving it).
Name of use case: Privacy-Preserving Data Sharing on Blockchain P2P Nodes
Instead of a centralized data processing architecture, the P2P nodes (e.g. in a blockchain) offers the opportunity for data (user data and organizational data) to be stored by these nodes and be processed in a privacy-preserving manner.
In this new paradigm of privacy-preserving data sharing, we “move the algorithm to the data” where queries and subqueries are computed by the data repositories (nodes on the P2P network). This means that repositories never release raw data and that they perform the algorithm/query computation locally which produce aggregate answers only. This approach of moving the algorithm to the data provides data-owners and other joint rights-holders the opportunity to exercise control over data release, and thus offers a way forward to provide the highest degree of privacy-preservation while allowing data to still be effectively shared.
This paradigm requires that queries be decomposed into one or more subqueries, where each subquery is sent to the appropriate data repository (nodes on the P2P network) and be executed at that repository. This allows each data repository to evaluate received subqueries in terms of “safety” from a privacy and data leakage perspective.
Furthermore, safe queries and subqueries can be expressed in the form of a query smart contract (QSC) that legally bind the querier (person or organization), the data repository and other related entities.
A query smart contract has been vetted to be safe can even be stored on nodes of the P2P network (e.g. blockchain). This allows Queriers to not only search for useful data (as advertised by the metadata in the repositories) but also search for prefabricated safe QSCs that are available throughout the P2P network that match the intended application. Such a query smart contract will require that identities and authorizations requirements be encoded within the contract.
A node on the P2P network may act as a Delegate Node in the completion of a subquery smart contract. A delegate node works on a subquery by locating the relevant data repositories, sending the appropriate subquery to each data repository, and receiving individual answers and collating the results received from these data repositories for reporting to the (paying) Querier.
A Delegate Node that seeks to fulfil a query smart contract should only do so when all the conditions of the contract has been fulfilled (e.g. QSC has valid signature; identity of Querier is established; authorization to access APIs at data repositories has been obtained; payment terms has been agreed, etc.). A hierarchy of delegate nodes may be involved in the completion of a given query originating from the Querier entity. The remuneration scheme for all Delegate Nodes and the data repositories involved in a query is outside the scope of the current use-case.
(a) Data repository node: The role of the data repositories to hold raw-data and perform privacy-preserving “safe” query/subquery computations.
(b) Querier: This is the entity that seeks to run a query over data that are distributed across multiple data-repositories.
(c) QSC node: These are the nodes that store a signed copy of a query smart contract.
(d) Delegate Node: This is a node on the P2P network that seek to complete a subquery smart contract.
More information about the DG-BSC