The healthcare industry is forever advancing with new innovations: be it anything from drugs to surgical equipment. The industry is in the best shape it has ever been, but it still falls short in certain aspects. 

The main issue that the industry faces is interoperability of the patient’s medical records between different hospitals and information blocking done by competing hospitals. The interoperability issue stems from the absence of a universally recognized patient identifier,which could easily solve the issue of a mismatch in a patient’s Electronic Health Record [EHR]. 

Information blocking, although deemed as an illegal/unethical practice, has been quite prevalent in the healthcare industry. This is a widespread issue where certain medical institutions impose constraints on patient data to lower their chances of losing that particular patient as a customer and,  worse, to make it difficult for them to move on to a different hospital. 

We know that several measures are being taken to tackle the issue of rigid patient data handling with respect to medical records. But very little is being done in sharing information related to genomic data of patients and interoperability between genomic databases. Genomic databases hold integral information about human DNA and genes, which allows researchers to arrive at required biological inferences when looking into why our body behaves in a certain way with respect to particular diseases or medicine. 

A large computational burden comes along with genomic data, including ownership and data privacy issues. The reason is that high throughput sequencing [HTS] platforms that process genomic data require an immense amount of computational resources.

After it is computed and analysed, this data is stored in central repositories with access control. This centralized structure acts as an unnecessary mediator between the data users and the actual data owners. Additionally, these centralized repositories are prone to single point failures pertaining to data privacy and availability in terms of service. 

So the 3 main issues that we run into here are:

  1. Storage of this enormous amount of genomic data once it is processed.
  2. The privacy of the individuals who contribute to biological material such as DNA and the right to control access.
  3. The time it takes for these central repositories to grant or revoke access to this data upon initiating a request.  

These problems can be curbed in a few ways:

  1. A method that allows us to store this data in a distributed fashion so that single point failures won’t have a significant impact.
  2. Increasing transparency between data users and owners.
  3. Possibly offering incentives for data owners to allow access to their DNA material.
  4. Ease of access and a fast consensual mechanism for regulating user’s genomic data, like adding a permission layer for users to have more control over their genomic data.

Hence researchers are trying to implement genomic databases on a distributed ledger technology [DLT] like the blockchain since it perfectly checks all the above boxes. Including the implementation of smart contracts to further improve accessibility. Let’s take a look at a few ways this can be implemented

Image Source: NCBI- Realizing the potential of blockchain technologies in genomics

This describes how a smart contract can be used to manage access for genomic data, this also eliminates the need for third party involvement. Alice publishes an encrypted version of her VCF file [a standard file format for storing contact information for a person or business]. At first, no other participant or researcher can analyze this file. In the second round, smart contract accepts bidding transactions for this file. The highest bidder is then selected to be the rightful owner and gets access to the file through an algorithmic process. 

The bidding could be done in a particular token or cryptocurrency and provides an incentive to the patient/data owner to share their information with researchers.

The Coinami project uses a volunteer grid computation platform with token awards to distribute the HTS read mapping work load to many volunteers (or, miners) and then collect and validate the results. It uses a federated structure where a root authority validates and tracks mid-level sub-authority servers that supply HTS data to the system and checks for validity of the alignments and the third level is composed of miners. 

Image source: Coinami white paper

Root Authority issues certificates to research centers which enables them to distribute HTS jobs to miners. When an HTS job is processed and uploaded successfully, the miner is rewarded with a coinbase transaction that is signed by the subauthority. These transactions are included in the underlying blockchain. This way, every reward, every transaction is made public so that anyone can inspect the system for suspicious activity. 

Instead of using large data centres and cloud for storage, implementing genomic data on a DLT allows for this large amount of data to be efficiently stored and accessed. By utilizing the blockchains computational grids, these compute intensive applications of genomic databases can be tackled.

Leave a comment

My Newsletter

Sign Up For Updates & Newsletters