Disease as an information defect


One way to think about disease[1] is as a loss of information[2] from the operating blueprint for human physiology.  Broadly speaking, there are three main possible types of informational defects depending on the nature of the informational encoding that is compromised. The first is genetic through loss of information due to corruption of the genetic (and sometime epigenetic) code, for instance in congenital disease or in cancer. The second is spatial through loss of architectural information due to cumulative changes away from a structural template. This prevents turnover of tissue – either routine or after injury – from maintaining normal functionality, for instance in degenerative diseases such as osteoarthritis or some cardiomyopathies. These spatial information defects can occur at the molecular (e.g. amyloid), cellular (e.g. hypertrophy), or extracellular (e.g. fibrosis) level.  The third is relational (network) through loss of network information due to departures from the normal operating regime that cannot be resolved through natural feedback mechanisms[3] – e.g. in autoimmune or metabolic disease. These are typically diseases with strong environmental factors – take for instance PTSD which in the end, likely represents a persistent maladaptive response to prior insults entrenched in unwelcome neural circuitry. Network defects can thus affect areas ranging from hormonal cross-talk to immune regulation to synaptic activity. And clearly, there is frequently spill-over from one type of informational disruption into other types, such as for PTSD where evidence of epigenetic modification has been accumulating.

Until now, this informational lens has not been a particularly helpful way to describe disease which is why it is rarely used[4]. However there are signs that this is about to change because of maturing therapeutic modalities capable of delivering corrected/corrective information where it is needed. Categorized according to the scheme above, some of these modalities are:

  1. Genetic: Viral vectors capable of delivering DNA payloads to specific targets within the body or for ex-vivo cellular reprogramming for re-implantation, selective targeting of tumor cells for elimination by the immune system which is essentially a way to remove corrupted information
  2. Spatial: Bioengineered scaffolds that can be populated by stem cells to form replacement for anatomical structures (trachea, ear), tissue engineering using induced pluripotent stem cells (iPSCs), 3D cellular printing of organ components
  3. Network: This is the area that has been most challenging but there are early signs progress notably in reeducating/resetting of immune networks through tolerization maneuvers (and after all, what is immunization if not the delivery of specific information to the immune system)


What are the building blocks of an informational view of disease?

An absolutely necessary (but definitely not sufficient) condition for an informational lens to be useful is the ability to characterize what information has been corrupted in the diseased state compared to the original “healthy” configuration. It is least hard to do for genetic and spatial information, in one case by sequencing the healthy and diseased states (although characterizing epigenetic change is more of a challenge), in the other through a combination of imaging and histopathology to uncover defects of cellular, tissue or other architecture. Of course, this does not explain how to fix the informational defect, but at least is a guide to the informational content that needs to be cleaned up, restored, or complemented.

Unfortunately, for network information defects (which likely represent a large chunk of the human disease burden for which risk is polygenic and multifactorial) this is much harder because in most cases, we have not even identified all the nodes of the physiological network, let alone the nature of their connectivity, both in the healthy and the diseased states. So far, the approach in this domain has largely been based on painstaking identification and analysis of individual interactions, and while this reductionist method has been reliably fruitful over the past century or so, there is a broad sense that overreliance on it is now a limitation to our understanding of systems whose behavior is driven by complicated networks.  In the early 2000s, this realization gave rise to systems biology, a field that has so far not quite lived up to its initial promise in the sphere of human disease, in part because the technology needed to support its aims was still immature.

However, there is now a growing ability to generate and ingest massive streams of health data with many biomarkers across large populations (be they people, cells, or zebra fish) and together with machine learning approaches, this creates an opportunity for a new view of physiological networks and their disturbances in disease.  This leads us to a second condition for the usefulness of the informational lens:  the creation of models that enable to probe in silico the various possible states of the network and the factors that can drive transitions between them. A potential breakthrough technology that may accelerate progress is the use of attention mechanisms in deep neural networks to identify which nodes and interactions are important in physiological networks.

What does an informational approach to disease mean for biopharma?

The exhibit below shows a partial list of potential changes for biopharma that would be driven by an informational perspective to therapeutics.


This transition will take a while, spanning not years, but a decade to a generation. It will not be uniform, with some therapeutic areas such as cancer and immunity more amenable to an informational approach than others. As usual, the future will belong to those who time it right, i.e. in making the investments in platform competencies (e.g. data science, AI, systems biology tools, genetic and cellular engineering tools), as well as in bringing in the right mix of talent with an informational mindset. It is indeed a tricky exercise to continue to operate in the tried-and-true old model while attempting to guess whether the new world is sufficiently mature to commit to it – no doubt there will be trailblazers who end up paving the way only to see others reap the rewards.

In the end though, change will come down to a question of framing and mindset. One does not need an evolutionary perspective to make contributions in the life sciences, but it sure is helpful in developing insights. The same is true with a number of other diverse lenses such as that of comparative zoology, of energy flows, embryological development, etc. Given current rate of progress, the time is approaching when an informational lens will make its contribution to our understanding of diseases and therapeutics, and it may be that at some point, treating a patient will become less like manning the bilge pump of a leaky boat and more like debugging code or doing forensic accounting.

[1] Outside acute externally driven events like trauma or infection

[2] In all that follows, information “loss”, “defect”, “corruption”, are used interchangeably.

[3] Large physiological networks tend to be made up of many smaller regulatory motifs each of which have multiple stable states (sometime as on/off switches, sometime with more complex behavior). Just as under some conditions, cars in an intersection can lock themselves in an intractable traffic jam, it is possible for networks to become stuck because the connectivity settings have been pushed to a frozen state from which self-extraction is not possible. This is an informational issue because the pattern of normal operation has been lost – in the case of the cars at intersection, it’ll take an outsider (e.g. police) to unravel the gridlock based on a view of what the overall pattern should be.

[4] Isolated exceptions exist in cancer, immunology, and neuroscience