All life is based on DNA code. DNA determines your height, sex, hair and eye colour. It also decides if you are an amoeba, a flower, a tree, a tiger or a human. This information is called the geonome. The amount of data needed to describe each individual geonome is enormous, yet it is stored in a tiny structure. DNA is also very persistent. Scientists have decoded DNA from animals that died thousands of years ago. When the DNA helix splits, two new complete helices are formed, both exact copies of the original. So, given how much data can be stored, how it replicates and how secure it is, could DNA then be used to store ordinary data?
DNA is a double helix shaped molecule, with the two strands of the helix connected by amino acids called by their initials; A, T, G, and C. The DNA code is made up from combinations of these amino acids. The actual length of a human DNA helix would be about 2 inches if it was stretched out, but in fact it is very compact. According to the New Scientist magazine, one gram of DNA can potentially hold up to 455 exabytes of data. Other authorities say 2.2 petabytes per gram. Either way that is a lot of data in a very small space. Scientists have been able to read the DNA code for a long time, using a process called sequencing, and synthesizing is the equivalent process of writing out DNA chains. DNA is also incredibly stable, scientists have managed to sequence the complete genome of a fossil horse that lived more than 500,000 years ago. Once last advantage - storing it does not require much energy.
The downside it that it is currently expensive and slow. Storage costs in millions of dollars per gigabyte will put most people off. Data read and write time is still measured in hours.
A Chromosome is a combination of DNA and proteins which keep the DNA helix stable. Chromosomes contain Genes, which are short sequences of DNA and are the basic unit of genetic information. (Which leads to the old joke Question - "How do you tell the sex of a Chromosome?" Answer - "You take its genes off! ") Sorry.
In 2017 the Harvard group adopted a DNA-editing technology called CRISPR, which can identify specific DNA sequences with precision and slice into them like a molecular scalpel. This means it is possible to select a target gene and either remove it or replace it with a new sequence. The Harvard team has used CRISPR DNA-editing technology to record images of a human hand into the genome of E. coli, and then read the images out with higher than 90 percent accuracy.
Researchers at the University of Washington and Microsoft Research have developed a fully automated system for writing, storing and reading data encoded in DNA. In March 2019 they ran a proof-of-concept test, and successfully encoded the word 'hello' in short lenghts of artificial DNA, then converted it back to digital data using a fully automated end-to-end system.
Twist Bioscience states 'We Make DNA'. They have created a revolutionary silicon-based DNA synthesis platform, which is known for being cheap and scalable. While Twist Bioscience primarily supplies DNA for medical research, they state 'Twist is also pursuing longer-term opportunities in digital data storage in DNA'
It seems that there is a lot of active development with DNA storage, some focussed on sequencing techniques that will allow for billions of DNA sequences to be read easily and simultaneously. This should speed up access times to data and bring the price down. However these both need to improve by orders of magnitude before DNA storage can compete with electronic storage. Read back of written data also needs to be closer to 100 percent before the technology can be considered reliable.