Four years ago, Valerie Arboleda accomplished something most young medical geneticists rarely do. She helped discover a rare congenital disease now known as KAT6A syndrome . From the original 10 cases to the more than 100 diagnosed today, KAT6A kids share a single altered gene that causes neuro-developmental delays, most prominently in learning to walk and talk, plus a spectrum of possible abnormalities involving the head, face, heart, and immune system.
Now, Arboleda wants to accomplish something even more groundbreaking. With a 2017 NIH Director’s Early Independence Award, she will develop ways to mine Big Data—the voluminous amounts of DNA sequence and other biological information now stored in public databases—to unearth new clues into the biology of rare disorders like KAT6A syndrome. If successful, Arboleda’s work could bring greater precision to the diagnosis and potentially treatment of Mendelian disorders, as well as provide greater clarity into the specific challenges that might lie ahead for an affected child.
A Mendelian disorder arises from a change in a single gene and follows the accepted patterns (dominant, recessive, X-linked, or mitochondrial) of genetic inheritance. There are an estimated 7,300 Mendelian disorders, and many are exceedingly rare .
Because of their rarity, Mendelian diseases might seem like an unlikely match for Big Data tools. But as researchers have grown more adept at sequencing genomes and finding rare mutations, several recent studies have suggested that genes and pathways involved in many Mendelian disorders may also be involved in more common disorders [3-4].
Arboleda, a researcher at the University of California, Los Angeles (UCLA), is now starting to look at Mendelian disorders from an oligogenic, or multi-gene, construct. While a rare, single-gene mutation may exert its most devastating impact on one biological pathway, she suspects such mutations can also have ripple effects that disrupt genes that control other biological pathways, including ones involved in more common diseases and traits.
To test this hypothesis, Arboleda will study a condition that she knows well: KAT6A syndrome. The syndrome is triggered by a mutation that usually arises spontaneously in the KAT6A gene during early development and is expressed in almost all cell types. The gene essentially encodes a master control switch: an enzyme that helps unwind certain parts of a chromosome, allowing gene-transcribing proteins access to blocks of genes to modify their expression levels as needed.
Arboleda has already collected skin cells from several kids with KAT6A syndrome and, for the first time, is using genomic tools to profile how a mutated KAT6A gene alters transcription across the genome. She’s using a technique called RNA-seq to generate the global gene-expression data and ChIP-seq to monitor the activity of proteins that bind to DNA to influence gene expression. Arboleda and her team will use this information to identify the many biological pathways that appear to be affected by the KAT6A mutation.
Then comes another challenge: using Big Data tools to search for similarities between the genomic profiles of KAT6A kids and those of kids with common disorders that display some features similar to KAT6A. Such disorders include autism spectrum disorder, congenital heart defects, craniofacial defects, and immunodeficiencies.
For these analyses, Arboleda will turn to the rich body of data generated by the more than 2,500 genome-wide association studies (GWAS), Genotype-Tissue Expression (GTEx), and Encyclopedia of DNA Elements (ENCODE). She’ll use these tools to search for genetic variations that control gene expression in ways that are reminiscent of KAT6A syndrome and which are associated with risk for common disorders. Arboleda wants to see which of these variations might also influence some of the symptoms associated with KAT6A syndrome.
According to the young UCLA researcher, all of this effort will be worthwhile if she can help the families of kids with KAT6A syndrome, whom she regards as priceless partners in her research endeavors. It’s a sentiment that resonates with me and the many other NIH-funded scientists who are working so hard to find cures for rare diseases, especially as we celebrate the 11th annual Rare Disease Day at NIH today.
 De novo nonsense mutations in KAT6A, a lysine acetyl-transferase gene, cause a syndrome including microcephaly and global developmental delay. Arboleda VA, Lee H, Dorrani N, Zadeh N, Willis M, Macmurdo CF, Manning MA, Kwan A, Hudgins L, Barthelemy F, Miceli MC, Quintero-Rivera F, Kantarci S, Strom SP, Deignan JL; UCLA Clinical Genomics Center, Grody WW, Vilain E, Nelson SF. Am J Hum Genet. 2015 Mar 5;96(3):498-506.
 Centers for Mendelian Genomics uncovering the genomic basis of hundreds of rare conditions. Benowitz S, National Human Genome Research Institute, August 6, 2015.
 A nondegenerate code of deleterious variants in Mendelian loci contributes to complex disease risk. Blair DR, Lyttle CS, Mortensen JM, Bearden CF, Jensen AB, Khiabanian H, Melamed R, Rabadan R, Bernstam EV, Brunak S, Jensen LJ, Nicolae D, Shah NH, Grossman RL, Cox NJ, White KP, Rzhetsky A. Cell. 2013 Sep 26;155(1):70-80.
 Properties of human disease genes and the role of genes linked to Mendelian disorders in complex disease aetiology. Spataro N, Rodríguez JA, Navarro A, Bosch E. Hum Mol Genet. 2017 Feb 1;26(3):489-500.
Genome-Wide Association Studies (National Human Genome Research Institute/NIH)
Rare Disease Day at NIH 2018 (National Center for Advancing Translational Sciences/NIH)
Arboleda Project Information (NIH RePORTER)
NIH Director’s Early Independence Award (Common Fund)
NIH Support: Common Fund
Image courtesy of:
Credit: UCLA/Margaret Sison Photography