Dr. Tao Liu from Roswell Park's biostatistics and informatics team has developed a machine-learning-based computational method that may help to improve cancer detection and treatment.

Roswell Park Team Develops New DNA-Mapping Tool

  • New software interprets data about gene ‘switches’ responsible for cancer
  • Computational method outperforms all other algorithms in the field
  • Approach will help researchers identify active regions of the genome

BUFFALO, N.Y. — A scientist at Roswell Park Comprehensive Cancer Center has developed the first dedicated tool for analyzing a DNA-mapping technology known as ATAC-seq, a technique that identifies the gene “switches” responsible for cancer development and progression. Developed in collaboration with the University at Buffalo, the innovative computational method, named HMMRATAC, could vastly improve current methods of cancer detection and treatment.

Each cell in the human body must pack more than 6 feet of DNA into a microscopic space only a few microns wide. To accomplish this enormous task, DNA is precisely arranged and tightly compacted into a structure called chromatin. This dense packaging system segregates DNA into areas that are tucked away and inaccessible and areas that are open and accessible to the cellular machinery that reads and translates genetic information. Although only a small fraction of the genome is exposed at a given point in time, these biologically active regions of DNA perform the essential function of transcription, the first step in protein synthesis.

ATAC-seq (assay for transposase-accessible chromatin sequencing) is a relatively simple method that helps scientists identify active regions of the genome. A special enzyme called transposase is used to selectively cut and label open chromatin, and the bits of DNA that are captured can then be analyzed to reveal key information about diseases such as cancer. Although ATAC-seq has been widely used in cancer research since 2013, the massive amounts of data generated by this method are currently analyzed with algorithms initially designed for older sequencing methods, which inevitably leads to misinterpretation of the results and misses potentially important information contained in DNA fragments unique to this particular sequencing method.

HMMRATAC (hidden Markov modeler for ATAC-seq) is the first and only computation tool dedicated to ATAC-seq. Unlike other methods of genomic analysis, HMMRATAC is a machine learning approach that splits the DNA obtained from a single ATAC-seq dataset into accessible and inaccessible regions. The unique chromatin structure around accessible regions is then analyzed in order to predict where other genetically active areas of interest are located across the entire genome.

“HMMRATAC outperforms all other methods used in the field to identify open and active chromatin, because it takes advantage of the unique features of ATAC-seq to identify chromatin structure more accurately,” says Tao Liu, PhD, senior author on the study and Assistant Professor of Oncology in the Department of Biostatistics and Bioinformatics at Roswell Park. “As HMMRATAC is a cross-platform and user-friendly algorithm dedicated to ATAC-seq, we envision it becoming the standard tool used for ATAC-seq data analysis.”

The software’s algorithm is built upon the idea of decomposition and integration, where many layers of complex genetic information are broken down until patterns emerge that reveal the location of the specific genes and proteins that drive various disease types, including cancer. The ultimate goal is to use this information to personalize medicine by understanding how individual variations in gene expression influence cancer development, prognosis, and treatment.

The study, which was recently published in Nucleic Acids Research, was supported by the University at Buffalo, the Buffalo Blue Sky Award, and the National Cancer Institute, or NCI (project nos. P30CA016056, Roswell Park’s Cancer Center Support Grant from the NCI, and U24CA232979, a grant supporting the Cancer MoonshotSM).


Roswell Park Comprehensive Cancer Center is a community united by the drive to eliminate cancer’s grip on humanity by unlocking its secrets through personalized approaches and unleashing the healing power of hope. Founded by Dr. Roswell Park in 1898, it is the only National Cancer Institute-designated comprehensive cancer center in Upstate New York. Learn more at www.roswellpark.org, or contact us at 1-800-ROSWELL (1-800-767-9355) or ASKRoswell@RoswellPark.org.

Media Contact

Annie Deck-Miller, Senior Media Relations Manager
716-845-8593; annie.deck-miller@roswellpark.org