Double-strand breaks (DSBs) result from the attack of both DNA strands by multiple sources, including radiation and chemicals. DSBs can cause the abnormal chromosomal rearrangements associated with cancer. Recent techniques allow the genome-wide mapping of DSBs at high resolution, enabling the comprehensive study of their origins. | Mourad et al. Genome Biology 2018 19 34 https s13059-018-1411-7 METHOD Open Access Predicting double-strand DNA breaks using epigenome marks or DNA at kilobase resolution Raphaël Mourad1 Krzysztof Ginalski2 Gaëlle Legube3 and Olivier Cuvier1 Abstract Double-strand breaks DSBs result from the attack of both DNA strands by multiple sources including radiation and chemicals. DSBs can cause the abnormal chromosomal rearrangements associated with cancer. Recent techniques allow the genome-wide mapping of DSBs at high resolution enabling the comprehensive study of their origins. However these techniques are costly and challenging. Hence we devise a computational approach to predict DSBs using the epigenomic and chromatin context for which public data are readily available from the ENCODE project. We achieve excellent prediction accuracy at high resolution. We identify chromatin accessibility activity and long-range contacts as the best predictors. Keywords Double-strand breaks Epigenetics Chromatin Machine learning Background hypersensitive site sequencing DNase-seq data are pub- Double-strand breaks DSBs arise when both DNA licly available for dozens of cell lines and tissues from strands of the double helix are severed. DSBs are caused by the ENCODE 7 and Roadmap Epigenomics 8 projects. the attack of deoxyribose and DNA bases by reactive oxy- On the one hand recent studies have shown that the gen species and other electrophilic molecules 1 . DSBs mapping of regulatory elements such as enhancers and are particularly hazardous to a cell because they can lead promoters can be accurately predicted using available to deletions translocations and fusions in the DNA col- epigenome and chromatin data 9 10 . Other studies have lectively referred to as chromosomal rearrangements 2 . shown that the epigenome can be predicted by combi- DSBs are most commonly found in cancer cells. Several nations of DNA motifs and DNA shape 11 14 . On the high-throughput .