A multi-institutional team of scientists has developed a free, publicly accessible resource to aid in classification of patient tumor samples based on distinct molecular features identified by The Cancer Genome Atlas (TCGA) Network. The resource comprises classifier models that can accelerate the design of cancer subtype-specific test kits for use in clinical trials and cancer diagnosis. This is an important advance because tumors belonging to different subtypes may vary in their response to cancer therapies.
The resource is the first of its kind to bridge the gap between TCGA's immense data library and clinical implementation. A paper describing the tools published online today in Cancer Cell . TCGA defined molecular subtypes for each major type of cancer.
With this resource, we aimed to provide the clinical and scientific communities with the tools to assign a newly diagnosed tumor to one of these established subtypes. Our new resource will be a powerful asset for creating clinical assays based on the diverse molecular variations between cancers." Peter W.
Laird, Ph.D., the Peter and Emajean Cook Endowed Chair in Epigenetics at Van Andel Institute and study's lead corresponding author TCGA was a decade-long, National Cancer Institute-led effort to create detailed molecular maps of 33 cancer types.
Unlike traditional approaches that define cancers based on the organ or tissue in which they arise, TCGA identified nuanced genomic, epigenomic, proteomic and transcriptomic characteristics that more precisely describe cancer subtypes. Andrew D. Cherniack, Ph.
D., of the Broad Institute of MIT and Harvard and Kyle Ellrott, Ph.D.
, of the Knight Cancer Institute at Oregon Health & Science University also are corresponding authors of the paper, which represents a collaborative effort between scientists from more than a dozen research organizations. "Since many TCGA molecular subtypes were generated using hundreds or thousands of features from multiple data types, scientists and physicians have asked us for help subtyping their samples," Cherniack said. "Our resource greatly simplifies this process.
" Related Stories Potential dangers of direct-to-consumer genetic test for cancer risk Study reveals how fatty liver disease leads to liver cancer Psilocybin shows potential for treating mental health disorders in people with cancer and addiction The team created the new resource by leveraging data from 8,791 TCGA cancer samples that represented 26 cancer cohorts and 106 cancer subtypes. They then used existing machine learning tools to develop and test nearly half a million models across six categories -; gene expression, DNA methylation, miRNA, copy number, mutation calls and multi-omics -; and selected those that performed best for inclusion in the online resource. In total, the resource contains 737 ready-to-use models, which represent the top models from each of the 26 cancer cohorts, the five training algorithms and six data types.
"A major element of this effort was working to ensure that these models could be deployed by other groups onto new datasets," Ellrott said. "All too often this type of work is difficult to replicate or apply to new samples." The resource may be accessed at https://github.
com/NCICCGPO/gdan-tmp-models . Co-first authors of the study include Christopher K. Wong of University of California, Santa Cruz, Christina Yau of University of California, San Francisco, and Buck Institute for Research on Aging, Mauro A.
A. Castro of the Federal University of Paraná, Jordan E. Lee of Oregon Health and Science University, Brian J.
Karlberg of Oregon Health and Science University, Jasleen K. Grewal of BC Cancer, Vincenzo Lagani of JADBio Gnosis DA and Ilia State University, and Bahar Tercan of the Institute for Systems Biology. Other authors include Verena Friedl, Vladislav Uzunangelov and Joshua M.
Stewart of University of California, Santa Cruz; Toshinori Hinoue of Van Andel Institute; Lindsay Westlake and Xavier Loinaz of the Broad Institute of MIT and Harvard; Ina Felau, Peggy I. Wang, Anab Kemal, Samantha J. Cesar-Johnson and Jean C.
Zenklusen of the National Cancer Institute; Ilya Shmulevich of the Institute for Systems Biology; Alexander J. Lazar of the University of Texas MD Anderson Cancer Center; Ioannis Tsamardinos of JADBio Gnosis DA and University of Crete; Katherine A. Hoadley of Lineberger Comprehensive Cancer Center at University of North Carolina at Chapel Hill; The Cancer Genome Atlas Analysis Network; A.
Gordon Robertson of BC Cancer; Theo A. Knijnenburg of the Institute for Systems Biology; and Christopher C. Benz of Buck Institute for Research on Aging.
Van Andel Research Institute Ellrott, K., et al. (2025) Classification of non-TCGA cancer samples to TCGA molecular subtypes using compact feature sets .
Cancer Cell . doi.org/10.
1016/j.ccell.2024.
12.002 ..
Health
Scientists launch public resource to classify cancer subtypes for better diagnosis
A multi-institutional team of scientists has developed a free, publicly accessible resource to aid in classification of patient tumor samples based on distinct molecular features identified by The Cancer Genome Atlas (TCGA) Network.