Singapore Unveils Major RNA Dataset for Disease Study

Agency for Science, Technology and Research (A*STAR), Singapore

SG-NEx provides an open-access resource for the global research community, accelerating biomarker discovery and precision medicine through:

  • Unprecedented Scale and Benchmarking: One of the world's largest datasets comparing long-read sequencing techniques with conventional short-read methods.
  • Enhanced RNA Insights: Demonstrates long-read sequencing's ability to reveal complex RNA features like cancer-related fusion transcripts that short-read methods often miss.
  • Global Accessibility: The 39TB dataset and analysis tools are freely available via the AWS Open Data Registry, supporting worldwide research collaboration.
  • Advanced Diagnostics Potential: Enables more accurate RNA analysis to discover new biomarkers for neurodegenerative, cardiovascular and infectious diseases, advancing personalised treatment approaches.

Singapore – A team of scientists led by the A*STAR Genome Institute of Singapore (A*STAR GIS) have released one of the world's largest and most comprehensive long-read RNA sequencing datasets, addressing a long-standing bottleneck in disease research. With over 750 million long RNA reads across 14 human cell lines, the Singapore Nanopore Expression (SG-NEx) dataset is designed to help researchers decode RNA complexity with greater precision, laying the groundwork for next-generation diagnostics and therapies. The study was published in Nature Methods in March 2025.

Unlocking the Full Picture of RNA

Despite being the backbone of modern transcriptomics, traditional short-read RNA sequencing often fails to capture full-length RNA molecules and complex variations, such as splicing patterns, fusion transcripts, and specific chemical modifications that influence disease progression like cancer. This limits its utility in detecting clinically relevant biomarkers.

SG-NEx overcomes these limitations using long-read RNA sequencing, which enables the full sequence structure of RNA to be observed directly. This offers deeper biological insights while reducing analytical blind spots, which are key steps in discovering new biomarkers and developing better, more precise treatments.

"Imagine you have a book where each page is torn into fragments," said Chen Ying, Senior Scientist at A*STAR GIS. "That is what happens with short-read RNA sequencing, which forces you to reconstruct the story from scattered sentences. Some details may be lost, and it is easy to make mistakes. In contrast, long-read RNA sequencing enables you to read the book using complete pages or even chapters at once. This makes it easier to uncover the key details hidden in complex RNA molecules linked to diseases."

As the life sciences industry doubles down on precision medicine, researchers and biotech companies need reliable, high-resolution tools to pinpoint new disease markers and therapeutic targets. SG-NEx was purpose-built to fill this gap with its open-access dataset and benchmarking resources, allowing the industry to better analyse different forms of genes (known as isoforms) and providing a critical foundation for academic and translational research in human diseases, biotech and pharma companies developing RNA-based diagnostics or therapeutics, tool developers and bioinformatics teams creating next-gen RNA analysis platforms, and healthcare systems and policymakers investing in genomic precision strategies.

Driving Innovation Through Collaboration for Global Impact

Started in 2018, the SG-NEx is the result of close collaboration with experts from A*STAR GIS, Duke-NUS Medical School, the National Cancer Centre Singapore, Cancer Science Institute of Singapore, the National University Cancer Institute of Singapore, The Walter and Eliza Hall Institute of Medical Research, the Garvan Institute of Medical Research and Peter MacCallum Center.

"From the start, we designed the SG-NEx project with a rapid open access data release strategy to maximise its utility. By making the data publicly available, we are enabling researchers worldwide to develop and test new RNA profiling methods. This is an important step towards accelerating biomedical discoveries and unlocking the potential of long-read sequencing to improve diagnostics and patient care," said Dr Jonathan Göke, Senior Principal Scientist at A*STAR GIS.

Shaping the Future of RNA Research

The SG-NEx team is now working to further amplify its impact by developing AI-powered tools for automated RNA feature detection, broadening global access to the data, and promoting the standardisation of long-read protocols, which are key enablers for clinical adoption.

Dr Wan Yue, Executive Director, A*STAR GIS, said, "By combining large-scale data generation, rigorous benchmarking, and open-access infrastructure, SG-NEx is shaping the future of RNA research. It brings us closer to understanding how RNA influences health and disease, and how we can harness that knowledge to improve lives."

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.