Stampede3 Supercomputer Fully Operational, Updates for Open Science

A powerful new supercomputer that will enable dynamic open science research projects in the U.S. is in full production in the Texas Advanced Computing Center (TACC) at The University of Texas at Austin.

For more than a decade, the Stampede systems - Stampede (2012), Stampede2 (2017) and now Stampede3 (2024) - have been flagships in the National Science Foundation's scientific supercomputing ecosystem. Made possible by a $10 million award from the NSF, Stampede3 will enable computational and data-driven science and engineering research and education.

"During our pre-production period, users experienced capabilities such as an increase in speed-up for scientific applications due to better memory bandwidth per core provided by the Intel Xeon CPU Max processors," said Tommy Minyard, TACC's director of Advanced Computing Systems and principal investigator of the Stampede3 project. "And for the first time, we are using a storage system with no spinning disk - we are expecting a significant improvement for users in their I/O performance and reliability."

Image

Tommy Minyard, Director of Advanced Computing Systems and Principal Investigator of the Stampede3 project, Texas Advanced Computing Center

TACC continues its partnerships with Dell Technologies and Intel on Stampede3, a nearly 10 petaflop system offering tremendous capability for diverse scientific applications. Stampede3 offers substantial new computing capability, while also re-purposing hardware from previous NSF investments to support high-throughput users.

Stampede3 brings a significant increase in computational and data capabilities to the science and engineering research community," said Katie Antypas, office director for the NSF's Office of Advanced Cyberinfrastructure. "The new high-bandwidth memory node architecture as well as the all-flash filesystem will accelerate a wide range of applications, and I expect it will be in high demand by the user community."

More than 450 distinct users ran a half million jobs during the pre-production period. The system will enable thousands of researchers nationwide to investigate questions that require advanced computing power ranging from data analysis in biology to supersonic turbulence flows to atomistic simulations on a wide range of materials.

A few of the early users and projects include:

  • Biology: The Galaxy project provides a unique, freely available, high-performance data analysis environment serving the full range of questions related to biology and other disciplines. The allocation on Stampede3 aims at significantly improving the capability of Galaxy to allow for the full range of data analysis scenarios currently present in modern biology.
  • Fluid Dynamics, Diego Donzis, Texas A&M University: Understanding turbulent flows at both low and high speeds by performing incompressible and compressible simulations at massive scales with unbounded and wall-bounded flows. The allocation on Stampde3 will support the development of a new numerical approach to accurately solve turbulence at a lower computational cost.
  • Industrial Chemistry and Materials Science, Qi Liang, University of Michigan: The allocation on Stampede3 will support the development of a kinetic Monte Carlo simulation method to study the long-time evaluation of solute atoms into solute clusters based on surrogate models. These studies provide the means to predict complex defect-solute interactions accurately.

TACC also added an experimental GPU hardware subsystem for artificial intelligence and machine learning, further advancing the University's Year of AI initiative and highlighting the AI data processing capabilities available only at UT.

Stampede3 delivers:

  • A new 4 petaflop capability for high-end simulation: 560 new Intel Xeon CPU Max Series processors with high-bandwidth memory-enabled nodes, adding nearly 63,000 cores for the largest, most performance-intensive computing jobs.
  • A new graphics processing unit/AI subsystem including 20 Dell PowerEdge XE9640 servers adding 80 new Intel® Data Center GPU Max 1550s for AI/ML and other GPU-enabled applications.
  • Reintegration of 224 3rd Gen Intel Xeon Scalable processor nodes for higher memory applications (added to Stampede2 in 2021).
  • Legacy hardware to support throughput computing - more than 1,200 existing Stampede2 2nd Gen Intel Xeon Scalable processor nodes will be incorporated into the new system to support high-throughput computing, interactive workloads, and other smaller workloads.
  • VAST Data - 10PB usable all flash storage system capable of 50GB/s write, 500GB/s read bandwidth.
  • The new Cornelis Networks CN5000 Omni-Path™ highly scalable 400Gb/s network interconnect to enable low latency, excellent scalability for applications, and high connectivity to the I/O subsystem (to be deployed later in 2024).
  • 2,044 compute nodes with almost 200,000 cores, more than 350 terabytes of RAM, 10 petabytes of new storage, and almost 10 petaflops of peak capability.

Stampede3 will serve the open science community from 2024 through 2029.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.