Supercomputers

HPC Vega — Slovenian peta-scale supercomputer powering scientific discovery

Posted on Updated on

European Technology Platform (ETP) for High-Performance Computing (HPC) or ETP4HPC organized a conference today on the Vega system.

EuroHPC supercomputers with HPC Vega system, was hosted by IZUM in Maribor, Slovenia. Aleš Zemljak and Žiga Zebec from IZUM presented on Vega. IZUM is the Institute of Information Science, Maribor, Slovenia.

Slovenian peta-scale supercomputer
Aleš Zemljak gave an overview of HPC Vega: “HPC Vega — Slovenian Peta-scale Supercomputer”. He touched on the system’s design, architecture and installation, focusing on most user-relevant basic concepts of HPC, and their relation to HPC Vega.

HPC Vega.

HPC Vega is the Slovenian peta-scale supercomputer. HPC Vega is the most powerful Slovenian supercomputer. It is the first operational EuroHPC JU system, in production since April 2021. It has performance of 6.9 PFLOPS, uses Atos Sequana XH2000 and 1020 Compute nodes, Infiniband 100Gb/s. It has 18PB large capacity storage Ceph, and 1PB high performance storage Lustre. It consumes < 1MW power, and has PUE < 1.15.

App domains
HPC app domains include earth sciences, such as seismology, earthquake simulations and predictions, climate change, weather forecast, earth temperatures, ocean streams, forest fires, vulcano analysis, etc. High energy physics and space exploration, such as particle physics, large Hadron collider, project ATLAS trkalnik, astronomy, large synoptic survey telescope, Gaia satellite, supernovas, new stars, planets, sun, moon, etc.

Medicine, health, chemistry, molecular simulation, including diseases, drugs, vaccines, DNA sequencing, bioinformatics, molecular chemistry, etc. Mechanical engineering and computational liquid dynamics. Machine, deep learning, AI, etc., such as autonomous driving, walk simulations, speech and face recognition, robotics, language analytics, etc.

HPC Vega has 10 design goals. These are: general-purpose HPC for user communities, HPC compute intensive CPU/GPU partitions, high-performance data analytics (HPDA) extreme data processing, AI/ML, compute node WAN connectivity, hyper-converged network, remote access for job submission, good scalability for massively parallel jobs, fast throughput for large number of small jobs, and high sequential with random storage access.

EU projects (funded) are: interTwin, exploitation of HPC Vega environment, two FTEs (IZUM, JSI), starts, EPICURE, SMASH (MCSA cofunded), o-boarding first postdocs, etc. EUmaster4HPC is preparing an offer for summer internship.

Supporting projects/activities (non-funded) are: EuroCC SLING, MaX3 CoE, etc. Others are: European Digital Infrastructure Consortium (EDIC) – national resources reserved, high-level app support help for Leonardo, CASTIEL2, Container Forum, MultiXscale CoE, and EVEREST (Experiments for Validation and Enhancement of higher REsolution Simulation Tools).

Future is data centers and ‘Project NOO’. Project “Recuperation and Resilience Plan — NOO. The goal is to archive facilities for research data, space for hosting of equipment of public research institutions and universities, space for future HPC(s). The project is due to be completed in June 2026. We have EUR15.2 million for two data centers and the long-lasting storage for research data equipment.

We envision two identical facilities or buildings for two data centers. They will be located in Dravske elektrarne, Mariborski otok. Acquisition of land has been completed. The other one is JSI (nuclear research) reactor, at Podgorica, Montenegro. We will be using the ground floor for HPC, first floor for the research data archive, Arnes’s and hosted equipment. Slovenia is going to need a new supercomputer by end of 2026. EuroHPC JU co-funding is expected (this system is not part of this ‘Project NOO.

Powering scientific discovery
Dr. Žiga Zebec presented: “HPC Vega: Powering Scientific Discovery”, focusing on the science conducted on HPC Vega, or “use cases”.

Slovenian research facilities using HPC Vega are: Kemijsko Institut, lab for molecular modeling, Univerza v Lubljani, for cognition modeling lab, FMF, in physics department, Univerza v Maribou, lab of physical chemistry, and Institut Jozef Stefan, theroretical physics, experimental particle physics, reactor physics, Centre for Astrophysics and Cosmology, etc.

Major domestic projects are development of Slovene in a digital environment. Project goal is to meet needs for computational tools and services in language technologies for Slovene. Development of meteorological and oceanographic test models. Hospital smart development based on AI, with project goal to develop AI-based hospitals. Robot textile and fabric inspection and manipulation. It is to advance state-of-the-art of perception and inspection, and robotic manipulation of textile and fabric, and bridge technological gap in this industry.

We have Slovenian Genome project with systematic study of genomic variability of Slovenians. There can be faster and more reliable diagnostics of rare genetic diseases.

There are scientific projects running on the Slovenian share of HPC Vega. These include deep-learning ensemble for sea level and storm tide forecasting, All-Atom Simulations of cellular senescence (process of deterioration with age), first-principles catalyst screening, dynamics of opioid receptor, visual realism assessment of deepfakes, etc. Scientific projects are also running on EuroHPC share of HPC Vega such as understanding skin permeability with molecular dynamics simulations.

Vega is involved in several international projects. These include SMASH, interTwin, EUMaster4HPC, Epicure, etc.

Key for an energy efficient world lies in new semiconductor materials: Dr. Ms. Sabine Herlitschka, Infineon

Posted on

Dr. Ms. Sabine Herlitschka, CEO, Infineon Technologies Austria AG, presented decarbonization and digitalization: Key for an energy efficient world lies in new semiconductor materials!

The global electricity demand more than doubles in the period to 2050. Energy efficiency tempers growth! Microelectronics enables sustainable power consumption. Energy efficiency technologies have become an important lever for generating, transmitting, and using electrical power.

The key for the next essential step towards an energy efficient world lies in the use of new materials. We have SiC, GaN, and Si. Gallium nitride (GaN) opens the door for new opportunities. We now have CoolGaN. Physical properties are beneficial, but very challenging at the same time!

Europe has demonstrated joint research strength to increase energy efficiency. We have a new semiconductor material — GaN. We have Ultimate GaN. For chip area reduction, it prepared the basis for the next generation with bond over active.

Infineon is now working on All2GaN. It will provide integrated GaN solutions based on UltimateGaN technology. The duration is from May 2023 to April 2026. There is a consortium of 46 partners from 12 European countries. Total cost is ~ 60 million EUR, and European funding is worth ~ 16 million EUR. In All2GaN, Smart GaN Integration Toolbox is a base with significantly increased material and energy efficiency, meeting the global energy needs, while keeping CO2 footprint at the minimum.

Infineon has three new facilities for 300 mm and new semiconductor materials. One is in Dresden, announced in 2022. Euro 5 billion is for new analog/mixed-signal and power semiconductor fab on 300mm. Next, in Villach, has been in operation since 2021. In FY-22, there was an investment of Euro 160 million for new semiconductor materials — SiC and GaN.

The third facility is in Kulim, Malaysia, announced in 2022. It is Euro 2 billion investment in production capacities for new semiconductor materials. First wafers are expected to leave the plant in H2-2024.

Decarbonization and digitization go hand in hand (twin transition). GaN will enable a significant step forward towards energy efficiency, and size and weight reduction in a variety of applications. Europe demonstrated joint research activities. Further research is needed on new materials and new technologies in order to “green” all applications that can be electrified.

By adding significant manufacturing capacities for power semiconductors, Infineon is strengthening the strategic know-how in Europe. EU Chips Act needs to strengthen the competitiveness of Europe!

What’s the future of computing going to be?
Dr. James Sexton, IBM Fellow, Department Group Manager, Data Centric Systems, IBM Research Europe, presented on the future of computing.

AI, quantum, and hybrid cloud are driving progress in every aspect of computing. We are now accelerating materials discovery. AI, cloud, and robotic labs allow 100x faster synthesis. Here, we can synthesize and validate the most suitable candidates.

Future computing foundations are based on complex workflows on unprecedented quantities of data. We are at an inflection point of the computer energy problem. We are now witnessing the emergence of heterogeneity. There are many approaches and players. Some of these are Intel, AMD, NVIDIA, etc. Among startups, we have Cerebras, Next Silicon, Graphcore, Liqid, GigaIO, etc.

If we look at the industry trends, hyperscale’s are recognizing the need for disaggregation and limitation of existing fabrics. Software-only optimizations for resource utilization only yield small improvements over eight years, with most of the improvements coming from best effort jobs. Composability and disaggregated systems is a growing market today.

Composable (disaggregated) infrastructure (CDI) is an emerging approach to deliver flexible infrastructure provisioning by dynamically combining disaggregated compute, memory, storage, and network elements.

CDI has two core elements — a set of disaggregated compute, memory, storage, and network elements that can be assembled to produce virtual compute servers and clusters on-demand, and software to provision and manage the underlying physical resources, to assemble servers, and to support and optimize workflow deployment across those physical resources.

From a market perspective, CDI is considered an evolution of the integrated and hyperconverged infrastructure. From a technology perspective, CDI takes advantage of heterogeneous resources within the system (accelerators), advanced fabric (PCIe and CXL), and networking capabilities (SmartNICs, and DPUs/IPUs) to enable dynamic composability of server instances.

IDC notes that “it has the potential to completely overhaul a decades-old monolithic systems architecture, which cannot scale in the digital era that is upon us.”

Looking at the sustainable computing future, we are making the current state more sustainable. We are introducing accelerators (digital), and hardware and software co-design / co-optimization. We also have new computational models (beyond digital).

Future computing challenges and opportunities are immense. Power requirements are growing unsustainably. There is emergence of special hardware for specific tasks — AI devices, Cereberas, NextSilicon, Graphcore, etc. New packaging methods, new memory technologies, new fabric technologies are coming up. As are new computing models — quantum, wafer-scale, analog, in-memory, etc. No single supplier can deliver all elements. Standards are essential to support integration.

Expect increasing heterogeneity! Even within a single system can foresee sub-sections with different capabilities. Expect increasing complexity of workflows. Expect increasing need for / opportunity for composability within module/system, and across systems. We can also expect increasing use of AI in all aspects of computational analysis — for deployment, execution, optimization of workflows.

A look into future of computing in 2040!

Posted on

SEMI, USA, organized a seminar today on a look into the future of computing in 2040. Ms. Bettina Weiss, Chief of Staff & Corporate Strategy, SEMI, welcomed everyone.

Dr. Alessandro Curioni, IBM Fellow, VP, IBM Research Europe & Africa, Director IBM Research Europe – Zurich, IBM, talked about the importance of future of computing effort for the industry. There is a ‘Future of Computing’ think tank with scope for top down end-user perspective, bottoms-up approach, and considerations for classic computing, computing for AI, and quantum computing.

Jim Sexton.

Progress report
Jim Sexton, IBM Fellow, Department Group Manager, Data Centric Systems, IBM, presented the progress report of the think tank. There are industry trends driving change. Economic and geopolitical trends include increasing dependence on silicon design and fabrication. There was supply chain crisis following the pandemic. We have had responses, such as US Chips Act and Science Act, EU Chips Act, and from Japan and Korea.

From EU chips survey, chip demand is expected to double between 2022 and 2030, with significant increase in future demand for leading-edge semiconductors. Companies are establishing new chip fabrication facilities. Current supply crisis is expected to last till 2024, forcing companies to adopt costly migtration measures.

US Chips and Science Act worth $54 billion supports to build, expand and modernize US facilities, and equipment for semiconductor fabrication, assembly, testing, advanced packaging, etc. EU Chips Act worth $43 billion look at boosting EU reliance on semiconductors.

Computing trends include cloud model for computing, edge computing, AI and quantum computing, increasing importance of security, virtualization, compliance, etc. There is pervasive computing, and computer neural/brain interfacing.

Cloud is leading to decentralization, We are moving to disaggregated and decentralized clouds. We are seeing rise of sovereign cloud, latency-sensitive edge apps, etc. We are having true multi-cloud apps. We have flexibility to choose and combine best-of-breed technologies, and specialization.

Data, AI, quantum, and hybrid cloud are driving progress and change in every aspect of computing. With AI systems and quantum systems, fundamentally new architectures are opening the door to new insights. Advanced microelectronics is leading to new processors, devices, accelerators, interconnects, etc.

At the top level, complex apps do analysis, modeling, and simulation, AI, etc. The container platform approach, based on Kubernetes, looks at the tools to build and compose, manage, secure, automate, and optimize. We are now having high-performance compute for quantum supercomputer, AI supercomputer, modeling and simulation supercomputer, and general-purpose compute.

Future computing foundations lead to complex workflows on unprecedented quantities of data. Hybrid cloud technologies can secure, provision, and deploy, across multiple locations. Computers can now address all elements of a ‘discovery process’. This is applicable across research, enterprise, and government. We are building the foundation for future computing.

Workflow complexity and provisioning are components of a complex discovery workflow. This includes AI- and quantum-enriched analysis, knowledge generation and analysis, etc. We are seeing silicon evolution. We are seeing improved performance per chip. Power per chip remains unchanged, and cost per transistor is increasing.

We are facing the computer energy problem. Data centers are gobbling up lot of the world’s electricity. AI power consumption doubles every 3 months. CEA-Leti recognized 60 elements are now used for silicon, and 15 percent is recycled. We are looking at the future of sustainable computing.

Power requirements are growing unsustainability. We are seeing new hardware for specific tasks, new packaging methods, new memory and fabric technologies. Standards are essential for supporting future integration.

Deep dive into quantum
James Clarke, Director of Quantum Hardware, Intel, took a deeper dive into quantum computing. Quantum concepts include superposition, entanglement, and fragility. Fragility will require error correction and likely millions of qubits. It is not quite clear how many qubits we need for a fault-tolerant system.

James Clark.

There is broad support within the US Government, and other governments worldwide. There is the National Quantum Initiative Act 2018. Quantum computing could shape everything. All of these are separate from the Chips Act. Quantum TAM is too early to call at this stage. Hardware development must come first. We are likely 10 years away from a commercially relevant QC.

There are various physical implementations of qubits. These include: superconducting loops, trapped ions, silicon quantum dots, topological qubits, and PSI quantum, etc. There are over a dozen qubit types. All of the above qubits represent closer alignment to microelectronics infrastructure.

Beyond quantum bits, quantum circuits power the computation. A quantum program can be represented by a sequence of quantum circuits and non-concurrent classical computation. There is industry investment in scaling and system development. These are across trapped ions, superconducting, quantum dots, photons, etc. We have a long way to go. Technology ecosystem R&D is happening across process, integration, laser, etc.

There is the ITRS for QC. We need partnerships between IDM and industry. USG must play a large role. We need a Quantum Foundry/USG/industry/academia partnerships. The equipment and chemicals suppliers need to look at QC. Transistors drive QC, and not the other way around. We need nascent technologies to quantum/qubits that are necessary, such as superconducting digital, photonics, etc.

Quantum has the potential to be transformative for several classes of algorithms. QC will augment, and not replace classical compute. Progress is needed for scaling and system development.

New frontiers for exascale supercomputers

Posted on Updated on

Recent years have seen rapid advancements in advanced computing, including the recent achievement of hitting exascale with Oak Ridge’s Frontier system. These breakthroughs are fueled by innovation and US leadership grounded in long-standing government investments in R&D and STEM education. Semiconductor Industry Association (SIA), TFAI, and National Labs examined the state of exascale computing in the USA, and where it is headed. Jon Hoganson, Corporate VP, AMD Inc., and Co-chair of TFAI, welcomed the audience.

Doug Kothe, Associate Laboratory Director, Computing and Computational Sciences Directorate, and Director, Exascale Computing Project, Oak Ridge National Laboratory, said we have been talking about exascale since 2009. We have had the Exascale Computing Project (ECP) that is focused on accelerating the delivery of a capable exascale computing ecosystem that delivers 50 times more.

DoE’s Exascale Computing Initiative, co-led by the Office of Science and DOE’s National Nuclear Security Administration (NNSA), began in 2016 with the goal of speeding the development of an exascale computing ecosystem.

ECP, a key part of ECI, is a vision that we are achieving. ECP has three basic thrust areas. First, delivering applications, software development, and hardware and integration. ECP is a seven-year-old large R&D effort since 2016. Around 90 researchers and almost 100 companies are part of this. There were 80 R&D teams. Potential impacts would be far-reaching for decades to come. Some of the examples are virtual time machines, digital twins, national security, energy and economic security, scientific discoveries, healthcare, etc. Apps will probably incorporate AI/ML techniques. This is a once-in-a-lifetime opportunity for the best and brightest to join.

Ron Bewtra, Director, Leadership Computing, Hewlett Packard Enterprise, said from a PPP perspective, HPE is really proud of the partnerships for this project. Covid-19 HPC Consortium was one of the original founding members. HPE has joined the other technology leaders. We have a strong set of relationships that we build with partners. We are seeing transitions of HPC technology to the industry. We are also looking at the impacts of the structural analysis. Innovations drive HPE products and also get transferred to the industry.

Mike Schulte, Senior Fellow for Silicon Engineering, AMD, said that PPPs are central to maintaining US leadership. They have allowed US to be at forefront of leading technologies. They are important for creating new jobs, and new technologies to come out. A great example of PPP is Frontier. AMD, in collaboration with the US Department of Energy, Oak Ridge National Laboratory, and Cray Inc., designed the Frontier supercomputer expected to deliver more than 1.5 exaflops of peak processing power. We are now working at the next scale of exascale. This has also led to advancements in the other products, such as laptops, desktops, etc. There are 74 racks in the Frontier.

AMD provides the processers and accelerators used in the Frontier. These allow high performance and higher efficiency. We made enhancements to the chips to make them highly efficient and high performing. We need continued improvements in power efficiency that is built into the chips. The software designed for Frontier is very effective. The software is open source, and is beneficial, as it allows partners to work together.

Kothe added that besides 1018 Flops efficiency or 1exaFlop, we also need lots of memory, and lots of storage. This combination of memory, storage, and compute makes for higher science workrates. It can deliver better science faster! We can deliver solutions that we can bank on to the stakeholders. There can be the app store for the nation. We have deployed thousands of scientists and researchers. ECP is delivering and deploying tools on Frontier.

There are 24 first-mover apps that span the DoE mission space. Hundreds of apps will also follow. There are quantum materials for quantum computing with quantum interconnects. We can also simulate the system for earth’s climate. We can do cancer research, and national security, etc. These are botique apps. Apps are also being designed to offer even more solutions.

Bewtra added that federal missions have the ability to stay ahead of the requirements. We have to keep pace with the national innovations. The American Government is going to push that envelope further. Supercomputers aid in scientific discovery and national security. We have to deliver the operational capabilities.

Exascale is a tremendous milestone. Frontier is just the foundation. The science you are going to hear from Oak Ridge is going to be really astounding. This computing also needs to be available inside of the agencies organically. We know that China has two large supercomputers. Other countries are also not waiting. Supercomputing capabilities continue to need to advance. Investments in supercomputers need to stay strong. Supercomputers for weather and climate forecasts at the National Oceanic and Atmospheric Administration (NOAA), US Department of Commerce, have just gone live.

Schute said that use of data is skyrocketing. We are making supercomputers as energy efficient as possible. It was important to have an efficient design for power envelope. Frontier has 150,000 processors and accelerators. There are tremendous challenges relating to work and cooling.

Kothe added that such a tremendous tool needs to be affordable. We have seen incredible technology, such as advanced packaging for this machine. There are 9,400 nodes. One node is very powerful, and five nodes together make a petaflop. We have the challenge to exploit the power of one node.

Bewtra noted that DoE has great vision. Software libraries have undergone tremendous work. They are cutting-edge system. Schulte added that we can take the fundamental technologies for such systems.