Supercomputing has long been associated with areas such as physics, engineering, and data science. However, researchers in humanities at Aarhus University are increasingly turning to supercomputing allowing them to delve into unexplored territories and discover new insights. From analysing historical archives to simulating ancient civilizations to analysing social media data, supercomputing offers unique opportunities to generate insights and advance knowledge in humanities.
In this article series, we highlight three cases with humanities researchers from Aarhus University that illustrate the varied ways in which supercomputing is being used in humanities research.
While many studies are based on historical data, the research of Rebekah Baglini, Associate Professor in Linguistics at Interacting Minds Centre, Aarhus University is an excellent example of supercomputing applied to recent data in the humanities.
She employs supercomputing in her current projects involving the collection, processing, and annotation of large-scale media data from traditional and social media sources. By examining this diverse range of data, Rebekah Baglini investigates causal inference and causal reasoning from a linguistic perspective. Her research involves the application of semantic model theory and computational methods to uncover insights in linguistics.
“I aim to develop computationally assisted methods to identify trends in the discursive and informational landscape around topics concerning media dynamics, public health and science communication, crisis and risk messaging, as well as the emergence of mis- and dis-information”.
Rebekah Baglini, Associate Professor in Linguistics, Aarhus University
In addition to her linguistic investigations, Rebekah Baglini also strives to enhance the existing computational language models for multilingual natural language processing (NLP), with a particular focus on under-resourced languages.
Humanities researchers should know the affordances of High-Performance Computing
Rebekah’s pursuits demonstrate the continuous progress of digital humanities and the ongoing efforts to enhance existing language models, ultimately leading to a deeper understanding in the field of humanities.
“My earlier work involved smaller language corpora and didn’t require HPC resources. However, as my projects grew in scale, involving large corpus creation, the relevance of supercomputing increased. I recognise that not all projects require HPC. However, it is useful for researchers to gain training in the affordances of HPC, parallel compute, and large models so they know what’s possible, and can potentially take on projects of larger scale or make use of state-of-the-art resources for data processing, modelling, and simulation.”
Rebekah Baglini, Associate Professor in Linguistics, Aarhus University
This explains why NLP and Computational Linguistics have become integral to Rebekah Baglini’s teaching, enabling her to offer students practical exposure to working with extensive datasets and large language models, fostering hands-on learning opportunities. She emphasises that there is a significant learning curve when delving into the realm of supercomputing.
“There has definitely been a learning curve involved in the transition from locally maintained clusters to the cloud based Interactive HPC platform, particularly because it is also a somewhat new service without comprehensive documentation, and my affiliation with Center for Humanities Computing at Aarhus University has been a valuable resource as there is a great deal of collective experience and knowledge to draw on in the community”.
Rebekah Baglini, Associate Professor in Linguistics, Aarhus University
Rebekah has used the DeiC Interactive HPC system for storing and analysing news and social media in the national research project HOPE that monitored Scandinavian user behaviour during Covid-19.
Today she uses the system in her own AUFF Starting Grant Project CROSS: Causal Reasoning and Online Science Scepticism to train language models to identify and analyse emerging narratives that undermine or counteract verified messaging on scientific findings and public health recommendations.
You have just read the third and final case in our series on Interactive HPC usage in humanities. Through these compelling cases it becomes evident that supercomputing in humanities research is transforming traditional approaches, empowering researchers to uncover new insights and deepen our understanding of the field. It opens doors to interdisciplinary collaborations and expands the possibilities for data analysis and modelling, ultimately shaping the future of digital humanities.
What an accomplishment after less than 3 years of running the DeiC Interactive HPC service – there are now more than 7000 users on UCloud!
And not only that – we are seeing an increase in the number of active users per quarter.
Graph: the number of users on UCloud. UCloud has been used as the basis of the national HPC service, DeiC Interactive HPC, since November 2020. DeiC Interactive HPCis provided by a consortium of universities consisting of Aalborg University, Aarhus University and the University of Southern Denmark.
With so many users, we are also experiencing an extremely high utilization of the service. Over the last 6 months the average utilization was above 85%, with a peak utilization during working hours of more than 170 % of the total national capacity.
“The DeiC Interactive HPC has been an incredible success among Danish researchers and students, to the extent that it is now one of the most popular HPC services in Europe, despite being a relatively small infrastructure. This innovative DeiC service has succeeded in democratizing HPC among all research disciplines, including humanities and social sciences. It has also been a very popular platform for teachers at universities, who now use it every year as part of their courses.”
Director of the SDU eScience Center, Prof. Claudio Pica.
“The surge in demand is a testament to the growing importance of interactive HPC in the national research ecosystem. It’s not just about capacity, but about making high-performance computing accessible and relevant to every field of study.”
Professor Kristoffer Laigaard Nielbo, head of Center for Humanities Computing, Aarhus University
“We are incredibly proud of DeiC interactive HPC’s success and we are excited for the future possibilities and perspectives.”
Lars Sørensen, Head of Digitalization, Aalborg University
This year, new hardware will be added to the service to accommodate the growing interest for GPU computing, which are highly requested amongst the DeiC Interactive HPC users. The new machines, which will be added both to the AAU and SDU datacenters, include compute nodes with 4 NVIDIA H100. This will have a significant impact on advancing research in Denmark.
Supercomputing has long been associated with areas such as physics, engineering, and data science. However, researchers in humanities at Aarhus University are increasingly turning to supercomputing allowing them to delve into unexplored territories and discover new insights. From analysing historical archives to simulating ancient civilizations to analysing social media data, supercomputing offers unique opportunities to generate insights and advance knowledge in humanities.
In this article series, we highlight three cases with humanities researchers from Aarhus University that illustrate the varied ways in which supercomputing is being used in humanities research.
Iza Romanowska is assistant professor at Aarhus University working at the Aarhus Insitute of Advanced Studies where she studies complex ancient societies.
To overcome the challenges of limited data from these ancient societies, researchers have started utilizing Agent-based model (ABM) sometimes enabled by supercomputing. ABMs are computational models that simulate the behaviour and interactions of individual entities, known as agents, within a specified environment or system. Each agent in the model is typically programmed with a set of rules or algorithms that control its behaviour, decision-making processes, and interactions with other agents and the environment.
ABM is a valuable tool in archaeology that allows us to simulate and analyse the behaviours and interactions of individuals or groups in past societies, and the use of ABM allows comparison of the model against real archaeological data.
Assistant Professor Iza Romanowska
In one of Iza Romanowska’s studies, agent-based modelling (ABM) made it possible for her and her colleagues to explore the Roman economy in the context of long-distance trade, using ceramic tableware to understand the distribution patterns and buying strategies of traders in the Eastern Mediterranean between 200 BC and AD 300.
The potential of supercomputing in humanities becomes particularly evident when studying such societies with only limited data as experienced by archaeologists and historians. Iza Romanowska explains that the availability of data is limited in her field compared to other disciplines, stating that while social scientists studying more contemporary populations have access to abundant amounts of data such as the number of traders, transactions, and values, “we have none of this information.” Therefore, the use of HPC has been essential for her research.
ABM as methodological tool necessitates running the simulation many times, and by many, I mean eight hundred thousand times, and that is possible with a laptop… if one plans to be doing their Ph.D. for 500 years. Supercomputing is bigger, faster, better without any qualitative change in terms of the research.
Assistant Professor Iza Romanowska
Using a high-performance computer like the DeiC Interactive HPC system enhances the scalability and speed of ABMs, allowing researchers to gain deeper insights into the behavior and outcomes of complex systems. The DeiC Interactive HPC facility hosts out-of-the-box tools, like NetLogo, for working with ABM. Researchers can also use ABM frameworks for Python or R in one of the many development apps like JupyterLab or Coder.
Supercomputing and coding as research tools advance humanities research
While humanities data in general is plentiful and can be analysed effectively, Iza Romanowska finds that there is a gap in understanding the underlying processes that generate the observed patterns, resulting in underdeveloped explanatory frameworks. Her point is that the lack of formal tools for theory building and testing remains a major disciplinary issue.
“Within humanities including archaeology and history, data analysis is well-established. However, there’s a kind of fundamental disciplinary problem with that we don’t have or use many computational tools for theory building and theory testing. Supercomputing as a tool for the humanities can contribute to fill this gap and strengthen theory building and ultimately it can advance the field of humanities research.”
Assistant Professor Iza Romanowsk
Iza Romanowska believes that more people in humanities should learn to code to take advantage of the possibilities offered by their data. She suggests that supercomputing can be a natural progression from this. While many humanities researchers may not feel like they need supercomputing, perhaps they are simply not asking questions that could benefit from high-performance computing (HPC).
I would especially encourage junior researchers in the humanities to embrace supercomputing. It never hurts to acquire a skill, and many of these tools are becoming so easily available that it’s almost a shame to not use them.
You have just read the second of three cases in our series on Interactive HPC usage in humanities. Through these compelling cases it becomes evident that supercomputing in humanities research is transforming traditional approaches, empowering researchers to uncover new insights and deepen our understanding of the field. It opens doors to interdisciplinary collaborations and expands the possibilities for data analysis and modelling, ultimately shaping the future of digital humanities.
Researchers at a Danish university have various options for gaining access to computing power at both Danish and international HPC facilities. Front office personnell, please inform your users that the fall call H1-2024 is now open for applications for access to the e-ressources.
Beyond Tradition Unveiling the Uses of Supercomputing in Humanities.
Supercomputing has long been associated with areas such as physics, engineering, and data science. However, researchers in humanities at Aarhus University are increasingly turning to supercomputing allowing them to delve into unexplored territories and discover new insights. From analysing historical archives to simulating ancient civilizations to analysing social media data, supercomputing offers unique opportunities to generate insights and advance knowledge in humanities.
In this article series, we highlight three cases with humanities researchers from Aarhus University that illustrate the varied ways in which supercomputing is being used in humanities research.
Katrine Frøkjær Baunvig, head of the Grundtvig Center at Aarhus University has used supercomputing as a methodological approach, and it has led her to non-trivial conclusions that significantly impact our understanding of of 19th-century nation builder and prominent pastor N.F.S. Grundtvig ‘s vast body of works and his immense influence on Danish culture.
In order to conduct a certain type of text mining, so-called word embeddings, she has created an artificial intelligence of Grundtvig, enabling a comprehensive analysis of his over 1000 works and 8 million words, resulting in unprecedented insights.
This approach has ushered in a completely new era in Grundtvig research, according to Katrine Frøkjær Baunvig. She dismisses the criticism of digital humanities sceptics who argue that word embedding fails to consider the surrounding context of words.
“This type of rejection is prevalent only among researchers who have not taken the time to understand or familiarize themselves with the current state and level of the research. When creating a word embedding, I obtain a vast mapping of a given word’s extensive association structure. Therefore, I can clearly discern different semantic focal points and contexts where the word appears in Grundtvig’s body of work. This is precisely what allows me to gain an overview.”
Katrine Frøkjær Baunvig, Head of the Grundtvig Center at Aarhus University
Katrine Frøkjær Baunvig opted to form a research partnership with the Center for Humanities Computing at Aarhus University. Her best advice for other researchers going into supercomputing in the humanities is to team up with the right people.
“Stepping into the world of supercomputing requires an approach to work processes that, in my opinion, represents a new trend in the humanities, namely, interdisciplinary collaborations and team-based publishing. Someone takes care of what is typically called the domain expert area – in this case, knowledge of Grundtvig’s authorship – while others handle the more technical aspects of execution.”
Katrine Frøkjær Baunvig, Head of the Grundtvig Center at Aarhus University
She also emphasises the importance of comprehending the workings of the tools to better harness the power of supercomputing.
“Even if you may not be able to train your algorithm yourself, it can be very practical to devote time and energy to obtain an operational understanding of the steps involved in creating a Grundtvig-artificial intelligence and the various types of applications such an intelligence can be used for.”
Katrine Frøkjær Baunvig, Head of the Grundtvig Center at Aarhus University
With years of experience in using supercomputing in her research, Katrine plans to continue using it and encourages others to do so when it seems fit. Especially in times where humanities research is often dismissed as lacking scientific rigor, Katrine Frøkjær Baunvig sees an opportunity to make an impact. With a keen sense of responsibility to bring her field forward, she is determined to prove that humanities research can be just as methodical and rigorous as research in any other discipline.
“Researchers who have pioneering eagerness should explore supercomputing as it can give them a head start by venturing into “blue ocean” territory.”
Katrine Frøkjær Baunvig, Head of the Grundtvig Center at Aarhus University
Katrine Frøkjær Baunvig has used the DeiC Interactive HPC system for a range of NLP tasks such as linguistic normalisation of historical Danish, semantic representation learning and inference, and finally, historical chat bot development based on custom Large Language Model for Danish.
You have just read the first of three cases in our series on Interactive HPC usage in humanities. Through these compelling cases it becomes evident that supercomputing in humanities research is transforming traditional approaches, empowering researchers to uncover new insights and deepen our understanding of the field. It opens doors to interdisciplinary collaborations and expands the possibilities for data analysis and modelling, ultimately shaping the future of digital humanities.
Stay tuned for our second and third case featuring Iza Romanowska and Rebekah Baglini representing their fields of archaeology and linguistics .
DeiC Interactive HPC consisting of Aalborg University, Aarhus University, and University of Southern Denmark has been at the forefront of cloud and interactive high-performance computing in Denmark for several years. Their commitment to information security and compliance with international standards, including their ISO 27001 certifications, has been a key factor in their success.
In fact, DeiC Interactive HPC was the first HPC facility in Denmark to obtain an ISO 27001 certification, which is an internationally recognised standard for information security management systems (ISMS). The certification provides a framework for organisations to manage and protect their information assets, and it requires a rigorous process of assessment and ongoing monitoring to ensure compliance with the standard.
A Commitment to Information Security
Recently, the University of Southern Denmark announced that they have successfully renewed their ISO 27001 certification three years after they first obtained it in 2020. This recertification demonstrates the university’s ongoing commitment to information security and compliance with international standards.
“We are very proud to have received this renewal of our ISO 27001 certification. It is a testament to our ongoing efforts to ensure the highest levels of security and protection for our data and systems.”
Claudio Pica, CEO of DeiC National HPC center.
Both Aalborg University and the University of Southern Denmark have held ISO 27001 certifications for several years, demonstrating their commitment to maintaining the highest levels of security and compliance in their operations. This commitment has been particularly important in light of new regulations such as the General Data Protection Regulation (GDPR), which require organisations to take a proactive approach to data protection and privacy. Increasing threats from cyber attacks and data breaches also highlight the importance of taking proactive steps to protect data and systems from potential attacks.
Read the SDU eScience story for a detailed walk-through of the process of obtaining the ISO 27001 certification.
Dansk version:
Interactive HPC lever op til højeste internationale standarder med ISO 27001
DeiC Interactive HPC bestående af Aalborg Universitet, Aarhus Universitet og Syddansk Universitet været førende inden for cloud og interaktiv high-performance computing i Danmark i flere år. Deres engagement i informationssikkerhed og overholdelse af internationale standarder, herunder deres ISO 27001-certificeringer, har været en nøglefaktor i deres succes.
Faktisk var DeiC Interactive HPC de første i Danmark til at opnå en ISO 27001-certificering, som er en internationalt anerkendt standard for informationssikkerhedsstyringssystemer (ISMS). Certificeringen udstikker retningslinjer til organisationer i forhold til administration og beskyttelse af deres informationsaktiver, og den kræver en streng proces med løbende vurdering og overvågning for at sikre overholdelse af standarden.
Proaktiv tilgang til databeskyttelse og privatliv
For nylig meddelte Syddansk Universitet, at de med succes har fornyet deres ISO 27001-certificering tre år efter, de første gang opnåede den i 2020. Denne gencertificering demonstrerer universitetets løbende engagement i informationssikkerhed og overholdelse af internationale standarder.
“Vi er meget stolte over at have modtaget denne fornyelse af vores ISO 27001-certificering. Det er et vidnesbyrd om vores igangværende bestræbelser på at sikre det højeste niveau af sikkerhed og beskyttelse af vores data og systemer.”
Claudio Pica, CEO for DeiCs nationale HPC center
Både Aalborg Universitet og Syddansk Universitet har haft ISO 27001-certificeringer i flere år, hvilket viser deres engagement i at opretholde det højeste niveau af sikkerhed og compliance i deres drift. Denne forpligtelse har været særlig vigtig i lyset af nye regler såsom den generelle databeskyttelsesforordning (GDPR), som kræver, at organisationer tager en proaktiv tilgang til databeskyttelse og privatliv. Stigende trusler fra cyberangreb og databrud synliggør ligeledes vigtigheden af at tage proaktive skridt for at beskytte data og systemer mod potentielle angreb.
Se nyheden fra SDU eScience for en detaljeret gennemgang af processen med at opnå ISO 27001 certificeringen (på engelsk).
That’s how long it took to set up the new SSH access to DeiC Interactive HPC applications.
If SSH is of interest to you, you probably know that DeiC Interactive HPC applications have recently experienced limitations in providing a reliable and scalable solution for accessing their services using Secure Shell Protocol (SSH).
The challenges were attributed to a limited number of available IP addresses from the common pool on their platform, along with the implementation of a more scalable solution. The latter proved to be burdensome, as acquiring multiple new IP addresses would not provide the scalability required. However, DeiC Interactive HPC has launched a new solution for Secure Shell Protocol access that eliminates the need for multiple IP addresses. This new solution is based on ports which are much more scalable. Therefore, users can now access DeiC Interactive HPC applications using SSH with ease.
Recently, the DeiC Interactive HPC consortium (which consists of Aarhus University, Aalborg University and the University of Southern Denmark) posted a news story about the user overload on the service. This is an issue arising from the very positive fact that the popularity of the DeiC Interactive HPC service is increasing, but it also has the unfortunate effect that some users are now experiencing longer waiting time on the machines than usual.
Needless to say, however, the DeiC Interactive HPC consortium is immensely proud of the success of the service – and now it can also announce that the UCloud platform, which is used to provide the DeiC Interactive HPC service, has passed 6,000 users. So many users on a service which has only been operational for 2,5 years is a great achievement.
“Overall, we’re seeing an increase in the number of new users at nearly all the 8 universities in Denmark. One significant factor for the past few months is that the number of users from Copenhagen University, which has not been using the facility extensively so far, is now increasing at a steady pace. This is both great news, but also a warning sign for us, as KU is a big university and we will need to be prepared to accommodate even more users in the future.”
Prof. Claudio Pica, director of the SDU eScience Center and DeiC Interactive HPC Consortium representative
Another factor, which causes significant spikes in the number of new users at the start of every semester, is the number of students who log on the service because the UCloud software is used as part of their courses.
The consortium is working hard to add more hardware to the DeiC Interactive HPC service to alleviate the current periods of overload. In the meantime, you can find a number of tips for how to avoid overload issues here.
While there was never any doubt that DeiC Interactive HPC would be a success, the popularity of the HPC facility has taken the Interactive HPC consortium a little by surprise. The two-year old system reached no less than 5000 users back in December, and while every milestone is celebrated the rapid success also comes with a price.
We’re seeing an average rise in users of 1000 each quarter and we’re very proud of the success. However, with an average utilisation of 135% of resources for containerised applications, we’re also experiencing issues with user overload recurring more and more frequently as more users join.
DeiC Interactive HPC Consortium representative, Professor Kristoffer Nielbo, Center for Humanities Computing, Aarhus University
Additional hardware would solve most of the issues but adding more hardware is a time-consuming process, and in the meantime the consortium behind Interactive HPC is working on other solutions to ensure the best user experience.
We are currently working on making operational status available to users to allow them to see when user overload is causing issues and plan their work differently accordingly. The part of Interactive HPC running on SDU already has a solution underway, and the part of the system located at AAU will follow as soon as possible.
Professor Kristoffer Nielbo
However, users can also actively do things to avoid the user overload issues. The DeiC Interactive HPC consortium recommends that; Small users should make sure they only use the resources they need; medium users are asked to consider whether their work could be done on other HPC systems; and large users should apply for resources via the national calls.
Applying for national resources may not fix the problem right now but by doing so researchers indicate that there is a need for additional hardware for Interactive HPC, and this can help speed up the expansion process.
Professor Kristoffer Nielbo
The consortium also recommends using the new DeiC Integration portal when it makes sense. The portal integrates multiple national HPC systems and allows for users to seamlessly shift to other facilities clearing up space for users whose only option is Interactive HPC.
The consortium will continue work on solving any issues and ensure that necessary resources are available because there’s no doubt that interactive HPC is here to stay as a favourite HPC resource for researchers.
Since the DeiC HPC services started in November 2020, a consortium of universities (AU, DTU and SDU) has been working hard to finish the ambitious development of the DeiC Integration Portal. The vision of the DeiC Integration Portal is to provide a national solution to access all the DeiC HPC systems and future DeiC services under one common portal. After two years of development, UCloud has now been expanded with new functionality to integrate with the DeiC HPC providers.
Denmark currently has three national HPC services operated and hosted by different consortia of Danish universities and coordinated by the Danish e-Infrastructure Cooperation (DeiC). All researchers in Denmark can apply for resources on the national HPC services, including the Danish part of the European supercomputer LUMI, either through their universities’ Front Office or via national calls.
Along with the establishment of the national HPC services, it was also envisioned that researchers should be able to access the DeiC systems via a common national portal. This portal should ideally make it “as easy to use the national HPC centers as AWS-, Azure- and Google cloud service”(from the DeiC call in 2020). The DeiC Board decided to make a call for expression of interests for the development of the DeiC Integration Portal, which at the time was also referred to as Project 5.
In 2020, the consortium of universities consisting of AU, DTU and SDU, with SDU as the coordinating body for the consortium, sent the proposal to base this portal on UCloud. This proposal was accepted by the DeiC Board in 2020.
When we answered the DeiC call in 2020, we understood the potential behind the vision of the DeiC Board. At the time the UCloud software platform was maturing into a full-fledge solution for e-research, and it seemed an ideal starting point for the DeiC Project 5 (DeiC Integration Portal)
Claudio Pica, professor at SDU and coordinator of the winning consortium.
Advantages for the users
For the users, there are many advantages of having a common portal to access the national HPC services. Professor, Kristoffer Nielbo, from the Center for Humanities Computing at Aarhus University, explains:
As a researcher (and an infrastructure provider), a common portal brings us closer to the seamless integration of multiple national HPC systems. Such access simplifies my workflows and saves valuable resources otherwise spent on mentally, and sometimes physically, ‘switching’ between platforms. It also makes transitioning from interactive to batch jobs less ‘scary.’ Finally, the portal reduces resources spent on onboarding new researchers in my lab because they only have to learn how to access HPC through the Integration Portal.”
Professor Kristoffer Nielbo, Center for Humanities Computing, Aarhus University
Kristoffer Nielbo tested the portal doing the project’s pilot phase in Fall 2022, and he was very happy with the result.
I was surprised at how well the portal reproduced the familiar user experience of DeiC Interactive HPC – where UCloud has been used for several years. Even though the mode of running jobs is fundamentally different (although DeiC Interactive HPC can run batch jobs), the project and file management, which are large parts of UCloud, were very similar. I wish more national HPC systems had been available during testing.
Professor Kristoffer Nielbo
A common portal also makes it easier for the DeiC Interactive HPC users to use and transition to other more “traditional” HPC systems, such as the LUMI supercomputer.
Even in my lab, I can see that more researchers that used to use DeiC Interactive HPC are now planning to use DeiC Throughput HPC. Project 5 arrived at the right time for many DeiC Interactive HPC users – we have just started to ‘develop an appetite’ for HPC. That being said, I see the different national HPC systems as complementary, and Project 5 enables more users to benefit from more systems.
Professor Kristoffer Nielbo
Implementation of the design
To better understand how the DeiC Integration Portal has been implemented in UCloud, it may be useful to look at how UCloud used to work. In the figure below, an end-user wants to run an application. Using their laptop, they open UCloud, find the application in the application store and click on the “Start” button. This causes their laptop to send a message to UCloud, containing the user’s command. UCloud then sends a similar message to the “DeiC Interactive HPC” computing resources (in this example the YouGene cluster at SDU).
In Project 5, the consortium developed a component called the UCloud Integration Module (or UCloud/IM) which sits at the service provider and which is controlled by the service provider. The UCloud/IM communicates with UCloud and exposes the computing resources of the provider. The service providers have full control over what the UCloud/IM can do.
At a technical level, UCloud/IM is plugin-based software. This means that, as a provider, you can choose and adapt the IM to fit your environment. We have packed it full of features for controlling authentication and authorization. It has several different implementations for compute, storage, licenses and more.
Dan Sebastian Thrane, team leader for cloud services at the SDU eScience Center
The UCloud/IM was designed to maintain a high level of IT security and the integrity of the individual service providers.
To use an analogy, without the UCloud/IM, sending a message via the DeiC Integration Portal would (from the service providers’ perspective) be like giving the postman the keys to your house to deliver the mail. Instead the UCloud/IM acts like a “mailbox”, where the postman can leave your mail without entering your house.
Design Principles
It has been important for the consortium behind the DeiC Integration Portal to have a transparent design and an inclusive development process. A DeiC Steering Group, which included representatives from all the universities in Denmark, was formed by the DeiC Board. This steering group has discussed the design of the portal throughout the development period and approved the final result.
It has also been important for the consortium and DeiC to stress that the DeiC Integration Portal does not replace or control any functionality which DeiC service providers have. It simply exposes these functionalities in a secure and user-friendly way to all users with a common interface, acting as a secure message brokering system.
The DeiC Integration Portal initiative aims to facilitate access to remote compute resources through a joint portal with multiple backend HPC resources. These backend service providers are at the same time HPC service providers to their home universities and part of the emerging national HPC infrastructure. This mission duality implies that the resource providers, at all times, should be able to maintain full integrity and local control.
Michael Rasmussen, section leader for Research-IT (RIT) at Technical University of Denmark.
Full integrity and local control has been achieved by following a set of design principles:
Zero trust design
Exclusively users local to the service providers
Configurable integration module with no elevated privileges
Local validation and authorization control for all actions following the local policies
‘Never trust, always verify’ (zero trust) has been a guiding principle for the design of the process from initiating a job, submitting the job request to the service provider, queuing and executing the job, and finally reporting back to the portal. Users authenticate with home-institution credentials (via WAYF) on login to the Integration Portal and can from here apply for compute resources. Once the DeiC Front Office of a user’s home-institution approves an application for resources, the local resource provider can authorize access by having a local user account created and associated with the user’s DeiC Integration Portal account.
Michael Rasmussen
If the user does not comply with code-of-conduct, the compute resource provider can disable the user’s connection via the integration module and lock the local user account to prevent re-logins until further notice. This means that only user accounts validated, created and authenticated locally, act on the local resource provider facility, thereby ensuring local integrity and control.
If a DeiC Integration Portal user unknown to the local resource provider facility submits a job, the process of validating and creating the new user account is completely controlled by the resource provider. This ensures that only locally validated users act on the local facility.
Michael Rasmussen
Integration with DeiC Large Memory HPC
Since the 19th of December 2022, the first of several planned service providers, the DeiC Large Memory HPC system, was enabled on the DeiC Integration Portal.
DeiC Large Memory HPC is a traditional HPC system with large memory nodes (up to 4TB per node) based on Slurm as the workload manager. This kind of system is historically used primarily by the natural sciences, such as physics and chemistry, for large scale simulations of physical and biological systems via non-interactive batch jobs. As such, this kind of system is very different from the DeiC Interactive HPC platform.
Traditional HPC users from the natural sciences will also benefit from the new integration.
The DeiC Integration Portal provides project management features previously lacking on the system. From the platform, the project PI (or a project administrator) can manage users in the project themselves. Previously they had to write to the user support whenever a user had to be added to the project. Similarly, users are now able to upload their SSH keys directly, instead of sending them via mail.
Martin Lundquist Hansen, team leader for the infrastructure team at the SDU eScience Center
Martin Lundquist Hansen furthermore explains that:
The integration also allows users to manage their files and Slurm jobs directly from the UCloud platform. This is especially important for users less familiar with traditional text based HPC systems, but even for more experienced users this might be convenient in some cases. It is important to emphasize, however, that the DeiC Integration Portal simply provides an additional method for accessing the system, while traditional SSH access is still possible and unchanged.
Martin Lundquist Hansen
Like Kristoffer Nielbo, Martin Lundquist Hansen stresses that the DeiC Integration Portal may help users of the DeiC Interactive HPC system transition to other DeiC HPC systems:
With the new integration, users can consume resources on the DeiC Large Memory HPC system in the same way they are already consuming resources on UCloud. There is of course a difference in the type of applications that conventionally are used on the two types of systems, but they can now be accessed and executed in a uniform way. As users learn to run jobs on the system via the UCloud platform, the transition to accessing the system via SSH might also become easier, due to familiarity with certain aspects of the system.
Martin Lundquist Hansen
The implementation of the DeiC Integration Portal also offers a new avenue for running interactive jobs on traditional HPC clusters, like the DeiC Large Memory HPC system, something that is not typically done on these types of systems. An example, a popular application is JupyterLab, which is a web-based application that allows you to work interactively with languages such as Python and R. Thanks to the DeiC Integration Portal integration these applications can be launched as a Slurm job and the users can then work with the application directly from their browsers.
We are planning to implement more applications of this type in the future, such that the resources are more readily available for non-expert users.
Martin Lundquist Hansen
Currently JupyterLab and RStudio are available for the DeiC Large Memory HPC.
Integration with DTU Sophia
The DTU Sophia HPC cluster, which is part of the DeiC Throughput HPC service, is also available on the DeiC Integration Portal.
The Sophia system is hosted at DTU Campus Risø. The HPC cluster consists of dual processor AMD EPYC nodes fully connected through a 100G Infiniband Fat Tree topology. The full description can be found in the system documentation.
Currently, the main user groups on Sophia are from DTU Wind and DTU Construct. They typically run heavy duty numerical simulations like Computational Fluid Dynamics workloads, using softwares like Ellipsys, OpenFOAM, PETSc, and WRF. Other commonly used applications are AI/Machine Learning, Quantum Chemistry (Density Functional Theory), Monte Carlo and Molecular Dynamics codes. Commercial applications, like ABAQUS, COMSOL, Mathematica, and Matlab are also widely used.
Integration with LUMI/Puhuri
The third planned integration is with the LUMI supercomputer. LUMI has its own project management portal called Puhuri, which is used to create projects on the LUMI supercomputer. The consortium has worked with the Puhuri development team to support the functionality from the DeiC Integration Portal. Due to the scope of the Puhuri portal, this integration will, however, be limited to project management and requests of resources on LUMI. It is not yet possible to run jobs on LUMI directly from the DeiC Integration Portal.
What comes next?
With the DeiC Integration Portal now launched, in the future more DeiC services can be added. The majority of DeiC HPC services are already part of the portal: DeiC Interactive HPC (where hardware is placed both at SDU and AAU), DTU Sophia (part of DeiC Throughput HPC), DeiC Large Memory HPC and LUMI. The missing DeiC HPC services, part of the DeiC Throughput HPC, will be added in the future.
The DeiC Integration Portal will also make it possible to integrate with the upcoming DeiC data management services. A possible integration with DeiC data management services could mean that researchers will be able to use their data across the whole portfolio of DeiC services, for example to analyse data at different DeiC HPC centers.
In collaboration with DeiC, we plan to improve the look and branding of the new DeiC Integration Portal.
Outside of Denmark, the functionality of the new DeiC Integration Portal has already caught the attention of research institutions. This includes e.g. the HALRIC consortium, which recently received 11 million euros to build collaborations between companies, hospitals and universities (press release from Lund University). Within Denmark, there has been a dialogue with Danish Bioimaging Infrastructure (DBI-INFRA) Image Analysis Core Facility, who are also interested in the possibilities offered by the platform.
No doubt, the attention the DeiC Integration Portal has received both nationally and on a European level is an acknowledgement of the skills and competences of the consortium’s developers and the original vision of the DeiC Board from 2020. Surely, this is only the beginning of many future collaborations, which will benefit the research environment in Denmark.