Software is an integral part of scientific research, but its importance often goes unnoticed. From licences and infrastructure to user-friendliness, the multifaceted nature of software in research needs to be highlighted. In a joint article, philosophers and experts from various scientific fields have shed light on the complicated relationship between models, software, and computational science.
We spoke to Markus Diesmann, director at the Insitute for Advanced Simulations - Computational and Systems Neuroscience (IAS-6) at Forschungszentrum Jülich, who is head of the Computational Neurophysics group and one of the original developers of NEST - a simulator for neural networks.
In what areas is software used in the natural sciences?
Today, software is used in all areas of natural science. I am not referring to software that we all use to create texts, presentations, or on our phones, but rather software that is used directly for research itself.
For example, simulation has established itself as the third pillar of the scientific method, alongside theory and experiment. We can now create detailed mathematical models of nature due to our understanding of the fundamental physical mechanisms and the availability of precise experimental data on the components. However, the models are so complex that we cannot understand how the interaction of the various components produces the respective properties of the overall system that is being depicted. However, starting from an initial state, simulation software can calculate the values that certain measured variables will assume after a specific period of time - provided that the model correctly describes nature. Step by step, a computer can therefore calculate how an aspect of nature will change over time. Predictions can now be compared in detail with experimental results. Researchers can study the model behaviour and learn to understand it better with simpler mathematical descriptions. Furthermore, a model can be studied by analysing the effects of changes that are difficult or impossible to achieve in nature. Such exact depictions of nature or of a technical system, such as a factory or an aircraft, are referred to as digital twins.
Another aspect is the operating systems of experimental computers such as quantum computers or neuromorphic computers. These systems are physically implemented in electronics. However, to ensure they can be used by researchers, many layers of software are necessary, which must be designed in such a way that the entire software does not have to be replaced when the hardware is changed. The same applies to the management of large scientific instruments.
In the traditional field of data analysis, which is currently being supplemented with artificial intelligence (AI) methods, more and more software is being developed to cope with the growing volumes of data.
In addition to these prominent areas in which research is currently dependent on scientific software, I would like to draw attention to another area that is not so visible. Researchers today write many small snippets of software in their everyday work to quickly prepare, analyse, and visualize data. This software is not intended for long-term development, but it must still be correct and its use must be documented.
The software we use in our daily lives is often designed for a large user base. It usually has a robust infrastructure, is maintained by professional software developers, and adapted to current developments. I'm assuming this is probably not always the case with specialized scientific software?
That's right. Neither science nor its funding bodies have recognized this fact for a long time. There are two aspects here. Firstly, software in science is subject to different requirements and is developed differently than in industry. It starts with the fact that deep knowledge of the subject area is required to create scientific software. It is therefore the researchers themselves, especially doctoral researchers, who create the code - not professional software developers. Furthermore, scientific software does not just have to fulfil a specific task. As scientific progress continues, the task of the software also changes. The architecture must therefore be simple and robust enough for researchers who only work on the software for a limited time and for whom software development is only part of their job. Secondly, the architecture must be flexible enough to adapt to new tasks. For industry, software engineering (SE) is a field of computer science that has developed good methods for efficiently developing and documenting correct software. Research software engineering (RSE) is currently being developed for scientific software in order to do justice to the growing importance of software in science and to adapt SE to the conditions in science.
For a long time, science assumed that software could be created quickly and cost-effectively by young people, while high-performance computers require large investments and are operated for long periods. In practice, however, the exact opposite is the case. Relevant scientific software has a lifespan of several decades, while computers in research are replaced every five years in order to benefit from the latest hardware. Scientific software therefore needs to be maintained over a long period of time in order to adapt it to new hardware and new tasks, possibly beyond the active phase of the original developers. In this sense, scientific software is scientific infrastructure and an issue for large-scale research.
What influence does specialized software have on scientific theories and models?
That is a very good question, which is currently being researched by philosophers and sociologists. On the surface, many models today are so complex that we would be unable to calculate any predictions at all without the help of software. Weather models, for example, have to take countless factors into account that could influence the weather. The possibility of exploring the properties of models using simulations is inspiring researchers to come up with new theories. But what is more important is how the availability of a particular software influences the way researchers think and possibly steers them in a particular direction.
It has always been the case that technology has influenced the way researchers have thought about nature. With software, this happens in a subtle way. Take our simulation software NEST, for example. Researchers can use it to create models of neural networks on a computer with just a few commands. One command, for instance, connects a model neuron to exactly one other, while another command creates connections between several neurons with a certain probability. It is conceivable that the commands available, which make the construction of some networks easier than others, influence how researchers think about neural networks in nature. In a study together with philosophers from RWTH Aachen University, we investigated, for a very limited area, which terms researchers use in the literature to describe neural networks, which software is used for this, and how these terms can be defined mathematically. Simulation software can now attempt to map this terminological field. What repercussions this has for research and whether aspects of nature are overlooked as a result needs to be investigated further.
Another example: "My laptop is all I need for the models I'm interested in," some researchers say. This sentence suggests that there is a scientific question and a judgement that a laptop will be sufficient to answer this question. But might the researcher be subconsciously limiting themselves? What would the researchers say if the investigation of much more comprehensive models were just as easy as that of models that fit on a laptop today? This is where software comes into play. It can make the technical differences between computers irrelevant to researchers. For example, our simulation software NEST can be used regardless of whether it is running on a laptop or a supercomputer. If scientific software achieves this and knowledge of this cuts through to researchers, it can create mental freedom.
But there is also a sociological or cultural aspect. When a researcher uses scientific software that they have not written themselves, they are giving up part of their autonomy. Scientific software is today developed jointly by many researchers and used by even more researchers. This is the only way to achieve further progress and address issues that go beyond what a single researcher can achieve. However, some areas of natural science are still characterized by a high degree of individuality. In such areas, the notion of research in large collaborations is more difficult. This is also the case in neuroscience - my own field. Other areas, such as high-energy physics, have been dependent on collaboration in large networks for a long time, as this is necessary for the construction of large instruments. Peter Galison has described the pain and resistance witnessed in high-energy physics during this transition. In this context, we can see scientific software as a scientific instrument such as a particle accelerator, a telescope, or a research vessel.
What measures do you think are necessary to improve scientific software, and what steps should we take moving forward?
Software is changing science. We can now tackle problems that would have been unthinkable just a few years ago. But this also means that scientific progress is dependent on the quality and availability of scientific software. Today, all researchers use software and almost all of them develop at least parts of it. However, science and funding bodies have not yet fully realized the profound changes that are necessary to ensure long-term progress. We have identified two areas where urgent action is needed.
We need to offer expertise in research software engineering (RSE) at universities and research institutions - central points of contact which researchers can turn to for support. In the medium term, we should integrate RSE training into science and engineering degree courses to teach students the basic tools of the trade, as is currently the case in research data management. Traditional software engineering will no doubt also establish chairs that focus on the research of RSE methods, as has happened for other areas of application.
We also have to learn to see relevant scientific software as infrastructure and create the financial conditions for it to be operated for decades if necessary. The fundamental difficulty here is that in research, a strict distinction is made between investments and personnel funds. Conventional infrastructures are made of steel and concrete and are therefore largely considered to be investments. A software infrastructure, however, is almost entirely comprised of personnel costs and therefore cannot be financed from investment funds. It sounds a little silly, but this point does seem to cause considerable difficulties.
Background information
Senk, J., Kriener, B., Djurfeldt, M., Voges, N., Jiang, H.-J., Schüttler, L., Gramelsberger, G., Diesmann, M., Plesser, H. E., van Albada, S. J. (2022) Connectivity concepts in neuronal network modeling. PLoS Computational Biology 18(9):e1010086
Jordan, J., Ippen, T., Helias, M., Kitayama, I., Sato, M., Igarashi, J., Diesmann, M., Kunkel, S. (2018) Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers. Frontiers in Neuroinformatics 12:2
Galison, P. (1997) Image and Logic: A Material Culture of Microphysics. Chicago University Press
Aimone, J. B., Awile, O., Diesmann, M., Knight, J. C., Nowotny, T., Schürmann, F. (2023) Editorial: Neuroscience, computing, performance, and benchmarks: Why it matters to neuroscience how fast we can compute. Frontiers in Neuroinformatics 17:1157418
Kunkel, S., Diesmann, M. (2024) Entwicklung des Research Software Engineering am Beispiel von NEST: Wissenschaftliche Software ist wissenschaftliche Infrastruktur. In: RWTH Themenheft Research Software Engineering 2024, 50-53 (im Druck)
Grunske, L., Lamprecht, A.-L., Hasselbring, W., Rumpe, B. (2024) Research Software Engineering: Forschungssoftware effizient erstellen und dauerhaft erhalten. Forschung & Lehre 3/24 186-188
Original Publication
Hocquet, A., Wieber, F., Gramelsberger, G., Hinsen, K., Diesmann, M., Santos, F. P., Landström, C., Peters, B., Kasprowicz, D., Borrelli, A., Roth, P., Ai Ling Lee, C., Olteanu, A., & Böschen, S. Software in science is ubiquitous yet overlooked. Nature Computational Science (2024). https://doi.org/10.1038/s43588-024-00651-2
Contact Person
Prof. Dr. Markus Diesmann
Director of IAS-6/ INM-10 Group Leader of Computational Neurophysics group
- Institute for Advanced Simulation (IAS)
- Institute for Advanced Simulation (IAS-6), Computational and Systems Neuroscience