Computer Science differs from other experimental sciences, such as biology of physics, in the way experimental results are presented in articles. In those other disciplines articles always begin with a detailed presentation of the methods employed to produce the results that often rely on previously described and acknowledged procedures. In computer science, and more particularly in the field of application simulation, only a short description of a (sometime unavailable) ad-hoc simulation framework is provided. This prevents reproducibility of published results and thus objective comparisons between new research results and the state of the art. To reduce this gap between computer science and other experimental sciences, there is need for powerful, validated, available and well advertised tools and methods. The general goal of this project is to provide such an application simulation framework that meets the needs of both the High Performance Computing and the Large Scale Distributed Computing communities. SimGrid is recognized inthe HPC community as one of the most prominent simulation environments as shown by its large community of users and the number of publications that use it. This project will allow to extend SimGrid to target the Large Scale Distributed Computing community, increase simulation realism, and provide useful tools for test campaign management.

Specific aims

The project tackles three main axes. The first axis aims at improving the models used in SimGrid. This axis consists of two main directions. The first direction aims at adding new models to the framework to increase its application area (WP1). In particular, we aim at providing a simulation framework that provides several models, allowing the users to run simulations at various scales, using several models with different levels of accuracy. The second direction aims at establishing tools and methodologies to automatically instantiate the models in order to allow the users to run their simulation on realistic settings (WP2). The second axis, which aims at improving the tools for the experimenters, also consists of two directions. The first direction aims at helping the experimenter gain insight on the simulation experimental results (WP3). This will be done by instrumenting SimGrid and by developping sophisticated aggregation functions so that relevant information can then be displayed with generic visualization tools. Ultimately, platform-level information should be related to application-level information so as to explain performance anomalies. The second direction will provide tools to run and manage large campaigns of tests (WP4). The third axis aims at increasing simulation scale by parallelizing the simulator (WP5). Last, even though SimGrid was initially designed for studying scheduling algorithms on heterogeneous computing platforms, such as grids, it can be used in many other settings as well. In particular, we believe that it could be a very useful tool for researchers in the high performance computing community as it is possible to plug in precise models that could be specific to the hardware in use (e.g., Myrinet or Quadrics networks). SimGrid would enable them to have rough estimations of the kind of performance they could expect from a given architecture before buying or designing it. We also believe that SimGrid could be interesting for researchers from the distributed algorithm community. Indeed, the models underlying their simulators are generally very basic. Experimenting with more precise models could be very interesting for them. In particularly, phenomenons that are generally ignored could be taken into account (e.g., network contention or locality). Helping users from various communities to experiment with SimGrid and receiving feedback from them so as to improve the tool usability will be in the heart of all our work.

Originality and Novelty

This project is highly original since, to our best knowledge, this is the first attempt to design and build a framework for the simulation of applications targeting at the same time the parallel and HPC computing community as well as the very large scale distributed computation community. Another specificity of this project is the methodological effort to validate the simulation results, and to come up with standardized tools easing result reproducibility of scientific productions. This is why we plan to work at the same time on validated models and tools allowing the monitoring of existing platforms. It should allow the establishment of platform and workload archives which could be used as classical benchmarks by scientists. Our work on experimenter tools aims at easing the adoption of the framework by the community.

Targeted Result

By contrast with most application simulation frameworks, we do not aim at producing a tool usable mainly by its developer community. Instead, we aim at producing a scientific instrument directly usable by a large community of academic end-users. SimGrid is almost 10 years old, and this proposed project should pave the way for the next ten years by increasing the targeted audience as well as opening the developer community to new members.

Scientific and Technological Bottlenecks

The realization of such a scientific instrument clearly induces technological difficulties. First, since we target end-users, the tool stability and validity has to be carefully studied. Then, since we aim at simulation scale not achieved by any other software, the tool performance and scalability should be highly optimized. Moreover, the project is not only a scientific instrument, but also a scientific object on its own. Several work packages address well known scientific challenges, such as analytical models of the network in WP1, automatic topology mapping in WP2, inducing the cause of monitored effects in WP3, or efficient distribution of parameter sweep applications in WP4. Moreover, some of the technical challenges faced are so novel that they become scientific challenges. For example, the extreme parallelization of the simulator envisioned in WP5 will certainly require a scientific approach.

Full scientific project proposal

More details on the project proposal can be found in the following PDF file: Attach:proposal.pdf

Design by N.Design Studio, adapted by (version 1.0.0)
Powered by pmwiki-2.1.27