|
|
|
|
|
Odaka, Tina; Banihirwe, Anderson; Eynard-bontemps, Guillaume; Ponte, Aurelien; Maze, Guillaume; Paul, Kevin; Baker, Jared; Abernathey, Ryan. |
The Pangeo ecosystem is an interactive computing software stack for HPC and public cloud infrastructures. In this paper, we show benchmarking results of the Pangeo platform on two di erent HPC sys- tems. Four di erent geoscience operations were considered in this bench- marking study with varying chunk sizes and chunking schemes. Both strong and weak scaling analyses were performed. Chunk sizes between 64MB to 512MB were considered, with the best scalability obtained for 512MB. Compared to certain manual chunking schemes, the auto chunk- ing scheme scaled well. |
Tipo: Text |
Palavras-chave: Pangeo; Interactive computing; HPC; Cloud; Benchmarking; Dask; Xarray. |
Ano: 2019 |
URL: https://archimer.ifremer.fr/doc/00597/70946/69187.pdf |
| |
|
|
Eynard-bontemps, Guillaume; Abernathey, Ryan; Hamman, Joseph; Ponte, Aurelien; Rath, Willi. |
Pangeo[1] is a community-driven effort for open-source big data initially focused on the Earth System Sciences. One of its primary goals is to enable scientists in analyzing petascale datasets both on classical high-performance computing (HPC) and on public cloud infrastructure. In only a few years, Pangeo has grown into a very productive community collaborating on the development of open-source analysis tools for science. It provides a set of example deployments based on open-source Scientific Python packages like Jupyter[2], Dask[3], and Xarray[4] that bring together scientists and developer with their actual use-cases. In this paper, we first describe Pangeo ecosystem and community. We then present its impact on the work of scientists from CNES on the... |
Tipo: Text |
Palavras-chave: Pangeo; Dask; Jupyter; HPC; Cloud; Big Data; Analysis; Open Source. |
Ano: 2019 |
URL: https://archimer.ifremer.fr/doc/00503/61441/65160.pdf |
| |
|
|
|