Deploying Performant Parallel Filesystems - Ansible and BeeGFS

HPC Insights


BeeGFS is a parallel file system suitable for High Performance Computing (HPC) with a proven track record in scalable storage solution space. In this blog hosted by Verne Global, we explore how different components of BeeGFS are pieced together and how we have incorporated them into an Ansible role for a seamless storage cluster deployment experience.


With access to a high-performance InfiniBand fabric, Verne Global's hpcDIRECT users can take advantage of BeeGFS’s native RDMA support to create high performance parallel filesystems as part of their HPC deployments, either on dedicated storage resources or hyperconverged with their compute nodes. This is a great way of making scratch space available to hpcDIRECT workloads.

Users looking to optimise the time to science can do this for their deployments through the hpcDIRECT portal. For those looking to get more hands-on, this guide will take you behind the scenes on how a BeeGFS storage cluster can be configured as part of the deployment of cloud-native HPC.

In this post we'll focus on some practical details for how to dynamically provision BeeGFS filesystems and/or clients running in cloud environments. There are actually no dependencies on OpenStack APIs here - although we do like to draw our Ansible inventory from Cluster-as-a-Service infrastructure and hpcDIRECT makes this possible.

As described here, BeeGFS has components which may be familiar concepts to those working in parallel file system solution space:

  • Management service: for registering and watching all other services
  • Storage service: for storing the distributed file contents
  • Metadata service: for storing access permissions and striping info
  • Client service: for mounting the file system to access stored data
  • Admon service (optional): for presenting administration and monitoring options through a graphical user interface.

Introducing our Ansible role for BeeGFS

We have an Ansible role published on Ansible Galaxy which handles the end-to-end deployment of BeeGFS. It takes care of details all the way from deployment of management, storage and metadata servers to setting up client nodes and mounting the storage point. To install, simply run:


There is a README that describes the role parameters and example usage.

An Ansible inventory is organised into groups, each representing a different role within the filesystem (or its clients). An example inventory-beegfs file with two hosts bgfs1 and bgfs2 may look like this:

Through controlling the membership of each inventory group, it is possible to create a variety of use cases and configurations. For example, client-only deployments, server-only deployments, or hyperconverged use cases in which the filesystem servers are also the clients (as above).

A minimal Ansible playbook which we shall refer to as beegfs.yml to configure the cluster may look something like this:

To create a BeeGFS cluster spanning the two nodes as defined in the inventory, run a single Ansible playbook to handle the setup and the teardown of BeeGFS storage cluster components by setting beegfs_state flag to present or absent:

The playbook is designed to fail if the path specified for BeeGFS storage service under beegfs_oss is already being used for another service. To override this behaviour, pass an extra option as -e beegfs_force_format=yes. Be warned that this will cause data loss as it formats the disk if a block device is specified and it also erases management and metadata server data if there is an existing BeeGFS deployment.


Highlights of the Ansible role for BeeGFS

  • The idempotent role will leave state unchanged if the configuration has not changed compared to the previous deployment.
  • The tuning parameters for optimal performance of the storage servers recommended by the BeeGFS maintainers themselves are automatically set.
  • The role can be used to deploy both storage-as-a-service and hyperconverged architecture by the nature of how roles are ascribed to hosts in the Ansible inventory. For example, the hyperconverged case would have storage and client services running on the same nodes while in the disaggregated case, the clients are not aware of storage servers.


One point to be aware of: BeeGFS is sensitive to hostname. It prefers hostnames to be consistent and permanent. If the hostname changes, services refuse to start. As a result, this is worth being mindful of during the initial cluster setup.


Looking Ahead


The simplicity of BeeGFS deployment and configuration makes it a great fit for automated cloud-native deployments such as hpcDIRECT. We have seen a lot of potential in the performance of BeeGFS, and we hope to be publishing more details from our tests in a future post. Watch this space!


Written by Stig Telfer (Guest)

See Stig Telfer (Guest)'s blog

Stig is the CTO of StackHPC. He has a background in R&D working for various prominent technology companies, particularly in HPC and software-defined networking. Stig is also co-chair of the OpenStack Scientific Working Group,

Related blogs

HPC and big data convergence in fraud detection

The convergence of two existing disciplines can be an explosively creative force. A great example within the world of tech is the convergence of HPC merging with big data and machine learning. Though in many ways this convergence is still in its early stages, the merging of these technologies is already starting to deliver concrete, real world benefits in the fraud detection field, helping save financial firms hundreds of millions of dollars.

Read more


NeurIPS – Rumours from the trade show floor

Generally, trade shows follow the sun and tourists to popular vacation destinations. Everyone loves a conference in San Diego or Orlando! The recently rebranded NeurIPS (formally NIPS) took a different road this year and visited Montreal in early December. Montreal is one of my favourite cities but in early December it’s the season for cold, cloudy weather and infrequent freezing rain. Here's a quick rundown on my experiences at the conference.

Read more


Evolving HPC at ISC18 – Rumours from the trade show floor

Recent visits to Europe have been significantly influenced by the FIFA World Cup football (‘soccer’ to my American friends) in Russia. Every bar and restaurant has had additional large screen TVs showing the games and important meetings were carefully scheduled around the tournament schedule. Against this backdrop you would expect Europe’s largest HPC conference and trade show - International Supercomputing Conference 2018 “ISC18” - to struggle for attention. However, nothing could be further from the truth.

Read more

We use cookies to ensure we give you the best experience on our website, to analyse our website traffic, and to understand where our visitors are coming from. By browsing our website, you consent to our use of cookies and other tracking technologies. Read our Privacy Policy for more information.