Ansible playbook-based tools for deploying Slurm and Kubernetes clusters for High Performance Computing, Machine Learning, Deep Learning, and High-Performance Data Analytics

This project is maintained by dellhpc

Parameters in omnia_config.yml

omnia_config.yml contains multiple configuration parameters.

Parameter Name Default Value Additional Information
mariadb_password password Password used to access the Slurm database.
Required Length: 8 characters
The password must not contain -,\, ‘,”
k8s_version 1.19.3 Kubernetes Version
Accepted Values: “1.16.7” or “1.19.3”
k8s_cni calico CNI type used by Kubernetes.
Accepted values: calico, flannel
k8s_pod_network_cidr Kubernetes pod network CIDR
docker_username   Username to login to Docker. A kubernetes secret will be created and patched to the service account in default namespace.
This value is optional but suggested to avoid docker pull limit issues
docker_password   Password to login to Docker
This value is mandatory if a docker_username is provided
ansible_config_file_path /etc/ansible Path where the ansible.cfg file can be found.
If dnf is used, the default value is valid. If pip is used, the variable must be set manually
login_node_required true Boolean indicating whether the login node is required or not
domain_name omnia.test Sets the intended domain name
realm_name OMNIA.TEST Sets the intended realm name
directory_manager_password   Password authenticating admin level access to the Directory for system management tasks. It will be added to the instance of directory server created for IPA.
Required Length: 8 characters.
The password must not contain -,\, ‘,”
kerberos_admin_password   “admin” user password for the IPA server on RockyOS. If LeapOS is in use, it is used as the “kerberos admin” user password for 389-ds
This field is not relevant to Control Planes running LeapOS
enable_secure_login_node false Boolean value deciding whether security features are enabled on the Login Node. For more information, see here
beegfs_support false Boolean value deciding whether to install BeeGFS-client on nodes.
beegfs_rdma_support false Boolean value indicating whether the network hardware in use is RDMA capable. (Eg: Infiniband)
beegfs_ofed_kernel_modules_path   Path where OFED kernel modules are located for Mellanox Switches.
beegfs_mgmt_server   Stores the management server IP for BeeGFS
Required Field
beegfs_mounts /mnt/beegfs Path where the BeeGFS-client filesystem is mounted
beegfs_unmount_client false Boolean indicating whether there’s a pre-existing BeeGFS configuration. Set to true when updating the BeeGFS Client version or mount location.
beegfs_client_version 7.2.6 BeeGFS version to be installed on all the nodes.
Minimum Supported version: 7.2
beegfs_version_change false Boolean indicating whether the BeeGFS is to be updated when running omnia.yml. Set to true if there’s a pre-existing BeeGFS set up to be updated.
nfs_client_params - { server_ip: , server_share_path: , client_share_path: , client_mount_options: } If NFS client services are to be deployed, enter the configuration required here in JSON format. If left blank, no NFS configuration takes place. Possible values include:
1. Single NFS file system: A single filesystem from a single NFS server is mounted.
Sample value:
- { server_ip: xx.xx.xx.xx, server_share_path: “/mnt/share”, client_share_path: “/mnt/client”, client_mount_options: “nosuid,rw,sync,hard,intr” }
2. Multiple Mount NFS file system: Multiple filesystems from a single NFS server are mounted.
Sample values:
- { server_ip: xx.xx.xx.xx, server_share_path: “/mnt/server1”, client_share_path: “/mnt/client1”, client_mount_options: “nosuid,rw,sync,hard,intr” }
- { server_ip: xx.xx.xx.xx, server_share_path: “/mnt/server2”, client_share_path: “/mnt/client2”, client_mount_options: “nosuid,rw,sync,hard,intr” }
3. Multiple NFS file systems: Multiple filesystems are mounted from multiple servers. Sample Values:
- { server_ip: zz.zz.zz.zz, server_share_path: “/mnt/share1”, client_share_path: “/mnt/client1”, client_mount_options: “nosuid,rw,sync,hard,intr”}
- { server_ip: xx.xx.xx.xx, server_share_path: “/mnt/share2”, client_share_path: “/mnt/client2”, client_mount_options: “nosuid,rw,sync,hard,intr”}
- { server_ip: yy.yy.yy.yy, server_share_path: “/mnt/share3”, client_share_path: “/mnt/client3”, client_mount_options: “nosuid,rw,sync,hard,intr”}
For more information, check this out.
powervault_ip   IP of the powervault connected to the NFS server. Mandatory field when nfs_node group is defined with an IP and omnia is required to configure nfs server. For more information, click here