This module provisions 1 or more PBS execution hosts to run jobs in a PBS Professional cluster. The following extra services are required:
- an existing licensed PBS Professional Server; if provisioned in the cloud, we recommend using the pbspro-server module.
- a shared filesystem mounted across all PBS hosts to facilitate file transfers for jobs and their stdin/stdout logs
The following example snippet demonstrates use of the execution host module in concert with the pbspro-preinstall, pbspro-server, and filestore modules.
- id: pbspro_execution
source: community/modules/compute/pbspro-execution
use:
- homefs
- pbspro_setup
- pbspro_server
settings:
instance_count: 10
machine_type: c2-standard-16
name_prefix: pbs-exec
More information on GPU support in PBS Pro and other Cluster Toolkit modules can be found at docs/gpu-support.md
PBS Professional is licensed and supported by Altair. This module is maintained and supported by the Cluster Toolkit team in collaboration with Altair.
Copyright 2022 Google LLC
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Name | Version |
---|---|
terraform | >= 0.14.0 |
No providers.
Name | Source | Version |
---|---|---|
execution_startup_script | github.com/GoogleCloudPlatform/hpc-toolkit//modules/scripts/startup-script | v1.39.0&depth=1 |
pbs_execution | github.com/GoogleCloudPlatform/hpc-toolkit//modules/compute/vm-instance | 09ae2725 |
pbs_install | github.com/GoogleCloudPlatform/hpc-toolkit//community/modules/scripts/pbspro-install | v1.39.0&depth=1 |
No resources.
Name | Description | Type | Default | Required |
---|---|---|---|---|
auto_delete_boot_disk | Controls if boot disk should be auto-deleted when instance is deleted. | bool |
true |
no |
bandwidth_tier | Tier 1 bandwidth increases the maximum egress bandwidth for VMs. Using the tier_1_enabled setting will enable both gVNIC and TIER_1 higher bandwidth networking.Using the gvnic_enabled setting will only enable gVNIC and will not enable TIER_1.Note that TIER_1 only works with specific machine families & shapes and must be using an image th at supports gVNIC. See official docs for more details. |
string |
"not_enabled" |
no |
deployment_name | Cluster Toolkit deployment name. Cloud resource names will include this value. | string |
n/a | yes |
disk_size_gb | Size of disk for instances. | number |
200 |
no |
disk_type | Disk type for instances. | string |
"pd-standard" |
no |
enable_oslogin | Enable or Disable OS Login with "ENABLE" or "DISABLE". Set to "INHERIT" to inherit project OS Login setting. | string |
"ENABLE" |
no |
enable_public_ips | If set to true, instances will have public IPs on the internet. | bool |
true |
no |
guest_accelerator | List of the type and count of accelerator cards attached to the instance. | list(object({ |
null |
no |
instance_count | Number of instances | number |
1 |
no |
instance_image | Instance Image Expected Fields: name: The name of the image. Mutually exclusive with family. family: The image family to use. Mutually exclusive with name. project: The project where the image is hosted. |
map(string) |
{ |
no |
labels | Labels to add to the instances. Key-value pairs. | map(string) |
n/a | yes |
local_ssd_count | The number of local SSDs to attach to each VM. See https://cloud.google.com/compute/docs/disks/local-ssd. | number |
0 |
no |
local_ssd_interface | Interface to be used with local SSDs. Can be either 'NVME' or 'SCSI'. No effect unless local_ssd_count is also set. |
string |
"NVME" |
no |
machine_type | Machine type to use for the instance creation | string |
"c2-standard-60" |
no |
metadata | Metadata, provided as a map | map(string) |
{} |
no |
name_prefix | Name prefix for PBS execution hostnames | string |
null |
no |
network_interfaces | A list of network interfaces. The options match that of the terraform network_interface block of google_compute_instance. For descriptions of the subfields or more information see the documentation: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance#nested_network_interface _NOTE:_ If network_interfaces are set, network_self_link andsubnetwork_self_link will be ignored, even if they are provided throughthe use field. bandwidth_tier and enable_public_ips also do not applyto network interfaces defined in this variable. Subfields: network (string, required if subnetwork is not supplied) subnetwork (string, required if network is not supplied) subnetwork_project (string, optional) network_ip (string, optional) nic_type (string, optional, choose from ["GVNIC", "VIRTIO_NET"]) stack_type (string, optional, choose from ["IPV4_ONLY", "IPV4_IPV6"]) queue_count (number, optional) access_config (object, optional) ipv6_access_config (object, optional) alias_ip_range (list(object), optional) |
list(object({ |
[] |
no |
network_self_link | The self link of the network to attach the VM. | string |
"default" |
no |
network_storage | An array of network attached storage mounts to be configured. | list(object({ |
[] |
no |
on_host_maintenance | Describes maintenance behavior for the instance. If left blank this will default to MIGRATE except for when placement_policy , spot provisioning, or GPUs require it to be TERMINATE |
string |
null |
no |
pbs_exec | Root path in which to install PBS | string |
"/opt/pbs" |
no |
pbs_execution_rpm_url | Path to PBS Pro Execution Host RPM file | string |
n/a | yes |
pbs_home | PBS working directory | string |
"/var/spool/pbs" |
no |
pbs_server | IP address or DNS name of PBS server host | string |
n/a | yes |
placement_policy | Control where your VM instances are physically located relative to each other within a zone. | object({ |
null |
no |
project_id | Project in which Google Cloud resources will be created | string |
n/a | yes |
region | Default region for creating resources | string |
n/a | yes |
service_account | Service account to attach to the instance. See https://www.terraform.io/docs/providers/google/r/compute_instance_template.html#service_account. | object({ |
{ |
no |
spot | Provision VMs using discounted Spot pricing, allowing for preemption | bool |
false |
no |
startup_script | Startup script used on the instance | string |
null |
no |
subnetwork_self_link | The self link of the subnetwork to attach the VM. | string |
null |
no |
tags | Network tags, provided as a list | list(string) |
[] |
no |
threads_per_core | Sets the number of threads per physical core. By setting threads_per_core to 2, Simultaneous Multithreading (SMT) is enabled extending the total number of virtual cores. For example, a machine of type c2-standard-60 will have 60 virtual cores with threads_per_core equal to 2. With threads_per_core equal to 1 (SMT turned off), only the 30 physical cores will be available on the VM. The default value of "0" will turn off SMT for supported machine types, and will fall back to GCE defaults for unsupported machine types (t2d, shared-core instances, or instances with less than 2 vCPU). Disabling SMT can be more performant in many HPC workloads, therefore it is disabled by default where compatible. null = SMT configuration will use the GCE defaults for the machine type 0 = SMT will be disabled where compatible (default) 1 = SMT will always be disabled (will fail on incompatible machine types) 2 = SMT will always be enabled (will fail on incompatible machine types) |
number |
0 |
no |
zone | Default zone for creating resources | string |
n/a | yes |
No outputs.