The Compute Engine service supports the creation of multiple VMs with
the gcloud compute instances bulk create
command.
The following sections describe the process of creating a startup script and deploying it to any number of Compute Engine VMs.
For detailed instructions on creating and connecting to a single VM, see Connect from Compute Engine: single client .
Required permissions
You must have the following IAM role in order to create a Compute Engine VM:
- Compute Instance Admin (v1) (
roles/compute.instanceAdmin.v1
). For more information, refer to the [Compute Engine documentation][gce-role].
Set environment variables
The following environment variables are used in the example commands in this document:
export
SSH_USER
=
"daos-user"
export
CLIENT_PREFIX
=
"daos-client-vm"
export
NUM_CLIENTS
=
10
Update these to your desired values.
Create an SSH key
Create an SSH key and save it locally to be distributed to the client VMs. The key is associated with the SSH user specified in the environment variables, and is created on each VM:
# Generate an SSH key for the specified user
ssh-keygen
-t
rsa
-b
4096
-C
"
${
SSH_USER
}
"
-N
''
-f
"./id_rsa"
chmod
600
"./id_rsa"
# Create a new file in the format [user]:[public key] user
echo
"
${
SSH_USER
}
:
$(
cat
"./id_rsa.pub"
)
${
SSH_USER
}
"
>
"./keys.txt"
Get Parallelstore network details
Get the Parallelstore server IP addresses in a format consumable by the daos agent:
export
ACCESS_POINTS
=
$(
gcloud
beta
parallelstore
instances
describe
INSTANCE_NAME
\
--location
LOCATION
\
--format
"value[delimiter=', '](format("
{
0
}
", accessPoints))"
)
Get the network name associated with the Parallelstore instance:
export
NETWORK
=
$(
gcloud
beta
parallelstore
instances
describe
INSTANCE_NAME
\
--location
LOCATION
\
--format
"value[delimiter=', '](format("
{
0
}
", network))"
)
|
awk
-F
'/'
'{print $NF}'
Create the startup script
The startup script is attached to the VM and is run every time the system starts. The startup script does the following:
- Configures the daos agent
- Installs required libraries
- Mounts your Parallelstore instance to
/tmp/parallelstore/
on each VM
The following script works on VMs running HPC Rocky 8.
# Create a startup script that configures the VM
cat >
./startup-script <<
EOF
sudo
tee
/etc/yum.repos.d/parallelstore-v2-6-el8.repo <<
INNEREOF [
parallelstore-v2-6-el8 ]
name
=
Parallelstore
EL8
v2.6 baseurl
=
https://us-central1-yum.pkg.dev/projects/parallelstore-packages/v2-6-el8 enabled
=
1
repo_gpgcheck
=
0
gpgcheck
=
0
INNEREOF
sudo
dnf
makecache # Install daos-client
dnf
install
-y
epel-release
# needed for capstone
dnf
install
-y
daos-client # Upgrade libfabric
dnf
upgrade
-y
libfabric
systemctl
stop
daos_agent
mkdir
-p
/etc/daos
cat >
/etc/daos/daos_agent.yml <<
INNEREOF
access_points:
${
ACCESS_POINTS
}
transport_config:
allow_insecure:
true
fabric_ifaces:
-
numa_node:
0
devices:
-
iface:
eth0
domain:
eth0
INNEREOF echo
-e
"Host *\n\tStrictHostKeyChecking no\n\tUserKnownHostsFile /dev/null"
>
/home/ ${
SSH_USER
}
/.ssh/config
chmod
600
/home/ ${
SSH_USER
}
/.ssh/config
usermod
-u
2000
${
SSH_USER
}
groupmod
-g
2000
${
SSH_USER
}
chown
-R
${
SSH_USER
}
: ${
SSH_USER
}
/home/ ${
SSH_USER
}
chown
-R
daos_agent:daos_agent
/etc/daos/
systemctl
enable
daos_agent
systemctl
start
daos_agent
mkdir
-p
/tmp/parallelstore
dfuse
-m
/tmp/parallelstore
--pool
default-pool
--container
default-container
--disable-wb-cache
--thread-count =
16
--eq-count =
8
--multi-user
chmod
777
/tmp/parallelstore
EOF
For help optimizing the values of --thread-count
and --eq-count
, see the Thread count and event queue count
section of the Performance considerationspage.
Create the client VMs
The overall performance of your workloads depends on the client machine types.
The following example uses c2-standard-30
VMs; modify the machine-type
value to increase performance with faster NICs. See Machine families resource and comparison guide
for details of
the available machine types.
To create VM instances in bulk, use the gcloud compute instances bulk create
command:
gcloud
compute
instances
bulk
create
\
--name-pattern =
"
${
CLIENT_PREFIX
}
-####"
\
--zone =
" LOCATION
"
\
--machine-type =
" c2-standard-30
"
\
--network-interface =
subnet
=
${
NETWORK
}
,nic-type =
GVNIC
\
--network-performance-configs =
total-egress-bandwidth-tier =
TIER_1
\
--create-disk =
auto-delete =
yes,boot =
yes,device-name =
client-vm1,image =
projects/cloud-hpc-image-public/global/images/hpc-rocky-linux-8-v20240126,mode =
rw,size =
100
,type =
pd-balanced
\
--metadata =
enable-oslogin =
FALSE
\
--metadata-from-file =
ssh-keys =
./keys.txt,startup-script =
./startup-script
\
--count
${
NUM_CLIENTS
}