v1.15.0: Improvements to Slurm and HTCondor solutions
Key New Features
- Support for HTCondor pools with both On-demand and Spot VMs
- Slurm solution updated to 5.6.0
- Support for custom machine types
- Label exclusive nodes with job ID for cost-tracking
- New zone_target_shape parameter corresponding to bulkInsert targetShape parameter
- FIX: lustre mounting regression introduced in 5.5.0
Improvements
- [
filestore
] module added supported for Shared VPCs viause
keyword andpre-existing-vpc
module - HTCondor modules now use minimally-scoped authentication for each daemon
- HTCondor execute points disable benchmarks to decrease time to join pool
- Improved type alignment across modules. e.g.
var.labels
aligned tomap(string)
What's Changed
- Rename filestore network_name to network_id to enable shared VPC via use by @nick-stroud in #962
- Improve attribute tracking in HTCondor scheduler by @tpdownes in #965
- Update fluent tutorial to use pre-existing-vpc module and other minor syntax updates by @nick-stroud in #963
- Revert "Rename filestore network_name to network_id to enable shared VPC via use" by @nick-stroud in #967
- Mask sleep/suspend targets on chrome-remote-desktop to prevent shutdown by @nick-stroud in #968
- Update image building example to use Slurm V5 by @mr0re1 in #964
- Improve HTCondor job matchmaking speed by @tpdownes in #971
- Roll-forward:"Rename filestore network_name to network_id to enable shared VPC via use" by @nick-stroud in #969
- Increase reliability of blueprints using DDN Exascaler by @tpdownes in #972
- Further increase speed at which HTCondor daemons update their ClassAds by @tpdownes in #974
- Initial support for Spot VMs within HTCondor pools by @tpdownes in #973
- Convert HTCondor autoscaler to SystemD timer by @tpdownes in #975
- Add validation to prevent usage of variables in backend block. by @mr0re1 in #970
- Making OFE deploy.sh MacOS compatible. Fixes #978 by @ek-nag in #979
- Improve Slurm log capturing by @tpdownes in #980
- Support Spot VMs in HTCondor pools by @tpdownes in #981
- Add utils for parising and normalizing HCL dtype by @mr0re1 in #977
- Enable depth-first filling of HTCondor pools by @tpdownes in #982
- Escalate to root priveleges to fetch Slurm logs by @mr0re1 in #987
- Bump google.golang.org/api from 0.110.0 to 0.111.0 by @dependabot in #984
- Bump github.com/spf13/afero from 1.9.4 to 1.9.5 by @dependabot in #985
- Bump github.com/go-git/go-git/v5 from 5.4.2 to 5.6.0 by @dependabot in #986
- Bump dill from 0.3.4 to 0.3.6 in /community/front-end/ofe by @dependabot in #990
- Bump google-cloud-core from 2.2.2 to 2.3.2 in /community/front-end/ofe by @dependabot in #991
- Bump astroid from 2.9.3 to 2.15.0 in /community/front-end/ofe by @dependabot in #992
- Bump proto-plus from 1.20.1 to 1.22.2 in /community/front-end/ofe by @dependabot in #993
- Bump isort from 5.10.1 to 5.12.0 in /community/front-end/ofe by @dependabot in #994
- Merge main into develop after release v1.14.0 by @mr0re1 in #997
- Bump terraform providers version 4.53.1 -> 4.56.0 by @mr0re1 in #998
- Clean up Filestore regardless of instances presence by @mr0re1 in #999
- Upgrade to slurm-gcp 5.6.0 by @SkylerMalinowski in #995
- Fix nfs-server example to use local_mounts instead of local_mount by @nick-stroud in #1001
- Add missing description for gcs_bucket_path by @nick-stroud in #1002
- Doc fix by @issacg in #1010
- Add mounting of cloud-storage-bucket to Slurm v5 test by @nick-stroud in #1007
- Use DeploymentName getter instead of looking up Vars by @mr0re1 in #1005
- Specify strict type for labels = map(string) by @mr0re1 in #1000
- Pass empty string instead of null to avoid mounting failure in Slurm by @nick-stroud in #1003
- Remove ghpc_role setting from nfs-server example by @nick-stroud in #1008
- Actually check mount instead of just checking dir exists by @nick-stroud in #1004
- Remove hostname test as it is not providing incremental value by @nick-stroud in #1006
- Double length of time for HTCondor integration test to detect job queue by @tpdownes in #1020
- Bump github.com/googleapis/gax-go/v2 from 2.7.0 to 2.7.1 by @dependabot in #1011
- Bump github.com/hashicorp/hcl/v2 from 2.16.1 to 2.16.2 by @dependabot in #1012
- Update slurm v5 readme about local-exec dependencies by @mr0re1 in #1023
- Bump google.golang.org/api from 0.111.0 to 0.112.0 by @dependabot in #1013
- Update OFE Dependabot configuration by @tpdownes in #1055
- Release v1.15.0 by @tpdownes in #1065
New Contributors
Full Changelog: v1.14.1...v1.15.0