Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use sackd for the login nodes #2979

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

jvilarru
Copy link
Contributor

Substitute slurmd for the sackd daemon, this way an x-login partition is not needed.

Submission Checklist

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

Substitute slurmd for the sackd daemon, this way an x-login partition is
not needed.
Copy link
Member

@tpdownes tpdownes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the behavior of sackd when the slurmctld is non responsive? We currently configure the slurmd service unit to restart when configless slurmd fails to retrieve the configuration.

@jvilarru
Copy link
Contributor Author

jvilarru commented Sep 2, 2024

Added new commit so that at reconfigure sackd is restarted, but you have a valid point about the restart on-failure defined in https://github.com/GoogleCloudPlatform/slurm-gcp/blob/master/ansible/roles/slurm/templates/systemd/slurmd_overrides.j2 I will create a PR also on slurm-gcp for that.

@jvilarru
Copy link
Contributor Author

jvilarru commented Sep 2, 2024

Done: GoogleCloudPlatform/slurm-gcp#204

Copy link
Member

@tpdownes tpdownes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should not be merged until new images are published using

GoogleCloudPlatform/slurm-gcp#204

@tpdownes tpdownes assigned mr0re1 and unassigned jvilarru Sep 3, 2024
@mr0re1 mr0re1 added the do-not-merge Block merging of this PR label Sep 4, 2024
@mr0re1
Copy link
Collaborator

mr0re1 commented Sep 4, 2024

Waiting for images-based on 6.7.0 to be relased. DO_NOT_MERGE until then.

@mr0re1
Copy link
Collaborator

mr0re1 commented Sep 4, 2024

/gcbrun

@mr0re1 mr0re1 added the release-module-improvements Added to release notes under the "Module Improvements" heading. label Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge Block merging of this PR release-module-improvements Added to release notes under the "Module Improvements" heading.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants