Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate running backups and restores w/ fusion #2157

Merged
merged 73 commits into from
Sep 12, 2024

Conversation

bluzarraga
Copy link
Member

@bluzarraga bluzarraga commented Aug 22, 2024

What this PR does / why we need it:
Creates an automation script to trigger and wait for Fusion backups and restores

Which issue(s) this PR fixes:
Work for https://github.ibm.com/IBMPrivateCloud/roadmap/issues/64245 & https://github.ibm.com/IBMPrivateCloud/roadmap/issues/64247

Special notes for your reviewer:

  1. How the test is done?
    Setup cluster using instructions from Automate setup of Hub and Spoke clusters for backup and restore using Spectrum Fusion #2142
    Update variables in script to match expected (ie BACKUP_STORAGE_LOCATION_NAME)
    run auto-br.sh with appropriate parameters
    verify backup and restore complete

Outstanding items:

  • Usage function needs to be updated
  • Variables at the top of the script need to either be parameterized or updated via env.properties
  • prereq checks for proper variables set before running the script
  • update logging statements
  • clean up commented out code and remove defaults for values that shouldn't have one

How to backport this PR to other branch:

  1. Add label to this PR with the target branch name backport <branch-name>
  2. The PR will be automatically created in the target branch after merging this PR
  3. If this PR is already merged, you can still add the label with the target branch name backport <branch-name> and leave a comment /backport to trigger the backport action

Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
Signed-off-by: Ben Luzarraga <[email protected]>
@bluzarraga
Copy link
Member Author

This pr is a bit ugly because I started work on this script before 2142 was merged in. The changes specific to this pr start from f9ce24b. I also opened the same pr in my fork branch for clarity on the difference https://github.com/bluzarraga/ibm-common-service-operator/pull/4/files

@YCShen1010
Copy link
Contributor

I have passed the test for Backup automation part.

Result:

./auto-br.sh --backup --backup-name cs-application-backuo-test --cluster-type hub --target-cluster cutie1
All arguments passed into the auto-br.sh: --backup --backup-name cs-application-backuo-test --cluster-type hub --target-cluster cutie1


[✔] oc command available
[✔] yq command available
[✔] oc command logged in as kube:admin
# Creating Spectrum Fusion backup resource for hub cluster.
[INFO] Copying template files...
[INFO] Editing backup yaml...
backup.data-protection.isf.ibm.com/cs-application-backuo-test unchanged
[✔] Backup cs-application-backuo-test successfully applied on hub server https://api.cutie1.cp.fyre.ibm.com:6443 to backup target cluster cutie1
# Waiting for backup cs-application-backuo-test to complete...
Completed && Completed && oc get backup.data-protection.isf.ibm.com cs-application-backuo-test -n ibm-spectrum-fusion-ns -o jsonpath='{.status.phase}'
[INFO] backup cs-application-backuo-test can be further tracked in the UI here: https://console-ibm-spectrum-fusion-ns.apps.cutie1.cp.fyre.ibm.com/backupAndRestore/jobs/backups/cs-application-backuo-test
[✔] backup cs-application-backuo-test completed successfully for cutie1.
[INFO] For more info, see job in the UI (https://console-ibm-spectrum-fusion-ns.apps.cutie1.cp.fyre.ibm.com/backupAndRestore/jobs/backups/cs-application-backuo-test) or use "oc get backup cs-application-backuo-test -n ibm-spectrum-fusion-ns -o yaml | yq '.status'".
[✔] Backup cs-application-backuo-test of cluster cutie1 completed. See results in Fusion UI here: https://console-ibm-spectrum-fusion-ns.apps.cutie1.cp.fyre.ibm.com/backupAndRestore/jobs/backups/cs-application-backuo-test

@qpdpQ
Copy link
Contributor

qpdpQ commented Sep 6, 2024

it should be good now

@bluzarraga
Copy link
Member Author

Updated some of the parameters and prereq checking after testing the script. I've got it running backups and restores consistently so I think this is ready to merge and ready to hand over to SERT team for further testing/use

@bluzarraga bluzarraga changed the title [WIP] Automate running backups and restores w/ fusion Automate running backups and restores w/ fusion Sep 11, 2024
@YCShen1010
Copy link
Contributor

YCShen1010 commented Sep 12, 2024

Backup test passed

./auto-br.sh --backup --backup-name cs-application-backup-test
All arguments passed into the auto-br.sh: --backup --backup-name cs-application-backup-test

[✔] oc command available
[✔] yq command available
[✔] oc command logged in as kube:admin
# Creating Spectrum Fusion backup.data-protection.isf.ibm.com resource for  cluster.
[INFO] Copying template files...
[INFO] Editing backup yaml...
backup.data-protection.isf.ibm.com/cs-application-backup-test created
[✔] Backup cs-application-backup-test successfully applied on hub server https://api.cutie1.cp.fyre.ibm.com:6443 to backup target cluster 
# Waiting for backup.data-protection.isf.ibm.com cs-application-backup-test to complete...
[INFO] backup.data-protection.isf.ibm.com cs-application-backup-test can be further tracked in the UI here: https://console-ibm-spectrum-fusion-ns.apps.cutie1.cp.fyre.ibm.com/backupAndRestore/jobs/backups/cs-application-backup-test
[INFO] Waiting on backup.data-protection.isf.ibm.com cs-application-backup-test to complete. Current status: InventoryInProgress
[INFO] Current sequence status:
...
[INFO] Waiting on backup.data-protection.isf.ibm.com cs-application-backup-test to complete. Current status: RecipeInProgress
[INFO] Waiting on backup.data-protection.isf.ibm.com cs-application-backup-test to complete. Current status: SnapshotInProgress
[INFO] Waiting on backup.data-protection.isf.ibm.com cs-application-backup-test to complete. Current status: SnapshotInProgress
[INFO] Waiting on backup.data-protection.isf.ibm.com cs-application-backup-test to complete. Current status: DataTransferInProgress
[INFO] Waiting on backup.data-protection.isf.ibm.com cs-application-backup-test to complete. Current status: DataTransferInProgress
[INFO] Waiting on backup.data-protection.isf.ibm.com cs-application-backup-test to complete. Current status: DataTransferInProgress
[INFO] Waiting on backup.data-protection.isf.ibm.com cs-application-backup-test to complete. Current status: Completed
[✔] backup.data-protection.isf.ibm.com cs-application-backup-test completed successfully for .
[INFO] For more info, see job in the UI (https://console-ibm-spectrum-fusion-ns.apps.cutie1.cp.fyre.ibm.com/backupAndRestore/jobs/backups/cs-application-backup-test) or use "oc get backup.data-protection.isf.ibm.com cs-application-backup-test -n ibm-spectrum-fusion-ns -o yaml | yq '.status'".
[✔] Backup cs-application-backup-test of cluster  completed. See results in Fusion UI here: https://console-ibm-spectrum-fusion-ns.apps.cutie1.cp.fyre.ibm.com/backupAndRestore/jobs/backups/cs-application-backup-test

@YCShen1010
Copy link
Contributor

YCShen1010 commented Sep 12, 2024

I have another question when testing restore, from the template I see some operator and services namespace are hard code like cs-serv, cs-op, are we availabe to replace this namespaces with our own custom namespace?
I have tested restore process with my own namespace and got restore status failed:
Failed validation There was an error when processing the job in the Backup service

spec:
  backup: cs-application-backup-test
  objectsToRestore:
    RESOURCES:
      - ALL
    v1/persistentvolumeclaim:
      - cs-op/setup-tenant-job-pvc
      - cs-serv/cs-db-backup-pvc
      - cs-serv/zen5-backup-pvc
      - ibm-lsr/lsr-backup-pvc

Signed-off-by: Allen Li <[email protected]>
@qpdpQ
Copy link
Contributor

qpdpQ commented Sep 12, 2024

update the script to modify the pvc namespaces

Signed-off-by: Allen Li <[email protected]>
@YCShen1010
Copy link
Contributor

Restore process also passed:

./auto-br.sh --restore --backup-name cs-application-backup-test --restore-name cs-application-backup-test-1 --cluster-type spoke --target-cluster apps.cutie1.cp.fyre.ibm.com
All arguments passed into the auto-br.sh: --restore --backup-name cs-application-backup-test --restore-name cs-application-backup-test-1 --cluster-type spoke --target-cluster apps.cutie1.cp.fyre.ibm.com

[✔] oc command available
[✔] yq command available
[✔] oc command logged in as kube:admin
# Creating Spectrum Fusion restore.data-protection.isf.ibm.com resource for spoke cluster.
[INFO] Editing restore yaml...
restore.data-protection.isf.ibm.com/cs-application-backup-test-1 created
[✔] Restore cs-application-backup-test-1 successfully applied on hub server https://api.cutie1.cp.fyre.ibm.com:6443 to restore target cluster apps.cutie1.cp.fyre.ibm.com
# Waiting for restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete...
[INFO] restore.data-protection.isf.ibm.com cs-application-backup-test-1 can be further tracked in the UI here: https://console-ibm-spectrum-fusion-ns.apps.cutie1.cp.fyre.ibm.com/backupAndRestore/jobs/restores/cs-application-backup-test-1
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: InventoryInProgress
[INFO] Current sequence status:
null
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Current sequence status:
...
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestorePvcsInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: RestoreEtcdInProgress
[INFO] Waiting on restore.data-protection.isf.ibm.com cs-application-backup-test-1 to complete. Current status: Completed
[✔] restore.data-protection.isf.ibm.com cs-application-backup-test-1 completed successfully for apps.cutie1.cp.fyre.ibm.com.
[INFO] For more info, see job in the UI (https://console-ibm-spectrum-fusion-ns.apps.cutie1.cp.fyre.ibm.com/backupAndRestore/jobs/restores/cs-application-backup-test-1) or use "oc get restore.data-protection.isf.ibm.com cs-application-backup-test-1 -n ibm-spectrum-fusion-ns -o yaml | yq '.status'".
[✔] Restore cs-application-backup-test-1 to cluster apps.cutie1.cp.fyre.ibm.com completed. See results in Fusion UI here: https://console-ibm-spectrum-fusion-ns.apps.cutie1.cp.fyre.ibm.com/backupAndRestore/jobs/restores/cs-application-backup-test-1

Copy link
Contributor

@YCShen1010 YCShen1010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@ibm-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bluzarraga, YCShen1010

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [YCShen1010,bluzarraga]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ibm-ci-bot ibm-ci-bot merged commit a61e662 into IBM:scripts-dev Sep 12, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants