TOC
Applicability
This solution is applicable to Nexus instances running version 3.76 or later.
Terminology
| Term | Meaning |
|---|
| Original Instance | The tool instance before backup |
| New Instance | The tool instance after restoration |
| Original Namespace | The namespace of the tool instance before backup |
| New Namespace | The namespace of the tool instance after restoration |
| Nexus CR Resource | This resource is used to describe the deployment method of the Nexus instance. The Nexus Operator will deploy the Nexus instance based on this resource. |
Prerequisites
-
Deploy MinIO Object Storage: This backup and restore solution relies on object storage to save backup data, so a MinIO instance must be deployed in advance. ACP provides .
-
Deploy Velero: Velero is a backup and restore tool. ACP provides Alauda Container Platform Data Backup for Velero, which can be deployed in the Administrator view, under Marketplace -> Cluster Plugins by searching for Velero.
-
Install mc Command-Line Tool: mc is MinIO's command-line management tool. For installation instructions, see the MinIO official documentation.
-
Install kubectl Command-Line Tool: kubectl is Kubernetes' command-line management tool. For installation instructions, see the Kubernetes official documentation.
-
Uninstall nexus operator from the cluster.
kubectl get subscription --all-namespaces | grep nexus-ce-operator | awk '{print "kubectl delete subscription "$2" -n "$1}' | sh
# Output:
# subscription.operators.coreos.com "nexus-ce-operator" deleted
Why uninstall the Nexus operator?
During restoration, Velero needs to start backup pods to restore PVC data, which takes a long time. If the Operator is not uninstalled, the following issues may occur:
- The Operator may recreate workloads based on the Nexus CR, causing the restored pods to be restarted or recreated, ultimately leading to restoration interruption or failure.
- Some restored resources may conflict with existing resources, such as Ingress.
Impact of Uninstalling Nexus operator
After uninstalling the Operator, modifications to the Nexus CR will not take effect, such as adjusting resources or storage size.
Uninstalling the Operator will not cause existing instances to malfunction.
For convenience in subsequent operations, please set the following environment variables first:
export MINIO_HOST=<MinIO instance access address> # Example: http://192.168.1.100:32008
export MINIO_ACCESS_KEY=<MinIO instance access key> # Example: minioadmin
export MINIO_SECRET_KEY=<MinIO instance secret key> # Example: minioadminpassword
export MINIO_ALIAS_NAME=<MinIO instance alias name> # Example: myminio
export VELERO_BACKUP_BUCKET=<Velero backup bucket name> # Example: backup
export VELERO_BACKUP_REPO_NAME=<Velero backup repo name> # Example: nexus-backup-repo
export NEXUS_NAMESPACE=<namespace of the NEXUS instance to be backed up>
export NEXUS_NAME=<name of the NEXUS instance to be backed up>
Run the following commands to configure the mc tool and test the connection:
mc alias set ${MINIO_ALIAS_NAME} ${MINIO_HOST} ${MINIO_ACCESS_KEY} ${MINIO_SECRET_KEY}
mc ping ${MINIO_ALIAS_NAME}
# Example output:
# 1: http://192.168.131.56:32571:32571 min=98.86ms max=98.86ms average=98.86ms errors=0 roundtrip=98.86ms
# 2: http://192.168.131.56:32571:32571 min=29.57ms max=98.86ms average=64.21ms errors=0 roundtrip=29.57ms
# 3: http://192.168.131.56:32571:32571 min=29.57ms max=98.86ms average=52.77ms errors=0 roundtrip=29.88ms
If you can successfully ping the MinIO instance, it means mc is configured correctly.
Backup
Preparation
Backup preparation consists of two steps:
- Create Bucket
- Configure Velero backup repository
Create Bucket
Run the following command to create a bucket for storing backup data:
mc mb ${MINIO_ALIAS_NAME}/${VELERO_BACKUP_BUCKET}
# Output:
# Bucket created successfully `myminio/backup`.
Run the following command to create the Velero backup repository:
nexus_backup_repo_credentials_secret_name="nexus-backup-repo-credentials"
minio_nexus_backup_bucket="${VELERO_BACKUP_BUCKET:-backup}"
minio_nexus_backup_bucket_directory="nexus"
kubectl apply -f - <<EOF
apiVersion: v1
stringData:
cloud: |
[default]
aws_access_key_id = ${MINIO_ACCESS_KEY}
aws_secret_access_key = ${MINIO_SECRET_KEY}
kind: Secret
metadata:
labels:
component: velero
cpaas.io/backup-storage-location-repo: ${VELERO_BACKUP_REPO_NAME}
name: ${nexus_backup_repo_credentials_secret_name}
namespace: cpaas-system
type: Opaque
---
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: ${VELERO_BACKUP_REPO_NAME}
namespace: cpaas-system
spec:
config:
checksumAlgorithm: ""
insecureSkipTLSVerify: "true"
region: minio
s3ForcePathStyle: "true"
s3Url: ${MINIO_HOST}
credential:
key: cloud
name: ${nexus_backup_repo_credentials_secret_name}
objectStorage:
bucket: ${minio_nexus_backup_bucket}
prefix: ${minio_nexus_backup_bucket_directory}
provider: aws
EOF
# Output
# secret/nexus-backup-repo-credentials created
# backupstoragelocationrepo.ait.velero.io/nexus-backup-repo created
Backup Execution
Manual backup consists of four steps:
- Remove the nexus instance service to prevent data changes
- Create backup volume policy
- Execute backup
- After successful backup, restore the nexus instance service
Remove the nexus instance service to prevent data changes during backup
Run the following command to remove the nexus instance service:
kubectl delete service -l release=${NEXUS_NAME} -n ${NEXUS_NAMESPACE}
#output:
# service "nexus-nxrm-ha" deleted
# service "nexus-nxrm-ha-hl" deleted
Create Backup Volume Policy
Volume policy is used to specify the backup method for PVCs. The default policy is to backup PVCs using the fs-backup method.
export BACKUP_POLICY_NAME=${BACKUP_POLICY_NAME:-nexus-backup}
export VELERO_BACKUP_REPO_NAME=${VELERO_BACKUP_REPO_NAME:-nexus-backup-repo}
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: ${BACKUP_POLICY_NAME}-volume-policy
namespace: cpaas-system
data:
resourcepolicies: |
version: v1
volumePolicies:
- conditions:
csi: {}
action:
type: fs-backup
- conditions:
csi: {}
action:
type: fs-backup
EOF
# Output
# configmap/nexus-backup-volume-policy created
If the PVC supports snapshot, you can use snapshot to backup pvc to speed up backup. You need to modify the volume policy configmap to specify that the storage class uses snapshot for backup. For example, if the ceph storage class supports snapshot backup, you need to add the following configuration (the added condition should be placed at the top to get higher priority):
How to determine if the PVC supports snapshot?
WARNING
If you use pvc snapshot backup, you cannot change the storage class when restoring. Please adjust the storage class according to the actual situation.
apiVersion: v1
kind: ConfigMap
metadata:
name: ${BACKUP_POLICY_NAME}-volume-policy
namespace: cpaas-system
data:
resourcepolicies: |
version: v1
volumePolicies:
# If the storage class supports snapshot, you can use snapshot to backup pvc to speed up backup.
# Add the following configuration to the volume policy configmap.
- conditions:
storageClass:
- ceph
action:
type: snapshot
# end of snapshot configuration
- conditions:
csi: {}
action:
type: fs-backup
- conditions:
csi: {}
action:
type: fs-backup
WARNING
If the velero is not enabled CSI snapshot, you will get the following error when backup:
Skip action velero.io/csi-pvc-backupper for resource persistentvolumeclaims:xxx, because the CSI feature is not enabled.
To fix this, you need to add --features=EnableCSI parameter to the velero deployment.
apiVersion: apps/v1
kind: Deployment
metadata:
name: velero
spec:
template:
spec:
containers:
- args:
- server
- --uploader-type=restic
+ - --features=EnableCSI
- --namespace=cpaas-system
command:
- /velero
name: velero
Execute Backup
Run the following commands to create a backup schedule and trigger a backup job:
export BACKUP_POLICY_NAME=${BACKUP_POLICY_NAME:-nexus-backup}
export VELERO_BACKUP_REPO_NAME=${VELERO_BACKUP_REPO_NAME:-nexus-backup-repo}
kubectl apply -f - <<EOF
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: ${BACKUP_POLICY_NAME}
namespace: cpaas-system
spec:
schedule: '@every 876000h'
template:
defaultVolumesToFsBackup: true
hooks: {}
includedNamespaces:
- ${NEXUS_NAMESPACE}
includedResources:
- '*'
resourcePolicy:
kind: configmap
name: ${BACKUP_POLICY_NAME}-volume-policy
storageLocation: ${VELERO_BACKUP_REPO_NAME}
ttl: 720h0m0s
EOF
kubectl create -f - <<EOF
apiVersion: velero.io/v1
kind: Backup
metadata:
labels:
velero.io/schedule-name: ${BACKUP_POLICY_NAME}
velero.io/storage-location: ${VELERO_BACKUP_REPO_NAME}
generateName: ${BACKUP_POLICY_NAME}-
namespace: cpaas-system
spec:
csiSnapshotTimeout: 10m0s
defaultVolumesToFsBackup: true
includedNamespaces:
- ${NEXUS_NAMESPACE}
includedResources:
- "*"
itemOperationTimeout: 4h0m0s
resourcePolicy:
kind: configmap
name: ${BACKUP_POLICY_NAME}-volume-policy
snapshotMoveData: false
storageLocation: ${VELERO_BACKUP_REPO_NAME}
ttl: 720h0m0s
EOF
# Output
# schedule.velero.io/nexus-backup created
# backup.velero.io/nexus-backup-r6hht created
View backup logs:
kubectl logs -f -n cpaas-system -l app.kubernetes.io/instance=velero --max-log-requests 100 | grep nexus-backup
# Output
# time="2025-11-03T12:08:00Z" level=info msg="PodVolumeBackup completed" backup=cpaas-system/nexus-backup-n4lxm controller=podvolumebackup logSource="pkg/controller/pod_volume_backup_controller.go:244" pvb=cpaas-system/nexus-backup-n4lxm-5hs4m
Check task progress. If the status is Completed, the backup was successful.
kubectl get backup -n cpaas-system -o jsonpath="{range .items[*]}{.metadata.name}{'\t'}{.status.phase}{'\t'}{.status.startTimestamp}{'\n'}{end}" | grep ${BACKUP_POLICY_NAME}
# Output
# nexus-backup-n4lxm Completed 2025-11-03T12:07:53Z
After Successful Backup, Restore the nexus instance service
Go to the Administrator -> Marketplace -> Operator Hub page, switch to the target cluster, and then redeploy the Alauda Build of Nexus Operator.
Restore
Preparation
Restore preparation consists of four steps:
- Select the backup to restore
- Determine the target namespace for restoration
- Uninstall Nexus operator from the cluster.
Select the Backup to Restore
Run the following command to view all successful backup records and select the desired backup based on the start time.
kubectl get backup -n cpaas-system -o jsonpath="{range .items[*]}{.metadata.name}{'\t'}{.status.phase}{'\t'}{.status.startTimestamp}{'\n'}{end}" | grep ${BACKUP_POLICY_NAME} | grep Completed
# Output:
# nexus-backup-n4lxm Completed 2025-11-03T12:07:53Z
Set the BACKUP_NAME environment variable:
export BACKUP_NAME=<selected backup record name>
Determine Target Namespace
It is recommended to restore to a new namespace. Set the following environment variables:
export NEW_NEXUS_NAMESPACE=<new namespace name>
kubectl create namespace ${NEW_NEXUS_NAMESPACE}
Uninstall Nexus operator from the cluster.
Run the following command to uninstall the Nexus operator:
kubectl get subscription --all-namespaces | grep nexus-ce-operator | awk '{print "kubectl delete subscription "$2" -n "$1}' | sh
# Output:
# subscription.operators.coreos.com "nexus-ce-operator" deleted
Why uninstall the Nexus operator?
During restoration, Velero needs to start backup pods to restore PVC data, which takes a long time. If the Operator is not uninstalled, the following issues may occur:
- The Operator may recreate workloads based on the Nexus CR, causing the restored pods to be restarted or recreated, ultimately leading to restoration interruption or failure.
- Some restored resources may conflict with existing resources, such as Ingress.
Impact of Uninstalling Nexus operator
After uninstalling the Operator, modifications to the Nexus CR will not take effect, such as adjusting resources or storage size.
Uninstalling the Operator will not cause existing instances to malfunction.
Restore Operations
Restore operations consist of five steps:
- Create restore configuration file
- Create restore task
- Clean up resources
- Modify Nexus CR resource
- Deploy Nexus operator from the cluster.
Create Restore Configuration File
Please read the YAML comments carefully and modify as needed (such as changing the storage class) before creating.
# If you need to change the storage class, set the following environment variables
OLD_STORAGE_CLASS_NAME='' # Original storage class name
NEW_STORAGE_CLASS_NAME='' # New storage class name
if [ -n "${OLD_STORAGE_CLASS_NAME}" ] && [ -n "${NEW_STORAGE_CLASS_NAME}" ] && [ ${NEW_STORAGE_CLASS_NAME} != ${OLD_STORAGE_CLASS_NAME} ]; then
kubectl apply -f - <<EOF
apiVersion: v1
data:
resourcemodifier: |
version: v1
resourceModifierRules:
- conditions:
groupResource: persistentvolumeclaims
resourceNameRegex: .*
namespaces:
- "*"
patches: &a1
- operation: test
path: /spec/storageClassName
value: ${OLD_STORAGE_CLASS_NAME}
- operation: replace
path: /spec/storageClassName
value: ${NEW_STORAGE_CLASS_NAME}
- conditions:
groupResource: persistentvolume
resourceNameRegex: .*
namespaces:
- "*"
patches: *a1
kind: ConfigMap
metadata:
labels:
component: velero
name: nexus-restore-modifier
namespace: cpaas-system
EOF
else
kubectl apply -f - <<EOF
apiVersion: v1
data:
resourcemodifier: |
version: v1
resourceModifierRules:
- conditions:
groupResource: persistentvolumeclaims
resourceNameRegex: .*
namespaces:
- "*"
patches:
- operation: add
path: "/metadata/annotations/meta.helm.sh~1release-namespace"
value: ${NEW_NEXUS_NAMESPACE}
kind: ConfigMap
metadata:
labels:
component: velero
name: nexus-restore-modifier
namespace: cpaas-system
EOF
fi
# Output:
# configmap/nexus-restore-modifier created
Create Restore Task
Run the following command to create the restore task:
kubectl create -f - <<EOF
apiVersion: velero.io/v1
kind: Restore
metadata:
generateName: ${NEXUS_NAME}-restore-
namespace: cpaas-system
spec:
backupName: ${BACKUP_NAME}
hooks: {}
includedNamespaces:
- ${NEXUS_NAMESPACE}
includedResources:
- persistentvolumeclaims
- persistentvolumes
- configmaps
- secrets
- pods
- nexuses.operator.alaudadevops.io
- volumesnapshots
- volumesnapshotcontents
itemOperationTimeout: 10h0m0s
namespaceMapping:
${NEXUS_NAMESPACE}: ${NEW_NEXUS_NAMESPACE}
resourceModifier:
kind: ConfigMap
name: nexus-restore-modifier
EOF
# Output:
# restore.velero.io/<nexus-name>-restore-m2n8f created
View restore logs:
kubectl logs -f -n cpaas-system -l app.kubernetes.io/instance=velero --max-log-requests 100 | grep ${NEXUS_NAME}-restore
# Output
# time="2025-11-03T12:27:38Z" level=info msg="Async fs restore data path completed" PVR=<nexus-name>-restore-m2n8f-n2tx2 controller=PodVolumeRestore logSource="pkg/controller/pod_volume_restore_controller.go:275" pvr=<nexus-name>-restore-m2n8f-n2tx2
# time="2025-11-03T12:27:38Z" level=info msg="Restore completed" controller=PodVolumeRestore logSource="pkg/controller/pod_volume_restore_controller.go:327" pvr=<nexus-name>-restore-m2n8f-n2tx2
Check task progress. If the status is Completed, the restore was successful.
kubectl get restore -n cpaas-system -o jsonpath="{range .items[*]}{.metadata.name}{'\t'}{.status.phase}{'\t'}{.status.startTimestamp}{'\n'}{end}" | grep ${NEXUS_NAME}-restore
# Output
# <nexus-name>-restore-m2n8f InProgress 2025-11-03T12:27:38Z
Clean Up Resources
Make sure the restore operation is complete before proceeding!
Run the following commands to clean up resources:
kubectl delete configmap,secrets,sa -n ${NEW_NEXUS_NAMESPACE} -l release=${NEXUS_NAME}
kubectl get pod -n ${NEW_NEXUS_NAMESPACE} | grep ${NEXUS_NAME} | awk '{print $1}' | xargs kubectl delete pod -n ${NEW_NEXUS_NAMESPACE}
kubectl get secret -n ${NEW_NEXUS_NAMESPACE} | grep sh.helm.release.v1.${NEXUS_NAME} | awk '{print $1}' | xargs kubectl delete secret,configmap -n ${NEW_NEXUS_NAMESPACE}
kubectl delete configmap ${NEXUS_NAME}-nxrm-ha-logback-tasklogfile-override -n ${NEW_NEXUS_NAMESPACE}
Modify Nexus CR Resource
kubectl edit nexus ${NEXUS_NAME} -n ${NEW_NEXUS_NAMESPACE}
The new instance CR resource may need the following modifications. Please adjust according to your actual situation:
- Domain Name:
- Applicable scenario: The original instance was deployed using a domain name
- Reason: The same domain name in the Ingress resources of the old and new instances will cause conflicts, resulting in the failure to create the Ingress for the new instance
- Recommendation:
- (Recommended) Change the original instance's domain to a temporary domain, keep the new instance unchanged
- Or keep the original instance unchanged, and change the new instance to a new domain
- How to modify: See Configure Instance Network Access
- NodePort:
- Applicable scenario: The original instance was deployed using NodePort
- Reason: The same NodePort in the Service resources of the old and new instances will cause conflicts, resulting in the failure to create the Service for the new instance
- Recommendation:
- (Recommended) Change the original instance's NodePort to a temporary port, keep the new instance unchanged
- Or keep the original instance unchanged, and change the new instance to a new port
- How to modify: See Configure Instance Network Access
- Storage Class:
- Applicable scenario: The storage class of the old and new instances is different (e.g., the original used NFS, the new uses Ceph)
- Reason: If not modified, the Operator will still use the old storage class to create PVCs, which will conflict with the restored PVCs
- Recommendation: Change to the correct storage class
- How to modify: See Configure Instance Storage
Deploy Nexus operator from the cluster.
Go to the Administrator view, Marketplace -> OperatorHub page, and redeploy the Alauda Build of Nexus Operator.
After the Operator is deployed, it will deploy the new instance according to the Nexus CR. You can view the progress on the instance details page.
After the instance status returns to normal, log in to Nexus to check whether the data has been restored successfully. Check items include but are not limited to:
Q & A
How to determine if the PVC supports snapshot?
To determine whether a PVC supports snapshot, you need to check if its underlying StorageClass supports snapshot functionality. A PVC supports snapshot if there is a VolumeSnapshotClass in the cluster whose driver matches the StorageClass's provisioner. If such a VolumeSnapshotClass exists, the StorageClass (and therefore the PVC) supports snapshot functionality.
Example Configuration:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ceph
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
clusterID: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
fsName: cephfs
pool: cephfs-data0
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-cephfs-snapshotclass
deletionPolicy: Delete
driver: rook-ceph.cephfs.csi.ceph.com
parameters:
clusterID: rook-ceph
csi.storage.k8s.io/snapshotter-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/snapshotter-secret-namespace: rook-ceph
Key Points:
- StorageClass
provisioner must exactly match VolumeSnapshotClass driver
- Both resources must exist in the cluster
- The CSI driver must support snapshot operations