Harbor Image Synchronization to Cluster Registry
This guide explains how to configure Harbor replication rules to automatically synchronize images from Harbor to registries in clusters when new images are pushed.
TOC
Prerequisites
- A running Harbor service
podmanclient or other image management tools installed locally for pushing images to Harbor- A Kubernetes cluster with a registry deployed
- Kubectl installed locally and configured to access the Kubernetes cluster
Overview
Key Configurations in Harbor
- Registry Endpoint: Add target registry information to Harbor
- Replication Rule: Define which images to sync and the synchronization path
Process Overview
Configuration Steps
Step 1: Configure Registry Endpoint in Harbor
First, configure the target registry information in Harbor using the Registry Endpoint feature.
- Log into Harbor and navigate to Administration > Registries
- Click NEW ENDPOINT and configure the following:
- Provider: Select your registry type. For this example, choose
Harbor - Name: Provide a descriptive name for your target registry. We recommend using the format
<cluster-name>-registryfor easy identification in this example. - Endpoint URL: Enter your registry URL (e.g.,
https://cluster1-registry.example.com) - Access ID: Username with
Push/Deletepermissions for the target registry - Access Secret: Password for the above user
- Verify Remote Cert: Determines whether to verify the remote registry's certificate. Uncheck for self-signed or untrusted certificates
- Click TEST CONNECTION to verify configuration and network connectivity
- Click OK to save the configuration
Step 2: Configure Replication Rule in Harbor
Next, configure a replication rule to define how Harbor synchronizes images to the target registry.
- Navigate to Administration > Replications
- Click NEW REPLICATION RULE and configure:
Basic Settings
- Name: Use a descriptive name like
<cluster-name>-registry-replication - Replication mode: Select
Push-basedto trigger sync when images are pushed to Harbor
Source Resource Filter
Configure filters to identify which artifacts to replicate:
- Name: Filter by resource name. Leave empty or use
**to match all. For this example, leave empty to sync all repositories - Tag: Filter by tag/version. Leave empty or use
**to match all tags. For this example, leave empty to sync all tags - Label: Filter by artifact labels. Leave empty to match all artifacts. For this example, leave empty to sync all artifacts
- Resource: Filter the type of resources. For this example, select
ALLto sync all artifact types
Destination
Define the sync target and path structure:
- Destination registry: Select the registry endpoint configured in Step 1
- Namespace: The name of the namespace in which to replicate resources. If empty, the resources will be put under the same namespace as the source. For this example, leave empty to use the same namespace as the source
- Flattening: Reduce the nested repository structure when copying images. Leave empty to preserve the original image hierarchy
Additional Settings
- Trigger Mode: Decide how to trigger the sync. Select
Event Basedto trigger sync on Harbor push events for this example - Check "Delete remote resources when locally deleted" if you want deletions in Harbor to propagate to the target registry
- Bandwidth: Set maximum network bandwidth per replication worker (use -1 for unlimited)
- Options:
- Check
Overrideto overwrite existing resources at the destination - Check
Enable ruleto activate the replication rule
Step 3: Verify Image Synchronization
Push an Image to Harbor
Push an image to Harbor to trigger the sync:
Monitor Synchronization Job in Harbor
- In Harbor, navigate to Administration > Replications
- Select your newly created replication rule to see the automatically triggered sync job
- Click on the execution record to view detailed information
Verify Synchronization Results
Create a pod in the target cluster to test that the image was successfully synchronized:
If the pod starts successfully, the synchronization is working correctly.
Summary
This configuration establishes automatic image synchronization from Harbor to cluster registries, maintaining consistent image paths and enabling seamless deployment workflows across your infrastructure. The push-based replication ensures that images are available in target registries immediately after being pushed to Harbor.
FAQ
Can the Replication Function Be Used for Harbor Disaster Recovery?
Harbor's Replication function can only provide a preliminary disaster recovery (DR) solution and comes with certain limitations.
- Firstly, a failover mechanism is required to automatically switch the domain name resolution from the primary instance to the secondary instance in the event of a primary instance failure, allowing users to continue pulling already synchronized images.
- Secondly, this function relies on event-driven synchronization of images and is primarily designed for replication between registries, not as a mature solution tailored for disaster recovery.
The following sections analyze the problems this solution can address and those it cannot.
Problems Addressed
- Automatic Synchronization of Image Data: Container images and Helm charts from the primary instance can be automatically synchronized to the secondary instance. When the "synchronize deletions" option is enabled, images deleted from the primary instance are automatically removed from the secondary instance.
- Automatic Failover: In the event of a primary instance failure, the external domain name resolution can automatically switch to the secondary instance, enabling users to pull already synchronized images (Recovery Time Objective, or RTO, depends primarily on the implemented failover mechanism).
Problems Not Addressed
- Automatic Synchronization of Non-Image Data (Critical!):
- Issue Description: Harbor's Replication function does not synchronize metadata such as user accounts, permission configurations (e.g., OIDC settings), project member groups, or audit logs.
- Impact: Users whose metadata is not synchronized will be unable to access Harbor after failover, significantly undermining the disaster recovery objectives.
- Recovery Point Objective (RPO):
- Issue Description: The Replication function is not designed specifically for disaster recovery. Image replication depends on background task execution, which cannot guarantee that all images in the primary registry are synchronized to the secondary registry at the time of a failure.
- Impact: During a primary registry failure, users will be unable to pull images that have not yet been synchronized.
- Fallback to the Primary Instance:
- Issue Description: The current solution only supports automatic failover to the secondary instance. Fallback to the primary instance requires manual intervention.
- Impact: Before fallback, incremental data from the secondary instance must be synchronized back to the primary instance, or this data will be lost.
Recommendations for a Comprehensive DR Solution
To achieve a robust disaster recovery mechanism, focus on synchronization at the PostgreSQL database (PG) and storage layers. For instance, implement real-time database replication for PostgreSQL and enable storage-layer synchronization (e.g., S3 cross-region replication) to ensure comprehensive consistency of both metadata and all data.