Harbor Instance Deployment
This document describes the subscription of Harbor Operator and the functionality based on Harbor instance deployment.
Harbor does not support deployment in namespaces with SPA (Security Policy Admission) policy set to Restricted due to the following reasons:
- Init Container Requires Root Privileges: Harbor uses init containers to initialize PVC directory permissions, which requires root privileges that are not allowed under the Restricted policy.
Recommendation: Create a dedicated namespace for the Harbor deployment and ensure that its security policy is not set to Restricted. When deploying Harbor with hostPath storage, the namespace security policy must be configured as Privileged.
TOC
Prerequisites
-
This document applies to Harbor 2.12 and above versions provided by the platform. It is decoupled from the platform based on technologies such as Operator.
-
Please ensure that Harbor Operator has been deployed (subscribed) in the target cluster, meaning the Harbor Operator is ready to create instances.
Deployment Planning
Harbor supports various resource configurations to accommodate different customer scenarios. In different scenarios, the required resources and configurations may vary significantly. Therefore, this section describes what aspects need to be considered in deployment planning before deploying Harbor instances, and what the impact of decision points is, to help users make subsequent specific instance deployments based on this information.
Basic Information
-
The Harbor Operator provided by the platform is based on the community's official Harbor Operator, with enterprise-level capability enhancements such as ARM support and security vulnerability fixes. It is fully compatible with the community version in terms of functionality, and in terms of experience, it enhances the convenience of Harbor deployment through optional and customizable templates.
-
A Harbor instance contains multiple components, such as the
Registry componentresponsible for managing image files, thePostgreSQLcomponent providing storage for application metadata and user information, and theRediscomponent for caching, etc. The platform provides professional PostgreSQL Operator and Redis Operator, so when deploying Harbor instances, Redis and PostgreSQL resources are no longer directly deployed, but accessed through configuring specific access credentials for existing instances.
Pre-deployment Resource Planning
Pre-deployment resource planning refers to decisions that need to be made before deployment and take effect during deployment. The main contents include:
High Availability
-
Harbor supports high availability deployment, with the following main impacts and limitations:
-
Each component will use multiple replicas
-
Network access no longer supports
NodePort, but requires access through domain names configured viaIngress -
Storage methods no longer support
node storage, but require access throughStorageClassorPVC
-
Resources
According to community recommendations and practices, a non-high-availability Harbor instance can run with a minimum of 2 cores and 4Gi of resources, while in high availability mode, a minimum of 8 cores and 16Gi of resources is required for stable operation.
Storage
-
Common storage methods provided by the platform can be used for Harbor, such as storage classes, persistent volume claims, node storage, etc.
-
To intentionally skip persistence for Trivy, leave
spec.helmValues.persistence.persistentVolumeClaim.trivy.storageClassunset. The vulnerability database then lives on anemptyDirvolume, so each Trivy pod restart triggers a full database download and blocks vulnerability scans until it completes. -
Node storage is not suitable for
high availabilitymode, as it stores files in a specified path on the host node -
Additionally, Harbor supports object storage. For configuration instructions, see Using Object Storage as the Registry Storage Backend.
It is not recommended to use NFS as the storage backend for Harbor in production or high-load scenarios (such as large-scale image push/pull operations or high-concurrency image workloads). Due to inherent protocol characteristics, NFS cannot fully meet Harbor Registry's metadata-intensive operation requirements. Under heavy load, this often results in artifact upload failures, and you may see errors such as:
- digest invalid: provided digest did not match uploaded content
- blob upload unknown
- blob upload invalid
If you still want to use NFS as Harbor's storage backend in a testing environment, enabling sync and no_wdelay on the NFS server (consult your storage provider for configuration details) and setting the Registry component's replica count to 1 can help mitigate the above issues.
Network
-
The platform provides two mainstream network access methods:
NodePortandIngress-
NodePortrequires specifying HTTP port and SSH port, and ensuring the ports are available.NodePortis not suitable forhigh availabilitymode -
Ingressrequires specifying a domain name and ensuring that the domain name resolution is normal
-
-
The platform supports HTTPS protocol, which needs to be configured after instance deployment. See Configure HTTPS for details.
Redis
It is recommended to use block storage (e.g., TopoLVM) to achieve higher IOPS and lower latency.
-
The current Redis component version that Harbor depends on is v6. It is recommended to use the Redis Operator provided by the platform to deploy Redis instances, and then complete Redis integration by configuring access credentials.
- Redis access is accomplished by configuring a
secretresource with specific format content. See Configure Redis, PostgreSQL, and Account Access Credentials for details.
- Redis access is accomplished by configuring a
-
Known Issue: Modify Harbor Project Permissions Prompt Internal Server Error
PostgreSQL
It is recommended to use block storage (e.g., TopoLVM) to achieve higher IOPS and lower latency.
-
The current PostgreSQL component version that Harbor depends on is v14. It is recommended to use the PostgreSQL Operator provided by the platform to deploy PostgreSQL instances, and then complete PostgreSQL integration by configuring access credentials.
- PostgreSQL access is accomplished by configuring a
secretresource with specific format content. See Configure Redis, PostgreSQL, and Account Access Credentials for details.
- PostgreSQL access is accomplished by configuring a
Account Credentials
When initializing a Harbor instance, you need to configure the admin account and its password. This is done by configuring a secret resource. See Configure Redis, PostgreSQL, and Account Access Credentials for details.
Post-deployment Configuration Planning
Post-deployment configuration planning refers to planning that does not require decisions before deployment, but can be changed as needed through standardized operations after deployment. This mainly includes Single Sign-On (SSO), HTTPS configuration, external load balancer configuration, etc. Please refer to Subsequent Operations for details.
Instance Deployment
The Harbor Operator provided by the platform mainly offers two deployment methods: deployment from templates and deployment from YAML.
The platform provides two built-in templates for use: the Harbor Quick Start template and the Harbor High Availability template, while also supporting custom templates to meet specific customer scenarios.
Information about the built-in templates and YAML deployment is as follows:
Deploying from the Harbor Quick Start Template
This template is used to quickly create a lightweight Harbor instance suitable for development and testing scenarios, not recommended for production environments.
- Computing resources: CPU 2 cores, Memory 4Gi
- Storage method: Uses local node storage, requires configuring storage node IP and path
- Network access: Uses NodePort method, shares node IP with storage, requires port specification
- Dependent services: Requires configuration of existing Redis and PostgreSQL access credentials
- Other settings: Requires account credentials configuration, SSO functionality is disabled by default
Complete the deployment by filling in the relevant information according to the template prompts.
Deploying from the Harbor High Availability Template
Deploying a high-availability Harbor instance requires higher resource configuration and provides a higher availability standard.
- Computing resources: CPU 16 cores, Memory 16 Gi
- Storage method: Uses storage class resources to store image files, background task logs, and image scanning vulnerability database
- Network access: Uses Ingress method, requires domain name specification
- Dependent services: Requires configuration of existing Redis and PostgreSQL access credentials
- Other settings: Requires account credentials configuration, SSO functionality is disabled by default
To achieve Harbor high availability, external dependencies must meet the following conditions:
- The
RedisandPostgreSQLinstances must be highly available - The network load balancer must be highly available; when using ALB, a VIP must be configured
- The cluster must have more than 2 nodes
Complete the deployment by filling in the relevant information according to the template prompts.
Deploying from the Harbor Object Storage Template
Deploy a Harbor instance based on object storage.
- Compute resources: 8 CPU cores, 16 Gi memory
- Storage: Image files use object storage, and background task logs use database storage
- Network access: Use Ingress to access the service, and specify the domain name
- Dependencies: Configure existing Redis and PostgreSQL access credentials, and separately configure the access credentials for another independent database in the PG instance for the Gitaly component
- Other settings: Configure account credentials, SSO feature is disabled by default
In this template the Trivy scanner does not persist data; it mounts an emptyDir, so every pod restart downloads the vulnerability database again and temporarily blocks new scans until the sync finishes.
Verify that the object storage credentials you supply meet the required S3 API permissions, see Object Storage Credentials.
Complete the deployment by filling in the relevant information according to the template prompts.
Deploying from YAML
YAML deployment is the most basic and powerful deployment capability. Here we provide corresponding YAML snippets for each dimension from the Deployment Planning section, and then provide two complete YAML examples for complete scenarios to help users understand the YAML configuration method and make configuration changes as needed.
High Availability (YAML Snippets)
In high availability mode, Harbor component replicas should be at least 2. The YAML configuration snippet is as follows:
Storage (YAML Snippets)
Harbor data storage mainly includes three parts:
- Registry: Manages and stores container images and artifacts. It handles image upload, download, and storage operations.
- Jobservice: Executes background tasks like image replication between registries, garbage collection, and other scheduled or on-demand jobs.
- Trivy: Performs vulnerability scans on container images to identify security issues and ensure compliance with security policies.
Currently, three storage configuration methods are supported: storage class, PVC, and local node storage. When using storage class or pvc, the storage must support multi-node read and write (ReadWriteMany).
For Registry, Object Storage (S3) can also be used as the storage backend.
Jobservice supports storing job logs in multiple locations (file, database. stdout). If you haven't chosen to output the jobservice logs to file, there's no need to configure a storage backend for jobservice. See Configure Job Log Storage for details.
Storage class configuration snippet:
PVC configuration snippet (PVCs need to be created in advance):
Local node storage configuration snippet:
Configure Object Storage (S3) as the Registry storage backend:
-
Use Amazon S3 or S3 compatible services for object storage, such as MinIO, Ceph.
-
The Object Storage bucket must be created in advance.
-
The
<object-storage-secret>secret must be created in advance.
For more details, see S3 storage driver
If you want to use Ceph in platform, please refer to .
Harbor currently only supports configuring the Registry component to use S3 storage. Other components will continue to use PVC or StorageClass for persistent storage.
Network Access (YAML Snippets)
Network access mainly includes two methods: domain name access and NodePort access.
Domain name access configuration snippet:
NodePort access configuration snippet:
Redis Access Credential Configuration
This refers to the configuration snippet for these credentials in the Harbor instance after configuring the Redis credentials secret resource:
Standalone example:
Sentinel example:
PostgreSQL Access Credential Configuration
This refers to the configuration snippet for these credentials in the Harbor instance after configuring the PostgreSQL credentials secret resource:
Admin Account Configuration
This refers to the configuration snippet for these credentials in the Harbor instance after configuring the account credentials secret resource:
Complete YAML Example: Single Instance, Node Storage, NodePort Network Access
Complete YAML Example: High Availability, Storage Class, Ingress Network Access
Subsequent Operations
Configuring Single Sign-On (SSO)
You can change the authentication mode from database to OIDC only if no local users have been added to the database. If there is at least one user other than admin in the Harbor database, you cannot change the authentication mode.
details refer: Configure OIDC Provider Authentication
Configuring SSO involves the following steps:
- Register an SSO authentication client in the global cluster
- Prepare SSO authentication configuration
- Configure the Harbor instance to use SSO authentication
Create the following OAuth2Client resource in the global cluster to register the SSO authentication client:
Edit the Harbor instance to add the following configuration:
Configuring HTTPS
After deploying the Harbor instance, HTTPS can be configured as needed.
First, create a TLS certificate secret in the namespace where the instance is located:
Then edit the Harbor instance's YAML configuration to enable HTTPS access:
Configuring Image Scanning Vulnerability Database Policy
Harbor's image scanning functionality is implemented by the Trivy component. Considering the user's network environment, the default vulnerability database policy for this component is to use the built-in offline vulnerability database. Since the vulnerability database is not updated, it cannot detect new vulnerabilities in a timely manner.
If you want to keep the vulnerability database up-to-date, you can edit the Harbor instance's YAML configuration to enable the online update policy (this policy requires access to GitHub):
After enabling the online update policy, Trivy will determine whether to update the vulnerability database before scanning based on the last update time. Since downloading the vulnerability database takes some time, if you don't need Java vulnerability scanning, you can also edit the Harbor instance's YAML configuration to disable Java vulnerability database updates:
Additional Information
Deploying Harbor in IPv6 Environment
Harbor supports deployment in IPv6 environments, but you need to ensure that your client tools' versions support IPv6. If you encounter an invalid reference format error, please check whether your client tool version supports IPv6.
Related community issues: