the point of transition

Integrating IBM Storage Ceph with PoINT Archival Gateway for Policy-Based S3-to-Tape Archiving

Introduction

Overview of Archival and Tiered Storage Challenges

As data growth accelerates, organizations increasingly rely on archival and tiered storage to manage capacity, cost, performance and compliance efficiently. These storage models aim to balance frequently accessed “hot” data with rarely used “cold” data by placing them on appropriate storage tiers.

However, implementing and managing archiving and tiering pose several challenges. Storage systems must be able to integrate different storage technologies (flash, disk and tape) so that data is stored on the most appropriate technology according to its importance and use. The tiering and archiving process must be able to be carried out automatically on the basis of policies. It is also important that the storage systems support standardized protocols.

The combination of IBM Storage Ceph and PoINT Archival Gateway addresses these challenges. PoINT Archival Gateway integrates tape storage product homogenously into a Ceph Cluster via the standardized S3 interface. This connection makes it possible to fulfill archiving and tiering requirements in a consistent complete system.

Introducing PoINT Archival Gateway

Overview and Key Features

PoINT Archival Gateway (PAG) is a high-performance, scalable, S3 object storage on tape. The software solution connects S3 capable storage systems like IBM Storage Ceph with tape libraries as target storage.

The basic functions of PAG include user, data and storage management, as well as access control, logging and monitoring. PAG allows direct writing to tape media. No expensive disk caches are required. Optional integration of an additional disk/flash-based storage class is possible to meet the demands of use cases that require fast data access. Internal tiering using the standardized S3 Lifecycle Policies ensures optimized data and storage management.

Key Features:

Introducing IBM Storage Ceph Object Storage

Overview and Key Features

IBM Storage Ceph is an Enterprise-grade software-defined storage solution. built
for data-intensive applications. Designed for hybrid cloud, it empowers organizations to modernize infrastructure and reduce costs with flexible deployment in the data center, or as a service.

Ceph provides a single, efficient, unified storage platform for object, block, and file storage with Enterprise support and services, certified updates, and service level agreements for production environments.

Install and run IBM Storage Ceph on industry-standard x86 server hardware of a company preferred hardware vendor.

Key Features:

IBM Storage Ceph Object Tiering Capabilities

Ceph offers object storage tiering capabilities to optimize cost and performance by seamlessly moving data between storage classes. These tiers can be configured locally within an on-premises infrastructure or extended to include cloud-based storage classes, providing a flexible and scalable solution for diverse workloads. With policy-based automation, administrators can define lifecycle policies to migrate data between high-performance storage and cost-effective archival tiers, ensuring the right balance of speed, durability, and cost-efficiency.

Benefits of Integrating PAG with IBM Storage Ceph Object Storage

How-To: PoINT Storage Manager Upgrade

PAG allows a homogeneous integration of a tape storage class into a Ceph cluster. In this way, a multi-tier configuration with tape as active archive tier can be realized. Ceph supports policy-based data archival and retrieval capabilities that integrate PAG as S3 tape endpoint for long-term retention, disaster recovery, or cost-optimized cold storage. By leveraging policy-based automation, Ceph ensures that data is moved to PAG and, thus, to tape according to predefined lifecycle rules. PAG ensures efficient tape integration in Ceph, as no additional disk storage class is required.

The benefits of the combined Ceph and PAG solution are:

PAG and IBM Storage Ceph Integration Workflow

Integrating IBM Storage Ceph with PoINT Archival Gateway

IBM Storage Ceph 8.0 introduced policy-based data retrieval, which marks a significant evolution in its capabilities and is now available as a Tech Preview. This enhancement enables users to retrieve archived objects from S3 Tape endpoints like PAG directly into their on-prem Ceph environment.

Data can be restored as temporary or permanent objects:

This retrieval of objects can be done in two different ways:

Integrating IBM Storage Ceph with PoINT Archival Gateway

Use Cases for Policy Based Archive & Retrieval from Tape

Long-Term Regulatory Compliance

Media & Content Archiving

Scientific & HPC Research

Cybersecurity & Ransomware Protection

Multi-Cloud & Hybrid Strategies

Increasing Data Security and Performance by Erasure Coding on Tape

Data security on the tape media is provided by Erasure Coding. This process stores blocks of data redundantly on multiple media. This means that even if one medium fails, the data will not be lost. PoINT Archival Gateway supports the Erasure Code (EC) rates 1/2, 1/3, 1/4, 2/3, 2/4 and 3/4. In combination with Erasure Coding, data security and redundancy can be further increased, e.g. by using two, three or four tape media in parallel in the tape storage class. Such a combination of multiple media is called a Protected Volume Array. A Protected Volume Array consisting of N tape media can also extend over N tape libraries. The EC rates 1/2, 1/3, 1/4 indicate the automatic creation of copies. For the tape storage class, this means that multiple tape copies can be created (even in different libraries). Throughput rates can be significantly increased with EC rates that distribute data across multiple media (EC 2/3, 2/4, and 3/4).

Integrating IBM Storage Ceph with PoINT Archival Gateway

In addition to increased redundancy, throughput rates can be significantly increased with EC rates that distribute data across multiple media (EC 2/3, 2/4, and 3/4).

Deployment Guide (Hands-on Section)

Installing PoINT Archival Gateway on RHEL 9.3

PoINT Archival Gateway (PAG) can be installed on several servers (multi-node installation) in the Enterprise Edition or on one server in the Compact Edition. The following section describes the deployment of the Compact Edition.

Installing the PAG Compact Edition on RHEL 9.3 begins by transferring the installation tarball to your server and extracting its contents. After unarchiving, the next step is to install any required .NET runtimes and configure systemd services so PAG can run in the background.

Here is an example:

Integrating IBM Storage Ceph with PoINT Archival Gateway

After installing the files and dependencies, it is necessary to update the PAG configuration files in order to reflect the correct IP addresses, ports, and license key. The primary changes typically occur in /etc/opt/PoINT/PAG/CGN/pag-cgn.conf for the S3 REST API and /etc/opt/PoINT/PAG/GUI/pag-gui.conf for the administrative GUI. An example edit might look like this:

Integrating IBM Storage Ceph with PoINT Archival Gateway

Likewise, editing the GUI configuration file might involve similar IP updates:

Integrating IBM Storage Ceph with PoINT Archival Gateway

Once the configurations are in place, the services can be enabled and started:

Integrating IBM Storage Ceph with PoINT Archival Gateway

Confirming everything is running allows you to access the PAG GUI through HTTPS on the configured IP address and port. You can then log in with the default admin credentials, enter your License Key, and activate the software through the “System Management” → “Information” section in the PAG GUI.

After licensing is complete, creating a partition and Object Repository in the PAG interface will prepare the backend for storing objects on tape.

Under the menu command “Storage Management” → “Storage Partitions” you get an overview about all created Storage Partition:

Integrating IBM Storage Ceph with PoINT Archival Gateway

To create a new Storage Partition click on “Create Partition” and fill out the following dialogue:

Integrating IBM Storage Ceph with PoINT Archival Gateway

Under the menu command “Storage Management” → “Object Repositories” you get an overview about all created Object Repositories (Buckets):

Integrating IBM Storage Ceph with PoINT Archival Gateway

To create a new Object Repository (Bucket) click on “Create Object Repository” and fill out the following dialogue:

Integrating IBM Storage Ceph with PoINT Archival Gateway

Setting up a user with HMAC credentials will allow Ceph to authenticate against PAG’s S3 endpoint.

Integrating IBM Storage Ceph with PoINT Archival Gateway

Integrating PAG as a Storage Class within Ceph RGW involves configuring a cloud-tier placement for tape using the standard Ceph CLI. Adding a new point-tape storage class to the default placement looks like this:

Integrating IBM Storage Ceph with PoINT Archival Gateway

For a full description of all the configuration parameters available check this link.

We can list our new zonegroup placement configuration with the following command:

Integrating IBM Storage Ceph with PoINT Archival Gateway

NOTE: If you have not done any previous Multisite Configuration, a default zone and zonegroup are created for you, and changes to the zone/zonegroup will not take effect until the Ceph Object Gateways are restarted. If you have created a realm for multisite, the zone/zonegroup changes will take effect once the changes are committed with 'radosgw-admin period update –commit'.

Integrating IBM Storage Ceph with PoINT Archival Gateway

Next comes the creation of a bucket and the assignment of a lifecycle policy.

This policy will automatically transition objects from the STANDARD tier to point-tape after a specified number of days, we first create a bucket called 'dataset':

Integrating IBM Storage Ceph with PoINT Archival Gateway

The contents of point-tape-lc.json might resemble the following:

Integrating IBM Storage Ceph with PoINT Archival Gateway

To apply the lifecycle configuration to the 'dataset' bucket, you can apply it using the AWS cli:

Integrating IBM Storage Ceph with PoINT Archival Gateway

Testing the Integrated Setup includes verifying that newly uploaded objects transition to the PAG tape tier according to the lifecycle rules. Uploading a file to the bucket and confirming its presence happens with:

Integrating IBM Storage Ceph with PoINT Archival Gateway
Integrating IBM Storage Ceph with PoINT Archival Gateway

The Ceph lifecycle daemon will run at scheduled intervals. After it completes, you can check whether objects have transitioned, the size of the object in the Ceph bucket will now be 0 as only the stub file remains the in the local Ceph cluster, and the 'StorageClass' will now be 'point-tape':

Integrating IBM Storage Ceph with PoINT Archival Gateway

In the output, 'StorageClass' should change to 'point-tape' for objects that have been migrated to the PAG tier. Validating the actual data in the PAG backend is done by querying the bucket path in PAG via its S3 REST API:

Integrating IBM Storage Ceph with PoINT Archival Gateway

Object Retrieval Workflow can then be tested by triggering a restore. A restore request can be made with the restore-object API call, we will first test with a temporary restore, the object will be available in our Ceph Cluster for three days, expiry date for the object is part of the restored object metadata:

Integrating IBM Storage Ceph with PoINT Archival Gateway

You can later confirm that the restored object is accessible and listed in Ceph, because this is a temporary restore, the storage class won’t be modified it will still be 'point-tape' and the object won’t be part of LC policies or multisite replication:

Integrating IBM Storage Ceph with PoINT Archival Gateway
Integrating IBM Storage Ceph with PoINT Archival Gateway

If we don’t specify the days in the restore request, this will trigger a permanent restore of the object; here is an example, we upload a new file into our 'dataset' bucket:

Integrating IBM Storage Ceph with PoINT Archival Gateway

After the LC policy kicks in the object is transitioned to tape, as we can see by the storage class ouput of the head object call:

Integrating IBM Storage Ceph with PoINT Archival Gateway

I will now use the RestoreObject API call without specifying the number of days in the restore-request field, this will cause the restore to be permanent:

Integrating IBM Storage Ceph with PoINT Archival Gateway

We can see that because of the restore being permanent the storage class has gone back to being 'STANDARD', and there is no expiration date for the restore:

Integrating IBM Storage Ceph with PoINT Archival Gateway

Conclusion

By following the above approach, you can effectively deploy the PoINT Archival Gateway, integrate it into IBM Storage Ceph as a new tape storage tier, and validate the entire lifecycle workflow — from upload and automatic migration to restore and verification. This combined solution reduces storage costs, enhances data protection and compliance, and provides on-premises tape capabilities through a familiar S3 interface.

White Paper