CephFS: Migrating Files Between Pools

CephFS: Migrating Files Between Pools

When I started with CephFS, I didn't have a good plan for how I wanted subfolders to map to different Ceph pools. I had different kinds of data in the file system, so I knew I wanted some of it to be on fast NVMe storage with simple replication, and other bulk files to be on higher capacity SATA SSD storage with erasure coding to reduce the storage required.

CephFS has the concept of File Layouts. Essentially, these are extended attributes on files or directories that give CephFS hints about how it should handle file storage. One of the available fields indicates which data pool should be used to store a file. If you add that field to a directory, it applies to any files within that directory tree (unless an individual file or subfolder overrides the field).

So I did this when I started out, separating my terabytes of media files from my smaller amount of more performance-sensitive container data files.

But I didn't get it right. Oops.

After I had populated the file system with 20+ TB of files, I changed my mind about which pools I wanted to use. It's easy to change those extended attributes anytime you want, but that doesn't affect existing files—only new ones.

To get existing files to move to the newly assigned pools, you essentially have to recreate them so that CephFS sees them as new files and puts them in the right place.

I wanted to migrate files from their current pool to the newly assigned pool, and I didn't want to do it by hand.

After some searching for solutions, I found a piece of Python code written by Peter Woodman that sort of did this, but it didn't work exactly how I wanted. However, it was good inspiration.

I'm not usually a Python programmer, so I turned to ChatGPT, and it helped me create a similar standalone script that systematically processes a directory tree to move existing files to a new pool.

The code I ultimately ended up with and used is here: Migrate files in cephfs to a new file layout pool recursively (github.com)

In simple terms, the script does the following:

  • Recursively loops through all files and folders, starting with the current folder

  • For each file, checks if it is already in the desired pool by reading the virtual attribute. If it's already where it is supposed to be, skips it

  • If the file is not correct and is a normal file, copies it to a scratch folder, then moves it back to the original location. This essentially moves the file to the new pool

  • Restores file ownership and permissions after the copy/move

  • Handles symlinks and hard links appropriately as well

All of this is parallelized so that the Ceph backend can be kept busy, by default processing 4 files simultaneously. Since every file copy essentially reads and rewrites the entire file, this is expensive on I/O. The parallelization helps ensure there is always something being copied, even during brief gaps where metadata is being checked, etc.

Ultimately, this worked even though it was slow (reading and rewriting 20+ TB of data takes a while). But it was automatic and happened in the background, and I didn't have to manually re-populate my file system from scratch, which is what I wanted to avoid.