linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Disseldorp <ddiss@suse.de>
To: linux-kbuild@vger.kernel.org, linux-fsdevel@vger.kernel.org
Cc: linux-next@vger.kernel.org, ddiss@suse.de, nsc@kernel.org
Subject: [PATCH v3 7/8] gen_init_cpio: add -a <data_align> as reflink optimization
Date: Tue, 19 Aug 2025 13:05:50 +1000	[thread overview]
Message-ID: <20250819032607.28727-8-ddiss@suse.de> (raw)
In-Reply-To: <20250819032607.28727-1-ddiss@suse.de>

As described in buffer-format.rst, the existing initramfs.c extraction
logic works fine if the cpio filename field is padded out with trailing
zeros, with a caveat that the padded namesize can't exceed PATH_MAX.

Add filename zero-padding logic to gen_init_cpio, which can be triggered
via the new -a <data_align> parameter. Performance and storage
utilization is improved for Btrfs and XFS workloads, as copy_file_range
can reflink the entire source file into a filesystem block-size aligned
destination offset within the cpio archive.

Btrfs benchmarks run on 6.15.8-1-default (Tumbleweed) x86_64 host:
  > truncate --size=2G /tmp/backing.img
  > /sbin/mkfs.btrfs /tmp/backing.img
  ...
  Sector size:        4096        (CPU page size: 4096)
  ...
  > sudo mount /tmp/backing.img mnt
  > sudo chown $USER mnt
  > cd mnt
  mnt> dd if=/dev/urandom of=foo bs=1M count=20 && cat foo >/dev/null
  ...
  mnt> echo "file /foo foo 0755 0 0" > list
  mnt> perf stat -r 10 gen_init_cpio -o unaligned_btrfs list
  ...
            0.023496 +- 0.000472 seconds time elapsed  ( +-  2.01% )

  mnt> perf stat -r 10 gen_init_cpio -o aligned_btrfs -a 4096 list
  ...
           0.0010010 +- 0.0000565 seconds time elapsed  ( +-  5.65% )

  mnt> /sbin/xfs_io -c "fiemap -v" unaligned_btrfs
  unaligned_btrfs:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..40967]:      695040..736007   40968   0x1
  mnt> /sbin/xfs_io -c "fiemap -v" aligned_btrfs
  aligned_btrfs:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..7]:          26768..26775         8   0x0
     1: [8..40967]:      269056..310015   40960 0x2000
     2: [40968..40975]:  26776..26783         8   0x1
  mnt> /sbin/btrfs fi du unaligned_btrfs aligned_btrfs
       Total   Exclusive  Set shared  Filename
    20.00MiB    20.00MiB       0.00B  unaligned_btrfs
    20.01MiB     8.00KiB    20.00MiB  aligned_btrfs

XFS benchmarks run on same host:
  > sudo umount mnt && rm /tmp/backing.img
  > truncate --size=2G /tmp/backing.img
  > /sbin/mkfs.xfs /tmp/backing.img
  ...
           =                       reflink=1    ...
  data     =                       bsize=4096   blocks=524288, imaxpct=25
  ...
  > sudo mount /tmp/backing.img mnt
  > sudo chown $USER mnt
  > cd mnt
  mnt> dd if=/dev/urandom of=foo bs=1M count=20 && cat foo >/dev/null
  ...
  mnt> echo "file /foo foo 0755 0 0" > list
  mnt> perf stat -r 10 gen_init_cpio -o unaligned_xfs list
  ...
            0.011069 +- 0.000469 seconds time elapsed  ( +-  4.24% )

  mnt> perf stat -r 10 gen_init_cpio -o aligned_xfs -a 4096 list
  ...
            0.001273 +- 0.000288 seconds time elapsed  ( +- 22.60% )

  mnt> /sbin/xfs_io -c "fiemap -v" unaligned_xfs
   unaligned_xfs:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..40967]:      106176..147143   40968   0x0
     1: [40968..65023]:  147144..171199   24056 0x801
  mnt> /sbin/xfs_io -c "fiemap -v" aligned_xfs
   aligned_xfs:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..7]:          120..127             8   0x0
     1: [8..40967]:      192..41151       40960 0x2000
     2: [40968..40975]:  236728..236735       8   0x0
     3: [40976..106495]: 236736..302255   65520 0x801

The alignment is best-effort; a stderr message is printed if alignment
can't be achieved due to PATH_MAX overrun, with fallback to non-padded
filename. This allows it to still be useful for opportunistic alignment,
e.g. on aarch64 Btrfs with 64K block-size. Alignment failure messages
provide an indicator that reordering of the cpio-manifest may be
beneficial.

Archive read performance for reflinked initramfs images may suffer due
to the effects of fragmentation, particularly on spinning disks. To
mitigate excessive fragmentation, files with lengths less than
data_align aren't padded.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
---
 usr/gen_init_cpio.c | 49 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 38 insertions(+), 11 deletions(-)

diff --git a/usr/gen_init_cpio.c b/usr/gen_init_cpio.c
index 729585342e16e..75e9561ba3139 100644
--- a/usr/gen_init_cpio.c
+++ b/usr/gen_init_cpio.c
@@ -28,13 +28,15 @@
 #define CPIO_TRAILER "TRAILER!!!"
 #define padlen(_off, _align) (((_align) - ((_off) & ((_align) - 1))) % (_align))
 
-static char padding[512];
+/* zero-padding the filename field for data alignment is limited by PATH_MAX */
+static char padding[PATH_MAX];
 static unsigned int offset;
 static unsigned int ino = 721;
 static time_t default_mtime;
 static bool do_file_mtime;
 static bool do_csum = false;
 static int outfd = STDOUT_FILENO;
+static unsigned int dalign;
 
 struct file_handler {
 	const char *type;
@@ -359,7 +361,7 @@ static int cpio_mkfile(const char *name, const char *location,
 	int file, retval, len;
 	int rc = -1;
 	time_t mtime;
-	int namesize;
+	int namesize, namepadlen;
 	unsigned int i;
 	uint32_t csum = 0;
 	ssize_t this_read;
@@ -407,14 +409,27 @@ static int cpio_mkfile(const char *name, const char *location,
 	}
 
 	size = 0;
+	namepadlen = 0;
 	for (i = 1; i <= nlinks; i++) {
-		/* data goes on last link */
-		if (i == nlinks)
-			size = buf.st_size;
-
 		if (name[0] == '/')
 			name++;
 		namesize = strlen(name) + 1;
+
+		/* data goes on last link, after any alignment padding */
+		if (i == nlinks)
+			size = buf.st_size;
+
+		if (dalign && size > dalign) {
+			namepadlen = padlen(offset + CPIO_HDR_LEN + namesize,
+					    dalign);
+			if (namesize + namepadlen > PATH_MAX) {
+				fprintf(stderr,
+					"%s: best-effort alignment %u missed\n",
+					name, dalign);
+				namepadlen = 0;
+			}
+		}
+
 		len = dprintf(outfd, "%s%08X%08X%08lX%08lX%08X%08lX"
 		       "%08lX%08X%08X%08X%08X%08X%08X",
 			do_csum ? "070702" : "070701", /* magic */
@@ -429,13 +444,13 @@ static int cpio_mkfile(const char *name, const char *location,
 			1,			/* minor */
 			0,			/* rmajor */
 			0,			/* rminor */
-			namesize,		/* namesize */
+			namesize + namepadlen,	/* namesize */
 			size ? csum : 0);	/* chksum */
 		offset += len;
 
 		if (len != CPIO_HDR_LEN ||
 		    push_buf(name, namesize) < 0 ||
-		    push_pad(padlen(offset, 4)) < 0)
+		    push_pad(namepadlen ? namepadlen : padlen(offset, 4)) < 0)
 			goto error;
 
 		if (size) {
@@ -552,7 +567,7 @@ static int cpio_mkfile_line(const char *line)
 static void usage(const char *prog)
 {
 	fprintf(stderr, "Usage:\n"
-		"\t%s [-t <timestamp>] [-c] [-o <output_file>] <cpio_list>\n"
+		"\t%s [-t <timestamp>] [-c] [-o <output_file>] [-a <data_align>] <cpio_list>\n"
 		"\n"
 		"<cpio_list> is a file containing newline separated entries that\n"
 		"describe the files to be included in the initramfs archive:\n"
@@ -590,7 +605,10 @@ static void usage(const char *prog)
 		"The default is to use the current time for all files, but\n"
 		"preserve modification time for regular files.\n"
 		"-c: calculate and store 32-bit checksums for file data.\n"
-		"<output_file>: write cpio to this file instead of stdout\n",
+		"<output_file>: write cpio to this file instead of stdout\n"
+		"<data_align>: attempt to align file data by zero-padding the\n"
+		"filename field up to data_align. Must be a multiple of 4.\n"
+		"Alignment is best-effort; PATH_MAX limits filename padding.\n",
 		prog);
 }
 
@@ -632,7 +650,7 @@ int main (int argc, char *argv[])
 
 	default_mtime = time(NULL);
 	while (1) {
-		int opt = getopt(argc, argv, "t:cho:");
+		int opt = getopt(argc, argv, "t:cho:a:");
 		char *invalid;
 
 		if (opt == -1)
@@ -661,6 +679,15 @@ int main (int argc, char *argv[])
 				exit(1);
 			}
 			break;
+		case 'a':
+			dalign = strtoul(optarg, &invalid, 10);
+			if (!*optarg || *invalid || (dalign & 3)) {
+				fprintf(stderr, "Invalid data_align: %s\n",
+						optarg);
+				usage(argv[0]);
+				exit(1);
+			}
+			break;
 		case 'h':
 		case '?':
 			usage(argv[0]);
-- 
2.43.0


  parent reply	other threads:[~2025-08-19  3:50 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-19  3:05 [PATCH v3 0/8] gen_init_cpio: add copy_file_range / reflink support David Disseldorp
2025-08-19  3:05 ` [PATCH v3 1/8] gen_init_cpio: write to fd instead of stdout stream David Disseldorp
2025-08-19  3:05 ` [PATCH v3 2/8] gen_init_cpio: support -o <output_file> parameter David Disseldorp
2025-08-19  3:05 ` [PATCH v3 3/8] gen_init_cpio: attempt copy_file_range for file data David Disseldorp
2025-08-19  3:05 ` [PATCH v3 4/8] gen_init_cpio: avoid duplicate strlen calls David Disseldorp
2025-08-19  3:05 ` [PATCH v3 5/8] gen_initramfs.sh: use gen_init_cpio -o parameter David Disseldorp
2025-08-19  3:05 ` [PATCH v3 6/8] docs: initramfs: file data alignment via name padding David Disseldorp
2025-08-19  3:05 ` David Disseldorp [this message]
2025-08-19  3:05 ` [PATCH v3 8/8] initramfs_test: add filename padding test case David Disseldorp
2025-08-19 20:16   ` kernel test robot
2025-08-20  1:13     ` David Disseldorp
2025-08-20 21:02       ` Nicolas Schier
2025-08-21  5:04         ` David Disseldorp
2025-08-21  5:40           ` Nicolas Schier
2025-08-21 19:09 ` [PATCH v3 0/8] gen_init_cpio: add copy_file_range / reflink support Nathan Chancellor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250819032607.28727-8-ddiss@suse.de \
    --to=ddiss@suse.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=nsc@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).