From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 348D919BBC for ; Thu, 26 Sep 2024 13:29:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727357391; cv=none; b=XnsflpYtNwYcPbs+o51fL2G7F04HZBtr/pgYbkd+S1bjkfd2zWIAYJKGcvCEZ18O0iCX8IZx2Y7AaELGP+TgkJ9xZvhkwiw1Ucb66NNX/6DP2+alXCJM+M8Nl4Bo71nk+hm3Sh5/hHE8x3JCs8J0h5Je7G4B5Rc+sVVcrszoWNc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727357391; c=relaxed/simple; bh=Foi08EWuW9jYSpb3YwQREn/oEILyic+4FEEeCMb3Q1c=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=dwSdZJcMo2/UYywruAETRZ1gkPVBD+q3dqKD6DMFV8fuF46URjGm0WCL0rFY3FVqR4xdMokoVf9J7g2ha8Iq3TAj/U+9R7ZTJTCwLkgETxvfr4WIyjvRjUQoj7chEuXNABJ3+kJx62DPUQjpp/Lii5NCPmjb8+46IZ1KcegbxCs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=cKu5TqGt; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="cKu5TqGt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1727357388; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KQ6wLyvFf1DC/0sU41emA5coECG/jjMncWe18jRkzK0=; b=cKu5TqGtcOF/vSomE5Ep9DvDfHYsRwkEeeteuJvyfekp08EesU5Kg/ZWnajKmWvpL+x+0q OsZ4jFI4jvKAiQOkRZOPfdMJkRSV6MnCQUvWixsdh+Atya7576yOmaAbESw0wgy0i2s/aP DXbmkeT/SCDw9KxW0xVyG5U8QoyMloY= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-647-FDhl_8GHO6e8WfatpfW1kQ-1; Thu, 26 Sep 2024 09:29:46 -0400 X-MC-Unique: FDhl_8GHO6e8WfatpfW1kQ-1 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (unknown [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 815331902F77 for ; Thu, 26 Sep 2024 13:29:44 +0000 (UTC) Received: from bfoster (unknown [10.22.32.70]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id DEA2219373D7; Thu, 26 Sep 2024 13:29:43 +0000 (UTC) Date: Thu, 26 Sep 2024 09:30:53 -0400 From: Brian Foster To: Zorro Lang Cc: fstests@vger.kernel.org Subject: Re: [PATCH RFC] fsx: support unshare range fallocate mode Message-ID: References: <20240906185606.136402-1-bfoster@redhat.com> <20240920145839.vphm5uoiptfgnad5@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com> Precedence: bulk X-Mailing-List: fstests@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240920145839.vphm5uoiptfgnad5@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com> X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 On Fri, Sep 20, 2024 at 10:58:39PM +0800, Zorro Lang wrote: > On Fri, Sep 06, 2024 at 02:56:06PM -0400, Brian Foster wrote: > > The fallocate unshare mode flag modifies traditional preallocate > > mode to unshare any shared extents backing the target range. Without > > the unshare flag, preallocate mode simply assures that blocks are > > physically allocated, regardless of whether they might be shared. > > Unshare mode behaves the same as preallocate mode outside of the > > shared extent case. > > > > Since unshare is fundamentally a modifier to preallocate mode, > > enable it via an operation flag. Similar to keep size mode, select > > it randomly for fallocate operations and track it via a flag and > > string combination for operation logging and replay. > > > > Unshare is mainly used for filesystems that support reflink, but the > > operation is equivalent to preallocate mode for non-shared ranges, > > so enable it by default. Filesystems that do not support the > > fallocate flag (such as those that might not support reflink) will > > fail the test operation and disable unshare calls at runtime. Also > > provide a new command line option to explicitly disable unshare > > calls. > > > > Signed-off-by: Brian Foster > > --- > > > > Hi all, > > > > A couple quick notes.. > > > > First, this applies on top of the currently pending EOF pollution > > patches [1] to fsx. > > The patchset [1] has been merged, thanks for doing that. > Great, thanks. > > > > Second, this triggers multiple new test failures in my initial tests > > that I've not fully diagnosed. I'm not necessarily sure that we want to > > hold off on added test coverage for that reason, but I'd like to at > > least be sure this isn't due to this patch doing something blatantly > > wrong. > > > > That said, this has also uncovered one or two unshare related issues, so > > I'm posting this slightly prematurely in order to share it and hopefully > > get some review.. > > > > Brian > > > > [1] https://lore.kernel.org/fstests/20240828181534.41054-1-bfoster@redhat.com/ > > > > ltp/fsx.c | 59 ++++++++++++++++++++++++++++++++++++++++++------------- > > 1 file changed, 45 insertions(+), 14 deletions(-) > > > > diff --git a/ltp/fsx.c b/ltp/fsx.c > > index 1ba1bf65..003990db 100644 > > --- a/ltp/fsx.c ... > > @@ -1879,15 +1887,25 @@ do_copy_range(unsigned offset, unsigned length, unsigned dest) > > #ifdef HAVE_LINUX_FALLOC_H > > /* fallocate is basically a no-op unless extending, then a lot like a truncate */ > > void > > -do_preallocate(unsigned offset, unsigned length, int keep_size) > > +do_preallocate(unsigned offset, unsigned length, int keep_size, int unshare) > > { > > unsigned end_offset; > > + enum opflags opflags = FL_NONE; > > + int mode = 0; > > + > > + if (keep_size) { > > + opflags |= FL_KEEP_SIZE; > > + mode |= FALLOC_FL_KEEP_SIZE; > > + } > > + if (unshare) { > > + opflags |= FL_UNSHARE; > > + mode |= FALLOC_FL_UNSHARE_RANGE; > > The FALLOC_FL_UNSHARE_RANGE is not a new flag, but it might be not old enough. > The first commit about the FALLOC_FL_UNSHARE_RANGE is: > > commit 409332b65d3ed8cfa7a8030f1e9d52f372219642 > Author: Lukas Czerner > Date: Thu Mar 13 19:07:42 2014 +1100 > > fs: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate > > So "#ifdef FALLOC_FL_UNSHARE_RANGE" maybe needed. > Indeed. Not that it matters, but note the above refers to zero range. Unshare range came a couple years later in 71be6b4942dd6 ("vfs: add a FALLOC_FL_UNSHARE mode to fallocate to unshare a range of blocks"). Also when looking at this, I noticed most of the test_fallocate() calls seem to break backward compatibility. I.e., if an FALLOC_FL_* flag is not defined, compilation is going to fail anyways because none of those usages are ifdef'd. This seems a bit confused, but I think the reason for the lack of ifdefs for the test calls is that most of these are enabled by default. IOW with a simple ifdef around the test_fallocate() call we wouldn't actually set unshare_range_calls = 0 in the case where the flag doesn't exist. It would be nice to have a macro that just tested the flag as well and otherwise returned 0, but a quick test suggests that's not allowed (though my macro-fu is weak). For the purpose of this patch, I'm just going to follow precedent and use an ifdef everywhere except for the test_fallocate() call. Unless somebody has a better idea, maybe I'll include a patch 2 that factors out a test_fallocate_calls() helper that does all the relevant ifdefery. Brian > Thanks, > Zorro > > > > + } > > > > if (length == 0) { > > if (!quiet && testcalls > simulatedopcount) > > prt("skipping zero length fallocate\n"); > > - log4(OP_FALLOCATE, offset, length, FL_SKIPPED | > > - (keep_size ? FL_KEEP_SIZE : FL_NONE)); > > + log4(OP_FALLOCATE, offset, length, FL_SKIPPED | opflags); > > return; > > } > > > > @@ -1905,8 +1923,7 @@ do_preallocate(unsigned offset, unsigned length, int keep_size) > > * 1: extending prealloc > > * 2: interior prealloc > > */ > > - log4(OP_FALLOCATE, offset, length, > > - keep_size ? FL_KEEP_SIZE : FL_NONE); > > + log4(OP_FALLOCATE, offset, length, opflags); > > > > if (end_offset > file_size) { > > memset(good_buf + file_size, '\0', end_offset - file_size); > > @@ -1921,7 +1938,7 @@ do_preallocate(unsigned offset, unsigned length, int keep_size) > > end_offset <= monitorend))) > > prt("%lld falloc\tfrom 0x%x to 0x%x (0x%x bytes)\n", testcalls, > > offset, offset + length, length); > > - if (fallocate(fd, keep_size ? FALLOC_FL_KEEP_SIZE : 0, (loff_t)offset, (loff_t)length) == -1) { > > + if (fallocate(fd, mode, (loff_t)offset, (loff_t)length) == -1) { > > prt("fallocate: 0x%x to 0x%x\n", offset, offset + length); > > prterr("do_preallocate: fallocate"); > > report_failure(161); > > @@ -1929,7 +1946,7 @@ do_preallocate(unsigned offset, unsigned length, int keep_size) > > } > > #else > > void > > -do_preallocate(unsigned offset, unsigned length, int keep_size) > > +do_preallocate(unsigned offset, unsigned length, int keep_size, int unshare) > > { > > return; > > } > > @@ -2095,6 +2112,8 @@ read_op(struct log_entry *log_entry) > > log_entry->flags |= FL_KEEP_SIZE; > > else if (strcmp(str, "close_open") == 0) > > log_entry->flags |= FL_CLOSE_OPEN; > > + else if (strcmp(str, "unshare") == 0) > > + log_entry->flags |= FL_UNSHARE; > > else if (strcmp(str, "*") == 0) > > ; /* overlap marker; ignore */ > > else > > @@ -2161,6 +2180,7 @@ test(void) > > unsigned long rv; > > unsigned long op; > > int keep_size = 0; > > + int unshare = 0; > > > > if (simulatedopcount > 0 && testcalls == simulatedopcount) > > writefileimage(); > > @@ -2190,6 +2210,7 @@ test(void) > > offset2 = log_entry.args[2]; > > closeopen = !!(log_entry.flags & FL_CLOSE_OPEN); > > keep_size = !!(log_entry.flags & FL_KEEP_SIZE); > > + unshare = !!(log_entry.flags & FL_UNSHARE); > > goto have_op; > > } > > return 0; > > @@ -2219,8 +2240,12 @@ test(void) > > size = random() % maxfilelen; > > break; > > case OP_FALLOCATE: > > - if (fallocate_calls && size && keep_size_calls) > > - keep_size = random() % 2; > > + if (fallocate_calls && size) { > > + if (keep_size_calls) > > + keep_size = random() % 2; > > + if (unshare_calls) > > + unshare = random() % 2; > > + } > > break; > > case OP_ZERO_RANGE: > > if (zero_range_calls && size && keep_size_calls) > > @@ -2334,7 +2359,7 @@ have_op: > > > > case OP_FALLOCATE: > > TRIM_OFF_LEN(offset, size, maxfilelen); > > - do_preallocate(offset, size, keep_size); > > + do_preallocate(offset, size, keep_size, unshare); > > break; > > > > case OP_PUNCH_HOLE: > > @@ -2469,6 +2494,7 @@ usage(void) > > -r readbdy: 4096 would make reads page aligned (default 1)\n\ > > -s style: 1 gives smaller truncates (default 0)\n\ > > -t truncbdy: 4096 would make truncates page aligned (default 1)\n\ > > + -u Do not use unshare range\n\ > > -w writebdy: 4096 would make writes page aligned (default 1)\n\ > > -x: preallocate file space before starting, XFS only\n\ > > -y: synchronize changes to a file\n" > > @@ -2853,7 +2879,7 @@ main(int argc, char **argv) > > setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */ > > > > while ((ch = getopt_long(argc, argv, > > - "0b:c:de:fg:i:j:kl:m:no:p:qr:s:t:w:xyABD:EFJKHzCILN:OP:RS:UWXZ", > > + "0b:c:de:fg:i:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ", > > longopts, NULL)) != EOF) > > switch (ch) { > > case 'b': > > @@ -2952,6 +2978,9 @@ main(int argc, char **argv) > > if (truncbdy <= 0) > > usage(); > > break; > > + case 'u': > > + unshare_calls = 0; > > + break; > > case 'w': > > writebdy = getnum(optarg, &endp); > > if (writebdy <= 0) > > @@ -3242,6 +3271,8 @@ main(int argc, char **argv) > > fallocate_calls = test_fallocate(0); > > if (keep_size_calls) > > keep_size_calls = test_fallocate(FALLOC_FL_KEEP_SIZE); > > + if (unshare_calls) > > + unshare_calls = test_fallocate(FALLOC_FL_UNSHARE_RANGE); > > if (punch_hole_calls) > > punch_hole_calls = test_fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE); > > if (zero_range_calls) > > -- > > 2.45.0 > > > > >