[PATCH AUTOSEL 5.9 33/33] xfs: don't allow NOWAIT DIO across extent boundaries

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH AUTOSEL 5.9 33/33] xfs: don't allow NOWAIT DIO across extent boundaries
       [not found] <20201125153550.810101-1-sashal@kernel.org>
@ 2020-11-25 15:35 ` Sasha Levin
  2020-11-25 21:52   ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2020-11-25 15:35 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Dave Chinner, Jens Axboe, Darrick J . Wong, Sasha Levin,
	linux-xfs

From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit 883a790a84401f6f55992887fd7263d808d4d05d ]

Jens has reported a situation where partial direct IOs can be issued
and completed yet still return -EAGAIN. We don't want this to report
a short IO as we want XFS to complete user DIO entirely or not at
all.

This partial IO situation can occur on a write IO that is split
across an allocated extent and a hole, and the second mapping is
returning EAGAIN because allocation would be required.

The trivial reproducer:

$ sudo xfs_io -fdt -c "pwrite 0 4k" -c "pwrite -V 1 -b 8k -N 0 8k" /mnt/scr/foo
wrote 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0.0001 sec (27.509 MiB/sec and 7042.2535 ops/sec)
pwrite: Resource temporarily unavailable
$

The pwritev2(0, 8kB, RWF_NOWAIT) call returns EAGAIN having done
the first 4kB write:

 xfs_file_direct_write: dev 259:1 ino 0x83 size 0x1000 offset 0x0 count 0x2000
 iomap_apply:          dev 259:1 ino 0x83 pos 0 length 8192 flags WRITE|DIRECT|NOWAIT (0x31) ops xfs_direct_write_iomap_ops caller iomap_dio_rw actor iomap_dio_actor
 xfs_ilock_nowait:     dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_ilock_for_iomap
 xfs_iunlock:          dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_direct_write_iomap_begin
 xfs_iomap_found:      dev 259:1 ino 0x83 size 0x1000 offset 0x0 count 8192 fork data startoff 0x0 startblock 24 blockcount 0x1
 iomap_apply_dstmap:   dev 259:1 ino 0x83 bdev 259:1 addr 102400 offset 0 length 4096 type MAPPED flags DIRTY

Here the first iomap loop has mapped the first 4kB of the file and
issued the IO, and we enter the second iomap_apply loop:

 iomap_apply: dev 259:1 ino 0x83 pos 4096 length 4096 flags WRITE|DIRECT|NOWAIT (0x31) ops xfs_direct_write_iomap_ops caller iomap_dio_rw actor iomap_dio_actor
 xfs_ilock_nowait:     dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_ilock_for_iomap
 xfs_iunlock:          dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_direct_write_iomap_begin

And we exit with -EAGAIN out because we hit the allocate case trying
to make the second 4kB block.

Then IO completes on the first 4kB and the original IO context
completes and unlocks the inode, returning -EAGAIN to userspace:

 xfs_end_io_direct_write: dev 259:1 ino 0x83 isize 0x1000 disize 0x1000 offset 0x0 count 4096
 xfs_iunlock:          dev 259:1 ino 0x83 flags IOLOCK_SHARED caller xfs_file_dio_aio_write

There are other vectors to the same problem when we re-enter the
mapping code if we have to make multiple mappinfs under NOWAIT
conditions. e.g. failing trylocks, COW extents being found,
allocation being required, and so on.

Avoid all these potential problems by only allowing IOMAP_NOWAIT IO
to go ahead if the mapping we retrieve for the IO spans an entire
allocated extent. This avoids the possibility of subsequent mappings
to complete the IO from triggering NOWAIT semantics by any means as
NOWAIT IO will now only enter the mapping code once per NOWAIT IO.

Reported-and-tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/xfs/xfs_iomap.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 3abb8b9d6f4c6..7b9ff824e82d4 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -706,6 +706,23 @@ xfs_ilock_for_iomap(
 	return 0;
 }
 
+/*
+ * Check that the imap we are going to return to the caller spans the entire
+ * range that the caller requested for the IO.
+ */
+static bool
+imap_spans_range(
+	struct xfs_bmbt_irec	*imap,
+	xfs_fileoff_t		offset_fsb,
+	xfs_fileoff_t		end_fsb)
+{
+	if (imap->br_startoff > offset_fsb)
+		return false;
+	if (imap->br_startoff + imap->br_blockcount < end_fsb)
+		return false;
+	return true;
+}
+
 static int
 xfs_direct_write_iomap_begin(
 	struct inode		*inode,
@@ -766,6 +783,18 @@ xfs_direct_write_iomap_begin(
 	if (imap_needs_alloc(inode, flags, &imap, nimaps))
 		goto allocate_blocks;
 
+	/*
+	 * NOWAIT IO needs to span the entire requested IO with a single map so
+	 * that we avoid partial IO failures due to the rest of the IO range not
+	 * covered by this map triggering an EAGAIN condition when it is
+	 * subsequently mapped and aborting the IO.
+	 */
+	if ((flags & IOMAP_NOWAIT) &&
+	    !imap_spans_range(&imap, offset_fsb, end_fsb)) {
+		error = -EAGAIN;
+		goto out_unlock;
+	}
+
 	xfs_iunlock(ip, lockmode);
 	trace_xfs_iomap_found(ip, offset, length, XFS_DATA_FORK, &imap);
 	return xfs_bmbt_to_iomap(ip, iomap, &imap, iomap_flags);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH AUTOSEL 5.9 33/33] xfs: don't allow NOWAIT DIO across extent boundaries
  2020-11-25 15:35 ` [PATCH AUTOSEL 5.9 33/33] xfs: don't allow NOWAIT DIO across extent boundaries Sasha Levin
@ 2020-11-25 21:52   ` Dave Chinner
  2020-11-25 23:46     ` Sasha Levin
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2020-11-25 21:52 UTC (permalink / raw)
  To: Sasha Levin
  Cc: linux-kernel, stable, Dave Chinner, Jens Axboe, Darrick J . Wong,
	linux-xfs

On Wed, Nov 25, 2020 at 10:35:50AM -0500, Sasha Levin wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> [ Upstream commit 883a790a84401f6f55992887fd7263d808d4d05d ]
> 
> Jens has reported a situation where partial direct IOs can be issued
> and completed yet still return -EAGAIN. We don't want this to report
> a short IO as we want XFS to complete user DIO entirely or not at
> all.
> 
> This partial IO situation can occur on a write IO that is split
> across an allocated extent and a hole, and the second mapping is
> returning EAGAIN because allocation would be required.
> 
> The trivial reproducer:
> 
> $ sudo xfs_io -fdt -c "pwrite 0 4k" -c "pwrite -V 1 -b 8k -N 0 8k" /mnt/scr/foo
> wrote 4096/4096 bytes at offset 0
> 4 KiB, 1 ops; 0.0001 sec (27.509 MiB/sec and 7042.2535 ops/sec)
> pwrite: Resource temporarily unavailable
> $
> 
> The pwritev2(0, 8kB, RWF_NOWAIT) call returns EAGAIN having done
> the first 4kB write:
> 
>  xfs_file_direct_write: dev 259:1 ino 0x83 size 0x1000 offset 0x0 count 0x2000
>  iomap_apply:          dev 259:1 ino 0x83 pos 0 length 8192 flags WRITE|DIRECT|NOWAIT (0x31) ops xfs_direct_write_iomap_ops caller iomap_dio_rw actor iomap_dio_actor
>  xfs_ilock_nowait:     dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_ilock_for_iomap
>  xfs_iunlock:          dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_direct_write_iomap_begin
>  xfs_iomap_found:      dev 259:1 ino 0x83 size 0x1000 offset 0x0 count 8192 fork data startoff 0x0 startblock 24 blockcount 0x1
>  iomap_apply_dstmap:   dev 259:1 ino 0x83 bdev 259:1 addr 102400 offset 0 length 4096 type MAPPED flags DIRTY
> 
> Here the first iomap loop has mapped the first 4kB of the file and
> issued the IO, and we enter the second iomap_apply loop:
> 
>  iomap_apply: dev 259:1 ino 0x83 pos 4096 length 4096 flags WRITE|DIRECT|NOWAIT (0x31) ops xfs_direct_write_iomap_ops caller iomap_dio_rw actor iomap_dio_actor
>  xfs_ilock_nowait:     dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_ilock_for_iomap
>  xfs_iunlock:          dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_direct_write_iomap_begin
> 
> And we exit with -EAGAIN out because we hit the allocate case trying
> to make the second 4kB block.
> 
> Then IO completes on the first 4kB and the original IO context
> completes and unlocks the inode, returning -EAGAIN to userspace:
> 
>  xfs_end_io_direct_write: dev 259:1 ino 0x83 isize 0x1000 disize 0x1000 offset 0x0 count 4096
>  xfs_iunlock:          dev 259:1 ino 0x83 flags IOLOCK_SHARED caller xfs_file_dio_aio_write
> 
> There are other vectors to the same problem when we re-enter the
> mapping code if we have to make multiple mappinfs under NOWAIT
> conditions. e.g. failing trylocks, COW extents being found,
> allocation being required, and so on.
> 
> Avoid all these potential problems by only allowing IOMAP_NOWAIT IO
> to go ahead if the mapping we retrieve for the IO spans an entire
> allocated extent. This avoids the possibility of subsequent mappings
> to complete the IO from triggering NOWAIT semantics by any means as
> NOWAIT IO will now only enter the mapping code once per NOWAIT IO.
> 
> Reported-and-tested-by: Jens Axboe <axboe@kernel.dk>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  fs/xfs/xfs_iomap.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)

No, please don't pick this up for stable kernels until at least 5.10
is released. This still needs some time integration testing time to
ensure that we haven't introduced any other performance regressions
with this change.

We've already had one XFS upstream kernel regression in this -rc
cycle propagated to the stable kernels in 5.9.9 because the stable
process picked up a bunch of random XFS fixes within hours of them
being merged by Linus. One of those commits was a result of a
thinko, and despite the fact we found it and reverted it within a
few days, users of stable kernels have been exposed to it for a
couple of weeks. That *should never have happened*. 

This has happened before, and *again* we were lucky this wasn't
worse than it was. We were saved by the flaw being caught by own
internal pre-write corruption verifiers (which exist because we
don't trust our code to be bug-free, let alone the collections of
random, poorly tested backports) so that it only resulted in
corruption shutdowns rather than permanent on-disk damage and data
loss.

Put simply: the stable process is flawed because it shortcuts the
necessary stabilisation testing for new code. It doesn't matter if
the merged commits have a "fixes" tag in them, that tag doesn't mean
the change is ready to be exposed to production systems. We need the
*-rc stabilisation process* to weed out thinkos, brown paper bag
bugs, etc, because we all make mistakes, and bugs in filesystem code
can *lose user data permanently*.

Hence I ask that the stable maintainers only do automated pulls of
iomap and XFS changes from upstream kernels when Linus officially
releases them rather than at random points in time in the -rc cycle.
If there is a critical fix we need to go back to stable kernels
immediately, we will let stable@kernel.org know directly that we
want this done.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH AUTOSEL 5.9 33/33] xfs: don't allow NOWAIT DIO across extent boundaries
  2020-11-25 21:52   ` Dave Chinner
@ 2020-11-25 23:46     ` Sasha Levin
  2020-11-26  7:13       ` Dave Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2020-11-25 23:46 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-kernel, stable, Dave Chinner, Jens Axboe, Darrick J . Wong,
	linux-xfs

On Thu, Nov 26, 2020 at 08:52:47AM +1100, Dave Chinner wrote:
>We've already had one XFS upstream kernel regression in this -rc
>cycle propagated to the stable kernels in 5.9.9 because the stable
>process picked up a bunch of random XFS fixes within hours of them
>being merged by Linus. One of those commits was a result of a
>thinko, and despite the fact we found it and reverted it within a
>few days, users of stable kernels have been exposed to it for a
>couple of weeks. That *should never have happened*.

No, what shouldn't have happened is a commit that never went out for a review
on the public mailing lists nor spending any time in linux-next ending
up in Linus's tree.

It's ridiculous that you see a failure in the maintainership workflow of
XFS and turn around to blame it somehow on the stable process.

>This has happened before, and *again* we were lucky this wasn't
>worse than it was. We were saved by the flaw being caught by own
>internal pre-write corruption verifiers (which exist because we
>don't trust our code to be bug-free, let alone the collections of
>random, poorly tested backports) so that it only resulted in
>corruption shutdowns rather than permanent on-disk damage and data
>loss.
>
>Put simply: the stable process is flawed because it shortcuts the
>necessary stabilisation testing for new code. It doesn't matter if

The stable process assumes that commits that ended up upstream were
reviewed and tested; the stable process doesn't offer much in the way of
in-depth review of specific patches but mostly focuses on testing the
product of backporting hundreds of patches into each stable branch.

Release candidate cycles are here to squash the bugs that went in during
the merge window, not to introduce new "thinkos" in the way of pulling
patches out of your hip in the middle of the release cycle.

>the merged commits have a "fixes" tag in them, that tag doesn't mean
>the change is ready to be exposed to production systems. We need the
>*-rc stabilisation process* to weed out thinkos, brown paper bag
>bugs, etc, because we all make mistakes, and bugs in filesystem code
>can *lose user data permanently*.

What needed to happen here is that XFS's internal testing story would
run *before* this patch was merged anywhere and catch this bug. Why
didn't it happen?

>Hence I ask that the stable maintainers only do automated pulls of
>iomap and XFS changes from upstream kernels when Linus officially
>releases them rather than at random points in time in the -rc cycle.
>If there is a critical fix we need to go back to stable kernels
>immediately, we will let stable@kernel.org know directly that we
>want this done.

I'll happily switch back to a model where we look only for stable tags
from XFS, but sadly this happened only *once* in the past year. How is
this helping to prevent the dangerous bugs that may cause users to lose
their data permanently?

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH AUTOSEL 5.9 33/33] xfs: don't allow NOWAIT DIO across extent boundaries
  2020-11-25 23:46     ` Sasha Levin
@ 2020-11-26  7:13       ` Dave Chinner
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2020-11-26  7:13 UTC (permalink / raw)
  To: Sasha Levin
  Cc: linux-kernel, stable, Dave Chinner, Jens Axboe, Darrick J . Wong,
	linux-xfs

On Wed, Nov 25, 2020 at 06:46:54PM -0500, Sasha Levin wrote:
> On Thu, Nov 26, 2020 at 08:52:47AM +1100, Dave Chinner wrote:
> > We've already had one XFS upstream kernel regression in this -rc
> > cycle propagated to the stable kernels in 5.9.9 because the stable
> > process picked up a bunch of random XFS fixes within hours of them
> > being merged by Linus. One of those commits was a result of a
> > thinko, and despite the fact we found it and reverted it within a
> > few days, users of stable kernels have been exposed to it for a
> > couple of weeks. That *should never have happened*.
> 
> No, what shouldn't have happened is a commit that never went out for a review
> on the public mailing lists nor spending any time in linux-next ending
> up in Linus's tree.

<sigh>

I think you've got your wires crossed somewhere, Sasha, because none
of that happened here.  From the public record, the patch was first
posted here by Darrick:

https://lore.kernel.org/linux-xfs/160494584816.772693.2490433010759557816.stgit@magnolia/

on Nov 9, and was reviewed by Christoph a day later.  It was merged
into the XFS tree on Nov 10, with the rvb tag:

https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git/commit/?h=for-next&id=6ff646b2ceb0eec916101877f38da0b73e3a5b7f

Which means it should have been in linux-next on Nov 11, 12 and 13,
when Darrick sent the pull request:

https://lore.kernel.org/linux-xfs/20201113231738.GX9695@magnolia/

It was merged into Linus's tree an hour later.

So, in contrast to your claims, the evidence is that the patch was,
in fact, publicly posted, reviewed, and spent time in linux-next
before ending up in Linus's tree.

FWIW, on November 17, GregKH sent the patch to lkml for stable review
after being selected by the stable process for a stable backport.
This was not cc'd to the XFS list, and it was committed without
comment into the  5.9.x tree is on November 18.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/fs/xfs?h=linux-5.9.y&id=0ca9a072112b18efc9ba9d3a9b77e9dae60f93ac

IOWs, the XFS developers didn't ask for it to be backported to
stable kernels - the commit did not contain a "cc:
stable@kernel.org", nor was the original patch posting cc'd to the
stable list.

The fact is that entire decision to backport this commit was made by
stable maintainers and/or their tools, and the stable maintainers
themselves chose not to tell the XFS list they had selected it for
backport.

Hence:

> It's ridiculous that you see a failure in the maintainership workflow of
> XFS and turn around to blame it somehow on the stable process.

.... I think you really need to have another look at the evidence
before you dig yourself a deeper hole and waste more of my time....

> > This has happened before, and *again* we were lucky this wasn't
> > worse than it was. We were saved by the flaw being caught by own
> > internal pre-write corruption verifiers (which exist because we
> > don't trust our code to be bug-free, let alone the collections of
> > random, poorly tested backports) so that it only resulted in
> > corruption shutdowns rather than permanent on-disk damage and data
> > loss.
> > 
> > Put simply: the stable process is flawed because it shortcuts the
> > necessary stabilisation testing for new code. It doesn't matter if
> 
> The stable process assumes that commits that ended up upstream were
> reviewed and tested; the stable process doesn't offer much in the way of
> in-depth review of specific patches but mostly focuses on testing the
> product of backporting hundreds of patches into each stable branch.

And I've lost count of the number of times I've told the stable
maintainers that this is an invalid assumption.

Yet here we are again.

How many times do we have to make the same mistake before we learn
from it?

> Release candidate cycles are here to squash the bugs that went in during
> the merge window, not to introduce new "thinkos" in the way of pulling
> patches out of your hip in the middle of the release cycle.

"pulling patches out of your hip"

Nice insult. Avoids profanity filters and everything. But I don't
know why you're trying to insult me over something I played no part
in.

Seriously, merging critical fixes discovered in the -rc cycle
happens *all the time* and has been done for as long as we've had
-rc cycles.  Even Linus himself does this.

The fact is that the -rc process is intended to accomodate merging
fixes quickly whilst still allowing sufficient testing time to be
confident that no regressions were introduced or have been found and
addressed before release.

And that's the whole point of having an iterative integration
testing phase in the release cycle - it can be adapted in duration
to the current state of the code base and the fixes that are being
made late in the cycle.

You *should* know all this Sasha, so I'm not sure why you are
claiming that long standing, well founded software engineering
practices are suddenly a problem...

> > the merged commits have a "fixes" tag in them, that tag doesn't mean
> > the change is ready to be exposed to production systems. We need the
> > *-rc stabilisation process* to weed out thinkos, brown paper bag
> > bugs, etc, because we all make mistakes, and bugs in filesystem code
> > can *lose user data permanently*.
> 
> What needed to happen here is that XFS's internal testing story would
> run *before* this patch was merged anywhere and catch this bug. Why
> didn't it happen?

It did. It's simply that kernel unit testing didn't discover the
bug. That happens all the time, not just in XFS.  In fact, it was
XFS userspace unit testing that found the bug.

The kernel by itself didn't trigger inconsistencies because
everything it created matches what it expected, and an old userspace
checking a new kernel would ignore the bits changed by the bad
commit in the kernel.  IOWs, the typical kernel unit testing
configuration of "new kernel, stable userspace" didn't see anything
wrong.

It wasn't until a new userspace was run with an old kernel that the
problem was found. The new userspace expected bits in the keys
on disk to be set that an old kernel didn't set and that's when
inconsistencies were flagged and the problem uncovered. This was
found by the userspace XFS maintainer as when testing the
changes merged from the kernel code. Unit testing userspace is
the opposite of the kernel - "stable kernel, new userspace" - and
that's the combination that triggered the errors that made us aware
of the problem.

So, yes, our normal testing processes found the bug, it's just a the
filesystem is *much more than just the kernel code* and sometimes
bugs in the core code are found on the userspace side before they
are found in the kernel.

However, testing variations in kernel and/or userspace tool versions
is not generally part of the unit tests developers run.  It is,
however, something that we cover as the new code rolls out to
testers and developers code as part of the integration testing
process. That's when version mismatch and upgrade/downgrade bugs
tend to show up as that's when the variety of installations the code
is tested on rapidly expands.  IOWs, we caught the problem exactly
when hindsight tells us we should expect to catch such a problem.

The reality is that a single developer cannot do this sort of
testing, hence we do not expect a single developer to do this. We
don't even expect the maintainer to be able to cover such a huge
testing scope before merging commits. It's simply not realistic to
cover everything before code changes are merged. Perfection is the
enemy of good.

I'll repeat the lesson to be learned here: merging a commit into
Linus's tree does not mean it is fully tested and production ready.
It just means it's passed a wide range of unit tests without
regressions and so is considered ready for wider integration testing
by a larger population of developers and testers.

> > Hence I ask that the stable maintainers only do automated pulls of
> > iomap and XFS changes from upstream kernels when Linus officially
> > releases them rather than at random points in time in the -rc cycle.
> > If there is a critical fix we need to go back to stable kernels
> > immediately, we will let stable@kernel.org know directly that we
> > want this done.
> 
> I'll happily switch back to a model where we look only for stable tags
> from XFS, but sadly this happened only *once* in the past year. How is
> this helping to prevent the dangerous bugs that may cause users to lose
> their data permanently?

Given that the only dangerous bug that XFS users have been directly
exposed to in recent times has been caused by an automated stable
kernel backport short-circuiting our normal commit-test-release
cycle, I can't see how XFS users will be worse off by turning off
automated backports. :/

But that is not what I asked you to do or consider, it's just a
strawman you constructed.  What I want is for users to benefit from
the overall stable process, but I don't want them exposed to the
potential catastrophic risks that the current stable process exposes
them to.

As developers, the stable process gives us no margin for error. We
need at least some margin to prevent users for being exposed to
avoidable regressions in stable kernels.  So what I'm asking you to
do is back off the bots a bit to provide that margin because, as we
all known, bugs and mistakes happen.

An acceptible compromise from our perspective would be to run
automated scans on released kernels that have already run the full
test cycle.  This way the users get the benefits of both full
integration testing of the patches that get backported, and all the
changes that the stable maintainers think their kernels should be
getting end up in the stable kernel.

For the sorts of changes your process is typically backporting from
XFS, an extra couple of weeks time lag is going to make no
difference to users at all. But that extra margin means that
situations like this recent stable kernel regression do not occur,
resulting in stable kernels having a much lower risk profile for
it's users.

That seems like a win-win-win scenario to me, so rather than throw
shade and insults at the messenger, how about listening, learning
from mistakes and trying to improve the process in a way that
benefits everyone?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-11-26  7:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20201125153550.810101-1-sashal@kernel.org>
2020-11-25 15:35 ` [PATCH AUTOSEL 5.9 33/33] xfs: don't allow NOWAIT DIO across extent boundaries Sasha Levin
2020-11-25 21:52   ` Dave Chinner
2020-11-25 23:46     ` Sasha Levin
2020-11-26  7:13       ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox