* Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF [not found] ` <20181130101441.GA213156@sasha-vm> @ 2018-11-30 20:35 ` Darrick J. Wong [not found] ` <20181130215005.GP19305@dastard> 1 sibling, 0 replies; 7+ messages in thread From: Darrick J. Wong @ 2018-11-30 20:35 UTC (permalink / raw) To: Sasha Levin Cc: Greg KH, Dave Chinner, stable, linux-kernel, Dave Chinner, linux-fsdevel, xfs On Fri, Nov 30, 2018 at 05:14:41AM -0500, Sasha Levin wrote: > On Fri, Nov 30, 2018 at 09:22:03AM +0100, Greg KH wrote: > > On Fri, Nov 30, 2018 at 09:40:19AM +1100, Dave Chinner wrote: > > > I stopped my tests at 5 billion ops yesterday (i.e. 20 billion ops > > > aggregate) to focus on testing the copy_file_range() changes, but > > > Darrick's tests are still ongoing and have passed 40 billion ops in > > > aggregate over the past few days. > > > > > > The reason we are running these so long is that we've seen fsx data > > > corruption failures after 12+ hours of runtime and hundreds of > > > millions of ops. Hence the testing for backported fixes will need to > > > replicate these test runs across multiple configurations for > > > multiple days before we have any confidence that we've actually > > > fixed the data corruptions and not introduced any new ones. > > > > > > If you pull only a small subset of the fixes, the fsx will still > > > fail and we have no real way of actually verifying that there have > > > been no regression introduced by the backport. IOWs, there's a > > > /massive/ amount of QA needed for ensuring that these backports work > > > correctly. > > > > > > Right now the XFS developers don't have the time or resources > > > available to validate stable backports are correct and regression > > > fre because we are focussed on ensuring the upstream fixes we've > > > already made (and are still writing) are solid and reliable. I feel the need to contribute my own interpretation of what's been going on the last four months: What you're seeing is not the usual level of reluctance to backport fixes to LTS kernels, it's our own frustrations at the kernel community's systemic inability to QA new fs features properly. Four months ago (prior to 4.19) Zorro started digging into periodic test failures with shared/010, which resulted in some fixes to the btrfs dedupe and clone range ioctl implementations. He then saw the same failures on XFS. Dave and I stared at the btrfs patches for a while, then started looking at the xfs counterparts, and realized that nobody had ever added those commands to the fstests stressor programs, nor had anyone ever encoded into a test the side effects of a file remap (mtime update, removal of suid). Nor were there any tests to ensure that these ioctls couldn't be abused to violate system security and stability constraints. That's why I refactored a whole ton of vfs file remap code for 4.20, and (with the help of Dave and Brian and others) worked on fixing all the problems where fsx and fsstress demonstrate file corruption problems. Then we started asking the same questions of the copy_file_range system call, and discovered that yes, we have all of the same problems. We also discovered several failure cases that aren't mentioned in any documentation, which has complicated the generation of automatable tests. Worse yet, the stressor programs fell over even sooner with the fallback splice implementation. TLDR: New features show up in the vfs without a lot of design documentation, incomplete userspace interface manuals, and not much beyond trivial testing. So the problem I'm facing here is that the XFS team are singlehandedly trying to pay off years of accumulated technical debt in the vfs. We definitely had a role in adding to that debt, so we're fixing it. Dave is now refactoring the copy_file_range backend to implement all the necessary security and stability checks, and I'm still QAing all the stuff we've added to 4.20. We're not finished, where "finished" means that we can get /one/ kernel tree to go ~100 billion fsxops without burping up failures, and we've written fstests to check that said kernel can handle correctly all the weird side cases. Until all those fstests go upstream, I don't want to spread out into backporting and testing LTS kernels, even with test automation. By the time we're done with all our upstream work you ought to be able to autosel backport the whole mess into the LTS kernels /and/ fstests will be able to tell you if the autosel has succeeded without causing any obvious regressions. > > Ok, that's fine, so users of XFS should wait until the 4.20 release > > before relying on it? :) At the rate we're going, we're not going to finish until 4.21, but yes, let's wait until 4.20 is closer to release to start in on porting all of its fixes to 4.14/4.19. > It's getting to the point that with the amount of known issues with XFS > on LTS kernels it makes sense to mark it as CONFIG_BROKEN. These aren't all issues specific to XFS; some plague every fs in subtle weird ways that only show up with extreme testing. We need the extreme testing to flush out as many bugs as we can before enabling the feature by default. XFS reflink is not enabled by default and due to all this is not likely to get it any time soon. (That copy_file_range syscall should have been rigorously tested before it was turned on in the kernel...) > > I understand your reluctance to want to backport anything, but it really > > feels like you are not even allowing for fixes that are "obviously > > right" to be backported either, even after they pass testing. Which > > isn't ok for your users. > > Do the XFS maintainers expect users to always use the latest upstream > kernel? For features that are EXPERIMENTAL or aren't enabled by default, yes, they should be. --D > > -- > Thanks, > Sasha ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20181130215005.GP19305@dastard>]
[parent not found: <20181201074909.GC213156@sasha-vm>]
* Re: XFS patches for stable [not found] ` <20181201074909.GC213156@sasha-vm> @ 2018-12-01 9:09 ` Amir Goldstein 2018-12-02 15:25 ` Sasha Levin 0 siblings, 1 reply; 7+ messages in thread From: Amir Goldstein @ 2018-12-01 9:09 UTC (permalink / raw) To: sashal Cc: Dave Chinner, Greg KH, stable, linux-kernel, Dave Chinner, Darrick J. Wong, linux-fsdevel, linux-xfs, Luis R. Chamberlain > >> It's getting to the point that with the amount of known issues with XFS > >> on LTS kernels it makes sense to mark it as CONFIG_BROKEN. > > > >Really? Where are the bug reports? > > In 'git log'! You report these every time you fix something in upstream > xfs but don't backport it to stable trees: > > $ git log --oneline v4.18-rc1..v4.18 fs/xfs > d4a34e165557 xfs: properly handle free inodes in extent hint validators > 9991274fddb9 xfs: Initialize variables in xfs_alloc_get_rec before using them > d8cb5e423789 xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation > e53c4b598372 xfs: ensure post-EOF zeroing happens after zeroing part of a file > a3a374bf1889 xfs: fix off-by-one error in xfs_rtalloc_query_range > 232d0a24b0fc xfs: fix uninitialized field in rtbitmap fsmap backend > 5bd88d153998 xfs: recheck reflink state after grabbing ILOCK_SHARED for a write > f62cb48e4319 xfs: don't allow insert-range to shift extents past the maximum offset > aafe12cee0b1 xfs: don't trip over negative free space in xfs_reserve_blocks > 10ee25268e1f xfs: allow empty transactions while frozen > e53946dbd31a xfs: xfs_iflush_abort() can be called twice on cluster writeback failure > 23fcb3340d03 xfs: More robust inode extent count validation > e2ac836307e3 xfs: simplify xfs_bmap_punch_delalloc_range > > Since I'm assuming that at least some of them are based on actual issues > users hit, and some of those apply to stable kernels, why would users > want to use an XFS version which is knowingly buggy? > Sasha, There is one more point to consider. Until v4.16, reflink and rmapbt features were experimental: 76883f7988e6 xfs: remove experimental tag for reverse mapping 1e369b0e199b xfs: remove experimental tag for reflinks And MANY of the bug fixes flowing in through XFS tree to master are related to those new XFS features and also to vfs functionality that depends on them (e.g. clone/dedupe), so there MAY be no bug reports at all for XFS in stable trees. IMO users should NOT be expecting XFS to be stable with those features enabled (they are still disabled by default) when running on stable kernels below v4.16. Allow me to act as a self-appointed mediator here and say: There is obviously some bad blood between xfs developers and stable tree maintainers. The conflicts are caused by long standing frustration on both sides. We would all be better off with looking forward on how to improve the situation instead dwelling on past mistakes. This issue was on the agenda at the XFS team meeting on last LSF/MM. The path towards compliance has been laid out by xfs maintainers. Luis, Sasha and myself have been working to improve the filesystem test coverage for stable tree candidate patches. We have still some way to go. The stable candidate patches that triggered the recent flames was outside of the fs/xfs subsystem, which AUTOSEL already know to stay away from, so nobody had any intention to stir things up. At the end of the day, most xfs developers work for companies that ship enterprise distros and need to maintain stable trees, so I would hope that it is in the best interest of everyone involved to cooperate on the goal of better stable-xfs ecosystem. On my part, I would be happy if AUTOSEL could point me at candidate patch *series* for review instead of single patches. For that matter, it sure wouldn't hurt if an xfs developer sending out a patch series would cc:stable on the cover letter and if a developer would be kind enough to add some backporting hints to the cover letter text that would be very helpful indeed. Thanks, Amir. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: XFS patches for stable 2018-12-01 9:09 ` XFS patches for stable Amir Goldstein @ 2018-12-02 15:25 ` Sasha Levin 2018-12-02 16:10 ` Christoph Hellwig 0 siblings, 1 reply; 7+ messages in thread From: Sasha Levin @ 2018-12-02 15:25 UTC (permalink / raw) To: Amir Goldstein Cc: Dave Chinner, Greg KH, stable, linux-kernel, Dave Chinner, Darrick J. Wong, linux-fsdevel, linux-xfs, Luis R. Chamberlain On Sat, Dec 01, 2018 at 11:09:05AM +0200, Amir Goldstein wrote: >> >> It's getting to the point that with the amount of known issues with XFS >> >> on LTS kernels it makes sense to mark it as CONFIG_BROKEN. >> > >> >Really? Where are the bug reports? >> >> In 'git log'! You report these every time you fix something in upstream >> xfs but don't backport it to stable trees: >> >> $ git log --oneline v4.18-rc1..v4.18 fs/xfs >> d4a34e165557 xfs: properly handle free inodes in extent hint validators >> 9991274fddb9 xfs: Initialize variables in xfs_alloc_get_rec before using them >> d8cb5e423789 xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation >> e53c4b598372 xfs: ensure post-EOF zeroing happens after zeroing part of a file >> a3a374bf1889 xfs: fix off-by-one error in xfs_rtalloc_query_range >> 232d0a24b0fc xfs: fix uninitialized field in rtbitmap fsmap backend >> 5bd88d153998 xfs: recheck reflink state after grabbing ILOCK_SHARED for a write >> f62cb48e4319 xfs: don't allow insert-range to shift extents past the maximum offset >> aafe12cee0b1 xfs: don't trip over negative free space in xfs_reserve_blocks >> 10ee25268e1f xfs: allow empty transactions while frozen >> e53946dbd31a xfs: xfs_iflush_abort() can be called twice on cluster writeback failure >> 23fcb3340d03 xfs: More robust inode extent count validation >> e2ac836307e3 xfs: simplify xfs_bmap_punch_delalloc_range >> >> Since I'm assuming that at least some of them are based on actual issues >> users hit, and some of those apply to stable kernels, why would users >> want to use an XFS version which is knowingly buggy? >> > >Sasha, > >There is one more point to consider. >Until v4.16, reflink and rmapbt features were experimental: >76883f7988e6 xfs: remove experimental tag for reverse mapping >1e369b0e199b xfs: remove experimental tag for reflinks > >And MANY of the bug fixes flowing in through XFS tree to master >are related to those new XFS features and also to vfs functionality >that depends on them (e.g. clone/dedupe), so there MAY be no >bug reports at all for XFS in stable trees. > >IMO users should NOT be expecting XFS to be stable with those >features enabled (they are still disabled by default) >when running on stable kernels below v4.16. > >Allow me to act as a self-appointed mediator here and say: >There is obviously some bad blood between xfs developers and stable >tree maintainers. >The conflicts are caused by long standing frustration on both sides. >We would all be better off with looking forward on how to improve the >situation instead dwelling on past mistakes. >This issue was on the agenda at the XFS team meeting on last LSF/MM. >The path towards compliance has been laid out by xfs maintainers. >Luis, Sasha and myself have been working to improve the filesystem >test coverage for stable tree candidate patches. >We have still some way to go. > >The stable candidate patches that triggered the recent flames >was outside of the fs/xfs subsystem, which AUTOSEL already know >to stay away from, so nobody had any intention to stir things up. > >At the end of the day, most xfs developers work for companies that >ship enterprise distros and need to maintain stable trees, so I would >hope that it is in the best interest of everyone involved to cooperate >on the goal of better stable-xfs ecosystem. > >On my part, I would be happy if AUTOSEL could point me at >candidate patch *series* for review instead of single patches. I'm afraid it's not smart enough to do that :( I can grab an entire series if it selects a single patch in a series, but from my experience it's usually the wrong thing to do. >For that matter, it sure wouldn't hurt if an xfs developer sending >out a patch series would cc:stable on the cover letter and if a developer >would be kind enough to add some backporting hints to the cover letter >text that would be very helpful indeed. Given that we have folks (Luis, Amir, etc) working on it already, maybe a step in the right direction would be having the XFS folks tag fixes some other way ("#wants-a-backport"?) where this would give a hint that this should be backported after sufficient testing? We won't pick these commits to stable ourselves, but only after the XFS maintainers are satisfied that the commit was sufficiently tested on LTS trees? -- Thanks, Sasha ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: XFS patches for stable 2018-12-02 15:25 ` Sasha Levin @ 2018-12-02 16:10 ` Christoph Hellwig 2018-12-02 20:08 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2018-12-02 16:10 UTC (permalink / raw) To: Sasha Levin Cc: Amir Goldstein, Dave Chinner, Greg KH, stable, linux-kernel, Dave Chinner, Darrick J. Wong, linux-fsdevel, linux-xfs, Luis R. Chamberlain As someone who has done xfs stable backports for a while I really don't think the autoselection is helpful at all. Someone who is vaguely familiar with the code needs to manually select the commits and QA them, which takes a fair amount of time, but just needs some manual help if it should work ok. I think we are about ready to have a new xfs stable maintainer lined up if everything works well fortunately. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: XFS patches for stable 2018-12-02 16:10 ` Christoph Hellwig @ 2018-12-02 20:08 ` Greg KH 2018-12-03 14:41 ` Richard Weinberger 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2018-12-02 20:08 UTC (permalink / raw) To: Christoph Hellwig Cc: Sasha Levin, Amir Goldstein, Dave Chinner, stable, linux-kernel, Dave Chinner, Darrick J. Wong, linux-fsdevel, linux-xfs, Luis R. Chamberlain On Sun, Dec 02, 2018 at 08:10:16AM -0800, Christoph Hellwig wrote: > As someone who has done xfs stable backports for a while I really don't > think the autoselection is helpful at all. autoselection for xfs patches has been turned off for a while, what triggered this email thread was a core vfs patch that was backported that was not obvious it was created by the xfs developers due to a problem they had found. > Someone who is vaguely familiar with the code needs to manually select > the commits and QA them, which takes a fair amount of time, but just > needs some manual help if it should work ok. > > I think we are about ready to have a new xfs stable maintainer lined up > if everything works well fortunately. That would be wonderful news. thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: XFS patches for stable 2018-12-02 20:08 ` Greg KH @ 2018-12-03 14:41 ` Richard Weinberger 2018-12-03 16:56 ` Sasha Levin 0 siblings, 1 reply; 7+ messages in thread From: Richard Weinberger @ 2018-12-03 14:41 UTC (permalink / raw) To: Greg KH Cc: Christoph Hellwig, sashal, amir73il, Dave Chinner, stable, LKML, dchinner, darrick.wong, linux-fsdevel, linux-xfs, mcgrof, linux-mtd, boris.brezillon On Sun, Dec 2, 2018 at 9:09 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Sun, Dec 02, 2018 at 08:10:16AM -0800, Christoph Hellwig wrote: > > As someone who has done xfs stable backports for a while I really don't > > think the autoselection is helpful at all. > > autoselection for xfs patches has been turned off for a while, what > triggered this email thread was a core vfs patch that was backported > that was not obvious it was created by the xfs developers due to a > problem they had found. Sorry for hijacking this thread. Can you please also disable autoselection for MTD, UBI and UBIFS? fs/ubifs/ drivers/mtd/ include/linux/mtd/ include/uapi/mtd/ -- Thanks, //richard ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: XFS patches for stable 2018-12-03 14:41 ` Richard Weinberger @ 2018-12-03 16:56 ` Sasha Levin 0 siblings, 0 replies; 7+ messages in thread From: Sasha Levin @ 2018-12-03 16:56 UTC (permalink / raw) To: Richard Weinberger Cc: Greg KH, Christoph Hellwig, amir73il, Dave Chinner, stable, LKML, dchinner, darrick.wong, linux-fsdevel, linux-xfs, mcgrof, linux-mtd, boris.brezillon On Mon, Dec 03, 2018 at 03:41:27PM +0100, Richard Weinberger wrote: >On Sun, Dec 2, 2018 at 9:09 PM Greg KH <gregkh@linuxfoundation.org> wrote: >> >> On Sun, Dec 02, 2018 at 08:10:16AM -0800, Christoph Hellwig wrote: >> > As someone who has done xfs stable backports for a while I really don't >> > think the autoselection is helpful at all. >> >> autoselection for xfs patches has been turned off for a while, what >> triggered this email thread was a core vfs patch that was backported >> that was not obvious it was created by the xfs developers due to a >> problem they had found. > >Sorry for hijacking this thread. >Can you please also disable autoselection for MTD, UBI and UBIFS? > >fs/ubifs/ >drivers/mtd/ >include/linux/mtd/ >include/uapi/mtd/ Sure, done! -- Thanks, Sasha ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-12-03 16:56 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20181129060110.159878-1-sashal@kernel.org>
[not found] ` <20181129060110.159878-25-sashal@kernel.org>
[not found] ` <20181129121458.GK19305@dastard>
[not found] ` <20181129124756.GA25945@kroah.com>
[not found] ` <20181129224019.GM19305@dastard>
[not found] ` <20181130082203.GA26830@kroah.com>
[not found] ` <20181130101441.GA213156@sasha-vm>
2018-11-30 20:35 ` [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF Darrick J. Wong
[not found] ` <20181130215005.GP19305@dastard>
[not found] ` <20181201074909.GC213156@sasha-vm>
2018-12-01 9:09 ` XFS patches for stable Amir Goldstein
2018-12-02 15:25 ` Sasha Levin
2018-12-02 16:10 ` Christoph Hellwig
2018-12-02 20:08 ` Greg KH
2018-12-03 14:41 ` Richard Weinberger
2018-12-03 16:56 ` Sasha Levin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox