From: Mike Snitzer <snitzer@redhat.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
Shaohua Li <shli@fb.com>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
axboe@fb.com, sitsofe@yahoo.com, Kernel-team@fb.com
Subject: Re: [PATCH V2] block: correctly fallback for zeroout
Date: Tue, 14 Jun 2016 14:36:12 -0400 [thread overview]
Message-ID: <20160614183612.GB26196@redhat.com> (raw)
In-Reply-To: <20160613082001.GA27508@infradead.org>
On Mon, Jun 13 2016 at 4:20am -0400,
Christoph Hellwig <hch@infradead.org> wrote:
> On Fri, Jun 10, 2016 at 09:49:44PM -0400, Martin K. Petersen wrote:
> > >> What does the extra io_err buy us? Just have this function return an
> > >> error. And then in blkdev_issue_discard if you get -EOPNOTSUPP you
> > >> special case it there.
> >
> > Shaohua> The __blkdev_issue_discard returns -EOPNOTSUPP if disk doesn't
> > Shaohua> support discard. in that case, blkdev_issue_discard doesn't
> > Shaohua> return 0. blkdev_issue_discard only returns 0 if IO error is
> > Shaohua> -EOPNOTSUPP.
> >
> > Oh, I see. The sanity checks are now in __blkdev_issue_discard() so
> > there is no way to distinguish between -EOPNOTSUPP and the other
> > -EOPNOTSUPP. *sigh*
>
> We can move the sanity checks out. Or even better get rid of the
> stupid behavior of ignoring the late -EOPNOTSUPP in this low level
> helper and instead leaving it to the caller(s) that care.
I'm not onboard with blkdev_issue_discard() no longer masking the late
return of -EOPNOTSUPP.
I'd be fine with moving the early -EOPNOTSUPP checks and the masking of
late -EOPNOTSUPP out to blkdev_issue_discard(). But to be clear,
the masking of late -EOPNOTSUPP return is there for stacking drivers
like MD and DM. So long as the upper level ioctl code, filesystems, etc
makes use of blkdev_issue_discard() then they'll still get the benefit
of that masking.
drivers/md/dm-thin.c is now using the new async __blkdev_issue_discard()
and it'll only ever do so to a device it knows supports discards -- BUT
it could be that the DM thin-pool's data device is itself a stacked
device that doesn't uniformly support discards throughout its entire
logical address space. So it could issue a discard to a portion of the
stacked data device that will return -EOPNOTSUPP.. so long story short:
making this change to remove this so-called "stupid behaviour" will
require code like drivers/md/dm-thin.c:issue_discard(() to check the
return from __blkdev_issue_discard() and if it is -EOPNOTSUPP then it
should return 0.
> So far the DM test suite seems to be the only one that does.
The device-mapper-test-suite was only ever relying on
blkdev_issue_discard()'s early return of -EOPNOTSUPP.
> > I am OK with your patch as a stable fix but this really needs to be
> > fixed up properly.
>
> And I'd much prefer to get this right now. It's not like this is
> recently introduced behavior.
We need to sequence the fixes such that stable kernels get the zeroout
fallback fixed. Right? Not sure if that is a goal of shli's though..
In 4.7-rc, where you introduced __blkdev_issue_discard and I made
dm-thin.c consume it, I'm fine with seeing __blkdev_issue_discard stop
masking -EOPNOTSUPP... but at the same time that change is made
dm-thin.c would need to be fixed (in the same commit as the interface
change). Though I'm now missing what lifting the -EOPNOTSUPP behavior
into blkdev_issue_discard() buys us... maybe purity of the new async
__blkdev_issue_discard()?
Mike
next prev parent reply other threads:[~2016-06-14 18:36 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-06 22:33 [PATCH V2] block: correctly fallback for zeroout Shaohua Li
2016-06-07 4:50 ` Sitsofe Wheeler
2016-06-07 14:58 ` Shaohua Li
2016-06-15 21:26 ` Sitsofe Wheeler
2016-06-10 2:04 ` Martin K. Petersen
2016-06-10 2:54 ` Shaohua Li
2016-06-11 1:49 ` Martin K. Petersen
2016-06-13 8:20 ` Christoph Hellwig
2016-06-14 18:36 ` Mike Snitzer [this message]
2016-06-15 2:30 ` Martin K. Petersen
2016-06-15 2:40 ` Mike Snitzer
2016-06-15 2:14 ` Martin K. Petersen
2016-06-15 21:24 ` Sitsofe Wheeler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160614183612.GB26196@redhat.com \
--to=snitzer@redhat.com \
--cc=Kernel-team@fb.com \
--cc=axboe@fb.com \
--cc=hch@infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=shli@fb.com \
--cc=sitsofe@yahoo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.