linux-mmc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ulf Hansson <ulf.hansson@linaro.org>
To: Bryan Gurney <bgurney@redhat.com>
Cc: Paolo Valente <paolo.valente@linaro.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Damien.LeMoal@wdc.com, Artem Bityutskiy <dedekind1@gmail.com>,
	Jens Axboe <axboe@kernel.dk>,
	linux-block <linux-block@vger.kernel.org>,
	linux-mmc <linux-mmc@vger.kernel.org>,
	linux-mtd@lists.infradead.org, Pavel Machek <pavel@ucw.cz>,
	Richard Weinberger <richard@nod.at>,
	Adrian Hunter <adrian.hunter@intel.com>, Jan Kara <jack@suse.cz>,
	aherrmann@suse.com, mgorman@suse.com,
	Chunyan Zhang <zhang.chunyan@linaro.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	bfq-iosched@googlegroups.com, oleksandr@natalenko.name,
	Mark Brown <broonie@kernel.org>
Subject: Re: [PATCH] block: BFQ default for single queue devices
Date: Thu, 4 Oct 2018 11:56:17 +0200	[thread overview]
Message-ID: <CAPDyKFqJMJkmumxS3vET6kvjnGGMrHaDpUT20LDAgzygbFmZHw@mail.gmail.com> (raw)
In-Reply-To: <CAHhmqcRbz+UsgDsHzJzKDWqZ_JMNp1UYq3M2rAf1vqfmO+wRyg@mail.gmail.com>

On 3 October 2018 at 19:34, Bryan Gurney <bgurney@redhat.com> wrote:
> On Wed, Oct 3, 2018 at 11:53 AM, Paolo Valente <paolo.valente@linaro.org> wrote:
>>
>>
>>> Il giorno 03 ott 2018, alle ore 10:28, Linus Walleij <linus.walleij@linaro.org> ha scritto:
>>>
>>> On Wed, Oct 3, 2018 at 9:42 AM Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
>>>
>>>> There is another class of outliers: host-managed SMR disks (SATA and SCSI,
>>>> definitely single hw queue). For these, using mq-deadline is mandatory in many
>>>> cases in order to guarantee sequential write command delivery to the device
>>>> driver. Having the default changed to bfq, which as far as I know is not SMR
>>>> friendly (can sequential writes within a single zone be reordered ?) is asking
>>>> for troubles (unaligned write errors showing up).
>>>
>>> Ah, that is interesting.
>>>
>>> Which device driver files are we talking about here, specifically?
>>> I'd like to take a look.
>>>
>>> I guess what you say is not that you are looking for the deadline
>>> scheduling per se (as in deadline scheduling is nice), what you want is
>>> the zone locking semantics in that scheduler, is that right?
>>>
>>> I.e. this business:
>>> blk_queue_is_zoned(q)
>>> blk_req_zone_write_lock(rq);
>>> blk_req_zone_write_unlock(rq);
>>> and mq-deadline solves this with a spinlock.
>>>
>>> I will augment the patch to enforce mq-deadline
>>> if blk_queue_is_zoned(q) is true, as it is clear that
>>> any device with that characteristic must use mq-deadline.
>>>
>>> Paoly might be interested in looking into whether BFQ could
>>> also handle zoned devices in the future, I have no idea of how
>>> hard that would be.
>>>
>>
>> Absolutely, as I already wrote in my reply to Damien.
>>
>> In the meantime, Linus, augmenting your patch as you propose seems
>> a clean and effective solution to me.
>>
>> Thanks,
>> Paolo
>>
>>> The zoned business seems a bit fragile. Should it even be
>>> allowed to select any other scheduler than deadline on these
>>> devices? Presenting all compiled in schedulers in
>>> /sysblock/device/queue/scheduler sounds like just giving
>>> sysadmins too much rope.
>>>
>>> Yours,
>>> Linus Walleij
>>
>
> Right now, users of host-managed SMR drives should be using "deadline"
> or "mq-deadline", to avoid out-of-order writes in sequential-only
> zones.
>
> I'm running into a situation right now on a test system (Fedora 28,
> 4.18.7 kernel) where I copied test data onto an F2FS filesystem, but I
> accidentally forgot to add my "udev rule" file:
>
> # cat /etc/udev/rules.d/99-zoned-block-devices.rules
> ACTION=="add|change", KERNEL=="sd[a-z]",
> ATTRS{queue/zoned}=="host-managed", ATTR{queue/scheduler}="deadline"
>
> ...and now, I see these messages when that specific SMR drive is mounted:
>
> kernel: F2FS-fs (sdc): IO Block Size:        4 KB
> kernel: F2FS-fs (sdc): Found nat_bits in checkpoint
> kernel: F2FS-fs (sdc): Mounted with checkpoint version = 212216ab
> kernel: mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08),
> sub_code(0x0000)
> kernel: mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08),
> sub_code(0x0000)
> kernel: scsi_io_completion: 20 callbacks suppressed
> kernel: sd 7:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> kernel: sd 7:0:0:0: [sdb] tag#0 Sense Key : Aborted Command [current]
> kernel: sd 7:0:0:0: [sdb] tag#0 Add. Sense: No additional sense information
> kernel: sd 7:0:0:0: [sdb] tag#0 CDB: Write(16) 8a 00 00 00 00 00 3d d4
> ec 99 00 00 00 80 00 00
>
> I was also running into problems with creating new directories on this
> F2FS filesystem.  However, "fsck.f2fs" reports no problems.  So at
> this point, I created a new F2FS filesystem on a second SMR drive, and
> am currently copying the data from the "bad" F2FS filesystem to the
> "good" one.
>
> I wouldn't call zoned block devices "fragile"; they simply have I/O
> rules that didn't previously exist: all writes to sequential-only
> zones must be sequential.  And one of the things that schedulers do is
> reorder writes.  After 4.16, sd stopped being the "gatekeeper" of
> ensuring sequential writes, but the only "zoned-aware" schedulers were
> deadline and mq-deadline.  Since my test system defaulted to "cfq", I
> ran into problems.
>
> So I welcome any changes that make it impossible for the user to
> "accidentally use the wrong scheduler".

I fully agree.

>
> At least this time, I didn't "brick" my test system's BIOS, like I did
> back in May of this year [1].

It sounds to me that the kernel isn't doing its job. In particular,
the kernel have the information, as to be able to select the proper
I/O scheduler (the block layer could just check
BLK_ZONE_TYPE_SEQWRITE_REQ/ZBC_ZONE_TYPE_SEQWRITE_REQ). Instead it
relies on userspace to do the right thing, it can't be right.

Kind regards
Uffe

  parent reply	other threads:[~2018-10-04  9:56 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-02 12:43 [PATCH] block: BFQ default for single queue devices Linus Walleij
2018-10-02 14:31 ` Jens Axboe
2018-10-02 14:45   ` Linus Walleij
2018-10-03  6:29   ` Paolo Valente
2018-10-03  6:53     ` Linus Walleij
2018-10-03 13:25       ` Jan Kara
2018-10-04  7:45         ` Johannes Thumshirn
2018-10-04  8:24           ` Andreas Herrmann
2018-10-03  7:05     ` Artem Bityutskiy
2018-10-03  7:18       ` Linus Walleij
2018-10-03  7:42         ` Damien Le Moal
2018-10-03  8:28           ` Linus Walleij
2018-10-03  8:53             ` Damien Le Moal
2018-10-03 15:53             ` Paolo Valente
2018-10-03 17:34               ` Bryan Gurney
2018-10-04  8:21                 ` Linus Walleij
2018-10-04  9:56                 ` Ulf Hansson [this message]
2018-10-03 12:51           ` Christoph Hellwig
2018-10-03 14:58             ` Bart Van Assche
2018-10-03 15:01               ` Christoph Hellwig
2018-10-03 15:15                 ` Bart Van Assche
2018-10-05  6:24                   ` Christoph Hellwig
2018-10-03 15:52           ` Paolo Valente
2018-10-03 11:49     ` Oleksandr Natalenko
2018-10-03 14:51       ` Mark Brown
2018-10-03 15:55       ` Paolo Valente
2018-10-03 16:00         ` Bart Van Assche
2018-10-03 16:04           ` Paolo Valente
2018-10-04  7:38         ` Jan Kara
2018-10-04  8:25       ` Linus Walleij
     [not found]       ` <CACRpkdYG2Y=rspbZ_o=H3REXTEfOcaiqEyQD4kzO=G=d63V3yA@mail.gmail.com>
2018-10-04 10:13         ` Mark Brown
2018-10-04 15:10           ` Bart Van Assche
2018-10-04 15:26             ` Mark Brown
2018-10-05  9:49         ` Pavel Machek
2018-10-03 15:54     ` Bart Van Assche
2018-10-03 16:02       ` Paolo Valente
2018-10-03 17:22         ` Paolo Valente
2018-10-04 19:25       ` Alan Cox
2018-10-04 20:09         ` Bart Van Assche
2018-10-04 20:39           ` Paolo Valente
2018-10-04 22:42             ` Bart Van Assche
2018-10-05  9:16               ` Jan Kara
2018-10-06  3:12                 ` Bart Van Assche
2018-10-06  6:46                   ` Paolo Valente
2018-10-06 16:20                     ` Bart Van Assche
2018-10-06 16:46                       ` Paolo Valente
2018-10-05  9:28               ` Paolo Valente
2018-10-05  6:24           ` Artem Bityutskiy
2018-10-04 20:19         ` Paolo Valente
2018-10-02 21:28 ` Richard Weinberger
2018-10-03 15:51 ` Paolo Valente
2018-10-05  8:04 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPDyKFqJMJkmumxS3vET6kvjnGGMrHaDpUT20LDAgzygbFmZHw@mail.gmail.com \
    --to=ulf.hansson@linaro.org \
    --cc=Damien.LeMoal@wdc.com \
    --cc=adrian.hunter@intel.com \
    --cc=aherrmann@suse.com \
    --cc=axboe@kernel.dk \
    --cc=bfq-iosched@googlegroups.com \
    --cc=bgurney@redhat.com \
    --cc=broonie@kernel.org \
    --cc=dedekind1@gmail.com \
    --cc=jack@suse.cz \
    --cc=linus.walleij@linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=mgorman@suse.com \
    --cc=oleksandr@natalenko.name \
    --cc=paolo.valente@linaro.org \
    --cc=pavel@ucw.cz \
    --cc=richard@nod.at \
    --cc=zhang.chunyan@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).