* [LSF/MM TOPIC] Two blk-mq related topics
@ 2018-01-29 15:46 Ming Lei
2018-01-29 20:40 ` Mike Snitzer
2018-01-29 20:56 ` James Bottomley
0 siblings, 2 replies; 13+ messages in thread
From: Ming Lei @ 2018-01-29 15:46 UTC (permalink / raw)
To: lsf-pc, Linux-scsi, linux-block, linux-nvme
Hi guys,
Two blk-mq related topics
1. blk-mq vs. CPU hotplug & IRQ vectors spread on CPUs
We have done three big changes in this field before, each time some issues
are fixed, meantime new ones are introduced
1) freeze all queues during CPU hotplug handler
- issues: queue dependency such as loop-mq/dm vs underlying queues, NVMe admin
queue vs. namespace queues, and IO hang may be caused during freezing all
these queues in CPU hotplug handler.
2) IRQ vectors spread on all present CPUs
- fix issue on 1)
- new issues introduced: don't support CPU hotplug physically, and cause blk-mq
warning during dispatch
3) IRQ vectors spread on all possible CPUs
- can support CPU hotplug physically
- warning in __blk_mq_run_hw_queue() still may be triggered if CPU
offline/online happens between blk_mq_hctx_next_cpu() and running
__blk_mq_run_hw_queue()
- new issues introduced: queue mapping may be distorted completely,
patch sent out(https://marc.info/?t=151603230900002&r=1&w=2), but may
need further discussion about this approach; drivers(such as NVMe) may
need to pass 'num_possible_cpus()' as the max vectors for allocating
irq vectors; some drivers(NVMe) uses hard-code hw queue index directly,
then this way becomes very fragile, since the hw queue may be inactive
from the beginning.
Also starting from 2), another issue is that IO completion may not be
delivered to CPUs, for example, IO may be dispatched to hw queue just
before(or after) all CPUs mapped to the hctx become offline, then IRQ
vector of the hw queue can be shutdown. Now seems we depend on timeout
handler to deal with the situation, and is there better way to solve this
issue?
2. When to enable SCSI_MQ at default again?
SCSI_MQ is enabled on V3.17 firstly, but disabled at default. In V4.13-rc1,
it is enabled at default, but later the patch is reverted in V4.13-rc7, and
becomes disabled at default too.
Now both the original reported PM issue(actually SCSI quiesce) and the
sequential IO performance issue have been addressed. And MQ IO schedulers
are ready too for traditional disks. Are there other issues to be addressed
for enabling SCSI_MQ at default? When can we do that again?
Last time, the two issues were reported during V4.13 dev cycle just when it is
enabled at default, that seems if SCSI_MQ isn't enabled at default, it wouldn't
be exposed to run/tested completely & fully.
So if we continue to disable it at default, maybe it can never be exposed to
full test/production environment.
Thanks,
Ming
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-29 15:46 [LSF/MM TOPIC] Two blk-mq related topics Ming Lei
@ 2018-01-29 20:40 ` Mike Snitzer
2018-01-30 1:27 ` [Lsf-pc] " Ming Lei
2018-01-29 20:56 ` James Bottomley
1 sibling, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2018-01-29 20:40 UTC (permalink / raw)
To: Ming Lei; +Cc: lsf-pc, Linux-scsi, linux-block, linux-nvme
On Mon, Jan 29 2018 at 10:46am -0500,
Ming Lei <ming.lei@redhat.com> wrote:
> 2. When to enable SCSI_MQ at default again?
>
> SCSI_MQ is enabled on V3.17 firstly, but disabled at default. In V4.13-rc1,
> it is enabled at default, but later the patch is reverted in V4.13-rc7, and
> becomes disabled at default too.
>
> Now both the original reported PM issue(actually SCSI quiesce) and the
> sequential IO performance issue have been addressed. And MQ IO schedulers
> are ready too for traditional disks. Are there other issues to be addressed
> for enabling SCSI_MQ at default? When can we do that again?
>
> Last time, the two issues were reported during V4.13 dev cycle just when it is
> enabled at default, that seems if SCSI_MQ isn't enabled at default, it wouldn't
> be exposed to run/tested completely & fully.
>
> So if we continue to disable it at default, maybe it can never be exposed to
> full test/production environment.
I was going to propose revisiting this as well.
I'd really like to see all the old .request_fn block core code removed.
But maybe we take a first step of enabling:
CONFIG_SCSI_MQ_DEFAULT=Y
CONFIG_DM_MQ_DEFAULT=Y
Thanks,
Mike
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-29 15:46 [LSF/MM TOPIC] Two blk-mq related topics Ming Lei
2018-01-29 20:40 ` Mike Snitzer
@ 2018-01-29 20:56 ` James Bottomley
2018-01-29 21:00 ` Jens Axboe
2018-01-30 1:24 ` Ming Lei
1 sibling, 2 replies; 13+ messages in thread
From: James Bottomley @ 2018-01-29 20:56 UTC (permalink / raw)
To: Ming Lei, lsf-pc, Linux-scsi, linux-block, linux-nvme
On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote:
[...]
> 2. When to enable SCSI_MQ at default again?
I'm not sure there's much to discuss ... I think the basic answer is as
soon as Christoph wants to try it again.
> SCSI_MQ is enabled on V3.17 firstly, but disabled at default. In
> V4.13-rc1, it is enabled at default, but later the patch is reverted
> in V4.13-rc7, and becomes disabled at default too.
>
> Now both the original reported PM issue(actually SCSI quiesce) and
> the sequential IO performance issue have been addressed.
Is the blocker bug just not closed because no-one thought to do it:
https://bugzilla.kernel.org/show_bug.cgi?id=178381
(we have confirmed that this issue is now fixed with the original
reporter?)
And did the Huawei guy (Jonathan Cameron) confirm his performance issue
was fixed (I don't think I saw email that he did)?
James
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-29 20:56 ` James Bottomley
@ 2018-01-29 21:00 ` Jens Axboe
2018-01-29 23:46 ` James Bottomley
2018-01-30 10:08 ` Johannes Thumshirn
2018-01-30 1:24 ` Ming Lei
1 sibling, 2 replies; 13+ messages in thread
From: Jens Axboe @ 2018-01-29 21:00 UTC (permalink / raw)
To: James Bottomley, Ming Lei, lsf-pc, Linux-scsi, linux-block,
linux-nvme
On 1/29/18 1:56 PM, James Bottomley wrote:
> On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote:
> [...]
>> 2. When to enable SCSI_MQ at default again?
>
> I'm not sure there's much to discuss ... I think the basic answer is as
> soon as Christoph wants to try it again.
FWIW, internally I've been running various IO intensive workloads on
what is essentially 4.12 upstream with scsi-mq the default (with
mq-deadline as the scheduler) and comparing IO workloads with a
previous 4.6 kernel (without scsi-mq), and things are looking
great.
We're never going to iron out the last kinks with it being off
by default, I think we should attempt to flip the switch again
for 4.16.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-29 21:00 ` Jens Axboe
@ 2018-01-29 23:46 ` James Bottomley
2018-01-30 1:47 ` Jens Axboe
2018-01-30 10:08 ` Johannes Thumshirn
1 sibling, 1 reply; 13+ messages in thread
From: James Bottomley @ 2018-01-29 23:46 UTC (permalink / raw)
To: Jens Axboe, Ming Lei, lsf-pc, Linux-scsi, linux-block, linux-nvme
On Mon, 2018-01-29 at 14:00 -0700, Jens Axboe wrote:
> On 1/29/18 1:56 PM, James Bottomley wrote:
> >
> > On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote:
> > [...]
> > >
> > > 2. When to enable SCSI_MQ at default again?
> >
> > I'm not sure there's much to discuss ... I think the basic answer
> > is as soon as Christoph wants to try it again.
>
> FWIW, internally I've been running various IO intensive workloads on
> what is essentially 4.12 upstream with scsi-mq the default (with
> mq-deadline as the scheduler) and comparing IO workloads with a
> previous 4.6 kernel (without scsi-mq), and things are looking
> great.
>
> We're never going to iron out the last kinks with it being off
> by default, I think we should attempt to flip the switch again
> for 4.16.
Absolutely, I agree we turn it on ASAP. I just don't want to be on the
receiving end of Linus' flamethrower because a bug we already had
reported against scsi-mq caused problems. Get confirmation from the
original reporters (or as close to it as you can) that their problems
are fixed and we're good to go; he won't kick us nearly as hard for new
bugs that turn up.
James
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-29 20:56 ` James Bottomley
2018-01-29 21:00 ` Jens Axboe
@ 2018-01-30 1:24 ` Ming Lei
2018-01-30 8:33 ` [Lsf-pc] " Martin Steigerwald
2018-01-30 10:33 ` John Garry
1 sibling, 2 replies; 13+ messages in thread
From: Ming Lei @ 2018-01-30 1:24 UTC (permalink / raw)
To: James Bottomley, John Garry; +Cc: lsf-pc, Linux-scsi, linux-block, linux-nvme
On Mon, Jan 29, 2018 at 12:56:30PM -0800, James Bottomley wrote:
> On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote:
> [...]
> > 2. When to enable SCSI_MQ at default again?
>
> I'm not sure there's much to discuss ... I think the basic answer is as
> soon as Christoph wants to try it again.
I guess Christoph still need to evaluate if there are existed issues or
blockers before trying it again. And more input may be got from F2F
discussion, IMHO.
>
> > SCSI_MQ is enabled on V3.17 firstly, but disabled at default. In
> > V4.13-rc1, it is enabled at default, but later the patch is reverted
> > in V4.13-rc7, and becomes disabled at default too.
> >
> > Now both the original reported PM issue(actually SCSI quiesce) and
> > the sequential IO performance issue have been addressed.
>
> Is the blocker bug just not closed because no-one thought to do it:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=178381
>
> (we have confirmed that this issue is now fixed with the original
> reporter?)
>From a developer view, this issue is fixed by the following commit:
3a0a52997(block, scsi: Make SCSI quiesce and resume work reliably),
and it is verified by kernel list reporter.
>
> And did the Huawei guy (Jonathan Cameron) confirm his performance issue
> was fixed (I don't think I saw email that he did)?
Last time I talked with John Garry about the issue, and the merged .get_budget
based patch improves much on the IO performance, but there is still a bit gap
compared with legacy path. Seems a driver specific issue, remembered that removing
a driver's lock can improve performance much.
Garry, could you provide further update on this issue?
Thanks,
Ming
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Lsf-pc] [LSF/MM TOPIC] Two blk-mq related topics
2018-01-29 20:40 ` Mike Snitzer
@ 2018-01-30 1:27 ` Ming Lei
0 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2018-01-30 1:27 UTC (permalink / raw)
To: Mike Snitzer; +Cc: linux-block, lsf-pc, linux-nvme, Linux-scsi
On Mon, Jan 29, 2018 at 03:40:31PM -0500, Mike Snitzer wrote:
> On Mon, Jan 29 2018 at 10:46am -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
>
> > 2. When to enable SCSI_MQ at default again?
> >
> > SCSI_MQ is enabled on V3.17 firstly, but disabled at default. In V4.13-rc1,
> > it is enabled at default, but later the patch is reverted in V4.13-rc7, and
> > becomes disabled at default too.
> >
> > Now both the original reported PM issue(actually SCSI quiesce) and the
> > sequential IO performance issue have been addressed. And MQ IO schedulers
> > are ready too for traditional disks. Are there other issues to be addressed
> > for enabling SCSI_MQ at default? When can we do that again?
> >
> > Last time, the two issues were reported during V4.13 dev cycle just when it is
> > enabled at default, that seems if SCSI_MQ isn't enabled at default, it wouldn't
> > be exposed to run/tested completely & fully.
> >
> > So if we continue to disable it at default, maybe it can never be exposed to
> > full test/production environment.
>
> I was going to propose revisiting this as well.
>
> I'd really like to see all the old .request_fn block core code removed.
Yeah, that should be a final goal, but may take a bit long.
>
> But maybe we take a first step of enabling:
> CONFIG_SCSI_MQ_DEFAULT=Y
> CONFIG_DM_MQ_DEFAULT=Y
Maybe you can remove legacy path from DM_RQ first, and take your
original approach to allow DM/MQ over legacy underlying driver,
seems we discussed this topic before, :-)
Thanks,
Ming
_______________________________________________
Lsf-pc mailing list
Lsf-pc@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/lsf-pc
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-29 23:46 ` James Bottomley
@ 2018-01-30 1:47 ` Jens Axboe
0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2018-01-30 1:47 UTC (permalink / raw)
To: James Bottomley, Ming Lei, lsf-pc, Linux-scsi, linux-block,
linux-nvme
On 1/29/18 4:46 PM, James Bottomley wrote:
> On Mon, 2018-01-29 at 14:00 -0700, Jens Axboe wrote:
>> On 1/29/18 1:56 PM, James Bottomley wrote:
>>>
>>> On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote:
>>> [...]
>>>>
>>>> 2. When to enable SCSI_MQ at default again?
>>>
>>> I'm not sure there's much to discuss ... I think the basic answer
>>> is as soon as Christoph wants to try it again.
>>
>> FWIW, internally I've been running various IO intensive workloads on
>> what is essentially 4.12 upstream with scsi-mq the default (with
>> mq-deadline as the scheduler) and comparing IO workloads with a
>> previous 4.6 kernel (without scsi-mq), and things are looking
>> great.
>>
>> We're never going to iron out the last kinks with it being off
>> by default, I think we should attempt to flip the switch again
>> for 4.16.
>
> Absolutely, I agree we turn it on ASAP. I just don't want to be on the
> receiving end of Linus' flamethrower because a bug we already had
> reported against scsi-mq caused problems. Get confirmation from the
> original reporters (or as close to it as you can) that their problems
> are fixed and we're good to go; he won't kick us nearly as hard for new
> bugs that turn up.
I agree, the functional issues definitely have to be verified to be
resolved. Various performance hitches we can dive into if they
crop up, but reintroducing some random suspend regression is not
acceptable.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Lsf-pc] [LSF/MM TOPIC] Two blk-mq related topics
2018-01-30 1:24 ` Ming Lei
@ 2018-01-30 8:33 ` Martin Steigerwald
2018-01-30 10:33 ` John Garry
1 sibling, 0 replies; 13+ messages in thread
From: Martin Steigerwald @ 2018-01-30 8:33 UTC (permalink / raw)
To: Ming Lei
Cc: linux-block, Linux-scsi, John Garry, linux-nvme, James Bottomley,
lsf-pc
TWluZyBMZWkgLSAzMC4wMS4xOCwgMDI6MjQ6Cj4gPiA+IFNDU0lfTVEgaXMgZW5hYmxlZCBvbiBW
My4xNyBmaXJzdGx5LCBidXQgZGlzYWJsZWQgYXQgZGVmYXVsdC4gSW4KPiA+ID4gVjQuMTMtcmMx
LCBpdCBpcyBlbmFibGVkIGF0IGRlZmF1bHQsIGJ1dCBsYXRlciB0aGUgcGF0Y2ggaXMgcmV2ZXJ0
ZWQKPiA+ID4gaW4gVjQuMTMtcmM3LCBhbmQgYmVjb21lcyBkaXNhYmxlZCBhdCBkZWZhdWx0IHRv
by4KPiA+ID4gCj4gPiA+IE5vdyBib3RoIHRoZSBvcmlnaW5hbCByZXBvcnRlZCBQTSBpc3N1ZShh
Y3R1YWxseSBTQ1NJIHF1aWVzY2UpIGFuZAo+ID4gPiB0aGUgc2VxdWVudGlhbCBJTyBwZXJmb3Jt
YW5jZSBpc3N1ZSBoYXZlIGJlZW4gYWRkcmVzc2VkLgo+ID4gCj4gPiBJcyB0aGUgYmxvY2tlciBi
dWcganVzdCBub3QgY2xvc2VkIGJlY2F1c2Ugbm8tb25lIHRob3VnaHQgdG8gZG8gaXQ6Cj4gPiAK
PiA+IGh0dHBzOi8vYnVnemlsbGEua2VybmVsLm9yZy9zaG93X2J1Zy5jZ2k/aWQ9MTc4MzgxCj4g
PiAKPiA+ICh3ZSBoYXZlIGNvbmZpcm1lZCB0aGF0IHRoaXMgaXNzdWUgaXMgbm93IGZpeGVkIHdp
dGggdGhlIG9yaWdpbmFsCj4gPiByZXBvcnRlcj8pCj4gCj4gRnJvbSBhIGRldmVsb3BlciB2aWV3
LCB0aGlzIGlzc3VlIGlzIGZpeGVkIGJ5IHRoZSBmb2xsb3dpbmcgY29tbWl0Ogo+IDNhMGE1Mjk5
NyhibG9jaywgc2NzaTogTWFrZSBTQ1NJIHF1aWVzY2UgYW5kIHJlc3VtZSB3b3JrIHJlbGlhYmx5
KSwKPiBhbmQgaXQgaXMgdmVyaWZpZWQgYnkga2VybmVsIGxpc3QgcmVwb3J0ZXIuCgpJIG5ldmVy
IHNlZW4gYW55IHN1c3BlbmQgLyBoaWJlcm5hdGUgcmVsYXRlZCBpc3N1ZXMgd2l0aCBibGstbXEg
KyBiZnEgc2luY2UgCnRoZW4uIFVzaW5nIGhlYXZpbHkgdXRpbGl6ZWQgQlRSRlMgZHVhbCBTU0Qg
UkFJRCAxLgoKJSBlZ3JlcCAiTVF8QkZRIiAvYm9vdC9jb25maWctNC4xNS4wLXRwNTIwLWJ0cmZz
dHJpbSsKQ09ORklHX1BPU0lYX01RVUVVRT15CkNPTkZJR19QT1NJWF9NUVVFVUVfU1lTQ1RMPXkK
Q09ORklHX0JMS19XQlRfTVE9eQpDT05GSUdfQkxLX01RX1BDST15CkNPTkZJR19CTEtfTVFfVklS
VElPPXkKQ09ORklHX01RX0lPU0NIRURfREVBRExJTkU9bQpDT05GSUdfTVFfSU9TQ0hFRF9LWUJF
Uj1tCkNPTkZJR19JT1NDSEVEX0JGUT1tCkNPTkZJR19CRlFfR1JPVVBfSU9TQ0hFRD15CkNPTkZJ
R19ORVRfU0NIX01RUFJJTz1tCiMgQ09ORklHX1NDU0lfTVFfREVGQVVMVCBpcyBub3Qgc2V0CiMg
Q09ORklHX0RNX01RX0RFRkFVTFQgaXMgbm90IHNldApDT05GSUdfRE1fQ0FDSEVfU01RPW0KCiUg
Y2F0IC9wcm9jL2NtZGxpbmUgCkJPT1RfSU1BR0U9L3ZtbGludXotNC4xNS4wLXRwNTIwLWJ0cmZz
dHJpbSsgcm9vdD1VVUlEPVvigKZdIHJvIApyb290ZmxhZ3M9c3Vidm9sPWRlYmlhbiByZXN1bWU9
L2Rldi9tYXBwZXIvc2F0YS1zd2FwIGluaXQ9L2Jpbi9zeXN0ZW1kIAp0aGlua3BhZF9hY3BpLmZh
bl9jb250cm9sPTEgc3lzdGVtZC5yZXN0b3JlX3N0YXRlPTAgc2NzaV9tb2QudXNlX2Jsa19tcT0x
CgolIGNhdCAvc3lzL2Jsb2NrL3NkYS9xdWV1ZS9zY2hlZHVsZXIgCltiZnFdIG5vbmUKClRoYW5r
cywKLS0gCk1hcnRpbgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fXwpMc2YtcGMgbWFpbGluZyBsaXN0CkxzZi1wY0BsaXN0cy5saW51eC1mb3VuZGF0aW9uLm9y
ZwpodHRwczovL2xpc3RzLmxpbnV4Zm91bmRhdGlvbi5vcmcvbWFpbG1hbi9saXN0aW5mby9sc2Yt
cGMK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-29 21:00 ` Jens Axboe
2018-01-29 23:46 ` James Bottomley
@ 2018-01-30 10:08 ` Johannes Thumshirn
2018-01-30 10:50 ` Mel Gorman
1 sibling, 1 reply; 13+ messages in thread
From: Johannes Thumshirn @ 2018-01-30 10:08 UTC (permalink / raw)
To: Jens Axboe
Cc: James Bottomley, Ming Lei, lsf-pc, Linux-scsi, linux-block,
linux-nvme, Mel Gorman
[+Cc Mel]
Jens Axboe <axboe@kernel.dk> writes:
> On 1/29/18 1:56 PM, James Bottomley wrote:
>> On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote:
>> [...]
>>> 2. When to enable SCSI_MQ at default again?
>>=20
>> I'm not sure there's much to discuss ... I think the basic answer is as
>> soon as Christoph wants to try it again.
>
> FWIW, internally I've been running various IO intensive workloads on
> what is essentially 4.12 upstream with scsi-mq the default (with
> mq-deadline as the scheduler) and comparing IO workloads with a
> previous 4.6 kernel (without scsi-mq), and things are looking
> great.
>
> We're never going to iron out the last kinks with it being off
> by default, I think we should attempt to flip the switch again
> for 4.16.
The 4.12 sounds interesting. I remember Mel ran some test with 4.12 as
we where considering to flip the config option for SLES and it showed
several road blocks.
I'm not sure whether he re-evaluated 4.13/4.14 on his grid though.
But I'm definitively interested in this discussion and can even possibly
share some benchmark results we did in our FC Lab.
Byte,
Johannes
--=20
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N=C3=BCrnberg
GF: Felix Imend=C3=B6rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N=C3=BCrnberg)
Key fingerprint =3D EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-30 1:24 ` Ming Lei
2018-01-30 8:33 ` [Lsf-pc] " Martin Steigerwald
@ 2018-01-30 10:33 ` John Garry
2018-02-07 10:55 ` John Garry
1 sibling, 1 reply; 13+ messages in thread
From: John Garry @ 2018-01-30 10:33 UTC (permalink / raw)
To: Ming Lei, James Bottomley
Cc: lsf-pc, Linux-scsi, linux-block, linux-nvme, Linuxarm
On 30/01/2018 01:24, Ming Lei wrote:
> On Mon, Jan 29, 2018 at 12:56:30PM -0800, James Bottomley wrote:
>> On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote:
>> [...]
>>> 2. When to enable SCSI_MQ at default again?
>>
>> I'm not sure there's much to discuss ... I think the basic answer is as
>> soon as Christoph wants to try it again.
>
> I guess Christoph still need to evaluate if there are existed issues or
> blockers before trying it again. And more input may be got from F2F
> discussion, IMHO.
>
>>
>>> SCSI_MQ is enabled on V3.17 firstly, but disabled at default. In
>>> V4.13-rc1, it is enabled at default, but later the patch is reverted
>>> in V4.13-rc7, and becomes disabled at default too.
>>>
>>> Now both the original reported PM issue(actually SCSI quiesce) and
>>> the sequential IO performance issue have been addressed.
>>
>> Is the blocker bug just not closed because no-one thought to do it:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=178381
>>
>> (we have confirmed that this issue is now fixed with the original
>> reporter?)
>
>>>From a developer view, this issue is fixed by the following commit:
> 3a0a52997(block, scsi: Make SCSI quiesce and resume work reliably),
> and it is verified by kernel list reporter.
>
>>
>> And did the Huawei guy (Jonathan Cameron) confirm his performance issue
>> was fixed (I don't think I saw email that he did)?
>
> Last time I talked with John Garry about the issue, and the merged .get_budget
> based patch improves much on the IO performance, but there is still a bit gap
> compared with legacy path. Seems a driver specific issue, remembered that removing
> a driver's lock can improve performance much.
>
> Garry, could you provide further update on this issue?
Hi Ming,
From our testing with experimental changes to our driver to support
SCSI mq we were almost getting on par performance with legacy path. But
without these MQ was hitting performance (and I would not necessarily
say it was a driver issue).
We can retest from today's mainline and see where we are.
BTW, Have you got performance figures for many other single queue HBAs
with and without CONFIG_SCSI_MQ_DEFAULT=Y?
Thanks,
John
>
> Thanks,
> Ming
>
> .
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-30 10:08 ` Johannes Thumshirn
@ 2018-01-30 10:50 ` Mel Gorman
0 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2018-01-30 10:50 UTC (permalink / raw)
To: Johannes Thumshirn
Cc: Jens Axboe, James Bottomley, Ming Lei, lsf-pc, Linux-scsi,
linux-block, linux-nvme
On Tue, Jan 30, 2018 at 11:08:28AM +0100, Johannes Thumshirn wrote:
> [+Cc Mel]
> Jens Axboe <axboe@kernel.dk> writes:
> > On 1/29/18 1:56 PM, James Bottomley wrote:
> >> On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote:
> >> [...]
> >>> 2. When to enable SCSI_MQ at default again?
> >>
> >> I'm not sure there's much to discuss ... I think the basic answer is as
> >> soon as Christoph wants to try it again.
> >
> > FWIW, internally I've been running various IO intensive workloads on
> > what is essentially 4.12 upstream with scsi-mq the default (with
> > mq-deadline as the scheduler) and comparing IO workloads with a
> > previous 4.6 kernel (without scsi-mq), and things are looking
> > great.
> >
> > We're never going to iron out the last kinks with it being off
> > by default, I think we should attempt to flip the switch again
> > for 4.16.
>
> The 4.12 sounds interesting. I remember Mel ran some test with 4.12 as
> we where considering to flip the config option for SLES and it showed
> several road blocks.
>
Mostly due to slow storage and BFQ where mq-deadline was not a universal
win as an alternative default. I don't have current data and I archived
what I had, but it was based on 4.13-rc7 at the time and BFQ has changed
a lot since so it would need to be redone.
> I'm not sure whether he re-evaluated 4.13/4.14 on his grid though.
>
No, it hasn't. Grid time for performance testing has been tight during
the last few months to say the least.
> But I'm definitively interested in this discussion and can even possibly
> share some benchmark results we did in our FC Lab.
>
If you remind me, I may be able to re-execute the tests in a 4.16-rcX
before LSF/MM so you have other data to work with. Unfortunately, I'll
not be able to make LSF/MM this time due to personal commitments that
conflict and are unmovable.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [LSF/MM TOPIC] Two blk-mq related topics
2018-01-30 10:33 ` John Garry
@ 2018-02-07 10:55 ` John Garry
0 siblings, 0 replies; 13+ messages in thread
From: John Garry @ 2018-02-07 10:55 UTC (permalink / raw)
To: Ming Lei, James Bottomley
Cc: linux-block, lsf-pc, linux-nvme, Linux-scsi, Linuxarm
On 30/01/2018 10:33, John Garry wrote:
> On 30/01/2018 01:24, Ming Lei wrote:
>> On Mon, Jan 29, 2018 at 12:56:30PM -0800, James Bottomley wrote:
>>> On Mon, 2018-01-29 at 23:46 +0800, Ming Lei wrote:
>>> [...]
>>>> 2. When to enable SCSI_MQ at default again?
>>>
>>> I'm not sure there's much to discuss ... I think the basic answer is as
>>> soon as Christoph wants to try it again.
>>
>> I guess Christoph still need to evaluate if there are existed issues or
>> blockers before trying it again. And more input may be got from F2F
>> discussion, IMHO.
>>
>>>
>>>> SCSI_MQ is enabled on V3.17 firstly, but disabled at default. In
>>>> V4.13-rc1, it is enabled at default, but later the patch is reverted
>>>> in V4.13-rc7, and becomes disabled at default too.
>>>>
>>>> Now both the original reported PM issue(actually SCSI quiesce) and
>>>> the sequential IO performance issue have been addressed.
>>>
>>> Is the blocker bug just not closed because no-one thought to do it:
>>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=178381
>>>
>>> (we have confirmed that this issue is now fixed with the original
>>> reporter?)
>>
>>> From a developer view, this issue is fixed by the following commit:
>> 3a0a52997(block, scsi: Make SCSI quiesce and resume work reliably),
>> and it is verified by kernel list reporter.
>>
>>>
>>> And did the Huawei guy (Jonathan Cameron) confirm his performance issue
>>> was fixed (I don't think I saw email that he did)?
>>
>> Last time I talked with John Garry about the issue, and the merged
>> .get_budget
>> based patch improves much on the IO performance, but there is still a
>> bit gap
>> compared with legacy path. Seems a driver specific issue, remembered
>> that removing
>> a driver's lock can improve performance much.
>>
>> Garry, could you provide further update on this issue?
>
> Hi Ming,
>
> From our testing with experimental changes to our driver to support SCSI
> mq we were almost getting on par performance with legacy path. But
> without these MQ was hitting performance (and I would not necessarily
> say it was a driver issue).
>
> We can retest from today's mainline and see where we are.
>
> BTW, Have you got performance figures for many other single queue HBAs
> with and without CONFIG_SCSI_MQ_DEFAULT=Y?
We finally got around to retesting this (on hisi_sas controller). So the
results are generally ok, in that we are now not seeing such big
performance drops in our hardware for enabling SCSI MQ - in some
scenarios the performance is better. Generally fio rw mode is better.
Anyway, for what it's worth, it's a green light from us to set SCSI MQ
on by default.
John
>
> Thanks,
> John
>
>>
>> Thanks,
>> Ming
>>
>> .
>>
>
>
> _______________________________________________
> Linuxarm mailing list
> Linuxarm@huawei.com
> http://hulk.huawei.com/mailman/listinfo/linuxarm
>
> .
>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-02-07 10:55 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-29 15:46 [LSF/MM TOPIC] Two blk-mq related topics Ming Lei
2018-01-29 20:40 ` Mike Snitzer
2018-01-30 1:27 ` [Lsf-pc] " Ming Lei
2018-01-29 20:56 ` James Bottomley
2018-01-29 21:00 ` Jens Axboe
2018-01-29 23:46 ` James Bottomley
2018-01-30 1:47 ` Jens Axboe
2018-01-30 10:08 ` Johannes Thumshirn
2018-01-30 10:50 ` Mel Gorman
2018-01-30 1:24 ` Ming Lei
2018-01-30 8:33 ` [Lsf-pc] " Martin Steigerwald
2018-01-30 10:33 ` John Garry
2018-02-07 10:55 ` John Garry
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox