* Re: [Lsf] [LSF/MM TOPIC] block-mq issues with FC
2016-04-08 17:40 ` Matthew Wilcox
@ 2016-04-08 18:00 ` James Bottomley
2016-04-08 18:08 ` Christoph Hellwig
2016-04-08 18:06 ` Keith Busch
` (3 subsequent siblings)
4 siblings, 1 reply; 16+ messages in thread
From: James Bottomley @ 2016-04-08 18:00 UTC (permalink / raw)
To: Matthew Wilcox, Hannes Reinecke
Cc: lsf, linux-block@vger.kernel.org, Jens Axboe, Christoph Hellwig,
SCSI Mailing List
On Fri, 2016-04-08 at 13:40 -0400, Matthew Wilcox wrote:
> On Fri, Apr 08, 2016 at 01:29:26PM +0200, Hannes Reinecke wrote:
> > I'd like to propose a topic on block-mq issues with FC.
> > During my performance testing using block/scsi-mq with FC I've hit
> > several issues I'd like to discuss:
>
> If there's a general block-mq bitching session, I have some ideas :-)
"Block mq bitching session" is going to look a bit bad on the public
schedule, what about "Block MQ implementor feedback"?
> - Inability to use all queues supported by a device. Intel's P3700
> supports 31 queues, but block-mq insists on assigning an even multiple
> of CPUs to each queue. So if you have 48 CPUs, it will use 24 queues.
> If you have 128 CPUs, it will only use 16 of the queues.
>
> - Interrupt steering needs to be controlled by block-mq instead of
> the driver. It's pointless to have each driver implement its own
> policies on interrupt steering, irqbalanced remains a source of
> end-user frustration, and block-mq can change the queue<->cpu mapping
> without the driver's knowledge.
>
> (thanks to Keith for his input on the first and suggestion of the second).
OK, what about two sessions, one for general bitching (the feedback
sessions) and one for concrete proposals for improvements ... so rather
than just complaining about the problem, if you have concrete ideas
about fixing it, that would go into the second session.
James
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Lsf] [LSF/MM TOPIC] block-mq issues with FC
2016-04-08 18:00 ` James Bottomley
@ 2016-04-08 18:08 ` Christoph Hellwig
2016-04-08 18:24 ` James Bottomley
0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2016-04-08 18:08 UTC (permalink / raw)
To: James Bottomley
Cc: Matthew Wilcox, Hannes Reinecke, lsf, linux-block@vger.kernel.org,
Jens Axboe, Christoph Hellwig, SCSI Mailing List
On Fri, Apr 08, 2016 at 11:00:51AM -0700, James Bottomley wrote:
> > - Inability to use all queues supported by a device. Intel's P3700
> > supports 31 queues, but block-mq insists on assigning an even multiple
> > of CPUs to each queue. So if you have 48 CPUs, it will use 24 queues.
> > If you have 128 CPUs, it will only use 16 of the queues.
> >
> > - Interrupt steering needs to be controlled by block-mq instead of
> > the driver. It's pointless to have each driver implement its own
> > policies on interrupt steering, irqbalanced remains a source of
> > end-user frustration, and block-mq can change the queue<->cpu mapping
> > without the driver's knowledge.
> >
> > (thanks to Keith for his input on the first and suggestion of the second).
>
> OK, what about two sessions, one for general bitching (the feedback
> sessions) and one for concrete proposals for improvements ... so rather
> than just complaining about the problem, if you have concrete ideas
> about fixing it, that would go into the second session.
We already have the blk-mq interrupt assignment session on the schedule,
which is about willy's item. And my work in progress code to address
the issue also mostly addresses his item number 1, so I think we can
just keep the schedule most as is and just rename "multiqueue interrupt
assignment" into "multiqueue interrupt and queue assignment".
No need to blow it up into three slots.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Lsf] [LSF/MM TOPIC] block-mq issues with FC
2016-04-08 18:08 ` Christoph Hellwig
@ 2016-04-08 18:24 ` James Bottomley
0 siblings, 0 replies; 16+ messages in thread
From: James Bottomley @ 2016-04-08 18:24 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block@vger.kernel.org, SCSI Mailing List, lsf,
Hannes Reinecke, Matthew Wilcox
On Fri, 2016-04-08 at 20:08 +0200, Christoph Hellwig wrote:
> On Fri, Apr 08, 2016 at 11:00:51AM -0700, James Bottomley wrote:
> > > - Inability to use all queues supported by a device. Intel's
> > > P3700
> > > supports 31 queues, but block-mq insists on assigning an even
> > > multiple
> > > of CPUs to each queue. So if you have 48 CPUs, it will use 24
> > > queues.
> > > If you have 128 CPUs, it will only use 16 of the queues.
> > >
> > > - Interrupt steering needs to be controlled by block-mq instead
> > > of
> > > the driver. It's pointless to have each driver implement its
> > > own
> > > policies on interrupt steering, irqbalanced remains a source
> > > of
> > > end-user frustration, and block-mq can change the queue<->cpu
> > > mapping
> > > without the driver's knowledge.
> > >
> > > (thanks to Keith for his input on the first and suggestion of the
> > > second).
> >
> > OK, what about two sessions, one for general bitching (the feedback
> > sessions) and one for concrete proposals for improvements ... so
> > rather
> > than just complaining about the problem, if you have concrete ideas
> > about fixing it, that would go into the second session.
>
> We already have the blk-mq interrupt assignment session on the
> schedule,
> which is about willy's item. And my work in progress code to address
> the issue also mostly addresses his item number 1, so I think we can
> just keep the schedule most as is and just rename "multiqueue
> interrupt
> assignment" into "multiqueue interrupt and queue assignment".
>
> No need to blow it up into three slots.
Agreed; I made the adjustments.
James
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Lsf] [LSF/MM TOPIC] block-mq issues with FC
2016-04-08 17:40 ` Matthew Wilcox
2016-04-08 18:00 ` James Bottomley
@ 2016-04-08 18:06 ` Keith Busch
2016-04-12 19:16 ` Jens Axboe
2016-04-08 18:14 ` Bart Van Assche
` (2 subsequent siblings)
4 siblings, 1 reply; 16+ messages in thread
From: Keith Busch @ 2016-04-08 18:06 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Hannes Reinecke, lsf, linux-block@vger.kernel.org, Jens Axboe,
SCSI Mailing List, Christoph Hellwig
On Fri, Apr 08, 2016 at 01:40:06PM -0400, Matthew Wilcox wrote:
> - Inability to use all queues supported by a device. Intel's P3700
> supports 31 queues, but block-mq insists on assigning an even multiple
> of CPUs to each queue. So if you have 48 CPUs, it will use 24 queues.
> If you have 128 CPUs, it will only use 16 of the queues.
While it'd be better to use all the available h/w resources, that's
actually not the worst part.
The real problems occur when there are more physical/unique CPUs than
h/w queues since blk-mq does not consider CPU topology beyond thread
siblings. With 128 CPUs, blk-mq may use all 31 queues P3700 supports,
but many CPU groups won't share a last-level-cache.
Smarter assignment would reclaim some untapped performance, and we can
share such code prior to the session.
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Lsf] [LSF/MM TOPIC] block-mq issues with FC
2016-04-08 18:06 ` Keith Busch
@ 2016-04-12 19:16 ` Jens Axboe
0 siblings, 0 replies; 16+ messages in thread
From: Jens Axboe @ 2016-04-12 19:16 UTC (permalink / raw)
To: Keith Busch, Matthew Wilcox
Cc: Hannes Reinecke, lsf, linux-block@vger.kernel.org,
SCSI Mailing List, Christoph Hellwig
On 04/08/2016 12:06 PM, Keith Busch wrote:
> On Fri, Apr 08, 2016 at 01:40:06PM -0400, Matthew Wilcox wrote:
>> - Inability to use all queues supported by a device. Intel's P3700
>> supports 31 queues, but block-mq insists on assigning an even multiple
>> of CPUs to each queue. So if you have 48 CPUs, it will use 24 queues.
>> If you have 128 CPUs, it will only use 16 of the queues.
>
> While it'd be better to use all the available h/w resources, that's
> actually not the worst part.
>
> The real problems occur when there are more physical/unique CPUs than
> h/w queues since blk-mq does not consider CPU topology beyond thread
> siblings. With 128 CPUs, blk-mq may use all 31 queues P3700 supports,
> but many CPU groups won't share a last-level-cache.
>
> Smarter assignment would reclaim some untapped performance, and we can
> share such code prior to the session.
There's definitely room for improvement in the cpu mapping code.
However, on the original complaint, it's by design (or, working as
intended) - this was done to keep the layout symmetrical. It's been
discussed on the mailing lists before. We can have a discussion whether
we should change this or not, of course.
--
Jens Axboe
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Lsf] [LSF/MM TOPIC] block-mq issues with FC
2016-04-08 17:40 ` Matthew Wilcox
2016-04-08 18:00 ` James Bottomley
2016-04-08 18:06 ` Keith Busch
@ 2016-04-08 18:14 ` Bart Van Assche
2016-04-08 19:22 ` Waskiewicz, PJ
2016-04-10 19:02 ` Sagi Grimberg
4 siblings, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2016-04-08 18:14 UTC (permalink / raw)
To: Matthew Wilcox, Hannes Reinecke
Cc: lsf@lists.linux-foundation.org, linux-block@vger.kernel.org,
Jens Axboe, SCSI Mailing List, Christoph Hellwig
On 04/08/2016 10:40 AM, Matthew Wilcox wrote:
> - Interrupt steering needs to be controlled by block-mq instead of
> the driver. It's pointless to have each driver implement its own
> policies on interrupt steering, irqbalanced remains a source of
> end-user frustration, and block-mq can change the queue<->cpu mapping
> without the driver's knowledge.
I'm looking forward to the day that I will be able to drop my script for
spreading interrupts manually (see also the fifth attachment of
http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/21312/focus=98409).
Bart.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Lsf] [LSF/MM TOPIC] block-mq issues with FC
2016-04-08 17:40 ` Matthew Wilcox
` (2 preceding siblings ...)
2016-04-08 18:14 ` Bart Van Assche
@ 2016-04-08 19:22 ` Waskiewicz, PJ
2016-04-10 19:02 ` Sagi Grimberg
4 siblings, 0 replies; 16+ messages in thread
From: Waskiewicz, PJ @ 2016-04-08 19:22 UTC (permalink / raw)
To: willy@linux.intel.com, hare@suse.de
Cc: lsf@lists.linux-foundation.org, linux-scsi@vger.kernel.org,
hch@lst.de, linux-block@vger.kernel.org, axboe@kernel.dk
On Fri, 2016-04-08 at 13:40 -0400, Matthew Wilcox wrote:
> On Fri, Apr 08, 2016 at 01:29:26PM +0200, Hannes Reinecke wrote:
> > - Interrupt steering needs to be controlled by block-mq instead of
> the driver. It's pointless to have each driver implement its own
> policies on interrupt steering, irqbalanced remains a source of
> end-user frustration, and block-mq can change the queue<->cpu
> mapping
> without the driver's knowledge.
This is the same problem in the networking space as well. When I added
affinity_hint to the irq_desc, and then that support into irqbalance,
my original approach was to allow the driver to assign affinities.
This was shot down because a driver was influencing policy, versus
allowing userspace to do so. Meh.
If there's something actionable out of this discussion that makes
interrupt steering better, I'd like to see us drive it into the
networking world as well. That would also let me rip out the
affinity_hint stuff overall from irqbalance...
-PJ
--
PJ Waskiewicz
Principal Engineer, NetApp
e: pj.waskiewicz@netapp.com
d: 503.961.3705
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Lsf] [LSF/MM TOPIC] block-mq issues with FC
2016-04-08 17:40 ` Matthew Wilcox
` (3 preceding siblings ...)
2016-04-08 19:22 ` Waskiewicz, PJ
@ 2016-04-10 19:02 ` Sagi Grimberg
2016-04-12 19:04 ` Quinn Tran
4 siblings, 1 reply; 16+ messages in thread
From: Sagi Grimberg @ 2016-04-10 19:02 UTC (permalink / raw)
To: Matthew Wilcox, Hannes Reinecke
Cc: lsf, linux-block@vger.kernel.org, Jens Axboe, SCSI Mailing List,
Christoph Hellwig
Hey Willy,
> - Interrupt steering needs to be controlled by block-mq instead of
> the driver. It's pointless to have each driver implement its own
> policies on interrupt steering, irqbalanced remains a source of
> end-user frustration, and block-mq can change the queue<->cpu mapping
> without the driver's knowledge.
I honestly don't think that block-mq is the right place to
*assign* interrupt steering. Not all HW devices are dedicated
to storage, take RDMA for example, a RNIC is shared by block
storage, networking and even user-space workloads so obviously
block-mq can't understand how a user wants to steer interrupts.
I think that block-mq needs to ask the device driver:
"what is the optimal queue index for cpu X?" and use it
while *someone* will be responsible for optimum interrupt
steering (can be the driver itself or user-space).
From some discussions I had with HCH I think he intends to
use the cpu reverse-mapping API to try and do what's described
above (if I'm not mistaken).
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [Lsf] [LSF/MM TOPIC] block-mq issues with FC
2016-04-10 19:02 ` Sagi Grimberg
@ 2016-04-12 19:04 ` Quinn Tran
0 siblings, 0 replies; 16+ messages in thread
From: Quinn Tran @ 2016-04-12 19:04 UTC (permalink / raw)
To: Sagi Grimberg, Matthew Wilcox, Hannes Reinecke
Cc: lsf@lists.linux-foundation.org, linux-block@vger.kernel.org,
Jens Axboe, linux-scsi, Christoph Hellwig
>Hey Willy,
>
>> - Interrupt steering needs to be controlled by block-mq instead of
>> the driver. It's pointless to have each driver implement its own
>> policies on interrupt steering, irqbalanced remains a source of
>> end-user frustration, and block-mq can change the queue<->cpu mapping
>> without the driver's knowledge.
>
>I honestly don't think that block-mq is the right place to
>*assign* interrupt steering. Not all HW devices are dedicated
>to storage, take RDMA for example, a RNIC is shared by block
>storage, networking and even user-space workloads so obviously
>block-mq can't understand how a user wants to steer interrupts.
>
>I think that block-mq needs to ask the device driver:
>"what is the optimal queue index for cpu X?" and use it
>while *someone* will be responsible for optimum interrupt
>steering (can be the driver itself or user-space).
+0.5 on block-mq asking lower layer on where to place the queue. However, I think it is better that the lower layer push up the data rather the block-mq asking for it. User can change or irqbalance can relocate the interrupt vector(s) during runtime.
For Qlogic adapter, it can act in both Initiator & Target Modes at the same time. Certain target vendor might not wants the initiator side to holding this knob.
>
> From some discussions I had with HCH I think he intends to
>use the cpu reverse-mapping API to try and do what's described
>above (if I'm not mistaken).
>_______________________________________________
>Lsf mailing list
>Lsf@lists.linux-foundation.org
>https://lists.linuxfoundation.org/mailman/listinfo/lsf
^ permalink raw reply [flat|nested] 16+ messages in thread