From: Yu Chen <yu.c.chen@intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org, Ingo Molnar <mingo@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>, Rui Zhang <rui.zhang@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
Len Brown <lenb@kernel.org>,
Dan Williams <dan.j.williams@intel.com>,
Christoph Hellwig <hch@lst.de>,
Peter Zijlstra <peterz@infradead.org>,
Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Subject: Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU
Date: Thu, 7 Sep 2017 16:34:06 +0800 [thread overview]
Message-ID: <20170907083405.GA24450@localhost.localdomain> (raw)
In-Reply-To: <alpine.DEB.2.20.1709070731110.2433@nanos>
On Thu, Sep 07, 2017 at 07:54:09AM +0200, Thomas Gleixner wrote:
> On Thu, 7 Sep 2017, Yu Chen wrote:
> > On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote:
> > > Can you please apply the debug patch below, boot the machine and right
> > > after login provide the output of
> > >
> > > # cat /sys/kernel/debug/tracing/trace
> > >
> > kworker/0:2-303 [000] .... 9.135467: msi_domain_alloc_irqs: dev: 0000:bb:00.0 nvec 1 virq 34
> > kworker/0:2-303 [000] .... 9.135476: msi_domain_alloc_irqs: dev: 0000:bb:00.0 nvec 1 virq 35
> > kworker/0:2-303 [000] .... 9.135484: msi_domain_alloc_irqs: dev: 0000:bb:00.0 nvec 1 virq 36
>
> <SNIP>
>
> > kworker/0:2-303 [000] .... 9.762268: msi_domain_alloc_irqs: dev: 0000:bb:00.3 nvec 1 virq 331
> > kworker/0:2-303 [000] .... 9.762278: msi_domain_alloc_irqs: dev: 0000:bb:00.3 nvec 1 virq 332
> > kworker/0:2-303 [000] .... 9.762288: msi_domain_alloc_irqs: dev: 0000:bb:00.3 nvec 1 virq 333
>
> That's 300 vectors.
>
> > bb:00.[0-3] Ethernet controller: Intel Corporation Device 37d0 (rev 03)
> >
> > -+-[0000:b2]-+-00.0-[b3-bc]----00.0-[b4-bc]--+-00.0-[b5-b6]----00.0
> > | | +-01.0-[b7-b8]----00.0
> > | | +-02.0-[b9-ba]----00.0
> > | | \-03.0-[bb-bc]--+-00.0
> > | | +-00.1
> > | | +-00.2
> > | | \-00.3
> >
> > and they are using i40e driver, the vectors should be reserved by:
> > i40e_probe() ->
> > i40e_init_interrupt_scheme() ->
> > i40e_init_msix() ->
> > i40e_reserve_msix_vectors() ->
> > pci_enable_msix_range()
> >
> > # ls /sys/kernel/debug/irq/irqs
> > 0 10 11 13 142 184 217 259 292 31 33
> > 337 339 340 342 344 346 348 350 352 354 356
> > 358 360 362 364 366 368 370 372 374 376 378
> > 380 382 384 386 388 390 392 394 4 6 7 9
> > 1 109 12 14 15 2 24 26 3 32 335
> > 338 34 341 343 345 347 349 351 353 355 357
> > 359 361 363 365 367 369 371 373 375 377 379
> > 381 383 385 387 389 391 393 395 5 67 8
>
> Out of these 300 interrupts exactly 8 randomly selected ones are actively
> used. And the other 292 interrupts are just there because it might need
> them in the future when the 32 CPU machine gets magically upgraded to 4096
> cores at runtime?
>
Humm, the 292 vectors remain disabled due to the network devices have
not been enabled(say,ifconfig up does not get invoked), so request_irq()
does not get invoked for these vectors? I have an impression that once
I've borrowed some fiber cables to connect the platform, the active IRQ
from i40e raised a lot, although I don't have these expensive cables
now...
> Can the i40e people @intel please fix this waste of resources and sanitize
> their interrupt allocation scheme?
>
> Please switch it over to managed interrupts so the affinity spreading
> happens in a sane way and the interrupts are properly managed on CPU
> hotplug.
Ok, I think currently in i40e driver the reservation of vectors
leverages pci_enable_msix_range() and did not provide the affinity
hit to low level IRQ system thus the managed interrupts is not enabled
there(although later in i40e driver we use irq_set_affinity_hint() to
spread the IRQs)
Thanks,
Yu
>
> Thanks,
>
> tglx
next prev parent reply other threads:[~2017-09-07 8:32 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-01 5:03 [PATCH 0/4][RFC v2] x86/irq: Spread vectors on different CPUs Chen Yu
2017-09-01 5:03 ` [PATCH 1/4][RFC v2] x86/apic: Extend the defination for vector_irq Chen Yu
2017-09-01 5:04 ` [PATCH 2/4][RFC v2] x86/apic: Record the number of vectors assigned on a CPU Chen Yu
2017-09-01 5:04 ` [PATCH 3/4] x86/apic: Introduce the per vector cpumask array Chen Yu
2017-09-01 5:04 ` [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU Chen Yu
2017-09-03 18:17 ` Thomas Gleixner
2017-09-03 19:18 ` RFD: x86: Sanitize the vector allocator Thomas Gleixner
2017-09-05 22:57 ` [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU Thomas Gleixner
2017-09-06 4:34 ` Yu Chen
2017-09-06 8:03 ` Thomas Gleixner
2017-09-07 2:52 ` Yu Chen
2017-09-07 5:54 ` Thomas Gleixner
2017-09-07 8:34 ` Yu Chen [this message]
2017-09-07 9:45 ` Thomas Gleixner
2017-09-06 4:13 ` Yu Chen
2017-09-06 6:15 ` Christoph Hellwig
2017-09-06 17:46 ` Dan Williams
2017-09-07 2:57 ` Yu Chen
2017-09-07 5:59 ` Thomas Gleixner
2017-09-07 6:23 ` Dan Williams
2017-09-07 6:59 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170907083405.GA24450@localhost.localdomain \
--to=yu.c.chen@intel.com \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=hpa@zytor.com \
--cc=jeffrey.t.kirsher@intel.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rjw@rjwysocki.net \
--cc=rui.zhang@intel.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox