All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yu Chen <yu.c.chen@intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org, Ingo Molnar <mingo@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Rui Zhang <rui.zhang@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Len Brown <lenb@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Subject: Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU
Date: Thu, 7 Sep 2017 16:34:06 +0800	[thread overview]
Message-ID: <20170907083405.GA24450@localhost.localdomain> (raw)
In-Reply-To: <alpine.DEB.2.20.1709070731110.2433@nanos>

On Thu, Sep 07, 2017 at 07:54:09AM +0200, Thomas Gleixner wrote:
> On Thu, 7 Sep 2017, Yu Chen wrote:
> > On Wed, Sep 06, 2017 at 10:03:58AM +0200, Thomas Gleixner wrote:
> > > Can you please apply the debug patch below, boot the machine and right
> > > after login provide the output of
> > > 
> > > # cat /sys/kernel/debug/tracing/trace
> > >
> >      kworker/0:2-303   [000] ....     9.135467: msi_domain_alloc_irqs: dev: 0000:bb:00.0 nvec 1 virq 34
> >      kworker/0:2-303   [000] ....     9.135476: msi_domain_alloc_irqs: dev: 0000:bb:00.0 nvec 1 virq 35
> >      kworker/0:2-303   [000] ....     9.135484: msi_domain_alloc_irqs: dev: 0000:bb:00.0 nvec 1 virq 36
> 
> <SNIP>
> 
> >      kworker/0:2-303   [000] ....     9.762268: msi_domain_alloc_irqs: dev: 0000:bb:00.3 nvec 1 virq 331
> >      kworker/0:2-303   [000] ....     9.762278: msi_domain_alloc_irqs: dev: 0000:bb:00.3 nvec 1 virq 332
> >      kworker/0:2-303   [000] ....     9.762288: msi_domain_alloc_irqs: dev: 0000:bb:00.3 nvec 1 virq 333
> 
> That's 300 vectors.
> 
> >  bb:00.[0-3] Ethernet controller: Intel Corporation Device 37d0 (rev 03)
> > 
> > -+-[0000:b2]-+-00.0-[b3-bc]----00.0-[b4-bc]--+-00.0-[b5-b6]----00.0
> >  |           |                               +-01.0-[b7-b8]----00.0
> >  |           |                               +-02.0-[b9-ba]----00.0
> >  |           |                               \-03.0-[bb-bc]--+-00.0
> >  |           |                                               +-00.1
> >  |           |                                               +-00.2
> >  |           |                                               \-00.3
> > 
> > and they are using i40e driver, the vectors should be reserved by:
> > i40e_probe() ->
> >   i40e_init_interrupt_scheme() ->
> >     i40e_init_msix() ->
> >       i40e_reserve_msix_vectors() ->
> >         pci_enable_msix_range()
> > 
> > # ls /sys/kernel/debug/irq/irqs
> > 0  10   11  13  142  184  217  259  292  31  33
> > 337  339  340  342  344  346  348  350  352  354  356
> > 358  360  362  364  366  368  370  372  374  376  378
> > 380  382  384  386  388  390  392  394  4  6   7  9
> > 1  109  12  14  15   2    24   26   3    32  335
> > 338  34   341  343  345  347  349  351  353  355  357
> > 359  361  363  365  367  369  371  373  375  377  379
> > 381  383  385  387  389  391  393  395  5  67  8
> 
> Out of these 300 interrupts exactly 8 randomly selected ones are actively
> used. And the other 292 interrupts are just there because it might need
> them in the future when the 32 CPU machine gets magically upgraded to 4096
> cores at runtime?
>
Humm, the 292 vectors remain disabled due to the network devices have
not been enabled(say,ifconfig up does not get invoked), so request_irq()
does not get invoked for these vectors? I have an impression that once
I've borrowed some fiber cables to connect the platform, the active IRQ
from i40e raised a lot, although I don't have these expensive cables
now...
> Can the i40e people @intel please fix this waste of resources and sanitize
> their interrupt allocation scheme?
> 
> Please switch it over to managed interrupts so the affinity spreading
> happens in a sane way and the interrupts are properly managed on CPU
> hotplug.
Ok, I think currently in i40e driver the reservation of vectors
leverages pci_enable_msix_range() and did not provide the affinity
hit to low level IRQ system thus the managed interrupts is not enabled
there(although later in i40e driver we use irq_set_affinity_hint() to
spread the IRQs)

Thanks,
	Yu
> 
> Thanks,
> 
> 	tglx

  reply	other threads:[~2017-09-07  8:32 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-01  5:03 [PATCH 0/4][RFC v2] x86/irq: Spread vectors on different CPUs Chen Yu
2017-09-01  5:03 ` [PATCH 1/4][RFC v2] x86/apic: Extend the defination for vector_irq Chen Yu
2017-09-01  5:04 ` [PATCH 2/4][RFC v2] x86/apic: Record the number of vectors assigned on a CPU Chen Yu
2017-09-01  5:04 ` [PATCH 3/4] x86/apic: Introduce the per vector cpumask array Chen Yu
2017-09-01  5:04 ` [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU Chen Yu
2017-09-03 18:17   ` Thomas Gleixner
2017-09-03 19:18     ` RFD: x86: Sanitize the vector allocator Thomas Gleixner
2017-09-05 22:57     ` [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU Thomas Gleixner
2017-09-06  4:34       ` Yu Chen
2017-09-06  8:03         ` Thomas Gleixner
2017-09-07  2:52           ` Yu Chen
2017-09-07  5:54             ` Thomas Gleixner
2017-09-07  8:34               ` Yu Chen [this message]
2017-09-07  9:45                 ` Thomas Gleixner
2017-09-06  4:13     ` Yu Chen
2017-09-06  6:15       ` Christoph Hellwig
2017-09-06 17:46         ` Dan Williams
2017-09-07  2:57           ` Yu Chen
2017-09-07  5:59           ` Thomas Gleixner
2017-09-07  6:23             ` Dan Williams
2017-09-07  6:59               ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170907083405.GA24450@localhost.localdomain \
    --to=yu.c.chen@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=hpa@zytor.com \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=rui.zhang@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.