netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Daney <ddaney@caviumnetworks.com>
To: Stephen Hemminger <shemminger@vyatta.com>
Cc: Chetan Loke <chetanloke@gmail.com>,
	Chris Friesen <cfriesen@nortel.com>,
	netdev@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mips <linux-mips@linux-mips.org>
Subject: Re: Irq architecture for multi-core network driver.
Date: Wed, 16 Dec 2009 15:26:02 -0800	[thread overview]
Message-ID: <4B296C8A.6000900@caviumnetworks.com> (raw)
In-Reply-To: <20091216150051.63b6e31c@nehalam>

Stephen Hemminger wrote:
> On Wed, 16 Dec 2009 14:30:36 -0800
> David Daney <ddaney@caviumnetworks.com> wrote:
> 
>> Chetan Loke wrote:
>>>>> Does your hardware do flow-based queues?  In this model you have
>>>>> multiple rx queues and the hardware hashes incoming packets to a single
>>>>> queue based on the addresses, ports, etc. This ensures that all the
>>>>> packets of a single connection always get processed in the order they
>>>>> arrived at the net device.
>>>>>
>>>> Indeed, this is exactly what we have.
>>>>
>>>>
>>>>> Typically in this model you have as many interrupts as queues
>>>>> (presumably 16 in your case).  Each queue is assigned an interrupt and
>>>>> that interrupt is affined to a single core.
>>>> Certainly this is one mode of operation that should be supported, but I
>>>> would also like to be able to go for raw throughput and have as many cores
>>>> as possible reading from a single queue (like I currently have).
>>>>
>>> Well, you could let the NIC firmware(f/w) handle this. The f/w would
>>> know which interrupt was just injected recently.In other words it
>>> would have a history of which CPU's would be available. So if some
>>> previously interrupted CPU isn't making good progress then the
>>> firmware should route the incoming response packets to a different
>>> queue. This way some other CPU will pick it up.
>>>
>>
>> It isn's a NIC.  There is no firmware.  The system interrupt hardware is 
>> what it is and cannot be changed.
>>
>> My current implementation still has a single input queue configured and 
>> I get a maskable interrupt on a single CPU when packets are available. 
>> If the queue depth increases above a given threshold, I optionally send 
>> an IPI to another CPU to enable NAPI polling on that CPU.
>>
>> Currently I have a module parameter that controls the maximum number of 
>> CPUs that will have NAPI polling enabled.
>>
>> This allows me to get multiple CPUs doing receive processing without 
>> having to hack into the lower levels of the system's interrupt 
>> processing code to try to do interrupt steering.  Since all the 
>> interrupt service routine was doing was call netif_rx_schedule(), I can 
>> simply do this via smp_call_function_single().
> 
> Better to look into receive packet steering patches that are still
> under review (rather than reinventing it just for your driver)
> 

Indeed.  Although it turns out that I can do packet steering in hardware 
  across up to 16 queues each with their own irq and thus dedicated CPU. 
  So it is unclear to me if the receive packet steering patches offer 
much benefit to this hardware.

One concern is the ability to forward as many packets as possible from a 
very low number of flows (between 1 and 4).  Since it is an artificial 
benchmark, we can arbitrarily say that packet reordering is allowed. 
The simple hack to do NAPI polling on all CPUs from a single queue gives 
good results.  There is no need to remind me that packet reordering 
should be avoided, I already know this.

David Daney

      reply	other threads:[~2009-12-16 23:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-22 21:40 Irq architecture for multi-core network driver David Daney
2009-10-22 22:05 ` Chris Friesen
2009-10-22 22:24   ` David Daney
2009-10-23  7:59     ` Eric W. Biederman
2009-10-23 17:28       ` Jesse Brandeburg
2009-10-23 23:22         ` Eric W. Biederman
2009-10-24 13:26           ` David Miller
2009-10-24  3:19         ` David Miller
2009-10-24 13:23     ` David Miller
2009-12-16 22:08     ` Chetan Loke
2009-12-16 22:30       ` David Daney
2009-12-16 23:00         ` Stephen Hemminger
2009-12-16 23:26           ` David Daney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B296C8A.6000900@caviumnetworks.com \
    --to=ddaney@caviumnetworks.com \
    --cc=cfriesen@nortel.com \
    --cc=chetanloke@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@vyatta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).