From: Eric Dumazet <eric.dumazet@gmail.com>
To: Changli Gao <xiaosuo@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Tom Herbert <therbert@google.com>,
Linux Netdev List <netdev@vger.kernel.org>
Subject: Re: [PATCH net-next-2.6] net: Xmit Packet Steering (XPS)
Date: Fri, 20 Nov 2009 05:58:36 +0100 [thread overview]
Message-ID: <4B0621FC.6060004@gmail.com> (raw)
In-Reply-To: <412e6f7f0911191812uf0abc61w2f0d44f4d71bd55@mail.gmail.com>
Changli Gao a écrit :
> On Fri, Nov 20, 2009 at 7:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 9977288..9e134f6 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -2000,6 +2001,7 @@ gso:
>> */
>> rcu_read_lock_bh();
>>
>> + skb->sending_cpu = cpu = smp_processor_id();
>> txq = dev_pick_tx(dev, skb);
>> q = rcu_dereference(txq->qdisc);
>
> I think assigning cpu to skb->sending_cpu just before calling
> hard_start_xmit is better, because the CPU which dequeues the skb will
> be another one.
I want to record the application CPU, because I want the application CPU
to call sock_wfree(), not the CPU that happened to dequeue skb to transmit it
in case of txq contention.
>
>> @@ -2024,8 +2026,6 @@ gso:
>> Either shot noqueue qdisc, it is even simpler 8)
>> */
>> if (dev->flags & IFF_UP) {
>> - int cpu = smp_processor_id(); /* ok because BHs are off */
>> -
>> if (txq->xmit_lock_owner != cpu) {
>>
>> HARD_TX_LOCK(dev, txq, cpu);
>> @@ -2967,7 +2967,7 @@ static void net_rx_action(struct softirq_action *h)
>> }
>> out:
>> local_irq_enable();
>> -
>> + xps_flush();
>
> If there isn't any new skbs, the memory will be hold forever. I know
> you want to eliminate unnecessary IPI, how about sending IPI only when
> the remote xps_pcpu_queues are changed from empty to nonempty?
I dont understand your remark, and dont see the problem, yet.
I send IPI only on cpus I know I have at least one skb queueud for them.
For each cpu taking TX completion interrupts I have :
One bitmask (xps_cpus) of cpus I will eventually send IPI at end of net_rx_action()
One array of skb lists per remote cpu, allocated on cpu node memory, thanks
to __alloc_percpu() at boot time.
I say _eventually_ because the algo is :
+ if (cpu_online(cpu)) {
+ spin_lock(&q->list.lock);
+ prevlen = skb_queue_len(&q->list);
+ skb_queue_splice_init(&head[cpu], &q->list);
+ spin_unlock(&q->list.lock);
+ /*
+ * We hope remote cpu will be fast enough to transfert
+ * this list to its completion queue before our
+ * next xps_flush() call
+ */
+ if (!prevlen)
+ __smp_call_function_single(cpu, &q->csd, 0);
+ continue;
So I send an IPI only if needed, once for the whole skb list.
With my pktgen (no skb cloning setup) tests, and
ethtool -C eth3 tx-usecs 1000 tx-frames 100
I really saw batches of 100 frames given from CPU X (NIC interrupts) to CPU Y (pktgen cpu)
What memory is hold forever ?
next prev parent reply other threads:[~2009-11-20 4:58 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-19 23:46 [PATCH net-next-2.6] net: Xmit Packet Steering (XPS) Eric Dumazet
2009-11-20 2:12 ` Changli Gao
2009-11-20 4:58 ` Eric Dumazet [this message]
2009-11-20 5:11 ` Changli Gao
2009-11-20 5:24 ` Eric Dumazet
2009-11-20 5:34 ` Changli Gao
2009-11-20 5:42 ` Eric Dumazet
2009-11-20 5:50 ` Changli Gao
[not found] ` <65634d660911191641o4210a797mf1e8168dd8dd8b60@mail.gmail.com>
2009-11-20 5:08 ` Eric Dumazet
2009-11-20 13:32 ` Jarek Poplawski
2009-11-20 14:45 ` Eric Dumazet
2009-11-20 20:04 ` Jarek Poplawski
2009-11-20 21:43 ` Eric Dumazet
2009-11-20 22:08 ` Jarek Poplawski
2009-11-20 22:21 ` Eric Dumazet
2009-11-20 20:51 ` Andi Kleen
2009-11-20 20:53 ` David Miller
2009-11-20 22:30 ` Eric Dumazet
2009-11-20 22:37 ` Andi Kleen
[not found] ` <65634d660911201642k3930dc78vd576e0e89dc0c794@mail.gmail.com>
2009-11-21 6:58 ` Eric Dumazet
2009-11-20 20:53 ` Jarek Poplawski
2009-11-20 21:35 ` Eric Dumazet
2009-11-20 21:43 ` Joe Perches
2009-11-20 21:49 ` David Miller
2009-11-20 22:01 ` Eric Dumazet
2009-11-20 22:34 ` David Miller
2009-11-20 22:32 ` David Miller
2009-11-20 22:36 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B0621FC.6060004@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=therbert@google.com \
--cc=xiaosuo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).