From: Eric Dumazet <dada1@cosmosbay.com>
To: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: netdev@vger.kernel.org, dale@farnsworth.org
Subject: Re: [PATCH 2/2] mv643xx_eth: hook up skb recycling
Date: Thu, 04 Sep 2008 06:50:22 +0200 [thread overview]
Message-ID: <48BF690E.7090501@cosmosbay.com> (raw)
In-Reply-To: <20080904042005.GA27272@xi.wantstofly.org>
Lennert Buytenhek a écrit :
> On Wed, Sep 03, 2008 at 04:25:34PM +0200, Eric Dumazet wrote:
>
>>> This increases the maximum loss-free packet forwarding rate in
>>> routing workloads by typically about 25%.
>>>
>>> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
>> Interesting...
>>
>>> refilled = 0;
>>> while (refilled < budget && rxq->rx_desc_count < rxq->rx_ring_size) {
>>> struct sk_buff *skb;
>>> int unaligned;
>>> int rx;
>>>
>>> - skb = dev_alloc_skb(skb_size + dma_get_cache_alignment() -
>>> 1);
>>> + skb = __skb_dequeue(&mp->rx_recycle);
>> Here you take one skb at the head of queue
>>
>>> + if (skb == NULL)
>>> + skb = dev_alloc_skb(mp->skb_size +
>>> + dma_get_cache_alignment() - 1);
>>> +
>>> if (skb == NULL) {
>>> mp->work_rx_oom |= 1 << rxq->index;
>>> goto oom;
>>> @@ -600,8 +591,8 @@ static int rxq_refill(struct rx_queue *rxq, int budget)
>>> rxq->rx_used_desc = 0;
>>>
>>> rxq->rx_desc_area[rx].buf_ptr = dma_map_single(NULL,
>>> skb->data,
>>> - skb_size, DMA_FROM_DEVICE);
>>> - rxq->rx_desc_area[rx].buf_size = skb_size;
>>> + mp->skb_size,
>>> DMA_FROM_DEVICE);
>>> + rxq->rx_desc_area[rx].buf_size = mp->skb_size;
>>> rxq->rx_skb[rx] = skb;
>>> wmb();
>>> rxq->rx_desc_area[rx].cmd_sts = BUFFER_OWNED_BY_DMA |
>>> @@ -905,8 +896,13 @@ static int txq_reclaim(struct tx_queue *txq, int
>>> budget, int force)
>>> else
>>> dma_unmap_page(NULL, addr, count, DMA_TO_DEVICE);
>>>
>>> - if (skb)
>>> - dev_kfree_skb(skb);
>>> + if (skb != NULL) {
>>> + if (skb_queue_len(&mp->rx_recycle) < 1000 &&
>>> + skb_recycle_check(skb, mp->skb_size))
>>> + __skb_queue_tail(&mp->rx_recycle, skb);
>>> + else
>>> + dev_kfree_skb(skb);
>>> + }
>> Here you put a skb at the head of queue. So you use a FIFO mode.
Here, I meant "tail of queue", you obviously already corrected this :)
>>
>> To have best performance (cpu cache hot), you might try to use a LIFO mode
>> (use __skb_queue_head()) ?
>
> That sounds like a good idea. I'll try that, thanks.
>
>
>> Could you give us your actual bench results (number of packets received per
>> second, number of transmited packets per second), and your machine setup.
>
> mv643xx_eth isn't your typical PCI network adapter, it's a silicon
> block that is found in PPC/MIPS northbridges and in ARM System-on-Chips
> (SoC = CPU + peripherals integrated in one chip).
>
> The particular platform I did these tests on is a wireless access
> point. It has an ARM SoC running at 1.2 GHz, with relatively small
> (16K/16K) L1 caches, 256K of L2 cache, and DDR2-400 memory, and a
> hardware switch chip. Networking is hooked up as follows:
>
> +-----------+ +-----------+
> | | | |
> | | | +------ 1000baseT MDI ("WAN")
> | | RGMII | 6-port +------ 1000baseT MDI ("LAN1")
> | CPU +-------+ ethernet +------ 1000baseT MDI ("LAN2")
> | | | switch +------ 1000baseT MDI ("LAN3")
> | | | w/5 PHYs +------ 1000baseT MDI ("LAN4")
> | | | |
> +-----------+ +-----------+
>
> The protocol that the ethernet switch speaks is called DSA
> ("Distributed Switch Architecture"), which is basically just ethernet
> with a header that's inserted between the ethernet header and the data
> (just like 802.1q VLAN tags) telling the switch what to do with the
> packet. (I hope to submit the DSA driver I am writing soon.) But for
> these purposes of this test, the switch chip is in pass-through mode,
> where DSA tagging is not used and the switch behaves like an ordinary
> 6-port ethernet chip.
>
> The network benchmarks are done with a Smartbits 600B traffic
> generator/measurement device. What it does is a bisection search of
> sending traffic at different packet-per-second rates to pin down the
> maximum loss-free forwarding rate, i.e. the maximum packet rate at
> which there is still no packet loss.
>
> My notes say that before recycling (i.e. with all the mv643xx_eth
> patches I posted yesterday), the typical rate was 191718 pps, and
> after, 240385 pps. The 2.6.27 version of the driver gets ~130kpps.
> (The different injection rates are achieved by varying the inter-packet
> gap at byte granularities, so you don't get nice round numbers.)
>
> Those measurements were made more than a week ago, though, and my
> mv643xx_eth patch stack has seen a lot of splitting and reordering and
> recombining and rewriting since then, so I'm not sure if those numbers
> are accurate anymore. I'll do some more benchmarks when I get access
> to the smartbits again. Also, I'll get TX vs. RX curves if you care
> about those.
>
> (The same hardware has been seen to do ~300 kpps or ~380 kpps or ~850
> kpps depending on how much of the networking stack you bypass, but I'm
> trying to find ways to optimise the routing throughput without
> bypassing the stack, i.e. while retaining full functionality.)
Thanks a lot for this detailed informations, definitly usefull !
As a slide note, you have an arbitrary long limit on rx_recycle queue length (1000),
maybe you could use rx_ring_size instead.
next prev parent reply other threads:[~2008-09-04 4:51 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-03 13:54 [PATCH,RFC 0/2] skb recycling (and example implementation for mv643xx_eth) Lennert Buytenhek
2008-09-03 13:55 ` [PATCH 1/2] [NET] add skb_recycle_check() to enable netdriver skb recycling Lennert Buytenhek
2008-09-03 13:55 ` [PATCH 2/2] mv643xx_eth: hook up " Lennert Buytenhek
2008-09-03 14:25 ` Eric Dumazet
2008-09-04 4:20 ` Lennert Buytenhek
2008-09-04 4:50 ` Eric Dumazet [this message]
2008-09-14 19:30 ` Lennert Buytenhek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48BF690E.7090501@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=buytenh@wantstofly.org \
--cc=dale@farnsworth.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).