From: Andrew Vagin <avagin@parallels.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev@vger.kernel.org, vvs@parallels.com,
"Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Subject: Re: Slow speed of tcp connections in a network namespace
Date: Sun, 30 Dec 2012 01:15:09 +0400 [thread overview]
Message-ID: <20121229211508.GB4350@paralelels.com> (raw)
In-Reply-To: <1356807516.4102.4.camel@edumazet-laptop>
On Sat, Dec 29, 2012 at 07:58:36PM +0100, Eric Dumazet wrote:
> Le samedi 29 décembre 2012 à 09:40 -0800, Eric Dumazet a écrit :
>
> >
> > Please post your new tcpdump then ;)
> >
> > also post "netstat -s" from root and test ns after your wgets
>
> Also try following bnx2 patch.
>
> It should help GRO / TCP coalesce
>
> bnx2 should be the last driver not using skb head_frag
>
This patch breaks nothing. I don't know what kind of profit I should get
with it:).
FYI:
I forgot to say, that I disable gro before collecting tcpdump, because
in this case tcpdump from veth and from eth0 can be compared easier.
>
> diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
> index a1adfaf..08a2d40 100644
> --- a/drivers/net/ethernet/broadcom/bnx2.c
> +++ b/drivers/net/ethernet/broadcom/bnx2.c
> @@ -2726,6 +2726,14 @@ bnx2_free_rx_page(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index)
> rx_pg->page = NULL;
> }
>
> +static void bnx2_frag_free(const struct bnx2 *bp, void *data)
> +{
> + if (bp->rx_frag_size)
> + put_page(virt_to_head_page(data));
> + else
> + kfree(data);
> +}
> +
> static inline int
> bnx2_alloc_rx_data(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index, gfp_t gfp)
> {
> @@ -2735,7 +2743,10 @@ bnx2_alloc_rx_data(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index, gf
> struct bnx2_rx_bd *rxbd =
> &rxr->rx_desc_ring[BNX2_RX_RING(index)][BNX2_RX_IDX(index)];
>
> - data = kmalloc(bp->rx_buf_size, gfp);
> + if (bp->rx_frag_size)
> + data = netdev_alloc_frag(bp->rx_frag_size);
> + else
> + data = kmalloc(bp->rx_buf_size, gfp);
> if (!data)
> return -ENOMEM;
>
> @@ -2744,7 +2755,7 @@ bnx2_alloc_rx_data(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u16 index, gf
> bp->rx_buf_use_size,
> PCI_DMA_FROMDEVICE);
> if (dma_mapping_error(&bp->pdev->dev, mapping)) {
> - kfree(data);
> + bnx2_frag_free(bp, data);
> return -EIO;
> }
>
> @@ -3014,9 +3025,9 @@ error:
>
> dma_unmap_single(&bp->pdev->dev, dma_addr, bp->rx_buf_use_size,
> PCI_DMA_FROMDEVICE);
> - skb = build_skb(data, 0);
> + skb = build_skb(data, bp->rx_frag_size);
> if (!skb) {
> - kfree(data);
> + bnx2_frag_free(bp, data);
> goto error;
> }
> skb_reserve(skb, ((u8 *)get_l2_fhdr(data) - data) + BNX2_RX_OFFSET);
> @@ -5358,6 +5369,10 @@ bnx2_set_rx_ring_size(struct bnx2 *bp, u32 size)
> /* hw alignment + build_skb() overhead*/
> bp->rx_buf_size = SKB_DATA_ALIGN(bp->rx_buf_use_size + BNX2_RX_ALIGN) +
> NET_SKB_PAD + SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> + if (bp->rx_buf_size <= PAGE_SIZE)
> + bp->rx_frag_size = bp->rx_buf_size;
> + else
> + bp->rx_frag_size = 0;
> bp->rx_jumbo_thresh = rx_size - BNX2_RX_OFFSET;
> bp->rx_ring_size = size;
> bp->rx_max_ring = bnx2_find_max_ring(size, BNX2_MAX_RX_RINGS);
> @@ -5436,7 +5451,7 @@ bnx2_free_rx_skbs(struct bnx2 *bp)
>
> rx_buf->data = NULL;
>
> - kfree(data);
> + bnx2_frag_free(bp, data);
> }
> for (j = 0; j < bp->rx_max_pg_ring_idx; j++)
> bnx2_free_rx_page(bp, rxr, j);
> diff --git a/drivers/net/ethernet/broadcom/bnx2.h b/drivers/net/ethernet/broadcom/bnx2.h
> index 172efbe..11f5dee 100644
> --- a/drivers/net/ethernet/broadcom/bnx2.h
> +++ b/drivers/net/ethernet/broadcom/bnx2.h
> @@ -6804,6 +6804,7 @@ struct bnx2 {
>
> u32 rx_buf_use_size; /* useable size */
> u32 rx_buf_size; /* with alignment */
> + u32 rx_frag_size; /* 0 if kmalloced(), or rx_buf_size */
> u32 rx_copy_thresh;
> u32 rx_jumbo_thresh;
> u32 rx_max_ring_idx;
>
>
next prev parent reply other threads:[~2012-12-29 21:15 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-29 9:24 Slow speed of tcp connections in a network namespace Andrew Vagin
2012-12-29 13:53 ` Eric Dumazet
2012-12-29 14:50 ` Andrew Vagin
2012-12-29 17:40 ` Eric Dumazet
2012-12-29 18:29 ` Andrew Vagin
2012-12-29 18:58 ` Eric Dumazet
2012-12-29 19:41 ` Eric Dumazet
2012-12-29 20:08 ` Andrew Vagin
2012-12-29 20:20 ` Eric Dumazet
2012-12-29 21:07 ` Andrew Vagin
2012-12-29 21:12 ` Eric Dumazet
2012-12-29 21:19 ` Andrew Vagin
2012-12-29 21:15 ` Andrew Vagin [this message]
2012-12-29 16:01 ` Michał Mirosław
2012-12-30 2:26 ` [PATCH] veth: extend device features Eric Dumazet
2012-12-30 10:32 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121229211508.GB4350@paralelels.com \
--to=avagin@parallels.com \
--cc=eric.dumazet@gmail.com \
--cc=mirq-linux@rere.qmqm.pl \
--cc=netdev@vger.kernel.org \
--cc=vvs@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.