Netdev List
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: Richard Cochran <richardcochran@gmail.com>,
	David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org
Subject: Re: [RFC] net: remove erroneous sk null assignment in timestamping
Date: Sat, 08 Oct 2011 10:57:18 +0200	[thread overview]
Message-ID: <1318064238.5276.2.camel@edumazet-laptop> (raw)
In-Reply-To: <1318061808.3991.12.camel@jlt3.sipsolutions.net>

Le samedi 08 octobre 2011 à 10:16 +0200, Johannes Berg a écrit :

> I'm not terribly familiar with struct sock. Looking at it, I'm a bit
> confused by skb_orphan() -- it doesn't put the sock reference. So are
> sockets not refcounted for skbs in this way? They seem to use
> sock_wfree() which does a bit more than this it seems, and I don't see
> it using sk_refcnt anywhere so I'm a bit confused now.

Check following commit changelog to get some information on this.

commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Thu Jun 11 02:55:43 2009 -0700

    net: No more expensive sock_hold()/sock_put() on each tx
    
    One of the problem with sock memory accounting is it uses
    a pair of sock_hold()/sock_put() for each transmitted packet.
    
    This slows down bidirectional flows because the receive path
    also needs to take a refcount on socket and might use a different
    cpu than transmit path or transmit completion path. So these
    two atomic operations also trigger cache line bounces.
    
    We can see this in tx or tx/rx workloads (media gateways for example),
    where sock_wfree() can be in top five functions in profiles.
    
    We use this sock_hold()/sock_put() so that sock freeing
    is delayed until all tx packets are completed.
    
    As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
    by one unit at init time, until sk_free() is called.
    Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc)
    to decrement initial offset and atomicaly check if any packets
    are in flight.
    
    skb_set_owner_w() doesnt call sock_hold() anymore
    
    sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc
    reached 0 to perform the final freeing.
    
    Drawback is that a skb->truesize error could lead to unfreeable sockets, or
    even worse, prematurely calling __sk_free() on a live socket.
    
    Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s
    on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt
    contention point. 5 % speedup on a UDP transmit workload (depends
    on number of flows), lowering TX completion cpu usage.

  reply	other threads:[~2011-10-08  8:57 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-07 17:11 [RFC] net: remove erroneous sk null assignment in timestamping Johannes Berg
2011-10-07 17:33 ` David Miller
2011-10-07 17:40   ` Johannes Berg
2011-10-07 17:47     ` Johannes Berg
2011-10-07 17:53       ` Johannes Berg
2011-10-07 18:42     ` Johannes Berg
2011-10-08  7:59       ` Richard Cochran
2011-10-08  7:57   ` Richard Cochran
2011-10-08  8:16     ` Johannes Berg
2011-10-08  8:57       ` Eric Dumazet [this message]
2011-10-08 10:32         ` Johannes Berg
2011-10-11 13:34           ` Richard Cochran
2011-10-08 10:35         ` Richard Cochran
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
2011-10-12 19:25   ` Eric Dumazet
2011-10-12 19:27   ` Johannes Berg
2011-10-12 19:52     ` Eric Dumazet
2011-10-13  8:54       ` Johannes Berg
2011-10-13  4:51     ` Richard Cochran
2011-10-13  9:46   ` [PATCH 0/3] net: time stamping fixes Richard Cochran
2011-10-13  9:46     ` [PATCH 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
2011-10-19  4:42       ` Eric Dumazet
2011-10-13  9:46     ` [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
2011-10-19  4:47       ` Eric Dumazet
2011-10-13  9:46     ` [PATCH 3/3] dp83640: free packet queues on remove Richard Cochran
2011-10-19  4:48       ` Eric Dumazet
2011-10-19  4:16     ` [PATCH 0/3] net: time stamping fixes David Miller
2011-10-19  5:15       ` Johannes Berg
2011-10-19 11:50         ` Richard Cochran
2011-10-19 12:33           ` Eric Dumazet
2011-10-19 12:38           ` Eric Dumazet
2011-10-19 12:58             ` Johannes Berg
2011-10-19 13:09               ` Johannes Berg
2011-10-19 13:25                 ` Eric Dumazet
2011-10-19 13:35                   ` Johannes Berg
2011-10-19 13:44                     ` Eric Dumazet
2011-10-19 13:57                       ` Johannes Berg
2011-10-19 14:08                         ` Eric Dumazet
2011-10-19 14:24                           ` Johannes Berg
2011-10-19 14:27                             ` Richard Cochran
2011-10-19 14:33                               ` Eric Dumazet
2011-10-19 13:21               ` Eric Dumazet
2011-10-19 13:25                 ` Johannes Berg
2011-10-19 13:27                   ` Eric Dumazet
2011-10-19 13:32                     ` Johannes Berg
2011-10-19 14:25                       ` Richard Cochran
2011-10-21 10:49   ` [PATCH v2 " Richard Cochran
2011-10-21 10:49     ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
2011-10-21 11:31       ` Eric Dumazet
2011-10-24  6:55         ` David Miller
2011-10-21 11:44       ` Johannes Berg
2011-10-21 10:49     ` [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
2011-10-24  6:55       ` David Miller
2011-10-24 17:47         ` Richard Cochran
2011-10-24 23:16           ` David Miller
2011-10-21 10:49     ` [PATCH v2 3/3] dp83640: free packet queues on remove Richard Cochran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1318064238.5276.2.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=johannes@sipsolutions.net \
    --cc=netdev@vger.kernel.org \
    --cc=richardcochran@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox