netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.h.duyck@intel.com>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: netdev@vger.kernel.org, linux-wireless@vger.kernel.org,
	davem@davemloft.net, eric.dumazet@gmail.com,
	linville@tuxdriver.com
Subject: Re: [PATCH net-next 2/2] mac80211: Resolve sk_refcnt/sk_wmem_alloc issue in wifi ack path
Date: Thu, 11 Sep 2014 08:21:19 -0700	[thread overview]
Message-ID: <5411BDEF.7070105@intel.com> (raw)
In-Reply-To: <1410419198.1825.5.camel@jlt4.sipsolutions.net>

On 09/11/2014 12:06 AM, Johannes Berg wrote:
> On Wed, 2014-09-10 at 18:05 -0400, Alexander Duyck wrote:
>> There is a possible issue with the use, or lack thereof of sk_refcnt and
>> sk_wmem_alloc in the wifi ack status functionality.
>>
>> Specifically if a socket were to request acknowledgements, and the socket
>> were to have sk_refcnt drop to 0 resulting in it waiting on sk_wmem_alloc
>> to reach 0 it would be possible to have sock_queue_err_skb orphan the last
>> buffer, resulting in __sk_free being called on the socket.  After this the
>> buffer is enqueued on sk_error_queue, however the queue has already been
>> flushed resulting in at least a memory leak, if not a data corruption.
> 
> Oh. Thanks :-)
> 
>> +	/* take a reference to prevent skb_orphan() from freeing the socket */
>> +	sock_hold(sk);
>> +
>>  	err = sock_queue_err_skb(sk, skb);
>>  	if (err)
>>  		kfree_skb(skb);
>> +
>> +	sock_put(sk);
>>  }
>>  EXPORT_SYMBOL_GPL(skb_complete_wifi_ack);
> 
> Here I'm not sure it matters *for this function*? Wouldn't it be freed
> then in sock_put(), which has the same net effect on this function
> overall? It doesn't use it after sock_queue_err_skb().

The significant piece is that we are calling sock_put *after*.  So if we
are dropping the last reference the buffer is already in the
sk_error_queue and will be purged when __sk_free is called.

> Seems like maybe this should be in sock_queue_err_skb() itself, since it
> does the orphaning first and then looks at the socket. Or the
> documentation for that function should state that it has to be held, but
> there are plenty of callers?

The problem is there are a number of cases where the sock_hold/put are
not needed.  For example, if we were to clone the skb and immediately
send the clone up the sk_error_queue then we don't need it.  We only
need it if there is a risk that orphaning the buffer sent could
potentially result in the destructor calling __sk_free.

>>  			spin_lock_irqsave(&local->ack_status_lock, flags);
>> -			id = idr_alloc(&local->ack_status_frames, orig_skb,
>> +			id = idr_alloc(&local->ack_status_frames, ack_skb,
>>  				       1, 0x10000, GFP_ATOMIC);
>>  			spin_unlock_irqrestore(&local->ack_status_lock, flags);
>>  
>>  			if (id >= 0) {
>>  				info_id = id;
>>  				info_flags |= IEEE80211_TX_CTL_REQ_TX_STATUS;
>> -			} else if (skb_shared(skb)) {
>> -				kfree_skb(orig_skb);
>>  			} else {
>> -				kfree_skb(skb);
>> -				skb = orig_skb;
>> +				kfree_skb(ack_skb);
>>  			}
> 
> So you're removing this part, but can't we really not reuse the clone_sk
> copy? The difference is that it's charged, but that's fine for the
> purposes here, no? Or am I misunderstanding that?
> 
> johannes

The copy being held cannot really be used for transmit.  The problem is
that it is holding the wrong kind of reference.

The problem lies in the order things are released.  The sock_put
function will dec_and_test sk_refcnt, once it reaches 0 it will do a
dec_and_test on sk_wmem_alloc to see if it should call __sk_free.  Until
that reaches 0 sk_wmem_alloc cannot reach 0.  Once either of these drops
to 0 we cannot bring the value back up from there.  So if I were to
transmit the clone then it could let the sk_refcnt drop to 0 in which
case any calls to sock_hold are invalid.

I would need to somehow hold the reference based on sk_wmem_alloc if we
want to transmit the clone.  Many of the hardware timestamping drivers
seem to just clone the original skb, queue that clone onto the
sk_error_queue, and then free the original after completing the call.  I
suppose we could change it to something like that, but you are still
looking at possibly 2 clones in that case anyway.

Thanks,

Alex

  parent reply	other threads:[~2014-09-11 15:22 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-10 22:04 [PATCH net-next 0/2] Address reference counting issues with sock_queue_err_skb Alexander Duyck
     [not found] ` <20140910215837.23225.39149.stgit-+uVpp3jiz/QKn9AQLGuxw7vm/XP+8Wra@public.gmane.org>
2014-09-10 22:05   ` [PATCH net-next v2 1/2] skb: Add documentation for skb_clone_sk Alexander Duyck
2014-09-10 22:05 ` [PATCH net-next 2/2] mac80211: Resolve sk_refcnt/sk_wmem_alloc issue in wifi ack path Alexander Duyck
2014-09-11  7:06   ` Johannes Berg
2014-09-11  9:38     ` Arend van Spriel
     [not found]       ` <54116D8E.20308-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2014-09-11 14:40         ` Alexander Duyck
2014-09-11 15:21     ` Alexander Duyck [this message]
2014-09-11 15:53       ` Johannes Berg
2014-09-12 21:51 ` [PATCH net-next 0/2] Address reference counting issues with sock_queue_err_skb David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5411BDEF.7070105@intel.com \
    --to=alexander.h.duyck@intel.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-wireless@vger.kernel.org \
    --cc=linville@tuxdriver.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).