netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] net: remove erroneous sk null assignment in timestamping
@ 2011-10-07 17:11 Johannes Berg
  2011-10-07 17:33 ` David Miller
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
  0 siblings, 2 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-07 17:11 UTC (permalink / raw)
  To: netdev; +Cc: Richard Cochran

From: Johannes Berg <johannes.berg@intel.com>

skb->sk is obviously required to be non-NULL
when we get into skb_complete_tx_timestamp().
sock_queue_err_skb() will call skb_orphan()
first thing which sets skb->sk = NULL itself.
This may crash if the skb is still charged to
the socket (skb->destructor is sk_wfree).

The assignment here thus seems to not only be
pointless (due to the skb_orphan() call) but
also dangerous (due to the crash).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 net/core/timestamping.c |    1 -
 1 file changed, 1 deletion(-)

--- a/net/core/timestamping.c	2011-10-07 18:59:12.000000000 +0200
+++ b/net/core/timestamping.c	2011-10-07 19:07:06.000000000 +0200
@@ -85,7 +85,6 @@ void skb_complete_tx_timestamp(struct sk
 	memset(serr, 0, sizeof(*serr));
 	serr->ee.ee_errno = ENOMSG;
 	serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
-	skb->sk = NULL;
 	err = sock_queue_err_skb(sk, skb);
 	if (err)
 		kfree_skb(skb);

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-07 17:11 [RFC] net: remove erroneous sk null assignment in timestamping Johannes Berg
@ 2011-10-07 17:33 ` David Miller
  2011-10-07 17:40   ` Johannes Berg
  2011-10-08  7:57   ` Richard Cochran
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
  1 sibling, 2 replies; 56+ messages in thread
From: David Miller @ 2011-10-07 17:33 UTC (permalink / raw)
  To: johannes; +Cc: netdev, richardcochran

From: Johannes Berg <johannes@sipsolutions.net>
Date: Fri, 07 Oct 2011 19:11:41 +0200

> From: Johannes Berg <johannes.berg@intel.com>
> 
> skb->sk is obviously required to be non-NULL
> when we get into skb_complete_tx_timestamp().
> sock_queue_err_skb() will call skb_orphan()
> first thing which sets skb->sk = NULL itself.
> This may crash if the skb is still charged to
> the socket (skb->destructor is sk_wfree).
> 
> The assignment here thus seems to not only be
> pointless (due to the skb_orphan() call) but
> also dangerous (due to the crash).
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>

It looks like skb_clone_tx_timestamp() sets clone->sk without any
proper refcounting, so I bet this NULL'ing it out is working
around that bug.

The TX side of this infrastructure seems very poorly tested.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-07 17:33 ` David Miller
@ 2011-10-07 17:40   ` Johannes Berg
  2011-10-07 17:47     ` Johannes Berg
  2011-10-07 18:42     ` Johannes Berg
  2011-10-08  7:57   ` Richard Cochran
  1 sibling, 2 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-07 17:40 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, richardcochran

On Fri, 2011-10-07 at 13:33 -0400, David Miller wrote:
> From: Johannes Berg <johannes@sipsolutions.net>
> Date: Fri, 07 Oct 2011 19:11:41 +0200
> 
> > From: Johannes Berg <johannes.berg@intel.com>
> > 
> > skb->sk is obviously required to be non-NULL
> > when we get into skb_complete_tx_timestamp().
> > sock_queue_err_skb() will call skb_orphan()
> > first thing which sets skb->sk = NULL itself.
> > This may crash if the skb is still charged to
> > the socket (skb->destructor is sk_wfree).
> > 
> > The assignment here thus seems to not only be
> > pointless (due to the skb_orphan() call) but
> > also dangerous (due to the crash).
> > 
> > Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> 
> It looks like skb_clone_tx_timestamp() sets clone->sk without any
> proper refcounting, so I bet this NULL'ing it out is working
> around that bug.

You're right. But this bug can actually trigger another way: The only
user of this is dp83640_txtstamp() which might do kfree_skb() on this
skb, in which case that'll likely crash trying to clean up the sk.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-07 17:40   ` Johannes Berg
@ 2011-10-07 17:47     ` Johannes Berg
  2011-10-07 17:53       ` Johannes Berg
  2011-10-07 18:42     ` Johannes Berg
  1 sibling, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-07 17:47 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, richardcochran

On Fri, 2011-10-07 at 19:40 +0200, Johannes Berg wrote:

> > It looks like skb_clone_tx_timestamp() sets clone->sk without any
> > proper refcounting, so I bet this NULL'ing it out is working
> > around that bug.
> 
> You're right. But this bug can actually trigger another way: The only
> user of this is dp83640_txtstamp() which might do kfree_skb() on this
> skb, in which case that'll likely crash trying to clean up the sk.

Not only that -- I don't even see what causes the socket to not go away
while we're waiting for the ts_work that dp83640_txtstamp() fires off to
run.

And actually, no, this is all wrong. It shouldn't kick off ts_work
anyway, ts_work only handles *RX* timestamps, never *TX*. The *TX*
timestamps are handled by decode_txts() which is called when we receive
a TXTS frame from the device... *that* dequeues the skb from tx_queue
there.

I'm afraid as is this is terminally broken. I don't have a device with a
dp83640 PHY, but I bet you can cause kernel crashes by doing something
like

while (1)
	fd = open()
	enable tx timestamping on fd;
	send(fd, frame)
	close(fd);

I don't even think any of this is privileged.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-07 17:47     ` Johannes Berg
@ 2011-10-07 17:53       ` Johannes Berg
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-07 17:53 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, richardcochran

On Fri, 2011-10-07 at 19:47 +0200, Johannes Berg wrote:

> I'm afraid as is this is terminally broken. I don't have a device with a
> dp83640 PHY, but I bet you can cause kernel crashes by doing something
> like
> 
> while (1)
> 	fd = open()
> 	enable tx timestamping on fd;
> 	send(fd, frame)
> 	close(fd);

It's possible that it doesn't crash *if* (and only if!) the ethernet
driver is guaranteed to process the TXTS RX packet before freeing the
original SKB off the TX queue, and also never orphans or whatever the
original TX SKB. In that case the original TX SKB will hang on to the
socket for long enough I guess.

But that's not something I'd want to rely on. All it means that the
above code isn't an instant kernel crash. The code is still broken.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-07 17:40   ` Johannes Berg
  2011-10-07 17:47     ` Johannes Berg
@ 2011-10-07 18:42     ` Johannes Berg
  2011-10-08  7:59       ` Richard Cochran
  1 sibling, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-07 18:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, richardcochran

On Fri, 2011-10-07 at 19:40 +0200, Johannes Berg wrote:

> > It looks like skb_clone_tx_timestamp() sets clone->sk without any
> > proper refcounting, so I bet this NULL'ing it out is working
> > around that bug.
> 
> You're right. But this bug can actually trigger another way: The only
> user of this is dp83640_txtstamp() which might do kfree_skb() on this
> skb, in which case that'll likely crash trying to clean up the sk.

Maybe that's how you can trigger it: have one thread turn on and off
timestamping all the time, and another thread send frames all the time,
then eventually you'll probably run into the kfree_skb() case there. If
you ever manage to run into that case, it'll crash either when freeing
this skb or when freeing the original.

Anyway, it's broken. I'll stop thinking about it. You (Richard) should
fix it quickly though otherwise I think we should revert/delete this
code.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-07 17:33 ` David Miller
  2011-10-07 17:40   ` Johannes Berg
@ 2011-10-08  7:57   ` Richard Cochran
  2011-10-08  8:16     ` Johannes Berg
  1 sibling, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-08  7:57 UTC (permalink / raw)
  To: David Miller; +Cc: johannes, netdev

On Fri, Oct 07, 2011 at 01:33:56PM -0400, David Miller wrote:
> It looks like skb_clone_tx_timestamp() sets clone->sk without any
> proper refcounting, so I bet this NULL'ing it out is working
> around that bug.

I don't remember why I put it that way, but I took a look at the
problem, and I am not sure how to solve it. The other callers of
sock_queue_err_skb all create or clone the error skb immediately
before queueing it:

  net/core/skbuff.c:       skb_tstamp_tx
  net/ipv4/ip_sockglue.c:  ip_icmp_error, ip_local_error
  net/ipv6/datagram.c:     ipv6_icmp_error, ipv6_local_error

So I need to prevent the socket from disappearing between
skb_clone_tx_timestamp and skb_complete_tx_timestamp:

  skb_clone_tx_timestamp
	clone = skb_clone(skb, GFP_ATOMIC);
	sock_hold
  skb_complete_tx_timestamp
	sock_queue_err_skb(sk, skb);
	sock_put

What do you think?

BTW, while looking for a good pattern to follow, I found that the can
driver also sets skb->sk after clone with no special treatment, like
so:

  drivers/net/can/dev.c:285
	can_put_echo_skb
		struct sock *srcsk = skb->sk;
		skb = skb_clone(old_skb, GFP_ATOMIC);
		skb->sk = srcsk;

> The TX side of this infrastructure seems very poorly tested.

In fact, we do have the phyter driver used in an extensive automated
test farm, but the applications just don't do the kinds of things
suggested to trigger the problem. The normal pattern is, send event
packet, get tx timestamp, and so we haven't seen the bug at all.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-07 18:42     ` Johannes Berg
@ 2011-10-08  7:59       ` Richard Cochran
  0 siblings, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-08  7:59 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David Miller, netdev

On Fri, Oct 07, 2011 at 08:42:12PM +0200, Johannes Berg wrote:
> Maybe that's how you can trigger it: have one thread turn on and off
> timestamping all the time, and another thread send frames all the time,
> then eventually you'll probably run into the kfree_skb() case there. If
> you ever manage to run into that case, it'll crash either when freeing
> this skb or when freeing the original.

Thats one weird app, but I get the point, and thanks for your
attention to my code.

Richard
 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-08  7:57   ` Richard Cochran
@ 2011-10-08  8:16     ` Johannes Berg
  2011-10-08  8:57       ` Eric Dumazet
  0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-08  8:16 UTC (permalink / raw)
  To: Richard Cochran; +Cc: David Miller, netdev

On Sat, 2011-10-08 at 09:57 +0200, Richard Cochran wrote:

> I don't remember why I put it that way, but I took a look at the
> problem, and I am not sure how to solve it. The other callers of
> sock_queue_err_skb all create or clone the error skb immediately
> before queueing it:
> 
>   net/core/skbuff.c:       skb_tstamp_tx
>   net/ipv4/ip_sockglue.c:  ip_icmp_error, ip_local_error
>   net/ipv6/datagram.c:     ipv6_icmp_error, ipv6_local_error

Yeah, I noticed that too. That's also the reason they pass the socket
externally I believe, since it's not a properly refcounted socket (the
reference they use is still from the original skb).

The thing that makes it work is that
 a) they don't release the original SKB before sock_queue_err_skb() and
 b) skb->sk is NULL for them

Since this is just a single function, they can guarantee that -- in the
case we found here it's scattered across the code and won't always be
guaranteed -- e.g. the kfree_skb() case in the PHY driver potentially
violates b).

> So I need to prevent the socket from disappearing between
> skb_clone_tx_timestamp and skb_complete_tx_timestamp:
> 
>   skb_clone_tx_timestamp
> 	clone = skb_clone(skb, GFP_ATOMIC);
> 	sock_hold
>   skb_complete_tx_timestamp
> 	sock_queue_err_skb(sk, skb);
> 	sock_put
> 
> What do you think?

I'm not terribly familiar with struct sock. Looking at it, I'm a bit
confused by skb_orphan() -- it doesn't put the sock reference. So are
sockets not refcounted for skbs in this way? They seem to use
sock_wfree() which does a bit more than this it seems, and I don't see
it using sk_refcnt anywhere so I'm a bit confused now.

> BTW, while looking for a good pattern to follow, I found that the can
> driver also sets skb->sk after clone with no special treatment, like
> so:
> 
>   drivers/net/can/dev.c:285
> 	can_put_echo_skb
> 		struct sock *srcsk = skb->sk;
> 		skb = skb_clone(old_skb, GFP_ATOMIC);
> 		skb->sk = srcsk;

Yeah that looks fishy too. But to me it looks a bit like it should
charge to the socket instead of refcounting it -- though of course
that's not really the correct thing to do from a socket buffer point of
view, but it seems the sk_refcnt and sk_wmem_alloc are two separate
mechanisms of refcounting the socket -- I just haven't figured out yet
how they interact.

> > The TX side of this infrastructure seems very poorly tested.
> 
> In fact, we do have the phyter driver used in an extensive automated
> test farm, but the applications just don't do the kinds of things
> suggested to trigger the problem. The normal pattern is, send event
> packet, get tx timestamp, and so we haven't seen the bug at all.

Makes sense, you never wrote an application trying to crash it :-)

> > Maybe that's how you can trigger it: have one thread turn on and off
> > timestamping all the time, and another thread send frames all the time,
> > then eventually you'll probably run into the kfree_skb() case there. If
> > you ever manage to run into that case, it'll crash either when freeing
> > this skb or when freeing the original.
> 
> Thats one weird app, but I get the point, and thanks for your
> attention to my code.

Agree, it's obviously a specifically devised app to try to make it
crash. It serves no other practical purpose.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-08  8:16     ` Johannes Berg
@ 2011-10-08  8:57       ` Eric Dumazet
  2011-10-08 10:32         ` Johannes Berg
  2011-10-08 10:35         ` Richard Cochran
  0 siblings, 2 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-08  8:57 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev

Le samedi 08 octobre 2011 à 10:16 +0200, Johannes Berg a écrit :

> I'm not terribly familiar with struct sock. Looking at it, I'm a bit
> confused by skb_orphan() -- it doesn't put the sock reference. So are
> sockets not refcounted for skbs in this way? They seem to use
> sock_wfree() which does a bit more than this it seems, and I don't see
> it using sk_refcnt anywhere so I'm a bit confused now.

Check following commit changelog to get some information on this.

commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Thu Jun 11 02:55:43 2009 -0700

    net: No more expensive sock_hold()/sock_put() on each tx
    
    One of the problem with sock memory accounting is it uses
    a pair of sock_hold()/sock_put() for each transmitted packet.
    
    This slows down bidirectional flows because the receive path
    also needs to take a refcount on socket and might use a different
    cpu than transmit path or transmit completion path. So these
    two atomic operations also trigger cache line bounces.
    
    We can see this in tx or tx/rx workloads (media gateways for example),
    where sock_wfree() can be in top five functions in profiles.
    
    We use this sock_hold()/sock_put() so that sock freeing
    is delayed until all tx packets are completed.
    
    As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
    by one unit at init time, until sk_free() is called.
    Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc)
    to decrement initial offset and atomicaly check if any packets
    are in flight.
    
    skb_set_owner_w() doesnt call sock_hold() anymore
    
    sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc
    reached 0 to perform the final freeing.
    
    Drawback is that a skb->truesize error could lead to unfreeable sockets, or
    even worse, prematurely calling __sk_free() on a live socket.
    
    Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s
    on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt
    contention point. 5 % speedup on a UDP transmit workload (depends
    on number of flows), lowering TX completion cpu usage.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-08  8:57       ` Eric Dumazet
@ 2011-10-08 10:32         ` Johannes Berg
  2011-10-11 13:34           ` Richard Cochran
  2011-10-08 10:35         ` Richard Cochran
  1 sibling, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-08 10:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev

On Sat, 2011-10-08 at 10:57 +0200, Eric Dumazet wrote:
> Le samedi 08 octobre 2011 à 10:16 +0200, Johannes Berg a écrit :
> 
> > I'm not terribly familiar with struct sock. Looking at it, I'm a bit
> > confused by skb_orphan() -- it doesn't put the sock reference. So are
> > sockets not refcounted for skbs in this way? They seem to use
> > sock_wfree() which does a bit more than this it seems, and I don't see
> > it using sk_refcnt anywhere so I'm a bit confused now.
> 
> Check following commit changelog to get some information on this.
> 
> commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
> Author: Eric Dumazet <eric.dumazet@gmail.com>
> Date:   Thu Jun 11 02:55:43 2009 -0700
> 
>     net: No more expensive sock_hold()/sock_put() on each tx

Aha, thanks for the pointer!

>     As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
>     by one unit at init time, until sk_free() is called.

This is the trick! Neat. I see it now, now sk_free() makes sense to
me :-)

There's one thing I still miss though: It seems to me that if you have a
reference to a socket that has been sk_free()'ed (which is possible
since it might still have sk_wmem_alloc > 0) you can't sock_hold() that
socket. That feels a bit unexpected -- and might happen in the code
Richard just suggested.

Basically, while you can bump a reference you own via sock_hold() by
sk_wmem_alloc, as soon as sk_refcnt reaches 0 it must never go > 0 again
because that will have released the sk_wmem_alloc offset.

That can be fixed, but I'm not exactly sure what would be an efficient
way of doing it. Maybe by adding some flag that says whether or not the
sk_wmem_alloc offset is present, but that flag would have to be atomic
since I guess even this could race.

>     Drawback is that a skb->truesize error could lead to unfreeable sockets, or
>     even worse, prematurely calling __sk_free() on a live socket.

I was thinking about this yesterday as well. What we could do is wrap
all the skb truesize operations in inlines -- the regular ones like
skb_truesize_set()/add()/sub() would (depending on a debug Kconfig)
check that the skb isn't charged to a socket, and common sequences like
changing truesize and updating the sock could be wrapped into another
set of inlines that do both. Or something like that.

I was actually thinking about this for other reasons but then realised
that with the truesize bug check gone (which was really checking
something else) I didn't really need to worry any more.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-08  8:57       ` Eric Dumazet
  2011-10-08 10:32         ` Johannes Berg
@ 2011-10-08 10:35         ` Richard Cochran
  1 sibling, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-08 10:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Johannes Berg, David Miller, netdev

On Sat, Oct 08, 2011 at 10:57:18AM +0200, Eric Dumazet wrote:
> Le samedi 08 octobre 2011 à 10:16 +0200, Johannes Berg a écrit :
> 
> > I'm not terribly familiar with struct sock. Looking at it, I'm a bit
> > confused by skb_orphan() -- it doesn't put the sock reference. So are
> > sockets not refcounted for skbs in this way? They seem to use
> > sock_wfree() which does a bit more than this it seems, and I don't see
> > it using sk_refcnt anywhere so I'm a bit confused now.

Me, too.

> Check following commit changelog to get some information on this.

Thanks, Eric, that does help explain.

>     We use this sock_hold()/sock_put() so that sock freeing
>     is delayed until all tx packets are completed.
>     
>     As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
>     by one unit at init time, until sk_free() is called.
>     Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc)
>     to decrement initial offset and atomicaly check if any packets
>     are in flight.
>     
>     skb_set_owner_w() doesnt call sock_hold() anymore

So, if I understand, then I can solve my particular problem by doing:

* skb_clone_tx_timestamp
	clone = skb_clone(skb, GFP_ATOMIC);
	skb_set_owner_w(clone, sk)
// instead of
//	clone->sk = sk;
	phydev->drv->txtstamp(phydev, clone, type);

* skb_complete_tx_timestamp
	serr->ee.ee_errno = ENOMSG;
	serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
// just remove this:
//	skb->sk = NULL;
	err = sock_queue_err_skb(sk, skb);

The only problem(?) I see is that it violates the rules from sock.h,
quoted below. The cloned tx skb destined for the error queue would be
budgeted to sk_wmem_alloc while wait for the time stamp. But maybe we
can allow this?

from sock.h:
/*
 * Socket reference counting postulates.
 *
 * * Each user of socket SHOULD hold a reference count.
 * * Each access point to socket (an hash table bucket, reference from a list,
 *   running timer, skb in flight MUST hold a reference count.
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                   /
BTW, no longer true.

 * * Packets, delivered from outside (from network or from another process)
 *   and enqueued on receive/error queues SHOULD NOT grab reference count,
 *   when they sit in queue. ~~~~~~~~~~~~~~~~~~~~~~~

Want to break/bend this rule.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC] net: remove erroneous sk null assignment in timestamping
  2011-10-08 10:32         ` Johannes Berg
@ 2011-10-11 13:34           ` Richard Cochran
  0 siblings, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-11 13:34 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Eric Dumazet, David Miller, netdev

On Sat, Oct 08, 2011 at 12:32:15PM +0200, Johannes Berg wrote:
> On Sat, 2011-10-08 at 10:57 +0200, Eric Dumazet wrote:
> > Check following commit changelog to get some information on this.
> > 
> > commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
> > Author: Eric Dumazet <eric.dumazet@gmail.com>
> > Date:   Thu Jun 11 02:55:43 2009 -0700
> > 
> >     net: No more expensive sock_hold()/sock_put() on each tx
...
> There's one thing I still miss though: It seems to me that if you have a
> reference to a socket that has been sk_free()'ed (which is possible
> since it might still have sk_wmem_alloc > 0) you can't sock_hold() that
> socket. That feels a bit unexpected -- and might happen in the code
> Richard just suggested.

Yes, I have been trying to see how to solve this, but it looks like I
am out of luck. 

Even if I use skb_set_owner_w() in skb_clone_tx_timestamp(), still the
sock might go away during skb_orphan() in sock_queue_err_skb().

It is no good to take sock_hold() in skb_complete_tx_timestamp(),
since, as you point out, it might not be safe to call.

So, I wonder, when is it safe to call sock_hold?

Are the 101 odd callers protected against the situation where the last
sock_out() has already happened?

Thanks,

Richard

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 1/1] net: hold sock reference while processing tx timestamps
  2011-10-07 17:11 [RFC] net: remove erroneous sk null assignment in timestamping Johannes Berg
  2011-10-07 17:33 ` David Miller
@ 2011-10-12 18:36 ` Richard Cochran
  2011-10-12 19:25   ` Eric Dumazet
                     ` (9 more replies)
  1 sibling, 10 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-12 18:36 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable

The pair of functions,

 * skb_clone_tx_timestamp()
 * skb_complete_tx_timestamp()

were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.

As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.

These functions first appeared in v2.6.36.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
---
 include/linux/phy.h     |    2 +-
 include/linux/skbuff.h  |    7 ++++++-
 net/core/timestamping.c |    7 ++++++-
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 54fc413..79f337c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -420,7 +420,7 @@ struct phy_driver {
 
 	/*
 	 * Requests a Tx timestamp for 'skb'. The phy driver promises
-	 * to deliver it to the socket's error queue as soon as a
+	 * to deliver it using skb_complete_tx_timestamp() as soon as a
 	 * timestamp becomes available. One of the PTP_CLASS_ values
 	 * is passed in 'type'.
 	 */
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8bd383c..0f96646 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2020,8 +2020,13 @@ static inline bool skb_defer_rx_timestamp(struct sk_buff *skb)
 /**
  * skb_complete_tx_timestamp() - deliver cloned skb with tx timestamps
  *
+ * PHY drivers may accept clones of transmitted packets for
+ * timestamping via their phy_driver.txtstamp method. These drivers
+ * must call this function to return the skb back to the stack, with
+ * or without a timestamp.
+ *
  * @skb: clone of the the original outgoing packet
- * @hwtstamps: hardware time stamps
+ * @hwtstamps: hardware time stamps, may be NULL if not available
  *
  */
 void skb_complete_tx_timestamp(struct sk_buff *skb,
diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 98a5264..29e59d6 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -60,6 +60,7 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
 			clone = skb_clone(skb, GFP_ATOMIC);
 			if (!clone)
 				return;
+			sock_hold(sk);
 			clone->sk = sk;
 			phydev->drv->txtstamp(phydev, clone, type);
 		}
@@ -77,8 +78,11 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
 	struct sock_exterr_skb *serr;
 	int err;
 
-	if (!hwtstamps)
+	if (!hwtstamps) {
+		sock_put(sk);
+		kfree_skb(skb);
 		return;
+	}
 
 	*skb_hwtstamps(skb) = *hwtstamps;
 	serr = SKB_EXT_ERR(skb);
@@ -87,6 +91,7 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
 	serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
 	skb->sk = NULL;
 	err = sock_queue_err_skb(sk, skb);
+	sock_put(sk);
 	if (err)
 		kfree_skb(skb);
 }
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-12 19:25   ` Eric Dumazet
  2011-10-12 19:27   ` Johannes Berg
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-12 19:25 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, David Miller, Johannes Berg, stable

Le mercredi 12 octobre 2011 à 20:36 +0200, Richard Cochran a écrit :
> The pair of functions,
> 
>  * skb_clone_tx_timestamp()
>  * skb_complete_tx_timestamp()
> 
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
> 
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.
> 
> These functions first appeared in v2.6.36.
> 
> Reported-by: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@kernel.org>
> ---

Seems fine to me, thanks !

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
  2011-10-12 19:25   ` Eric Dumazet
@ 2011-10-12 19:27   ` Johannes Berg
  2011-10-12 19:52     ` Eric Dumazet
  2011-10-13  4:51     ` Richard Cochran
  2011-10-13  9:46   ` [PATCH 0/3] net: time stamping fixes Richard Cochran
                     ` (7 subsequent siblings)
  9 siblings, 2 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-12 19:27 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, David Miller, Eric Dumazet

On Wed, 2011-10-12 at 20:36 +0200, Richard Cochran wrote:
> The pair of functions,
> 
>  * skb_clone_tx_timestamp()
>  * skb_complete_tx_timestamp()
> 
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
> 
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.

This still needs a fix to the PHY driver right? It has a case that can
kfree_skb() the skb instead of passing it back to
complete_tx_timestamp().

> -	if (!hwtstamps)
> +	if (!hwtstamps) {
> +		sock_put(sk);
> +		kfree_skb(skb);
>  		return;
> +	}

Is that right w/o skb->sk = NULL?


The other thing I was wondering -- what if we just set truesize to 1 (we
don't have any truesize checks) and account the skb to the socket
normally. Not really a good way either though.

FWIW I just decided to do it the other way around in mac80211 -- keep
the original SKB that's charged to the socket for the error queue, and
use a clone to actually do the TX.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
  2011-10-12 19:27   ` Johannes Berg
@ 2011-10-12 19:52     ` Eric Dumazet
  2011-10-13  8:54       ` Johannes Berg
  2011-10-13  4:51     ` Richard Cochran
  1 sibling, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-12 19:52 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Richard Cochran, netdev, David Miller

Le mercredi 12 octobre 2011 à 21:27 +0200, Johannes Berg a écrit :
> On Wed, 2011-10-12 at 20:36 +0200, Richard Cochran wrote:
> > The pair of functions,
> > 
> >  * skb_clone_tx_timestamp()
> >  * skb_complete_tx_timestamp()
> > 
> > were designed to allow timestamping in PHY devices. The first
> > function, called during the MAC driver's hard_xmit method, identifies
> > PTP protocol packets, clones them, and gives them to the PHY device
> > driver. The PHY driver may hold onto the packet and deliver it at a
> > later time using the second function, which adds the packet to the
> > socket's error queue.
> > 
> > As pointed out by Johannes, nothing prevents the socket from
> > disappearing while the cloned packet is sitting in the PHY driver
> > awaiting a timestamp. This patch fixes the issue by taking a reference
> > on the socket for each such packet. In addition, the comments
> > regarding the usage of these function are expanded to highlight the
> > rule that PHY drivers must use skb_complete_tx_timestamp() to release
> > the packet, in order to release the socket reference, too.
> 
> This still needs a fix to the PHY driver right? It has a case that can
> kfree_skb() the skb instead of passing it back to
> complete_tx_timestamp().
> 
> > -	if (!hwtstamps)
> > +	if (!hwtstamps) {
> > +		sock_put(sk);
> > +		kfree_skb(skb);
> >  		return;
> > +	}
> 
> Is that right w/o skb->sk = NULL?
> 
> 
> The other thing I was wondering -- what if we just set truesize to 1 (we
> don't have any truesize checks) and account the skb to the socket
> normally. Not really a good way either though.
> 

Changing truesize is not allowed here see below.

> FWIW I just decided to do it the other way around in mac80211 -- keep
> the original SKB that's charged to the socket for the error queue, and
> use a clone to actually do the TX.

Hmm, please take a look at IFF_SKB_TX_SHARING stuff, added in commit
d88733150 (net: add IFF_SKB_TX_SHARED flag to priv_flags)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
  2011-10-12 19:27   ` Johannes Berg
  2011-10-12 19:52     ` Eric Dumazet
@ 2011-10-13  4:51     ` Richard Cochran
  1 sibling, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-13  4:51 UTC (permalink / raw)
  To: Johannes Berg; +Cc: netdev, David Miller, Eric Dumazet

On Wed, Oct 12, 2011 at 09:27:53PM +0200, Johannes Berg wrote:
> On Wed, 2011-10-12 at 20:36 +0200, Richard Cochran wrote:
> > The pair of functions,
> > 
> >  * skb_clone_tx_timestamp()
> >  * skb_complete_tx_timestamp()
...
> 
> This still needs a fix to the PHY driver right? It has a case that can
> kfree_skb() the skb instead of passing it back to
> complete_tx_timestamp().

Yes, right, will do.

> > -	if (!hwtstamps)
> > +	if (!hwtstamps) {
> > +		sock_put(sk);
> > +		kfree_skb(skb);
> >  		return;
> > +	}
> 
> Is that right w/o skb->sk = NULL?

I think it is okay. In clone/complete, we use skb->sk in a special
way, just to remember the socket. Within kfree_skb, only the
skb->destructor would possibly use skb->sk, and that function pointer
is NULL (after the clone in the Tx path).

> FWIW I just decided to do it the other way around in mac80211 -- keep
> the original SKB that's charged to the socket for the error queue, and
> use a clone to actually do the TX.

In this case, we must clone, I think. The Ethernet MAC driver will
call the clone_tx_timetstamp hook just before freeing the original
skb.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
  2011-10-12 19:52     ` Eric Dumazet
@ 2011-10-13  8:54       ` Johannes Berg
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-13  8:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Richard Cochran, netdev, David Miller

On Wed, 2011-10-12 at 21:52 +0200, Eric Dumazet wrote:

> > The other thing I was wondering -- what if we just set truesize to 1 (we
> > don't have any truesize checks) and account the skb to the socket
> > normally. Not really a good way either though.
> > 
> 
> Changing truesize is not allowed here see below.

Oh, good point.

> > FWIW I just decided to do it the other way around in mac80211 -- keep
> > the original SKB that's charged to the socket for the error queue, and
> > use a clone to actually do the TX.
> 
> Hmm, please take a look at IFF_SKB_TX_SHARING stuff, added in commit
> d88733150 (net: add IFF_SKB_TX_SHARED flag to priv_flags)

mac80211 got that flag cleared -- and right now I see no way to change
that. 11ac/ad may require work in this area to be able to do this
better, and I have a number of ideas on how to do that, but right now I
don't see how we can do that.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 0/3] net: time stamping fixes
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
  2011-10-12 19:25   ` Eric Dumazet
  2011-10-12 19:27   ` Johannes Berg
@ 2011-10-13  9:46   ` Richard Cochran
  2011-10-19  4:16     ` David Miller
  2011-10-13  9:46   ` [PATCH 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-13  9:46 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg

The first patch fixes a bug in the time stamping code introduced in
v2.6.36 of the kernel.

The other two patches depend on the first patch and fix two bugs in a
PTP Hardware Clock driver. This driver was first introduced in Linux
version 3.0.

Richard Cochran (3):
  net: hold sock reference while processing tx timestamps
  dp83640: use proper function to free transmit time stamping packets
  dp83640: free packet queues on remove

 drivers/net/phy/dp83640.c |   11 +++++++++--
 include/linux/phy.h       |    2 +-
 include/linux/skbuff.h    |    7 ++++++-
 net/core/timestamping.c   |    7 ++++++-
 4 files changed, 22 insertions(+), 5 deletions(-)

-- 
1.7.2.5

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 1/3] net: hold sock reference while processing tx timestamps
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
                     ` (2 preceding siblings ...)
  2011-10-13  9:46   ` [PATCH 0/3] net: time stamping fixes Richard Cochran
@ 2011-10-13  9:46   ` Richard Cochran
  2011-10-19  4:42     ` Eric Dumazet
  2011-10-13  9:46   ` [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
                     ` (5 subsequent siblings)
  9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-13  9:46 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable

The pair of functions,

 * skb_clone_tx_timestamp()
 * skb_complete_tx_timestamp()

were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.

As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.

These functions first appeared in v2.6.36.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
---
 include/linux/phy.h     |    2 +-
 include/linux/skbuff.h  |    7 ++++++-
 net/core/timestamping.c |    7 ++++++-
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 54fc413..79f337c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -420,7 +420,7 @@ struct phy_driver {
 
 	/*
 	 * Requests a Tx timestamp for 'skb'. The phy driver promises
-	 * to deliver it to the socket's error queue as soon as a
+	 * to deliver it using skb_complete_tx_timestamp() as soon as a
 	 * timestamp becomes available. One of the PTP_CLASS_ values
 	 * is passed in 'type'.
 	 */
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8bd383c..0f96646 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2020,8 +2020,13 @@ static inline bool skb_defer_rx_timestamp(struct sk_buff *skb)
 /**
  * skb_complete_tx_timestamp() - deliver cloned skb with tx timestamps
  *
+ * PHY drivers may accept clones of transmitted packets for
+ * timestamping via their phy_driver.txtstamp method. These drivers
+ * must call this function to return the skb back to the stack, with
+ * or without a timestamp.
+ *
  * @skb: clone of the the original outgoing packet
- * @hwtstamps: hardware time stamps
+ * @hwtstamps: hardware time stamps, may be NULL if not available
  *
  */
 void skb_complete_tx_timestamp(struct sk_buff *skb,
diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 98a5264..29e59d6 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -60,6 +60,7 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
 			clone = skb_clone(skb, GFP_ATOMIC);
 			if (!clone)
 				return;
+			sock_hold(sk);
 			clone->sk = sk;
 			phydev->drv->txtstamp(phydev, clone, type);
 		}
@@ -77,8 +78,11 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
 	struct sock_exterr_skb *serr;
 	int err;
 
-	if (!hwtstamps)
+	if (!hwtstamps) {
+		sock_put(sk);
+		kfree_skb(skb);
 		return;
+	}
 
 	*skb_hwtstamps(skb) = *hwtstamps;
 	serr = SKB_EXT_ERR(skb);
@@ -87,6 +91,7 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
 	serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
 	skb->sk = NULL;
 	err = sock_queue_err_skb(sk, skb);
+	sock_put(sk);
 	if (err)
 		kfree_skb(skb);
 }
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
                     ` (3 preceding siblings ...)
  2011-10-13  9:46   ` [PATCH 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-13  9:46   ` Richard Cochran
  2011-10-19  4:47     ` Eric Dumazet
  2011-10-13  9:46   ` [PATCH 3/3] dp83640: free packet queues on remove Richard Cochran
                     ` (4 subsequent siblings)
  9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-13  9:46 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable

The previous commit enforces a new rule for handling the cloned packets
for transmit time stamping. These packets must not be freed using any other
function than skb_complete_tx_timestamp. This commit fixes the one and only
driver using this API.

The driver first appeared in v3.0.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
---
 drivers/net/phy/dp83640.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index c588a16..13e5713 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1192,7 +1192,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
 
 	case HWTSTAMP_TX_ONESTEP_SYNC:
 		if (is_sync(skb, type)) {
-			kfree_skb(skb);
+			skb_complete_tx_timestamp(skb, NULL);
 			return;
 		}
 		/* fall through */
@@ -1203,7 +1203,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
 
 	case HWTSTAMP_TX_OFF:
 	default:
-		kfree_skb(skb);
+		skb_complete_tx_timestamp(skb, NULL);
 		break;
 	}
 }
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 3/3] dp83640: free packet queues on remove
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
                     ` (4 preceding siblings ...)
  2011-10-13  9:46   ` [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
@ 2011-10-13  9:46   ` Richard Cochran
  2011-10-19  4:48     ` Eric Dumazet
  2011-10-21 10:49   ` [PATCH v2 0/3] net: time stamping fixes Richard Cochran
                     ` (3 subsequent siblings)
  9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-13  9:46 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable

If the PHY should disappear (for example, on an USB Ethernet MAC), then
the driver would leak any undelivered time stamp packets. This commit
fixes the issue by calling the appropriate functions to free any packets
left in the transmit and receive queues.

The driver first appeared in v3.0.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
---
 drivers/net/phy/dp83640.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index 13e5713..9663e0b 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1007,6 +1007,7 @@ static void dp83640_remove(struct phy_device *phydev)
 	struct dp83640_clock *clock;
 	struct list_head *this, *next;
 	struct dp83640_private *tmp, *dp83640 = phydev->priv;
+	struct sk_buff *skb;
 
 	if (phydev->addr == BROADCAST_ADDR)
 		return;
@@ -1014,6 +1015,12 @@ static void dp83640_remove(struct phy_device *phydev)
 	enable_status_frames(phydev, false);
 	cancel_work_sync(&dp83640->ts_work);
 
+	while ((skb = skb_dequeue(&dp83640->rx_queue)) != NULL)
+		kfree_skb(skb);
+
+	while ((skb = skb_dequeue(&dp83640->tx_queue)) != NULL)
+		skb_complete_tx_timestamp(skb, NULL);
+
 	clock = dp83640_clock_get(dp83640->clock);
 
 	if (dp83640 == clock->chosen) {
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-13  9:46   ` [PATCH 0/3] net: time stamping fixes Richard Cochran
@ 2011-10-19  4:16     ` David Miller
  2011-10-19  5:15       ` Johannes Berg
  0 siblings, 1 reply; 56+ messages in thread
From: David Miller @ 2011-10-19  4:16 UTC (permalink / raw)
  To: richardcochran; +Cc: netdev, eric.dumazet, johannes

From: Richard Cochran <richardcochran@gmail.com>
Date: Thu, 13 Oct 2011 11:46:26 +0200

> The first patch fixes a bug in the time stamping code introduced in
> v2.6.36 of the kernel.
> 
> The other two patches depend on the first patch and fix two bugs in a
> PTP Hardware Clock driver. This driver was first introduced in Linux
> version 3.0.
> 
> Richard Cochran (3):
>   net: hold sock reference while processing tx timestamps
>   dp83640: use proper function to free transmit time stamping packets
>   dp83640: free packet queues on remove

Johannes and Eric, please help review Richard's fixes here.

Thanks!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 1/3] net: hold sock reference while processing tx timestamps
  2011-10-13  9:46   ` [PATCH 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-19  4:42     ` Eric Dumazet
  0 siblings, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19  4:42 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, David Miller, Johannes Berg, stable

Le jeudi 13 octobre 2011 à 11:46 +0200, Richard Cochran a écrit :
> The pair of functions,
> 
>  * skb_clone_tx_timestamp()
>  * skb_complete_tx_timestamp()
> 
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
> 
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.
> 
> These functions first appeared in v2.6.36.
> 
> Reported-by: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@kernel.org>

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets
  2011-10-13  9:46   ` [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
@ 2011-10-19  4:47     ` Eric Dumazet
  0 siblings, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19  4:47 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, David Miller, Johannes Berg, stable

Le jeudi 13 octobre 2011 à 11:46 +0200, Richard Cochran a écrit :
> The previous commit enforces a new rule for handling the cloned packets
> for transmit time stamping. These packets must not be freed using any other
> function than skb_complete_tx_timestamp. This commit fixes the one and only
> driver using this API.
> 
> The driver first appeared in v3.0.
> 
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@kernel.org>

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 3/3] dp83640: free packet queues on remove
  2011-10-13  9:46   ` [PATCH 3/3] dp83640: free packet queues on remove Richard Cochran
@ 2011-10-19  4:48     ` Eric Dumazet
  0 siblings, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19  4:48 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, David Miller, Johannes Berg, stable

Le jeudi 13 octobre 2011 à 11:46 +0200, Richard Cochran a écrit :
> If the PHY should disappear (for example, on an USB Ethernet MAC), then
> the driver would leak any undelivered time stamp packets. This commit
> fixes the issue by calling the appropriate functions to free any packets
> left in the transmit and receive queues.
> 
> The driver first appeared in v3.0.
> 
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@kernel.org>

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19  4:16     ` David Miller
@ 2011-10-19  5:15       ` Johannes Berg
  2011-10-19 11:50         ` Richard Cochran
  0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19  5:15 UTC (permalink / raw)
  To: David Miller; +Cc: richardcochran, netdev, eric.dumazet

On Wed, 2011-10-19 at 00:16 -0400, David Miller wrote:
> From: Richard Cochran <richardcochran@gmail.com>
> Date: Thu, 13 Oct 2011 11:46:26 +0200
> 
> > The first patch fixes a bug in the time stamping code introduced in
> > v2.6.36 of the kernel.
> > 
> > The other two patches depend on the first patch and fix two bugs in a
> > PTP Hardware Clock driver. This driver was first introduced in Linux
> > version 3.0.
> > 
> > Richard Cochran (3):
> >   net: hold sock reference while processing tx timestamps
> >   dp83640: use proper function to free transmit time stamping packets
> >   dp83640: free packet queues on remove
> 
> Johannes and Eric, please help review Richard's fixes here.

The only thing I'm not completely sure about is whether or not it is
permissible to sock_hold() at that point. I'm probably just missing
something, but: if sk_free() was called before hard_start_xmit() which
will call skb_clone_tx_timestamp(), can we really call sock_hold()?

The reason I ask is that sock_wfree() doesn't check sk_refcnt, so if it
is possible for sk_free() to have been called before hard_start_xmit(),
maybe because the packet was stuck on the qdisc for a while, the socket
won't be released (sk_free checks sk_wmem_alloc) but the sk_wfree() when
the original skb is freed will actually free the socket, invalidating
the clone's sk pointer *even though* we called sock_hold() right after
making the clone.

So what guarantees that sk_refcnt is still non-zero when we make the
clone? Alternatively, should sock_wfree() check sk_refcnt?

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19  5:15       ` Johannes Berg
@ 2011-10-19 11:50         ` Richard Cochran
  2011-10-19 12:33           ` Eric Dumazet
  2011-10-19 12:38           ` Eric Dumazet
  0 siblings, 2 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-19 11:50 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David Miller, netdev, eric.dumazet

On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> The only thing I'm not completely sure about is whether or not it is
> permissible to sock_hold() at that point. I'm probably just missing
> something, but: if sk_free() was called before hard_start_xmit() which
> will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> 
> The reason I ask is that sock_wfree() doesn't check sk_refcnt, so if it
> is possible for sk_free() to have been called before hard_start_xmit(),
> maybe because the packet was stuck on the qdisc for a while, the socket
> won't be released (sk_free checks sk_wmem_alloc) but the sk_wfree() when
> the original skb is freed will actually free the socket, invalidating
> the clone's sk pointer *even though* we called sock_hold() right after
> making the clone.
> 
> So what guarantees that sk_refcnt is still non-zero when we make the
> clone?

In the non-qdisc path, the kernel is in a send() call, so the initial
reference taken in socket() is held.

I really don't know the qdisc code, whether it is somehow holding the
skb->sk indirectly or not.

Eric? David?

> Alternatively, should sock_wfree() check sk_refcnt?

That would presumably spoil the performance enhancement gained in
commit 2b85a34e.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 11:50         ` Richard Cochran
@ 2011-10-19 12:33           ` Eric Dumazet
  2011-10-19 12:38           ` Eric Dumazet
  1 sibling, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 12:33 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Johannes Berg, David Miller, netdev

Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > The only thing I'm not completely sure about is whether or not it is
> > permissible to sock_hold() at that point. I'm probably just missing
> > something, but: if sk_free() was called before hard_start_xmit() which
> > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> > 

This is not possible, or something is really broken.


> > The reason I ask is that sock_wfree() doesn't check sk_refcnt, so if it
> > is possible for sk_free() to have been called before hard_start_xmit(),
> > maybe because the packet was stuck on the qdisc for a while, the socket
> > won't be released (sk_free checks sk_wmem_alloc) but the sk_wfree() when
> > the original skb is freed will actually free the socket, invalidating
> > the clone's sk pointer *even though* we called sock_hold() right after
> > making the clone.
> > 
> > So what guarantees that sk_refcnt is still non-zero when we make the
> > clone?
> 
> In the non-qdisc path, the kernel is in a send() call, so the initial
> reference taken in socket() is held.
> 
> I really don't know the qdisc code, whether it is somehow holding the
> skb->sk indirectly or not.
> 
> Eric? David?

I dont really understand what's the problem, since sk_free() doesnt care
at all about sk_refcnt, but sk_wmem_alloc.

If one skb is in flight, and still linked to a socket, then this socket
cannot disappear, because this skb->truesize was accounted into
sk->sk_wmem_alloc

Of course, this point is valid as long as skb had not been orphaned.

And this is true 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 11:50         ` Richard Cochran
  2011-10-19 12:33           ` Eric Dumazet
@ 2011-10-19 12:38           ` Eric Dumazet
  2011-10-19 12:58             ` Johannes Berg
  1 sibling, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 12:38 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Johannes Berg, David Miller, netdev

Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > The only thing I'm not completely sure about is whether or not it is
> > permissible to sock_hold() at that point. I'm probably just missing
> > something, but: if sk_free() was called before hard_start_xmit() which
> > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> > 

This is not possible, or something is really broken. We specifically
dont skb_orphan(skb) if we know tx timestamping is enabled for this skb.

/*
 * Try to orphan skb early, right before transmission by the device.
 * We cannot orphan skb if tx timestamp is requested or the sk-reference
 * is needed on driver level for other reasons, e.g. see net/can/raw.c
 */
static inline void skb_orphan_try(struct sk_buff *skb)
{
        struct sock *sk = skb->sk;

        if (sk && !skb_shinfo(skb)->tx_flags) {
                /* skb_tx_hash() wont be able to get sk.
                 * We copy sk_hash into skb->rxhash
                 */
                if (!skb->rxhash)
                        skb->rxhash = sk->sk_hash;
                skb_orphan(skb);
        }
}


> > The reason I ask is that sock_wfree() doesn't check sk_refcnt, so if it
> > is possible for sk_free() to have been called before hard_start_xmit(),
> > maybe because the packet was stuck on the qdisc for a while, the socket
> > won't be released (sk_free checks sk_wmem_alloc) but the sk_wfree() when
> > the original skb is freed will actually free the socket, invalidating
> > the clone's sk pointer *even though* we called sock_hold() right after
> > making the clone.
> > 
> > So what guarantees that sk_refcnt is still non-zero when we make the
> > clone?
> 
> In the non-qdisc path, the kernel is in a send() call, so the initial
> reference taken in socket() is held.
> 
> I really don't know the qdisc code, whether it is somehow holding the
> skb->sk indirectly or not.
> 
> Eric? David?

I dont really understand what's the concern, since sk_free() doesnt care
at all about sk_refcnt, but sk_wmem_alloc.

void sk_free(struct sock *sk)
{
        /*
         * We subtract one from sk_wmem_alloc and can know if
         * some packets are still in some tx queue.
         * If not null, sock_wfree() will call __sk_free(sk) later
         */
        if (atomic_dec_and_test(&sk->sk_wmem_alloc))
                __sk_free(sk);
}
EXPORT_SYMBOL(sk_free);


If one skb is in flight, and still linked to a socket, then this socket
cannot disappear, because this skb->truesize was accounted into
sk->sk_wmem_alloc

Of course, this point is valid as long as skb had not been orphaned.

sk_refcnt can be 0, if user closed the socket, but socket wont disappear
as long as sk_wmem_alloc is not 0.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 12:38           ` Eric Dumazet
@ 2011-10-19 12:58             ` Johannes Berg
  2011-10-19 13:09               ` Johannes Berg
  2011-10-19 13:21               ` Eric Dumazet
  0 siblings, 2 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 12:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev

On Wed, 2011-10-19 at 14:38 +0200, Eric Dumazet wrote:
> Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> > On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > > The only thing I'm not completely sure about is whether or not it is
> > > permissible to sock_hold() at that point. I'm probably just missing
> > > something, but: if sk_free() was called before hard_start_xmit() which
> > > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> > > 
> 
> This is not possible, or something is really broken. We specifically
> dont skb_orphan(skb) if we know tx timestamping is enabled for this skb.

Why can't sk_free() have been called? I'm not thinking of sock_wfree()
which can't have been called -- so the socket surely still exists
because skb->truesize is still accounted to it -- but what says
sk_refcnt hasn't reached 0 yet?

> /*
>  * Try to orphan skb early, right before transmission by the device.
>  * We cannot orphan skb if tx timestamp is requested or the sk-reference
>  * is needed on driver level for other reasons, e.g. see net/can/raw.c
>  */
> static inline void skb_orphan_try(struct sk_buff *skb)
> {
>         struct sock *sk = skb->sk;
> 
>         if (sk && !skb_shinfo(skb)->tx_flags) {
>                 /* skb_tx_hash() wont be able to get sk.
>                  * We copy sk_hash into skb->rxhash
>                  */
>                 if (!skb->rxhash)
>                         skb->rxhash = sk->sk_hash;
>                 skb_orphan(skb);
>         }
> }

Right.

> I dont really understand what's the concern, since sk_free() doesnt care
> at all about sk_refcnt, but sk_wmem_alloc.

Right.

> void sk_free(struct sock *sk)
[snip]

> If one skb is in flight, and still linked to a socket, then this socket
> cannot disappear, because this skb->truesize was accounted into
> sk->sk_wmem_alloc

This is undoubtedly true, I'm not disputing this.

> Of course, this point is valid as long as skb had not been orphaned.
> 
> sk_refcnt can be 0, if user closed the socket, but socket wont disappear
> as long as sk_wmem_alloc is not 0.

Not disputing this either. But you said sk_refcnt can be 0, so why can't
the following happen:

/* skb; skb->sk = sk; skb->destructor = sock_wfree; */

/* skb is on qdisc, some time passes */

sk_free(sk); /* user closed socket,
                sk->sk_refcnt reaches 0,
		sk->sk_wmem_alloc == skb->truesize,
		__sk_free not called, socket still lives,
		but no more +1 in sk_wmem_alloc */

/* some more time passes */

/* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
skb2 = skb_clone(skb);
skb2->sk = skb->sk;
sock_hold(skb->sk);

/* ethernet TX completion calls skb_free(skb) */
skb_free(skb):
  sock_wfree(skb); /* sk_wmem_alloc reaches 0,
                      __sk_free called DESPITE sk_refcnt > 0 */

/* later, in skb_complete_tx_timestamp() */
sock_put(sk);	/* KABOOM */


I just want to understand why this can't happen :-)

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 12:58             ` Johannes Berg
@ 2011-10-19 13:09               ` Johannes Berg
  2011-10-19 13:25                 ` Eric Dumazet
  2011-10-19 13:21               ` Eric Dumazet
  1 sibling, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev

On Wed, 2011-10-19 at 14:58 +0200, Johannes Berg wrote:

> Not disputing this either. But you said sk_refcnt can be 0, so why can't
> the following happen:
> 
> /* skb; skb->sk = sk; skb->destructor = sock_wfree; */
> 
> /* skb is on qdisc, some time passes */
> 
> sk_free(sk); /* user closed socket,
>                 sk->sk_refcnt reaches 0,
> 		sk->sk_wmem_alloc == skb->truesize,
> 		__sk_free not called, socket still lives,
> 		but no more +1 in sk_wmem_alloc */
> 
> /* some more time passes */
> 
> /* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
> skb2 = skb_clone(skb);
> skb2->sk = skb->sk;
> sock_hold(skb->sk);
> 
> /* ethernet TX completion calls skb_free(skb) */
> skb_free(skb):
>   sock_wfree(skb); /* sk_wmem_alloc reaches 0,
>                       __sk_free called DESPITE sk_refcnt > 0 */
> 
> /* later, in skb_complete_tx_timestamp() */
> sock_put(sk);	/* KABOOM */


Given the complexity of all this, I'm not sure we shouldn't do something
like this, but I have no idea what the cost would be:

--- wireless-testing.orig/include/net/sock.h	2011-10-18 22:28:41.000000000 +0200
+++ wireless-testing/include/net/sock.h	2011-10-19 15:08:45.000000000 +0200
@@ -434,7 +434,10 @@ static __inline__ int __sk_del_node_init
 
 static inline void sock_hold(struct sock *sk)
 {
-	atomic_inc(&sk->sk_refcnt);
+	if (atomic_inc_return(&sk->sk_refcnt) == 1) {
+		/* was zero -- we must've gotten an sk_wmem_alloc reference */
+		atomic_inc(&sk->sk_wmem_alloc);
+	}
 }
 
 /* Ungrab socket in the context, which assumes that socket refcnt


johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 12:58             ` Johannes Berg
  2011-10-19 13:09               ` Johannes Berg
@ 2011-10-19 13:21               ` Eric Dumazet
  2011-10-19 13:25                 ` Johannes Berg
  1 sibling, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 13:21 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev

Le mercredi 19 octobre 2011 à 14:58 +0200, Johannes Berg a écrit :
> On Wed, 2011-10-19 at 14:38 +0200, Eric Dumazet wrote:
> > Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> > > On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > > > The only thing I'm not completely sure about is whether or not it is
> > > > permissible to sock_hold() at that point. I'm probably just missing
> > > > something, but: if sk_free() was called before hard_start_xmit() which
> > > > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> > > > 
> > 
> > This is not possible, or something is really broken. We specifically
> > dont skb_orphan(skb) if we know tx timestamping is enabled for this skb.
> 
> Why can't sk_free() have been called? I'm not thinking of sock_wfree()
> which can't have been called -- so the socket surely still exists
> because skb->truesize is still accounted to it -- but what says
> sk_refcnt hasn't reached 0 yet?
> 
> > /*
> >  * Try to orphan skb early, right before transmission by the device.
> >  * We cannot orphan skb if tx timestamp is requested or the sk-reference
> >  * is needed on driver level for other reasons, e.g. see net/can/raw.c
> >  */
> > static inline void skb_orphan_try(struct sk_buff *skb)
> > {
> >         struct sock *sk = skb->sk;
> > 
> >         if (sk && !skb_shinfo(skb)->tx_flags) {
> >                 /* skb_tx_hash() wont be able to get sk.
> >                  * We copy sk_hash into skb->rxhash
> >                  */
> >                 if (!skb->rxhash)
> >                         skb->rxhash = sk->sk_hash;
> >                 skb_orphan(skb);
> >         }
> > }
> 
> Right.
> 
> > I dont really understand what's the concern, since sk_free() doesnt care
> > at all about sk_refcnt, but sk_wmem_alloc.
> 
> Right.
> 
> > void sk_free(struct sock *sk)
> [snip]
> 
> > If one skb is in flight, and still linked to a socket, then this socket
> > cannot disappear, because this skb->truesize was accounted into
> > sk->sk_wmem_alloc
> 
> This is undoubtedly true, I'm not disputing this.
> 
> > Of course, this point is valid as long as skb had not been orphaned.
> > 
> > sk_refcnt can be 0, if user closed the socket, but socket wont disappear
> > as long as sk_wmem_alloc is not 0.
> 
> Not disputing this either. But you said sk_refcnt can be 0, so why can't
> the following happen:
> 
> /* skb; skb->sk = sk; skb->destructor = sock_wfree; */
> 
> /* skb is on qdisc, some time passes */
> 
> sk_free(sk); /* user closed socket,
>                 sk->sk_refcnt reaches 0,
> 		sk->sk_wmem_alloc == skb->truesize,
> 		__sk_free not called, socket still lives,
> 		but no more +1 in sk_wmem_alloc */
> 
> /* some more time passes */
> 
> /* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
> skb2 = skb_clone(skb);
> skb2->sk = skb->sk;
> sock_hold(skb->sk);
> 
> /* ethernet TX completion calls skb_free(skb) */
> skb_free(skb):
>   sock_wfree(skb); /* sk_wmem_alloc reaches 0,
>                       __sk_free called DESPITE sk_refcnt > 0 */
> 
> /* later, in skb_complete_tx_timestamp() */
> sock_put(sk);	/* KABOOM */
> 
> 
> I just want to understand why this can't happen :-)

Since you answer your own question :)

Hmm, oh well, sk_refcnt is/should not be changed if a xmit packet is
duped, but sk_wmem_alloc should be, exactly paired with skb->truesize

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 13:21               ` Eric Dumazet
@ 2011-10-19 13:25                 ` Johannes Berg
  2011-10-19 13:27                   ` Eric Dumazet
  0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev

On Wed, 2011-10-19 at 15:21 +0200, Eric Dumazet wrote:

> > Not disputing this either. But you said sk_refcnt can be 0, so why can't
> > the following happen:
> > 
> > /* skb; skb->sk = sk; skb->destructor = sock_wfree; */
> > 
> > /* skb is on qdisc, some time passes */
> > 
> > sk_free(sk); /* user closed socket,
> >                 sk->sk_refcnt reaches 0,
> > 		sk->sk_wmem_alloc == skb->truesize,
> > 		__sk_free not called, socket still lives,
> > 		but no more +1 in sk_wmem_alloc */
> > 
> > /* some more time passes */
> > 
> > /* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
> > skb2 = skb_clone(skb);
> > skb2->sk = skb->sk;
> > sock_hold(skb->sk);
> > 
> > /* ethernet TX completion calls skb_free(skb) */
> > skb_free(skb):
> >   sock_wfree(skb); /* sk_wmem_alloc reaches 0,
> >                       __sk_free called DESPITE sk_refcnt > 0 */
> > 
> > /* later, in skb_complete_tx_timestamp() */
> > sock_put(sk);	/* KABOOM */
> > 
> > 
> > I just want to understand why this can't happen :-)
> 
> Since you answer your own question :)

Did I?

> Hmm, oh well, sk_refcnt is/should not be changed if a xmit packet is
> duped, but sk_wmem_alloc should be, exactly paired with skb->truesize

Well, yeah, ok, but Richard's patches do modify sk_refcnt, and can't
modify sk_wmem_alloc as discussed upthread.

I'll admit that I don't really trust the whole thing anyway -- it seems
rather error prone to forbid you from doing sock_hold() on a socket you
obtained from a TX packet.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 13:09               ` Johannes Berg
@ 2011-10-19 13:25                 ` Eric Dumazet
  2011-10-19 13:35                   ` Johannes Berg
  0 siblings, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 13:25 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev

Le mercredi 19 octobre 2011 à 15:09 +0200, Johannes Berg a écrit :
> On Wed, 2011-10-19 at 14:58 +0200, Johannes Berg wrote:
> 
> > Not disputing this either. But you said sk_refcnt can be 0, so why can't
> > the following happen:
> > 
> > /* skb; skb->sk = sk; skb->destructor = sock_wfree; */
> > 
> > /* skb is on qdisc, some time passes */
> > 
> > sk_free(sk); /* user closed socket,
> >                 sk->sk_refcnt reaches 0,
> > 		sk->sk_wmem_alloc == skb->truesize,
> > 		__sk_free not called, socket still lives,
> > 		but no more +1 in sk_wmem_alloc */
> > 
> > /* some more time passes */
> > 
> > /* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
> > skb2 = skb_clone(skb);
> > skb2->sk = skb->sk;
> > sock_hold(skb->sk);
> > 
> > /* ethernet TX completion calls skb_free(skb) */
> > skb_free(skb):
> >   sock_wfree(skb); /* sk_wmem_alloc reaches 0,
> >                       __sk_free called DESPITE sk_refcnt > 0 */
> > 
> > /* later, in skb_complete_tx_timestamp() */
> > sock_put(sk);	/* KABOOM */
> 
> 
> Given the complexity of all this, I'm not sure we shouldn't do something
> like this, but I have no idea what the cost would be:
> 
> --- wireless-testing.orig/include/net/sock.h	2011-10-18 22:28:41.000000000 +0200
> +++ wireless-testing/include/net/sock.h	2011-10-19 15:08:45.000000000 +0200
> @@ -434,7 +434,10 @@ static __inline__ int __sk_del_node_init
>  
>  static inline void sock_hold(struct sock *sk)
>  {
> -	atomic_inc(&sk->sk_refcnt);
> +	if (atomic_inc_return(&sk->sk_refcnt) == 1) {
> +		/* was zero -- we must've gotten an sk_wmem_alloc reference */
> +		atomic_inc(&sk->sk_wmem_alloc);
> +	}
>  }
>  

Hmm, it will be difficult to handle two atomics without adding races,
and add quite expensive atomic_inc_return() on some arches.

I would just change the skb tx cloning to take a normal reference on
sk_wmem_alloc

	atomic_add(skb->truesize, &sk->sk_wmem_alloc);
instead of
	sock_hold(sk);

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 13:25                 ` Johannes Berg
@ 2011-10-19 13:27                   ` Eric Dumazet
  2011-10-19 13:32                     ` Johannes Berg
  0 siblings, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 13:27 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev

Le mercredi 19 octobre 2011 à 15:25 +0200, Johannes Berg a écrit :

> Well, yeah, ok, but Richard's patches do modify sk_refcnt, and can't
> modify sk_wmem_alloc as discussed upthread.
> 
> I'll admit that I don't really trust the whole thing anyway -- it seems
> rather error prone to forbid you from doing sock_hold() on a socket you
> obtained from a TX packet.
> 

I probably missed why sk_wmem_alloc can not be modified ?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 13:27                   ` Eric Dumazet
@ 2011-10-19 13:32                     ` Johannes Berg
  2011-10-19 14:25                       ` Richard Cochran
  0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev

On Wed, 2011-10-19 at 15:27 +0200, Eric Dumazet wrote:
> Le mercredi 19 octobre 2011 à 15:25 +0200, Johannes Berg a écrit :
> 
> > Well, yeah, ok, but Richard's patches do modify sk_refcnt, and can't
> > modify sk_wmem_alloc as discussed upthread.
> > 
> > I'll admit that I don't really trust the whole thing anyway -- it seems
> > rather error prone to forbid you from doing sock_hold() on a socket you
> > obtained from a TX packet.
> > 
> 
> I probably missed why sk_wmem_alloc can not be modified ?

I think I misremember -- I was suggesting to set skb2->truesize to 1 and
then charge it so it's not double-charged.

OTOH, could skb_clone_tx_timestamp() orphan the original skb after
adding the skb2?

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 13:25                 ` Eric Dumazet
@ 2011-10-19 13:35                   ` Johannes Berg
  2011-10-19 13:44                     ` Eric Dumazet
  0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev

On Wed, 2011-10-19 at 15:25 +0200, Eric Dumazet wrote:

> > Given the complexity of all this, I'm not sure we shouldn't do something
> > like this, but I have no idea what the cost would be:
> > 
> > --- wireless-testing.orig/include/net/sock.h	2011-10-18 22:28:41.000000000 +0200
> > +++ wireless-testing/include/net/sock.h	2011-10-19 15:08:45.000000000 +0200
> > @@ -434,7 +434,10 @@ static __inline__ int __sk_del_node_init
> >  
> >  static inline void sock_hold(struct sock *sk)
> >  {
> > -	atomic_inc(&sk->sk_refcnt);
> > +	if (atomic_inc_return(&sk->sk_refcnt) == 1) {
> > +		/* was zero -- we must've gotten an sk_wmem_alloc reference */
> > +		atomic_inc(&sk->sk_wmem_alloc);
> > +	}
> >  }
> >  
> 
> Hmm, it will be difficult to handle two atomics without adding races,

Where do you see a race? If you do sock_hold() while you have a 'proper
sk_refcnt reference', this does nothing but increase sk_refcnt. If you
do sock_hold() while you have an 'sk_wmem_alloc reference', this will
actually increase sk_wmem_alloc by 1; then when the original
'sk_wmem_alloc reference' you had when calling sock_hold() is released,
sk_wmem_alloc will still have 1 matching a non-zero sk_refcnt.


> and add quite expensive atomic_inc_return() on some arches.

Yes I was afraid of that.

> I would just change the skb tx cloning to take a normal reference on
> sk_wmem_alloc
> 
> 	atomic_add(skb->truesize, &sk->sk_wmem_alloc);
> instead of
> 	sock_hold(sk);

Even with that fixed I'm not really convinced of it all -- need to
really really really make sure that no skb->sk that was owned by a TX
skb is ever passed to sock_hold(). Can we really guarantee that?

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 13:35                   ` Johannes Berg
@ 2011-10-19 13:44                     ` Eric Dumazet
  2011-10-19 13:57                       ` Johannes Berg
  0 siblings, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 13:44 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev

Le mercredi 19 octobre 2011 à 15:35 +0200, Johannes Berg a écrit :

> Even with that fixed I'm not really convinced of it all -- need to
> really really really make sure that no skb->sk that was owned by a TX
> skb is ever passed to sock_hold(). Can we really guarantee that?

Either we can guarantee that, or kernel is a piece of crap, all bets are
off.

Yes, we need to make an audit, since we assumed sock_hold() was the
right thing to do in all contexts.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 13:44                     ` Eric Dumazet
@ 2011-10-19 13:57                       ` Johannes Berg
  2011-10-19 14:08                         ` Eric Dumazet
  0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:57 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev

On Wed, 2011-10-19 at 15:44 +0200, Eric Dumazet wrote:
> Le mercredi 19 octobre 2011 à 15:35 +0200, Johannes Berg a écrit :
> 
> > Even with that fixed I'm not really convinced of it all -- need to
> > really really really make sure that no skb->sk that was owned by a TX
> > skb is ever passed to sock_hold(). Can we really guarantee that?
> 
> Either we can guarantee that, or kernel is a piece of crap, all bets are
> off.

Fair enough :-)

> Yes, we need to make an audit, since we assumed sock_hold() was the
> right thing to do in all contexts.

Ok.

Anyway, I guess you agree that the patches as-is aren't actually the
right solution since we can't sock_hold() a TX skb socket reference?

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 13:57                       ` Johannes Berg
@ 2011-10-19 14:08                         ` Eric Dumazet
  2011-10-19 14:24                           ` Johannes Berg
  0 siblings, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 14:08 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev

Le mercredi 19 octobre 2011 à 15:57 +0200, Johannes Berg a écrit :
> On Wed, 2011-10-19 at 15:44 +0200, Eric Dumazet wrote:
> > Le mercredi 19 octobre 2011 à 15:35 +0200, Johannes Berg a écrit :
> > 
> > > Even with that fixed I'm not really convinced of it all -- need to
> > > really really really make sure that no skb->sk that was owned by a TX
> > > skb is ever passed to sock_hold(). Can we really guarantee that?
> > 
> > Either we can guarantee that, or kernel is a piece of crap, all bets are
> > off.
> 
> Fair enough :-)
> 
> > Yes, we need to make an audit, since we assumed sock_hold() was the
> > right thing to do in all contexts.
> 
> Ok.
> 
> Anyway, I guess you agree that the patches as-is aren't actually the
> right solution since we can't sock_hold() a TX skb socket reference?

Yes, the sock_hold() could be changed by an atomic_inc_not_zero()

What about doing this ?

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 54fc413..79f337c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -420,7 +420,7 @@ struct phy_driver {
 
 	/*
 	 * Requests a Tx timestamp for 'skb'. The phy driver promises
-	 * to deliver it to the socket's error queue as soon as a
+	 * to deliver it using skb_complete_tx_timestamp() as soon as a
 	 * timestamp becomes available. One of the PTP_CLASS_ values
 	 * is passed in 'type'.
 	 */
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8bd383c..0f96646 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2020,8 +2020,13 @@ static inline bool skb_defer_rx_timestamp(struct sk_buff *skb)
 /**
  * skb_complete_tx_timestamp() - deliver cloned skb with tx timestamps
  *
+ * PHY drivers may accept clones of transmitted packets for
+ * timestamping via their phy_driver.txtstamp method. These drivers
+ * must call this function to return the skb back to the stack, with
+ * or without a timestamp.
+ *
  * @skb: clone of the the original outgoing packet
- * @hwtstamps: hardware time stamps
+ * @hwtstamps: hardware time stamps, may be NULL if not available
  *
  */
 void skb_complete_tx_timestamp(struct sk_buff *skb,
diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 98a5264..82fb288 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -57,9 +57,13 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
 	case PTP_CLASS_V2_VLAN:
 		phydev = skb->dev->phydev;
 		if (likely(phydev->drv->txtstamp)) {
+			if (!atomic_inc_not_zero(&sk->sk_refcnt))
+				return;
 			clone = skb_clone(skb, GFP_ATOMIC);
-			if (!clone)
+			if (!clone) {
+				sock_put(sk);
 				return;
+			}
 			clone->sk = sk;
 			phydev->drv->txtstamp(phydev, clone, type);
 		}
@@ -77,8 +81,11 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
 	struct sock_exterr_skb *serr;
 	int err;
 
-	if (!hwtstamps)
+	if (!hwtstamps) {
+		sock_put(sk);
+		kfree_skb(skb);
 		return;
+	}
 
 	*skb_hwtstamps(skb) = *hwtstamps;
 	serr = SKB_EXT_ERR(skb);
@@ -87,6 +94,7 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
 	serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
 	skb->sk = NULL;
 	err = sock_queue_err_skb(sk, skb);
+	sock_put(sk);
 	if (err)
 		kfree_skb(skb);
 }

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 14:08                         ` Eric Dumazet
@ 2011-10-19 14:24                           ` Johannes Berg
  2011-10-19 14:27                             ` Richard Cochran
  0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 14:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev

On Wed, 2011-10-19 at 16:08 +0200, Eric Dumazet wrote:

> > Anyway, I guess you agree that the patches as-is aren't actually the
> > right solution since we can't sock_hold() a TX skb socket reference?
> 
> Yes, the sock_hold() could be changed by an atomic_inc_not_zero()
> 
> What about doing this ?

>  		if (likely(phydev->drv->txtstamp)) {
> +			if (!atomic_inc_not_zero(&sk->sk_refcnt))
> +				return;

Yeah that seems like it works and just drops the timestamp in case we
don't still have a live socket, which is perfectly fine.

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 13:32                     ` Johannes Berg
@ 2011-10-19 14:25                       ` Richard Cochran
  0 siblings, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-19 14:25 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Eric Dumazet, David Miller, netdev

On Wed, Oct 19, 2011 at 03:32:53PM +0200, Johannes Berg wrote:
> 
> OTOH, could skb_clone_tx_timestamp() orphan the original skb after
> adding the skb2?

The Ethernet MAC driver decides what to do with the original skb. In
the MAC driver, it goes like:

1. skb_tx_timestamp(skb) -> skb_clone_tx_timestamp(skb);
2. give skb to hardware;
3. ...
4. kfree_skb, recycle, etc

Richard

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 14:24                           ` Johannes Berg
@ 2011-10-19 14:27                             ` Richard Cochran
  2011-10-19 14:33                               ` Eric Dumazet
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-19 14:27 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Eric Dumazet, David Miller, netdev

On Wed, Oct 19, 2011 at 04:24:08PM +0200, Johannes Berg wrote:
> On Wed, 2011-10-19 at 16:08 +0200, Eric Dumazet wrote:
> 
> > > Anyway, I guess you agree that the patches as-is aren't actually the
> > > right solution since we can't sock_hold() a TX skb socket reference?
> > 
> > Yes, the sock_hold() could be changed by an atomic_inc_not_zero()
> > 
> > What about doing this ?
> 
> >  		if (likely(phydev->drv->txtstamp)) {
> > +			if (!atomic_inc_not_zero(&sk->sk_refcnt))
> > +				return;
> 
> Yeah that seems like it works and just drops the timestamp in case we
> don't still have a live socket, which is perfectly fine.

Yes, I think it resolves any doubt. I will resubmit with this
solution.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 0/3] net: time stamping fixes
  2011-10-19 14:27                             ` Richard Cochran
@ 2011-10-19 14:33                               ` Eric Dumazet
  0 siblings, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 14:33 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Johannes Berg, David Miller, netdev

Le mercredi 19 octobre 2011 à 16:27 +0200, Richard Cochran a écrit :
> On Wed, Oct 19, 2011 at 04:24:08PM +0200, Johannes Berg wrote:
> > On Wed, 2011-10-19 at 16:08 +0200, Eric Dumazet wrote:
> > 
> > > > Anyway, I guess you agree that the patches as-is aren't actually the
> > > > right solution since we can't sock_hold() a TX skb socket reference?
> > > 
> > > Yes, the sock_hold() could be changed by an atomic_inc_not_zero()
> > > 
> > > What about doing this ?
> > 
> > >  		if (likely(phydev->drv->txtstamp)) {
> > > +			if (!atomic_inc_not_zero(&sk->sk_refcnt))
> > > +				return;
> > 
> > Yeah that seems like it works and just drops the timestamp in case we
> > don't still have a live socket, which is perfectly fine.
> 
> Yes, I think it resolves any doubt. I will resubmit with this
> solution.
> 

Thanks guys !

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 0/3]  net: time stamping fixes
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
                     ` (5 preceding siblings ...)
  2011-10-13  9:46   ` [PATCH 3/3] dp83640: free packet queues on remove Richard Cochran
@ 2011-10-21 10:49   ` Richard Cochran
  2011-10-21 10:49   ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-21 10:49 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg

[ Changes in v2: use atomic_inc_not_zero per Eric's suggestion. ]

The first patch fixes a bug in the time stamping code introduced in
v2.6.36 of the kernel.

The other two patches depend on the first patch and fix two bugs in a
PTP Hardware Clock driver. This driver was first introduced in Linux
version 3.0.

Richard Cochran (3):
  net: hold sock reference while processing tx timestamps
  dp83640: use proper function to free transmit time stamping packets
  dp83640: free packet queues on remove

 drivers/net/phy/dp83640.c |   11 +++++++++--
 include/linux/phy.h       |    2 +-
 include/linux/skbuff.h    |    7 ++++++-
 net/core/timestamping.c   |   12 ++++++++++--
 4 files changed, 26 insertions(+), 6 deletions(-)

-- 
1.7.2.5

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 1/3] net: hold sock reference while processing tx timestamps
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
                     ` (6 preceding siblings ...)
  2011-10-21 10:49   ` [PATCH v2 0/3] net: time stamping fixes Richard Cochran
@ 2011-10-21 10:49   ` Richard Cochran
  2011-10-21 11:31     ` Eric Dumazet
  2011-10-21 11:44     ` Johannes Berg
  2011-10-21 10:49   ` [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
  2011-10-21 10:49   ` [PATCH v2 3/3] dp83640: free packet queues on remove Richard Cochran
  9 siblings, 2 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-21 10:49 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable

The pair of functions,

 * skb_clone_tx_timestamp()
 * skb_complete_tx_timestamp()

were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.

As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.

These functions first appeared in v2.6.36.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@vger.kernel.org>
---
 include/linux/phy.h     |    2 +-
 include/linux/skbuff.h  |    7 ++++++-
 net/core/timestamping.c |   12 ++++++++++--
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 54fc413..79f337c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -420,7 +420,7 @@ struct phy_driver {
 
 	/*
 	 * Requests a Tx timestamp for 'skb'. The phy driver promises
-	 * to deliver it to the socket's error queue as soon as a
+	 * to deliver it using skb_complete_tx_timestamp() as soon as a
 	 * timestamp becomes available. One of the PTP_CLASS_ values
 	 * is passed in 'type'.
 	 */
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8bd383c..0f96646 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2020,8 +2020,13 @@ static inline bool skb_defer_rx_timestamp(struct sk_buff *skb)
 /**
  * skb_complete_tx_timestamp() - deliver cloned skb with tx timestamps
  *
+ * PHY drivers may accept clones of transmitted packets for
+ * timestamping via their phy_driver.txtstamp method. These drivers
+ * must call this function to return the skb back to the stack, with
+ * or without a timestamp.
+ *
  * @skb: clone of the the original outgoing packet
- * @hwtstamps: hardware time stamps
+ * @hwtstamps: hardware time stamps, may be NULL if not available
  *
  */
 void skb_complete_tx_timestamp(struct sk_buff *skb,
diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 98a5264..82fb288 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -57,9 +57,13 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
 	case PTP_CLASS_V2_VLAN:
 		phydev = skb->dev->phydev;
 		if (likely(phydev->drv->txtstamp)) {
+			if (!atomic_inc_not_zero(&sk->sk_refcnt))
+				return;
 			clone = skb_clone(skb, GFP_ATOMIC);
-			if (!clone)
+			if (!clone) {
+				sock_put(sk);
 				return;
+			}
 			clone->sk = sk;
 			phydev->drv->txtstamp(phydev, clone, type);
 		}
@@ -77,8 +81,11 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
 	struct sock_exterr_skb *serr;
 	int err;
 
-	if (!hwtstamps)
+	if (!hwtstamps) {
+		sock_put(sk);
+		kfree_skb(skb);
 		return;
+	}
 
 	*skb_hwtstamps(skb) = *hwtstamps;
 	serr = SKB_EXT_ERR(skb);
@@ -87,6 +94,7 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
 	serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
 	skb->sk = NULL;
 	err = sock_queue_err_skb(sk, skb);
+	sock_put(sk);
 	if (err)
 		kfree_skb(skb);
 }
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
                     ` (7 preceding siblings ...)
  2011-10-21 10:49   ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-21 10:49   ` Richard Cochran
  2011-10-24  6:55     ` David Miller
  2011-10-21 10:49   ` [PATCH v2 3/3] dp83640: free packet queues on remove Richard Cochran
  9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-21 10:49 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable

The previous commit enforces a new rule for handling the cloned packets
for transmit time stamping. These packets must not be freed using any other
function than skb_complete_tx_timestamp. This commit fixes the one and only
driver using this API.

The driver first appeared in v3.0.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@vger.kernel.org>
---
 drivers/net/phy/dp83640.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index c588a16..13e5713 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1192,7 +1192,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
 
 	case HWTSTAMP_TX_ONESTEP_SYNC:
 		if (is_sync(skb, type)) {
-			kfree_skb(skb);
+			skb_complete_tx_timestamp(skb, NULL);
 			return;
 		}
 		/* fall through */
@@ -1203,7 +1203,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
 
 	case HWTSTAMP_TX_OFF:
 	default:
-		kfree_skb(skb);
+		skb_complete_tx_timestamp(skb, NULL);
 		break;
 	}
 }
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 3/3] dp83640: free packet queues on remove
  2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
                     ` (8 preceding siblings ...)
  2011-10-21 10:49   ` [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
@ 2011-10-21 10:49   ` Richard Cochran
  9 siblings, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-21 10:49 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable

If the PHY should disappear (for example, on an USB Ethernet MAC), then
the driver would leak any undelivered time stamp packets. This commit
fixes the issue by calling the appropriate functions to free any packets
left in the transmit and receive queues.

The driver first appeared in v3.0.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@vger.kernel.org>
---
 drivers/net/phy/dp83640.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index 13e5713..9663e0b 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1007,6 +1007,7 @@ static void dp83640_remove(struct phy_device *phydev)
 	struct dp83640_clock *clock;
 	struct list_head *this, *next;
 	struct dp83640_private *tmp, *dp83640 = phydev->priv;
+	struct sk_buff *skb;
 
 	if (phydev->addr == BROADCAST_ADDR)
 		return;
@@ -1014,6 +1015,12 @@ static void dp83640_remove(struct phy_device *phydev)
 	enable_status_frames(phydev, false);
 	cancel_work_sync(&dp83640->ts_work);
 
+	while ((skb = skb_dequeue(&dp83640->rx_queue)) != NULL)
+		kfree_skb(skb);
+
+	while ((skb = skb_dequeue(&dp83640->tx_queue)) != NULL)
+		skb_complete_tx_timestamp(skb, NULL);
+
 	clock = dp83640_clock_get(dp83640->clock);
 
 	if (dp83640 == clock->chosen) {
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 1/3] net: hold sock reference while processing tx timestamps
  2011-10-21 10:49   ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-21 11:31     ` Eric Dumazet
  2011-10-24  6:55       ` David Miller
  2011-10-21 11:44     ` Johannes Berg
  1 sibling, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-21 11:31 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, David Miller, Johannes Berg, stable

Le vendredi 21 octobre 2011 à 12:49 +0200, Richard Cochran a écrit :
> The pair of functions,
> 
>  * skb_clone_tx_timestamp()
>  * skb_complete_tx_timestamp()
> 
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
> 
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.
> 
> These functions first appeared in v2.6.36.
> 
> Reported-by: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@vger.kernel.org>
> ---

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 1/3] net: hold sock reference while processing tx timestamps
  2011-10-21 10:49   ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
  2011-10-21 11:31     ` Eric Dumazet
@ 2011-10-21 11:44     ` Johannes Berg
  1 sibling, 0 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-21 11:44 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, David Miller, Eric Dumazet

On Fri, 2011-10-21 at 12:49 +0200, Richard Cochran wrote:
> The pair of functions,
> 
>  * skb_clone_tx_timestamp()
>  * skb_complete_tx_timestamp()
> 
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
> 
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.

It just now occurred to me that technically I think you could use a
destructor function that just points to something like this:

void tstamp_clone_free(struct sk_buff *skb)
{
	sock_put(skb->sk);
}

but just forcing drivers to use the right API seems equally good to me.

> These functions first appeared in v2.6.36.
> 
> Reported-by: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@vger.kernel.org>


Thanks all for the discussions! :-)

Reviewed-by: Johannes Berg <johannes@sipsolutions.net>

johannes

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 1/3] net: hold sock reference while processing tx timestamps
  2011-10-21 11:31     ` Eric Dumazet
@ 2011-10-24  6:55       ` David Miller
  0 siblings, 0 replies; 56+ messages in thread
From: David Miller @ 2011-10-24  6:55 UTC (permalink / raw)
  To: eric.dumazet; +Cc: richardcochran, netdev, johannes, stable

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 21 Oct 2011 13:31:50 +0200

> Le vendredi 21 octobre 2011 à 12:49 +0200, Richard Cochran a écrit :
>> The pair of functions,
>> 
>>  * skb_clone_tx_timestamp()
>>  * skb_complete_tx_timestamp()
>> 
>> were designed to allow timestamping in PHY devices. The first
>> function, called during the MAC driver's hard_xmit method, identifies
>> PTP protocol packets, clones them, and gives them to the PHY device
>> driver. The PHY driver may hold onto the packet and deliver it at a
>> later time using the second function, which adds the packet to the
>> socket's error queue.
>> 
>> As pointed out by Johannes, nothing prevents the socket from
>> disappearing while the cloned packet is sitting in the PHY driver
>> awaiting a timestamp. This patch fixes the issue by taking a reference
>> on the socket for each such packet. In addition, the comments
>> regarding the usage of these function are expanded to highlight the
>> rule that PHY drivers must use skb_complete_tx_timestamp() to release
>> the packet, in order to release the socket reference, too.
>> 
>> These functions first appeared in v2.6.36.
>> 
>> Reported-by: Johannes Berg <johannes@sipsolutions.net>
>> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
>> Cc: <stable@vger.kernel.org>
>> ---
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets
  2011-10-21 10:49   ` [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
@ 2011-10-24  6:55     ` David Miller
  2011-10-24 17:47       ` Richard Cochran
  0 siblings, 1 reply; 56+ messages in thread
From: David Miller @ 2011-10-24  6:55 UTC (permalink / raw)
  To: richardcochran; +Cc: netdev, eric.dumazet, johannes, stable

From: Richard Cochran <richardcochran@gmail.com>
Date: Fri, 21 Oct 2011 12:49:16 +0200

> The previous commit enforces a new rule for handling the cloned packets
> for transmit time stamping. These packets must not be freed using any other
> function than skb_complete_tx_timestamp. This commit fixes the one and only
> driver using this API.
> 
> The driver first appeared in v3.0.
> 
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: <stable@vger.kernel.org>

In the 'net' tree, which is where you should be targetting these dp83640
driver patches, the code looks nothing like what you're patching against.

Please respin patches #2 and #3 against current sources.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets
  2011-10-24  6:55     ` David Miller
@ 2011-10-24 17:47       ` Richard Cochran
  2011-10-24 23:16         ` David Miller
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-24 17:47 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, eric.dumazet, johannes, stable

On Mon, Oct 24, 2011 at 02:55:55AM -0400, David Miller wrote:
> From: Richard Cochran <richardcochran@gmail.com>
> Date: Fri, 21 Oct 2011 12:49:16 +0200
> 
> > The previous commit enforces a new rule for handling the cloned packets
> > for transmit time stamping. These packets must not be freed using any other
> > function than skb_complete_tx_timestamp. This commit fixes the one and only
> > driver using this API.
> > 
> > The driver first appeared in v3.0.
> > 
> > Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> > Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: <stable@vger.kernel.org>
> 
> In the 'net' tree, which is where you should be targetting these dp83640
> driver patches, the code looks nothing like what you're patching against.
> 
> Please respin patches #2 and #3 against current sources.

Okay, but #2 will conflict with 

    dccaa9e0 dp83640: add time stamp insertion for sync messages

in net-next. Should I also submit a fix for that one?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets
  2011-10-24 17:47       ` Richard Cochran
@ 2011-10-24 23:16         ` David Miller
  0 siblings, 0 replies; 56+ messages in thread
From: David Miller @ 2011-10-24 23:16 UTC (permalink / raw)
  To: richardcochran; +Cc: netdev, eric.dumazet, johannes, stable

From: Richard Cochran <richardcochran@gmail.com>
Date: Mon, 24 Oct 2011 19:47:25 +0200

> On Mon, Oct 24, 2011 at 02:55:55AM -0400, David Miller wrote:
>> From: Richard Cochran <richardcochran@gmail.com>
>> Date: Fri, 21 Oct 2011 12:49:16 +0200
>> 
>> > The previous commit enforces a new rule for handling the cloned packets
>> > for transmit time stamping. These packets must not be freed using any other
>> > function than skb_complete_tx_timestamp. This commit fixes the one and only
>> > driver using this API.
>> > 
>> > The driver first appeared in v3.0.
>> > 
>> > Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
>> > Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
>> > Cc: <stable@vger.kernel.org>
>> 
>> In the 'net' tree, which is where you should be targetting these dp83640
>> driver patches, the code looks nothing like what you're patching against.
>> 
>> Please respin patches #2 and #3 against current sources.
> 
> Okay, but #2 will conflict with 
> 
>     dccaa9e0 dp83640: add time stamp insertion for sync messages
> 
> in net-next. Should I also submit a fix for that one?

Ok.  I'll just apply those original 2 patches to net-next since Linus
release 3.1 today already.

Thanks.

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2011-10-24 23:17 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-07 17:11 [RFC] net: remove erroneous sk null assignment in timestamping Johannes Berg
2011-10-07 17:33 ` David Miller
2011-10-07 17:40   ` Johannes Berg
2011-10-07 17:47     ` Johannes Berg
2011-10-07 17:53       ` Johannes Berg
2011-10-07 18:42     ` Johannes Berg
2011-10-08  7:59       ` Richard Cochran
2011-10-08  7:57   ` Richard Cochran
2011-10-08  8:16     ` Johannes Berg
2011-10-08  8:57       ` Eric Dumazet
2011-10-08 10:32         ` Johannes Berg
2011-10-11 13:34           ` Richard Cochran
2011-10-08 10:35         ` Richard Cochran
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
2011-10-12 19:25   ` Eric Dumazet
2011-10-12 19:27   ` Johannes Berg
2011-10-12 19:52     ` Eric Dumazet
2011-10-13  8:54       ` Johannes Berg
2011-10-13  4:51     ` Richard Cochran
2011-10-13  9:46   ` [PATCH 0/3] net: time stamping fixes Richard Cochran
2011-10-19  4:16     ` David Miller
2011-10-19  5:15       ` Johannes Berg
2011-10-19 11:50         ` Richard Cochran
2011-10-19 12:33           ` Eric Dumazet
2011-10-19 12:38           ` Eric Dumazet
2011-10-19 12:58             ` Johannes Berg
2011-10-19 13:09               ` Johannes Berg
2011-10-19 13:25                 ` Eric Dumazet
2011-10-19 13:35                   ` Johannes Berg
2011-10-19 13:44                     ` Eric Dumazet
2011-10-19 13:57                       ` Johannes Berg
2011-10-19 14:08                         ` Eric Dumazet
2011-10-19 14:24                           ` Johannes Berg
2011-10-19 14:27                             ` Richard Cochran
2011-10-19 14:33                               ` Eric Dumazet
2011-10-19 13:21               ` Eric Dumazet
2011-10-19 13:25                 ` Johannes Berg
2011-10-19 13:27                   ` Eric Dumazet
2011-10-19 13:32                     ` Johannes Berg
2011-10-19 14:25                       ` Richard Cochran
2011-10-13  9:46   ` [PATCH 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
2011-10-19  4:42     ` Eric Dumazet
2011-10-13  9:46   ` [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
2011-10-19  4:47     ` Eric Dumazet
2011-10-13  9:46   ` [PATCH 3/3] dp83640: free packet queues on remove Richard Cochran
2011-10-19  4:48     ` Eric Dumazet
2011-10-21 10:49   ` [PATCH v2 0/3] net: time stamping fixes Richard Cochran
2011-10-21 10:49   ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
2011-10-21 11:31     ` Eric Dumazet
2011-10-24  6:55       ` David Miller
2011-10-21 11:44     ` Johannes Berg
2011-10-21 10:49   ` [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
2011-10-24  6:55     ` David Miller
2011-10-24 17:47       ` Richard Cochran
2011-10-24 23:16         ` David Miller
2011-10-21 10:49   ` [PATCH v2 3/3] dp83640: free packet queues on remove Richard Cochran

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).