* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-12 19:25 ` Eric Dumazet
2011-10-12 19:27 ` Johannes Berg
` (8 subsequent siblings)
9 siblings, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-12 19:25 UTC (permalink / raw)
To: Richard Cochran; +Cc: netdev, David Miller, Johannes Berg, stable
Le mercredi 12 octobre 2011 à 20:36 +0200, Richard Cochran a écrit :
> The pair of functions,
>
> * skb_clone_tx_timestamp()
> * skb_complete_tx_timestamp()
>
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
>
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.
>
> These functions first appeared in v2.6.36.
>
> Reported-by: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@kernel.org>
> ---
Seems fine to me, thanks !
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
2011-10-12 19:25 ` Eric Dumazet
@ 2011-10-12 19:27 ` Johannes Berg
2011-10-12 19:52 ` Eric Dumazet
2011-10-13 4:51 ` Richard Cochran
2011-10-13 9:46 ` [PATCH 0/3] net: time stamping fixes Richard Cochran
` (7 subsequent siblings)
9 siblings, 2 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-12 19:27 UTC (permalink / raw)
To: Richard Cochran; +Cc: netdev, David Miller, Eric Dumazet
On Wed, 2011-10-12 at 20:36 +0200, Richard Cochran wrote:
> The pair of functions,
>
> * skb_clone_tx_timestamp()
> * skb_complete_tx_timestamp()
>
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
>
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.
This still needs a fix to the PHY driver right? It has a case that can
kfree_skb() the skb instead of passing it back to
complete_tx_timestamp().
> - if (!hwtstamps)
> + if (!hwtstamps) {
> + sock_put(sk);
> + kfree_skb(skb);
> return;
> + }
Is that right w/o skb->sk = NULL?
The other thing I was wondering -- what if we just set truesize to 1 (we
don't have any truesize checks) and account the skb to the socket
normally. Not really a good way either though.
FWIW I just decided to do it the other way around in mac80211 -- keep
the original SKB that's charged to the socket for the error queue, and
use a clone to actually do the TX.
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
2011-10-12 19:27 ` Johannes Berg
@ 2011-10-12 19:52 ` Eric Dumazet
2011-10-13 8:54 ` Johannes Berg
2011-10-13 4:51 ` Richard Cochran
1 sibling, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-12 19:52 UTC (permalink / raw)
To: Johannes Berg; +Cc: Richard Cochran, netdev, David Miller
Le mercredi 12 octobre 2011 à 21:27 +0200, Johannes Berg a écrit :
> On Wed, 2011-10-12 at 20:36 +0200, Richard Cochran wrote:
> > The pair of functions,
> >
> > * skb_clone_tx_timestamp()
> > * skb_complete_tx_timestamp()
> >
> > were designed to allow timestamping in PHY devices. The first
> > function, called during the MAC driver's hard_xmit method, identifies
> > PTP protocol packets, clones them, and gives them to the PHY device
> > driver. The PHY driver may hold onto the packet and deliver it at a
> > later time using the second function, which adds the packet to the
> > socket's error queue.
> >
> > As pointed out by Johannes, nothing prevents the socket from
> > disappearing while the cloned packet is sitting in the PHY driver
> > awaiting a timestamp. This patch fixes the issue by taking a reference
> > on the socket for each such packet. In addition, the comments
> > regarding the usage of these function are expanded to highlight the
> > rule that PHY drivers must use skb_complete_tx_timestamp() to release
> > the packet, in order to release the socket reference, too.
>
> This still needs a fix to the PHY driver right? It has a case that can
> kfree_skb() the skb instead of passing it back to
> complete_tx_timestamp().
>
> > - if (!hwtstamps)
> > + if (!hwtstamps) {
> > + sock_put(sk);
> > + kfree_skb(skb);
> > return;
> > + }
>
> Is that right w/o skb->sk = NULL?
>
>
> The other thing I was wondering -- what if we just set truesize to 1 (we
> don't have any truesize checks) and account the skb to the socket
> normally. Not really a good way either though.
>
Changing truesize is not allowed here see below.
> FWIW I just decided to do it the other way around in mac80211 -- keep
> the original SKB that's charged to the socket for the error queue, and
> use a clone to actually do the TX.
Hmm, please take a look at IFF_SKB_TX_SHARING stuff, added in commit
d88733150 (net: add IFF_SKB_TX_SHARED flag to priv_flags)
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
2011-10-12 19:52 ` Eric Dumazet
@ 2011-10-13 8:54 ` Johannes Berg
0 siblings, 0 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-13 8:54 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Richard Cochran, netdev, David Miller
On Wed, 2011-10-12 at 21:52 +0200, Eric Dumazet wrote:
> > The other thing I was wondering -- what if we just set truesize to 1 (we
> > don't have any truesize checks) and account the skb to the socket
> > normally. Not really a good way either though.
> >
>
> Changing truesize is not allowed here see below.
Oh, good point.
> > FWIW I just decided to do it the other way around in mac80211 -- keep
> > the original SKB that's charged to the socket for the error queue, and
> > use a clone to actually do the TX.
>
> Hmm, please take a look at IFF_SKB_TX_SHARING stuff, added in commit
> d88733150 (net: add IFF_SKB_TX_SHARED flag to priv_flags)
mac80211 got that flag cleared -- and right now I see no way to change
that. 11ac/ad may require work in this area to be able to do this
better, and I have a number of ideas on how to do that, but right now I
don't see how we can do that.
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 1/1] net: hold sock reference while processing tx timestamps
2011-10-12 19:27 ` Johannes Berg
2011-10-12 19:52 ` Eric Dumazet
@ 2011-10-13 4:51 ` Richard Cochran
1 sibling, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-13 4:51 UTC (permalink / raw)
To: Johannes Berg; +Cc: netdev, David Miller, Eric Dumazet
On Wed, Oct 12, 2011 at 09:27:53PM +0200, Johannes Berg wrote:
> On Wed, 2011-10-12 at 20:36 +0200, Richard Cochran wrote:
> > The pair of functions,
> >
> > * skb_clone_tx_timestamp()
> > * skb_complete_tx_timestamp()
...
>
> This still needs a fix to the PHY driver right? It has a case that can
> kfree_skb() the skb instead of passing it back to
> complete_tx_timestamp().
Yes, right, will do.
> > - if (!hwtstamps)
> > + if (!hwtstamps) {
> > + sock_put(sk);
> > + kfree_skb(skb);
> > return;
> > + }
>
> Is that right w/o skb->sk = NULL?
I think it is okay. In clone/complete, we use skb->sk in a special
way, just to remember the socket. Within kfree_skb, only the
skb->destructor would possibly use skb->sk, and that function pointer
is NULL (after the clone in the Tx path).
> FWIW I just decided to do it the other way around in mac80211 -- keep
> the original SKB that's charged to the socket for the error queue, and
> use a clone to actually do the TX.
In this case, we must clone, I think. The Ethernet MAC driver will
call the clone_tx_timetstamp hook just before freeing the original
skb.
Thanks,
Richard
^ permalink raw reply [flat|nested] 56+ messages in thread
* [PATCH 0/3] net: time stamping fixes
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
2011-10-12 19:25 ` Eric Dumazet
2011-10-12 19:27 ` Johannes Berg
@ 2011-10-13 9:46 ` Richard Cochran
2011-10-19 4:16 ` David Miller
2011-10-13 9:46 ` [PATCH 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
` (6 subsequent siblings)
9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-13 9:46 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg
The first patch fixes a bug in the time stamping code introduced in
v2.6.36 of the kernel.
The other two patches depend on the first patch and fix two bugs in a
PTP Hardware Clock driver. This driver was first introduced in Linux
version 3.0.
Richard Cochran (3):
net: hold sock reference while processing tx timestamps
dp83640: use proper function to free transmit time stamping packets
dp83640: free packet queues on remove
drivers/net/phy/dp83640.c | 11 +++++++++--
include/linux/phy.h | 2 +-
include/linux/skbuff.h | 7 ++++++-
net/core/timestamping.c | 7 ++++++-
4 files changed, 22 insertions(+), 5 deletions(-)
--
1.7.2.5
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-13 9:46 ` [PATCH 0/3] net: time stamping fixes Richard Cochran
@ 2011-10-19 4:16 ` David Miller
2011-10-19 5:15 ` Johannes Berg
0 siblings, 1 reply; 56+ messages in thread
From: David Miller @ 2011-10-19 4:16 UTC (permalink / raw)
To: richardcochran; +Cc: netdev, eric.dumazet, johannes
From: Richard Cochran <richardcochran@gmail.com>
Date: Thu, 13 Oct 2011 11:46:26 +0200
> The first patch fixes a bug in the time stamping code introduced in
> v2.6.36 of the kernel.
>
> The other two patches depend on the first patch and fix two bugs in a
> PTP Hardware Clock driver. This driver was first introduced in Linux
> version 3.0.
>
> Richard Cochran (3):
> net: hold sock reference while processing tx timestamps
> dp83640: use proper function to free transmit time stamping packets
> dp83640: free packet queues on remove
Johannes and Eric, please help review Richard's fixes here.
Thanks!
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 4:16 ` David Miller
@ 2011-10-19 5:15 ` Johannes Berg
2011-10-19 11:50 ` Richard Cochran
0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 5:15 UTC (permalink / raw)
To: David Miller; +Cc: richardcochran, netdev, eric.dumazet
On Wed, 2011-10-19 at 00:16 -0400, David Miller wrote:
> From: Richard Cochran <richardcochran@gmail.com>
> Date: Thu, 13 Oct 2011 11:46:26 +0200
>
> > The first patch fixes a bug in the time stamping code introduced in
> > v2.6.36 of the kernel.
> >
> > The other two patches depend on the first patch and fix two bugs in a
> > PTP Hardware Clock driver. This driver was first introduced in Linux
> > version 3.0.
> >
> > Richard Cochran (3):
> > net: hold sock reference while processing tx timestamps
> > dp83640: use proper function to free transmit time stamping packets
> > dp83640: free packet queues on remove
>
> Johannes and Eric, please help review Richard's fixes here.
The only thing I'm not completely sure about is whether or not it is
permissible to sock_hold() at that point. I'm probably just missing
something, but: if sk_free() was called before hard_start_xmit() which
will call skb_clone_tx_timestamp(), can we really call sock_hold()?
The reason I ask is that sock_wfree() doesn't check sk_refcnt, so if it
is possible for sk_free() to have been called before hard_start_xmit(),
maybe because the packet was stuck on the qdisc for a while, the socket
won't be released (sk_free checks sk_wmem_alloc) but the sk_wfree() when
the original skb is freed will actually free the socket, invalidating
the clone's sk pointer *even though* we called sock_hold() right after
making the clone.
So what guarantees that sk_refcnt is still non-zero when we make the
clone? Alternatively, should sock_wfree() check sk_refcnt?
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 5:15 ` Johannes Berg
@ 2011-10-19 11:50 ` Richard Cochran
2011-10-19 12:33 ` Eric Dumazet
2011-10-19 12:38 ` Eric Dumazet
0 siblings, 2 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-19 11:50 UTC (permalink / raw)
To: Johannes Berg; +Cc: David Miller, netdev, eric.dumazet
On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> The only thing I'm not completely sure about is whether or not it is
> permissible to sock_hold() at that point. I'm probably just missing
> something, but: if sk_free() was called before hard_start_xmit() which
> will call skb_clone_tx_timestamp(), can we really call sock_hold()?
>
> The reason I ask is that sock_wfree() doesn't check sk_refcnt, so if it
> is possible for sk_free() to have been called before hard_start_xmit(),
> maybe because the packet was stuck on the qdisc for a while, the socket
> won't be released (sk_free checks sk_wmem_alloc) but the sk_wfree() when
> the original skb is freed will actually free the socket, invalidating
> the clone's sk pointer *even though* we called sock_hold() right after
> making the clone.
>
> So what guarantees that sk_refcnt is still non-zero when we make the
> clone?
In the non-qdisc path, the kernel is in a send() call, so the initial
reference taken in socket() is held.
I really don't know the qdisc code, whether it is somehow holding the
skb->sk indirectly or not.
Eric? David?
> Alternatively, should sock_wfree() check sk_refcnt?
That would presumably spoil the performance enhancement gained in
commit 2b85a34e.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 11:50 ` Richard Cochran
@ 2011-10-19 12:33 ` Eric Dumazet
2011-10-19 12:38 ` Eric Dumazet
1 sibling, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 12:33 UTC (permalink / raw)
To: Richard Cochran; +Cc: Johannes Berg, David Miller, netdev
Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > The only thing I'm not completely sure about is whether or not it is
> > permissible to sock_hold() at that point. I'm probably just missing
> > something, but: if sk_free() was called before hard_start_xmit() which
> > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> >
This is not possible, or something is really broken.
> > The reason I ask is that sock_wfree() doesn't check sk_refcnt, so if it
> > is possible for sk_free() to have been called before hard_start_xmit(),
> > maybe because the packet was stuck on the qdisc for a while, the socket
> > won't be released (sk_free checks sk_wmem_alloc) but the sk_wfree() when
> > the original skb is freed will actually free the socket, invalidating
> > the clone's sk pointer *even though* we called sock_hold() right after
> > making the clone.
> >
> > So what guarantees that sk_refcnt is still non-zero when we make the
> > clone?
>
> In the non-qdisc path, the kernel is in a send() call, so the initial
> reference taken in socket() is held.
>
> I really don't know the qdisc code, whether it is somehow holding the
> skb->sk indirectly or not.
>
> Eric? David?
I dont really understand what's the problem, since sk_free() doesnt care
at all about sk_refcnt, but sk_wmem_alloc.
If one skb is in flight, and still linked to a socket, then this socket
cannot disappear, because this skb->truesize was accounted into
sk->sk_wmem_alloc
Of course, this point is valid as long as skb had not been orphaned.
And this is true
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 11:50 ` Richard Cochran
2011-10-19 12:33 ` Eric Dumazet
@ 2011-10-19 12:38 ` Eric Dumazet
2011-10-19 12:58 ` Johannes Berg
1 sibling, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 12:38 UTC (permalink / raw)
To: Richard Cochran; +Cc: Johannes Berg, David Miller, netdev
Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > The only thing I'm not completely sure about is whether or not it is
> > permissible to sock_hold() at that point. I'm probably just missing
> > something, but: if sk_free() was called before hard_start_xmit() which
> > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> >
This is not possible, or something is really broken. We specifically
dont skb_orphan(skb) if we know tx timestamping is enabled for this skb.
/*
* Try to orphan skb early, right before transmission by the device.
* We cannot orphan skb if tx timestamp is requested or the sk-reference
* is needed on driver level for other reasons, e.g. see net/can/raw.c
*/
static inline void skb_orphan_try(struct sk_buff *skb)
{
struct sock *sk = skb->sk;
if (sk && !skb_shinfo(skb)->tx_flags) {
/* skb_tx_hash() wont be able to get sk.
* We copy sk_hash into skb->rxhash
*/
if (!skb->rxhash)
skb->rxhash = sk->sk_hash;
skb_orphan(skb);
}
}
> > The reason I ask is that sock_wfree() doesn't check sk_refcnt, so if it
> > is possible for sk_free() to have been called before hard_start_xmit(),
> > maybe because the packet was stuck on the qdisc for a while, the socket
> > won't be released (sk_free checks sk_wmem_alloc) but the sk_wfree() when
> > the original skb is freed will actually free the socket, invalidating
> > the clone's sk pointer *even though* we called sock_hold() right after
> > making the clone.
> >
> > So what guarantees that sk_refcnt is still non-zero when we make the
> > clone?
>
> In the non-qdisc path, the kernel is in a send() call, so the initial
> reference taken in socket() is held.
>
> I really don't know the qdisc code, whether it is somehow holding the
> skb->sk indirectly or not.
>
> Eric? David?
I dont really understand what's the concern, since sk_free() doesnt care
at all about sk_refcnt, but sk_wmem_alloc.
void sk_free(struct sock *sk)
{
/*
* We subtract one from sk_wmem_alloc and can know if
* some packets are still in some tx queue.
* If not null, sock_wfree() will call __sk_free(sk) later
*/
if (atomic_dec_and_test(&sk->sk_wmem_alloc))
__sk_free(sk);
}
EXPORT_SYMBOL(sk_free);
If one skb is in flight, and still linked to a socket, then this socket
cannot disappear, because this skb->truesize was accounted into
sk->sk_wmem_alloc
Of course, this point is valid as long as skb had not been orphaned.
sk_refcnt can be 0, if user closed the socket, but socket wont disappear
as long as sk_wmem_alloc is not 0.
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 12:38 ` Eric Dumazet
@ 2011-10-19 12:58 ` Johannes Berg
2011-10-19 13:09 ` Johannes Berg
2011-10-19 13:21 ` Eric Dumazet
0 siblings, 2 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 12:58 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev
On Wed, 2011-10-19 at 14:38 +0200, Eric Dumazet wrote:
> Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> > On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > > The only thing I'm not completely sure about is whether or not it is
> > > permissible to sock_hold() at that point. I'm probably just missing
> > > something, but: if sk_free() was called before hard_start_xmit() which
> > > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> > >
>
> This is not possible, or something is really broken. We specifically
> dont skb_orphan(skb) if we know tx timestamping is enabled for this skb.
Why can't sk_free() have been called? I'm not thinking of sock_wfree()
which can't have been called -- so the socket surely still exists
because skb->truesize is still accounted to it -- but what says
sk_refcnt hasn't reached 0 yet?
> /*
> * Try to orphan skb early, right before transmission by the device.
> * We cannot orphan skb if tx timestamp is requested or the sk-reference
> * is needed on driver level for other reasons, e.g. see net/can/raw.c
> */
> static inline void skb_orphan_try(struct sk_buff *skb)
> {
> struct sock *sk = skb->sk;
>
> if (sk && !skb_shinfo(skb)->tx_flags) {
> /* skb_tx_hash() wont be able to get sk.
> * We copy sk_hash into skb->rxhash
> */
> if (!skb->rxhash)
> skb->rxhash = sk->sk_hash;
> skb_orphan(skb);
> }
> }
Right.
> I dont really understand what's the concern, since sk_free() doesnt care
> at all about sk_refcnt, but sk_wmem_alloc.
Right.
> void sk_free(struct sock *sk)
[snip]
> If one skb is in flight, and still linked to a socket, then this socket
> cannot disappear, because this skb->truesize was accounted into
> sk->sk_wmem_alloc
This is undoubtedly true, I'm not disputing this.
> Of course, this point is valid as long as skb had not been orphaned.
>
> sk_refcnt can be 0, if user closed the socket, but socket wont disappear
> as long as sk_wmem_alloc is not 0.
Not disputing this either. But you said sk_refcnt can be 0, so why can't
the following happen:
/* skb; skb->sk = sk; skb->destructor = sock_wfree; */
/* skb is on qdisc, some time passes */
sk_free(sk); /* user closed socket,
sk->sk_refcnt reaches 0,
sk->sk_wmem_alloc == skb->truesize,
__sk_free not called, socket still lives,
but no more +1 in sk_wmem_alloc */
/* some more time passes */
/* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
skb2 = skb_clone(skb);
skb2->sk = skb->sk;
sock_hold(skb->sk);
/* ethernet TX completion calls skb_free(skb) */
skb_free(skb):
sock_wfree(skb); /* sk_wmem_alloc reaches 0,
__sk_free called DESPITE sk_refcnt > 0 */
/* later, in skb_complete_tx_timestamp() */
sock_put(sk); /* KABOOM */
I just want to understand why this can't happen :-)
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 12:58 ` Johannes Berg
@ 2011-10-19 13:09 ` Johannes Berg
2011-10-19 13:25 ` Eric Dumazet
2011-10-19 13:21 ` Eric Dumazet
1 sibling, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:09 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev
On Wed, 2011-10-19 at 14:58 +0200, Johannes Berg wrote:
> Not disputing this either. But you said sk_refcnt can be 0, so why can't
> the following happen:
>
> /* skb; skb->sk = sk; skb->destructor = sock_wfree; */
>
> /* skb is on qdisc, some time passes */
>
> sk_free(sk); /* user closed socket,
> sk->sk_refcnt reaches 0,
> sk->sk_wmem_alloc == skb->truesize,
> __sk_free not called, socket still lives,
> but no more +1 in sk_wmem_alloc */
>
> /* some more time passes */
>
> /* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
> skb2 = skb_clone(skb);
> skb2->sk = skb->sk;
> sock_hold(skb->sk);
>
> /* ethernet TX completion calls skb_free(skb) */
> skb_free(skb):
> sock_wfree(skb); /* sk_wmem_alloc reaches 0,
> __sk_free called DESPITE sk_refcnt > 0 */
>
> /* later, in skb_complete_tx_timestamp() */
> sock_put(sk); /* KABOOM */
Given the complexity of all this, I'm not sure we shouldn't do something
like this, but I have no idea what the cost would be:
--- wireless-testing.orig/include/net/sock.h 2011-10-18 22:28:41.000000000 +0200
+++ wireless-testing/include/net/sock.h 2011-10-19 15:08:45.000000000 +0200
@@ -434,7 +434,10 @@ static __inline__ int __sk_del_node_init
static inline void sock_hold(struct sock *sk)
{
- atomic_inc(&sk->sk_refcnt);
+ if (atomic_inc_return(&sk->sk_refcnt) == 1) {
+ /* was zero -- we must've gotten an sk_wmem_alloc reference */
+ atomic_inc(&sk->sk_wmem_alloc);
+ }
}
/* Ungrab socket in the context, which assumes that socket refcnt
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 13:09 ` Johannes Berg
@ 2011-10-19 13:25 ` Eric Dumazet
2011-10-19 13:35 ` Johannes Berg
0 siblings, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 13:25 UTC (permalink / raw)
To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev
Le mercredi 19 octobre 2011 à 15:09 +0200, Johannes Berg a écrit :
> On Wed, 2011-10-19 at 14:58 +0200, Johannes Berg wrote:
>
> > Not disputing this either. But you said sk_refcnt can be 0, so why can't
> > the following happen:
> >
> > /* skb; skb->sk = sk; skb->destructor = sock_wfree; */
> >
> > /* skb is on qdisc, some time passes */
> >
> > sk_free(sk); /* user closed socket,
> > sk->sk_refcnt reaches 0,
> > sk->sk_wmem_alloc == skb->truesize,
> > __sk_free not called, socket still lives,
> > but no more +1 in sk_wmem_alloc */
> >
> > /* some more time passes */
> >
> > /* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
> > skb2 = skb_clone(skb);
> > skb2->sk = skb->sk;
> > sock_hold(skb->sk);
> >
> > /* ethernet TX completion calls skb_free(skb) */
> > skb_free(skb):
> > sock_wfree(skb); /* sk_wmem_alloc reaches 0,
> > __sk_free called DESPITE sk_refcnt > 0 */
> >
> > /* later, in skb_complete_tx_timestamp() */
> > sock_put(sk); /* KABOOM */
>
>
> Given the complexity of all this, I'm not sure we shouldn't do something
> like this, but I have no idea what the cost would be:
>
> --- wireless-testing.orig/include/net/sock.h 2011-10-18 22:28:41.000000000 +0200
> +++ wireless-testing/include/net/sock.h 2011-10-19 15:08:45.000000000 +0200
> @@ -434,7 +434,10 @@ static __inline__ int __sk_del_node_init
>
> static inline void sock_hold(struct sock *sk)
> {
> - atomic_inc(&sk->sk_refcnt);
> + if (atomic_inc_return(&sk->sk_refcnt) == 1) {
> + /* was zero -- we must've gotten an sk_wmem_alloc reference */
> + atomic_inc(&sk->sk_wmem_alloc);
> + }
> }
>
Hmm, it will be difficult to handle two atomics without adding races,
and add quite expensive atomic_inc_return() on some arches.
I would just change the skb tx cloning to take a normal reference on
sk_wmem_alloc
atomic_add(skb->truesize, &sk->sk_wmem_alloc);
instead of
sock_hold(sk);
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 13:25 ` Eric Dumazet
@ 2011-10-19 13:35 ` Johannes Berg
2011-10-19 13:44 ` Eric Dumazet
0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:35 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev
On Wed, 2011-10-19 at 15:25 +0200, Eric Dumazet wrote:
> > Given the complexity of all this, I'm not sure we shouldn't do something
> > like this, but I have no idea what the cost would be:
> >
> > --- wireless-testing.orig/include/net/sock.h 2011-10-18 22:28:41.000000000 +0200
> > +++ wireless-testing/include/net/sock.h 2011-10-19 15:08:45.000000000 +0200
> > @@ -434,7 +434,10 @@ static __inline__ int __sk_del_node_init
> >
> > static inline void sock_hold(struct sock *sk)
> > {
> > - atomic_inc(&sk->sk_refcnt);
> > + if (atomic_inc_return(&sk->sk_refcnt) == 1) {
> > + /* was zero -- we must've gotten an sk_wmem_alloc reference */
> > + atomic_inc(&sk->sk_wmem_alloc);
> > + }
> > }
> >
>
> Hmm, it will be difficult to handle two atomics without adding races,
Where do you see a race? If you do sock_hold() while you have a 'proper
sk_refcnt reference', this does nothing but increase sk_refcnt. If you
do sock_hold() while you have an 'sk_wmem_alloc reference', this will
actually increase sk_wmem_alloc by 1; then when the original
'sk_wmem_alloc reference' you had when calling sock_hold() is released,
sk_wmem_alloc will still have 1 matching a non-zero sk_refcnt.
> and add quite expensive atomic_inc_return() on some arches.
Yes I was afraid of that.
> I would just change the skb tx cloning to take a normal reference on
> sk_wmem_alloc
>
> atomic_add(skb->truesize, &sk->sk_wmem_alloc);
> instead of
> sock_hold(sk);
Even with that fixed I'm not really convinced of it all -- need to
really really really make sure that no skb->sk that was owned by a TX
skb is ever passed to sock_hold(). Can we really guarantee that?
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 13:35 ` Johannes Berg
@ 2011-10-19 13:44 ` Eric Dumazet
2011-10-19 13:57 ` Johannes Berg
0 siblings, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 13:44 UTC (permalink / raw)
To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev
Le mercredi 19 octobre 2011 à 15:35 +0200, Johannes Berg a écrit :
> Even with that fixed I'm not really convinced of it all -- need to
> really really really make sure that no skb->sk that was owned by a TX
> skb is ever passed to sock_hold(). Can we really guarantee that?
Either we can guarantee that, or kernel is a piece of crap, all bets are
off.
Yes, we need to make an audit, since we assumed sock_hold() was the
right thing to do in all contexts.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 13:44 ` Eric Dumazet
@ 2011-10-19 13:57 ` Johannes Berg
2011-10-19 14:08 ` Eric Dumazet
0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:57 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev
On Wed, 2011-10-19 at 15:44 +0200, Eric Dumazet wrote:
> Le mercredi 19 octobre 2011 à 15:35 +0200, Johannes Berg a écrit :
>
> > Even with that fixed I'm not really convinced of it all -- need to
> > really really really make sure that no skb->sk that was owned by a TX
> > skb is ever passed to sock_hold(). Can we really guarantee that?
>
> Either we can guarantee that, or kernel is a piece of crap, all bets are
> off.
Fair enough :-)
> Yes, we need to make an audit, since we assumed sock_hold() was the
> right thing to do in all contexts.
Ok.
Anyway, I guess you agree that the patches as-is aren't actually the
right solution since we can't sock_hold() a TX skb socket reference?
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 13:57 ` Johannes Berg
@ 2011-10-19 14:08 ` Eric Dumazet
2011-10-19 14:24 ` Johannes Berg
0 siblings, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 14:08 UTC (permalink / raw)
To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev
Le mercredi 19 octobre 2011 à 15:57 +0200, Johannes Berg a écrit :
> On Wed, 2011-10-19 at 15:44 +0200, Eric Dumazet wrote:
> > Le mercredi 19 octobre 2011 à 15:35 +0200, Johannes Berg a écrit :
> >
> > > Even with that fixed I'm not really convinced of it all -- need to
> > > really really really make sure that no skb->sk that was owned by a TX
> > > skb is ever passed to sock_hold(). Can we really guarantee that?
> >
> > Either we can guarantee that, or kernel is a piece of crap, all bets are
> > off.
>
> Fair enough :-)
>
> > Yes, we need to make an audit, since we assumed sock_hold() was the
> > right thing to do in all contexts.
>
> Ok.
>
> Anyway, I guess you agree that the patches as-is aren't actually the
> right solution since we can't sock_hold() a TX skb socket reference?
Yes, the sock_hold() could be changed by an atomic_inc_not_zero()
What about doing this ?
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 54fc413..79f337c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -420,7 +420,7 @@ struct phy_driver {
/*
* Requests a Tx timestamp for 'skb'. The phy driver promises
- * to deliver it to the socket's error queue as soon as a
+ * to deliver it using skb_complete_tx_timestamp() as soon as a
* timestamp becomes available. One of the PTP_CLASS_ values
* is passed in 'type'.
*/
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8bd383c..0f96646 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2020,8 +2020,13 @@ static inline bool skb_defer_rx_timestamp(struct sk_buff *skb)
/**
* skb_complete_tx_timestamp() - deliver cloned skb with tx timestamps
*
+ * PHY drivers may accept clones of transmitted packets for
+ * timestamping via their phy_driver.txtstamp method. These drivers
+ * must call this function to return the skb back to the stack, with
+ * or without a timestamp.
+ *
* @skb: clone of the the original outgoing packet
- * @hwtstamps: hardware time stamps
+ * @hwtstamps: hardware time stamps, may be NULL if not available
*
*/
void skb_complete_tx_timestamp(struct sk_buff *skb,
diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 98a5264..82fb288 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -57,9 +57,13 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
case PTP_CLASS_V2_VLAN:
phydev = skb->dev->phydev;
if (likely(phydev->drv->txtstamp)) {
+ if (!atomic_inc_not_zero(&sk->sk_refcnt))
+ return;
clone = skb_clone(skb, GFP_ATOMIC);
- if (!clone)
+ if (!clone) {
+ sock_put(sk);
return;
+ }
clone->sk = sk;
phydev->drv->txtstamp(phydev, clone, type);
}
@@ -77,8 +81,11 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
struct sock_exterr_skb *serr;
int err;
- if (!hwtstamps)
+ if (!hwtstamps) {
+ sock_put(sk);
+ kfree_skb(skb);
return;
+ }
*skb_hwtstamps(skb) = *hwtstamps;
serr = SKB_EXT_ERR(skb);
@@ -87,6 +94,7 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
skb->sk = NULL;
err = sock_queue_err_skb(sk, skb);
+ sock_put(sk);
if (err)
kfree_skb(skb);
}
^ permalink raw reply related [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 14:08 ` Eric Dumazet
@ 2011-10-19 14:24 ` Johannes Berg
2011-10-19 14:27 ` Richard Cochran
0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 14:24 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev
On Wed, 2011-10-19 at 16:08 +0200, Eric Dumazet wrote:
> > Anyway, I guess you agree that the patches as-is aren't actually the
> > right solution since we can't sock_hold() a TX skb socket reference?
>
> Yes, the sock_hold() could be changed by an atomic_inc_not_zero()
>
> What about doing this ?
> if (likely(phydev->drv->txtstamp)) {
> + if (!atomic_inc_not_zero(&sk->sk_refcnt))
> + return;
Yeah that seems like it works and just drops the timestamp in case we
don't still have a live socket, which is perfectly fine.
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 14:24 ` Johannes Berg
@ 2011-10-19 14:27 ` Richard Cochran
2011-10-19 14:33 ` Eric Dumazet
0 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-19 14:27 UTC (permalink / raw)
To: Johannes Berg; +Cc: Eric Dumazet, David Miller, netdev
On Wed, Oct 19, 2011 at 04:24:08PM +0200, Johannes Berg wrote:
> On Wed, 2011-10-19 at 16:08 +0200, Eric Dumazet wrote:
>
> > > Anyway, I guess you agree that the patches as-is aren't actually the
> > > right solution since we can't sock_hold() a TX skb socket reference?
> >
> > Yes, the sock_hold() could be changed by an atomic_inc_not_zero()
> >
> > What about doing this ?
>
> > if (likely(phydev->drv->txtstamp)) {
> > + if (!atomic_inc_not_zero(&sk->sk_refcnt))
> > + return;
>
> Yeah that seems like it works and just drops the timestamp in case we
> don't still have a live socket, which is perfectly fine.
Yes, I think it resolves any doubt. I will resubmit with this
solution.
Thanks,
Richard
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 14:27 ` Richard Cochran
@ 2011-10-19 14:33 ` Eric Dumazet
0 siblings, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 14:33 UTC (permalink / raw)
To: Richard Cochran; +Cc: Johannes Berg, David Miller, netdev
Le mercredi 19 octobre 2011 à 16:27 +0200, Richard Cochran a écrit :
> On Wed, Oct 19, 2011 at 04:24:08PM +0200, Johannes Berg wrote:
> > On Wed, 2011-10-19 at 16:08 +0200, Eric Dumazet wrote:
> >
> > > > Anyway, I guess you agree that the patches as-is aren't actually the
> > > > right solution since we can't sock_hold() a TX skb socket reference?
> > >
> > > Yes, the sock_hold() could be changed by an atomic_inc_not_zero()
> > >
> > > What about doing this ?
> >
> > > if (likely(phydev->drv->txtstamp)) {
> > > + if (!atomic_inc_not_zero(&sk->sk_refcnt))
> > > + return;
> >
> > Yeah that seems like it works and just drops the timestamp in case we
> > don't still have a live socket, which is perfectly fine.
>
> Yes, I think it resolves any doubt. I will resubmit with this
> solution.
>
Thanks guys !
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 12:58 ` Johannes Berg
2011-10-19 13:09 ` Johannes Berg
@ 2011-10-19 13:21 ` Eric Dumazet
2011-10-19 13:25 ` Johannes Berg
1 sibling, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 13:21 UTC (permalink / raw)
To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev
Le mercredi 19 octobre 2011 à 14:58 +0200, Johannes Berg a écrit :
> On Wed, 2011-10-19 at 14:38 +0200, Eric Dumazet wrote:
> > Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> > > On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > > > The only thing I'm not completely sure about is whether or not it is
> > > > permissible to sock_hold() at that point. I'm probably just missing
> > > > something, but: if sk_free() was called before hard_start_xmit() which
> > > > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> > > >
> >
> > This is not possible, or something is really broken. We specifically
> > dont skb_orphan(skb) if we know tx timestamping is enabled for this skb.
>
> Why can't sk_free() have been called? I'm not thinking of sock_wfree()
> which can't have been called -- so the socket surely still exists
> because skb->truesize is still accounted to it -- but what says
> sk_refcnt hasn't reached 0 yet?
>
> > /*
> > * Try to orphan skb early, right before transmission by the device.
> > * We cannot orphan skb if tx timestamp is requested or the sk-reference
> > * is needed on driver level for other reasons, e.g. see net/can/raw.c
> > */
> > static inline void skb_orphan_try(struct sk_buff *skb)
> > {
> > struct sock *sk = skb->sk;
> >
> > if (sk && !skb_shinfo(skb)->tx_flags) {
> > /* skb_tx_hash() wont be able to get sk.
> > * We copy sk_hash into skb->rxhash
> > */
> > if (!skb->rxhash)
> > skb->rxhash = sk->sk_hash;
> > skb_orphan(skb);
> > }
> > }
>
> Right.
>
> > I dont really understand what's the concern, since sk_free() doesnt care
> > at all about sk_refcnt, but sk_wmem_alloc.
>
> Right.
>
> > void sk_free(struct sock *sk)
> [snip]
>
> > If one skb is in flight, and still linked to a socket, then this socket
> > cannot disappear, because this skb->truesize was accounted into
> > sk->sk_wmem_alloc
>
> This is undoubtedly true, I'm not disputing this.
>
> > Of course, this point is valid as long as skb had not been orphaned.
> >
> > sk_refcnt can be 0, if user closed the socket, but socket wont disappear
> > as long as sk_wmem_alloc is not 0.
>
> Not disputing this either. But you said sk_refcnt can be 0, so why can't
> the following happen:
>
> /* skb; skb->sk = sk; skb->destructor = sock_wfree; */
>
> /* skb is on qdisc, some time passes */
>
> sk_free(sk); /* user closed socket,
> sk->sk_refcnt reaches 0,
> sk->sk_wmem_alloc == skb->truesize,
> __sk_free not called, socket still lives,
> but no more +1 in sk_wmem_alloc */
>
> /* some more time passes */
>
> /* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
> skb2 = skb_clone(skb);
> skb2->sk = skb->sk;
> sock_hold(skb->sk);
>
> /* ethernet TX completion calls skb_free(skb) */
> skb_free(skb):
> sock_wfree(skb); /* sk_wmem_alloc reaches 0,
> __sk_free called DESPITE sk_refcnt > 0 */
>
> /* later, in skb_complete_tx_timestamp() */
> sock_put(sk); /* KABOOM */
>
>
> I just want to understand why this can't happen :-)
Since you answer your own question :)
Hmm, oh well, sk_refcnt is/should not be changed if a xmit packet is
duped, but sk_wmem_alloc should be, exactly paired with skb->truesize
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 13:21 ` Eric Dumazet
@ 2011-10-19 13:25 ` Johannes Berg
2011-10-19 13:27 ` Eric Dumazet
0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:25 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev
On Wed, 2011-10-19 at 15:21 +0200, Eric Dumazet wrote:
> > Not disputing this either. But you said sk_refcnt can be 0, so why can't
> > the following happen:
> >
> > /* skb; skb->sk = sk; skb->destructor = sock_wfree; */
> >
> > /* skb is on qdisc, some time passes */
> >
> > sk_free(sk); /* user closed socket,
> > sk->sk_refcnt reaches 0,
> > sk->sk_wmem_alloc == skb->truesize,
> > __sk_free not called, socket still lives,
> > but no more +1 in sk_wmem_alloc */
> >
> > /* some more time passes */
> >
> > /* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
> > skb2 = skb_clone(skb);
> > skb2->sk = skb->sk;
> > sock_hold(skb->sk);
> >
> > /* ethernet TX completion calls skb_free(skb) */
> > skb_free(skb):
> > sock_wfree(skb); /* sk_wmem_alloc reaches 0,
> > __sk_free called DESPITE sk_refcnt > 0 */
> >
> > /* later, in skb_complete_tx_timestamp() */
> > sock_put(sk); /* KABOOM */
> >
> >
> > I just want to understand why this can't happen :-)
>
> Since you answer your own question :)
Did I?
> Hmm, oh well, sk_refcnt is/should not be changed if a xmit packet is
> duped, but sk_wmem_alloc should be, exactly paired with skb->truesize
Well, yeah, ok, but Richard's patches do modify sk_refcnt, and can't
modify sk_wmem_alloc as discussed upthread.
I'll admit that I don't really trust the whole thing anyway -- it seems
rather error prone to forbid you from doing sock_hold() on a socket you
obtained from a TX packet.
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 13:25 ` Johannes Berg
@ 2011-10-19 13:27 ` Eric Dumazet
2011-10-19 13:32 ` Johannes Berg
0 siblings, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 13:27 UTC (permalink / raw)
To: Johannes Berg; +Cc: Richard Cochran, David Miller, netdev
Le mercredi 19 octobre 2011 à 15:25 +0200, Johannes Berg a écrit :
> Well, yeah, ok, but Richard's patches do modify sk_refcnt, and can't
> modify sk_wmem_alloc as discussed upthread.
>
> I'll admit that I don't really trust the whole thing anyway -- it seems
> rather error prone to forbid you from doing sock_hold() on a socket you
> obtained from a TX packet.
>
I probably missed why sk_wmem_alloc can not be modified ?
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 13:27 ` Eric Dumazet
@ 2011-10-19 13:32 ` Johannes Berg
2011-10-19 14:25 ` Richard Cochran
0 siblings, 1 reply; 56+ messages in thread
From: Johannes Berg @ 2011-10-19 13:32 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Richard Cochran, David Miller, netdev
On Wed, 2011-10-19 at 15:27 +0200, Eric Dumazet wrote:
> Le mercredi 19 octobre 2011 à 15:25 +0200, Johannes Berg a écrit :
>
> > Well, yeah, ok, but Richard's patches do modify sk_refcnt, and can't
> > modify sk_wmem_alloc as discussed upthread.
> >
> > I'll admit that I don't really trust the whole thing anyway -- it seems
> > rather error prone to forbid you from doing sock_hold() on a socket you
> > obtained from a TX packet.
> >
>
> I probably missed why sk_wmem_alloc can not be modified ?
I think I misremember -- I was suggesting to set skb2->truesize to 1 and
then charge it so it's not double-charged.
OTOH, could skb_clone_tx_timestamp() orphan the original skb after
adding the skb2?
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH 0/3] net: time stamping fixes
2011-10-19 13:32 ` Johannes Berg
@ 2011-10-19 14:25 ` Richard Cochran
0 siblings, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-19 14:25 UTC (permalink / raw)
To: Johannes Berg; +Cc: Eric Dumazet, David Miller, netdev
On Wed, Oct 19, 2011 at 03:32:53PM +0200, Johannes Berg wrote:
>
> OTOH, could skb_clone_tx_timestamp() orphan the original skb after
> adding the skb2?
The Ethernet MAC driver decides what to do with the original skb. In
the MAC driver, it goes like:
1. skb_tx_timestamp(skb) -> skb_clone_tx_timestamp(skb);
2. give skb to hardware;
3. ...
4. kfree_skb, recycle, etc
Richard
^ permalink raw reply [flat|nested] 56+ messages in thread
* [PATCH 1/3] net: hold sock reference while processing tx timestamps
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
` (2 preceding siblings ...)
2011-10-13 9:46 ` [PATCH 0/3] net: time stamping fixes Richard Cochran
@ 2011-10-13 9:46 ` Richard Cochran
2011-10-19 4:42 ` Eric Dumazet
2011-10-13 9:46 ` [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
` (5 subsequent siblings)
9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-13 9:46 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable
The pair of functions,
* skb_clone_tx_timestamp()
* skb_complete_tx_timestamp()
were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.
As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.
These functions first appeared in v2.6.36.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
---
include/linux/phy.h | 2 +-
include/linux/skbuff.h | 7 ++++++-
net/core/timestamping.c | 7 ++++++-
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 54fc413..79f337c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -420,7 +420,7 @@ struct phy_driver {
/*
* Requests a Tx timestamp for 'skb'. The phy driver promises
- * to deliver it to the socket's error queue as soon as a
+ * to deliver it using skb_complete_tx_timestamp() as soon as a
* timestamp becomes available. One of the PTP_CLASS_ values
* is passed in 'type'.
*/
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8bd383c..0f96646 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2020,8 +2020,13 @@ static inline bool skb_defer_rx_timestamp(struct sk_buff *skb)
/**
* skb_complete_tx_timestamp() - deliver cloned skb with tx timestamps
*
+ * PHY drivers may accept clones of transmitted packets for
+ * timestamping via their phy_driver.txtstamp method. These drivers
+ * must call this function to return the skb back to the stack, with
+ * or without a timestamp.
+ *
* @skb: clone of the the original outgoing packet
- * @hwtstamps: hardware time stamps
+ * @hwtstamps: hardware time stamps, may be NULL if not available
*
*/
void skb_complete_tx_timestamp(struct sk_buff *skb,
diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 98a5264..29e59d6 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -60,6 +60,7 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
clone = skb_clone(skb, GFP_ATOMIC);
if (!clone)
return;
+ sock_hold(sk);
clone->sk = sk;
phydev->drv->txtstamp(phydev, clone, type);
}
@@ -77,8 +78,11 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
struct sock_exterr_skb *serr;
int err;
- if (!hwtstamps)
+ if (!hwtstamps) {
+ sock_put(sk);
+ kfree_skb(skb);
return;
+ }
*skb_hwtstamps(skb) = *hwtstamps;
serr = SKB_EXT_ERR(skb);
@@ -87,6 +91,7 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
skb->sk = NULL;
err = sock_queue_err_skb(sk, skb);
+ sock_put(sk);
if (err)
kfree_skb(skb);
}
--
1.7.2.5
^ permalink raw reply related [flat|nested] 56+ messages in thread* Re: [PATCH 1/3] net: hold sock reference while processing tx timestamps
2011-10-13 9:46 ` [PATCH 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-19 4:42 ` Eric Dumazet
0 siblings, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 4:42 UTC (permalink / raw)
To: Richard Cochran; +Cc: netdev, David Miller, Johannes Berg, stable
Le jeudi 13 octobre 2011 à 11:46 +0200, Richard Cochran a écrit :
> The pair of functions,
>
> * skb_clone_tx_timestamp()
> * skb_complete_tx_timestamp()
>
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
>
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.
>
> These functions first appeared in v2.6.36.
>
> Reported-by: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@kernel.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 56+ messages in thread
* [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
` (3 preceding siblings ...)
2011-10-13 9:46 ` [PATCH 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-13 9:46 ` Richard Cochran
2011-10-19 4:47 ` Eric Dumazet
2011-10-13 9:46 ` [PATCH 3/3] dp83640: free packet queues on remove Richard Cochran
` (4 subsequent siblings)
9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-13 9:46 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable
The previous commit enforces a new rule for handling the cloned packets
for transmit time stamping. These packets must not be freed using any other
function than skb_complete_tx_timestamp. This commit fixes the one and only
driver using this API.
The driver first appeared in v3.0.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
---
drivers/net/phy/dp83640.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index c588a16..13e5713 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1192,7 +1192,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
case HWTSTAMP_TX_ONESTEP_SYNC:
if (is_sync(skb, type)) {
- kfree_skb(skb);
+ skb_complete_tx_timestamp(skb, NULL);
return;
}
/* fall through */
@@ -1203,7 +1203,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
case HWTSTAMP_TX_OFF:
default:
- kfree_skb(skb);
+ skb_complete_tx_timestamp(skb, NULL);
break;
}
}
--
1.7.2.5
^ permalink raw reply related [flat|nested] 56+ messages in thread* [PATCH 3/3] dp83640: free packet queues on remove
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
` (4 preceding siblings ...)
2011-10-13 9:46 ` [PATCH 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
@ 2011-10-13 9:46 ` Richard Cochran
2011-10-19 4:48 ` Eric Dumazet
2011-10-21 10:49 ` [PATCH v2 0/3] net: time stamping fixes Richard Cochran
` (3 subsequent siblings)
9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-13 9:46 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable
If the PHY should disappear (for example, on an USB Ethernet MAC), then
the driver would leak any undelivered time stamp packets. This commit
fixes the issue by calling the appropriate functions to free any packets
left in the transmit and receive queues.
The driver first appeared in v3.0.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
---
drivers/net/phy/dp83640.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index 13e5713..9663e0b 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1007,6 +1007,7 @@ static void dp83640_remove(struct phy_device *phydev)
struct dp83640_clock *clock;
struct list_head *this, *next;
struct dp83640_private *tmp, *dp83640 = phydev->priv;
+ struct sk_buff *skb;
if (phydev->addr == BROADCAST_ADDR)
return;
@@ -1014,6 +1015,12 @@ static void dp83640_remove(struct phy_device *phydev)
enable_status_frames(phydev, false);
cancel_work_sync(&dp83640->ts_work);
+ while ((skb = skb_dequeue(&dp83640->rx_queue)) != NULL)
+ kfree_skb(skb);
+
+ while ((skb = skb_dequeue(&dp83640->tx_queue)) != NULL)
+ skb_complete_tx_timestamp(skb, NULL);
+
clock = dp83640_clock_get(dp83640->clock);
if (dp83640 == clock->chosen) {
--
1.7.2.5
^ permalink raw reply related [flat|nested] 56+ messages in thread* Re: [PATCH 3/3] dp83640: free packet queues on remove
2011-10-13 9:46 ` [PATCH 3/3] dp83640: free packet queues on remove Richard Cochran
@ 2011-10-19 4:48 ` Eric Dumazet
0 siblings, 0 replies; 56+ messages in thread
From: Eric Dumazet @ 2011-10-19 4:48 UTC (permalink / raw)
To: Richard Cochran; +Cc: netdev, David Miller, Johannes Berg, stable
Le jeudi 13 octobre 2011 à 11:46 +0200, Richard Cochran a écrit :
> If the PHY should disappear (for example, on an USB Ethernet MAC), then
> the driver would leak any undelivered time stamp packets. This commit
> fixes the issue by calling the appropriate functions to free any packets
> left in the transmit and receive queues.
>
> The driver first appeared in v3.0.
>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@kernel.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 56+ messages in thread
* [PATCH v2 0/3] net: time stamping fixes
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
` (5 preceding siblings ...)
2011-10-13 9:46 ` [PATCH 3/3] dp83640: free packet queues on remove Richard Cochran
@ 2011-10-21 10:49 ` Richard Cochran
2011-10-21 10:49 ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
` (2 subsequent siblings)
9 siblings, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-21 10:49 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg
[ Changes in v2: use atomic_inc_not_zero per Eric's suggestion. ]
The first patch fixes a bug in the time stamping code introduced in
v2.6.36 of the kernel.
The other two patches depend on the first patch and fix two bugs in a
PTP Hardware Clock driver. This driver was first introduced in Linux
version 3.0.
Richard Cochran (3):
net: hold sock reference while processing tx timestamps
dp83640: use proper function to free transmit time stamping packets
dp83640: free packet queues on remove
drivers/net/phy/dp83640.c | 11 +++++++++--
include/linux/phy.h | 2 +-
include/linux/skbuff.h | 7 ++++++-
net/core/timestamping.c | 12 ++++++++++--
4 files changed, 26 insertions(+), 6 deletions(-)
--
1.7.2.5
^ permalink raw reply [flat|nested] 56+ messages in thread* [PATCH v2 1/3] net: hold sock reference while processing tx timestamps
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
` (6 preceding siblings ...)
2011-10-21 10:49 ` [PATCH v2 0/3] net: time stamping fixes Richard Cochran
@ 2011-10-21 10:49 ` Richard Cochran
2011-10-21 11:31 ` Eric Dumazet
2011-10-21 11:44 ` Johannes Berg
2011-10-21 10:49 ` [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
2011-10-21 10:49 ` [PATCH v2 3/3] dp83640: free packet queues on remove Richard Cochran
9 siblings, 2 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-21 10:49 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable
The pair of functions,
* skb_clone_tx_timestamp()
* skb_complete_tx_timestamp()
were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.
As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.
These functions first appeared in v2.6.36.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@vger.kernel.org>
---
include/linux/phy.h | 2 +-
include/linux/skbuff.h | 7 ++++++-
net/core/timestamping.c | 12 ++++++++++--
3 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 54fc413..79f337c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -420,7 +420,7 @@ struct phy_driver {
/*
* Requests a Tx timestamp for 'skb'. The phy driver promises
- * to deliver it to the socket's error queue as soon as a
+ * to deliver it using skb_complete_tx_timestamp() as soon as a
* timestamp becomes available. One of the PTP_CLASS_ values
* is passed in 'type'.
*/
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8bd383c..0f96646 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2020,8 +2020,13 @@ static inline bool skb_defer_rx_timestamp(struct sk_buff *skb)
/**
* skb_complete_tx_timestamp() - deliver cloned skb with tx timestamps
*
+ * PHY drivers may accept clones of transmitted packets for
+ * timestamping via their phy_driver.txtstamp method. These drivers
+ * must call this function to return the skb back to the stack, with
+ * or without a timestamp.
+ *
* @skb: clone of the the original outgoing packet
- * @hwtstamps: hardware time stamps
+ * @hwtstamps: hardware time stamps, may be NULL if not available
*
*/
void skb_complete_tx_timestamp(struct sk_buff *skb,
diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 98a5264..82fb288 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -57,9 +57,13 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
case PTP_CLASS_V2_VLAN:
phydev = skb->dev->phydev;
if (likely(phydev->drv->txtstamp)) {
+ if (!atomic_inc_not_zero(&sk->sk_refcnt))
+ return;
clone = skb_clone(skb, GFP_ATOMIC);
- if (!clone)
+ if (!clone) {
+ sock_put(sk);
return;
+ }
clone->sk = sk;
phydev->drv->txtstamp(phydev, clone, type);
}
@@ -77,8 +81,11 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
struct sock_exterr_skb *serr;
int err;
- if (!hwtstamps)
+ if (!hwtstamps) {
+ sock_put(sk);
+ kfree_skb(skb);
return;
+ }
*skb_hwtstamps(skb) = *hwtstamps;
serr = SKB_EXT_ERR(skb);
@@ -87,6 +94,7 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
skb->sk = NULL;
err = sock_queue_err_skb(sk, skb);
+ sock_put(sk);
if (err)
kfree_skb(skb);
}
--
1.7.2.5
^ permalink raw reply related [flat|nested] 56+ messages in thread* Re: [PATCH v2 1/3] net: hold sock reference while processing tx timestamps
2011-10-21 10:49 ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-21 11:31 ` Eric Dumazet
2011-10-24 6:55 ` David Miller
2011-10-21 11:44 ` Johannes Berg
1 sibling, 1 reply; 56+ messages in thread
From: Eric Dumazet @ 2011-10-21 11:31 UTC (permalink / raw)
To: Richard Cochran; +Cc: netdev, David Miller, Johannes Berg, stable
Le vendredi 21 octobre 2011 à 12:49 +0200, Richard Cochran a écrit :
> The pair of functions,
>
> * skb_clone_tx_timestamp()
> * skb_complete_tx_timestamp()
>
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
>
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.
>
> These functions first appeared in v2.6.36.
>
> Reported-by: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@vger.kernel.org>
> ---
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH v2 1/3] net: hold sock reference while processing tx timestamps
2011-10-21 11:31 ` Eric Dumazet
@ 2011-10-24 6:55 ` David Miller
0 siblings, 0 replies; 56+ messages in thread
From: David Miller @ 2011-10-24 6:55 UTC (permalink / raw)
To: eric.dumazet; +Cc: richardcochran, netdev, johannes, stable
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 21 Oct 2011 13:31:50 +0200
> Le vendredi 21 octobre 2011 à 12:49 +0200, Richard Cochran a écrit :
>> The pair of functions,
>>
>> * skb_clone_tx_timestamp()
>> * skb_complete_tx_timestamp()
>>
>> were designed to allow timestamping in PHY devices. The first
>> function, called during the MAC driver's hard_xmit method, identifies
>> PTP protocol packets, clones them, and gives them to the PHY device
>> driver. The PHY driver may hold onto the packet and deliver it at a
>> later time using the second function, which adds the packet to the
>> socket's error queue.
>>
>> As pointed out by Johannes, nothing prevents the socket from
>> disappearing while the cloned packet is sitting in the PHY driver
>> awaiting a timestamp. This patch fixes the issue by taking a reference
>> on the socket for each such packet. In addition, the comments
>> regarding the usage of these function are expanded to highlight the
>> rule that PHY drivers must use skb_complete_tx_timestamp() to release
>> the packet, in order to release the socket reference, too.
>>
>> These functions first appeared in v2.6.36.
>>
>> Reported-by: Johannes Berg <johannes@sipsolutions.net>
>> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
>> Cc: <stable@vger.kernel.org>
>> ---
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH v2 1/3] net: hold sock reference while processing tx timestamps
2011-10-21 10:49 ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
2011-10-21 11:31 ` Eric Dumazet
@ 2011-10-21 11:44 ` Johannes Berg
1 sibling, 0 replies; 56+ messages in thread
From: Johannes Berg @ 2011-10-21 11:44 UTC (permalink / raw)
To: Richard Cochran; +Cc: netdev, David Miller, Eric Dumazet
On Fri, 2011-10-21 at 12:49 +0200, Richard Cochran wrote:
> The pair of functions,
>
> * skb_clone_tx_timestamp()
> * skb_complete_tx_timestamp()
>
> were designed to allow timestamping in PHY devices. The first
> function, called during the MAC driver's hard_xmit method, identifies
> PTP protocol packets, clones them, and gives them to the PHY device
> driver. The PHY driver may hold onto the packet and deliver it at a
> later time using the second function, which adds the packet to the
> socket's error queue.
>
> As pointed out by Johannes, nothing prevents the socket from
> disappearing while the cloned packet is sitting in the PHY driver
> awaiting a timestamp. This patch fixes the issue by taking a reference
> on the socket for each such packet. In addition, the comments
> regarding the usage of these function are expanded to highlight the
> rule that PHY drivers must use skb_complete_tx_timestamp() to release
> the packet, in order to release the socket reference, too.
It just now occurred to me that technically I think you could use a
destructor function that just points to something like this:
void tstamp_clone_free(struct sk_buff *skb)
{
sock_put(skb->sk);
}
but just forcing drivers to use the right API seems equally good to me.
> These functions first appeared in v2.6.36.
>
> Reported-by: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Cc: <stable@vger.kernel.org>
Thanks all for the discussions! :-)
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
johannes
^ permalink raw reply [flat|nested] 56+ messages in thread
* [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
` (7 preceding siblings ...)
2011-10-21 10:49 ` [PATCH v2 1/3] net: hold sock reference while processing tx timestamps Richard Cochran
@ 2011-10-21 10:49 ` Richard Cochran
2011-10-24 6:55 ` David Miller
2011-10-21 10:49 ` [PATCH v2 3/3] dp83640: free packet queues on remove Richard Cochran
9 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-21 10:49 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable
The previous commit enforces a new rule for handling the cloned packets
for transmit time stamping. These packets must not be freed using any other
function than skb_complete_tx_timestamp. This commit fixes the one and only
driver using this API.
The driver first appeared in v3.0.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@vger.kernel.org>
---
drivers/net/phy/dp83640.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index c588a16..13e5713 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1192,7 +1192,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
case HWTSTAMP_TX_ONESTEP_SYNC:
if (is_sync(skb, type)) {
- kfree_skb(skb);
+ skb_complete_tx_timestamp(skb, NULL);
return;
}
/* fall through */
@@ -1203,7 +1203,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
case HWTSTAMP_TX_OFF:
default:
- kfree_skb(skb);
+ skb_complete_tx_timestamp(skb, NULL);
break;
}
}
--
1.7.2.5
^ permalink raw reply related [flat|nested] 56+ messages in thread* Re: [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets
2011-10-21 10:49 ` [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
@ 2011-10-24 6:55 ` David Miller
2011-10-24 17:47 ` Richard Cochran
0 siblings, 1 reply; 56+ messages in thread
From: David Miller @ 2011-10-24 6:55 UTC (permalink / raw)
To: richardcochran; +Cc: netdev, eric.dumazet, johannes, stable
From: Richard Cochran <richardcochran@gmail.com>
Date: Fri, 21 Oct 2011 12:49:16 +0200
> The previous commit enforces a new rule for handling the cloned packets
> for transmit time stamping. These packets must not be freed using any other
> function than skb_complete_tx_timestamp. This commit fixes the one and only
> driver using this API.
>
> The driver first appeared in v3.0.
>
> Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: <stable@vger.kernel.org>
In the 'net' tree, which is where you should be targetting these dp83640
driver patches, the code looks nothing like what you're patching against.
Please respin patches #2 and #3 against current sources.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets
2011-10-24 6:55 ` David Miller
@ 2011-10-24 17:47 ` Richard Cochran
2011-10-24 23:16 ` David Miller
0 siblings, 1 reply; 56+ messages in thread
From: Richard Cochran @ 2011-10-24 17:47 UTC (permalink / raw)
To: David Miller; +Cc: netdev, eric.dumazet, johannes, stable
On Mon, Oct 24, 2011 at 02:55:55AM -0400, David Miller wrote:
> From: Richard Cochran <richardcochran@gmail.com>
> Date: Fri, 21 Oct 2011 12:49:16 +0200
>
> > The previous commit enforces a new rule for handling the cloned packets
> > for transmit time stamping. These packets must not be freed using any other
> > function than skb_complete_tx_timestamp. This commit fixes the one and only
> > driver using this API.
> >
> > The driver first appeared in v3.0.
> >
> > Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
> > Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: <stable@vger.kernel.org>
>
> In the 'net' tree, which is where you should be targetting these dp83640
> driver patches, the code looks nothing like what you're patching against.
>
> Please respin patches #2 and #3 against current sources.
Okay, but #2 will conflict with
dccaa9e0 dp83640: add time stamp insertion for sync messages
in net-next. Should I also submit a fix for that one?
Thanks,
Richard
^ permalink raw reply [flat|nested] 56+ messages in thread* Re: [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets
2011-10-24 17:47 ` Richard Cochran
@ 2011-10-24 23:16 ` David Miller
0 siblings, 0 replies; 56+ messages in thread
From: David Miller @ 2011-10-24 23:16 UTC (permalink / raw)
To: richardcochran; +Cc: netdev, eric.dumazet, johannes, stable
From: Richard Cochran <richardcochran@gmail.com>
Date: Mon, 24 Oct 2011 19:47:25 +0200
> On Mon, Oct 24, 2011 at 02:55:55AM -0400, David Miller wrote:
>> From: Richard Cochran <richardcochran@gmail.com>
>> Date: Fri, 21 Oct 2011 12:49:16 +0200
>>
>> > The previous commit enforces a new rule for handling the cloned packets
>> > for transmit time stamping. These packets must not be freed using any other
>> > function than skb_complete_tx_timestamp. This commit fixes the one and only
>> > driver using this API.
>> >
>> > The driver first appeared in v3.0.
>> >
>> > Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
>> > Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
>> > Cc: <stable@vger.kernel.org>
>>
>> In the 'net' tree, which is where you should be targetting these dp83640
>> driver patches, the code looks nothing like what you're patching against.
>>
>> Please respin patches #2 and #3 against current sources.
>
> Okay, but #2 will conflict with
>
> dccaa9e0 dp83640: add time stamp insertion for sync messages
>
> in net-next. Should I also submit a fix for that one?
Ok. I'll just apply those original 2 patches to net-next since Linus
release 3.1 today already.
Thanks.
^ permalink raw reply [flat|nested] 56+ messages in thread
* [PATCH v2 3/3] dp83640: free packet queues on remove
2011-10-12 18:36 ` [PATCH 1/1] net: hold sock reference while processing tx timestamps Richard Cochran
` (8 preceding siblings ...)
2011-10-21 10:49 ` [PATCH v2 2/3] dp83640: use proper function to free transmit time stamping packets Richard Cochran
@ 2011-10-21 10:49 ` Richard Cochran
9 siblings, 0 replies; 56+ messages in thread
From: Richard Cochran @ 2011-10-21 10:49 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Eric Dumazet, Johannes Berg, stable
If the PHY should disappear (for example, on an USB Ethernet MAC), then
the driver would leak any undelivered time stamp packets. This commit
fixes the issue by calling the appropriate functions to free any packets
left in the transmit and receive queues.
The driver first appeared in v3.0.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@vger.kernel.org>
---
drivers/net/phy/dp83640.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index 13e5713..9663e0b 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1007,6 +1007,7 @@ static void dp83640_remove(struct phy_device *phydev)
struct dp83640_clock *clock;
struct list_head *this, *next;
struct dp83640_private *tmp, *dp83640 = phydev->priv;
+ struct sk_buff *skb;
if (phydev->addr == BROADCAST_ADDR)
return;
@@ -1014,6 +1015,12 @@ static void dp83640_remove(struct phy_device *phydev)
enable_status_frames(phydev, false);
cancel_work_sync(&dp83640->ts_work);
+ while ((skb = skb_dequeue(&dp83640->rx_queue)) != NULL)
+ kfree_skb(skb);
+
+ while ((skb = skb_dequeue(&dp83640->tx_queue)) != NULL)
+ skb_complete_tx_timestamp(skb, NULL);
+
clock = dp83640_clock_get(dp83640->clock);
if (dp83640 == clock->chosen) {
--
1.7.2.5
^ permalink raw reply related [flat|nested] 56+ messages in thread