* [RFC] Could we avoid touching dst->refcount in some cases ?
@ 2008-11-24 8:57 Eric Dumazet
2008-11-24 9:42 ` Andi Kleen
0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2008-11-24 8:57 UTC (permalink / raw)
To: Linux Netdev List
[-- Attachment #1: Type: text/plain, Size: 484 bytes --]
tbench has hard time incrementing decrementing the route cache refcount
shared by all communications on localhost.
On real world, we also have this problem on RTP servers sending many UDP
frames to mediagateways, especially big ones handling thousand of streams.
Given that route entries are using RCU, we probably can avoid incrementing
their refcount in case of connected sockets ?
Here is a (untested and probably not working at all) patch on UDP part to
illustrate the idea :
[-- Attachment #2: avoid_touching_refcount.patch --]
[-- Type: text/plain, Size: 1271 bytes --]
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index da869ce..c385f13 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -553,6 +553,7 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
int ulen = len;
struct ipcm_cookie ipc;
struct rtable *rt = NULL;
+ int rt_release = 0;
int free = 0;
int connected = 0;
__be32 daddr, faddr, saddr;
@@ -656,8 +657,9 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
connected = 0;
}
+ rcu_read_lock();
if (connected)
- rt = (struct rtable*)sk_dst_check(sk, 0);
+ rt = (struct rtable *)__sk_dst_check(sk, 0);
if (rt == NULL) {
struct flowi fl = { .oif = ipc.oif,
@@ -681,11 +683,14 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
}
err = -EACCES;
+ rt_release = 1;
if ((rt->rt_flags & RTCF_BROADCAST) &&
!sock_flag(sk, SOCK_BROADCAST))
goto out;
- if (connected)
- sk_dst_set(sk, dst_clone(&rt->u.dst));
+ if (connected) {
+ sk_dst_set(sk, &rt->u.dst);
+ rt_release = 0;
+ }
}
if (msg->msg_flags&MSG_CONFIRM)
@@ -730,7 +735,9 @@ do_append_data:
release_sock(sk);
out:
- ip_rt_put(rt);
+ if (rt_release)
+ ip_rt_put(rt);
+ rcu_read_unlock();
if (free)
kfree(ipc.opt);
if (!err)
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC] Could we avoid touching dst->refcount in some cases ?
2008-11-24 8:57 [RFC] Could we avoid touching dst->refcount in some cases ? Eric Dumazet
@ 2008-11-24 9:42 ` Andi Kleen
2008-11-24 10:14 ` Eric Dumazet
0 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2008-11-24 9:42 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Linux Netdev List
Eric Dumazet <dada1@cosmosbay.com> writes:
> tbench has hard time incrementing decrementing the route cache refcount
> shared by all communications on localhost.
iirc there was a patch some time ago to use per CPU loopback devices to
avoid this, but it was considered too much a benchmark hack.
As core counts increase it might stop being that though.
>
> On real world, we also have this problem on RTP servers sending many UDP
> frames to mediagateways, especially big ones handling thousand of streams.
>
> Given that route entries are using RCU, we probably can avoid incrementing
> their refcount in case of connected sockets ?
Normally they can be hold over sleeps or queuing of skbs too, and RCU
doesn't handle that. To make it handle that you would need to define a
custom RCU period designed for this case, but this would be probably
tricky and fragile: especially I'm not sure even if you had a "any
packet queued" RCU method it be guaranteed to always finish
because there is no fixed upper livetime of a packet.
The other issue is that on preemptible kernels you would need to
disable preemption all the time such a routing entry is hold, which
could be potentially quite long.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Could we avoid touching dst->refcount in some cases ?
2008-11-24 9:42 ` Andi Kleen
@ 2008-11-24 10:14 ` Eric Dumazet
2008-11-24 11:24 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 22+ messages in thread
From: Eric Dumazet @ 2008-11-24 10:14 UTC (permalink / raw)
To: Andi Kleen; +Cc: Linux Netdev List
Andi Kleen a écrit :
> Eric Dumazet <dada1@cosmosbay.com> writes:
>
>> tbench has hard time incrementing decrementing the route cache refcount
>> shared by all communications on localhost.
>
> iirc there was a patch some time ago to use per CPU loopback devices to
> avoid this, but it was considered too much a benchmark hack.
> As core counts increase it might stop being that though.
Well, you probably mention Stephen patch to avoid dirtying other contended
cache lines (one napi structure per cpu)
Having multiple loopback dev would really be a hack I agree.
>
>> On real world, we also have this problem on RTP servers sending many UDP
>> frames to mediagateways, especially big ones handling thousand of streams.
>>
>> Given that route entries are using RCU, we probably can avoid incrementing
>> their refcount in case of connected sockets ?
>
> Normally they can be hold over sleeps or queuing of skbs too, and RCU
> doesn't handle that. To make it handle that you would need to define a
> custom RCU period designed for this case, but this would be probably
> tricky and fragile: especially I'm not sure even if you had a "any
> packet queued" RCU method it be guaranteed to always finish
> because there is no fixed upper livetime of a packet.
>
> The other issue is that on preemptible kernels you would need to
> disable preemption all the time such a routing entry is hold, which
> could be potentially quite long.
>
Well, in case of UDP, we call ip_push_pending_frames() and this one
does the increment of refcount (again). I was not considering
avoiding the refcount hold we do when queing a skb in transmit
queue, only during a short period of time. Oh well, ip_append_data()
might sleep, so this cannot work...
I agree avoiding one refcount increment/decrement is probably
not a huge gain, considering we *have* to do the increment,
but when many cpus are using UDP send/receive in //, this might
show a gain somehow.
So maybe we could make ip_append_data() (or its callers) a
litle bit smarter, avoiding increment/decrement if possible.
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data()
2008-11-24 10:14 ` Eric Dumazet
@ 2008-11-24 11:24 ` Eric Dumazet
2008-11-24 13:59 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames() Eric Dumazet
2008-11-24 23:55 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() David Miller
2008-11-24 11:27 ` [RFC] Could we avoid touching dst->refcount in some cases ? Andi Kleen
2008-11-24 23:39 ` David Miller
2 siblings, 2 replies; 22+ messages in thread
From: Eric Dumazet @ 2008-11-24 11:24 UTC (permalink / raw)
To: David S. Miller
Cc: Andi Kleen, Linux Netdev List, Corey Minyard, Christian Bell
[-- Attachment #1: Type: text/plain, Size: 3199 bytes --]
Eric Dumazet a écrit :
> Andi Kleen a écrit :
>> Eric Dumazet <dada1@cosmosbay.com> writes:
>>
>>> tbench has hard time incrementing decrementing the route cache refcount
>>> shared by all communications on localhost.
>>
>> iirc there was a patch some time ago to use per CPU loopback devices
>> to avoid this, but it was considered too much a benchmark hack.
>> As core counts increase it might stop being that though.
>
> Well, you probably mention Stephen patch to avoid dirtying other contended
> cache lines (one napi structure per cpu)
>
> Having multiple loopback dev would really be a hack I agree.
>
>>
>>> On real world, we also have this problem on RTP servers sending many UDP
>>> frames to mediagateways, especially big ones handling thousand of
>>> streams.
>>>
>>> Given that route entries are using RCU, we probably can avoid
>>> incrementing
>>> their refcount in case of connected sockets ?
>>
>> Normally they can be hold over sleeps or queuing of skbs too, and RCU
>> doesn't handle that. To make it handle that you would need to define a
>> custom RCU period designed for this case, but this would be probably
>> tricky and fragile: especially I'm not sure even if you had a "any
>> packet queued" RCU method it be guaranteed to always finish because
>> there is no fixed upper livetime of a packet.
>>
>> The other issue is that on preemptible kernels you would need to
>> disable preemption all the time such a routing entry is hold, which
>> could be potentially quite long.
>>
>
> Well, in case of UDP, we call ip_push_pending_frames() and this one
> does the increment of refcount (again). I was not considering
> avoiding the refcount hold we do when queing a skb in transmit
> queue, only during a short period of time. Oh well, ip_append_data()
> might sleep, so this cannot work...
>
> I agree avoiding one refcount increment/decrement is probably
> not a huge gain, considering we *have* to do the increment,
> but when many cpus are using UDP send/receive in //, this might
> show a gain somehow.
>
> So maybe we could make ip_append_data() (or its callers) a
> litle bit smarter, avoiding increment/decrement if possible.
Here is a patch to remove one dst_hold()/dst_release() pair
in UDP/RAW transmit path.
[PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data()
We can reduce pressure on dst entry refcount that slowdown UDP transmit
path on SMP machines. This pressure is visible on RTP servers when
delivering content to mediagateways, especially big ones, handling
thousand of streams. Several cpus send UDP frames to the same
destination, hence use the same dst entry.
This patch makes ip_append_data() eventually steal the refcount its
callers had to take on the dst entry.
This doesnt avoid all refcounting, but still gives speedups on SMP,
on UDP/RAW transmit path
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
include/net/ip.h | 2 +-
net/ipv4/icmp.c | 8 ++++----
net/ipv4/ip_output.c | 11 ++++++++---
net/ipv4/raw.c | 2 +-
net/ipv4/udp.c | 2 +-
5 files changed, 15 insertions(+), 10 deletions(-)
[-- Attachment #2: ip_append_data.patch --]
[-- Type: text/plain, Size: 4256 bytes --]
diff --git a/include/net/ip.h b/include/net/ip.h
index bc026ec..ddef10c 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -110,7 +110,7 @@ extern int ip_append_data(struct sock *sk,
int odd, struct sk_buff *skb),
void *from, int len, int protolen,
struct ipcm_cookie *ipc,
- struct rtable *rt,
+ struct rtable **rt,
unsigned int flags);
extern int ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb);
extern ssize_t ip_append_page(struct sock *sk, struct page *page,
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 21e497e..7b88be9 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -321,12 +321,12 @@ static int icmp_glue_bits(void *from, char *to, int offset, int len, int odd,
}
static void icmp_push_reply(struct icmp_bxm *icmp_param,
- struct ipcm_cookie *ipc, struct rtable *rt)
+ struct ipcm_cookie *ipc, struct rtable **rt)
{
struct sock *sk;
struct sk_buff *skb;
- sk = icmp_sk(dev_net(rt->u.dst.dev));
+ sk = icmp_sk(dev_net((*rt)->u.dst.dev));
if (ip_append_data(sk, icmp_glue_bits, icmp_param,
icmp_param->data_len+icmp_param->head_len,
icmp_param->head_len,
@@ -392,7 +392,7 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb)
}
if (icmpv4_xrlim_allow(net, rt, icmp_param->data.icmph.type,
icmp_param->data.icmph.code))
- icmp_push_reply(icmp_param, &ipc, rt);
+ icmp_push_reply(icmp_param, &ipc, &rt);
ip_rt_put(rt);
out_unlock:
icmp_xmit_unlock(sk);
@@ -635,7 +635,7 @@ route_done:
icmp_param.data_len = room;
icmp_param.head_len = sizeof(struct icmphdr);
- icmp_push_reply(&icmp_param, &ipc, rt);
+ icmp_push_reply(&icmp_param, &ipc, &rt);
ende:
ip_rt_put(rt);
out_unlock:
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 46d7be2..da9b819 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -778,7 +778,7 @@ int ip_append_data(struct sock *sk,
int getfrag(void *from, char *to, int offset, int len,
int odd, struct sk_buff *skb),
void *from, int length, int transhdrlen,
- struct ipcm_cookie *ipc, struct rtable *rt,
+ struct ipcm_cookie *ipc, struct rtable **rtp,
unsigned int flags)
{
struct inet_sock *inet = inet_sk(sk);
@@ -793,6 +793,7 @@ int ip_append_data(struct sock *sk,
int offset = 0;
unsigned int maxfraglen, fragheaderlen;
int csummode = CHECKSUM_NONE;
+ struct rtable *rt;
if (flags&MSG_PROBE)
return 0;
@@ -812,7 +813,11 @@ int ip_append_data(struct sock *sk,
inet->cork.flags |= IPCORK_OPT;
inet->cork.addr = ipc->addr;
}
- dst_hold(&rt->u.dst);
+ rt = *rtp;
+ /*
+ * We steal reference to this route, caller should not release it
+ */
+ *rtp = NULL;
inet->cork.fragsize = mtu = inet->pmtudisc == IP_PMTUDISC_PROBE ?
rt->u.dst.dev->mtu :
dst_mtu(rt->u.dst.path);
@@ -1391,7 +1396,7 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *ar
sk->sk_protocol = ip_hdr(skb)->protocol;
sk->sk_bound_dev_if = arg->bound_dev_if;
ip_append_data(sk, ip_reply_glue_bits, arg->iov->iov_base, len, 0,
- &ipc, rt, MSG_DONTWAIT);
+ &ipc, &rt, MSG_DONTWAIT);
if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) {
if (arg->csumoffset >= 0)
*((__sum16 *)skb_transport_header(skb) +
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 998fcff..dff8bc4 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -572,7 +572,7 @@ back_from_confirm:
ipc.addr = rt->rt_dst;
lock_sock(sk);
err = ip_append_data(sk, ip_generic_getfrag, msg->msg_iov, len, 0,
- &ipc, rt, msg->msg_flags);
+ &ipc, &rt, msg->msg_flags);
if (err)
ip_flush_pending_frames(sk);
else if (!(msg->msg_flags & MSG_MORE))
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index da869ce..5491144 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -719,7 +719,7 @@ do_append_data:
up->len += ulen;
getfrag = is_udplite ? udplite_getfrag : ip_generic_getfrag;
err = ip_append_data(sk, getfrag, msg->msg_iov, ulen,
- sizeof(struct udphdr), &ipc, rt,
+ sizeof(struct udphdr), &ipc, &rt,
corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags);
if (err)
udp_flush_pending_frames(sk);
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC] Could we avoid touching dst->refcount in some cases ?
2008-11-24 10:14 ` Eric Dumazet
2008-11-24 11:24 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() Eric Dumazet
@ 2008-11-24 11:27 ` Andi Kleen
2008-11-24 23:36 ` David Miller
2008-11-24 23:39 ` David Miller
2 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2008-11-24 11:27 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Andi Kleen, Linux Netdev List
On Mon, Nov 24, 2008 at 11:14:29AM +0100, Eric Dumazet wrote:
> Andi Kleen a écrit :
> >Eric Dumazet <dada1@cosmosbay.com> writes:
> >
> >>tbench has hard time incrementing decrementing the route cache refcount
> >>shared by all communications on localhost.
> >
> >iirc there was a patch some time ago to use per CPU loopback devices to
> >avoid this, but it was considered too much a benchmark hack.
> >As core counts increase it might stop being that though.
>
> Well, you probably mention Stephen patch to avoid dirtying other contended
> cache lines (one napi structure per cpu)
No that patch wasn't from Stephen. iirc it was from someone at SGI.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames()
2008-11-24 11:24 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() Eric Dumazet
@ 2008-11-24 13:59 ` Eric Dumazet
2008-11-25 0:07 ` David Miller
2008-11-24 23:55 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() David Miller
1 sibling, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2008-11-24 13:59 UTC (permalink / raw)
To: David S. Miller
Cc: Andi Kleen, Linux Netdev List, Corey Minyard, Christian Bell
[-- Attachment #1: Type: text/plain, Size: 662 bytes --]
We can reduce pressure on dst entry refcount that slowdown UDP transmit
path on SMP machines. This pressure is visible on RTP servers when
delivering content to mediagateways, especially big ones, handling
thousand of streams. Several cpus send UDP frames to the same
destination, hence use the same dst entry.
This patch makes ip_push_pending_frames() steal the refcount its
callers had to take when filling inet->cork.dst.
This doesnt avoid all refcounting, but still gives speedups on SMP,
on UDP/RAW transmit path.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
net/ipv4/ip_output.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletion(-)
[-- Attachment #2: ip_push_pending_frames.patch --]
[-- Type: text/plain, Size: 541 bytes --]
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 46d7be2..89bc1b9 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1279,7 +1279,12 @@ int ip_push_pending_frames(struct sock *sk)
skb->priority = sk->sk_priority;
skb->mark = sk->sk_mark;
- skb->dst = dst_clone(&rt->u.dst);
+ /*
+ * Steal rt from cork.dst to avoid a pair of atomic_inc/atomic_dec
+ * on dst refcount
+ */
+ inet->cork.dst = NULL;
+ skb->dst = &rt->u.dst;
if (iph->protocol == IPPROTO_ICMP)
icmp_out_count(net, ((struct icmphdr *)
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [RFC] Could we avoid touching dst->refcount in some cases ?
2008-11-24 11:27 ` [RFC] Could we avoid touching dst->refcount in some cases ? Andi Kleen
@ 2008-11-24 23:36 ` David Miller
0 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2008-11-24 23:36 UTC (permalink / raw)
To: andi; +Cc: dada1, netdev
From: Andi Kleen <andi@firstfloor.org>
Date: Mon, 24 Nov 2008 12:27:09 +0100
> On Mon, Nov 24, 2008 at 11:14:29AM +0100, Eric Dumazet wrote:
> > Andi Kleen a écrit :
> > >Eric Dumazet <dada1@cosmosbay.com> writes:
> > >
> > >>tbench has hard time incrementing decrementing the route cache refcount
> > >>shared by all communications on localhost.
> > >
> > >iirc there was a patch some time ago to use per CPU loopback devices to
> > >avoid this, but it was considered too much a benchmark hack.
> > >As core counts increase it might stop being that though.
> >
> > Well, you probably mention Stephen patch to avoid dirtying other contended
> > cache lines (one napi structure per cpu)
>
> No that patch wasn't from Stephen. iirc it was from someone at SGI.
That's how I remember it too.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Could we avoid touching dst->refcount in some cases ?
2008-11-24 10:14 ` Eric Dumazet
2008-11-24 11:24 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() Eric Dumazet
2008-11-24 11:27 ` [RFC] Could we avoid touching dst->refcount in some cases ? Andi Kleen
@ 2008-11-24 23:39 ` David Miller
2008-11-25 4:43 ` Eric Dumazet
2 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2008-11-24 23:39 UTC (permalink / raw)
To: dada1; +Cc: andi, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Mon, 24 Nov 2008 11:14:29 +0100
> So maybe we could make ip_append_data() (or its callers) a
> litle bit smarter, avoiding increment/decrement if possible.
These ideas are interesting but hard to make work.
I think the receive path has more chance of getting gains
from this, to be honest.
One third (effectively) of TCP stream packets are ACKs and
freed immediately. This means that the looked up route does
not escape the packet receive path. So we could elide the
counter increment in that case.
In fact, once we queue even TCP data, there is no need for
that cached skb->dst route any longer.
So pretty much all TCP packets could avoid the dst refcounting
on receive.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data()
2008-11-24 11:24 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() Eric Dumazet
2008-11-24 13:59 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames() Eric Dumazet
@ 2008-11-24 23:55 ` David Miller
2008-11-25 2:22 ` Andi Kleen
1 sibling, 1 reply; 22+ messages in thread
From: David Miller @ 2008-11-24 23:55 UTC (permalink / raw)
To: dada1; +Cc: andi, netdev, minyard, christian
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Mon, 24 Nov 2008 12:24:53 +0100
> [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data()
>
> We can reduce pressure on dst entry refcount that slowdown UDP transmit
> path on SMP machines. This pressure is visible on RTP servers when
> delivering content to mediagateways, especially big ones, handling
> thousand of streams. Several cpus send UDP frames to the same
> destination, hence use the same dst entry.
>
> This patch makes ip_append_data() eventually steal the refcount its
> callers had to take on the dst entry.
>
> This doesnt avoid all refcounting, but still gives speedups on SMP,
> on UDP/RAW transmit path
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Ok, this looks fine to me, thanks Eric. Although as you know
I'm not a big fan of pass by reference arguments :-)
Thinking more I believe we can do similar tricks for all TCP
transmit traffic.
Packets bound to sockets never outlive those sockets (and thus
their cached routes) unless we skb_orphan().
The only not covered case is where the socket cached route
is reset or changed. We could defer the dst put until the
transmit queue reaches a certain point, kind of like a retransmit
queue RCU :-)
Just some ideas...
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames()
2008-11-24 13:59 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames() Eric Dumazet
@ 2008-11-25 0:07 ` David Miller
0 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2008-11-25 0:07 UTC (permalink / raw)
To: dada1; +Cc: andi, netdev, minyard, christian
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Mon, 24 Nov 2008 14:59:51 +0100
> We can reduce pressure on dst entry refcount that slowdown UDP transmit
> path on SMP machines. This pressure is visible on RTP servers when
> delivering content to mediagateways, especially big ones, handling
> thousand of streams. Several cpus send UDP frames to the same
> destination, hence use the same dst entry.
>
> This patch makes ip_push_pending_frames() steal the refcount its
> callers had to take when filling inet->cork.dst.
>
> This doesnt avoid all refcounting, but still gives speedups on SMP,
> on UDP/RAW transmit path.
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Applied.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data()
2008-11-24 23:55 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() David Miller
@ 2008-11-25 2:22 ` Andi Kleen
0 siblings, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2008-11-25 2:22 UTC (permalink / raw)
To: David Miller; +Cc: dada1, andi, netdev, minyard, christian
> Thinking more I believe we can do similar tricks for all TCP
> transmit traffic.
Sounds reasonable.
>
> Packets bound to sockets never outlive those sockets (and thus
> their cached routes) unless we skb_orphan().
>
> The only not covered case is where the socket cached route
> is reset or changed. We could defer the dst put until the
> transmit queue reaches a certain point, kind of like a retransmit
> queue RCU :-)
>
> Just some ideas...
netfilter makes it somewhat tricky, for compatibility you would
need to reclone the route on the fly.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Could we avoid touching dst->refcount in some cases ?
2008-11-24 23:39 ` David Miller
@ 2008-11-25 4:43 ` Eric Dumazet
2008-11-25 5:00 ` David Miller
0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2008-11-25 4:43 UTC (permalink / raw)
To: David Miller; +Cc: andi, netdev
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Mon, 24 Nov 2008 11:14:29 +0100
>
>> So maybe we could make ip_append_data() (or its callers) a
>> litle bit smarter, avoiding increment/decrement if possible.
>
> These ideas are interesting but hard to make work.
>
> I think the receive path has more chance of getting gains
> from this, to be honest.
>
> One third (effectively) of TCP stream packets are ACKs and
> freed immediately. This means that the looked up route does
> not escape the packet receive path. So we could elide the
> counter increment in that case.
>
> In fact, once we queue even TCP data, there is no need for
> that cached skb->dst route any longer.
>
> So pretty much all TCP packets could avoid the dst refcounting
> on receive.
Very interesting. So we could try the following path :
1) First try to release dst when queueing skb to various queues
(UDP, TCP, ...) while its hot. Reader wont have to release it
while its cold.
2) Check if we can handle the input path without any refcount
dirtying ?
To make the transition easy, we could use a bit on skb to mark
dst being not refcounted (ie no dst_release() should be done on it)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Could we avoid touching dst->refcount in some cases ?
2008-11-25 4:43 ` Eric Dumazet
@ 2008-11-25 5:00 ` David Miller
2008-11-26 0:00 ` [PATCH] net: release skb->dst in sock_queue_rcv_skb() Eric Dumazet
0 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2008-11-25 5:00 UTC (permalink / raw)
To: dada1; +Cc: andi, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 25 Nov 2008 05:43:32 +0100
> Very interesting. So we could try the following path :
>
> 1) First try to release dst when queueing skb to various queues
> (UDP, TCP, ...) while its hot. Reader wont have to release it
> while its cold.
>
> 2) Check if we can handle the input path without any refcount
> dirtying ?
>
> To make the transition easy, we could use a bit on skb to mark
> dst being not refcounted (ie no dst_release() should be done on it)
It is possible to make this self-auditing. For example, by
using the usual trick where we encode a pointer in an
unsigned long and use the low bits for states.
In the first step, make each skb->dst access go through some
accessor inline function.
Next, audit the paths where skb->dst's can "escape" the pure
packet input path. Add annotations, in the form of a
inline function call, for these locations.
Also, audit the other locations where we enqueue into a socket
queue and no longer care about the skb->dst, and annotate
those with another inline function.
Finally, the initial skb->dst assignment in the input path doesn't
grab a reference, but sets the low bit ("refcount pending") in
the encoded skb->dst pointer. The skb->dst "escape" inline
function performs the deferred refcount grab. And kfree_skb()
is taught to not dst_release() on skb->dst's which have the
low bit set.
Anyways, something like that.
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] net: release skb->dst in sock_queue_rcv_skb()
2008-11-25 5:00 ` David Miller
@ 2008-11-26 0:00 ` Eric Dumazet
2008-11-26 0:23 ` David Miller
` (2 more replies)
0 siblings, 3 replies; 22+ messages in thread
From: Eric Dumazet @ 2008-11-26 0:00 UTC (permalink / raw)
To: David Miller; +Cc: andi, netdev
[-- Attachment #1: Type: text/plain, Size: 2867 bytes --]
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Tue, 25 Nov 2008 05:43:32 +0100
>
>> Very interesting. So we could try the following path :
>>
>> 1) First try to release dst when queueing skb to various queues
>> (UDP, TCP, ...) while its hot. Reader wont have to release it
>> while its cold.
>>
>> 2) Check if we can handle the input path without any refcount
>> dirtying ?
>>
>> To make the transition easy, we could use a bit on skb to mark
>> dst being not refcounted (ie no dst_release() should be done on it)
>
> It is possible to make this self-auditing. For example, by
> using the usual trick where we encode a pointer in an
> unsigned long and use the low bits for states.
>
> In the first step, make each skb->dst access go through some
> accessor inline function.
>
> Next, audit the paths where skb->dst's can "escape" the pure
> packet input path. Add annotations, in the form of a
> inline function call, for these locations.
>
> Also, audit the other locations where we enqueue into a socket
> queue and no longer care about the skb->dst, and annotate
> those with another inline function.
>
> Finally, the initial skb->dst assignment in the input path doesn't
> grab a reference, but sets the low bit ("refcount pending") in
> the encoded skb->dst pointer. The skb->dst "escape" inline
> function performs the deferred refcount grab. And kfree_skb()
> is taught to not dst_release() on skb->dst's which have the
> low bit set.
>
> Anyways, something like that.
I looked this stuff and found it would be difficult to not grab a
reference (and more important not writing to dst) in input path.
ip_rcv_finish() calls ip_route_input()
and ip_route_input() calls dst_use(&rth->u.dst, jiffies);
static inline void dst_use(struct dst_entry *dst, unsigned long time)
{
dst_hold(dst);
dst->__use++;
dst->lastuse = time;
}
Even if we avoid the refcount increment, I guess we need the lastuse
assignement in order to keep dst in cache. Not sure about the role of
__use field. Hum... for a tcp connection, dst refcount should already
be pinned by a sk->sk_dst_cache. Maybe test refcount value, and if this
value is > 1, dont take a reference. (given rcu_read_lock() is done
before calling ip_rcv_finish())
In the meantime, what do you think of the following patch ?
[PATCH] net: release skb->dst in sock_queue_rcv_skb()
When queuing a skb to sk->sk_receive_queue, we can release its dst, not
anymore needed.
Since current cpu did the dst_hold(), refcount is probably still hot
int this cpu caches.
This avoids readers to access the original dst to decrement its refcount,
possibly a long time after packet reception. This should speedup UDP
and RAW receive path.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
[-- Attachment #2: sock_queue_rcv_skb.patch --]
[-- Type: text/plain, Size: 534 bytes --]
diff --git a/net/core/sock.c b/net/core/sock.c
index a4e840e..b287645 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -289,7 +289,11 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
skb->dev = NULL;
skb_set_owner_r(skb, sk);
-
+ /*
+ * release dst right now while its hot
+ */
+ dst_release(skb->dst);
+ skb->dst = NULL;
/* Cache the SKB length before we tack it onto the receive
* queue. Once it is added it no longer belongs to us and
* may be freed by other threads of control pulling packets
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()
2008-11-26 0:00 ` [PATCH] net: release skb->dst in sock_queue_rcv_skb() Eric Dumazet
@ 2008-11-26 0:23 ` David Miller
2008-11-26 2:04 ` David Miller
2008-12-17 11:25 ` net-next: broken IP_PKTINFO [was Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()] Mark McLoughlin
2 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2008-11-26 0:23 UTC (permalink / raw)
To: dada1; +Cc: andi, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed, 26 Nov 2008 01:00:30 +0100
> Hum... for a tcp connection, dst refcount should already be pinned
> by a sk->sk_dst_cache. Maybe test refcount value, and if this value
> is > 1, dont take a reference. (given rcu_read_lock() is done before
> calling ip_rcv_finish())
Input route is different from output route. sk->sk_dst_cache holds
the output route. So I can't see how this can help here.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()
2008-11-26 0:00 ` [PATCH] net: release skb->dst in sock_queue_rcv_skb() Eric Dumazet
2008-11-26 0:23 ` David Miller
@ 2008-11-26 2:04 ` David Miller
2008-11-26 7:39 ` Eric Dumazet
2008-12-17 11:25 ` net-next: broken IP_PKTINFO [was Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()] Mark McLoughlin
2 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2008-11-26 2:04 UTC (permalink / raw)
To: dada1; +Cc: andi, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed, 26 Nov 2008 01:00:30 +0100
> In the meantime, what do you think of the following patch ?
>
> [PATCH] net: release skb->dst in sock_queue_rcv_skb()
>
> When queuing a skb to sk->sk_receive_queue, we can release its dst, not
> anymore needed.
> Since current cpu did the dst_hold(), refcount is probably still hot
> int this cpu caches.
>
> This avoids readers to access the original dst to decrement its refcount,
> possibly a long time after packet reception. This should speedup UDP
> and RAW receive path.
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
I guess the idea is that if we release quickly we'll not have
to reget the cacheline in owned state.
I wonder if this might actually slightly hurt loads like tbench where
we are banging on the refcnt constantly on every cpu anyways.
Can you do a quick check?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()
2008-11-26 2:04 ` David Miller
@ 2008-11-26 7:39 ` Eric Dumazet
2008-11-26 9:08 ` David Miller
0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2008-11-26 7:39 UTC (permalink / raw)
To: David Miller; +Cc: andi, netdev
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Wed, 26 Nov 2008 01:00:30 +0100
>
>> In the meantime, what do you think of the following patch ?
>>
>> [PATCH] net: release skb->dst in sock_queue_rcv_skb()
>>
>> When queuing a skb to sk->sk_receive_queue, we can release its dst, not
>> anymore needed.
>> Since current cpu did the dst_hold(), refcount is probably still hot
>> int this cpu caches.
>>
>> This avoids readers to access the original dst to decrement its refcount,
>> possibly a long time after packet reception. This should speedup UDP
>> and RAW receive path.
>>
>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>
> I guess the idea is that if we release quickly we'll not have
> to reget the cacheline in owned state.
Yes, this is the idea.
>
> I wonder if this might actually slightly hurt loads like tbench where
> we are banging on the refcnt constantly on every cpu anyways.
Yes, the only way to reduce the load on this case is not moving the
increments/decrements : It could help on some machines, and hurt on others.
In the long term, only reducing the number of dirtying can help the average.
>
> Can you do a quick check?
>
>
Sure I can do a check :)
No impact at all on tbench, unless this bench can run in UDP mode :)
CPU: Core 2, speed 3000.1 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
mask of 0x00 (Unhalted core cycles) count 100000
samples cum. samples % cum. % symbol name
599983 599983 11.2948 11.2948 copy_from_user
549349 1149332 10.3416 21.6364 ipt_do_table
245697 1395029 4.6253 26.2617 copy_to_user
221663 1616692 4.1729 30.4346 schedule
144888 1761580 2.7275 33.1621 tcp_sendmsg
136967 1898547 2.5784 35.7406 tcp_ack
115513 2014060 2.1746 37.9151 tcp_transmit_skb
99517 2113577 1.8734 39.7885 sysenter_past_esp
99438 2213015 1.8719 41.6605 ip_queue_xmit
98171 2311186 1.8481 43.5086 tcp_recvmsg
80763 2391949 1.5204 45.0290 __switch_to
79621 2471570 1.4989 46.5278 tcp_v4_rcv
77210 2548780 1.4535 47.9813 dst_release
69387 2618167 1.3062 49.2876 tcp_rcv_established
64709 2682876 1.2182 50.5057 __tcp_push_pending_frames
55223 2738099 1.0396 51.5453 lock_sock_nested
53754 2791853 1.0119 52.5572 sys_socketcall
50092 2841945 0.9430 53.5002 netif_receive_skb
49499 2891444 0.9318 54.4321 release_sock
47796 2939240 0.8998 55.3318 __inet_lookup_established
45162 2984402 0.8502 56.1820 update_curr
44895 3029297 0.8452 57.0272 ip_rcv
42945 3072242 0.8084 57.8356 dev_queue_xmit
42892 3115134 0.8075 58.6431 tcp_event_data_recv
42768 3157902 0.8051 59.4482 local_bh_enable
41555 3199457 0.7823 60.2305 netif_rx
38613 3238070 0.7269 60.9574 __alloc_skb
38016 3276086 0.7157 61.6730 ip_finish_output
36867 3312953 0.6940 62.3671 tcp_current_mss
36759 3349712 0.6920 63.0591 skb_release_data
35560 3385272 0.6694 63.7285 local_bh_enable_ip
34100 3419372 0.6419 64.3704 sock_recvmsg
33829 3453201 0.6368 65.0073 __kfree_skb
32949 3486150 0.6203 65.6275 sched_clock_cpu
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()
2008-11-26 7:39 ` Eric Dumazet
@ 2008-11-26 9:08 ` David Miller
0 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2008-11-26 9:08 UTC (permalink / raw)
To: dada1; +Cc: andi, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed, 26 Nov 2008 08:39:30 +0100
> David Miller a écrit :
> > From: Eric Dumazet <dada1@cosmosbay.com>
> > Date: Wed, 26 Nov 2008 01:00:30 +0100
> >
> > Can you do a quick check?
> >
>
> Sure I can do a check :)
>
> No impact at all on tbench, unless this bench can run in UDP mode :)
Durrr, of course!
Patch applied, thanks Eric.
^ permalink raw reply [flat|nested] 22+ messages in thread
* net-next: broken IP_PKTINFO [was Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()]
2008-11-26 0:00 ` [PATCH] net: release skb->dst in sock_queue_rcv_skb() Eric Dumazet
2008-11-26 0:23 ` David Miller
2008-11-26 2:04 ` David Miller
@ 2008-12-17 11:25 ` Mark McLoughlin
2008-12-18 3:34 ` net-next: broken IP_PKTINFO David Miller
2 siblings, 1 reply; 22+ messages in thread
From: Mark McLoughlin @ 2008-12-17 11:25 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, andi, netdev
Hi,
On Wed, 2008-11-26 at 01:00 +0100, Eric Dumazet wrote:
> [PATCH] net: release skb->dst in sock_queue_rcv_skb()
>
> When queuing a skb to sk->sk_receive_queue, we can release its dst, not
> anymore needed.
> Since current cpu did the dst_hold(), refcount is probably still hot
> int this cpu caches.
>
> This avoids readers to access the original dst to decrement its refcount,
> possibly a long time after packet reception. This should speedup UDP
> and RAW receive path.
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> plain text document attachment (sock_queue_rcv_skb.patch)
> diff --git a/net/core/sock.c b/net/core/sock.c
> index a4e840e..b287645 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -289,7 +289,11 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
>
> skb->dev = NULL;
> skb_set_owner_r(skb, sk);
> -
> + /*
> + * release dst right now while its hot
> + */
> + dst_release(skb->dst);
> + skb->dst = NULL;
IP_PKTINFO cmsg data is one post-queueing user:
static void ip_cmsg_recv_pktinfo(struct msghdr *msg, struct sk_buff *skb)
{
struct in_pktinfo info;
struct rtable *rt = skb->rtable;
info.ipi_addr.s_addr = ip_hdr(skb)->daddr;
if (rt) {
info.ipi_ifindex = rt->rt_iif;
info.ipi_spec_dst.s_addr = rt->rt_spec_dst;
} else {
info.ipi_ifindex = 0;
info.ipi_spec_dst.s_addr = 0;
}
put_cmsg(msg, SOL_IP, IP_PKTINFO, sizeof(info), &info);
}
(i.e. skb->rtable is NULL at this point)
I'm seeing dnsmasq not working on net-next-2.6 because of this and
reverting commit 7035560 makes things work as expected again.
Cheers,
Mark.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: net-next: broken IP_PKTINFO
2008-12-17 11:25 ` net-next: broken IP_PKTINFO [was Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()] Mark McLoughlin
@ 2008-12-18 3:34 ` David Miller
2008-12-18 5:59 ` Eric Dumazet
0 siblings, 1 reply; 22+ messages in thread
From: David Miller @ 2008-12-18 3:34 UTC (permalink / raw)
To: markmc; +Cc: dada1, andi, netdev
From: Mark McLoughlin <markmc@redhat.com>
Date: Wed, 17 Dec 2008 11:25:01 +0000
> On Wed, 2008-11-26 at 01:00 +0100, Eric Dumazet wrote:
>
> > [PATCH] net: release skb->dst in sock_queue_rcv_skb()
...
> IP_PKTINFO cmsg data is one post-queueing user:
Eric, we'll need to rever this change I think.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: net-next: broken IP_PKTINFO
2008-12-18 3:34 ` net-next: broken IP_PKTINFO David Miller
@ 2008-12-18 5:59 ` Eric Dumazet
2008-12-18 6:17 ` David Miller
0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2008-12-18 5:59 UTC (permalink / raw)
To: David Miller; +Cc: markmc, andi, netdev
David Miller a écrit :
> From: Mark McLoughlin <markmc@redhat.com>
> Date: Wed, 17 Dec 2008 11:25:01 +0000
>
>> On Wed, 2008-11-26 at 01:00 +0100, Eric Dumazet wrote:
>>
>>> [PATCH] net: release skb->dst in sock_queue_rcv_skb()
> ...
>> IP_PKTINFO cmsg data is one post-queueing user:
>
> Eric, we'll need to rever this change I think.
I am afraid we have to revert it, yes.
META_COLLECTOR(int_rtiif) & META_COLLECTOR(int_rtclassid)
in net/sched/em_meta.c also need rtable, I am not sure how it is used.
About ip_cmsg_recv_pktinfo() :
iif can be found in skb->iif instead of rt->rt_iif, but I am not sure
about rt_spec_dst : Shouldnt we find it in ip_hdr(skb)->saddr ?
Do you know if we really need rtable in ip_cmsg_recv_pktinfo() ?
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 43c0585..e854893 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -64,8 +64,8 @@ static void ip_cmsg_recv_pktinfo(struct msghdr *msg, struct sk_buff *skb)
info.ipi_ifindex = rt->rt_iif;
info.ipi_spec_dst.s_addr = rt->rt_spec_dst;
} else {
- info.ipi_ifindex = 0;
- info.ipi_spec_dst.s_addr = 0;
+ info.ipi_ifindex = skb->iif;
+ info.ipi_spec_dst.s_addr = ip_hdr(skb)->saddr;
}
put_cmsg(msg, SOL_IP, IP_PKTINFO, sizeof(info), &info);
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: net-next: broken IP_PKTINFO
2008-12-18 5:59 ` Eric Dumazet
@ 2008-12-18 6:17 ` David Miller
0 siblings, 0 replies; 22+ messages in thread
From: David Miller @ 2008-12-18 6:17 UTC (permalink / raw)
To: dada1; +Cc: markmc, andi, netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Thu, 18 Dec 2008 06:59:26 +0100
> David Miller a écrit :
> > Eric, we'll need to rever this change I think.
>
> I am afraid we have to revert it, yes.
Done.
> About ip_cmsg_recv_pktinfo() :
>
> iif can be found in skb->iif instead of rt->rt_iif, but I am not sure
> about rt_spec_dst : Shouldnt we find it in ip_hdr(skb)->saddr ?
>
> Do you know if we really need rtable in ip_cmsg_recv_pktinfo() ?
I think we might, as these are routing attributes, which not
necessarily match up with the values found in the SKB.
In fact, I do remember that "specific destination" has a very
exact definition in the routing RFCs and it has to do with the
matched route.
Conversely, in what situations (other than with the reverted patch
applied, hehe) would the route not be attached here?
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2008-12-18 6:17 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-24 8:57 [RFC] Could we avoid touching dst->refcount in some cases ? Eric Dumazet
2008-11-24 9:42 ` Andi Kleen
2008-11-24 10:14 ` Eric Dumazet
2008-11-24 11:24 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() Eric Dumazet
2008-11-24 13:59 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames() Eric Dumazet
2008-11-25 0:07 ` David Miller
2008-11-24 23:55 ` [PATCH] net: avoid a pair of dst_hold()/dst_release() in ip_append_data() David Miller
2008-11-25 2:22 ` Andi Kleen
2008-11-24 11:27 ` [RFC] Could we avoid touching dst->refcount in some cases ? Andi Kleen
2008-11-24 23:36 ` David Miller
2008-11-24 23:39 ` David Miller
2008-11-25 4:43 ` Eric Dumazet
2008-11-25 5:00 ` David Miller
2008-11-26 0:00 ` [PATCH] net: release skb->dst in sock_queue_rcv_skb() Eric Dumazet
2008-11-26 0:23 ` David Miller
2008-11-26 2:04 ` David Miller
2008-11-26 7:39 ` Eric Dumazet
2008-11-26 9:08 ` David Miller
2008-12-17 11:25 ` net-next: broken IP_PKTINFO [was Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb()] Mark McLoughlin
2008-12-18 3:34 ` net-next: broken IP_PKTINFO David Miller
2008-12-18 5:59 ` Eric Dumazet
2008-12-18 6:17 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).