From: Eric Dumazet <dada1@cosmosbay.com>
To: Corey Minyard <minyard@acm.org>, David Miller <davem@davemloft.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>,
paulmck@linux.vnet.ibm.com, benny+usenet@amorsen.dk,
netdev@vger.kernel.org,
Christoph Lameter <cl@linux-foundation.org>,
a.p.zijlstra@chello.nl, johnpol@2ka.mipt.ru,
Christian Bell <christian@myri.com>,
Pavel Emelyanov <xemul@openvz.org>
Subject: Re: [PATCH] udp: Introduce special NULL pointers for hlist termination
Date: Fri, 31 Oct 2008 15:37:50 +0100 [thread overview]
Message-ID: <490B183E.3010707@cosmosbay.com> (raw)
In-Reply-To: <4909E0A4.7060009@acm.org>
[-- Attachment #1: Type: text/plain, Size: 1297 bytes --]
Corey Minyard a écrit :
>
> It is annoying that it doesn't help the performance for multicast.
> However, I think the current patch will solve the DOS issue for
> multicast, since it switches to a normal spinlock and has a per-list lock.
About multicast, it should be possible to do something about it, if it happens
to be an issue.
That is, do a lockless lookup and accumulate matching sockets ptr in a table
(incrementing their refcount if not zero, checking key, adding in a local stack).
If lookup must be restarted, forget all accumulated sockets (sock_put(ptrs))
goto begin;
Then, send the (cloned) packet to all accumulated sockets, and
sock_put() them to release the refcount.
Well, looking at current implementation, I found that udp_v4_mcast_next()
doesnt take into account the 'struct net *net', so we have a bug here...
udp_v6_mcast_next() is buggy too (or at least its caller is)
David, please find a patch against net-2.6
Thanks
[PATCH] udp: multicast packets need to check namespace
Current UDP multicast delivery is not namespace aware.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
net/ipv4/udp.c | 14 ++++++++------
net/ipv6/udp.c | 8 ++++----
2 files changed, 12 insertions(+), 10 deletions(-)
[-- Attachment #2: udp_multi.patch --]
[-- Type: text/plain, Size: 3014 bytes --]
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 2095abc..76e3cc5 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -284,7 +284,7 @@ struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
}
EXPORT_SYMBOL_GPL(udp4_lib_lookup);
-static inline struct sock *udp_v4_mcast_next(struct sock *sk,
+static inline struct sock *udp_v4_mcast_next(struct net *net, struct sock *sk,
__be16 loc_port, __be32 loc_addr,
__be16 rmt_port, __be32 rmt_addr,
int dif)
@@ -295,8 +295,9 @@ static inline struct sock *udp_v4_mcast_next(struct sock *sk,
sk_for_each_from(s, node) {
struct inet_sock *inet = inet_sk(s);
-
- if (s->sk_hash != hnum ||
+
+ if (!net_eq(sock_net(s), net) ||
+ s->sk_hash != hnum ||
(inet->daddr && inet->daddr != rmt_addr) ||
(inet->dport != rmt_port && inet->dport) ||
(inet->rcv_saddr && inet->rcv_saddr != loc_addr) ||
@@ -1079,15 +1080,16 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
read_lock(&udp_hash_lock);
sk = sk_head(&udptable[udp_hashfn(net, ntohs(uh->dest))]);
dif = skb->dev->ifindex;
- sk = udp_v4_mcast_next(sk, uh->dest, daddr, uh->source, saddr, dif);
+ sk = udp_v4_mcast_next(net, sk, uh->dest, daddr, uh->source, saddr, dif);
if (sk) {
struct sock *sknext = NULL;
do {
struct sk_buff *skb1 = skb;
- sknext = udp_v4_mcast_next(sk_next(sk), uh->dest, daddr,
- uh->source, saddr, dif);
+ sknext = udp_v4_mcast_next(net, sk_next(sk), uh->dest,
+ daddr, uh->source, saddr,
+ dif);
if (sknext)
skb1 = skb_clone(skb, GFP_ATOMIC);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index e51da8c..71e259e 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -328,7 +328,7 @@ drop:
return -1;
}
-static struct sock *udp_v6_mcast_next(struct sock *sk,
+static struct sock *udp_v6_mcast_next(struct net *net, struct sock *sk,
__be16 loc_port, struct in6_addr *loc_addr,
__be16 rmt_port, struct in6_addr *rmt_addr,
int dif)
@@ -340,7 +340,7 @@ static struct sock *udp_v6_mcast_next(struct sock *sk,
sk_for_each_from(s, node) {
struct inet_sock *inet = inet_sk(s);
- if (sock_net(s) != sock_net(sk))
+ if (!net_eq(sock_net(s), net))
continue;
if (s->sk_hash == num && s->sk_family == PF_INET6) {
@@ -383,14 +383,14 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
read_lock(&udp_hash_lock);
sk = sk_head(&udptable[udp_hashfn(net, ntohs(uh->dest))]);
dif = inet6_iif(skb);
- sk = udp_v6_mcast_next(sk, uh->dest, daddr, uh->source, saddr, dif);
+ sk = udp_v6_mcast_next(net, sk, uh->dest, daddr, uh->source, saddr, dif);
if (!sk) {
kfree_skb(skb);
goto out;
}
sk2 = sk;
- while ((sk2 = udp_v6_mcast_next(sk_next(sk2), uh->dest, daddr,
+ while ((sk2 = udp_v6_mcast_next(net, sk_next(sk2), uh->dest, daddr,
uh->source, saddr, dif))) {
struct sk_buff *buff = skb_clone(skb, GFP_ATOMIC);
if (buff) {
next prev parent reply other threads:[~2008-10-31 14:44 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-06 18:50 [PATCH 3/3] Convert the UDP hash lock to RCU Corey Minyard
2008-10-06 21:22 ` Eric Dumazet
2008-10-06 21:40 ` David Miller
2008-10-06 23:08 ` Corey Minyard
2008-10-07 8:37 ` Evgeniy Polyakov
2008-10-07 14:16 ` Christoph Lameter
2008-10-07 14:29 ` Evgeniy Polyakov
2008-10-07 14:38 ` Christoph Lameter
2008-10-07 14:33 ` Paul E. McKenney
2008-10-07 14:45 ` Christoph Lameter
2008-10-07 15:07 ` Eric Dumazet
2008-10-07 15:07 ` Paul E. McKenney
2008-10-07 5:24 ` Eric Dumazet
2008-10-07 8:54 ` Benny Amorsen
2008-10-07 12:59 ` Eric Dumazet
2008-10-07 14:07 ` Stephen Hemminger
2008-10-07 20:55 ` David Miller
2008-10-07 21:20 ` Stephen Hemminger
2008-10-08 13:55 ` Eric Dumazet
2008-10-08 18:45 ` David Miller
2008-10-28 20:37 ` [PATCH 1/2] udp: introduce struct udp_table and multiple rwlocks Eric Dumazet
2008-10-28 21:23 ` Christian Bell
2008-10-28 21:31 ` Evgeniy Polyakov
2008-10-28 21:48 ` Eric Dumazet
2008-10-28 21:28 ` Evgeniy Polyakov
2008-10-28 20:42 ` [PATCH 2/2] udp: RCU handling for Unicast packets Eric Dumazet
2008-10-28 22:45 ` Eric Dumazet
2008-10-29 5:05 ` David Miller
2008-10-29 8:23 ` Eric Dumazet
2008-10-29 8:56 ` David Miller
2008-10-29 10:19 ` Eric Dumazet
2008-10-29 18:19 ` David Miller
2008-10-29 9:04 ` Eric Dumazet
2008-10-29 9:17 ` David Miller
2008-10-29 13:17 ` Corey Minyard
2008-10-29 14:36 ` Eric Dumazet
2008-10-29 15:34 ` Corey Minyard
2008-10-29 16:09 ` Eric Dumazet
2008-10-29 16:37 ` Paul E. McKenney
2008-10-29 17:22 ` Corey Minyard
2008-10-29 17:45 ` Eric Dumazet
2008-10-29 18:28 ` Corey Minyard
2008-10-29 18:52 ` Paul E. McKenney
2008-10-29 20:00 ` Eric Dumazet
2008-10-29 20:17 ` Paul E. McKenney
2008-10-29 21:29 ` Corey Minyard
2008-10-29 21:57 ` Eric Dumazet
2008-10-29 21:58 ` Paul E. McKenney
2008-10-29 22:08 ` Eric Dumazet
2008-10-30 3:22 ` Corey Minyard
2008-10-30 5:50 ` Eric Dumazet
2008-11-02 4:19 ` David Miller
2008-10-30 5:40 ` David Miller
2008-10-30 5:51 ` Eric Dumazet
2008-10-30 7:04 ` Eric Dumazet
2008-10-30 7:05 ` David Miller
2008-10-30 15:40 ` [PATCH] udp: Introduce special NULL pointers for hlist termination Eric Dumazet
2008-10-30 15:51 ` Stephen Hemminger
2008-10-30 16:28 ` Corey Minyard
2008-10-31 14:37 ` Eric Dumazet [this message]
2008-10-31 14:55 ` Pavel Emelyanov
2008-11-02 4:22 ` David Miller
2008-10-30 17:12 ` Eric Dumazet
2008-10-31 7:51 ` David Miller
2008-10-30 16:01 ` Peter Zijlstra
2008-10-31 0:14 ` Keith Owens
2008-11-13 13:13 ` [PATCH 0/3] net: RCU lookups for UDP, DCCP and TCP protocol Eric Dumazet
2008-11-13 17:20 ` Andi Kleen
2008-11-17 3:41 ` David Miller
2008-11-19 19:52 ` Christoph Lameter
2008-11-13 13:14 ` [PATCH 1/3] rcu: Introduce hlist_nulls variant of hlist Eric Dumazet
2008-11-13 13:29 ` Peter Zijlstra
2008-11-13 13:44 ` Eric Dumazet
2008-11-13 16:02 ` [PATCH 4/3] rcu: documents rculist_nulls Eric Dumazet
2008-11-14 15:16 ` Peter Zijlstra
2008-11-17 3:36 ` David Miller
2008-11-19 17:07 ` Paul E. McKenney
2008-11-14 15:16 ` [PATCH 1/3] rcu: Introduce hlist_nulls variant of hlist Peter Zijlstra
2008-11-19 17:01 ` Paul E. McKenney
2008-11-19 17:53 ` Eric Dumazet
2008-11-19 18:46 ` Paul E. McKenney
2008-11-19 18:53 ` Arnaldo Carvalho de Melo
2008-11-19 21:17 ` Paul E. McKenney
2008-11-19 20:39 ` Eric Dumazet
2008-11-19 21:21 ` Paul E. McKenney
2008-11-13 13:15 ` [PATCH 2/3] udp: Use hlist_nulls in UDP RCU code Eric Dumazet
2008-11-19 17:29 ` Paul E. McKenney
2008-11-19 17:53 ` Eric Dumazet
2008-11-13 13:15 ` [PATCH 3/3] net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls Eric Dumazet
2008-11-13 13:34 ` Peter Zijlstra
2008-11-13 13:51 ` Eric Dumazet
2008-11-13 14:08 ` Christoph Lameter
2008-11-13 14:22 ` Peter Zijlstra
2008-11-13 14:27 ` Christoph Lameter
2008-11-19 17:53 ` Paul E. McKenney
2008-11-23 9:33 ` [PATCH] net: Convert TCP/DCCP listening hash tables to use RCU Eric Dumazet
2008-11-23 15:59 ` Paul E. McKenney
2008-11-23 18:42 ` Eric Dumazet
2008-11-23 19:17 ` Paul E. McKenney
2008-11-23 20:18 ` Eric Dumazet
2008-11-23 22:33 ` Paul E. McKenney
2008-11-24 1:23 ` David Miller
2008-10-30 11:04 ` [PATCH 2/2] udp: RCU handling for Unicast packets Peter Zijlstra
2008-10-30 11:30 ` Eric Dumazet
2008-10-30 18:25 ` Paul E. McKenney
2008-10-31 16:40 ` Eric Dumazet
2008-11-01 3:10 ` Paul E. McKenney
2008-10-29 17:32 ` Eric Dumazet
2008-10-29 18:11 ` Paul E. McKenney
2008-10-29 18:29 ` David Miller
2008-10-29 18:38 ` Paul E. McKenney
2008-10-29 18:36 ` Eric Dumazet
2008-10-29 18:20 ` David Miller
2008-10-30 11:12 ` Peter Zijlstra
2008-10-30 11:29 ` Eric Dumazet
2008-10-28 20:37 ` [PATCH 0/2] udp: Convert the UDP hash lock to RCU Eric Dumazet
2008-10-28 21:28 ` Stephen Hemminger
2008-10-28 21:50 ` Eric Dumazet
2008-10-07 16:43 ` [PATCH 3/3] " Corey Minyard
2008-10-07 18:26 ` David Miller
2008-10-08 8:35 ` Eric Dumazet
2008-10-08 16:38 ` David Miller
2008-10-07 8:31 ` Peter Zijlstra
2008-10-07 14:36 ` Paul E. McKenney
2008-10-07 18:29 ` David Miller
2008-10-06 22:07 ` Corey Minyard
2008-10-07 8:17 ` Peter Zijlstra
2008-10-07 9:24 ` Eric Dumazet
2008-10-07 14:15 ` Christoph Lameter
2008-10-07 14:38 ` Paul E. McKenney
2008-10-07 14:50 ` Eric Dumazet
2008-10-07 15:05 ` Paul E. McKenney
2008-10-07 15:09 ` Peter Zijlstra
2008-10-07 15:23 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=490B183E.3010707@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=a.p.zijlstra@chello.nl \
--cc=benny+usenet@amorsen.dk \
--cc=christian@myri.com \
--cc=cl@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=johnpol@2ka.mipt.ru \
--cc=minyard@acm.org \
--cc=netdev@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=shemminger@vyatta.com \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).