* [PATCH V3 2/8] tcp: use limited socket backlog [not found] <1267761707-15605-1-git-send-email-yi.zhu@intel.com> @ 2010-03-05 4:01 ` Zhu Yi 2010-03-05 4:01 ` [PATCH V3 3/8] udp: " Zhu Yi 2010-03-05 6:19 ` [PATCH V3 2/8] tcp: " Eric Dumazet 2010-03-05 13:00 ` [PATCH V3 1/8] net: add limit for " Arnaldo Carvalho de Melo 1 sibling, 2 replies; 22+ messages in thread From: Zhu Yi @ 2010-03-05 4:01 UTC (permalink / raw) To: davem Cc: netdev, Zhu Yi, Alexey Kuznetsov, Pekka Savola (ipv6), Patrick McHardy Make tcp adapt to the limited socket backlog change. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Zhu Yi <yi.zhu@intel.com> --- net/ipv4/tcp_ipv4.c | 6 ++++-- net/ipv6/tcp_ipv6.c | 6 ++++-- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index c3588b4..4baf194 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1682,8 +1682,10 @@ process: if (!tcp_prequeue(sk, skb)) ret = tcp_v4_do_rcv(sk, skb); } - } else - sk_add_backlog(sk, skb); + } else if (sk_add_backlog_limited(sk, skb)) { + bh_unlock_sock(sk); + goto discard_and_relse; + } bh_unlock_sock(sk); sock_put(sk); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 6963a6b..c4ea9d5 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1740,8 +1740,10 @@ process: if (!tcp_prequeue(sk, skb)) ret = tcp_v6_do_rcv(sk, skb); } - } else - sk_add_backlog(sk, skb); + } else if (sk_add_backlog_limited(sk, skb)) { + bh_unlock_sock(sk); + goto discard_and_relse; + } bh_unlock_sock(sk); sock_put(sk); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V3 3/8] udp: use limited socket backlog 2010-03-05 4:01 ` [PATCH V3 2/8] tcp: use limited socket backlog Zhu Yi @ 2010-03-05 4:01 ` Zhu Yi 2010-03-05 4:01 ` [PATCH V3 4/8] llc: " Zhu Yi 2010-03-05 6:21 ` [PATCH V3 3/8] udp: " Eric Dumazet 2010-03-05 6:19 ` [PATCH V3 2/8] tcp: " Eric Dumazet 1 sibling, 2 replies; 22+ messages in thread From: Zhu Yi @ 2010-03-05 4:01 UTC (permalink / raw) To: davem Cc: netdev, Zhu Yi, Alexey Kuznetsov, Pekka Savola (ipv6), Patrick McHardy Make udp adapt to the limited socket backlog change. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Zhu Yi <yi.zhu@intel.com> --- net/ipv4/udp.c | 6 ++++-- net/ipv6/udp.c | 28 ++++++++++++++++++---------- 2 files changed, 22 insertions(+), 12 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 608a544..e7eb47f 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1371,8 +1371,10 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) bh_lock_sock(sk); if (!sock_owned_by_user(sk)) rc = __udp_queue_rcv_skb(sk, skb); - else - sk_add_backlog(sk, skb); + else if (sk_add_backlog_limited(sk, skb)) { + bh_unlock_sock(sk); + goto drop; + } bh_unlock_sock(sk); return rc; diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 52b8347..6480491 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -583,16 +583,20 @@ static void flush_stack(struct sock **stack, unsigned int count, bh_lock_sock(sk); if (!sock_owned_by_user(sk)) udpv6_queue_rcv_skb(sk, skb1); - else - sk_add_backlog(sk, skb1); + else if (sk_add_backlog_limited(sk, skb1)) { + kfree_skb(skb1); + bh_unlock_sock(sk); + goto drop; + } bh_unlock_sock(sk); - } else { - atomic_inc(&sk->sk_drops); - UDP6_INC_STATS_BH(sock_net(sk), - UDP_MIB_RCVBUFERRORS, IS_UDPLITE(sk)); - UDP6_INC_STATS_BH(sock_net(sk), - UDP_MIB_INERRORS, IS_UDPLITE(sk)); + continue; } +drop: + atomic_inc(&sk->sk_drops); + UDP6_INC_STATS_BH(sock_net(sk), + UDP_MIB_RCVBUFERRORS, IS_UDPLITE(sk)); + UDP6_INC_STATS_BH(sock_net(sk), + UDP_MIB_INERRORS, IS_UDPLITE(sk)); } } /* @@ -754,8 +758,12 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, bh_lock_sock(sk); if (!sock_owned_by_user(sk)) udpv6_queue_rcv_skb(sk, skb); - else - sk_add_backlog(sk, skb); + else if (sk_add_backlog_limited(sk, skb)) { + atomic_inc(&sk->sk_drops); + bh_unlock_sock(sk); + sock_put(sk); + goto discard; + } bh_unlock_sock(sk); sock_put(sk); return 0; -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V3 4/8] llc: use limited socket backlog 2010-03-05 4:01 ` [PATCH V3 3/8] udp: " Zhu Yi @ 2010-03-05 4:01 ` Zhu Yi 2010-03-05 4:01 ` [PATCH V3 5/8] sctp: " Zhu Yi ` (2 more replies) 2010-03-05 6:21 ` [PATCH V3 3/8] udp: " Eric Dumazet 1 sibling, 3 replies; 22+ messages in thread From: Zhu Yi @ 2010-03-05 4:01 UTC (permalink / raw) To: davem; +Cc: netdev, Zhu Yi, Arnaldo Carvalho de Melo Make llc adapt to the limited socket backlog change. Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Zhu Yi <yi.zhu@intel.com> --- net/llc/llc_conn.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/net/llc/llc_conn.c b/net/llc/llc_conn.c index a8dde9b..c0539ff 100644 --- a/net/llc/llc_conn.c +++ b/net/llc/llc_conn.c @@ -827,7 +827,8 @@ void llc_conn_handler(struct llc_sap *sap, struct sk_buff *skb) else { dprintk("%s: adding to backlog...\n", __func__); llc_set_backlog_type(skb, LLC_PACKET); - sk_add_backlog(sk, skb); + if (sk_add_backlog_limited(sk, skb)) + goto drop_unlock; } out: bh_unlock_sock(sk); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V3 5/8] sctp: use limited socket backlog 2010-03-05 4:01 ` [PATCH V3 4/8] llc: " Zhu Yi @ 2010-03-05 4:01 ` Zhu Yi 2010-03-05 6:28 ` Eric Dumazet [not found] ` <1267761707-15605-6-git-send-email-yi.zhu@intel.com> 2010-03-05 6:22 ` [PATCH V3 4/8] llc: " Eric Dumazet 2010-03-05 13:00 ` Arnaldo Carvalho de Melo 2 siblings, 2 replies; 22+ messages in thread From: Zhu Yi @ 2010-03-05 4:01 UTC (permalink / raw) To: davem; +Cc: netdev, Zhu Yi, Vlad Yasevich, Sridhar Samudrala Make sctp adapt to the limited socket backlog change. Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Zhu Yi <yi.zhu@intel.com> --- net/sctp/input.c | 42 +++++++++++++++++++++++++++--------------- net/sctp/socket.c | 3 +++ 2 files changed, 30 insertions(+), 15 deletions(-) diff --git a/net/sctp/input.c b/net/sctp/input.c index c0c973e..cbc0636 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -75,7 +75,7 @@ static struct sctp_association *__sctp_lookup_association( const union sctp_addr *peer, struct sctp_transport **pt); -static void sctp_add_backlog(struct sock *sk, struct sk_buff *skb); +static int sctp_add_backlog(struct sock *sk, struct sk_buff *skb); /* Calculate the SCTP checksum of an SCTP packet. */ @@ -265,8 +265,13 @@ int sctp_rcv(struct sk_buff *skb) } if (sock_owned_by_user(sk)) { + if (sctp_add_backlog(sk, skb)) { + sctp_bh_unlock_sock(sk); + sctp_chunk_free(chunk); + skb = NULL; /* sctp_chunk_free already freed the skb */ + goto discard_release; + } SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_BACKLOG); - sctp_add_backlog(sk, skb); } else { SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_SOFTIRQ); sctp_inq_push(&chunk->rcvr->inqueue, chunk); @@ -336,8 +341,10 @@ int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb) sctp_bh_lock_sock(sk); if (sock_owned_by_user(sk)) { - sk_add_backlog(sk, skb); - backloged = 1; + if (sk_add_backlog_limited(sk, skb)) + sctp_chunk_free(chunk); + else + backloged = 1; } else sctp_inq_push(inqueue, chunk); @@ -362,22 +369,27 @@ done: return 0; } -static void sctp_add_backlog(struct sock *sk, struct sk_buff *skb) +static int sctp_add_backlog(struct sock *sk, struct sk_buff *skb) { struct sctp_chunk *chunk = SCTP_INPUT_CB(skb)->chunk; struct sctp_ep_common *rcvr = chunk->rcvr; + int ret; - /* Hold the assoc/ep while hanging on the backlog queue. - * This way, we know structures we need will not disappear from us - */ - if (SCTP_EP_TYPE_ASSOCIATION == rcvr->type) - sctp_association_hold(sctp_assoc(rcvr)); - else if (SCTP_EP_TYPE_SOCKET == rcvr->type) - sctp_endpoint_hold(sctp_ep(rcvr)); - else - BUG(); + ret = sk_add_backlog_limited(sk, skb); + if (!ret) { + /* Hold the assoc/ep while hanging on the backlog queue. + * This way, we know structures we need will not disappear + * from us + */ + if (SCTP_EP_TYPE_ASSOCIATION == rcvr->type) + sctp_association_hold(sctp_assoc(rcvr)); + else if (SCTP_EP_TYPE_SOCKET == rcvr->type) + sctp_endpoint_hold(sctp_ep(rcvr)); + else + BUG(); + } + return ret; - sk_add_backlog(sk, skb); } /* Handle icmp frag needed error. */ diff --git a/net/sctp/socket.c b/net/sctp/socket.c index f6d1e59..dfc5c12 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -3720,6 +3720,9 @@ SCTP_STATIC int sctp_init_sock(struct sock *sk) SCTP_DBG_OBJCNT_INC(sock); percpu_counter_inc(&sctp_sockets_allocated); + /* Set socket backlog limit. */ + sk->sk_backlog.limit = sysctl_sctp_rmem[1]; + local_bh_disable(); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); local_bh_enable(); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH V3 5/8] sctp: use limited socket backlog 2010-03-05 4:01 ` [PATCH V3 5/8] sctp: " Zhu Yi @ 2010-03-05 6:28 ` Eric Dumazet 2010-03-05 11:05 ` Zhu, Yi [not found] ` <1267761707-15605-6-git-send-email-yi.zhu@intel.com> 1 sibling, 1 reply; 22+ messages in thread From: Eric Dumazet @ 2010-03-05 6:28 UTC (permalink / raw) To: Zhu Yi; +Cc: davem, netdev, Vlad Yasevich, Sridhar Samudrala Le vendredi 05 mars 2010 à 12:01 +0800, Zhu Yi a écrit : > Make sctp adapt to the limited socket backlog change. > > Cc: Vlad Yasevich <vladislav.yasevich@hp.com> > Cc: Sridhar Samudrala <sri@us.ibm.com> > Signed-off-by: Zhu Yi <yi.zhu@intel.com> This patch looks wrong. > -static void sctp_add_backlog(struct sock *sk, struct sk_buff *skb) > +static int sctp_add_backlog(struct sock *sk, struct sk_buff *skb) > { > struct sctp_chunk *chunk = SCTP_INPUT_CB(skb)->chunk; > struct sctp_ep_common *rcvr = chunk->rcvr; > + int ret; > > - /* Hold the assoc/ep while hanging on the backlog queue. > - * This way, we know structures we need will not disappear from us > - */ > - if (SCTP_EP_TYPE_ASSOCIATION == rcvr->type) > - sctp_association_hold(sctp_assoc(rcvr)); > - else if (SCTP_EP_TYPE_SOCKET == rcvr->type) > - sctp_endpoint_hold(sctp_ep(rcvr)); > - else > - BUG(); > + ret = sk_add_backlog_limited(sk, skb); > + if (!ret) { > + /* Hold the assoc/ep while hanging on the backlog queue. > + * This way, we know structures we need will not disappear > + * from us > + */ > + if (SCTP_EP_TYPE_ASSOCIATION == rcvr->type) > + sctp_association_hold(sctp_assoc(rcvr)); > + else if (SCTP_EP_TYPE_SOCKET == rcvr->type) > + sctp_endpoint_hold(sctp_ep(rcvr)); > + else > + BUG(); > + } > + return ret; > > - sk_add_backlog(sk, skb); > } > As advertized by comment, we should hold the association *before* accessing backlog queue. If order is not important, comment should be relaxed somehow ? ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH V3 5/8] sctp: use limited socket backlog 2010-03-05 6:28 ` Eric Dumazet @ 2010-03-05 11:05 ` Zhu, Yi 2010-03-05 13:24 ` Eric Dumazet 2010-03-05 13:30 ` Vlad Yasevich 0 siblings, 2 replies; 22+ messages in thread From: Zhu, Yi @ 2010-03-05 11:05 UTC (permalink / raw) To: Eric Dumazet Cc: davem@davemloft.net, netdev@vger.kernel.org, Vlad Yasevich, Sridhar Samudrala Eric Dumazet <eric.dumazet@gmail.com> wrote: > As advertized by comment, we should hold the association *before* > accessing backlog queue. > If order is not important, comment should be relaxed somehow ? I don't see how the order is important here. We are under sock_lock here thus nobody will race with us. IMHO, the comment talks about if a packet is queued into the backlog, we need to increase the assoc/ep reference count. Otherwise the assoc/ep might be disappeared when we are about to process it (by sctp_backlog_rcv) sometime later. Thanks, -yi ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH V3 5/8] sctp: use limited socket backlog 2010-03-05 11:05 ` Zhu, Yi @ 2010-03-05 13:24 ` Eric Dumazet 2010-03-05 13:30 ` Vlad Yasevich 1 sibling, 0 replies; 22+ messages in thread From: Eric Dumazet @ 2010-03-05 13:24 UTC (permalink / raw) To: Zhu, Yi Cc: davem@davemloft.net, netdev@vger.kernel.org, Vlad Yasevich, Sridhar Samudrala Le vendredi 05 mars 2010 à 19:05 +0800, Zhu, Yi a écrit : > I don't see how the order is important here. We are under sock_lock > here thus nobody will race with us. IMHO, the comment talks about > if a packet is queued into the backlog, we need to increase the assoc/ep > reference count. Otherwise the assoc/ep might be disappeared when > we are about to process it (by sctp_backlog_rcv) sometime later. > OK then. Its strange this protocol has to increase a refcount for each queued frame in its backlog, but this is unrelated to your changes anyway. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 5/8] sctp: use limited socket backlog 2010-03-05 11:05 ` Zhu, Yi 2010-03-05 13:24 ` Eric Dumazet @ 2010-03-05 13:30 ` Vlad Yasevich 1 sibling, 0 replies; 22+ messages in thread From: Vlad Yasevich @ 2010-03-05 13:30 UTC (permalink / raw) To: Zhu, Yi Cc: Eric Dumazet, davem@davemloft.net, netdev@vger.kernel.org, Sridhar Samudrala Zhu, Yi wrote: > Eric Dumazet <eric.dumazet@gmail.com> wrote: > >> As advertized by comment, we should hold the association *before* >> accessing backlog queue. > >> If order is not important, comment should be relaxed somehow ? > > I don't see how the order is important here. We are under sock_lock > here thus nobody will race with us. IMHO, the comment talks about > if a packet is queued into the backlog, we need to increase the assoc/ep > reference count. Otherwise the assoc/ep might be disappeared when > we are about to process it (by sctp_backlog_rcv) sometime later. > > Thanks, > -yi Yes, that's correct. The order is not really important since we are under lock and are actually already holding a ref. However the ref will be dropped once we exit the function, so the function takes an additional ref that is held while the packet is backloged. You could get rid of the extra nesting though by returning early if backlog failed. -vlad ^ permalink raw reply [flat|nested] 22+ messages in thread
[parent not found: <1267761707-15605-6-git-send-email-yi.zhu@intel.com>]
* [PATCH V3 7/8] x25: use limited socket backlog [not found] ` <1267761707-15605-6-git-send-email-yi.zhu@intel.com> @ 2010-03-05 4:01 ` Zhu Yi 2010-03-05 4:01 ` [PATCH V3 8/8] net: backlog functions rename Zhu Yi 2010-03-05 6:30 ` [PATCH V3 7/8] x25: use limited socket backlog Eric Dumazet 2010-03-05 6:30 ` [PATCH V3 6/8] tipc: " Eric Dumazet 2010-03-05 20:48 ` Stephens, Allan 2 siblings, 2 replies; 22+ messages in thread From: Zhu Yi @ 2010-03-05 4:01 UTC (permalink / raw) To: davem; +Cc: netdev, Zhu Yi, Andrew Hendry Make x25 adapt to the limited socket backlog change. Cc: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: Zhu Yi <yi.zhu@intel.com> --- net/x25/x25_dev.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/x25/x25_dev.c b/net/x25/x25_dev.c index 3e1efe5..a9da0dc 100644 --- a/net/x25/x25_dev.c +++ b/net/x25/x25_dev.c @@ -53,7 +53,7 @@ static int x25_receive_data(struct sk_buff *skb, struct x25_neigh *nb) if (!sock_owned_by_user(sk)) { queued = x25_process_rx_frame(sk, skb); } else { - sk_add_backlog(sk, skb); + queued = !sk_add_backlog_limited(sk, skb); } bh_unlock_sock(sk); sock_put(sk); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH V3 8/8] net: backlog functions rename 2010-03-05 4:01 ` [PATCH V3 7/8] x25: " Zhu Yi @ 2010-03-05 4:01 ` Zhu Yi 2010-03-05 6:32 ` Eric Dumazet 2010-03-05 6:30 ` [PATCH V3 7/8] x25: use limited socket backlog Eric Dumazet 1 sibling, 1 reply; 22+ messages in thread From: Zhu Yi @ 2010-03-05 4:01 UTC (permalink / raw) To: davem; +Cc: netdev, Zhu Yi sk_add_backlog -> __sk_add_backlog sk_add_backlog_limited -> sk_add_backlog Signed-off-by: Zhu Yi <yi.zhu@intel.com> --- include/net/sock.h | 6 +++--- net/core/sock.c | 2 +- net/dccp/minisocks.c | 2 +- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/tcp_minisocks.c | 2 +- net/ipv4/udp.c | 2 +- net/ipv6/tcp_ipv6.c | 2 +- net/ipv6/udp.c | 4 ++-- net/llc/llc_c_ac.c | 2 +- net/llc/llc_conn.c | 2 +- net/sctp/input.c | 4 ++-- net/tipc/socket.c | 2 +- net/x25/x25_dev.c | 2 +- 13 files changed, 17 insertions(+), 17 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 2516d76..170353d 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -592,7 +592,7 @@ static inline int sk_stream_memory_free(struct sock *sk) } /* OOB backlog add */ -static inline void sk_add_backlog(struct sock *sk, struct sk_buff *skb) +static inline void __sk_add_backlog(struct sock *sk, struct sk_buff *skb) { if (!sk->sk_backlog.tail) { sk->sk_backlog.head = sk->sk_backlog.tail = skb; @@ -604,12 +604,12 @@ static inline void sk_add_backlog(struct sock *sk, struct sk_buff *skb) } /* The per-socket spinlock must be held here. */ -static inline int sk_add_backlog_limited(struct sock *sk, struct sk_buff *skb) +static inline int sk_add_backlog(struct sock *sk, struct sk_buff *skb) { if (sk->sk_backlog.len >= max(sk->sk_backlog.limit, sk->sk_rcvbuf << 1)) return -ENOBUFS; - sk_add_backlog(sk, skb); + __sk_add_backlog(sk, skb); sk->sk_backlog.len += skb->truesize; return 0; } diff --git a/net/core/sock.c b/net/core/sock.c index 6e22dc9..61a65a2 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -340,7 +340,7 @@ int sk_receive_skb(struct sock *sk, struct sk_buff *skb, const int nested) rc = sk_backlog_rcv(sk, skb); mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_); - } else if (sk_add_backlog_limited(sk, skb)) { + } else if (sk_add_backlog(sk, skb)) { bh_unlock_sock(sk); atomic_inc(&sk->sk_drops); goto discard_and_relse; diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c index af226a0..0d508c3 100644 --- a/net/dccp/minisocks.c +++ b/net/dccp/minisocks.c @@ -254,7 +254,7 @@ int dccp_child_process(struct sock *parent, struct sock *child, * in main socket hash table and lock on listening * socket does not protect us more. */ - sk_add_backlog(child, skb); + __sk_add_backlog(child, skb); } bh_unlock_sock(child); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 4baf194..1915f7d 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1682,7 +1682,7 @@ process: if (!tcp_prequeue(sk, skb)) ret = tcp_v4_do_rcv(sk, skb); } - } else if (sk_add_backlog_limited(sk, skb)) { + } else if (sk_add_backlog(sk, skb)) { bh_unlock_sock(sk); goto discard_and_relse; } diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index f206ee5..4199bc6 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -728,7 +728,7 @@ int tcp_child_process(struct sock *parent, struct sock *child, * in main socket hash table and lock on listening * socket does not protect us more. */ - sk_add_backlog(child, skb); + __sk_add_backlog(child, skb); } bh_unlock_sock(child); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index e7eb47f..7af756d 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1371,7 +1371,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) bh_lock_sock(sk); if (!sock_owned_by_user(sk)) rc = __udp_queue_rcv_skb(sk, skb); - else if (sk_add_backlog_limited(sk, skb)) { + else if (sk_add_backlog(sk, skb)) { bh_unlock_sock(sk); goto drop; } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index c4ea9d5..2c378b1 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1740,7 +1740,7 @@ process: if (!tcp_prequeue(sk, skb)) ret = tcp_v6_do_rcv(sk, skb); } - } else if (sk_add_backlog_limited(sk, skb)) { + } else if (sk_add_backlog(sk, skb)) { bh_unlock_sock(sk); goto discard_and_relse; } diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 6480491..3c0c9c7 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -583,7 +583,7 @@ static void flush_stack(struct sock **stack, unsigned int count, bh_lock_sock(sk); if (!sock_owned_by_user(sk)) udpv6_queue_rcv_skb(sk, skb1); - else if (sk_add_backlog_limited(sk, skb1)) { + else if (sk_add_backlog(sk, skb1)) { kfree_skb(skb1); bh_unlock_sock(sk); goto drop; @@ -758,7 +758,7 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, bh_lock_sock(sk); if (!sock_owned_by_user(sk)) udpv6_queue_rcv_skb(sk, skb); - else if (sk_add_backlog_limited(sk, skb)) { + else if (sk_add_backlog(sk, skb)) { atomic_inc(&sk->sk_drops); bh_unlock_sock(sk); sock_put(sk); diff --git a/net/llc/llc_c_ac.c b/net/llc/llc_c_ac.c index 019c780..86d6985 100644 --- a/net/llc/llc_c_ac.c +++ b/net/llc/llc_c_ac.c @@ -1437,7 +1437,7 @@ static void llc_process_tmr_ev(struct sock *sk, struct sk_buff *skb) llc_conn_state_process(sk, skb); else { llc_set_backlog_type(skb, LLC_EVENT); - sk_add_backlog(sk, skb); + __sk_add_backlog(sk, skb); } } } diff --git a/net/llc/llc_conn.c b/net/llc/llc_conn.c index c0539ff..a12144d 100644 --- a/net/llc/llc_conn.c +++ b/net/llc/llc_conn.c @@ -827,7 +827,7 @@ void llc_conn_handler(struct llc_sap *sap, struct sk_buff *skb) else { dprintk("%s: adding to backlog...\n", __func__); llc_set_backlog_type(skb, LLC_PACKET); - if (sk_add_backlog_limited(sk, skb)) + if (sk_add_backlog(sk, skb)) goto drop_unlock; } out: diff --git a/net/sctp/input.c b/net/sctp/input.c index cbc0636..3d74b26 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -341,7 +341,7 @@ int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb) sctp_bh_lock_sock(sk); if (sock_owned_by_user(sk)) { - if (sk_add_backlog_limited(sk, skb)) + if (sk_add_backlog(sk, skb)) sctp_chunk_free(chunk); else backloged = 1; @@ -375,7 +375,7 @@ static int sctp_add_backlog(struct sock *sk, struct sk_buff *skb) struct sctp_ep_common *rcvr = chunk->rcvr; int ret; - ret = sk_add_backlog_limited(sk, skb); + ret = sk_add_backlog(sk, skb); if (!ret) { /* Hold the assoc/ep while hanging on the backlog queue. * This way, we know structures we need will not disappear diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 22bfbc3..4b235fc 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -1322,7 +1322,7 @@ static u32 dispatch(struct tipc_port *tport, struct sk_buff *buf) if (!sock_owned_by_user(sk)) { res = filter_rcv(sk, buf); } else { - if (sk_add_backlog_limited(sk, buf)) + if (sk_add_backlog(sk, buf)) res = TIPC_ERR_OVERLOAD; else res = TIPC_OK; diff --git a/net/x25/x25_dev.c b/net/x25/x25_dev.c index a9da0dc..52e3042 100644 --- a/net/x25/x25_dev.c +++ b/net/x25/x25_dev.c @@ -53,7 +53,7 @@ static int x25_receive_data(struct sk_buff *skb, struct x25_neigh *nb) if (!sock_owned_by_user(sk)) { queued = x25_process_rx_frame(sk, skb); } else { - queued = !sk_add_backlog_limited(sk, skb); + queued = !sk_add_backlog(sk, skb); } bh_unlock_sock(sk); sock_put(sk); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH V3 8/8] net: backlog functions rename 2010-03-05 4:01 ` [PATCH V3 8/8] net: backlog functions rename Zhu Yi @ 2010-03-05 6:32 ` Eric Dumazet 2010-03-05 21:36 ` David Miller 0 siblings, 1 reply; 22+ messages in thread From: Eric Dumazet @ 2010-03-05 6:32 UTC (permalink / raw) To: Zhu Yi; +Cc: davem, netdev Le vendredi 05 mars 2010 à 12:01 +0800, Zhu Yi a écrit : > sk_add_backlog -> __sk_add_backlog > sk_add_backlog_limited -> sk_add_backlog > > Signed-off-by: Zhu Yi <yi.zhu@intel.com> > --- > include/net/sock.h | 6 +++--- > net/core/sock.c | 2 +- > net/dccp/minisocks.c | 2 +- > net/ipv4/tcp_ipv4.c | 2 +- > net/ipv4/tcp_minisocks.c | 2 +- > net/ipv4/udp.c | 2 +- > net/ipv6/tcp_ipv6.c | 2 +- > net/ipv6/udp.c | 4 ++-- > net/llc/llc_c_ac.c | 2 +- > net/llc/llc_conn.c | 2 +- > net/sctp/input.c | 4 ++-- > net/tipc/socket.c | 2 +- > net/x25/x25_dev.c | 2 +- > 13 files changed, 17 insertions(+), 17 deletions(-) > Acked-by: Eric Dumazet <eric.dumazet@gmail.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 8/8] net: backlog functions rename 2010-03-05 6:32 ` Eric Dumazet @ 2010-03-05 21:36 ` David Miller 0 siblings, 0 replies; 22+ messages in thread From: David Miller @ 2010-03-05 21:36 UTC (permalink / raw) To: eric.dumazet; +Cc: yi.zhu, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Fri, 05 Mar 2010 07:32:15 +0100 > Le vendredi 05 mars 2010 à 12:01 +0800, Zhu Yi a écrit : >> sk_add_backlog -> __sk_add_backlog >> sk_add_backlog_limited -> sk_add_backlog >> >> Signed-off-by: Zhu Yi <yi.zhu@intel.com> >> --- >> include/net/sock.h | 6 +++--- >> net/core/sock.c | 2 +- >> net/dccp/minisocks.c | 2 +- >> net/ipv4/tcp_ipv4.c | 2 +- >> net/ipv4/tcp_minisocks.c | 2 +- >> net/ipv4/udp.c | 2 +- >> net/ipv6/tcp_ipv6.c | 2 +- >> net/ipv6/udp.c | 4 ++-- >> net/llc/llc_c_ac.c | 2 +- >> net/llc/llc_conn.c | 2 +- >> net/sctp/input.c | 4 ++-- >> net/tipc/socket.c | 2 +- >> net/x25/x25_dev.c | 2 +- >> 13 files changed, 17 insertions(+), 17 deletions(-) >> > > Acked-by: Eric Dumazet <eric.dumazet@gmail.com> All 8 patches applied to net-2.6, thanks Zhu! Feel free to send me a patch which adds the "__must_check" tag to sk_add_backlog() so that any failure to check and drop packets will be at least warned about. Thanks! ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 7/8] x25: use limited socket backlog 2010-03-05 4:01 ` [PATCH V3 7/8] x25: " Zhu Yi 2010-03-05 4:01 ` [PATCH V3 8/8] net: backlog functions rename Zhu Yi @ 2010-03-05 6:30 ` Eric Dumazet 1 sibling, 0 replies; 22+ messages in thread From: Eric Dumazet @ 2010-03-05 6:30 UTC (permalink / raw) To: Zhu Yi; +Cc: davem, netdev, Andrew Hendry Le vendredi 05 mars 2010 à 12:01 +0800, Zhu Yi a écrit : > Make x25 adapt to the limited socket backlog change. > > Cc: Andrew Hendry <andrew.hendry@gmail.com> > Signed-off-by: Zhu Yi <yi.zhu@intel.com> > --- Acked-by: Eric Dumazet <eric.dumazet@gmail.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 6/8] tipc: use limited socket backlog [not found] ` <1267761707-15605-6-git-send-email-yi.zhu@intel.com> 2010-03-05 4:01 ` [PATCH V3 7/8] x25: " Zhu Yi @ 2010-03-05 6:30 ` Eric Dumazet 2010-03-05 20:48 ` Stephens, Allan 2 siblings, 0 replies; 22+ messages in thread From: Eric Dumazet @ 2010-03-05 6:30 UTC (permalink / raw) To: Zhu Yi; +Cc: davem, netdev, Jon Maloy, Allan Stephens Le vendredi 05 mars 2010 à 12:01 +0800, Zhu Yi a écrit : > Make tipc adapt to the limited socket backlog change. > > Cc: Jon Maloy <jon.maloy@ericsson.com> > Cc: Allan Stephens <allan.stephens@windriver.com> > Signed-off-by: Zhu Yi <yi.zhu@intel.com> > --- > net/tipc/socket.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) Acked-by: Eric Dumazet <eric.dumazet@gmail.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH V3 6/8] tipc: use limited socket backlog [not found] ` <1267761707-15605-6-git-send-email-yi.zhu@intel.com> 2010-03-05 4:01 ` [PATCH V3 7/8] x25: " Zhu Yi 2010-03-05 6:30 ` [PATCH V3 6/8] tipc: " Eric Dumazet @ 2010-03-05 20:48 ` Stephens, Allan 2 siblings, 0 replies; 22+ messages in thread From: Stephens, Allan @ 2010-03-05 20:48 UTC (permalink / raw) To: Zhu Yi; +Cc: netdev, Jon Maloy, davem, Eric Dumazet > > Make tipc adapt to the limited socket backlog change. > > Cc: Jon Maloy <jon.maloy@ericsson.com> > Cc: Allan Stephens <allan.stephens@windriver.com> > Signed-off-by: Zhu Yi <yi.zhu@intel.com> >From visual inspection and basic sanity testing: Acked-by: Allan Stephens <allan.stephens@windriver.com> Nice work! -- Al ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 4/8] llc: use limited socket backlog 2010-03-05 4:01 ` [PATCH V3 4/8] llc: " Zhu Yi 2010-03-05 4:01 ` [PATCH V3 5/8] sctp: " Zhu Yi @ 2010-03-05 6:22 ` Eric Dumazet 2010-03-05 13:00 ` Arnaldo Carvalho de Melo 2 siblings, 0 replies; 22+ messages in thread From: Eric Dumazet @ 2010-03-05 6:22 UTC (permalink / raw) To: Zhu Yi; +Cc: davem, netdev, Arnaldo Carvalho de Melo Le vendredi 05 mars 2010 à 12:01 +0800, Zhu Yi a écrit : > Make llc adapt to the limited socket backlog change. > > Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> > Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 4/8] llc: use limited socket backlog 2010-03-05 4:01 ` [PATCH V3 4/8] llc: " Zhu Yi 2010-03-05 4:01 ` [PATCH V3 5/8] sctp: " Zhu Yi 2010-03-05 6:22 ` [PATCH V3 4/8] llc: " Eric Dumazet @ 2010-03-05 13:00 ` Arnaldo Carvalho de Melo 2 siblings, 0 replies; 22+ messages in thread From: Arnaldo Carvalho de Melo @ 2010-03-05 13:00 UTC (permalink / raw) To: Zhu Yi; +Cc: davem, netdev Em Fri, Mar 05, 2010 at 12:01:43PM +0800, Zhu Yi escreveu: > Make llc adapt to the limited socket backlog change. > > Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> > Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 3/8] udp: use limited socket backlog 2010-03-05 4:01 ` [PATCH V3 3/8] udp: " Zhu Yi 2010-03-05 4:01 ` [PATCH V3 4/8] llc: " Zhu Yi @ 2010-03-05 6:21 ` Eric Dumazet 1 sibling, 0 replies; 22+ messages in thread From: Eric Dumazet @ 2010-03-05 6:21 UTC (permalink / raw) To: Zhu Yi Cc: davem, netdev, Alexey Kuznetsov, Pekka Savola (ipv6), Patrick McHardy Le vendredi 05 mars 2010 à 12:01 +0800, Zhu Yi a écrit : > Make udp adapt to the limited socket backlog change. > > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> > Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> > Cc: Patrick McHardy <kaber@trash.net> > Signed-off-by: Zhu Yi <yi.zhu@intel.com> > --- > net/ipv4/udp.c | 6 ++++-- > net/ipv6/udp.c | 28 ++++++++++++++++++---------- > 2 files changed, 22 insertions(+), 12 deletions(-) Acked-by: Eric Dumazet <eric.dumazet@gmail.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 2/8] tcp: use limited socket backlog 2010-03-05 4:01 ` [PATCH V3 2/8] tcp: use limited socket backlog Zhu Yi 2010-03-05 4:01 ` [PATCH V3 3/8] udp: " Zhu Yi @ 2010-03-05 6:19 ` Eric Dumazet 2010-03-08 9:21 ` Eric Dumazet 1 sibling, 1 reply; 22+ messages in thread From: Eric Dumazet @ 2010-03-05 6:19 UTC (permalink / raw) To: Zhu Yi Cc: davem, netdev, Alexey Kuznetsov, Pekka Savola (ipv6), Patrick McHardy Le vendredi 05 mars 2010 à 12:01 +0800, Zhu Yi a écrit : > Make tcp adapt to the limited socket backlog change. > > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> > Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> > Cc: Patrick McHardy <kaber@trash.net> > Signed-off-by: Zhu Yi <yi.zhu@intel.com> > --- Acked-by: Eric Dumazet <eric.dumazet@gmail.com> I'll submit a followup patch to add a MIB counter if your patch gets in. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 2/8] tcp: use limited socket backlog 2010-03-05 6:19 ` [PATCH V3 2/8] tcp: " Eric Dumazet @ 2010-03-08 9:21 ` Eric Dumazet 2010-03-08 18:46 ` David Miller 0 siblings, 1 reply; 22+ messages in thread From: Eric Dumazet @ 2010-03-08 9:21 UTC (permalink / raw) To: Zhu Yi, David Miller; +Cc: netdev Le vendredi 05 mars 2010 à 07:19 +0100, Eric Dumazet a écrit : > I'll submit a followup patch to add a MIB counter if your patch gets in. > As promised, here it is. [PATCH] tcp: Add SNMP counters for backlog and min_ttl drops Commit 6b03a53a (tcp: use limited socket backlog) added the possibility of dropping frames when backlog queue is full. Commit d218d111 (tcp: Generalized TTL Security Mechanism) added the possibility of dropping frames when TTL is under a given limit. This patch adds new SNMP MIB entries, named TCPBacklogDrop and TCPMinTTLDrop, published in /proc/net/netstat in TcpExt: line netstat -s | egrep "TCPBacklogDrop|TCPMinTTLDrop" TCPBacklogDrop: 0 TCPMinTTLDrop: 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --- include/linux/snmp.h | 2 ++ net/ipv4/proc.c | 2 ++ net/ipv4/tcp_ipv4.c | 7 +++++-- net/ipv6/tcp_ipv6.c | 3 ++- 4 files changed, 11 insertions(+), 3 deletions(-) diff --git a/include/linux/snmp.h b/include/linux/snmp.h index e28f5a0..4435d10 100644 --- a/include/linux/snmp.h +++ b/include/linux/snmp.h @@ -225,6 +225,8 @@ enum LINUX_MIB_SACKSHIFTED, LINUX_MIB_SACKMERGED, LINUX_MIB_SACKSHIFTFALLBACK, + LINUX_MIB_TCPBACKLOGDROP, + LINUX_MIB_TCPMINTTLDROP, /* RFC 5082 */ __LINUX_MIB_MAX }; diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c index 242ed23..4f1f337 100644 --- a/net/ipv4/proc.c +++ b/net/ipv4/proc.c @@ -249,6 +249,8 @@ static const struct snmp_mib snmp4_net_list[] = { SNMP_MIB_ITEM("TCPSackShifted", LINUX_MIB_SACKSHIFTED), SNMP_MIB_ITEM("TCPSackMerged", LINUX_MIB_SACKMERGED), SNMP_MIB_ITEM("TCPSackShiftFallback", LINUX_MIB_SACKSHIFTFALLBACK), + SNMP_MIB_ITEM("TCPBacklogDrop", LINUX_MIB_TCPBACKLOGDROP), + SNMP_MIB_ITEM("TCPMinTTLDrop", LINUX_MIB_TCPMINTTLDROP), SNMP_MIB_SENTINEL }; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 1915f7d..8d51d39 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1651,8 +1651,10 @@ int tcp_v4_rcv(struct sk_buff *skb) if (!sk) goto no_tcp_socket; - if (iph->ttl < inet_sk(sk)->min_ttl) + if (unlikely(iph->ttl < inet_sk(sk)->min_ttl)) { + NET_INC_STATS_BH(net, LINUX_MIB_TCPMINTTLDROP); goto discard_and_relse; + } process: if (sk->sk_state == TCP_TIME_WAIT) @@ -1682,8 +1684,9 @@ process: if (!tcp_prequeue(sk, skb)) ret = tcp_v4_do_rcv(sk, skb); } - } else if (sk_add_backlog(sk, skb)) { + } else if (unlikely(sk_add_backlog(sk, skb))) { bh_unlock_sock(sk); + NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP); goto discard_and_relse; } bh_unlock_sock(sk); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 2c378b1..9b6dbba 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1740,8 +1740,9 @@ process: if (!tcp_prequeue(sk, skb)) ret = tcp_v6_do_rcv(sk, skb); } - } else if (sk_add_backlog(sk, skb)) { + } else if (unlikely(sk_add_backlog(sk, skb))) { bh_unlock_sock(sk); + NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP); goto discard_and_relse; } bh_unlock_sock(sk); ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH V3 2/8] tcp: use limited socket backlog 2010-03-08 9:21 ` Eric Dumazet @ 2010-03-08 18:46 ` David Miller 0 siblings, 0 replies; 22+ messages in thread From: David Miller @ 2010-03-08 18:46 UTC (permalink / raw) To: eric.dumazet; +Cc: yi.zhu, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Mon, 08 Mar 2010 10:21:57 +0100 > Le vendredi 05 mars 2010 à 07:19 +0100, Eric Dumazet a écrit : > >> I'll submit a followup patch to add a MIB counter if your patch gets in. >> > > As promised, here it is. > > [PATCH] tcp: Add SNMP counters for backlog and min_ttl drops > > Commit 6b03a53a (tcp: use limited socket backlog) added the possibility > of dropping frames when backlog queue is full. > > Commit d218d111 (tcp: Generalized TTL Security Mechanism) added the > possibility of dropping frames when TTL is under a given limit. > > This patch adds new SNMP MIB entries, named TCPBacklogDrop and > TCPMinTTLDrop, published in /proc/net/netstat in TcpExt: line > > netstat -s | egrep "TCPBacklogDrop|TCPMinTTLDrop" > TCPBacklogDrop: 0 > TCPMinTTLDrop: 0 > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Applied. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH V3 1/8] net: add limit for socket backlog [not found] <1267761707-15605-1-git-send-email-yi.zhu@intel.com> 2010-03-05 4:01 ` [PATCH V3 2/8] tcp: use limited socket backlog Zhu Yi @ 2010-03-05 13:00 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 22+ messages in thread From: Arnaldo Carvalho de Melo @ 2010-03-05 13:00 UTC (permalink / raw) To: Zhu Yi Cc: davem, netdev, Pekka Savola (ipv6), Patrick McHardy, Vlad Yasevich, Sridhar Samudrala, Jon Maloy, Allan Stephens, Andrew Hendry, Eric Dumazet Em Fri, Mar 05, 2010 at 12:01:40PM +0800, Zhu Yi escreveu: > We got system OOM while running some UDP netperf testing on the loopback > device. The case is multiple senders sent stream UDP packets to a single > receiver via loopback on local host. Of course, the receiver is not able > to handle all the packets in time. But we surprisingly found that these > packets were not discarded due to the receiver's sk->sk_rcvbuf limit. > Instead, they are kept queuing to sk->sk_backlog and finally ate up all > the memory. We believe this is a secure hole that a none privileged user > can crash the system. > > The root cause for this problem is, when the receiver is doing > __release_sock() (i.e. after userspace recv, kernel udp_recvmsg -> > skb_free_datagram_locked -> release_sock), it moves skbs from backlog to > sk_receive_queue with the softirq enabled. In the above case, multiple > busy senders will almost make it an endless loop. The skbs in the > backlog end up eat all the system memory. > > The issue is not only for UDP. Any protocols using socket backlog is > potentially affected. The patch adds limit for socket backlog so that > the backlog size cannot be expanded endlessly. >From visual inspection (no testing): Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> - Arnaldo ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2010-03-08 18:46 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1267761707-15605-1-git-send-email-yi.zhu@intel.com> 2010-03-05 4:01 ` [PATCH V3 2/8] tcp: use limited socket backlog Zhu Yi 2010-03-05 4:01 ` [PATCH V3 3/8] udp: " Zhu Yi 2010-03-05 4:01 ` [PATCH V3 4/8] llc: " Zhu Yi 2010-03-05 4:01 ` [PATCH V3 5/8] sctp: " Zhu Yi 2010-03-05 6:28 ` Eric Dumazet 2010-03-05 11:05 ` Zhu, Yi 2010-03-05 13:24 ` Eric Dumazet 2010-03-05 13:30 ` Vlad Yasevich [not found] ` <1267761707-15605-6-git-send-email-yi.zhu@intel.com> 2010-03-05 4:01 ` [PATCH V3 7/8] x25: " Zhu Yi 2010-03-05 4:01 ` [PATCH V3 8/8] net: backlog functions rename Zhu Yi 2010-03-05 6:32 ` Eric Dumazet 2010-03-05 21:36 ` David Miller 2010-03-05 6:30 ` [PATCH V3 7/8] x25: use limited socket backlog Eric Dumazet 2010-03-05 6:30 ` [PATCH V3 6/8] tipc: " Eric Dumazet 2010-03-05 20:48 ` Stephens, Allan 2010-03-05 6:22 ` [PATCH V3 4/8] llc: " Eric Dumazet 2010-03-05 13:00 ` Arnaldo Carvalho de Melo 2010-03-05 6:21 ` [PATCH V3 3/8] udp: " Eric Dumazet 2010-03-05 6:19 ` [PATCH V3 2/8] tcp: " Eric Dumazet 2010-03-08 9:21 ` Eric Dumazet 2010-03-08 18:46 ` David Miller 2010-03-05 13:00 ` [PATCH V3 1/8] net: add limit for " Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).