* [PATCH 0/2] Tracepoints for queueing skb to rcvbuf
@ 2011-06-17 21:56 Satoru Moriya
2011-06-17 21:58 ` [PATCH 1/2] udp: add tracepoints " Satoru Moriya
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Satoru Moriya @ 2011-06-17 21:56 UTC (permalink / raw)
To: netdev@vger.kernel.org
Cc: nhorman@tuxdriver.com, davem@davemloft.net,
dle-develop@lists.sourceforge.net, Seiji Aguchi
Hi,
Kernel may drops packets when it queues them to socket receive buffer.
Currently We can detect packet drop events and know when and where it
happened via kfree_skb tracepoint. But it's difficult to know a detailed
reason because there are some possibilities.
In UDP case, core function for queueing skb to socket rcvbuf is
__udp_queue_rcv_skb and its call chain is following:
__udp_queue_rcv_skb
ip_queue_rcv_skb
sock_queue_rcv_skb
sk_rmem_schedule
__sk_mem_schedule
We can catch a packet drop event in __udp_queue_rcv_skb and it means
ip_queue_rcv_skb/sock_queue_rcv_skb returned negative value.
In sock_queue_rcv_skb there are 3 possibilities-(*) where it returns
negative value but we can't separate them. Moreover sock_queue_rcv_skb calls
__sk_mem_schedule and there are several if satetements to decide whether
kernel should drop the packet.
To separate these reasons, this patchset adds 3 tracepoints.
1st one is added to __udp_queue_rcv_skb to get return value of
ip_queue_rcv_skb. Analyzing it we can separate above (*) (3 possibilities).
2nd and 3rd one are to get more detailed information. We can collect status
of socket receive queue and related parameters(some of them are sysctl knob
e.g. /proc/sys/net/ipv4/udp_mem, etc. for UDP) and then we can tune kernel
behavior easily.
Any comments and feedback are welcome.
Satoru Moriya (2):
udp: add tracepoint for queueing skb to rcvbuf
core: add tracepoints for queueing skb to rcvbuf
include/trace/events/sock.h | 68 +++++++++++++++++++++++++++++++++++++++++++
include/trace/events/udp.h | 32 ++++++++++++++++++++
net/core/net-traces.c | 2 +
net/core/sock.c | 5 +++
net/ipv4/udp.c | 2 +
5 files changed, 109 insertions(+), 0 deletions(-)
create mode 100644 include/trace/events/sock.h
create mode 100644 include/trace/events/udp.h
^ permalink raw reply [flat|nested] 10+ messages in thread* [PATCH 1/2] udp: add tracepoints for queueing skb to rcvbuf 2011-06-17 21:56 [PATCH 0/2] Tracepoints for queueing skb to rcvbuf Satoru Moriya @ 2011-06-17 21:58 ` Satoru Moriya 2011-06-21 10:47 ` Neil Horman 2011-06-17 22:00 ` [PATCH 2/2] core: " Satoru Moriya 2011-06-21 23:07 ` [PATCH 0/2] Tracepoints " David Miller 2 siblings, 1 reply; 10+ messages in thread From: Satoru Moriya @ 2011-06-17 21:58 UTC (permalink / raw) To: netdev@vger.kernel.org Cc: nhorman@tuxdriver.com, davem@davemloft.net, dle-develop@lists.sourceforge.net, Seiji Aguchi This patch adds a tracepoint to __udp_queue_rcv_skb to get the return value of ip_queue_rcv_skb. It indicates why kernel drops a packet at this point. ip_queue_rcv_skb returns following values in the packet drop case: rcvbuf is full : -ENOMEM sk_filter returns error : -EINVAL, -EACCESS, -ENOMEM, etc. __sk_mem_schedule returns error: -ENOBUF Signed-off-by: Satoru Moriya <satoru.moriya@hds.com> --- include/trace/events/udp.h | 32 ++++++++++++++++++++++++++++++++ net/core/net-traces.c | 1 + net/ipv4/udp.c | 2 ++ 3 files changed, 35 insertions(+), 0 deletions(-) create mode 100644 include/trace/events/udp.h diff --git a/include/trace/events/udp.h b/include/trace/events/udp.h new file mode 100644 index 0000000..a664bb9 --- /dev/null +++ b/include/trace/events/udp.h @@ -0,0 +1,32 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM udp + +#if !defined(_TRACE_UDP_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_UDP_H + +#include <linux/udp.h> +#include <linux/tracepoint.h> + +TRACE_EVENT(udp_fail_queue_rcv_skb, + + TP_PROTO(int rc, struct sock *sk), + + TP_ARGS(rc, sk), + + TP_STRUCT__entry( + __field(int, rc) + __field(__u16, lport) + ), + + TP_fast_assign( + __entry->rc = rc; + __entry->lport = inet_sk(sk)->inet_num; + ), + + TP_printk("rc=%d port=%hu", __entry->rc, __entry->lport) +); + +#endif /* _TRACE_UDP_H */ + +/* This part must be outside protection */ +#include <trace/define_trace.h> diff --git a/net/core/net-traces.c b/net/core/net-traces.c index 7f1bb2a..13aab64 100644 --- a/net/core/net-traces.c +++ b/net/core/net-traces.c @@ -28,6 +28,7 @@ #include <trace/events/skb.h> #include <trace/events/net.h> #include <trace/events/napi.h> +#include <trace/events/udp.h> EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index abca870..37aa9bf 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -105,6 +105,7 @@ #include <net/route.h> #include <net/checksum.h> #include <net/xfrm.h> +#include <trace/events/udp.h> #include "udp_impl.h" struct udp_table udp_table __read_mostly; @@ -1363,6 +1364,7 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) is_udplite); UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, is_udplite); kfree_skb(skb); + trace_udp_fail_queue_rcv_skb(rc, sk); return -1; } -- 1.7.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] udp: add tracepoints for queueing skb to rcvbuf 2011-06-17 21:58 ` [PATCH 1/2] udp: add tracepoints " Satoru Moriya @ 2011-06-21 10:47 ` Neil Horman 2011-06-21 11:58 ` Hagen Paul Pfeifer 0 siblings, 1 reply; 10+ messages in thread From: Neil Horman @ 2011-06-21 10:47 UTC (permalink / raw) To: Satoru Moriya Cc: netdev@vger.kernel.org, davem@davemloft.net, dle-develop@lists.sourceforge.net, Seiji Aguchi On Fri, Jun 17, 2011 at 05:58:39PM -0400, Satoru Moriya wrote: > This patch adds a tracepoint to __udp_queue_rcv_skb to get the > return value of ip_queue_rcv_skb. It indicates why kernel drops > a packet at this point. > > ip_queue_rcv_skb returns following values in the packet drop case: > > rcvbuf is full : -ENOMEM > sk_filter returns error : -EINVAL, -EACCESS, -ENOMEM, etc. > __sk_mem_schedule returns error: -ENOBUF > > > Signed-off-by: Satoru Moriya <satoru.moriya@hds.com> > --- > include/trace/events/udp.h | 32 ++++++++++++++++++++++++++++++++ > net/core/net-traces.c | 1 + > net/ipv4/udp.c | 2 ++ > 3 files changed, 35 insertions(+), 0 deletions(-) > create mode 100644 include/trace/events/udp.h > > diff --git a/include/trace/events/udp.h b/include/trace/events/udp.h > new file mode 100644 > index 0000000..a664bb9 > --- /dev/null > +++ b/include/trace/events/udp.h > @@ -0,0 +1,32 @@ > +#undef TRACE_SYSTEM > +#define TRACE_SYSTEM udp > + > +#if !defined(_TRACE_UDP_H) || defined(TRACE_HEADER_MULTI_READ) > +#define _TRACE_UDP_H > + > +#include <linux/udp.h> > +#include <linux/tracepoint.h> > + > +TRACE_EVENT(udp_fail_queue_rcv_skb, > + > + TP_PROTO(int rc, struct sock *sk), > + > + TP_ARGS(rc, sk), > + > + TP_STRUCT__entry( > + __field(int, rc) > + __field(__u16, lport) > + ), > + > + TP_fast_assign( > + __entry->rc = rc; > + __entry->lport = inet_sk(sk)->inet_num; > + ), > + > + TP_printk("rc=%d port=%hu", __entry->rc, __entry->lport) > +); > + > +#endif /* _TRACE_UDP_H */ > + > +/* This part must be outside protection */ > +#include <trace/define_trace.h> > diff --git a/net/core/net-traces.c b/net/core/net-traces.c > index 7f1bb2a..13aab64 100644 > --- a/net/core/net-traces.c > +++ b/net/core/net-traces.c > @@ -28,6 +28,7 @@ > #include <trace/events/skb.h> > #include <trace/events/net.h> > #include <trace/events/napi.h> > +#include <trace/events/udp.h> > > EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb); > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > index abca870..37aa9bf 100644 > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -105,6 +105,7 @@ > #include <net/route.h> > #include <net/checksum.h> > #include <net/xfrm.h> > +#include <trace/events/udp.h> > #include "udp_impl.h" > > struct udp_table udp_table __read_mostly; > @@ -1363,6 +1364,7 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) > is_udplite); > UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, is_udplite); > kfree_skb(skb); > + trace_udp_fail_queue_rcv_skb(rc, sk); > return -1; > } > > -- > 1.7.1 > I was thinking you could just trace callers of __sk_mem_schedule, but looking at it this works as well Acked-by: Neil Horman <nhorman@tuxdriver.com> > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] udp: add tracepoints for queueing skb to rcvbuf 2011-06-21 10:47 ` Neil Horman @ 2011-06-21 11:58 ` Hagen Paul Pfeifer 2011-06-21 13:50 ` Neil Horman 0 siblings, 1 reply; 10+ messages in thread From: Hagen Paul Pfeifer @ 2011-06-21 11:58 UTC (permalink / raw) To: Neil Horman; +Cc: Satoru Moriya, netdev, Seiji Aguchi On Tue, 21 Jun 2011 06:47:43 -0400, Neil Horman wrote: > I was thinking you could just trace callers of __sk_mem_schedule, but > looking at > it this works as well > Acked-by: Neil Horman <nhorman@tuxdriver.com> Hey Neil, since you acked the patch do you have any plans to migrate dropwatch to use perf infrastructure and skip the netlink transport? Should be practicable now. No kernel patch required to run dropwatch ;-) HGN ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] udp: add tracepoints for queueing skb to rcvbuf 2011-06-21 11:58 ` Hagen Paul Pfeifer @ 2011-06-21 13:50 ` Neil Horman 2011-06-21 14:48 ` Hagen Paul Pfeifer 0 siblings, 1 reply; 10+ messages in thread From: Neil Horman @ 2011-06-21 13:50 UTC (permalink / raw) To: Hagen Paul Pfeifer; +Cc: Satoru Moriya, netdev, Seiji Aguchi On Tue, Jun 21, 2011 at 01:58:27PM +0200, Hagen Paul Pfeifer wrote: > > On Tue, 21 Jun 2011 06:47:43 -0400, Neil Horman wrote: > > > > > I was thinking you could just trace callers of __sk_mem_schedule, but > > > looking at > > > it this works as well > > > Acked-by: Neil Horman <nhorman@tuxdriver.com> > > > > Hey Neil, > > > > since you acked the patch do you have any plans to migrate dropwatch to > > use perf infrastructure and skip the netlink transport? Should be > > practicable now. No kernel patch required to run dropwatch ;-) > > I hadn't really thought about that much, but yes, I suppose I could migrate dropwatch to export kfree_skb data via perf. Admittedly I don't know much about the perf api. Do you have any pointers on its use (to save me time in figuring out how it all works)? If so I'll start looking into it. Neil > > > > HGN > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] udp: add tracepoints for queueing skb to rcvbuf 2011-06-21 13:50 ` Neil Horman @ 2011-06-21 14:48 ` Hagen Paul Pfeifer 2011-06-21 17:14 ` Steven Rostedt 0 siblings, 1 reply; 10+ messages in thread From: Hagen Paul Pfeifer @ 2011-06-21 14:48 UTC (permalink / raw) To: Neil Horman; +Cc: Satoru Moriya, netdev, Seiji Aguchi, Steven Rostedt On Tue, 21 Jun 2011 09:50:09 -0400, Neil Horman wrote: > I hadn't really thought about that much, but yes, I suppose I could migrate > dropwatch to export kfree_skb data via perf. Admittedly I don't know much > about > the perf api. Do you have any pointers on its use (to save me time in > figuring > out how it all works)? If so I'll start looking into it. http://git.kernel.org/?p=status/powertop/powertop.git;a=tree;f=perf;hb=HEAD is probably a good starting point. Especially perf_bundle.cpp:handle_trace_point(). But I am not sure if this is the most clever way. The direct us of the perf api is somewhat dodgy (not sure if the ABI will change). IIRC Steven Rostedt wrote about a user space library (I CC'ed Steven). BUT: tracing via /sys/kernel/debug/tracing/* may be enough, eventually there is no need for perf at all. Then trace-cmd may provide some nice ideas how to wrap the /sys/kernel/debug/tracing interface programmatically. The idea behind dropwatch is great! There is currently to much unconsolidated information. It takes a genius to understand where and later why packets are dropped. A userspace tool where no kernel patch is required is a big plus! ;-) Hagen ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] udp: add tracepoints for queueing skb to rcvbuf 2011-06-21 14:48 ` Hagen Paul Pfeifer @ 2011-06-21 17:14 ` Steven Rostedt 0 siblings, 0 replies; 10+ messages in thread From: Steven Rostedt @ 2011-06-21 17:14 UTC (permalink / raw) To: Hagen Paul Pfeifer Cc: Neil Horman, Satoru Moriya, netdev, Seiji Aguchi, Ingo Molnar On Tue, 2011-06-21 at 16:48 +0200, Hagen Paul Pfeifer wrote: > On Tue, 21 Jun 2011 09:50:09 -0400, Neil Horman wrote: > > > I hadn't really thought about that much, but yes, I suppose I could > migrate > > dropwatch to export kfree_skb data via perf. Admittedly I don't know > much > > about > > the perf api. Do you have any pointers on its use (to save me time in > > figuring > > out how it all works)? If so I'll start looking into it. > > http://git.kernel.org/?p=status/powertop/powertop.git;a=tree;f=perf;hb=HEAD Please please do not copy this code and reuse it. You will end up forcing us to this ABI forever! Please read this for background: http://lwn.net/Articles/442113/ I plan on doing a libperf.so that will allow any tool to interact with trace events in the kernel the proper way. Yes trace-cmd currently has its own library that deals with this properly, but the library is not shipped with distros. I plan on taking the trace-cmd library (which perf even uses - an older verion) and make it into the libperf.so that distros will ship. But unfortunately, my work has gotten in the way (the work that actually pays me) and I'm doing other things at the moment. -- Steve > > is probably a good starting point. Especially > perf_bundle.cpp:handle_trace_point(). But I am not sure if this is the most > clever way. The direct us of the perf api is somewhat dodgy (not sure if > the ABI will change). IIRC Steven Rostedt wrote about a user space library > (I CC'ed Steven). BUT: tracing via /sys/kernel/debug/tracing/* may be > enough, eventually there is no need for perf at all. Then trace-cmd may > provide some nice ideas how to wrap the /sys/kernel/debug/tracing interface > programmatically. > > The idea behind dropwatch is great! There is currently to much > unconsolidated information. It takes a genius to understand where and later > why packets are dropped. A userspace tool where no kernel patch is required > is a big plus! ;-) > > Hagen ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] core: add tracepoints for queueing skb to rcvbuf 2011-06-17 21:56 [PATCH 0/2] Tracepoints for queueing skb to rcvbuf Satoru Moriya 2011-06-17 21:58 ` [PATCH 1/2] udp: add tracepoints " Satoru Moriya @ 2011-06-17 22:00 ` Satoru Moriya 2011-06-21 10:48 ` Neil Horman 2011-06-21 23:07 ` [PATCH 0/2] Tracepoints " David Miller 2 siblings, 1 reply; 10+ messages in thread From: Satoru Moriya @ 2011-06-17 22:00 UTC (permalink / raw) To: netdev@vger.kernel.org Cc: nhorman@tuxdriver.com, davem@davemloft.net, dle-develop@lists.sourceforge.net, Seiji Aguchi This patch adds 2 tracepoints to get a status of a socket receive queue and related parameter. One tracepoint is added to sock_queue_rcv_skb. It records rcvbuf size and its usage. The other tracepoint is added to __sk_mem_schedule and it records limitations of memory for sockets and current usage. By using these tracepoints we're able to know detailed reason why kernel drop the packet. Signed-off-by: Satoru Moriya <satoru.moriya@hds.com> --- include/trace/events/sock.h | 68 +++++++++++++++++++++++++++++++++++++++++++ net/core/net-traces.c | 1 + net/core/sock.c | 5 +++ 3 files changed, 74 insertions(+), 0 deletions(-) create mode 100644 include/trace/events/sock.h diff --git a/include/trace/events/sock.h b/include/trace/events/sock.h new file mode 100644 index 0000000..779abb9 --- /dev/null +++ b/include/trace/events/sock.h @@ -0,0 +1,68 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM sock + +#if !defined(_TRACE_SOCK_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_SOCK_H + +#include <net/sock.h> +#include <linux/tracepoint.h> + +TRACE_EVENT(sock_rcvqueue_full, + + TP_PROTO(struct sock *sk, struct sk_buff *skb), + + TP_ARGS(sk, skb), + + TP_STRUCT__entry( + __field(int, rmem_alloc) + __field(unsigned int, truesize) + __field(int, sk_rcvbuf) + ), + + TP_fast_assign( + __entry->rmem_alloc = atomic_read(&sk->sk_rmem_alloc); + __entry->truesize = skb->truesize; + __entry->sk_rcvbuf = sk->sk_rcvbuf; + ), + + TP_printk("rmem_alloc=%d truesize=%u sk_rcvbuf=%d", + __entry->rmem_alloc, __entry->truesize, __entry->sk_rcvbuf) +); + +TRACE_EVENT(sock_exceed_buf_limit, + + TP_PROTO(struct sock *sk, struct proto *prot, long allocated), + + TP_ARGS(sk, prot, allocated), + + TP_STRUCT__entry( + __array(char, name, 32) + __field(long *, sysctl_mem) + __field(long, allocated) + __field(int, sysctl_rmem) + __field(int, rmem_alloc) + ), + + TP_fast_assign( + strncpy(__entry->name, prot->name, 32); + __entry->sysctl_mem = prot->sysctl_mem; + __entry->allocated = allocated; + __entry->sysctl_rmem = prot->sysctl_rmem[0]; + __entry->rmem_alloc = atomic_read(&sk->sk_rmem_alloc); + ), + + TP_printk("proto:%s sysctl_mem=%ld,%ld,%ld allocated=%ld " + "sysctl_rmem=%d rmem_alloc=%d", + __entry->name, + __entry->sysctl_mem[0], + __entry->sysctl_mem[1], + __entry->sysctl_mem[2], + __entry->allocated, + __entry->sysctl_rmem, + __entry->rmem_alloc) +); + +#endif /* _TRACE_SOCK_H */ + +/* This part must be outside protection */ +#include <trace/define_trace.h> diff --git a/net/core/net-traces.c b/net/core/net-traces.c index 13aab64..52380b1 100644 --- a/net/core/net-traces.c +++ b/net/core/net-traces.c @@ -28,6 +28,7 @@ #include <trace/events/skb.h> #include <trace/events/net.h> #include <trace/events/napi.h> +#include <trace/events/sock.h> #include <trace/events/udp.h> EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb); diff --git a/net/core/sock.c b/net/core/sock.c index 6e81978..76c4031 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -128,6 +128,8 @@ #include <linux/filter.h> +#include <trace/events/sock.h> + #ifdef CONFIG_INET #include <net/tcp.h> #endif @@ -292,6 +294,7 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) if (atomic_read(&sk->sk_rmem_alloc) + skb->truesize >= (unsigned)sk->sk_rcvbuf) { atomic_inc(&sk->sk_drops); + trace_sock_rcvqueue_full(sk, skb); return -ENOMEM; } @@ -1736,6 +1739,8 @@ suppress_allocation: return 1; } + trace_sock_exceed_buf_limit(sk, prot, allocated); + /* Alas. Undo changes. */ sk->sk_forward_alloc -= amt * SK_MEM_QUANTUM; atomic_long_sub(amt, prot->memory_allocated); -- 1.7.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] core: add tracepoints for queueing skb to rcvbuf 2011-06-17 22:00 ` [PATCH 2/2] core: " Satoru Moriya @ 2011-06-21 10:48 ` Neil Horman 0 siblings, 0 replies; 10+ messages in thread From: Neil Horman @ 2011-06-21 10:48 UTC (permalink / raw) To: Satoru Moriya Cc: netdev@vger.kernel.org, davem@davemloft.net, dle-develop@lists.sourceforge.net, Seiji Aguchi On Fri, Jun 17, 2011 at 06:00:03PM -0400, Satoru Moriya wrote: > This patch adds 2 tracepoints to get a status of a socket receive queue > and related parameter. > > One tracepoint is added to sock_queue_rcv_skb. It records rcvbuf size > and its usage. The other tracepoint is added to __sk_mem_schedule and > it records limitations of memory for sockets and current usage. > > By using these tracepoints we're able to know detailed reason why kernel > drop the packet. > > Signed-off-by: Satoru Moriya <satoru.moriya@hds.com> > --- > include/trace/events/sock.h | 68 +++++++++++++++++++++++++++++++++++++++++++ > net/core/net-traces.c | 1 + > net/core/sock.c | 5 +++ > 3 files changed, 74 insertions(+), 0 deletions(-) > create mode 100644 include/trace/events/sock.h > > diff --git a/include/trace/events/sock.h b/include/trace/events/sock.h > new file mode 100644 > index 0000000..779abb9 > --- /dev/null > +++ b/include/trace/events/sock.h > @@ -0,0 +1,68 @@ > +#undef TRACE_SYSTEM > +#define TRACE_SYSTEM sock > + > +#if !defined(_TRACE_SOCK_H) || defined(TRACE_HEADER_MULTI_READ) > +#define _TRACE_SOCK_H > + > +#include <net/sock.h> > +#include <linux/tracepoint.h> > + > +TRACE_EVENT(sock_rcvqueue_full, > + > + TP_PROTO(struct sock *sk, struct sk_buff *skb), > + > + TP_ARGS(sk, skb), > + > + TP_STRUCT__entry( > + __field(int, rmem_alloc) > + __field(unsigned int, truesize) > + __field(int, sk_rcvbuf) > + ), > + > + TP_fast_assign( > + __entry->rmem_alloc = atomic_read(&sk->sk_rmem_alloc); > + __entry->truesize = skb->truesize; > + __entry->sk_rcvbuf = sk->sk_rcvbuf; > + ), > + > + TP_printk("rmem_alloc=%d truesize=%u sk_rcvbuf=%d", > + __entry->rmem_alloc, __entry->truesize, __entry->sk_rcvbuf) > +); > + > +TRACE_EVENT(sock_exceed_buf_limit, > + > + TP_PROTO(struct sock *sk, struct proto *prot, long allocated), > + > + TP_ARGS(sk, prot, allocated), > + > + TP_STRUCT__entry( > + __array(char, name, 32) > + __field(long *, sysctl_mem) > + __field(long, allocated) > + __field(int, sysctl_rmem) > + __field(int, rmem_alloc) > + ), > + > + TP_fast_assign( > + strncpy(__entry->name, prot->name, 32); > + __entry->sysctl_mem = prot->sysctl_mem; > + __entry->allocated = allocated; > + __entry->sysctl_rmem = prot->sysctl_rmem[0]; > + __entry->rmem_alloc = atomic_read(&sk->sk_rmem_alloc); > + ), > + > + TP_printk("proto:%s sysctl_mem=%ld,%ld,%ld allocated=%ld " > + "sysctl_rmem=%d rmem_alloc=%d", > + __entry->name, > + __entry->sysctl_mem[0], > + __entry->sysctl_mem[1], > + __entry->sysctl_mem[2], > + __entry->allocated, > + __entry->sysctl_rmem, > + __entry->rmem_alloc) > +); > + > +#endif /* _TRACE_SOCK_H */ > + > +/* This part must be outside protection */ > +#include <trace/define_trace.h> > diff --git a/net/core/net-traces.c b/net/core/net-traces.c > index 13aab64..52380b1 100644 > --- a/net/core/net-traces.c > +++ b/net/core/net-traces.c > @@ -28,6 +28,7 @@ > #include <trace/events/skb.h> > #include <trace/events/net.h> > #include <trace/events/napi.h> > +#include <trace/events/sock.h> > #include <trace/events/udp.h> > > EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb); > diff --git a/net/core/sock.c b/net/core/sock.c > index 6e81978..76c4031 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -128,6 +128,8 @@ > > #include <linux/filter.h> > > +#include <trace/events/sock.h> > + > #ifdef CONFIG_INET > #include <net/tcp.h> > #endif > @@ -292,6 +294,7 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) > if (atomic_read(&sk->sk_rmem_alloc) + skb->truesize >= > (unsigned)sk->sk_rcvbuf) { > atomic_inc(&sk->sk_drops); > + trace_sock_rcvqueue_full(sk, skb); > return -ENOMEM; > } > > @@ -1736,6 +1739,8 @@ suppress_allocation: > return 1; > } > > + trace_sock_exceed_buf_limit(sk, prot, allocated); > + > /* Alas. Undo changes. */ > sk->sk_forward_alloc -= amt * SK_MEM_QUANTUM; > atomic_long_sub(amt, prot->memory_allocated); > -- > 1.7.1 > > > Acked-by: Neil Horman <nhorman@tuxdriver.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] Tracepoints for queueing skb to rcvbuf 2011-06-17 21:56 [PATCH 0/2] Tracepoints for queueing skb to rcvbuf Satoru Moriya 2011-06-17 21:58 ` [PATCH 1/2] udp: add tracepoints " Satoru Moriya 2011-06-17 22:00 ` [PATCH 2/2] core: " Satoru Moriya @ 2011-06-21 23:07 ` David Miller 2 siblings, 0 replies; 10+ messages in thread From: David Miller @ 2011-06-21 23:07 UTC (permalink / raw) To: satoru.moriya; +Cc: netdev, nhorman, dle-develop, seiji.aguchi From: Satoru Moriya <satoru.moriya@hds.com> Date: Fri, 17 Jun 2011 17:56:55 -0400 > To separate these reasons, this patchset adds 3 tracepoints. > > 1st one is added to __udp_queue_rcv_skb to get return value of > ip_queue_rcv_skb. Analyzing it we can separate above (*) (3 possibilities). > > 2nd and 3rd one are to get more detailed information. We can collect status > of socket receive queue and related parameters(some of them are sysctl knob > e.g. /proc/sys/net/ipv4/udp_mem, etc. for UDP) and then we can tune kernel > behavior easily. Both patches applied, thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-06-21 23:08 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-06-17 21:56 [PATCH 0/2] Tracepoints for queueing skb to rcvbuf Satoru Moriya 2011-06-17 21:58 ` [PATCH 1/2] udp: add tracepoints " Satoru Moriya 2011-06-21 10:47 ` Neil Horman 2011-06-21 11:58 ` Hagen Paul Pfeifer 2011-06-21 13:50 ` Neil Horman 2011-06-21 14:48 ` Hagen Paul Pfeifer 2011-06-21 17:14 ` Steven Rostedt 2011-06-17 22:00 ` [PATCH 2/2] core: " Satoru Moriya 2011-06-21 10:48 ` Neil Horman 2011-06-21 23:07 ` [PATCH 0/2] Tracepoints " David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).