* Re: systemtap networking tapsets was: Re: [RFC]: field name identifier conventions
2007-10-22 16:31 systemtap networking tapsets was: Re: [RFC]: field name identifier Arnaldo Carvalho de Melo
@ 2007-10-24 12:11 ` Gerrit Renker
2007-10-24 13:32 ` systemtap networking tapsets was: Re: [RFC]: field name Arnaldo Carvalho de Melo
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Gerrit Renker @ 2007-10-24 12:11 UTC (permalink / raw)
To: dccp
| > | perhaps one that could understand types and then could allow developers
| > | to ask questions like "show me all the places where the field foo of
| > | type bar appears"
| > Hopefully in the next generation of such things may be possible? I was
^
|
Indeed! Did you notice the missing word + ... I meant to write `dwarves' :)
and wrote `next generation' since there are apparently already 7 in this generation.
| Ah, I'm working on some systemtap tapsets, i.e. libraries of probe
| routines, for networking, starting with TCP, but organized in a way
| that can be easily used with DCCP and other net protocols too.
If you could give a shout on the mailing list once it is ready for testing/deployment,
that would be good. Last year you had a nice tool which automatically inserted kprobes
at entry/exit points, it was apparently meant to replace an older tool. I tried it a
few times but then lost track of the revisions. It is frustrating to test stuff which is
in the middle of a migration to something else.
The output looks great and once that is ready, I think it can be of much help to answer
long pending questions of e.g. how well the packet scheduler really works.
| And will probably convert net/dccp/dccpprobe.c and tcpprobe to be
| just systemtap scripts and not part of the build process, etc.
I think that dccpprobe.c is the wrong name ... it should really be called ccid3_probe.c ...
I have been working on printing entries for CCID2, since in ccid2.c there is no probe support,
and instead ccid2_pr_debug is used for the same purpose all over the place.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: systemtap networking tapsets was: Re: [RFC]: field name
2007-10-22 16:31 systemtap networking tapsets was: Re: [RFC]: field name identifier Arnaldo Carvalho de Melo
2007-10-24 12:11 ` systemtap networking tapsets was: Re: [RFC]: field name identifier conventions Gerrit Renker
@ 2007-10-24 13:32 ` Arnaldo Carvalho de Melo
2007-10-24 13:35 ` systemtap networking tapsets was: Re: [RFC]: field name identifier conventions Gerrit Renker
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-10-24 13:32 UTC (permalink / raw)
To: dccp
Em Wed, Oct 24, 2007 at 01:11:18PM +0100, Gerrit Renker escreveu:
> | > | perhaps one that could understand types and then could allow developers
> | > | to ask questions like "show me all the places where the field foo of
> | > | type bar appears"
> | > Hopefully in the next generation of such things may be possible? I was
> ^
> |
> Indeed! Did you notice the missing word + ... I meant to write `dwarves' :)
> and wrote `next generation' since there are apparently already 7 in this generation.
:)
> | Ah, I'm working on some systemtap tapsets, i.e. libraries of probe
> | routines, for networking, starting with TCP, but organized in a way
> | that can be easily used with DCCP and other net protocols too.
> If you could give a shout on the mailing list once it is ready for testing/deployment,
> that would be good. Last year you had a nice tool which automatically inserted kprobes
> at entry/exit points, it was apparently meant to replace an older tool. I tried it a
> few times but then lost track of the revisions. It is frustrating to test stuff which is
> in the middle of a migration to something else.
I will, what you are talking about is ctracer, that generates kprobes
entry/exit, I'll go to a third revision that will be to generate
systemtap scripts instead of kprobes, leveraging on the systemtap safety
nets.
Yesterday I stopped using _stp_gettimeofday_ns() for the timestamp,
switched to get_cycles_sync() and there was no performance drop when
using lnlat.stp (the local network latency measurement tool), so it
indeed looks promising.
> The output looks great and once that is ready, I think it can be of much help to answer
> long pending questions of e.g. how well the packet scheduler really works.
Exactly, I want to have a clear picture of where packets sits, and the
packet scheduler will be one of the next tapsets I'll be working on.
> | And will probably convert net/dccp/dccpprobe.c and tcpprobe to be
> | just systemtap scripts and not part of the build process, etc.
> I think that dccpprobe.c is the wrong name ... it should really be called ccid3_probe.c ...
> I have been working on printing entries for CCID2, since in ccid2.c there is no probe support,
> and instead ccid2_pr_debug is used for the same purpose all over the place.
Indeed, lemme try converting it right now...
- Arnaldo
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: systemtap networking tapsets was: Re: [RFC]: field name identifier conventions
2007-10-22 16:31 systemtap networking tapsets was: Re: [RFC]: field name identifier Arnaldo Carvalho de Melo
2007-10-24 12:11 ` systemtap networking tapsets was: Re: [RFC]: field name identifier conventions Gerrit Renker
2007-10-24 13:32 ` systemtap networking tapsets was: Re: [RFC]: field name Arnaldo Carvalho de Melo
@ 2007-10-24 13:35 ` Gerrit Renker
2007-10-24 15:43 ` systemtap networking tapsets was: Re: [RFC]: field name Arnaldo Carvalho de Melo
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Gerrit Renker @ 2007-10-24 13:35 UTC (permalink / raw)
To: dccp
Quoting Arnaldo Carvalho de Melo:
| > I think that dccpprobe.c is the wrong name ... it should really be called ccid3_probe.c ...
| > I have been working on printing entries for CCID2, since in ccid2.c there is no probe support,
| > and instead ccid2_pr_debug is used for the same purpose all over the place.
|
| Indeed, lemme try converting it right now...
|
This is how far I came (only for info, applies only on test tree):
--- a/net/dccp/probe.c
+++ b/net/dccp/probe.c
@@ -34,6 +34,7 @@
XXX enough for this week, remove to re-enable
#include "dccp.h"
#include "ccid.h"
+#include "ccids/ccid2.h"
#include "ccids/ccid3.h"
static int port;
@@ -80,22 +81,37 @@ static int jdccp_sendmsg(struct kiocb *i
struct msghdr *msg, size_t size)
{
const struct inet_sock *inet = inet_sk(sk);
- struct ccid3_hc_tx_sock *hctx = NULL;
+ struct ccid2_hc_tx_sock *tx2 = NULL;
+ struct ccid3_hc_tx_sock *tx3 = NULL;
- if (ccid_get_current_id(dccp_sk(sk), false) = DCCPC_CCID3)
- hctx = ccid3_hc_tx_sk(sk);
+ switch (ccid_get_current_id(dccp_sk(sk), false)) {
+ case DCCPC_CCID2:
+ tx2 = ccid2_hc_tx_sk(sk);
+ break;
+ case DCCPC_CCID3:
+ tx3 = ccid3_hc_tx_sk(sk);
+ }
if (port = 0 || ntohs(inet->dport) = port ||
ntohs(inet->sport) = port) {
- if (hctx)
+ if (tx3)
printl("%d.%d.%d.%d:%u %d.%d.%d.%d:%u %d %d %d %d %u "
"%llu %llu %d\n",
NIPQUAD(inet->saddr), ntohs(inet->sport),
NIPQUAD(inet->daddr), ntohs(inet->dport), size,
- hctx->ccid3hctx_s, hctx->ccid3hctx_rtt,
- hctx->ccid3hctx_p, hctx->ccid3hctx_x_calc,
- hctx->ccid3hctx_x_recv >> 6,
- hctx->ccid3hctx_x >> 6, hctx->ccid3hctx_t_ipi);
+ tx3->ccid3hctx_s, tx3->ccid3hctx_rtt,
+ tx3->ccid3hctx_p, tx3->ccid3hctx_x_calc,
+ tx3->ccid3hctx_x_recv >> 6,
+ tx3->ccid3hctx_x >> 6, tx3->ccid3hctx_t_ipi);
+ else if (tx2)
+ printl("%d.%d.%d.%d:%u %d.%d.%d.%d:%u %d %d %d %u %u %u\n",
+ NIPQUAD(inet->saddr), ntohs(inet->sport),
+ NIPQUAD(inet->daddr), ntohs(inet->dport), size,
+ tx2->ccid2hctx_srtt>>3,
+ tx2->ccid2hctx_rttvar>>2,
+ tx2->ccid2hctx_pipe,
+ tx2->ccid2hctx_cwnd,
+ tx2->ccid2hctx_ssthresh);
else
printl("%d.%d.%d.%d:%u %d.%d.%d.%d:%u %d\n",
NIPQUAD(inet->saddr), ntohs(inet->sport),
--- a/net/dccp/ccids/ccid2.c
+++ b/net/dccp/ccids/ccid2.c
@@ -129,9 +129,6 @@ static int ccid2_hc_tx_send_packet(struc
{
struct ccid2_hc_tx_sock *hctx = ccid2_hc_tx_sk(sk);
- ccid2_pr_debug("pipe=%d cwnd=%d\n", hctx->ccid2hctx_pipe,
- hctx->ccid2hctx_cwnd);
-
if (hctx->ccid2hctx_pipe < hctx->ccid2hctx_cwnd)
return 0;
@@ -259,9 +256,6 @@ static void ccid2_hc_tx_packet_sent(stru
}
hctx->ccid2hctx_seqh = next;
- ccid2_pr_debug("cwnd=%d pipe=%d\n", hctx->ccid2hctx_cwnd,
- hctx->ccid2hctx_pipe);
-
/*
* FIXME: The code below is broken and the variables have been removed
* from the socket struct. The `ackloss' variable was always set to 0,
@@ -316,7 +310,6 @@ static void ccid2_hc_tx_packet_sent(stru
ccid2_start_rto_timer(sk);
#ifdef CONFIG_IP_DCCP_CCID2_DEBUG
- ccid2_pr_debug("pipe=%d\n", hctx->ccid2hctx_pipe);
ccid2_pr_debug("Sent: seq=%llu\n", (unsigned long long)dp->dccps_gss);
do {
struct ccid2_seq *seqp = hctx->ccid2hctx_seqt;
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: systemtap networking tapsets was: Re: [RFC]: field name
2007-10-22 16:31 systemtap networking tapsets was: Re: [RFC]: field name identifier Arnaldo Carvalho de Melo
` (2 preceding siblings ...)
2007-10-24 13:35 ` systemtap networking tapsets was: Re: [RFC]: field name identifier conventions Gerrit Renker
@ 2007-10-24 15:43 ` Arnaldo Carvalho de Melo
2007-10-25 14:03 ` systemtap networking tapsets was: Re: [RFC]: field name identifier conventions Gerrit Renker
2007-10-25 17:13 ` systemtap networking tapsets was: Re: [RFC]: field name Arnaldo Carvalho de Melo
5 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-10-24 15:43 UTC (permalink / raw)
To: dccp
Em Wed, Oct 24, 2007 at 02:35:42PM +0100, Gerrit Renker escreveu:
> Quoting Arnaldo Carvalho de Melo:
> | > I think that dccpprobe.c is the wrong name ... it should really be called ccid3_probe.c ...
> | > I have been working on printing entries for CCID2, since in ccid2.c there is no probe support,
> | > and instead ccid2_pr_debug is used for the same purpose all over the place.
> |
> | Indeed, lemme try converting it right now...
> |
> This is how far I came (only for info, applies only on test tree):
Cool, but what I meant to say was that I was going to try and convert
dccpprobe to be a systemtap script, and this is how far I went, forgot
about x_calc, something fishy there, but take a look at the
dccp_ccid3_probe.stp and its output. The full sources for the tapsets
used (for things like ccid3_hc_tx_sk_rtt(), etc) are at:
http://oops.ghostprotocols.net:81/acme/nettaps.tar.bz2
Still a bit hackish, as we don't expose the ccids header files in
include/net/ for systemtap to use and I tried this on a kernel packaged
as an RPM (kernel-debuginfo, etc), perhaps something to consider
changing. But I used include/linux/tfrc.h, that is exposed and had what
I want.
Ah, and this was over gigabit ethernet, over wi-fi it gets to a crawl
after a while, have to check with your experimental tree, probably you
already fixed this issue.
[root@mica nettaps]# cat dccp_ccid3_probe.stp
#!/usr/bin/stap
global filter_dport = 5001
global rtts
global x_calcs
global ipis
probe dccp_user_out = module("dccp").function("dccp_sendmsg")
{
dport = inet_sk_dport($sk)
x_calc = ccid3_hc_tx_sk_x_calc($sk)
rtt = ccid3_hc_tx_sk_rtt($sk)
ipi = ccid3_hc_tx_sk_ipi($sk)
}
probe dccp_user_out
{
if (dport != filter_dport)
next
rtts <<< rtt
x_calcs <<< x_calc
ipis <<< ipi
}
probe end
{
printf("rtt: count: %d, min: %d, max: %d, avg: %d\n",
@count(rtts), @min(rtts), @max(rtts), @avg(rtts))
print(@hist_linear(rtts, 0, 1000, 20))
printf("x_calc: count: %d, min: %d, max: %d, avg: %d\n",
@count(x_calcs), @min(x_calcs), @max(x_calcs), @avg(x_calcs))
print(@hist_linear(x_calcs, 0, 100000, 2000))
printf("ipi: count: %d, min: %d, max: %d, avg: %d\n",
@count(ipis), @min(ipis), @max(ipis), @avg(ipis))
print(@hist_linear(ipis, 0, 600, 10))
}
[root@mica nettaps]#
[root@mica nettaps]# stap -I tapset/ dccp_ccid3_probe.stp # to finish it press control+C
rtt: count: 38746, min: 0, max: 390, avg: 286
value |-------------------------------------------------- count
0 | 1
20 | 0
40 | 0
60 | 0
80 | 0
100 | 0
120 | 0
140 | 0
160 | 17
180 | 19
200 | 21
220 |@ 248
240 |@@@@@@@@@@@@@@@@@@@@@@@@ 5510
260 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 10143
280 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 11290
300 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 9095
320 |@@@@@@ 1476
340 |@@ 463
360 |@ 370
380 | 93
400 | 0
420 | 0
x_calc: count: 38746, min: 0, max: 0, avg: 0
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 38746
2000 | 0
4000 | 0
ipi: count: 38746, min: 0, max: 7473, avg: 162
value |-------------------------------------------------- count
0 | 1
10 |@@@@ 654
20 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 7255
30 |@@@@@@@@@@@@@@@@@@@ 2876
40 |@@@@@@@@@@@@@@@@@@@@@@ 3232
50 |@@@@@@@@@@@@@@@@@@@@@@@@@ 3678
60 |@@@@@@@@@@@@@@@@@@@ 2775
70 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 4188
80 |@@@@@@@@@@@@@@ 2186
90 |@@@@@@@@@@@@@ 1986
100 |@@@@@@ 940
110 |@@@@@ 813
120 |@@@@@@@@@@ 1579
130 |@@@@@ 738
140 |@ 226
150 |@@@@@@@@ 1235
160 |@@ 399
170 |@@@ 452
180 | 0
190 |@ 289
200 | 0
210 | 116
220 | 0
230 |@@ 311
240 | 0
250 |@@ 414
260 | 0
270 | 119
280 | 0
290 | 0
300 |@@@@@@@@ 1184
310 |@@ 412
320 | 0
330 | 0
340 | 0
350 | 0
360 | 0
370 | 0
380 | 0
390 | 0
400 | 0
410 | 0
420 | 0
430 | 0
440 | 0
450 | 0
460 | 0
470 | 0
480 | 0
490 | 0
500 | 0
510 | 0
520 | 0
530 | 0
540 | 0
550 | 0
560 | 0
570 | 0
580 | 0
590 |@@@@ 688
[root@mica nettaps]#
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: systemtap networking tapsets was: Re: [RFC]: field name identifier conventions
2007-10-22 16:31 systemtap networking tapsets was: Re: [RFC]: field name identifier Arnaldo Carvalho de Melo
` (3 preceding siblings ...)
2007-10-24 15:43 ` systemtap networking tapsets was: Re: [RFC]: field name Arnaldo Carvalho de Melo
@ 2007-10-25 14:03 ` Gerrit Renker
2007-10-25 17:13 ` systemtap networking tapsets was: Re: [RFC]: field name Arnaldo Carvalho de Melo
5 siblings, 0 replies; 7+ messages in thread
From: Gerrit Renker @ 2007-10-25 14:03 UTC (permalink / raw)
To: dccp
| The full sources for the tapsets used (for things like ccid3_hc_tx_sk_rtt(), etc) are at:
|
| http://oops.ghostprotocols.net:81/acme/nettaps.tar.bz2
|
This is awesome. Is all that is needed to run these a new systemtap binary? Mine is 0.5.9
and too old for that, and I need to do some rtfm in the stap manuals.
It would be great to upload a few standard scripts and continually use them for regression
tests - it seems the stuff can again be wrapped into bash, python, etc, so that some test
runs could be automated.
| Still a bit hackish, as we don't expose the ccids header files in
| include/net/ for systemtap to use and I tried this on a kernel packaged
| as an RPM (kernel-debuginfo, etc), perhaps something to consider
| changing.
When doing this for CCID2, the same problem arises. Maybe the struct_tfrc can be converted
into a generic (lean) info structure (suitable for multiple CCIDs), afaik DCCP is the only
user of include/linux/dccp.h, while the real TFRC work is done in net/dccp/ccids/lib.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: systemtap networking tapsets was: Re: [RFC]: field name
2007-10-22 16:31 systemtap networking tapsets was: Re: [RFC]: field name identifier Arnaldo Carvalho de Melo
` (4 preceding siblings ...)
2007-10-25 14:03 ` systemtap networking tapsets was: Re: [RFC]: field name identifier conventions Gerrit Renker
@ 2007-10-25 17:13 ` Arnaldo Carvalho de Melo
5 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-10-25 17:13 UTC (permalink / raw)
To: dccp
Em Thu, Oct 25, 2007 at 03:03:55PM +0100, Gerrit Renker escreveu:
> | The full sources for the tapsets used (for things like ccid3_hc_tx_sk_rtt(), etc) are at:
> |
> | http://oops.ghostprotocols.net:81/acme/nettaps.tar.bz2
> |
> This is awesome. Is all that is needed to run these a new systemtap binary? Mine is 0.5.9
> and too old for that, and I need to do some rtfm in the stap manuals.
Here I have: 0.5.13, so you are not that off:
stap_fa52dacd83be5dbb4b8bcbb890f82bba_4257: systemtap: 0.5.13, base: ffffffff88365000, memory: 68795+40885+53408+164320+31157464 data+text+ctx+io+glob, probes: 4
> It would be great to upload a few standard scripts and continually use them for regression
> tests - it seems the stuff can again be wrapped into bash, python, etc, so that some test
> runs could be automated.
Indeed, having some scripts and using them for regression testing seems
like an excellent idea. We can use it to mangle a packet on its way to
some of our routines to check if they are handling some specific
failures, etc. Ultimately we could end up with a nice set of scripts
that would test each and every condition we have to handle. Coverage
analysis could then be done using lcov, that would be really nirvana 8)
> | Still a bit hackish, as we don't expose the ccids header files in
> | include/net/ for systemtap to use and I tried this on a kernel packaged
> | as an RPM (kernel-debuginfo, etc), perhaps something to consider
> | changing.
> When doing this for CCID2, the same problem arises. Maybe the struct_tfrc can be converted
> into a generic (lean) info structure (suitable for multiple CCIDs), afaik DCCP is the only
> user of include/linux/dccp.h, while the real TFRC work is done in net/dccp/ccids/lib.
Yeah, making it easier for systemtappin' is in my plans on the dccp
files layout. I'll even advocate for these scripts to be shipped with
the kernel sources.
Ok, back to TCP misuse investigations for a living, hope to have time to
merge some more patches later today. :-)
- Arnaldo
^ permalink raw reply [flat|nested] 7+ messages in thread