* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 [not found] <1102380430.6103.6.camel@buffy> @ 2004-12-07 6:35 ` Patrick McHardy 2004-12-07 6:44 ` Andrew Morton 1 sibling, 0 replies; 24+ messages in thread From: Patrick McHardy @ 2004-12-07 6:35 UTC (permalink / raw) To: Thomas Cataldo; +Cc: linux-kernel, netdev Thomas Cataldo wrote: >Hi, > >Tonight I upgraded to 2.6.10-rc3. Everything was fine until I started >wondershaper to setup my Qos rules : > >wondershaper eth0 255 16 > >And the machine freezed hard. No magic sysrq working, no oops in my >logs. > > Please try to find out which line causes the lockup. Are you using the same config options as with 2.6.9 ? Regards Patrick ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 [not found] <1102380430.6103.6.camel@buffy> 2004-12-07 6:35 ` Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Patrick McHardy @ 2004-12-07 6:44 ` Andrew Morton 2004-12-07 12:29 ` jamal 1 sibling, 1 reply; 24+ messages in thread From: Andrew Morton @ 2004-12-07 6:44 UTC (permalink / raw) To: Thomas Cataldo; +Cc: linux-kernel, netdev, tgraf, David S. Miller Thomas Cataldo <tomc@compaqnet.fr> wrote: > > Tonight I upgraded to 2.6.10-rc3. Everything was fine until I started > wondershaper to setup my Qos rules : > > wondershaper eth0 255 16 > > And the machine freezed hard. No magic sysrq working, no oops in my > logs. > > The computer is an x86 smp (dual p3) > > wondershaper was working fine with 2.6.9. Me too, with your .config: Using http://lartc.org/wondershaper/wondershaper-1.1a.tar.gz vmm:/home/akpm/wondershaper-1.1a# ./wshaper eth0 255 16 u32 classifier Perfomance counters on input device check on Actions configured Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c0290b58 *pde = 00000000 Oops: 0002 [#1] SMP Modules linked in: police sch_ingress cls_u32 sch_sfq sch_cbq usbcore CPU: 1 EIP: 0060:[<c0290b58>] Not tainted VLI EFLAGS: 00010206 (2.6.10-rc3) EIP is at _spin_lock_bh+0x10/0x20 eax: cf690000 ebx: ce0d4d80 ecx: 00000008 edx: 00000000 esi: cf691ba0 edi: 00000000 ebp: 00000000 esp: cf691b48 ds: 007b es: 007b ss: 0068 Process tc (pid: 2743, threadinfo=cf690000 task=cf07b520) Stack: c02372ab ce0d4d80 cd4d8800 cebd1c40 ce0d4d80 c0237346 ce0d4d80 00000008 00000000 00000000 00000000 cf691ba0 cf691ba0 c02478d6 ce0d4d80 00000008 00000000 cf691ba0 ce0d4d80 cf6a6070 30960094 cec44380 00000000 00019000 Call Trace: [<c02372ab>] gnet_stats_start_copy_compat+0x1b/0x98 [<c0237346>] gnet_stats_start_copy+0x1e/0x24 [<c02478d6>] tcf_action_copy_stats+0x26/0xa0 [<c02473e2>] tcf_action_dump_old+0x36/0x3c [<d0807174>] u32_dump+0x2c8/0x344 [cls_u32] [<d08071a6>] u32_dump+0x2fa/0x344 [cls_u32] [<c0246cf9>] tcf_fill_node+0x11d/0x170 [<c0246d9c>] tfilter_notify+0x50/0xa0 [<c0246bae>] tc_ctl_tfilter+0x542/0x570 [<c0240705>] rtnetlink_rcv+0x23d/0x360 [<c0249f08>] netlink_data_ready+0x1c/0x54 [<c02495c5>] netlink_sendskb+0x21/0x40 [<c024970b>] netlink_unicast+0xe3/0xec [<c0249d04>] netlink_sendmsg+0x27c/0x28c [<c02306d9>] sock_sendmsg+0xd5/0xf8 [<c02306d9>] sock_sendmsg+0xd5/0xf8 [<c01c0a70>] copy_from_user+0x30/0x60 [<c01c0a70>] copy_from_user+0x30/0x60 [<c0127cfc>] autoremove_wake_function+0x0/0x40 [<c0231db7>] sys_sendmsg+0x18f/0x1f4 [<c013c068>] handle_mm_fault+0x80/0x11c [<c010fc3b>] do_page_fault+0x1a3/0x554 [<c01c0a70>] copy_from_user+0x30/0x60 [<c02321bc>] sys_socketcall+0x1d8/0x1f4 [<c01021cd>] sysenter_past_esp+0x52/0x71 Code: 3a 00 7e f9 fa eb e9 c3 8d 76 00 fa f0 fe 08 79 09 f3 90 80 38 00 7e f9 eb f2 c3 89 c2 b8 00 e0 ff ff 21 e0 81 <0>Kernel panic - not syncing: Fatal exception in interrupt Somehow I don't think this is because "Performance" was misspelled ;) tcf_act_hdr.stats_lock is NULL in tcf_action_copy_stats() ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-07 6:44 ` Andrew Morton @ 2004-12-07 12:29 ` jamal 2004-12-07 16:59 ` Patrick McHardy 0 siblings, 1 reply; 24+ messages in thread From: jamal @ 2004-12-07 12:29 UTC (permalink / raw) To: Andrew Morton Cc: Thomas Cataldo, linux-kernel, netdev, tgraf, David S. Miller Can you do a: tc -V This seems to point to probably be a backward compat issue which was overlooked in the stats. cheers, jamal On Tue, 2004-12-07 at 01:44, Andrew Morton wrote: > Thomas Cataldo <tomc@compaqnet.fr> wrote: > > > > Tonight I upgraded to 2.6.10-rc3. Everything was fine until I started > > wondershaper to setup my Qos rules : > > > > wondershaper eth0 255 16 > > > > And the machine freezed hard. No magic sysrq working, no oops in my > > logs. > > > > The computer is an x86 smp (dual p3) > > > > wondershaper was working fine with 2.6.9. > > Me too, with your .config: > > Using http://lartc.org/wondershaper/wondershaper-1.1a.tar.gz > > vmm:/home/akpm/wondershaper-1.1a# ./wshaper eth0 255 16 > > > u32 classifier > Perfomance counters on > input device check on > Actions configured > Unable to handle kernel NULL pointer dereference at virtual address 00000000 > printing eip: > c0290b58 > *pde = 00000000 > Oops: 0002 [#1] > SMP > Modules linked in: police sch_ingress cls_u32 sch_sfq sch_cbq usbcore > CPU: 1 > EIP: 0060:[<c0290b58>] Not tainted VLI > EFLAGS: 00010206 (2.6.10-rc3) > EIP is at _spin_lock_bh+0x10/0x20 > eax: cf690000 ebx: ce0d4d80 ecx: 00000008 edx: 00000000 > esi: cf691ba0 edi: 00000000 ebp: 00000000 esp: cf691b48 > ds: 007b es: 007b ss: 0068 > Process tc (pid: 2743, threadinfo=cf690000 task=cf07b520) > Stack: c02372ab ce0d4d80 cd4d8800 cebd1c40 ce0d4d80 c0237346 ce0d4d80 00000008 > 00000000 00000000 00000000 cf691ba0 cf691ba0 c02478d6 ce0d4d80 00000008 > 00000000 cf691ba0 ce0d4d80 cf6a6070 30960094 cec44380 00000000 00019000 > Call Trace: > [<c02372ab>] gnet_stats_start_copy_compat+0x1b/0x98 > [<c0237346>] gnet_stats_start_copy+0x1e/0x24 > [<c02478d6>] tcf_action_copy_stats+0x26/0xa0 > [<c02473e2>] tcf_action_dump_old+0x36/0x3c > [<d0807174>] u32_dump+0x2c8/0x344 [cls_u32] > [<d08071a6>] u32_dump+0x2fa/0x344 [cls_u32] > [<c0246cf9>] tcf_fill_node+0x11d/0x170 > [<c0246d9c>] tfilter_notify+0x50/0xa0 > [<c0246bae>] tc_ctl_tfilter+0x542/0x570 > [<c0240705>] rtnetlink_rcv+0x23d/0x360 > [<c0249f08>] netlink_data_ready+0x1c/0x54 > [<c02495c5>] netlink_sendskb+0x21/0x40 > [<c024970b>] netlink_unicast+0xe3/0xec > [<c0249d04>] netlink_sendmsg+0x27c/0x28c > [<c02306d9>] sock_sendmsg+0xd5/0xf8 > [<c02306d9>] sock_sendmsg+0xd5/0xf8 > [<c01c0a70>] copy_from_user+0x30/0x60 > [<c01c0a70>] copy_from_user+0x30/0x60 > [<c0127cfc>] autoremove_wake_function+0x0/0x40 > [<c0231db7>] sys_sendmsg+0x18f/0x1f4 > [<c013c068>] handle_mm_fault+0x80/0x11c > [<c010fc3b>] do_page_fault+0x1a3/0x554 > [<c01c0a70>] copy_from_user+0x30/0x60 > [<c02321bc>] sys_socketcall+0x1d8/0x1f4 > [<c01021cd>] sysenter_past_esp+0x52/0x71 > Code: 3a 00 7e f9 fa eb e9 c3 8d 76 00 fa f0 fe 08 79 09 f3 90 80 38 00 7e f9 eb f2 c3 89 c2 b8 00 e0 ff ff 21 e0 81 > <0>Kernel panic - not syncing: Fatal exception in interrupt > Somehow I don't think this is because "Performance" was misspelled ;) > > tcf_act_hdr.stats_lock is NULL in tcf_action_copy_stats() > > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-07 12:29 ` jamal @ 2004-12-07 16:59 ` Patrick McHardy 2004-12-07 17:07 ` Thomas Graf 0 siblings, 1 reply; 24+ messages in thread From: Patrick McHardy @ 2004-12-07 16:59 UTC (permalink / raw) To: hadi Cc: Andrew Morton, Thomas Cataldo, linux-kernel, netdev, tgraf, David S. Miller [-- Attachment #1: Type: text/plain, Size: 435 bytes --] jamal wrote: >Can you do a: >tc -V > >This seems to point to probably be a backward compat issue which was >overlooked in the stats. > That's also what I thought at first. But the problem is in tcf_action_copy_stats, it assumes a->priv has the same layout as struct tcf_act_hdr, which is not true for struct tcf_police. This patch rearranges struct tcf_police to match tcf_act_hdr. Signed-off-by: Patrick McHardy <kaber@trash.net> [-- Attachment #2: x --] [-- Type: text/plain, Size: 1479 bytes --] ===== include/net/act_api.h 1.4 vs edited ===== --- 1.4/include/net/act_api.h 2004-11-06 01:33:12 +01:00 +++ edited/include/net/act_api.h 2004-12-07 17:53:37 +01:00 @@ -8,15 +8,23 @@ #include <net/sch_generic.h> #include <net/pkt_sched.h> +#define tca_gen(name) \ +struct tcf_##name *next; \ + u32 index; \ + int refcnt; \ + int bindcnt; \ + u32 capab; \ + int action; \ + struct tcf_t tm; \ + struct gnet_stats_basic bstats; \ + struct gnet_stats_queue qstats; \ + struct gnet_stats_rate_est rate_est; \ + spinlock_t *stats_lock; \ + spinlock_t lock + struct tcf_police { - struct tcf_police *next; - int refcnt; -#ifdef CONFIG_NET_CLS_ACT - int bindcnt; -#endif - u32 index; - int action; + tca_gen(police); int result; u32 ewma_rate; u32 burst; @@ -24,33 +32,14 @@ u32 toks; u32 ptoks; psched_time_t t_c; - spinlock_t lock; struct qdisc_rate_table *R_tab; struct qdisc_rate_table *P_tab; - - struct gnet_stats_basic bstats; - struct gnet_stats_queue qstats; - struct gnet_stats_rate_est rate_est; - spinlock_t *stats_lock; }; #ifdef CONFIG_NET_CLS_ACT #define ACT_P_CREATED 1 #define ACT_P_DELETED 1 -#define tca_gen(name) \ -struct tcf_##name *next; \ - u32 index; \ - int refcnt; \ - int bindcnt; \ - u32 capab; \ - int action; \ - struct tcf_t tm; \ - struct gnet_stats_basic bstats; \ - struct gnet_stats_queue qstats; \ - struct gnet_stats_rate_est rate_est; \ - spinlock_t *stats_lock; \ - spinlock_t lock struct tcf_act_hdr { ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-07 16:59 ` Patrick McHardy @ 2004-12-07 17:07 ` Thomas Graf 2004-12-07 17:23 ` Patrick McHardy 0 siblings, 1 reply; 24+ messages in thread From: Thomas Graf @ 2004-12-07 17:07 UTC (permalink / raw) To: Patrick McHardy Cc: hadi, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller * Patrick McHardy <41B5E188.5050800@trash.net> 2004-12-07 17:59 > jamal wrote: > > >Can you do a: > >tc -V > > > >This seems to point to probably be a backward compat issue which was > >overlooked in the stats. > > > That's also what I thought at first. But the problem is in > tcf_action_copy_stats, it assumes a->priv has the same layout as > struct tcf_act_hdr, which is not true for struct tcf_police. This > patch rearranges struct tcf_police to match tcf_act_hdr. Hehe, see my post a few minutes back. I came up with nearly the same solution ;-> The only difference to my patch is that I don't touch tcf_police if the action code isn't compiled. --- linux-2.6.10-rc2-bk13.orig/include/net/act_api.h 2004-11-30 14:01:11.000000000 +0100 +++ linux-2.6.10-rc2-bk13/include/net/act_api.h 2004-12-07 17:49:50.000000000 +0100 @@ -8,15 +8,42 @@ #include <net/sch_generic.h> #include <net/pkt_sched.h> +#ifdef CONFIG_NET_CLS_ACT + +#define ACT_P_CREATED 1 +#define ACT_P_DELETED 1 +#define tca_gen(name) \ +struct tcf_##name *next; \ + u32 index; \ + int refcnt; \ + int bindcnt; \ + u32 capab; \ + int action; \ + struct tcf_t tm; \ + struct gnet_stats_basic bstats; \ + struct gnet_stats_queue qstats; \ + struct gnet_stats_rate_est rate_est; \ + spinlock_t *stats_lock; \ + spinlock_t lock + +#endif + struct tcf_police { +#ifdef CONFIG_NET_CLS_ACT + tca_gen(police); +#else struct tcf_police *next; int refcnt; -#ifdef CONFIG_NET_CLS_ACT - int bindcnt; -#endif u32 index; int action; + spinlock_t lock; + struct gnet_stats_basic bstats; + struct gnet_stats_queue qstats; + struct gnet_stats_rate_est rate_est; + spinlock_t *stats_lock; +#endif + int result; u32 ewma_rate; u32 burst; @@ -24,34 +51,12 @@ u32 toks; u32 ptoks; psched_time_t t_c; - spinlock_t lock; struct qdisc_rate_table *R_tab; struct qdisc_rate_table *P_tab; - - struct gnet_stats_basic bstats; - struct gnet_stats_queue qstats; - struct gnet_stats_rate_est rate_est; - spinlock_t *stats_lock; }; #ifdef CONFIG_NET_CLS_ACT -#define ACT_P_CREATED 1 -#define ACT_P_DELETED 1 -#define tca_gen(name) \ -struct tcf_##name *next; \ - u32 index; \ - int refcnt; \ - int bindcnt; \ - u32 capab; \ - int action; \ - struct tcf_t tm; \ - struct gnet_stats_basic bstats; \ - struct gnet_stats_queue qstats; \ - struct gnet_stats_rate_est rate_est; \ - spinlock_t *stats_lock; \ - spinlock_t lock - struct tcf_act_hdr { tca_gen(act_hdr); ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-07 17:07 ` Thomas Graf @ 2004-12-07 17:23 ` Patrick McHardy 2004-12-08 4:27 ` jamal 2004-12-08 5:30 ` David S. Miller 0 siblings, 2 replies; 24+ messages in thread From: Patrick McHardy @ 2004-12-07 17:23 UTC (permalink / raw) To: Thomas Graf Cc: hadi, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller Thomas Graf wrote: >* Patrick McHardy <41B5E188.5050800@trash.net> 2004-12-07 17:59 > > >>That's also what I thought at first. But the problem is in >>tcf_action_copy_stats, it assumes a->priv has the same layout as >>struct tcf_act_hdr, which is not true for struct tcf_police. This >>patch rearranges struct tcf_police to match tcf_act_hdr. >> >> > >Hehe, see my post a few minutes back. I came up with nearly the same >solution ;-> The only difference to my patch is that I don't touch >tcf_police if the action code isn't compiled. > > Either one is fine with me, although I would prefer to see the number of ifdefs in this area going down, not up :) Regards Patrick ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-07 17:23 ` Patrick McHardy @ 2004-12-08 4:27 ` jamal 2004-12-08 4:41 ` Jamal Hadi Salim 2004-12-08 5:30 ` David S. Miller 1 sibling, 1 reply; 24+ messages in thread From: jamal @ 2004-12-08 4:27 UTC (permalink / raw) To: Patrick McHardy Cc: Thomas Graf, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller On Tue, 2004-12-07 at 12:23, Patrick McHardy wrote: > Either one is fine with me, although I would prefer to see > the number of ifdefs in this area going down, not up :) You guys pick one or other or a mix. I run 4 base testcases for the policer typically: 1) Old kernel, uptodate TC - MUST pass 2) old kernel, old tc (trivial - expected to pass). 3) New Kernel, uptodate TC - MUST pass 4) New Kernel, uptodate TC - MUST pass (although trivial) Try both setting, dumping then deleting policies. If these tests pass, please push patch to Dave. cheers, jamal ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 4:27 ` jamal @ 2004-12-08 4:41 ` Jamal Hadi Salim 2004-12-08 5:17 ` Patrick McHardy 0 siblings, 1 reply; 24+ messages in thread From: Jamal Hadi Salim @ 2004-12-08 4:41 UTC (permalink / raw) To: Patrick McHardy Cc: Thomas Graf, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller BTW, old kernel in this case implies one that does not support tc actions at all. So pick something like 2.4.28. New is whatever 2.6.x with patch. Old tc is something that for example ships with redhat new tc is whatever one is patched. Supplementary tests are: in 2.6.x to compile the policer in two different ways a) via tc actions and b) using the old scheme which is understood by "old" tc. Repeat the tests i described earlier with b) pretending to be "old" kernel. Infact come to think of it i would also prefer to have the suplementary tests run as well. If you guys have no cycles, please pass the patch to me and i will test on the weekend. cheers, jamal On Tue, 2004-12-07 at 23:27, jamal wrote: > On Tue, 2004-12-07 at 12:23, Patrick McHardy wrote: > > > Either one is fine with me, although I would prefer to see > > the number of ifdefs in this area going down, not up :) > > You guys pick one or other or a mix. I run 4 base testcases for the > policer typically: > > 1) Old kernel, uptodate TC - MUST pass > 2) old kernel, old tc (trivial - expected to pass). > 3) New Kernel, uptodate TC - MUST pass > 4) New Kernel, uptodate TC - MUST pass (although trivial) > > Try both setting, dumping then deleting policies. > > If these tests pass, please push patch to Dave. > > cheers, > jamal > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 4:41 ` Jamal Hadi Salim @ 2004-12-08 5:17 ` Patrick McHardy 2004-12-08 12:31 ` jamal 0 siblings, 1 reply; 24+ messages in thread From: Patrick McHardy @ 2004-12-08 5:17 UTC (permalink / raw) To: hadi Cc: Thomas Graf, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller [-- Attachment #1: Type: text/plain, Size: 1343 bytes --] Jamal Hadi Salim wrote: >BTW, old kernel in this case implies one that does not support tc >actions at all. So pick something like 2.4.28. >New is whatever 2.6.x with patch. >Old tc is something that for example ships with redhat >new tc is whatever one is patched. > >Supplementary tests are: in 2.6.x to compile the policer >in two different ways a) via tc actions and b) using the old scheme >which is understood by "old" tc. Repeat the tests i described earlier >with b) pretending to be "old" kernel. > >Infact come to think of it i would also prefer to have the suplementary >tests run as well. >If you guys have no cycles, please pass the patch to me and i will test >on the weekend. > > I think these tests are a waste of time. struct tcf_police is not userspace-visible, so it's highly unlikely that the tc version matters. Why an old kernel needs to be tested is beyond me. For possible in-kernel breakage caused by the restructuring, without CONFIG_NET_CLS_ACT, struct tcf_police is only used in police.c, without any casts or assumptions about layout, so I can't see what could break. With CONFIG_NET_CLS_ACT, the only place where it is used outside of police.c is tcf_action_copy_stats, and this is exactly what this patch (tested) fixes. If you still want to do these test, please use the attached patch. Regards Patrick [-- Attachment #2: x --] [-- Type: text/plain, Size: 1479 bytes --] ===== include/net/act_api.h 1.4 vs edited ===== --- 1.4/include/net/act_api.h 2004-11-06 01:33:12 +01:00 +++ edited/include/net/act_api.h 2004-12-07 17:53:37 +01:00 @@ -8,15 +8,23 @@ #include <net/sch_generic.h> #include <net/pkt_sched.h> +#define tca_gen(name) \ +struct tcf_##name *next; \ + u32 index; \ + int refcnt; \ + int bindcnt; \ + u32 capab; \ + int action; \ + struct tcf_t tm; \ + struct gnet_stats_basic bstats; \ + struct gnet_stats_queue qstats; \ + struct gnet_stats_rate_est rate_est; \ + spinlock_t *stats_lock; \ + spinlock_t lock + struct tcf_police { - struct tcf_police *next; - int refcnt; -#ifdef CONFIG_NET_CLS_ACT - int bindcnt; -#endif - u32 index; - int action; + tca_gen(police); int result; u32 ewma_rate; u32 burst; @@ -24,33 +32,14 @@ u32 toks; u32 ptoks; psched_time_t t_c; - spinlock_t lock; struct qdisc_rate_table *R_tab; struct qdisc_rate_table *P_tab; - - struct gnet_stats_basic bstats; - struct gnet_stats_queue qstats; - struct gnet_stats_rate_est rate_est; - spinlock_t *stats_lock; }; #ifdef CONFIG_NET_CLS_ACT #define ACT_P_CREATED 1 #define ACT_P_DELETED 1 -#define tca_gen(name) \ -struct tcf_##name *next; \ - u32 index; \ - int refcnt; \ - int bindcnt; \ - u32 capab; \ - int action; \ - struct tcf_t tm; \ - struct gnet_stats_basic bstats; \ - struct gnet_stats_queue qstats; \ - struct gnet_stats_rate_est rate_est; \ - spinlock_t *stats_lock; \ - spinlock_t lock struct tcf_act_hdr { ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 5:17 ` Patrick McHardy @ 2004-12-08 12:31 ` jamal 2004-12-08 14:32 ` Thomas Graf 0 siblings, 1 reply; 24+ messages in thread From: jamal @ 2004-12-08 12:31 UTC (permalink / raw) To: Patrick McHardy Cc: Thomas Graf, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller On Wed, 2004-12-08 at 00:17, Patrick McHardy wrote: > I think these tests are a waste of time. struct tcf_police is not > userspace-visible, so it's highly unlikely that the tc version matters. > Why an old kernel needs to be tested is beyond me. Regression testing. You need both backward and forward compatibility. Old kernels must continue to work with new tc for the policer using the old syntax. new kernels must continue to work with old tc for policer management using old syntax. Policer existed before any tc action code was written and has a very different layout of the structure. User tools and classifiers (accessed from user tools) do touch that code. These kind of tests constitute about 50% or more of my testing. > For possible in-kernel > breakage caused by the restructuring, without CONFIG_NET_CLS_ACT, > struct tcf_police is only used in police.c, without any casts or > assumptions about layout, so I can't see what could break. With > CONFIG_NET_CLS_ACT, the only place where it is used outside of > police.c is tcf_action_copy_stats, and this is exactly what this patch > (tested) fixes. > > If you still want to do these test, please use the attached patch. No rush now that its in (I also dont have time or equipment at the moment). Lets hope no more freezes reported. When i get time i will look into it. cheers, jamal ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 12:31 ` jamal @ 2004-12-08 14:32 ` Thomas Graf 2004-12-08 14:59 ` Thomas Graf 2004-12-08 15:05 ` jamal 0 siblings, 2 replies; 24+ messages in thread From: Thomas Graf @ 2004-12-08 14:32 UTC (permalink / raw) To: jamal Cc: Patrick McHardy, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller * jamal <1102509111.1051.54.camel@jzny.localdomain> 2004-12-08 07:31 > On Wed, 2004-12-08 at 00:17, Patrick McHardy wrote: > > > I think these tests are a waste of time. struct tcf_police is not > > userspace-visible, so it's highly unlikely that the tc version matters. > > Why an old kernel needs to be tested is beyond me. > > Regression testing. > You need both backward and forward compatibility. > Old kernels must continue to work with new tc for the policer using the > old syntax. > new kernels must continue to work with old tc for policer management > using old syntax. > Policer existed before any tc action code was written and has a very > different layout of the structure. User tools and classifiers (accessed > from user tools) do touch that code. > These kind of tests constitute about 50% or more of my testing. I invested some time to ease testing since this was primarly my fault by overlooking the special case of tcf_police. I've put together a small testsuite allowing to easly run tests for multiple versions of iproute2. It can be found at: http://people.suug.ch/~tgr/iproute2/tc-testsuite.tar.gz One simply extracts various iproute2 versions into iproute2/ and sets KERNEL_INCLUDE if needed for older versions. 'make compile' on the top level compiles all the versions. The tests are defined in tests/ and are simple shell scripts and get invoked for every iproute2 verison in iproute2 with $TC and $IP set to the version currently being tested. The output of every test run is stored in results/$TEST.$IPVERSION.out respectively .dmesg. 'make clean' removes all the results again. 'make liststests' lists all the available tests. 'make alltests' runs all the tests. I've run all the tests on my patch with the following kernels and iproute2 versions: - 2.6.10-rc2-bk13 (actions compiled in) - 2.6.10-rc2-bk13-no-act (old policer compiled in) - 2.4.28-rc1-bk1 - iproute2-2.6.9-tgr (with all my patches in) - iproute2-2.4.7 iproute-2.6.9 was sucessful with all kernels. I couldn't test with the old 2.4.7 iproute2 yet since the syntax has changed and I need to adopt the tests first. I will create better tests and run it on patrick's patch when I get home. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 14:32 ` Thomas Graf @ 2004-12-08 14:59 ` Thomas Graf 2004-12-08 15:06 ` jamal 2004-12-08 15:05 ` jamal 1 sibling, 1 reply; 24+ messages in thread From: Thomas Graf @ 2004-12-08 14:59 UTC (permalink / raw) To: jamal Cc: Patrick McHardy, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller * Thomas Graf <20041208143212.GL1371@postel.suug.ch> 2004-12-08 15:32 > iproute-2.6.9 was sucessful with all kernels. I couldn't test with the > old 2.4.7 iproute2 yet since the syntax has changed and I need to adopt > the tests first. I will create better tests and run it on patrick's > patch when I get home. I've updated the tests and all were successful: tc-2.6.9-tgr tc-2.4.7 2.6.10-rc2-bk13* Y Y 2.6.10-rc2-bk13-no-act* Y Y 2.4.28-rc1-bk1 Y Y * including tcf_police patch ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 14:59 ` Thomas Graf @ 2004-12-08 15:06 ` jamal 0 siblings, 0 replies; 24+ messages in thread From: jamal @ 2004-12-08 15:06 UTC (permalink / raw) To: Thomas Graf Cc: Patrick McHardy, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller Thanks for your efforts Thomas. cheers, jamal On Wed, 2004-12-08 at 09:59, Thomas Graf wrote: > * Thomas Graf <20041208143212.GL1371@postel.suug.ch> 2004-12-08 15:32 > > iproute-2.6.9 was sucessful with all kernels. I couldn't test with the > > old 2.4.7 iproute2 yet since the syntax has changed and I need to adopt > > the tests first. I will create better tests and run it on patrick's > > patch when I get home. > > I've updated the tests and all were successful: > > tc-2.6.9-tgr tc-2.4.7 > 2.6.10-rc2-bk13* Y Y > 2.6.10-rc2-bk13-no-act* Y Y > 2.4.28-rc1-bk1 Y Y > > * including tcf_police patch > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 14:32 ` Thomas Graf 2004-12-08 14:59 ` Thomas Graf @ 2004-12-08 15:05 ` jamal 2004-12-08 17:30 ` Thomas Graf 1 sibling, 1 reply; 24+ messages in thread From: jamal @ 2004-12-08 15:05 UTC (permalink / raw) To: Thomas Graf Cc: Patrick McHardy, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller On Wed, 2004-12-08 at 09:32, Thomas Graf wrote: > I've put together a small testsuite allowing to easly run tests for > multiple versions of iproute2. It can be found at: > http://people.suug.ch/~tgr/iproute2/tc-testsuite.tar.gz > Good stuff. Hopefully we can run these tests everytime we make changes going forward. We cant compromise quality by handwaving on instinct. Famous last words: "that couldnt have possibly caused a bug down there". I will take a look at what you have and integrate my 20-30 testcases if they are not covered over time - or may be adpat what you have to follow how i do things or maybe i can send them to you and you can integrate them. > iproute-2.6.9 was sucessful with all kernels. I couldn't test with the > old 2.4.7 iproute2 yet since the syntax has changed and I need to adopt > the tests first. I will create better tests and run it on patrick's > patch when I get home. That would be appreaciated. cheers, jamal ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 15:05 ` jamal @ 2004-12-08 17:30 ` Thomas Graf 0 siblings, 0 replies; 24+ messages in thread From: Thomas Graf @ 2004-12-08 17:30 UTC (permalink / raw) To: jamal Cc: Patrick McHardy, Andrew Morton, Thomas Cataldo, linux-kernel, netdev, David S. Miller * jamal <1102518304.1023.6.camel@jzny.localdomain> 2004-12-08 10:05 > On Wed, 2004-12-08 at 09:32, Thomas Graf wrote: > > > I've put together a small testsuite allowing to easly run tests for > > multiple versions of iproute2. It can be found at: > > http://people.suug.ch/~tgr/iproute2/tc-testsuite.tar.gz > > > > Good stuff. Hopefully we can run these tests everytime we make changes > going forward. We cant compromise quality by handwaving on instinct. > Famous last words: "that couldnt have possibly caused a bug down there". > I will take a look at what you have and integrate my 20-30 testcases if > they are not covered over time - or may be adpat what you have to follow > how i do things or maybe i can send them to you and you can integrate > them. I've only put a small cbq test case and the policer test in there for now. I'd be happy to integrate your test cases if you send them to me. I wil add more tests in the next days given time. Our sysadmin had some cycles and set up a UML capable of booting any kernel image by script so we can easly test all changes to iproute2 or the kernel on as many branches as we want. I've put in the latest 2.4/2.6 main releases, the latest bk snapshots of both and earlier releases of both to ensure we keep some backward compatibility. All 2.6 branches have 2 configs, one with action code and the other with the old policer. I'm not quite sure which versions of iproute2 are being used by the distributions so I've put the latest bk version and the ones of red hat, suse and debian into it. That's 36 kernel<->iproute2 combinations per test, given we add a few dozen tests makes it hard to tell if something went wrong. I'll probably need to add some more logic to it besides just checking if the test script has written anything to stderr. I'll be happy to put patches into the test cycle once they're posted to netdev or alternatively add dave's bk tree to the list of branches whichever is more reasonable. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-07 17:23 ` Patrick McHardy 2004-12-08 4:27 ` jamal @ 2004-12-08 5:30 ` David S. Miller 2004-12-08 12:04 ` jamal 1 sibling, 1 reply; 24+ messages in thread From: David S. Miller @ 2004-12-08 5:30 UTC (permalink / raw) To: Patrick McHardy; +Cc: tgraf, hadi, akpm, tomc, linux-kernel, netdev On Tue, 07 Dec 2004 18:23:46 +0100 Patrick McHardy <kaber@trash.net> wrote: > Either one is fine with me, although I would prefer to see > the number of ifdefs in this area going down, not up :) I agree, therefore I applied Patrick's patch. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 5:30 ` David S. Miller @ 2004-12-08 12:04 ` jamal 2004-12-08 16:57 ` Patrick McHardy 0 siblings, 1 reply; 24+ messages in thread From: jamal @ 2004-12-08 12:04 UTC (permalink / raw) To: David S. Miller; +Cc: Patrick McHardy, tgraf, akpm, tomc, linux-kernel, netdev I can almost guarantee that one or more of those tests i outlined would fail. So i would suggest a revert until the testing has been done. cheers, jamal On Wed, 2004-12-08 at 00:30, David S. Miller wrote: > On Tue, 07 Dec 2004 18:23:46 +0100 > Patrick McHardy <kaber@trash.net> wrote: > > > Either one is fine with me, although I would prefer to see > > the number of ifdefs in this area going down, not up :) > > I agree, therefore I applied Patrick's patch. > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 12:04 ` jamal @ 2004-12-08 16:57 ` Patrick McHardy 2004-12-08 19:30 ` Stephen Hemminger 2004-12-08 19:44 ` jamal 0 siblings, 2 replies; 24+ messages in thread From: Patrick McHardy @ 2004-12-08 16:57 UTC (permalink / raw) To: hadi; +Cc: David S. Miller, tgraf, akpm, tomc, linux-kernel, netdev jamal wrote: >I can almost guarantee that one or more of those tests i outlined would >fail. So i would suggest a revert until the testing has been done. > Please be more specific than an "almost guarantee" that "one or more tests" may fail when asking to revert a patch that fixes an easily triggerable crash. For example, point to the code that makes you think it might fail. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 16:57 ` Patrick McHardy @ 2004-12-08 19:30 ` Stephen Hemminger 2004-12-08 19:49 ` jamal 2004-12-08 21:21 ` Patrick McHardy 2004-12-08 19:44 ` jamal 1 sibling, 2 replies; 24+ messages in thread From: Stephen Hemminger @ 2004-12-08 19:30 UTC (permalink / raw) To: Patrick McHardy Cc: hadi, David S. Miller, tgraf, akpm, tomc, linux-kernel, netdev I don't know if this is related but something broke netem after 2.6.10-rc2 Now a simple request to delay 10ms ends up taking 1000ms. Still narrowing down the changeset, but it isn't a change in netem or tc utils since these didn't change. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 19:30 ` Stephen Hemminger @ 2004-12-08 19:49 ` jamal 2004-12-08 21:21 ` Patrick McHardy 1 sibling, 0 replies; 24+ messages in thread From: jamal @ 2004-12-08 19:49 UTC (permalink / raw) To: Stephen Hemminger Cc: Patrick McHardy, David S. Miller, tgraf, akpm, tomc, linux-kernel, netdev Stephen, Can we incoporate the tests from Thomas as part of iproute2? I have about 20 or so i could add in. Maybe you could add some for netem as well. cheers, jamal On Wed, 2004-12-08 at 14:30, Stephen Hemminger wrote: > I don't know if this is related but something broke netem after 2.6.10-rc2 > Now a simple request to delay 10ms ends up taking 1000ms. > > Still narrowing down the changeset, but it isn't a change in netem > or tc utils since these didn't change. > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 19:30 ` Stephen Hemminger 2004-12-08 19:49 ` jamal @ 2004-12-08 21:21 ` Patrick McHardy 2004-12-08 22:07 ` Stephen Hemminger 1 sibling, 1 reply; 24+ messages in thread From: Patrick McHardy @ 2004-12-08 21:21 UTC (permalink / raw) To: Stephen Hemminger; +Cc: netdev Stephen Hemminger wrote: >I don't know if this is related but something broke netem after 2.6.10-rc2 >Now a simple request to delay 10ms ends up taking 1000ms. > >Still narrowing down the changeset, but it isn't a change in netem >or tc utils since these didn't change. > > > I can't find anything in the Changelog that looks related. Can you give me a testcase ? Regards Patrick ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 21:21 ` Patrick McHardy @ 2004-12-08 22:07 ` Stephen Hemminger 2004-12-08 22:26 ` Patrick McHardy 0 siblings, 1 reply; 24+ messages in thread From: Stephen Hemminger @ 2004-12-08 22:07 UTC (permalink / raw) To: Patrick McHardy; +Cc: netdev On Wed, 08 Dec 2004 22:21:54 +0100 Patrick McHardy <kaber@trash.net> wrote: > Stephen Hemminger wrote: > > >I don't know if this is related but something broke netem after 2.6.10-rc2 > >Now a simple request to delay 10ms ends up taking 1000ms. > > > >Still narrowing down the changeset, but it isn't a change in netem > >or tc utils since these didn't change. > > > > > > > I can't find anything in the Changelog that looks related. > Can you give me a testcase ? I think the problem was a missing poke (qdisc_restart) in the netem timer routine, it probably worked earlier for me on other hardware because I was using different hardware that was waking up and checking tx in response to received packets. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 22:07 ` Stephen Hemminger @ 2004-12-08 22:26 ` Patrick McHardy 0 siblings, 0 replies; 24+ messages in thread From: Patrick McHardy @ 2004-12-08 22:26 UTC (permalink / raw) To: Stephen Hemminger; +Cc: netdev Stephen Hemminger wrote: >I think the problem was a missing poke (qdisc_restart) in the netem timer >routine, it probably worked earlier for me on other hardware because I was >using different hardware that was waking up and checking tx in response >to received packets. > > Yes, that seems likely. Since netem doesn't account for the delayed packets in sch->q.qlen it will only be woken up while non-delayed packets are queued. Regards Patrick ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 2004-12-08 16:57 ` Patrick McHardy 2004-12-08 19:30 ` Stephen Hemminger @ 2004-12-08 19:44 ` jamal 1 sibling, 0 replies; 24+ messages in thread From: jamal @ 2004-12-08 19:44 UTC (permalink / raw) To: Patrick McHardy; +Cc: David S. Miller, tgraf, akpm, tomc, linux-kernel, netdev On Wed, 2004-12-08 at 11:57, Patrick McHardy wrote: > jamal wrote: > > >I can almost guarantee that one or more of those tests i outlined would > >fail. So i would suggest a revert until the testing has been done. > > > Please be more specific than an "almost guarantee" that > "one or more tests" may fail when asking to revert a patch > that fixes an easily triggerable crash. For example, point > to the code that makes you think it might fail. I hope this is clear after you read the last email exchange i had with Thomas and that you are not intentionaly trying to be annoying. cheers, jamal ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2004-12-08 22:26 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1102380430.6103.6.camel@buffy>
2004-12-07 6:35 ` Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Patrick McHardy
2004-12-07 6:44 ` Andrew Morton
2004-12-07 12:29 ` jamal
2004-12-07 16:59 ` Patrick McHardy
2004-12-07 17:07 ` Thomas Graf
2004-12-07 17:23 ` Patrick McHardy
2004-12-08 4:27 ` jamal
2004-12-08 4:41 ` Jamal Hadi Salim
2004-12-08 5:17 ` Patrick McHardy
2004-12-08 12:31 ` jamal
2004-12-08 14:32 ` Thomas Graf
2004-12-08 14:59 ` Thomas Graf
2004-12-08 15:06 ` jamal
2004-12-08 15:05 ` jamal
2004-12-08 17:30 ` Thomas Graf
2004-12-08 5:30 ` David S. Miller
2004-12-08 12:04 ` jamal
2004-12-08 16:57 ` Patrick McHardy
2004-12-08 19:30 ` Stephen Hemminger
2004-12-08 19:49 ` jamal
2004-12-08 21:21 ` Patrick McHardy
2004-12-08 22:07 ` Stephen Hemminger
2004-12-08 22:26 ` Patrick McHardy
2004-12-08 19:44 ` jamal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).