* netconsole hangs w/ alt-sysrq-t @ 2004-04-22 15:29 Jeff Moyer 2004-04-25 19:15 ` Matt Mackall 0 siblings, 1 reply; 18+ messages in thread From: Jeff Moyer @ 2004-04-22 15:29 UTC (permalink / raw) To: linux-kernel If netconsole is enabled, and you hit Alt-Sysrq-t, then it will print a small amount of output to the console(s) and then hang the system. In this case, I'm using the e100 driver, and we end up exhausting the available cbs. Since we are in interrupt context, the driver's poll routine is never run, and we loop infinitely waiting for resources to free up that never will. Kernel version is 2.6.5. No easy solutions jump out at me, unfortunately. Comments? Regards, Jeff ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-04-22 15:29 netconsole hangs w/ alt-sysrq-t Jeff Moyer @ 2004-04-25 19:15 ` Matt Mackall 2004-04-28 12:44 ` Jeff Moyer 0 siblings, 1 reply; 18+ messages in thread From: Matt Mackall @ 2004-04-25 19:15 UTC (permalink / raw) To: Jeff Moyer; +Cc: linux-kernel On Thu, Apr 22, 2004 at 11:29:33AM -0400, Jeff Moyer wrote: > If netconsole is enabled, and you hit Alt-Sysrq-t, then it will print a > small amount of output to the console(s) and then hang the system. In this > case, I'm using the e100 driver, and we end up exhausting the available > cbs. Since we are in interrupt context, the driver's poll routine is never > run, and we loop infinitely waiting for resources to free up that never > will. Kernel version is 2.6.5. Can you try 2.6.6-rc2? It has a fix to congestion handling that should address this. -- Matt Mackall : http://www.selenic.com : Linux development and consulting ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-04-25 19:15 ` Matt Mackall @ 2004-04-28 12:44 ` Jeff Moyer 2004-04-28 14:03 ` Matt Mackall 0 siblings, 1 reply; 18+ messages in thread From: Jeff Moyer @ 2004-04-28 12:44 UTC (permalink / raw) To: Matt Mackall; +Cc: linux-kernel ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: mpm> On Thu, Apr 22, 2004 at 11:29:33AM -0400, Jeff Moyer wrote: >> If netconsole is enabled, and you hit Alt-Sysrq-t, then it will print a >> small amount of output to the console(s) and then hang the system. In >> this case, I'm using the e100 driver, and we end up exhausting the >> available cbs. Since we are in interrupt context, the driver's poll >> routine is never run, and we loop infinitely waiting for resources to >> free up that never will. Kernel version is 2.6.5. mpm> Can you try 2.6.6-rc2? It has a fix to congestion handling that should mpm> address this. Is the attached patch the change you are referring to? If so, I don't see how this would fix the problem. I ended up deferring netpoll writes to process context, which has been working fine for me. Have I missed something? -Jeff # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/04/13 14:07:53-04:00 mpm@selenic.com # [PATCH] netpoll transmit busy bugfix # # Fix for handling of full transmit queue when netpoll trap is enabled. # # From Stelian Pop <stelian@popies.net> # # net/core/netpoll.c # 2004/04/10 16:19:31-04:00 mpm@selenic.com +3 -9 # netpoll transmit busy bugfix # diff -Nru a/net/core/netpoll.c b/net/core/netpoll.c --- a/net/core/netpoll.c Wed Apr 28 05:43:08 2004 +++ b/net/core/netpoll.c Wed Apr 28 05:43:08 2004 @@ -163,21 +163,15 @@ spin_lock(&np->dev->xmit_lock); np->dev->xmit_lock_owner = smp_processor_id(); - if (netif_queue_stopped(np->dev)) { - np->dev->xmit_lock_owner = -1; - spin_unlock(&np->dev->xmit_lock); - - netpoll_poll(np); - goto repeat; - } - status = np->dev->hard_start_xmit(skb, np->dev); np->dev->xmit_lock_owner = -1; spin_unlock(&np->dev->xmit_lock); /* transmit busy */ - if(status) + if(status) { + netpoll_poll(np); goto repeat; + } } void netpoll_send_udp(struct netpoll *np, const char *msg, int len) ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-04-28 12:44 ` Jeff Moyer @ 2004-04-28 14:03 ` Matt Mackall 2004-04-28 14:07 ` Jeff Moyer 0 siblings, 1 reply; 18+ messages in thread From: Matt Mackall @ 2004-04-28 14:03 UTC (permalink / raw) To: Jeff Moyer; +Cc: linux-kernel On Wed, Apr 28, 2004 at 08:44:47AM -0400, Jeff Moyer wrote: > ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: > > mpm> On Thu, Apr 22, 2004 at 11:29:33AM -0400, Jeff Moyer wrote: > >> If netconsole is enabled, and you hit Alt-Sysrq-t, then it will print a > >> small amount of output to the console(s) and then hang the system. In > >> this case, I'm using the e100 driver, and we end up exhausting the > >> available cbs. Since we are in interrupt context, the driver's poll > >> routine is never run, and we loop infinitely waiting for resources to > >> free up that never will. Kernel version is 2.6.5. > > mpm> Can you try 2.6.6-rc2? It has a fix to congestion handling that should > mpm> address this. > > Is the attached patch the change you are referring to? If so, I don't see > how this would fix the problem. I ended up deferring netpoll writes to > process context, which has been working fine for me. Have I missed > something? Well process context defeats the purpose. Ok, I've more closely read your report and if I understand correctly, you're using the NAPI version of e100? There's some magic NAPI bits in netpoll_poll that might help here: if(trapped && np->dev->poll && test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) np->dev->poll(np->dev, &budget); Perhaps we need to pull the trapped test out of there. Then with any luck, dev->hard_start_xmit will return non-zero in netpoll_send_skb, we'll call netpoll_poll to pump the card, and we'll be able to flush it. -- Matt Mackall : http://www.selenic.com : Linux development and consulting ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-04-28 14:03 ` Matt Mackall @ 2004-04-28 14:07 ` Jeff Moyer 2004-04-28 14:27 ` Matt Mackall 0 siblings, 1 reply; 18+ messages in thread From: Jeff Moyer @ 2004-04-28 14:07 UTC (permalink / raw) To: Matt Mackall; +Cc: linux-kernel ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: mpm> On Wed, Apr 28, 2004 at 08:44:47AM -0400, Jeff Moyer wrote: >> ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall >> <mpm@selenic.com> adds: >> mpm> On Thu, Apr 22, 2004 at 11:29:33AM -0400, Jeff Moyer wrote: >> >> If netconsole is enabled, and you hit Alt-Sysrq-t, then it will print >> a >> small amount of output to the console(s) and then hang the system. >> In >> this case, I'm using the e100 driver, and we end up exhausting the >> >> available cbs. Since we are in interrupt context, the driver's poll >> >> routine is never run, and we loop infinitely waiting for resources to >> >> free up that never will. Kernel version is 2.6.5. >> mpm> Can you try 2.6.6-rc2? It has a fix to congestion handling that should mpm> address this. >> Is the attached patch the change you are referring to? If so, I don't >> see how this would fix the problem. I ended up deferring netpoll writes >> to process context, which has been working fine for me. Have I missed >> something? mpm> Well process context defeats the purpose. Ok, I've more closely read mpm> your report and if I understand correctly, you're using the NAPI mpm> version of e100? There's some magic NAPI bits in netpoll_poll that mpm> might help here: Yes, sorry I didn't specify that earlier. mpm> if(trapped && np->dev->poll && test_bit(__LINK_STATE_RX_SCHED, mpm> &np->dev->state)) np-> dev->poll(np->dev, &budget); mpm> Perhaps we need to pull the trapped test out of there. Then with any mpm> luck, dev->hard_start_xmit will return non-zero in netpoll_send_skb, mpm> we'll call netpoll_poll to pump the card, and we'll be able to flush mpm> it. I don't think so. You can end up in code running in interrupt context that is not designed to (ip routing code, etc). I've been down that path already. I only defer to process context if irqs_disabled(). Other suggestions? -Jeff ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-04-28 14:07 ` Jeff Moyer @ 2004-04-28 14:27 ` Matt Mackall 2004-04-28 18:23 ` Jeff Moyer 0 siblings, 1 reply; 18+ messages in thread From: Matt Mackall @ 2004-04-28 14:27 UTC (permalink / raw) To: Jeff Moyer; +Cc: linux-kernel On Wed, Apr 28, 2004 at 10:07:17AM -0400, Jeff Moyer wrote: > ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: > > mpm> On Wed, Apr 28, 2004 at 08:44:47AM -0400, Jeff Moyer wrote: > >> ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall > >> <mpm@selenic.com> adds: > >> > mpm> On Thu, Apr 22, 2004 at 11:29:33AM -0400, Jeff Moyer wrote: > >> >> If netconsole is enabled, and you hit Alt-Sysrq-t, then it will print > >> a >> small amount of output to the console(s) and then hang the system. > >> In >> this case, I'm using the e100 driver, and we end up exhausting the > >> >> available cbs. Since we are in interrupt context, the driver's poll > >> >> routine is never run, and we loop infinitely waiting for resources to > >> >> free up that never will. Kernel version is 2.6.5. > >> > mpm> Can you try 2.6.6-rc2? It has a fix to congestion handling that should > mpm> address this. > >> Is the attached patch the change you are referring to? If so, I don't > >> see how this would fix the problem. I ended up deferring netpoll writes > >> to process context, which has been working fine for me. Have I missed > >> something? > > mpm> Well process context defeats the purpose. Ok, I've more closely read > mpm> your report and if I understand correctly, you're using the NAPI > mpm> version of e100? There's some magic NAPI bits in netpoll_poll that > mpm> might help here: > > Yes, sorry I didn't specify that earlier. > > mpm> if(trapped && np->dev->poll && test_bit(__LINK_STATE_RX_SCHED, > mpm> &np->dev->state)) > np-> dev->poll(np->dev, &budget); > > mpm> Perhaps we need to pull the trapped test out of there. Then with any > mpm> luck, dev->hard_start_xmit will return non-zero in netpoll_send_skb, > mpm> we'll call netpoll_poll to pump the card, and we'll be able to flush > mpm> it. > > I don't think so. You can end up in code running in interrupt context that > is not designed to (ip routing code, etc). I've been down that path > already. I only defer to process context if irqs_disabled(). Fair enough. Turning on trapped basically short circuits the rest of the NAPI code so that stuff doesn't hit the stack when we call ->poll. Could you try doing a netpoll_set_trap(1)/(0) around the call to ->poll and see if that actually lets the thing work? Then we can try to figure out the right way to do this. My point about deferring still stands, as when we're dumping an oops, we can't really expect that we'll ever get to a process context. -- Matt Mackall : http://www.selenic.com : Linux development and consulting ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-04-28 14:27 ` Matt Mackall @ 2004-04-28 18:23 ` Jeff Moyer 2004-05-20 2:20 ` klogd, netconsole without /dev/console? Tom Oehser 2004-06-25 15:50 ` netconsole hangs w/ alt-sysrq-t Jeff Moyer 0 siblings, 2 replies; 18+ messages in thread From: Jeff Moyer @ 2004-04-28 18:23 UTC (permalink / raw) To: Matt Mackall; +Cc: linux-kernel ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: [snip] mpm> Well process context defeats the purpose. Ok, I've more closely read mpm> your report and if I understand correctly, you're using the NAPI mpm> version of e100? There's some magic NAPI bits in netpoll_poll that mpm> might help here: >> Yes, sorry I didn't specify that earlier. >> mpm> if(trapped && np->dev->poll && test_bit(__LINK_STATE_RX_SCHED, mpm> &np->dev->state)) np-> dev->poll(np->dev, &budget); >> mpm> Perhaps we need to pull the trapped test out of there. Then with any mpm> luck, dev->hard_start_xmit will return non-zero in netpoll_send_skb, mpm> we'll call netpoll_poll to pump the card, and we'll be able to flush mpm> it. >> I don't think so. You can end up in code running in interrupt context >> that is not designed to (ip routing code, etc). I've been down that >> path already. I only defer to process context if irqs_disabled(). mpm> Fair enough. Turning on trapped basically short circuits the rest of mpm> the NAPI code so that stuff doesn't hit the stack when we call ->poll. mpm> Could you try doing a netpoll_set_trap(1)/(0) around the call to mpm> ->poll and see if that actually lets the thing work? Then we can try mpm> to figure out the right way to do this. Okay, I tried that before and I still ran into problems. Just to be sure, I grabbed 2.6.6-rc3 and tried your suggestion. The resulting stack trace is at the end of this message. I can't explain how we get from netif_receive_skb to anywhere up the stack, though. Perhaps you'll have better luck with it. Note that this is alt-sysrq-t output, so the first bits of stack you see are not relevant. What counts is everything after the Badness in local_bh_enable. mpm> My point about deferring still stands, as when we're dumping an oops, mpm> we can't really expect that we'll ever get to a process context. Sure, but you can explicitly flush the queue in such a circumstance, though this is still not an ideal solution. -Jeff pdflush S 00000000 0 9 4 10 8 (L-TLB) cfe03f7c 00000046 00000000 00000000 00000000 00000000 cfe03f48 c011fa9c 5ec54655 123c321a 00000000 00000002 cff725d0 cff736b0 cff736d0 c1244ce0 0000119b 4318ade1 00000006 cfe12650 cfe12808 00000046 00000000 00000003 Call Trace: [<c011fa9c>] recalc_task_prio+0x8c/0x1a0 [<c014bb80>] pdflush+0x0/0x20 [<c014b918>] __pdflush+0xb8/0x320 [<c014bb9a>] pdflush+0x1a/0x20 [<c014bb80>] pdflush+0x0/0x20 [<c013b09c>] Badness in local_bh_enable at kernel/softirq.c:136 Call Trace: [<c012ab64>] local_bh_enable+0x84/0x90 [<c02b638e>] neigh_lookup+0x7e/0xb0 [<c02f3260>] arp_process+0x200/0x530 [<c0120e9b>] rebalance_tick+0x3b/0xf0 [<c02b1363>] netif_receive_skb+0x1d3/0x280 [<d08ff2d7>] e100_poll+0x6e7/0x760 [e100] [<d08fe4b4>] e100_xmit_frame+0x284/0x400 [e100] [<d08febf0>] e100_poll+0x0/0x760 [e100] [<c02bd13e>] netpoll_poll+0x5e/0x60 [<c02bd413>] netpoll_send_skb+0xc3/0x120 [<d086a058>] write_msg+0x58/0x60 [netconsole] [<d086a000>] write_msg+0x0/0x60 [netconsole] [<c01263c6>] __call_console_drivers+0x56/0x60 [<c0126927>] release_console_sem+0x77/0x140 [<c0126790>] printk+0x1c0/0x270 [<c014bb80>] pdflush+0x0/0x20 [<c013b09c>] kthread+0x9c/0xb0 [<c0107f35>] show_trace+0xa5/0xc0 [<c013b09c>] kthread+0x9c/0xb0 [<c0107fcb>] show_stack+0x7b/0x90 [<c012280c>] show_state+0x5c/0x90 [<c022042e>] __handle_sysrq_nolock+0x6e/0xe6 [<c02203a0>] handle_sysrq+0x40/0x60 [<c021a662>] kbd_event+0x32/0x70 [<c0294ab5>] input_event+0xf5/0x400 [<c0297cbf>] atkbd_report_key+0x2f/0x80 [<c0297f33>] atkbd_interrupt+0x223/0x530 [<c029bb46>] serio_interrupt+0x66/0x70 [<c029c464>] i8042_interrupt+0xf4/0x190 [<c0109560>] handle_IRQ_event+0x30/0x60 [<c01099f1>] do_IRQ+0x121/0x290 ======================= [<c0107ba4>] common_interrupt+0x18/0x20 [<c011007b>] transmeta_identify+0x2b/0x60 [<c011856f>] apm_bios_call_simple+0x9f/0xe0 [<c01186d8>] apm_do_idle+0x18/0x70 [<c01187dc>] apm_cpu_idle+0x7c/0x150 [<c0105030>] default_idle+0x0/0x40 [<c01050f6>] cpu_idle+0x46/0x50 [<c03dd8ff>] start_kernel+0x19f/0x200 [<c03dd480>] unknown_bootoption+0x0/0x130 ^ permalink raw reply [flat|nested] 18+ messages in thread
* klogd, netconsole without /dev/console? 2004-04-28 18:23 ` Jeff Moyer @ 2004-05-20 2:20 ` Tom Oehser 2004-06-25 15:50 ` netconsole hangs w/ alt-sysrq-t Jeff Moyer 1 sibling, 0 replies; 18+ messages in thread From: Tom Oehser @ 2004-05-20 2:20 UTC (permalink / raw) Cc: linux-kernel If I start klogd with a -c, it turns off _both_ netconsole _and_ /dev/console. If I start klogd without -c, I get both of them on. Shouldn't there be a way to have klogd and netconsole but no /dev/console? -Tom ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-04-28 18:23 ` Jeff Moyer 2004-05-20 2:20 ` klogd, netconsole without /dev/console? Tom Oehser @ 2004-06-25 15:50 ` Jeff Moyer 2004-06-25 23:27 ` Matt Mackall 1 sibling, 1 reply; 18+ messages in thread From: Jeff Moyer @ 2004-06-25 15:50 UTC (permalink / raw) To: Matt Mackall, linux-kernel ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Jeff Moyer <jmoyer@redhat.com> adds: ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: jmoyer> [snip] mpm> Well process context defeats the purpose. Ok, I've more closely read mpm> your report and if I understand correctly, you're using the NAPI mpm> version of e100? There's some magic NAPI bits in netpoll_poll that mpm> might help here: >>> Yes, sorry I didn't specify that earlier. >>> mpm> if(trapped && np->dev->poll && test_bit(__LINK_STATE_RX_SCHED, mpm> &np->dev->state)) np-> dev->poll(np->dev, &budget); >>> mpm> Perhaps we need to pull the trapped test out of there. Then with any mpm> luck, dev->hard_start_xmit will return non-zero in netpoll_send_skb, mpm> we'll call netpoll_poll to pump the card, and we'll be able to flush mpm> it. >>> I don't think so. You can end up in code running in interrupt context >>> that is not designed to (ip routing code, etc). I've been down that >>> path already. I only defer to process context if irqs_disabled(). mpm> Fair enough. Turning on trapped basically short circuits the rest of mpm> the NAPI code so that stuff doesn't hit the stack when we call ->poll. mpm> Could you try doing a netpoll_set_trap(1)/(0) around the call to mpm> ->poll and see if that actually lets the thing work? Then we can try mpm> to figure out the right way to do this. You will still hit the stack in the case of netconsole, since it doesn't register an rx hook and netif_receive_skb has this: #ifdef CONFIG_NETPOLL_RX if (skb->dev->netpoll_rx && skb->dev->poll && netpoll_rx(skb)) { kfree_skb(skb); return NET_RX_DROP; } #endif So we fall through and end up calling code which doesn't want to run in interrupt context. One solution that I've come up with, but may not be in the spirit of netpoll, is to change that test above to the following: if (unlikely(netpoll_trap())) { if (skb->dev->netpoll_rx && skb->dev->poll) netpoll_rx(skb); kfree_skb(skb); return NET_RX_DROP; } This changes semantics, in that the rx routine will only be called if we've decided to "trap" the network stack. Note also that this will cause us to drop packets while doing our logging. Matt, any thoughts on this? Regards, Jeff ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-06-25 15:50 ` netconsole hangs w/ alt-sysrq-t Jeff Moyer @ 2004-06-25 23:27 ` Matt Mackall 2004-06-28 14:48 ` Jeff Moyer 0 siblings, 1 reply; 18+ messages in thread From: Matt Mackall @ 2004-06-25 23:27 UTC (permalink / raw) To: Jeff Moyer; +Cc: linux-kernel On Fri, Jun 25, 2004 at 11:50:25AM -0400, Jeff Moyer wrote: > ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Jeff Moyer <jmoyer@redhat.com> adds: > > ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: > jmoyer> [snip] > > mpm> Well process context defeats the purpose. Ok, I've more closely read > mpm> your report and if I understand correctly, you're using the NAPI > mpm> version of e100? There's some magic NAPI bits in netpoll_poll that > mpm> might help here: > >>> Yes, sorry I didn't specify that earlier. > >>> > mpm> if(trapped && np->dev->poll && test_bit(__LINK_STATE_RX_SCHED, > mpm> &np->dev->state)) > np-> dev->poll(np->dev, &budget); > >>> > mpm> Perhaps we need to pull the trapped test out of there. Then with any > mpm> luck, dev->hard_start_xmit will return non-zero in netpoll_send_skb, > mpm> we'll call netpoll_poll to pump the card, and we'll be able to flush > mpm> it. > >>> I don't think so. You can end up in code running in interrupt context > >>> that is not designed to (ip routing code, etc). I've been down that > >>> path already. I only defer to process context if irqs_disabled(). > > mpm> Fair enough. Turning on trapped basically short circuits the rest of > mpm> the NAPI code so that stuff doesn't hit the stack when we call ->poll. > mpm> Could you try doing a netpoll_set_trap(1)/(0) around the call to > mpm> ->poll and see if that actually lets the thing work? Then we can try > mpm> to figure out the right way to do this. > > You will still hit the stack in the case of netconsole, since it doesn't > register an rx hook and netif_receive_skb has this: > > #ifdef CONFIG_NETPOLL_RX > if (skb->dev->netpoll_rx && skb->dev->poll && netpoll_rx(skb)) { > kfree_skb(skb); > return NET_RX_DROP; > } > #endif Ok, that is indeed a problem. > So we fall through and end up calling code which doesn't want to run in > interrupt context. > > One solution that I've come up with, but may not be in the spirit of > netpoll, is to change that test above to the following: > > if (unlikely(netpoll_trap())) { > if (skb->dev->netpoll_rx && skb->dev->poll) > netpoll_rx(skb); > kfree_skb(skb); > return NET_RX_DROP; > } > > This changes semantics, in that the rx routine will only be called if we've > decided to "trap" the network stack. Note also that this will cause us to > drop packets while doing our logging. Huh. I'm not sure if that covers everything. We want to: a) push stuff through netpoll_rx() iff rx is enabled b) drop packets when netpoll_rx() says to c) drop packets when we're trapped for debugging/netdump d) drop packets when we manually call dev->poll e) keep as little overhead as possible The above breaks (a) when we're not trapped (consider breaking in with debugger) and (b) in any case. My current thought is we want to get this down to a single test in the fastpath by replacing netpoll_rx in skb->dev with a flag that is equal to netpoll_trapped() | dev->netpoll_rx, which we manipulate when call dev->poll within netpoll. How's that sound? -- Mathematics is the supreme nostalgia of our time. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-06-25 23:27 ` Matt Mackall @ 2004-06-28 14:48 ` Jeff Moyer 2004-06-28 15:19 ` Matt Mackall 0 siblings, 1 reply; 18+ messages in thread From: Jeff Moyer @ 2004-06-28 14:48 UTC (permalink / raw) To: Matt Mackall; +Cc: linux-kernel ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: [snip] mpm> Huh. I'm not sure if that covers everything. mpm> We want to: mpm> a) push stuff through netpoll_rx() iff rx is enabled b) drop packets mpm> when netpoll_rx() says to c) drop packets when we're trapped for mpm> debugging/netdump d) drop packets when we manually call dev->poll e) mpm> keep as little overhead as possible mpm> The above breaks (a) when we're not trapped (consider breaking in with mpm> debugger) and (b) in any case. mpm> My current thought is we want to get this down to a single test in the mpm> fastpath by replacing netpoll_rx in skb->dev with a flag that is equal mpm> to netpoll_trapped() | dev->netpoll_rx, which we manipulate when call dev-> poll within netpoll. mpm> How's that sound? That sounds fine. So, in the netif_receive_skb function, we'd now have something like: if (skb->dev->netpoll_rx && netpoll_rx(skb)) { kfree_skb(skb); return NET_RX_DROP; } I removed the test for skb->dev->poll, since this is the NAPI code path, I think it should always be present, right? And then at the top of the netpoll_rx routine: if (!(skb->dev->netpoll_rx & NETPOLL_RX_ENABLED)) goto out; The routine, as before, returns the value of trapped. And now, netpoll_poll would do this: if(np->dev->poll && test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) { np->dev->netpoll_rx |= NETPOLL_RX_TRAPPED; if (trapped) np->dev->poll(np->dev, &budget); else { trapped = 1; np->dev->poll(np->dev, &budget); trapped = 0; } np->dev->netpoll_rx &= ~NETPOLL_RX_TRAPPED; } Is this more in line with what you had in mind? One thing I would consider changing, here, is having netpoll_set_trap take a struct netpoll argument. That way, we could have netpoll_set_trap manipulate the NETPOLL_RX_TRAPPED bit. This would make the trap specific to each interface, but I think that may be desirable. -Jeff ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-06-28 14:48 ` Jeff Moyer @ 2004-06-28 15:19 ` Matt Mackall 2004-06-28 16:58 ` Jeff Moyer 0 siblings, 1 reply; 18+ messages in thread From: Matt Mackall @ 2004-06-28 15:19 UTC (permalink / raw) To: Jeff Moyer; +Cc: linux-kernel On Mon, Jun 28, 2004 at 10:48:41AM -0400, Jeff Moyer wrote: > ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: > > [snip] > > mpm> Huh. I'm not sure if that covers everything. > > mpm> We want to: > > mpm> a) push stuff through netpoll_rx() iff rx is enabled b) drop packets > mpm> when netpoll_rx() says to c) drop packets when we're trapped for > mpm> debugging/netdump d) drop packets when we manually call dev->poll e) > mpm> keep as little overhead as possible > > mpm> The above breaks (a) when we're not trapped (consider breaking in with > mpm> debugger) and (b) in any case. > > mpm> My current thought is we want to get this down to a single test in the > mpm> fastpath by replacing netpoll_rx in skb->dev with a flag that is equal > mpm> to netpoll_trapped() | dev->netpoll_rx, which we manipulate when call > dev-> poll within netpoll. > > mpm> How's that sound? > > That sounds fine. So, in the netif_receive_skb function, we'd now have > something like: > > if (skb->dev->netpoll_rx && netpoll_rx(skb)) { > kfree_skb(skb); > return NET_RX_DROP; > } > > I removed the test for skb->dev->poll, since this is the NAPI code path, I > think it should always be present, right? No, I think we get to this on the non-NAPI route as well. The ->poll check keeps us from filtering twice. > And then at the top of the netpoll_rx routine: > > if (!(skb->dev->netpoll_rx & NETPOLL_RX_ENABLED)) > goto out; Let's explicitly return 1. > The routine, as before, returns the value of trapped. And now, > netpoll_poll would do this: > > if(np->dev->poll && test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) { > np->dev->netpoll_rx |= NETPOLL_RX_TRAPPED; > if (trapped) > np->dev->poll(np->dev, &budget); > else { > trapped = 1; > np->dev->poll(np->dev, &budget); > trapped = 0; > } > np->dev->netpoll_rx &= ~NETPOLL_RX_TRAPPED; > } > > Is this more in line with what you had in mind? Yes. Let's make that NETPOLL_RX_DROP to disambiguate the trapped/dropped-for-polling. And we can do a simple increment/decrement on trapped itself - but do we still need it here now that we've shorted out the NAPI path with RX_DROP? There might be some new atomicity or memory barrier issues here, be mindful. > One thing I would consider changing, here, is having > netpoll_set_trap take a struct netpoll argument. That way, we could > have netpoll_set_trap manipulate the NETPOLL_RX_TRAPPED bit. This > would make the trap specific to each interface, but I think that may > be desirable. It doesn't really make sense for trapped to be non-global - it means the machine is in polling-only mode, eg debugger or the like. It's rather misusing it to kill the NAPI receive side for netpoll_poll. -- Mathematics is the supreme nostalgia of our time. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-06-28 15:19 ` Matt Mackall @ 2004-06-28 16:58 ` Jeff Moyer 2004-06-29 12:36 ` Jeff Moyer 2004-07-01 16:52 ` Matt Mackall 0 siblings, 2 replies; 18+ messages in thread From: Jeff Moyer @ 2004-06-28 16:58 UTC (permalink / raw) To: Matt Mackall; +Cc: linux-kernel ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: mpm> On Mon, Jun 28, 2004 at 10:48:41AM -0400, Jeff Moyer wrote: >> ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall >> <mpm@selenic.com> adds: >> >> [snip] >> mpm> Huh. I'm not sure if that covers everything. >> mpm> We want to: >> mpm> a) push stuff through netpoll_rx() iff rx is enabled b) drop packets mpm> when netpoll_rx() says to c) drop packets when we're trapped for mpm> debugging/netdump d) drop packets when we manually call dev->poll e) mpm> keep as little overhead as possible >> mpm> The above breaks (a) when we're not trapped (consider breaking in with mpm> debugger) and (b) in any case. >> mpm> My current thought is we want to get this down to a single test in the mpm> fastpath by replacing netpoll_rx in skb->dev with a flag that is equal mpm> to netpoll_trapped() | dev->netpoll_rx, which we manipulate when call dev-> poll within netpoll. >> mpm> How's that sound? >> That sounds fine. So, in the netif_receive_skb function, we'd now have >> something like: >> >> if (skb->dev->netpoll_rx && netpoll_rx(skb)) { kfree_skb(skb); return >> NET_RX_DROP; } >> >> I removed the test for skb->dev->poll, since this is the NAPI code path, >> I think it should always be present, right? mpm> No, I think we get to this on the non-NAPI route as well. The ->poll mpm> check keeps us from filtering twice. I'll have to take your word for it on this one, as I can't find a way into this code for the non-napi case. Would anyone care to give an authoritative answer on this? >> And then at the top of the netpoll_rx routine: >> >> if (!(skb->dev->netpoll_rx & NETPOLL_RX_ENABLED)) goto out; mpm> Let's explicitly return 1. Ahh, yes, that makes things much more clear. >> The routine, as before, returns the value of trapped. And now, >> netpoll_poll would do this: >> >> if(np->dev->poll && test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) { np-> dev->netpoll_rx |= NETPOLL_RX_TRAPPED; >> if (trapped) np-> dev->poll(np->dev, &budget); >> else { trapped = 1; np-> dev->poll(np->dev, &budget); >> trapped = 0; } np-> dev->netpoll_rx &= ~NETPOLL_RX_TRAPPED; >> } >> >> Is this more in line with what you had in mind? mpm> Yes. Let's make that NETPOLL_RX_DROP to disambiguate the mpm> trapped/dropped-for-polling. And we can do a simple mpm> increment/decrement on trapped itself - but do we still need it here mpm> now that we've shorted out the NAPI path with RX_DROP? Right. The code becomes much cleaner with this change (we don't need to increment/decrement trapped). mpm> There might be some new atomicity or memory barrier issues here, be mpm> mindful. I'll think about this some more. >> One thing I would consider changing, here, is having netpoll_set_trap >> take a struct netpoll argument. That way, we could have netpoll_set_trap >> manipulate the NETPOLL_RX_TRAPPED bit. This would make the trap specific >> to each interface, but I think that may be desirable. mpm> It doesn't really make sense for trapped to be non-global - it means mpm> the machine is in polling-only mode, eg debugger or the like. It's mpm> rather misusing it to kill the NAPI receive side for netpoll_poll. Fair enough. Attached is a patch with these changes. I've removed CONFIG_NETPOLL_RX as it no longer seems useful. Let me know what you think. One other thing we might consider is adding a call to touch_nmi_watchdog() to zap_completion_queue. Thanks! Jeff --- linux-2.6.7/arch/sparc64/defconfig.orig 2004-06-28 12:37:24.000000000 -0400 +++ linux-2.6.7/arch/sparc64/defconfig 2004-06-28 12:37:27.000000000 -0400 @@ -705,7 +705,6 @@ # CONFIG_NET_PKTGEN=m CONFIG_NETPOLL=y -# CONFIG_NETPOLL_RX is not set # CONFIG_NETPOLL_TRAP is not set CONFIG_NET_POLL_CONTROLLER=y CONFIG_HAMRADIO=y --- linux-2.6.7/arch/ia64/configs/sn2_defconfig.orig 2004-06-28 12:37:50.000000000 -0400 +++ linux-2.6.7/arch/ia64/configs/sn2_defconfig 2004-06-28 12:37:53.000000000 -0400 @@ -636,7 +636,6 @@ # # CONFIG_BT is not set CONFIG_NETPOLL=y -# CONFIG_NETPOLL_RX is not set # CONFIG_NETPOLL_TRAP is not set CONFIG_NET_POLL_CONTROLLER=y --- linux-2.6.7/arch/ppc64/configs/pSeries_defconfig.orig 2004-06-28 12:39:00.000000000 -0400 +++ linux-2.6.7/arch/ppc64/configs/pSeries_defconfig 2004-06-28 12:39:03.000000000 -0400 @@ -544,7 +544,6 @@ # # CONFIG_BT is not set CONFIG_NETPOLL=y -CONFIG_NETPOLL_RX=y CONFIG_NETPOLL_TRAP=y CONFIG_NET_POLL_CONTROLLER=y --- linux-2.6.7/arch/ppc64/configs/iSeries_defconfig.orig 2004-06-28 12:39:19.000000000 -0400 +++ linux-2.6.7/arch/ppc64/configs/iSeries_defconfig 2004-06-28 12:39:23.000000000 -0400 @@ -358,7 +358,6 @@ # # CONFIG_NET_PKTGEN is not set CONFIG_NETPOLL=y -CONFIG_NETPOLL_RX=y CONFIG_NETPOLL_TRAP=y CONFIG_NET_POLL_CONTROLLER=y # CONFIG_HAMRADIO is not set --- linux-2.6.7/arch/ppc64/defconfig.orig 2004-06-28 12:40:06.000000000 -0400 +++ linux-2.6.7/arch/ppc64/defconfig 2004-06-28 12:40:08.000000000 -0400 @@ -544,8 +544,7 @@ # # CONFIG_BT is not set CONFIG_NETPOLL=y -CONFIG_NETPOLL_RX=y CONFIG_NETPOLL_TRAP=y CONFIG_NET_POLL_CONTROLLER=y --- linux-2.6.7/arch/x86_64/defconfig.orig 2004-06-28 12:40:21.000000000 -0400 +++ linux-2.6.7/arch/x86_64/defconfig 2004-06-28 12:40:24.000000000 -0400 @@ -399,7 +399,6 @@ # # CONFIG_NET_PKTGEN is not set CONFIG_NETPOLL=y -# CONFIG_NETPOLL_RX is not set # CONFIG_NETPOLL_TRAP is not set CONFIG_NET_POLL_CONTROLLER=y # CONFIG_HAMRADIO is not set --- linux-2.6.7/include/linux/netdevice.h.orig 2004-06-28 12:05:36.000000000 -0400 +++ linux-2.6.7/include/linux/netdevice.h 2004-06-28 12:13:36.000000000 -0400 @@ -459,7 +459,7 @@ unsigned char *haddr); int (*neigh_setup)(struct net_device *dev, struct neigh_parms *); int (*accept_fastpath)(struct net_device *, struct dst_entry*); -#ifdef CONFIG_NETPOLL_RX +#ifdef CONFIG_NETPOLL int netpoll_rx; #endif #ifdef CONFIG_NET_POLL_CONTROLLER --- linux-2.6.7/net/core/dev.c.orig 2004-06-28 12:05:22.000000000 -0400 +++ linux-2.6.7/net/core/dev.c 2004-06-28 12:40:52.000000000 -0400 @@ -1575,7 +1575,7 @@ struct softnet_data *queue; unsigned long flags; -#ifdef CONFIG_NETPOLL_RX +#ifdef CONFIG_NETPOLL if (skb->dev->netpoll_rx && netpoll_rx(skb)) { kfree_skb(skb); return NET_RX_DROP; @@ -1737,7 +1737,7 @@ int ret = NET_RX_DROP; unsigned short type; -#ifdef CONFIG_NETPOLL_RX +#ifdef CONFIG_NETPOLL if (skb->dev->netpoll_rx && skb->dev->poll && netpoll_rx(skb)) { kfree_skb(skb); return NET_RX_DROP; --- linux-2.6.7/net/core/netpoll.c.orig 2004-06-28 12:05:28.000000000 -0400 +++ linux-2.6.7/net/core/netpoll.c 2004-06-28 12:11:45.000000000 -0400 @@ -29,6 +29,9 @@ #define MAX_SKBS 32 #define MAX_UDP_CHUNK 1460 +#define NETPOLL_RX_ENABLED 1 +#define NETPOLL_RX_DROP 2 + static spinlock_t skb_list_lock = SPIN_LOCK_UNLOCKED; static int nr_skbs; static struct sk_buff *skbs; @@ -70,9 +73,12 @@ np->dev->poll_controller(np->dev); /* If scheduling is stopped, tickle NAPI bits */ - if(trapped && np->dev->poll && - test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) + if (np->dev->poll && + test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) { + np->dev->netpoll_rx |= NETPOLL_RX_DROP; np->dev->poll(np->dev, &budget); + np->dev->netpoll_rx &= ~NETPOLL_RX_DROP; + } zap_completion_queue(); } @@ -333,6 +339,9 @@ struct list_head *p; unsigned long flags; + if (!(skb->dev->netpoll_rx & NETPOLL_RX_ENABLED)) + return 1; + if (skb->dev->type != ARPHRD_ETHER) goto out; @@ -591,10 +600,7 @@ if(np->rx_hook) { unsigned long flags; -#ifdef CONFIG_NETPOLL_RX - np->dev->netpoll_rx = 1; -#endif - + np->dev->netpoll_rx = NETPOLL_RX_ENABLED; spin_lock_irqsave(&rx_list_lock, flags); list_add(&np->rx_list, &rx_list); spin_unlock_irqrestore(&rx_list_lock, flags); @@ -613,9 +619,7 @@ spin_lock_irqsave(&rx_list_lock, flags); list_del(&np->rx_list); -#ifdef CONFIG_NETPOLL_RX - np->dev->netpoll_rx = 0; -#endif + np->dev->netpoll_rx &= ~NETPOLL_RX_ENABLED; spin_unlock_irqrestore(&rx_list_lock, flags); } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-06-28 16:58 ` Jeff Moyer @ 2004-06-29 12:36 ` Jeff Moyer 2004-07-01 16:52 ` Matt Mackall 1 sibling, 0 replies; 18+ messages in thread From: Jeff Moyer @ 2004-06-29 12:36 UTC (permalink / raw) To: Matt Mackall, linux-kernel [snip] @@ -70,9 +73,12 @@ np->dev->poll_controller(np->dev); /* If scheduling is stopped, tickle NAPI bits */ - if(trapped && np->dev->poll && - test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) + if (np->dev->poll && + test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) { + np->dev->netpoll_rx |= NETPOLL_RX_DROP; np->dev->poll(np->dev, &budget); + np->dev->netpoll_rx &= ~NETPOLL_RX_DROP; + } zap_completion_queue(); } Silly me. This still needs to set trapped (and still needs more thinking about smp issues). -Jeff ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-06-28 16:58 ` Jeff Moyer 2004-06-29 12:36 ` Jeff Moyer @ 2004-07-01 16:52 ` Matt Mackall 2004-07-01 16:54 ` Jeff Moyer 1 sibling, 1 reply; 18+ messages in thread From: Matt Mackall @ 2004-07-01 16:52 UTC (permalink / raw) To: Jeff Moyer; +Cc: linux-kernel On Mon, Jun 28, 2004 at 12:58:22PM -0400, Jeff Moyer wrote: > ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: > mpm> No, I think we get to this on the non-NAPI route as well. The ->poll > mpm> check keeps us from filtering twice. > > I'll have to take your word for it on this one, as I can't find a way into > this code for the non-napi case. Would anyone care to give an > authoritative answer on this? I could be mistaken, but that's my recollection from last summer. Hopefully I'll have some spare cycles for this next week. > One other thing we might consider is adding a call to touch_nmi_watchdog() > to zap_completion_queue. Not sure how I feel about that yet. If we can't get out of the guts of netpoll in a timely fashion, then perhaps that's an indication that the watchdog should indeed fire. Describe the scenario where you think we should do this, please. -- Mathematics is the supreme nostalgia of our time. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-07-01 16:52 ` Matt Mackall @ 2004-07-01 16:54 ` Jeff Moyer 2004-07-01 17:05 ` Matt Mackall 0 siblings, 1 reply; 18+ messages in thread From: Jeff Moyer @ 2004-07-01 16:54 UTC (permalink / raw) To: Matt Mackall; +Cc: linux-kernel ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: mpm> On Mon, Jun 28, 2004 at 12:58:22PM -0400, Jeff Moyer wrote: >> ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall >> <mpm@selenic.com> adds: mpm> No, I think we get to this on the non-NAPI route as well. The ->poll mpm> check keeps us from filtering twice. >> I'll have to take your word for it on this one, as I can't find a way >> into this code for the non-napi case. Would anyone care to give an >> authoritative answer on this? mpm> I could be mistaken, but that's my recollection from last summer. mpm> Hopefully I'll have some spare cycles for this next week. >> One other thing we might consider is adding a call to >> touch_nmi_watchdog() to zap_completion_queue. mpm> Not sure how I feel about that yet. If we can't get out of the guts of mpm> netpoll in a timely fashion, then perhaps that's an indication that mpm> the watchdog should indeed fire. Describe the scenario where you think mpm> we should do this, please. Alt-Sysrq-t from the keyboard on a heavily-loaded UP system. -Jeff ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-07-01 16:54 ` Jeff Moyer @ 2004-07-01 17:05 ` Matt Mackall 2004-07-01 17:53 ` Jeff Moyer 0 siblings, 1 reply; 18+ messages in thread From: Matt Mackall @ 2004-07-01 17:05 UTC (permalink / raw) To: Jeff Moyer; +Cc: linux-kernel On Thu, Jul 01, 2004 at 12:54:50PM -0400, Jeff Moyer wrote: > ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: > > mpm> On Mon, Jun 28, 2004 at 12:58:22PM -0400, Jeff Moyer wrote: > >> ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall > >> <mpm@selenic.com> adds: > mpm> No, I think we get to this on the non-NAPI route as well. The ->poll > mpm> check keeps us from filtering twice. > >> I'll have to take your word for it on this one, as I can't find a way > >> into this code for the non-napi case. Would anyone care to give an > >> authoritative answer on this? > > mpm> I could be mistaken, but that's my recollection from last summer. > mpm> Hopefully I'll have some spare cycles for this next week. > > >> One other thing we might consider is adding a call to > >> touch_nmi_watchdog() to zap_completion_queue. > > mpm> Not sure how I feel about that yet. If we can't get out of the guts of > mpm> netpoll in a timely fashion, then perhaps that's an indication that > mpm> the watchdog should indeed fire. Describe the scenario where you think > mpm> we should do this, please. > > Alt-Sysrq-t from the keyboard on a heavily-loaded UP system. Is it because we're spinning for too long waiting for skbs or is it just because sysrq-t takes that long to dump its guts? How does the watchdog get tickled in the sysrq-t over slow serial console case? In other words, would this better be done between dumps in the sysrq-t loop where we actually know we're making forward progress? -- Mathematics is the supreme nostalgia of our time. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: netconsole hangs w/ alt-sysrq-t 2004-07-01 17:05 ` Matt Mackall @ 2004-07-01 17:53 ` Jeff Moyer 0 siblings, 0 replies; 18+ messages in thread From: Jeff Moyer @ 2004-07-01 17:53 UTC (permalink / raw) To: Matt Mackall; +Cc: linux-kernel ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall <mpm@selenic.com> adds: mpm> On Thu, Jul 01, 2004 at 12:54:50PM -0400, Jeff Moyer wrote: >> ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall >> <mpm@selenic.com> adds: >> mpm> On Mon, Jun 28, 2004 at 12:58:22PM -0400, Jeff Moyer wrote: >> >> ==> Regarding Re: netconsole hangs w/ alt-sysrq-t; Matt Mackall >> >> <mpm@selenic.com> adds: mpm> No, I think we get to this on the non-NAPI route as well. The ->poll mpm> check keeps us from filtering twice. >> >> I'll have to take your word for it on this one, as I can't find a way >> >> into this code for the non-napi case. Would anyone care to give an >> >> authoritative answer on this? >> mpm> I could be mistaken, but that's my recollection from last summer. mpm> Hopefully I'll have some spare cycles for this next week. >> >> One other thing we might consider is adding a call to >> >> touch_nmi_watchdog() to zap_completion_queue. >> mpm> Not sure how I feel about that yet. If we can't get out of the guts of mpm> netpoll in a timely fashion, then perhaps that's an indication that mpm> the watchdog should indeed fire. Describe the scenario where you think mpm> we should do this, please. >> Alt-Sysrq-t from the keyboard on a heavily-loaded UP system. mpm> Is it because we're spinning for too long waiting for skbs or is it mpm> just because sysrq-t takes that long to dump its guts? How does the mpm> watchdog get tickled in the sysrq-t over slow serial console case? mpm> In other words, would this better be done between dumps in the sysrq-t mpm> loop where we actually know we're making forward progress? Good point. I tend to center my thinking too much around netdump. You're right, if we're waiting too long for skb's, we probably should trigger the NMI watchdog. It also appears that show_state includes a call to touch_nmi_watchdog in its loop. -Jeff ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2004-07-01 17:55 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-04-22 15:29 netconsole hangs w/ alt-sysrq-t Jeff Moyer 2004-04-25 19:15 ` Matt Mackall 2004-04-28 12:44 ` Jeff Moyer 2004-04-28 14:03 ` Matt Mackall 2004-04-28 14:07 ` Jeff Moyer 2004-04-28 14:27 ` Matt Mackall 2004-04-28 18:23 ` Jeff Moyer 2004-05-20 2:20 ` klogd, netconsole without /dev/console? Tom Oehser 2004-06-25 15:50 ` netconsole hangs w/ alt-sysrq-t Jeff Moyer 2004-06-25 23:27 ` Matt Mackall 2004-06-28 14:48 ` Jeff Moyer 2004-06-28 15:19 ` Matt Mackall 2004-06-28 16:58 ` Jeff Moyer 2004-06-29 12:36 ` Jeff Moyer 2004-07-01 16:52 ` Matt Mackall 2004-07-01 16:54 ` Jeff Moyer 2004-07-01 17:05 ` Matt Mackall 2004-07-01 17:53 ` Jeff Moyer
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.