* [BUG] Something goes wrong with timer statistics.
@ 2007-05-29 21:38 Ian Kumlien
2007-05-29 22:37 ` David Miller
` (2 more replies)
0 siblings, 3 replies; 25+ messages in thread
From: Ian Kumlien @ 2007-05-29 21:38 UTC (permalink / raw)
To: Linux-kernel
Hi,
As the daystar sets, i try to play some with my new would be
firewall/server, but since this will be running for quite some time i
have been experimenting with powertop to find out what i can do to limit
it's power usage.
But, if i run powertop for too long or a few times to many... this
happens:
http://pomac.netswarm.net/pics/kernel_panic.jpg
If i don't run powertop, it is rock solid... Compiling for hours,
running memtest for hours etc etc...
--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 21:38 [BUG] Something goes wrong with timer statistics Ian Kumlien @ 2007-05-29 22:37 ` David Miller 2007-05-29 22:42 ` Thomas Gleixner [not found] ` <6bffcb0e0705291541q7e6d6faen5cd6345074c4c92a@mail.gmail.com> [not found] ` <6bffcb0e0705291524y2752d646p94b0bf6ca87af68f@mail.gmail.com> 2007-05-30 0:13 ` Stephane Casset 2 siblings, 2 replies; 25+ messages in thread From: David Miller @ 2007-05-29 22:37 UTC (permalink / raw) To: iank; +Cc: Linux-kernel From: Ian Kumlien <iank@bredband.net> Date: Tue, 29 May 2007 23:38:48 +0200 > As the daystar sets, i try to play some with my new would be > firewall/server, but since this will be running for quite some time i > have been experimenting with powertop to find out what i can do to limit > it's power usage. > > But, if i run powertop for too long or a few times to many... this > happens: > http://pomac.netswarm.net/pics/kernel_panic.jpg > > If i don't run powertop, it is rock solid... Compiling for hours, > running memtest for hours etc etc... I see this same exact problem on sparc64. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 22:37 ` David Miller @ 2007-05-29 22:42 ` Thomas Gleixner 2007-05-29 22:44 ` David Miller [not found] ` <6bffcb0e0705291541q7e6d6faen5cd6345074c4c92a@mail.gmail.com> 1 sibling, 1 reply; 25+ messages in thread From: Thomas Gleixner @ 2007-05-29 22:42 UTC (permalink / raw) To: David Miller; +Cc: iank, Linux-kernel, Ingo Molnar, Arjan van de Ven On Tue, 2007-05-29 at 15:37 -0700, David Miller wrote: > From: Ian Kumlien <iank@bredband.net> > Date: Tue, 29 May 2007 23:38:48 +0200 > > > As the daystar sets, i try to play some with my new would be > > firewall/server, but since this will be running for quite some time i > > have been experimenting with powertop to find out what i can do to limit > > it's power usage. > > > > But, if i run powertop for too long or a few times to many... this > > happens: > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > If i don't run powertop, it is rock solid... Compiling for hours, > > running memtest for hours etc etc... > > I see this same exact problem on sparc64. 100% reproducible ? tglx ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 22:42 ` Thomas Gleixner @ 2007-05-29 22:44 ` David Miller 2007-05-29 22:53 ` Thomas Gleixner 0 siblings, 1 reply; 25+ messages in thread From: David Miller @ 2007-05-29 22:44 UTC (permalink / raw) To: tglx; +Cc: iank, Linux-kernel, mingo, arjan From: Thomas Gleixner <tglx@linutronix.de> Date: Wed, 30 May 2007 00:42:12 +0200 > On Tue, 2007-05-29 at 15:37 -0700, David Miller wrote: > > I see this same exact problem on sparc64. > > 100% reproducible ? I think so, I'll try to find time to tinker with it a bit more. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 22:44 ` David Miller @ 2007-05-29 22:53 ` Thomas Gleixner 0 siblings, 0 replies; 25+ messages in thread From: Thomas Gleixner @ 2007-05-29 22:53 UTC (permalink / raw) To: David Miller; +Cc: iank, Linux-kernel, mingo, arjan On Tue, 2007-05-29 at 15:44 -0700, David Miller wrote: > From: Thomas Gleixner <tglx@linutronix.de> > Date: Wed, 30 May 2007 00:42:12 +0200 > > > On Tue, 2007-05-29 at 15:37 -0700, David Miller wrote: > > > I see this same exact problem on sparc64. > > > > 100% reproducible ? > > I think so, I'll try to find time to tinker with it a bit > more. It seems to be a subtle bug in the timer stats code, but I did not spot it. tglx ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <6bffcb0e0705291541q7e6d6faen5cd6345074c4c92a@mail.gmail.com>]
* Re: [BUG] Something goes wrong with timer statistics. [not found] ` <6bffcb0e0705291541q7e6d6faen5cd6345074c4c92a@mail.gmail.com> @ 2007-05-29 22:44 ` David Miller 2007-05-29 22:51 ` Ian Kumlien 0 siblings, 1 reply; 25+ messages in thread From: David Miller @ 2007-05-29 22:44 UTC (permalink / raw) To: michal.k.k.piotrowski; +Cc: iank, Linux-kernel From: "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> Date: Wed, 30 May 2007 00:41:46 +0200 > On 30/05/07, David Miller <davem@davemloft.net> wrote: > > From: Ian Kumlien <iank@bredband.net> > > Date: Tue, 29 May 2007 23:38:48 +0200 > > > > > As the daystar sets, i try to play some with my new would be > > > firewall/server, but since this will be running for quite some time i > > > have been experimenting with powertop to find out what i can do to limit > > > it's power usage. > > > > > > But, if i run powertop for too long or a few times to many... this > > > happens: > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > > > If i don't run powertop, it is rock solid... Compiling for hours, > > > running memtest for hours etc etc... > > > > I see this same exact problem on sparc64. > > Have you tried this patch? > http://lkml.org/lkml/2007/5/29/392 Of course, I worked with Thomas on the fix :-) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 22:44 ` David Miller @ 2007-05-29 22:51 ` Ian Kumlien 2007-05-29 23:03 ` David Miller 0 siblings, 1 reply; 25+ messages in thread From: Ian Kumlien @ 2007-05-29 22:51 UTC (permalink / raw) To: David Miller; +Cc: michal.k.k.piotrowski, Linux-kernel On tis, 2007-05-29 at 15:44 -0700, David Miller wrote: > From: "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> > Date: Wed, 30 May 2007 00:41:46 +0200 > > > On 30/05/07, David Miller <davem@davemloft.net> wrote: > > > From: Ian Kumlien <iank@bredband.net> > > > Date: Tue, 29 May 2007 23:38:48 +0200 > > > > > > > As the daystar sets, i try to play some with my new would be > > > > firewall/server, but since this will be running for quite some time i > > > > have been experimenting with powertop to find out what i can do to limit > > > > it's power usage. > > > > > > > > But, if i run powertop for too long or a few times to many... this > > > > happens: > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > > > > > If i don't run powertop, it is rock solid... Compiling for hours, > > > > running memtest for hours etc etc... > > > > > > I see this same exact problem on sparc64. > > > > Have you tried this patch? > > http://lkml.org/lkml/2007/5/29/392 > > Of course, I worked with Thomas on the fix :-) I dunno if this applies to me, i run a 32 bit userspace... -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 22:51 ` Ian Kumlien @ 2007-05-29 23:03 ` David Miller 2007-05-30 0:04 ` Ian Kumlien 0 siblings, 1 reply; 25+ messages in thread From: David Miller @ 2007-05-29 23:03 UTC (permalink / raw) To: iank; +Cc: michal.k.k.piotrowski, Linux-kernel From: Ian Kumlien <iank@bredband.net> Date: Wed, 30 May 2007 00:51:52 +0200 > On tis, 2007-05-29 at 15:44 -0700, David Miller wrote: > > From: "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> > > Date: Wed, 30 May 2007 00:41:46 +0200 > > > > > On 30/05/07, David Miller <davem@davemloft.net> wrote: > > > > From: Ian Kumlien <iank@bredband.net> > > > > Date: Tue, 29 May 2007 23:38:48 +0200 > > > > > > > > > As the daystar sets, i try to play some with my new would be > > > > > firewall/server, but since this will be running for quite some time i > > > > > have been experimenting with powertop to find out what i can do to limit > > > > > it's power usage. > > > > > > > > > > But, if i run powertop for too long or a few times to many... this > > > > > happens: > > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > > > > > > > If i don't run powertop, it is rock solid... Compiling for hours, > > > > > running memtest for hours etc etc... > > > > > > > > I see this same exact problem on sparc64. > > > > > > Have you tried this patch? > > > http://lkml.org/lkml/2007/5/29/392 > > > > Of course, I worked with Thomas on the fix :-) > > I dunno if this applies to me, i run a 32 bit userspace... That patch fixes a problem on 64-bit kernels, regardless of userspace, when NOHZ is enabled. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 23:03 ` David Miller @ 2007-05-30 0:04 ` Ian Kumlien 0 siblings, 0 replies; 25+ messages in thread From: Ian Kumlien @ 2007-05-30 0:04 UTC (permalink / raw) To: David Miller; +Cc: michal.k.k.piotrowski, Linux-kernel On tis, 2007-05-29 at 16:03 -0700, David Miller wrote: > From: Ian Kumlien <iank@bredband.net> > Date: Wed, 30 May 2007 00:51:52 +0200 > > > On tis, 2007-05-29 at 15:44 -0700, David Miller wrote: > > > From: "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> > > > Date: Wed, 30 May 2007 00:41:46 +0200 > > > > > > > On 30/05/07, David Miller <davem@davemloft.net> wrote: > > > > > From: Ian Kumlien <iank@bredband.net> > > > > > Date: Tue, 29 May 2007 23:38:48 +0200 > > > > > > > > > > > As the daystar sets, i try to play some with my new would be > > > > > > firewall/server, but since this will be running for quite some time i > > > > > > have been experimenting with powertop to find out what i can do to limit > > > > > > it's power usage. > > > > > > > > > > > > But, if i run powertop for too long or a few times to many... this > > > > > > happens: > > > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > > > > > > > > > If i don't run powertop, it is rock solid... Compiling for hours, > > > > > > running memtest for hours etc etc... > > > > > > > > > > I see this same exact problem on sparc64. > > > > > > > > Have you tried this patch? > > > > http://lkml.org/lkml/2007/5/29/392 > > > > > > Of course, I worked with Thomas on the fix :-) > > > > I dunno if this applies to me, i run a 32 bit userspace... > > That patch fixes a problem on 64-bit kernels, regardless of > userspace, when NOHZ is enabled. Sorry, i run all 32 bit, i was on the phone while typing that, so a wee bit distracted. Anyways, i run a nohz 32 bit kernel. I dunno how good a intel T7200 is when it comes to 64 bit kernel/userspace.. (Haven't really read up on it) -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <6bffcb0e0705291524y2752d646p94b0bf6ca87af68f@mail.gmail.com>]
* Re: [BUG] Something goes wrong with timer statistics. [not found] ` <6bffcb0e0705291524y2752d646p94b0bf6ca87af68f@mail.gmail.com> @ 2007-05-29 22:35 ` Ian Kumlien 2007-05-29 22:38 ` Thomas Gleixner 1 sibling, 0 replies; 25+ messages in thread From: Ian Kumlien @ 2007-05-29 22:35 UTC (permalink / raw) To: Michal Piotrowski; +Cc: Thomas Gleixner, Linux-kernel On ons, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote: > Hi Ian, > > (Thomas "The Wizard of Time" added to CC :)) > > On 29/05/07, Ian Kumlien <iank@bredband.net> wrote: > > Hi, > > > > As the daystar sets, i try to play some with my new would be > > firewall/server, but since this will be running for quite some time i > > have been experimenting with powertop to find out what i can do to limit > > it's power usage. > > > > But, if i run powertop for too long or a few times to many... this > > happens: > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > If i don't run powertop, it is rock solid... Compiling for hours, > > running memtest for hours etc etc... > > (Where is my magic ball?) (Lost? =)) > Let me guess - Pentium 4 with HT? Core 2 Duo, without HT =) > Also, please try the latest hrt tree > http://tglx.de/projects/hrtimers/2.6.22-rc3/ If you still think i should try that, i will, this box is not in production yet for a while so... =) I switched to using 64 bit hpet after the last crash, will this influence the results? > Regards, > Michal > -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. [not found] ` <6bffcb0e0705291524y2752d646p94b0bf6ca87af68f@mail.gmail.com> 2007-05-29 22:35 ` Ian Kumlien @ 2007-05-29 22:38 ` Thomas Gleixner 2007-05-29 22:50 ` Ian Kumlien ` (2 more replies) 1 sibling, 3 replies; 25+ messages in thread From: Thomas Gleixner @ 2007-05-29 22:38 UTC (permalink / raw) To: Michal Piotrowski Cc: Ian Kumlien, Linux-kernel, Ingo Molnar, Arjan van de Ven On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote: > Hi Ian, > > (Thomas "The Wizard of Time" added to CC :)) Added more wizards :) > On 29/05/07, Ian Kumlien <iank@bredband.net> wrote: > > Hi, > > > > As the daystar sets, i try to play some with my new would be > > firewall/server, but since this will be running for quite some time i > > have been experimenting with powertop to find out what i can do to limit > > it's power usage. > > > > But, if i run powertop for too long or a few times to many... this > > happens: > > http://pomac.netswarm.net/pics/kernel_panic.jpg This was reported before. It's incredibly hard to reproduce. Ian, is this happening reproducible ? > > If i don't run powertop, it is rock solid... Compiling for hours, > > running memtest for hours etc etc... > > (Where is my magic ball?) > > Let me guess - Pentium 4 with HT? I might be wrong, but it's a newer core duo thingy. tglx ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 22:38 ` Thomas Gleixner @ 2007-05-29 22:50 ` Ian Kumlien 2007-05-30 6:52 ` Björn Steinbrink 2007-05-30 11:38 ` Björn Steinbrink 2 siblings, 0 replies; 25+ messages in thread From: Ian Kumlien @ 2007-05-29 22:50 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Piotrowski, Linux-kernel, Ingo Molnar, Arjan van de Ven On ons, 2007-05-30 at 00:38 +0200, Thomas Gleixner wrote: > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote: > > Hi Ian, > > > > (Thomas "The Wizard of Time" added to CC :)) > > Added more wizards :) > > > On 29/05/07, Ian Kumlien <iank@bredband.net> wrote: > > > Hi, > > > > > > As the daystar sets, i try to play some with my new would be > > > firewall/server, but since this will be running for quite some time i > > > have been experimenting with powertop to find out what i can do to limit > > > it's power usage. > > > > > > But, if i run powertop for too long or a few times to many... this > > > happens: > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > This was reported before. It's incredibly hard to reproduce. > > Ian, is this happening reproducible ? Seems like it, i haven't actually TRIED to crash yet... But i can try if i don't have to crash in the future =) (The machine is just a motherboard lying on the floor with some disks attached, so i have to bring out my pocket knife and short circuit the right pins =)) > > > If i don't run powertop, it is rock solid... Compiling for hours, > > > running memtest for hours etc etc... > > > > (Where is my magic ball?) > > > > Let me guess - Pentium 4 with HT? > > I might be wrong, but it's a newer core duo thingy. > > tglx > -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 22:38 ` Thomas Gleixner 2007-05-29 22:50 ` Ian Kumlien @ 2007-05-30 6:52 ` Björn Steinbrink 2007-05-30 7:10 ` Björn Steinbrink 2007-05-30 11:38 ` Björn Steinbrink 2 siblings, 1 reply; 25+ messages in thread From: Björn Steinbrink @ 2007-05-30 6:52 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Piotrowski, Ian Kumlien, Linux-kernel, Ingo Molnar, Arjan van de Ven On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote: > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote: > > Hi Ian, > > > > (Thomas "The Wizard of Time" added to CC :)) > > Added more wizards :) > > > On 29/05/07, Ian Kumlien <iank@bredband.net> wrote: > > > Hi, > > > > > > As the daystar sets, i try to play some with my new would be > > > firewall/server, but since this will be running for quite some time i > > > have been experimenting with powertop to find out what i can do to limit > > > it's power usage. > > > > > > But, if i run powertop for too long or a few times to many... this > > > happens: > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > This was reported before. It's incredibly hard to reproduce. Hm, tstats_write() only grabs show_mutex, but it calls reset_entries() which happily messes with the entries table, shouldn't the table_lock also be held in that case? Björn ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-30 6:52 ` Björn Steinbrink @ 2007-05-30 7:10 ` Björn Steinbrink 0 siblings, 0 replies; 25+ messages in thread From: Björn Steinbrink @ 2007-05-30 7:10 UTC (permalink / raw) To: Thomas Gleixner, Michal Piotrowski, Ian Kumlien, Linux-kernel, Ingo Molnar, Arjan van de Ven On 2007.05.30 08:52:04 +0200, Björn Steinbrink wrote: > On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote: > > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote: > > > Hi Ian, > > > > > > (Thomas "The Wizard of Time" added to CC :)) > > > > Added more wizards :) > > > > > On 29/05/07, Ian Kumlien <iank@bredband.net> wrote: > > > > Hi, > > > > > > > > As the daystar sets, i try to play some with my new would be > > > > firewall/server, but since this will be running for quite some time i > > > > have been experimenting with powertop to find out what i can do to limit > > > > it's power usage. > > > > > > > > But, if i run powertop for too long or a few times to many... this > > > > happens: > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > This was reported before. It's incredibly hard to reproduce. > > Hm, tstats_write() only grabs show_mutex, but it calls reset_entries() > which happily messes with the entries table, shouldn't the table_lock > also be held in that case? Hrmpf, nevermind, I missed the active flag... Sorry, Björn ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 22:38 ` Thomas Gleixner 2007-05-29 22:50 ` Ian Kumlien 2007-05-30 6:52 ` Björn Steinbrink @ 2007-05-30 11:38 ` Björn Steinbrink 2007-05-30 12:44 ` Eric Dumazet 2 siblings, 1 reply; 25+ messages in thread From: Björn Steinbrink @ 2007-05-30 11:38 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Piotrowski, Ian Kumlien, Linux-kernel, Ingo Molnar, Arjan van de Ven On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote: > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote: > > Hi Ian, > > > > (Thomas "The Wizard of Time" added to CC :)) > > Added more wizards :) > > > On 29/05/07, Ian Kumlien <iank@bredband.net> wrote: > > > Hi, > > > > > > As the daystar sets, i try to play some with my new would be > > > firewall/server, but since this will be running for quite some time i > > > have been experimenting with powertop to find out what i can do to limit > > > it's power usage. > > > > > > But, if i run powertop for too long or a few times to many... this > > > happens: > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > This was reported before. It's incredibly hard to reproduce. OK, second try, this time with a patch. In timer_stats_update_stats, input is allocated on the stack, so it is uninitialized, in particular the "next" field is random. Now in tstat_lookup, the new entry "curr" is initialized with the values from "input" (passed as "entry") and "next" is set to NULL _after_ it is added to the list, so if a second CPU is running the fastpath lookup while we're inserting the new element, it might get the garbage value in "next". The patch below fixes that. Björn Initialize the "next" field of a timer stats entry before it is inserted into the list to avoid a race with the fastpath lookup. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> --- diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c index 868f1bc..5bc8f91 100644 --- a/kernel/time/timer_stats.c +++ b/kernel/time/timer_stats.c @@ -202,12 +202,12 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm) if (curr) { *curr = *entry; curr->count = 0; + curr->next = NULL; memcpy(curr->comm, comm, TASK_COMM_LEN); if (prev) prev->next = curr; else *head = curr; - curr->next = NULL; } out_unlock: spin_unlock(&table_lock); ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-30 11:38 ` Björn Steinbrink @ 2007-05-30 12:44 ` Eric Dumazet 2007-05-30 13:14 ` Björn Steinbrink 0 siblings, 1 reply; 25+ messages in thread From: Eric Dumazet @ 2007-05-30 12:44 UTC (permalink / raw) To: Björn Steinbrink Cc: Thomas Gleixner, Michal Piotrowski, Ian Kumlien, Linux-kernel, Ingo Molnar, Arjan van de Ven On Wed, 30 May 2007 13:38:25 +0200 Björn Steinbrink <B.Steinbrink@gmx.de> wrote: > On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote: > > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote: > > > Hi Ian, > > > > > > (Thomas "The Wizard of Time" added to CC :)) > > > > Added more wizards :) > > > > > On 29/05/07, Ian Kumlien <iank@bredband.net> wrote: > > > > Hi, > > > > > > > > As the daystar sets, i try to play some with my new would be > > > > firewall/server, but since this will be running for quite some time i > > > > have been experimenting with powertop to find out what i can do to limit > > > > it's power usage. > > > > > > > > But, if i run powertop for too long or a few times to many... this > > > > happens: > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > This was reported before. It's incredibly hard to reproduce. > > OK, second try, this time with a patch. In timer_stats_update_stats, > input is allocated on the stack, so it is uninitialized, in particular > the "next" field is random. Now in tstat_lookup, the new entry "curr" is > initialized with the values from "input" (passed as "entry") and "next" > is set to NULL _after_ it is added to the list, so if a second CPU is > running the fastpath lookup while we're inserting the new element, it > might get the garbage value in "next". The patch below fixes that. > > Björn > > > > Initialize the "next" field of a timer stats entry before it is inserted > into the list to avoid a race with the fastpath lookup. > > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> > --- > diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c > index 868f1bc..5bc8f91 100644 > --- a/kernel/time/timer_stats.c > +++ b/kernel/time/timer_stats.c > @@ -202,12 +202,12 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm) > if (curr) { > *curr = *entry; > curr->count = 0; > + curr->next = NULL; > memcpy(curr->comm, comm, TASK_COMM_LEN); > if (prev) > prev->next = curr; > else > *head = curr; > - curr->next = NULL; > } > out_unlock: > spin_unlock(&table_lock); > Your analysis might be right, not the fix. You *cannot* assume curr->next = NULL will be done before insert. You probably also need a memory barrier. curr->next = NULL; memcpy(curr->comm, comm, TASK_COMM_LEN); smp_mb(); /* commit all writes to curr before insert */ if (prev) prev->next = curr; else *head = curr; ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-30 12:44 ` Eric Dumazet @ 2007-05-30 13:14 ` Björn Steinbrink [not found] ` <20070531102047.GG19272@pomac.netswarm.net> 0 siblings, 1 reply; 25+ messages in thread From: Björn Steinbrink @ 2007-05-30 13:14 UTC (permalink / raw) To: Eric Dumazet Cc: Thomas Gleixner, Michal Piotrowski, Ian Kumlien, Linux-kernel, Ingo Molnar, Arjan van de Ven On 2007.05.30 14:44:49 +0200, Eric Dumazet wrote: > On Wed, 30 May 2007 13:38:25 +0200 > Björn Steinbrink <B.Steinbrink@gmx.de> wrote: > > > On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote: > > > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote: > > > > On 29/05/07, Ian Kumlien <iank@bredband.net> wrote: > > > > > Hi, > > > > > > > > > > As the daystar sets, i try to play some with my new would be > > > > > firewall/server, but since this will be running for quite some time i > > > > > have been experimenting with powertop to find out what i can do to limit > > > > > it's power usage. > > > > > > > > > > But, if i run powertop for too long or a few times to many... this > > > > > happens: > > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > > > This was reported before. It's incredibly hard to reproduce. > > > > OK, second try, this time with a patch. In timer_stats_update_stats, > > input is allocated on the stack, so it is uninitialized, in particular > > the "next" field is random. Now in tstat_lookup, the new entry "curr" is > > initialized with the values from "input" (passed as "entry") and "next" > > is set to NULL _after_ it is added to the list, so if a second CPU is > > running the fastpath lookup while we're inserting the new element, it > > might get the garbage value in "next". The patch below fixes that. > > > > Björn > > > > > > > > Initialize the "next" field of a timer stats entry before it is inserted > > into the list to avoid a race with the fastpath lookup. > > > > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> > > --- > > diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c > > index 868f1bc..5bc8f91 100644 > > --- a/kernel/time/timer_stats.c > > +++ b/kernel/time/timer_stats.c > > @@ -202,12 +202,12 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm) > > if (curr) { > > *curr = *entry; > > curr->count = 0; > > + curr->next = NULL; > > memcpy(curr->comm, comm, TASK_COMM_LEN); > > if (prev) > > prev->next = curr; > > else > > *head = curr; > > - curr->next = NULL; > > } > > out_unlock: > > spin_unlock(&table_lock); > > > > Your analysis might be right, not the fix. > > You *cannot* assume curr->next = NULL will be done before insert. > > You probably also need a memory barrier. Ehrm, right. I somehow thought of the spinlock being enough to satisfy that constraint, but if that was the case, the whole problem wouldn't exist in the first place. D'oh! Thanks, Björn Initialize the "next" field of a timer stats entry before it is inserted into the list to avoid a race with the fastpath lookup. Thanks to Eric Dumazet for reminding me of the memory barrier. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> --- diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c index 868f1bc..ab0ba6c 100644 --- a/kernel/time/timer_stats.c +++ b/kernel/time/timer_stats.c @@ -202,12 +202,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm) if (curr) { *curr = *entry; curr->count = 0; + curr->next = NULL; memcpy(curr->comm, comm, TASK_COMM_LEN); + + smp_mb(); /* Ensure that curr is initialized before insert */ + if (prev) prev->next = curr; else *head = curr; - curr->next = NULL; } out_unlock: spin_unlock(&table_lock); ^ permalink raw reply related [flat|nested] 25+ messages in thread
[parent not found: <20070531102047.GG19272@pomac.netswarm.net>]
* Re: [BUG] Something goes wrong with timer statistics. [not found] ` <20070531102047.GG19272@pomac.netswarm.net> @ 2007-05-31 14:25 ` Björn Steinbrink 2007-05-31 15:10 ` Eric Dumazet 0 siblings, 1 reply; 25+ messages in thread From: Björn Steinbrink @ 2007-05-31 14:25 UTC (permalink / raw) To: iank Cc: Eric Dumazet, Thomas Gleixner, Michal Piotrowski, Ingo Molnar, Arjan van de Ven, linux-kernel On 2007.05.31 12:20:47 +0200, iank@bredband.net wrote: > On Wed, May 30, 2007 at 03:14:58PM +0200, Björn Steinbrink wrote: > > Initialize the "next" field of a timer stats entry before it is inserted > > into the list to avoid a race with the fastpath lookup. > > Sorry to say... but this does not fix my problem, however, i can't reach > that machine at all atm, but i will post the oops later today. Ok, three times is a charm, right? ;-) The previous patch did fix a race, but there's still one left. The hash table is never cleared, so it can still point into the entries array. As long as we do not race with the insertion, that's not a directly visible problem as all fields are 0, so we get no match and the lookup finishes right away. We probably _can_ get weird results though, as the all-zero entry isn't marked as used, but is used as "head" and "prev" in this case and even multiple hash entries might point to it in some cases. That would eventually lead to lost entries because the "next" field gets overwritten when alloc_entry() finally claims the entry. But when we race, we can still end up between "*curr = *entry" (makes "next" contain garbage) and "curr->next = NULL", while we're doing the fastpath lookup, because the insertion basically already took place! Replacing "curr->next = NULL" with "entry->next = NULL" and moving it up is pointless, as "*curr = *entry" isn't atomic, so instead, we have to simply clean the hash table when the entries are cleared. Björn Fix two races in the timer stats lookup code. One by ensuring that the initialization of a new entry is finished upon insertion of that entry. The other by cleaning up the hash table when the entries array is cleared, so that we don't have "pre-inserted" entries. Thanks to Eric Dumazet for reminding me of the memory barrier. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> --- diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c index 868f1bc..611b844 100644 --- a/kernel/time/timer_stats.c +++ b/kernel/time/timer_stats.c @@ -117,21 +117,6 @@ static struct entry entries[MAX_ENTRIES]; static atomic_t overflow_count; -static void reset_entries(void) -{ - nr_entries = 0; - memset(entries, 0, sizeof(entries)); - atomic_set(&overflow_count, 0); -} - -static struct entry *alloc_entry(void) -{ - if (nr_entries >= MAX_ENTRIES) - return NULL; - - return entries + nr_entries++; -} - /* * The entries are in a hash-table, for fast lookup: */ @@ -149,6 +134,22 @@ static struct entry *alloc_entry(void) static struct entry *tstat_hash_table[TSTAT_HASH_SIZE] __read_mostly; +static void reset_entries(void) +{ + nr_entries = 0; + memset(entries, 0, sizeof(entries)); + memset(tstat_hash_table, 0, sizeof(tstat_hash_table)); + atomic_set(&overflow_count, 0); +} + +static struct entry *alloc_entry(void) +{ + if (nr_entries >= MAX_ENTRIES) + return NULL; + + return entries + nr_entries++; +} + static int match_entries(struct entry *entry1, struct entry *entry2) { return entry1->timer == entry2->timer && @@ -202,12 +203,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm) if (curr) { *curr = *entry; curr->count = 0; + curr->next = NULL; memcpy(curr->comm, comm, TASK_COMM_LEN); + + smp_mb(); /* Ensure that curr is initialized before insert */ + if (prev) prev->next = curr; else *head = curr; - curr->next = NULL; } out_unlock: spin_unlock(&table_lock); ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-31 14:25 ` Björn Steinbrink @ 2007-05-31 15:10 ` Eric Dumazet 2007-05-31 15:27 ` Björn Steinbrink 0 siblings, 1 reply; 25+ messages in thread From: Eric Dumazet @ 2007-05-31 15:10 UTC (permalink / raw) To: Björn Steinbrink Cc: iank, Thomas Gleixner, Michal Piotrowski, Ingo Molnar, Arjan van de Ven, linux-kernel On Thu, 31 May 2007 16:25:22 +0200 Björn Steinbrink <B.Steinbrink@gmx.de> wrote: > On 2007.05.31 12:20:47 +0200, iank@bredband.net wrote: > > On Wed, May 30, 2007 at 03:14:58PM +0200, Björn Steinbrink wrote: > > > Initialize the "next" field of a timer stats entry before it is inserted > > > into the list to avoid a race with the fastpath lookup. > > > > Sorry to say... but this does not fix my problem, however, i can't reach > > that machine at all atm, but i will post the oops later today. > > Ok, three times is a charm, right? ;-) The previous patch did fix a > race, but there's still one left. The hash table is never cleared, so it > can still point into the entries array. > > As long as we do not race with the insertion, that's not a directly > visible problem as all fields are 0, so we get no match and the lookup > finishes right away. We probably _can_ get weird results though, as the > all-zero entry isn't marked as used, but is used as "head" and "prev" in > this case and even multiple hash entries might point to it in some > cases. That would eventually lead to lost entries because the "next" > field gets overwritten when alloc_entry() finally claims the entry. > > But when we race, we can still end up between "*curr = *entry" (makes > "next" contain garbage) and "curr->next = NULL", while we're doing the > fastpath lookup, because the insertion basically already took place! > Replacing "curr->next = NULL" with "entry->next = NULL" and moving it up > is pointless, as "*curr = *entry" isn't atomic, so instead, we have to > simply clean the hash table when the entries are cleared. > > Björn > > > Fix two races in the timer stats lookup code. One by ensuring that the > initialization of a new entry is finished upon insertion of that entry. > The other by cleaning up the hash table when the entries array is > cleared, so that we don't have "pre-inserted" entries. > > Thanks to Eric Dumazet for reminding me of the memory barrier. > > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> > --- > diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c > index 868f1bc..611b844 100644 > --- a/kernel/time/timer_stats.c > +++ b/kernel/time/timer_stats.c > @@ -117,21 +117,6 @@ static struct entry entries[MAX_ENTRIES]; > > static atomic_t overflow_count; > > -static void reset_entries(void) > -{ > - nr_entries = 0; > - memset(entries, 0, sizeof(entries)); > - atomic_set(&overflow_count, 0); > -} > - > -static struct entry *alloc_entry(void) > -{ > - if (nr_entries >= MAX_ENTRIES) > - return NULL; > - > - return entries + nr_entries++; > -} > - > /* > * The entries are in a hash-table, for fast lookup: > */ > @@ -149,6 +134,22 @@ static struct entry *alloc_entry(void) > > static struct entry *tstat_hash_table[TSTAT_HASH_SIZE] __read_mostly; > > +static void reset_entries(void) > +{ > + nr_entries = 0; > + memset(entries, 0, sizeof(entries)); > + memset(tstat_hash_table, 0, sizeof(tstat_hash_table)); > + atomic_set(&overflow_count, 0); > +} > + > +static struct entry *alloc_entry(void) > +{ > + if (nr_entries >= MAX_ENTRIES) > + return NULL; > + > + return entries + nr_entries++; > +} > + > static int match_entries(struct entry *entry1, struct entry *entry2) > { > return entry1->timer == entry2->timer && > @@ -202,12 +203,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm) > if (curr) { > *curr = *entry; > curr->count = 0; > + curr->next = NULL; > memcpy(curr->comm, comm, TASK_COMM_LEN); > + > + smp_mb(); /* Ensure that curr is initialized before insert */ > + > if (prev) > prev->next = curr; > else > *head = curr; > - curr->next = NULL; > } > out_unlock: > spin_unlock(&table_lock); > - Well... :) , there is still a memory barrier missing it seems. Another cpu might see a bad value if 'active=1' is set before tstat_hash_table is really cleared. diff -urp linux/kernel/time/timer_stats.c.orig linux/kernel/time/timer_stats.c --- kernel/time/timer_stats.c.orig +++ kernel/time/timer_stats.c @@ -361,6 +361,7 @@ static ssize_t tstats_write(struct file if (!active) { reset_entries(); time_start = ktime_get(); + smp_mb(); active = 1; } break; ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-31 15:10 ` Eric Dumazet @ 2007-05-31 15:27 ` Björn Steinbrink 2007-05-31 22:59 ` Ian Kumlien 0 siblings, 1 reply; 25+ messages in thread From: Björn Steinbrink @ 2007-05-31 15:27 UTC (permalink / raw) To: Eric Dumazet Cc: iank, Thomas Gleixner, Michal Piotrowski, Ingo Molnar, Arjan van de Ven, linux-kernel On 2007.05.31 17:10:07 +0200, Eric Dumazet wrote: > Well... :) , there is still a memory barrier missing it seems. > > Another cpu might see a bad value if 'active=1' is set before tstat_hash_table is really cleared. Hm, that even makes the assumption in my first try valid ;-) Just for the record, this time I thought that the barrier from the spinlock in timer_stats_update_stats (right before the check for active) would be enough, but that's obviously running on the wrong cpu if we race... *sigh* Thanks, Björn Fix two races in the timer stats lookup code. One by ensuring that the initialization of a new entry is finished upon insertion of that entry. The other by cleaning up the hash table when the entries array is cleared, so that we don't have "pre-inserted" entries. Thanks to Eric Dumazet for reminding me of the memory barriers. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> --- diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c index 868f1bc..63d00f9 100644 --- a/kernel/time/timer_stats.c +++ b/kernel/time/timer_stats.c @@ -117,21 +117,6 @@ static struct entry entries[MAX_ENTRIES]; static atomic_t overflow_count; -static void reset_entries(void) -{ - nr_entries = 0; - memset(entries, 0, sizeof(entries)); - atomic_set(&overflow_count, 0); -} - -static struct entry *alloc_entry(void) -{ - if (nr_entries >= MAX_ENTRIES) - return NULL; - - return entries + nr_entries++; -} - /* * The entries are in a hash-table, for fast lookup: */ @@ -149,6 +134,22 @@ static struct entry *alloc_entry(void) static struct entry *tstat_hash_table[TSTAT_HASH_SIZE] __read_mostly; +static void reset_entries(void) +{ + nr_entries = 0; + memset(entries, 0, sizeof(entries)); + memset(tstat_hash_table, 0, sizeof(tstat_hash_table)); + atomic_set(&overflow_count, 0); +} + +static struct entry *alloc_entry(void) +{ + if (nr_entries >= MAX_ENTRIES) + return NULL; + + return entries + nr_entries++; +} + static int match_entries(struct entry *entry1, struct entry *entry2) { return entry1->timer == entry2->timer && @@ -202,12 +203,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm) if (curr) { *curr = *entry; curr->count = 0; + curr->next = NULL; memcpy(curr->comm, comm, TASK_COMM_LEN); + + smp_mb(); /* Ensure that curr is initialized before insert */ + if (prev) prev->next = curr; else *head = curr; - curr->next = NULL; } out_unlock: spin_unlock(&table_lock); @@ -360,6 +364,7 @@ static ssize_t tstats_write(struct file *file, const char __user *buf, if (!active) { reset_entries(); time_start = ktime_get(); + smb_mb(); active = 1; } break; ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-31 15:27 ` Björn Steinbrink @ 2007-05-31 22:59 ` Ian Kumlien 2007-05-31 23:09 ` Björn Steinbrink 0 siblings, 1 reply; 25+ messages in thread From: Ian Kumlien @ 2007-05-31 22:59 UTC (permalink / raw) To: Björn Steinbrink Cc: Eric Dumazet, Thomas Gleixner, Michal Piotrowski, Ingo Molnar, Arjan van de Ven, linux-kernel On tor, 2007-05-31 at 17:27 +0200, Björn Steinbrink wrote: > On 2007.05.31 17:10:07 +0200, Eric Dumazet wrote: > > Well... :) , there is still a memory barrier missing it seems. > > > > Another cpu might see a bad value if 'active=1' is set before tstat_hash_table is really cleared. > > Hm, that even makes the assumption in my first try valid ;-) > Just for the record, this time I thought that the barrier from the > spinlock in timer_stats_update_stats (right before the check for active) > would be enough, but that's obviously running on the wrong cpu if we > race... *sigh* > > Thanks, > Björn > > > > Fix two races in the timer stats lookup code. One by ensuring that the > initialization of a new entry is finished upon insertion of that entry. > The other by cleaning up the hash table when the entries array is > cleared, so that we don't have "pre-inserted" entries. > > Thanks to Eric Dumazet for reminding me of the memory barriers. Fix the comment below and you can add: Signed-off-by: Ian Kumlien <pomac@vapor.com> It's currently been running for the longest period ever, ie, 11 minutes =) I'm gonna leave it running during the night and send a status update when the evil daystar reaches it's peak CET. (i haven't been able to stop since linus mentioned it... damn it... =)) > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> > --- > @@ -360,6 +364,7 @@ static ssize_t tstats_write(struct file *file, const char __user *buf, > if (!active) { > reset_entries(); > time_start = ktime_get(); > + smb_mb(); smb? you mean smp =) > active = 1; > } > break; -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-31 22:59 ` Ian Kumlien @ 2007-05-31 23:09 ` Björn Steinbrink 2007-06-01 6:21 ` Ingo Molnar 0 siblings, 1 reply; 25+ messages in thread From: Björn Steinbrink @ 2007-05-31 23:09 UTC (permalink / raw) To: Ian Kumlien Cc: Eric Dumazet, Thomas Gleixner, Michal Piotrowski, Ingo Molnar, Arjan van de Ven, linux-kernel On 2007.06.01 00:59:11 +0200, Ian Kumlien wrote: > On tor, 2007-05-31 at 17:27 +0200, Björn Steinbrink wrote: > > On 2007.05.31 17:10:07 +0200, Eric Dumazet wrote: > > > Well... :) , there is still a memory barrier missing it seems. > > > > > > Another cpu might see a bad value if 'active=1' is set before tstat_hash_table is really cleared. > > > > Hm, that even makes the assumption in my first try valid ;-) > > Just for the record, this time I thought that the barrier from the > > spinlock in timer_stats_update_stats (right before the check for active) > > would be enough, but that's obviously running on the wrong cpu if we > > race... *sigh* > > > > Fix the comment below and you can add: > Signed-off-by: Ian Kumlien <pomac@vapor.com> > > It's currently been running for the longest period ever, ie, 11 minutes > =) Finally! :-) > I'm gonna leave it running during the night and send a status update > when the evil daystar reaches it's peak CET. (i haven't been able to > stop since linus mentioned it... damn it... =)) > > > @@ -360,6 +364,7 @@ static ssize_t tstats_write(struct file *file, const char __user *buf, > > if (!active) { > > reset_entries(); > > time_start = ktime_get(); > > + smb_mb(); > > smb? you mean smp =) No idea how I managed to break that after test compiling it... :-/ Thanks for catching it :-) So unless I messed up something again, here's the final patch. Thanks, Björn Fix two races in the timer stats lookup code. One by ensuring that the initialization of a new entry is finished upon insertion of that entry. The other by cleaning up the hash table when the entries array is cleared, so that we don't have any "pre-inserted" entries. Thanks to Eric Dumazet for reminding me of the memory barriers. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Signed-off-by: Ian Kumlien <pomac@vapor.com> --- diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c index 868f1bc..fa3d380 100644 --- a/kernel/time/timer_stats.c +++ b/kernel/time/timer_stats.c @@ -117,21 +117,6 @@ static struct entry entries[MAX_ENTRIES]; static atomic_t overflow_count; -static void reset_entries(void) -{ - nr_entries = 0; - memset(entries, 0, sizeof(entries)); - atomic_set(&overflow_count, 0); -} - -static struct entry *alloc_entry(void) -{ - if (nr_entries >= MAX_ENTRIES) - return NULL; - - return entries + nr_entries++; -} - /* * The entries are in a hash-table, for fast lookup: */ @@ -149,6 +134,22 @@ static struct entry *alloc_entry(void) static struct entry *tstat_hash_table[TSTAT_HASH_SIZE] __read_mostly; +static void reset_entries(void) +{ + nr_entries = 0; + memset(entries, 0, sizeof(entries)); + memset(tstat_hash_table, 0, sizeof(tstat_hash_table)); + atomic_set(&overflow_count, 0); +} + +static struct entry *alloc_entry(void) +{ + if (nr_entries >= MAX_ENTRIES) + return NULL; + + return entries + nr_entries++; +} + static int match_entries(struct entry *entry1, struct entry *entry2) { return entry1->timer == entry2->timer && @@ -202,12 +203,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm) if (curr) { *curr = *entry; curr->count = 0; + curr->next = NULL; memcpy(curr->comm, comm, TASK_COMM_LEN); + + smp_mb(); /* Ensure that curr is initialized before insert */ + if (prev) prev->next = curr; else *head = curr; - curr->next = NULL; } out_unlock: spin_unlock(&table_lock); @@ -360,6 +364,7 @@ static ssize_t tstats_write(struct file *file, const char __user *buf, if (!active) { reset_entries(); time_start = ktime_get(); + smp_mb(); active = 1; } break; ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-31 23:09 ` Björn Steinbrink @ 2007-06-01 6:21 ` Ingo Molnar 0 siblings, 0 replies; 25+ messages in thread From: Ingo Molnar @ 2007-06-01 6:21 UTC (permalink / raw) To: Björn Steinbrink, Ian Kumlien, Eric Dumazet, Thomas Gleixner, Michal Piotrowski, Arjan van de Ven, linux-kernel Cc: Arjan van de Ven * Björn Steinbrink <B.Steinbrink@gmx.de> wrote: > > Fix the comment below and you can add: > > Signed-off-by: Ian Kumlien <pomac@vapor.com> > > > > It's currently been running for the longest period ever, ie, 11 minutes > > =) > > Finally! :-) cool! Great detective work Björn! > Thanks to Eric Dumazet for reminding me of the memory barriers. > > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> > Signed-off-by: Ian Kumlien <pomac@vapor.com> Acked-by: Ingo Molnar <mingo@elte.hu> Ingo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-29 21:38 [BUG] Something goes wrong with timer statistics Ian Kumlien 2007-05-29 22:37 ` David Miller [not found] ` <6bffcb0e0705291524y2752d646p94b0bf6ca87af68f@mail.gmail.com> @ 2007-05-30 0:13 ` Stephane Casset 2007-05-31 23:53 ` Björn Steinbrink 2 siblings, 1 reply; 25+ messages in thread From: Stephane Casset @ 2007-05-30 0:13 UTC (permalink / raw) To: Ian Kumlien; +Cc: Linux-kernel Le Tue, May 29, 2007 at 11:38:48PM +0200, Ian Kumlien écrivait : > Hi, > > As the daystar sets, i try to play some with my new would be > firewall/server, but since this will be running for quite some time i > have been experimenting with powertop to find out what i can do to limit > it's power usage. > > But, if i run powertop for too long or a few times to many... this > happens: > http://pomac.netswarm.net/pics/kernel_panic.jpg > > If i don't run powertop, it is rock solid... Compiling for hours, > running memtest for hours etc etc... Same here, P4 HT, if I run powertop for 5-10 minutes the system crash :( It happens with 2.6.22-rc{2,3} at least :( I didn't try before. I don't have a oops since the last crash was in X, but next time I will try to catch it. If there is any patch around to fix this I am willing to test. Regards -- Stéphane Casset LOGIDÉE sàrl Se faire plaisir d'apprendre 1a, rue Pasteur Tel : +33 388 23 69 77 casset@logidee.com F-67540 OSTWALD Fax : +33 388 23 69 77 http://logidee.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [BUG] Something goes wrong with timer statistics. 2007-05-30 0:13 ` Stephane Casset @ 2007-05-31 23:53 ` Björn Steinbrink 0 siblings, 0 replies; 25+ messages in thread From: Björn Steinbrink @ 2007-05-31 23:53 UTC (permalink / raw) To: Stephane Casset; +Cc: linux-kernel On 2007.05.30 02:13:50 +0200, Stephane Casset wrote: > Le Tue, May 29, 2007 at 11:38:48PM +0200, Ian Kumlien écrivait : > > Hi, > > > > As the daystar sets, i try to play some with my new would be > > firewall/server, but since this will be running for quite some time i > > have been experimenting with powertop to find out what i can do to limit > > it's power usage. > > > > But, if i run powertop for too long or a few times to many... this > > happens: > > http://pomac.netswarm.net/pics/kernel_panic.jpg > > > > If i don't run powertop, it is rock solid... Compiling for hours, > > running memtest for hours etc etc... > > Same here, P4 HT, if I run powertop for 5-10 minutes the system crash :( > It happens with 2.6.22-rc{2,3} at least :( I didn't try before. > > I don't have a oops since the last crash was in X, but next time I will > try to catch it. If there is any patch around to fix this I am willing > to test. There is finally one that seems to work: http://lkml.org/lkml/2007/5/31/394 I'd be glad if you could give it a try. Thanks, Björn ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2007-06-01 6:23 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-29 21:38 [BUG] Something goes wrong with timer statistics Ian Kumlien
2007-05-29 22:37 ` David Miller
2007-05-29 22:42 ` Thomas Gleixner
2007-05-29 22:44 ` David Miller
2007-05-29 22:53 ` Thomas Gleixner
[not found] ` <6bffcb0e0705291541q7e6d6faen5cd6345074c4c92a@mail.gmail.com>
2007-05-29 22:44 ` David Miller
2007-05-29 22:51 ` Ian Kumlien
2007-05-29 23:03 ` David Miller
2007-05-30 0:04 ` Ian Kumlien
[not found] ` <6bffcb0e0705291524y2752d646p94b0bf6ca87af68f@mail.gmail.com>
2007-05-29 22:35 ` Ian Kumlien
2007-05-29 22:38 ` Thomas Gleixner
2007-05-29 22:50 ` Ian Kumlien
2007-05-30 6:52 ` Björn Steinbrink
2007-05-30 7:10 ` Björn Steinbrink
2007-05-30 11:38 ` Björn Steinbrink
2007-05-30 12:44 ` Eric Dumazet
2007-05-30 13:14 ` Björn Steinbrink
[not found] ` <20070531102047.GG19272@pomac.netswarm.net>
2007-05-31 14:25 ` Björn Steinbrink
2007-05-31 15:10 ` Eric Dumazet
2007-05-31 15:27 ` Björn Steinbrink
2007-05-31 22:59 ` Ian Kumlien
2007-05-31 23:09 ` Björn Steinbrink
2007-06-01 6:21 ` Ingo Molnar
2007-05-30 0:13 ` Stephane Casset
2007-05-31 23:53 ` Björn Steinbrink
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox