public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Top kernel oopses/warnings this week
@ 2007-12-14 18:46 Arjan van de Ven
  2007-12-14 18:58 ` Dave Jones
                   ` (4 more replies)
  0 siblings, 5 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-14 18:46 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton, Linus Torvalds, protasnb

The http://www.kerneloops.org website collects kernel oops and warning reports from various mailing lists and bugzillas; below is a top 10 
list of the oopses collected in the last 7 days. (Reports prior to 2.6.23 have been omitted in collecting the top 10)

This is the first such report that I'm posting; Please let me know if this is useful or not.

hid_output_report warning
	Warning at drivers/hid/hid-core.c:784 implement()
	16 times last week
	<no specific version information available>
	More Info: http://www.kerneloops.org/search.php?search=implement

softlockup in tick_broadcast_oneshot_control
	3 times last week
	Only seen in 2.6.24-rc4 so far
	More Info: http://www.kerneloops.org/oops.php?number=2409
	
hiddev_ioctl crash
	3 times last week
	Only seen in 2.6.24-rc3 so far
	More Info: http://www.kerneloops.org/oops.php?number=2428
	
shrink_dcache_for_umount_subtree crash
	BUG at fs/dcache.c:595
	2 times last week
	Has been seen as far back as 2.6.18
	More Info: http://www.kerneloops.org/oops.php?number=2365
	More Info: http://www.kerneloops.org/search.php?search=shrink_dcache_for_umount_subtree
	
cpufreq_remove_dev crash
	BUG at drivers/cpufreq/cpufreq.c:1060
	2 times last week
	Has been reported only for 2.6.24-rc4
	More Info: http://www.kerneloops.org/search.php?search=cpufreq_remove_dev
	More Info: http://www.kerneloops.org/oops.php?number=2458
	
journal_dirty_data crash (tainted)
	BUG at fs/jbd/transaction.c:983	
	2 times last week
	Has been reported only in 2.6.23.9
	http://www.kerneloops.org/search.php?search=journal_dirty_data
	
tcp_fastretrans_alert
	WARNING at net/ipv4/tcp_input.c:2533 tcp_fastretrans_alert()
	2 times last week
	Has been reported in 2.6.24-rc4 and -rc5
	More Info: http://www.kerneloops.org/search.php?search=tcp_fastretrans_alert	
	
tcp_sacktag_one
	WARNING at net/ipv4/tcp_input.c:1280 tcp_sacktag_one()	
	Reported once
	Has only been seen in -rc5 so far
	More Info: http://www.kerneloops.org/search.php?search=tcp_sacktag_one
	
simple_map_write (MTD)
	kernel crash
	Reported once this week on 2.6.24-rc5
	Has been seen as far back as 2.6.17
	More Info: http://www.kerneloops.org/search.php?search=simple_map_write
	
tcp_sacktag_walk	
	WARNING at net/ipv4/tcp_input.c:1280
	Reported once on 2.6.24-rc5
	Has been seen only on 2.6.24-rc5
	More Info: http://www.kerneloops.org/search.php?search=tcp_sacktag_walk

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-14 18:46 Top kernel oopses/warnings this week Arjan van de Ven
@ 2007-12-14 18:58 ` Dave Jones
  2007-12-14 21:57 ` Andrew Morton
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 35+ messages in thread
From: Dave Jones @ 2007-12-14 18:58 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, Andrew Morton, Linus Torvalds, protasnb

On Fri, Dec 14, 2007 at 10:46:36AM -0800, Arjan van de Ven wrote:
 > The http://www.kerneloops.org website collects kernel oops and warning reports from various mailing lists and bugzillas; below is a top 10 
 > list of the oopses collected in the last 7 days. (Reports prior to 2.6.23 have been omitted in collecting the top 10)
 > 
 > This is the first such report that I'm posting; Please let me know if this is useful or not.

I like!   Good work.

 > cpufreq_remove_dev crash
 > 	BUG at drivers/cpufreq/cpufreq.c:1060
 > 	2 times last week
 > 	Has been reported only for 2.6.24-rc4
 > 	More Info: http://www.kerneloops.org/search.php?search=cpufreq_remove_dev
 > 	More Info: http://www.kerneloops.org/oops.php?number=2458

Patch pending.  Already in -mm. Also sitting in Linus' inbox. 

	Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-14 18:46 Top kernel oopses/warnings this week Arjan van de Ven
  2007-12-14 18:58 ` Dave Jones
@ 2007-12-14 21:57 ` Andrew Morton
  2007-12-14 22:25   ` Natalie Protasevich
  2007-12-15  0:38   ` Arjan van de Ven
  2007-12-14 22:12 ` Jon Masters
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 35+ messages in thread
From: Andrew Morton @ 2007-12-14 21:57 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, Linus Torvalds, protasnb

On Fri, 14 Dec 2007 10:46:36 -0800 Arjan van de Ven <arjan@linux.intel.com> wrote:

> The http://www.kerneloops.org website collects kernel oops and warning
> reports from various mailing lists and bugzillas

Well that would have been fun to write.  Does it watch
https://lists.linux-foundation.org/mailman/listinfo/bugme-new ?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-14 18:46 Top kernel oopses/warnings this week Arjan van de Ven
  2007-12-14 18:58 ` Dave Jones
  2007-12-14 21:57 ` Andrew Morton
@ 2007-12-14 22:12 ` Jon Masters
  2007-12-15 15:49 ` Stefan Richter
  2007-12-17 17:23 ` Ingo Molnar
  4 siblings, 0 replies; 35+ messages in thread
From: Jon Masters @ 2007-12-14 22:12 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Linux Kernel Mailing List


On Fri, 2007-12-14 at 10:46 -0800, Arjan van de Ven wrote:

> The http://www.kerneloops.org website collects kernel oops and warning reports from various mailing lists and bugzillas; below is a top 10 
> list of the oopses collected in the last 7 days. (Reports prior to 2.6.23 have been omitted in collecting the top 10)
> 
> This is the first such report that I'm posting; Please let me know if this is useful or not.

FWIW I think this is incredibly useful, Arjan. Hoping we'll get the
kerneloops tools into Fedora soon too.

Jon.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-14 21:57 ` Andrew Morton
@ 2007-12-14 22:25   ` Natalie Protasevich
  2007-12-15  0:38   ` Arjan van de Ven
  1 sibling, 0 replies; 35+ messages in thread
From: Natalie Protasevich @ 2007-12-14 22:25 UTC (permalink / raw)
  To: Arjan van de Ven, Andrew Morton; +Cc: Linux Kernel Mailing List, Linus Torvalds

On Dec 14, 2007 1:57 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Fri, 14 Dec 2007 10:46:36 -0800 Arjan van de Ven <arjan@linux.intel.com> wrote:
>
> > The http://www.kerneloops.org website collects kernel oops and warning
> > reports from various mailing lists and bugzillas
>
> Well that would have been fun to write.  Does it watch
> https://lists.linux-foundation.org/mailman/listinfo/bugme-new ?
>

This looks great! I'd like to install and try this package on
bugzilla... It looks like it can do all kinds of searches.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-14 21:57 ` Andrew Morton
  2007-12-14 22:25   ` Natalie Protasevich
@ 2007-12-15  0:38   ` Arjan van de Ven
  1 sibling, 0 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-15  0:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Linus Torvalds, protasnb

Andrew Morton wrote:
> On Fri, 14 Dec 2007 10:46:36 -0800 Arjan van de Ven <arjan@linux.intel.com> wrote:
> 
>> The http://www.kerneloops.org website collects kernel oops and warning
>> reports from various mailing lists and bugzillas
> 
> Well that would have been fun to write.  Does it watch
> https://lists.linux-foundation.org/mailman/listinfo/bugme-new ?

yes it does; Martin pointed me at that recently....
What doesn't work yet (I now realize) is the link from the oops to the bugzilla URL; I'll be working on that shortly.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-14 18:46 Top kernel oopses/warnings this week Arjan van de Ven
                   ` (2 preceding siblings ...)
  2007-12-14 22:12 ` Jon Masters
@ 2007-12-15 15:49 ` Stefan Richter
  2007-12-15 18:21   ` Arjan van de Ven
  2007-12-17  2:51   ` Dave Jones
  2007-12-17 17:23 ` Ingo Molnar
  4 siblings, 2 replies; 35+ messages in thread
From: Stefan Richter @ 2007-12-15 15:49 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, Andrew Morton, Linus Torvalds, protasnb

Arjan van de Ven wrote:
> The http://www.kerneloops.org website collects kernel oops and warning
> reports from various mailing lists and bugzillas;

A few comments:

Report counts may be too high due to duplicate recognition of the very
same report.¹

Reports against 2.6.X-rcY-mmZ are listed in the same category as reports
against 2.6.X-rcY.  To distinguish -mm reports from vanilla reports, one
has to look into the details of each bug entry.¹

A general weakness is that it is ultimately impossible to know whether a
report was against an unpatched kernel, unless one drills down to the
individual mailinglist threads.

Reports about tainted kernels have arguably less value.  It would be
good to hide such reports until a report of the same oops in an
untainted kernel was found.

¹) example: http://www.kerneloops.org/oops.php?number=2335
-- 
Stefan Richter
-=====-=-=== ==-- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-15 15:49 ` Stefan Richter
@ 2007-12-15 18:21   ` Arjan van de Ven
  2007-12-15 19:44     ` Stefan Richter
  2007-12-17 18:25     ` Zach Brown
  2007-12-17  2:51   ` Dave Jones
  1 sibling, 2 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-15 18:21 UTC (permalink / raw)
  To: Stefan Richter; +Cc: linux-kernel, Andrew Morton, protasnb

Stefan Richter wrote:
> Arjan van de Ven wrote:
>> The http://www.kerneloops.org website collects kernel oops and warning
>> reports from various mailing lists and bugzillas;
> 
> A few comments:
> 
> Report counts may be too high due to duplicate recognition of the very
> same report.¹

this is true however it's .. a hard issue. It's really hard to distinguish a duplicate report from
two reports of the same bug.

> 
> Reports against 2.6.X-rcY-mmZ are listed in the same category as reports
> against 2.6.X-rcY.  To distinguish -mm reports from vanilla reports, one
> has to look into the details of each bug entry.¹

finding what exact kernel version an oops is from is... surprisingly hard.
And to be honest, bugs against -mm are still very interesting, since they'll be
the next mainline after all

> 
> A general weakness is that it is ultimately impossible to know whether a
> report was against an unpatched kernel, unless one drills down to the
> individual mailinglist threads.

for the same reason patched kernels are relevant. And if someone has a super weirdo kernel,
well, as long as we get enough bug data it'll be way down in the noise.


> Reports about tainted kernels have arguably less value.  It would be
> good to hide such reports until a report of the same oops in an
> untainted kernel was found.
That's half of what is done right now; they're not hidden though, just very clearly marked.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-15 18:21   ` Arjan van de Ven
@ 2007-12-15 19:44     ` Stefan Richter
  2007-12-17 18:25     ` Zach Brown
  1 sibling, 0 replies; 35+ messages in thread
From: Stefan Richter @ 2007-12-15 19:44 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, Andrew Morton, protasnb

Arjan van de Ven wrote:
> Stefan Richter wrote:
>> Report counts may be too high due to duplicate recognition of the very
>> same report.
> 
> this is true however it's .. a hard issue. It's really hard to
> distinguish a duplicate report from two reports of the same bug.

Would be nice though to try to find duplicates like the example I gave.
(The actual report and a reply was listed.  The reply just had a full
quote of the oops, with "> " prepended and perhaps lines wrapped.)
Because if an oops is independently reported twice or more, this too
says something about the issue.  E.g. flaky RAM and such is pretty much
eliminated as a possible cause.

Anyway, someone who is actually interested in a particular oops and
looks at the posts in your links quickly notices eventual duplicates.
But it would be helpful to people who only have a quick glance at the
bar graphs if you add a note of caution that the figures are not
accurate and not representative, e.g. because of occasional duplicates.

For the same reason, please don't write headings like "Oops statistics
for kernel 2.6.23-release".  Unless you mean "statistics" in a narrower
sense like they do statistics in medicine and economics. ;-)
Simply write "Oops reports for kernel...".

>> Reports against 2.6.X-rcY-mmZ are listed in the same category as reports
>> against 2.6.X-rcY.  To distinguish -mm reports from vanilla reports, one
>> has to look into the details of each bug entry.¹
> 
> finding what exact kernel version an oops is from is... surprisingly hard.
> And to be honest, bugs against -mm are still very interesting, since
> they'll be the next mainline after all

Yes, they definitely are interesting.  And it's the same like with the
above issue:  People who are genuinely interested in an oops find the
necessary information at the details page.  Separating them from
mainline oopses would be a service though for people who want to
  - have a quick look at what's urgent and what's not so urgent,
  - draw conclusions about the state of the release candidates.
So this is not that important.
-- 
Stefan Richter
-=====-=-=== ==-- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-15 15:49 ` Stefan Richter
  2007-12-15 18:21   ` Arjan van de Ven
@ 2007-12-17  2:51   ` Dave Jones
  2007-12-17 12:33     ` Jon Masters
  1 sibling, 1 reply; 35+ messages in thread
From: Dave Jones @ 2007-12-17  2:51 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Arjan van de Ven, linux-kernel, Andrew Morton, Linus Torvalds,
	protasnb

On Sat, Dec 15, 2007 at 04:49:05PM +0100, Stefan Richter wrote:
 
 > Reports about tainted kernels have arguably less value.  It would be
 > good to hide such reports until a report of the same oops in an
 > untainted kernel was found.
 
I disagree with this.  It's useful to have a "we've seen this before,
and every time, it was tainted with xyz module" datapoint, especially
if no untainted copies of that oops turn up.

	Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17  2:51   ` Dave Jones
@ 2007-12-17 12:33     ` Jon Masters
  2007-12-17 13:13       ` Stefan Richter
  0 siblings, 1 reply; 35+ messages in thread
From: Jon Masters @ 2007-12-17 12:33 UTC (permalink / raw)
  To: Dave Jones
  Cc: Stefan Richter, Arjan van de Ven, linux-kernel, Andrew Morton,
	Linus Torvalds, protasnb


On Sun, 2007-12-16 at 21:51 -0500, Dave Jones wrote:
> On Sat, Dec 15, 2007 at 04:49:05PM +0100, Stefan Richter wrote:
>  
>  > Reports about tainted kernels have arguably less value.  It would be
>  > good to hide such reports until a report of the same oops in an
>  > untainted kernel was found.
>  
> I disagree with this.  It's useful to have a "we've seen this before,
> and every time, it was tainted with xyz module" datapoint, especially
> if no untainted copies of that oops turn up.

+1

In fact, that's even more useful in many cases, if it helps demonstrate
that the oops is associated with a particular buggy binary driver. I can
see a lot of potentially interesting statistics coming from that too.

Jon.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 12:33     ` Jon Masters
@ 2007-12-17 13:13       ` Stefan Richter
  2007-12-17 16:40         ` Arjan van de Ven
  0 siblings, 1 reply; 35+ messages in thread
From: Stefan Richter @ 2007-12-17 13:13 UTC (permalink / raw)
  To: Jon Masters
  Cc: Dave Jones, Arjan van de Ven, linux-kernel, Andrew Morton,
	protasnb

Jon Masters wrote:
> On Sun, 2007-12-16 at 21:51 -0500, Dave Jones wrote:
>> On Sat, Dec 15, 2007 at 04:49:05PM +0100, Stefan Richter wrote:
>>  
>>  > Reports about tainted kernels have arguably less value.  It would be
>>  > good to hide such reports until a report of the same oops in an
>>  > untainted kernel was found.
>>  
>> I disagree with this.  It's useful to have a "we've seen this before,
>> and every time, it was tainted with xyz module" datapoint, especially
>> if no untainted copies of that oops turn up.
> 
> +1
> 
> In fact, that's even more useful in many cases, if it helps demonstrate
> that the oops is associated with a particular buggy binary driver. I can
> see a lot of potentially interesting statistics coming from that too.

-1  :-)

I don't care at all what this xyz module does or does not do by and in
itself.

(Of course since at least two people care and since this makes life
easier for Arjan, just keep listing reports about tainted kernels like
you do now.  It just so happens that different people are interested in
different things.)
-- 
Stefan Richter
-=====-=-=== ==-- =---=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 13:13       ` Stefan Richter
@ 2007-12-17 16:40         ` Arjan van de Ven
  0 siblings, 0 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-17 16:40 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Jon Masters, Dave Jones, linux-kernel, Andrew Morton, protasnb

Stefan Richter wrote:
> Jon Masters wrote:
>> On Sun, 2007-12-16 at 21:51 -0500, Dave Jones wrote:
>>> On Sat, Dec 15, 2007 at 04:49:05PM +0100, Stefan Richter wrote:
>>>  
>>>  > Reports about tainted kernels have arguably less value.  It would be
>>>  > good to hide such reports until a report of the same oops in an
>>>  > untainted kernel was found.
>>>  
>>> I disagree with this.  It's useful to have a "we've seen this before,
>>> and every time, it was tainted with xyz module" datapoint, especially
>>> if no untainted copies of that oops turn up.
>> +1
>>
>> In fact, that's even more useful in many cases, if it helps demonstrate
>> that the oops is associated with a particular buggy binary driver. I can
>> see a lot of potentially interesting statistics coming from that too.
> 
> -1  :-)
> 
> I don't care at all what this xyz module does or does not do by and in
> itself.
> 

the thing is this: The goal of kerneloops.org is to allow developers to focus their effort on the real
important cases. Part of that is knowing which cases to dismiss/not spend time on because of their
relation with one or more binary drivers.... so imo keeping track of this and showing the "don't bother"
flag with it is very much worthwhile; it allows us developers to know what to ignore.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-14 18:46 Top kernel oopses/warnings this week Arjan van de Ven
                   ` (3 preceding siblings ...)
  2007-12-15 15:49 ` Stefan Richter
@ 2007-12-17 17:23 ` Ingo Molnar
  2007-12-17 21:36   ` Arjan van de Ven
  4 siblings, 1 reply; 35+ messages in thread
From: Ingo Molnar @ 2007-12-17 17:23 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, Andrew Morton, Linus Torvalds, protasnb


* Arjan van de Ven <arjan@linux.intel.com> wrote:

> The http://www.kerneloops.org website collects kernel oops and warning 
> reports from various mailing lists and bugzillas; below is a top 10 
> list of the oopses collected in the last 7 days. (Reports prior to 
> 2.6.23 have been omitted in collecting the top 10)

cool stuff! I cannot over-emphasise how useful this will be.

Let us know if you need any additional WARN_ON()s or other dmesg 
annotations to make parsing easier / more intelligent. At least as far 
as arch/x86 and the scheduler is related it's going to be applied to the 
fast-track queue ;-)

	Ingo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-15 18:21   ` Arjan van de Ven
  2007-12-15 19:44     ` Stefan Richter
@ 2007-12-17 18:25     ` Zach Brown
  2007-12-17 18:41       ` Arjan van de Ven
  1 sibling, 1 reply; 35+ messages in thread
From: Zach Brown @ 2007-12-17 18:25 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Stefan Richter, linux-kernel, Andrew Morton, protasnb


>> Report counts may be too high due to duplicate recognition of the very
>> same report.¹
> 
> this is true however it's .. a hard issue. It's really hard to
> distinguish a duplicate report from
> two reports of the same bug.

Can we hack some data in to oops output to help?  Say a giant per-boot
anonymous random number (yeah, I know, harder than it sounds) and then
an incrementing oops counter.  That'd also let you discover that the
latter oopses in a chain of oopses might be fall-out from the head of
the chain.

- z

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 18:25     ` Zach Brown
@ 2007-12-17 18:41       ` Arjan van de Ven
  0 siblings, 0 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-17 18:41 UTC (permalink / raw)
  To: Zach Brown; +Cc: Stefan Richter, linux-kernel, Andrew Morton, protasnb

Zach Brown wrote:
>>> Report counts may be too high due to duplicate recognition of the very
>>> same report.¹
>> this is true however it's .. a hard issue. It's really hard to
>> distinguish a duplicate report from
>> two reports of the same bug.
> 
> Can we hack some data in to oops output to help?  Say a giant per-boot
> anonymous random number (yeah, I know, harder than it sounds) and then
> an incrementing oops counter.

there already is a per-boot UUID afaik, just a matter of printing that..
I'll look into that, but it does add extra info to the oops print

>  That'd also let you discover that the
> latter oopses in a chain of oopses might be fall-out from the head of
> the chain.

this is there already and taken care of ;)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 17:23 ` Ingo Molnar
@ 2007-12-17 21:36   ` Arjan van de Ven
  2007-12-17 21:58     ` Theodore Tso
                       ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-17 21:36 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andrew Morton, Linus Torvalds, protasnb, tytso

On Mon, 17 Dec 2007 18:23:31 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Arjan van de Ven <arjan@linux.intel.com> wrote:
> 
> > The http://www.kerneloops.org website collects kernel oops and
> > warning reports from various mailing lists and bugzillas; below is
> > a top 10 list of the oopses collected in the last 7 days. (Reports
> > prior to 2.6.23 have been omitted in collecting the top 10)
> 
> cool stuff! I cannot over-emphasise how useful this will be.
> 
> Let us know if you need any additional WARN_ON()s or other dmesg 
> annotations to make parsing easier / more intelligent. At least as
> far as arch/x86 and the scheduler is related it's going to be applied
> to the fast-track queue ;-)
>

the following patch would help a lot; it ads a very nice parsable end-marker
to oopses, as well as printing the boot UUID as part of the oops, which 
makes it easier to de-dupe oopses. The UUID is just a random number and not
privacy-tracable to any system.

--

Subject: [patch] terminate the oops printing with a defined string/uuid
From: Arjan van de Ven <arjan@linux.intel.com>

Right now, it's hard for automated tools to determine when an oops has
ended; there's no clear marker for this. In addition, there's no good
way to find out if an oops is unique. Sometimes it's the same oops
just reported multiple times, while other times it's a different
instance of the crash with the same signature. Printing the boot UUID
as part of the end string resolves this ambiguity.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
CC: Ted Ts'o <tytso@thunk.org>

---
 drivers/char/random.c  |   35 ++++++++++++++++++++++++++++++++++-
 include/linux/random.h |    1 +
 kernel/panic.c         |    2 ++
 3 files changed, 37 insertions(+), 1 deletion(-)

Index: linux-2.6.24-rc5/drivers/char/random.c
===================================================================
--- linux-2.6.24-rc5.orig/drivers/char/random.c
+++ linux-2.6.24-rc5/drivers/char/random.c
@@ -1176,8 +1176,41 @@ static int max_read_thresh = INPUT_POOL_
 static int max_write_thresh = INPUT_POOL_WORDS * 32;
 static char sysctl_bootid[16];
 
+/**
+ *	get_boot_uuid	- return a string pointer to a system wide boot UUID
+ *
+ *	Returns a pointer to the boot UUID. This UUID is unique per system
+ *	boot but persistent for one boot session.
+ *
+ *	The memory returned via the return pointer is static allocated and
+ *	owned by the random.c driver; this should not be kfree()'d.
+ *
+ *	Locking: none
+ */
+ */
+char *get_boot_uuid(void)
+{
+	static char target[80];
+	unsigned char *uuid;
+
+	if (sysctl_bootid[8] == 0)
+		generate_random_uuid(sysctl_bootid);
+	/* sysctl_bootid is signed, to print we need unsigned .. */
+	uuid = sysctl_bootid;
+
+	if (target[0] == 0) {
+		sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
+			"%02x%02x%02x%02x%02x%02x",
+		uuid[0],  uuid[1],  uuid[2],  uuid[3],  uuid[4],
+		uuid[5],  uuid[6],  uuid[7],  uuid[8],  uuid[9],
+		uuid[10], uuid[11], uuid[12], uuid[13], uuid[14],
+		uuid[15]);
+	}
+	return target;
+}
+
 /*
- * These functions is used to return both the bootid UUID, and random
+ * These functions are used to return both the bootid UUID, and random
  * UUID.  The difference is in whether table->data is NULL; if it is,
  * then a new UUID is generated and returned to the user.
  *
Index: linux-2.6.24-rc5/include/linux/random.h
===================================================================
--- linux-2.6.24-rc5.orig/include/linux/random.h
+++ linux-2.6.24-rc5/include/linux/random.h
@@ -71,6 +71,7 @@ unsigned long randomize_range(unsigned l
 
 u32 random32(void);
 void srandom32(u32 seed);
+char *get_boot_uuid(void);
 
 #endif /* __KERNEL___ */
 
Index: linux-2.6.24-rc5/kernel/panic.c
===================================================================
--- linux-2.6.24-rc5.orig/kernel/panic.c
+++ linux-2.6.24-rc5/kernel/panic.c
@@ -19,6 +19,7 @@
 #include <linux/nmi.h>
 #include <linux/kexec.h>
 #include <linux/debug_locks.h>
+#include <linux/random.h>
 
 int panic_on_oops;
 int tainted;
@@ -272,6 +273,7 @@ void oops_enter(void)
 void oops_exit(void)
 {
 	do_oops_enter_exit();
+	printk("---[ end of trace %s ]---\n", get_boot_uuid());
 }
 
 #ifdef CONFIG_CC_STACKPROTECTOR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 21:36   ` Arjan van de Ven
@ 2007-12-17 21:58     ` Theodore Tso
  2007-12-17 22:58     ` Tony Luck
  2007-12-18 17:48     ` Matt Mackall
  2 siblings, 0 replies; 35+ messages in thread
From: Theodore Tso @ 2007-12-17 21:58 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, linux-kernel, Andrew Morton, Linus Torvalds,
	protasnb

On Mon, Dec 17, 2007 at 01:36:31PM -0800, Arjan van de Ven wrote:
> Subject: [patch] terminate the oops printing with a defined string/uuid
> From: Arjan van de Ven <arjan@linux.intel.com>
> 
> Right now, it's hard for automated tools to determine when an oops has
> ended; there's no clear marker for this. In addition, there's no good
> way to find out if an oops is unique. Sometimes it's the same oops
> just reported multiple times, while other times it's a different
> instance of the crash with the same signature. Printing the boot UUID
> as part of the end string resolves this ambiguity.
> 
> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
> CC: Ted Ts'o <tytso@thunk.org>

Looks good to me!

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

						- Ted

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 21:36   ` Arjan van de Ven
  2007-12-17 21:58     ` Theodore Tso
@ 2007-12-17 22:58     ` Tony Luck
  2007-12-17 23:17       ` Arjan van de Ven
  2007-12-18 17:48     ` Matt Mackall
  2 siblings, 1 reply; 35+ messages in thread
From: Tony Luck @ 2007-12-17 22:58 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, linux-kernel, Andrew Morton, Linus Torvalds,
	protasnb, tytso

> +       static char target[80];
 ...
> +               sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> +                       "%02x%02x%02x%02x%02x%02x",

[80] is overkill ... [37] bytes should be enough (unless I went
cross-eyed counting the "%02x" :-)

-Tony

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 22:58     ` Tony Luck
@ 2007-12-17 23:17       ` Arjan van de Ven
  2007-12-17 23:26         ` Tony Luck
  0 siblings, 1 reply; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-17 23:17 UTC (permalink / raw)
  To: Tony Luck
  Cc: Ingo Molnar, linux-kernel, Andrew Morton, Linus Torvalds,
	protasnb, tytso

Tony Luck wrote:
>> +       static char target[80];
>  ...
>> +               sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
>> +                       "%02x%02x%02x%02x%02x%02x",
> 
> [80] is overkill ... [37] bytes should be enough (unless I went
> cross-eyed counting the "%02x" :-)
> 

%02x doesn't guarantee that it's at most 2, but at LEAST 2...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 23:17       ` Arjan van de Ven
@ 2007-12-17 23:26         ` Tony Luck
  2007-12-17 23:47           ` Arjan van de Ven
  0 siblings, 1 reply; 35+ messages in thread
From: Tony Luck @ 2007-12-17 23:26 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, linux-kernel, Andrew Morton, Linus Torvalds,
	protasnb, tytso

On Dec 17, 2007 3:17 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
>
> Tony Luck wrote:
> >> +       static char target[80];
> >  ...
> >> +               sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> >> +                       "%02x%02x%02x%02x%02x%02x",
> >
> > [80] is overkill ... [37] bytes should be enough (unless I went
> > cross-eyed counting the "%02x" :-)
> >
>
> %02x doesn't guarantee that it's at most 2, but at LEAST 2...

How will you fit a number that requires >2 hex digits into an
"unsigned char"?

Alternatively ... if %02x may spew more that 2 characters, can
you be sure that [80] is enough?

-Tony

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 23:26         ` Tony Luck
@ 2007-12-17 23:47           ` Arjan van de Ven
  2007-12-18  0:21             ` Linus Torvalds
  0 siblings, 1 reply; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-17 23:47 UTC (permalink / raw)
  To: Tony Luck
  Cc: Ingo Molnar, linux-kernel, Andrew Morton, Linus Torvalds,
	protasnb, tytso

On Mon, 17 Dec 2007 15:26:46 -0800
"Tony Luck" <tony.luck@intel.com> wrote:

> On Dec 17, 2007 3:17 PM, Arjan van de Ven <arjan@linux.intel.com>
> wrote:
> >
> > Tony Luck wrote:
> > >> +       static char target[80];
> > >  ...
> > >> +               sprintf(target,
> > >> "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> > >> +                       "%02x%02x%02x%02x%02x%02x",
> > >
> > > [80] is overkill ... [37] bytes should be enough (unless I went
> > > cross-eyed counting the "%02x" :-)
> > >
> >
> > %02x doesn't guarantee that it's at most 2, but at LEAST 2...
> 
> How will you fit a number that requires >2 hex digits into an
> "unsigned char"?

eh eh because at first it was a signed char but I fixed that bug later

updated patch attached; using 38 to have a hard 0 at the end in case sprintf does
something weird and 2 cpus race over oopsing (I don't want to add locking to the oops codepath
if I can avoid it; the worst case with 38 is a truncated UUID string)


Subject: [patch] terminate the oops printing with a defined string/uuid
From: Arjan van de Ven <arjan@linux.intel.com>

Right now, it's hard for automated tools to determine when an oops has
ended; there's no clear marker for this. In addition, there's no good
way to find out if an oops is unique. Sometimes it's the same oops
just reported multiple times, while other times it's a different
instance of the crash with the same signature. Printing the boot UUID
as part of the end string resolves this ambiguity.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

---
 drivers/char/random.c  |   35 ++++++++++++++++++++++++++++++++++-
 include/linux/random.h |    1 +
 kernel/panic.c         |    2 ++
 3 files changed, 37 insertions(+), 1 deletion(-)

Index: linux-2.6.24-rc5/drivers/char/random.c
===================================================================
--- linux-2.6.24-rc5.orig/drivers/char/random.c
+++ linux-2.6.24-rc5/drivers/char/random.c
@@ -1176,8 +1176,41 @@ static int max_read_thresh = INPUT_POOL_
 static int max_write_thresh = INPUT_POOL_WORDS * 32;
 static char sysctl_bootid[16];
 
+/**
+ *	get_boot_uuid	- return a string pointer to a system wide boot UUID
+ *
+ *	Returns a pointer to the boot UUID. This UUID is unique per system
+ *	boot but persistent for one boot session.
+ *
+ *	The memory returned via the return pointer is static allocated and
+ *	owned by the random.c driver; this should not be kfree()'d.
+ *
+ *	Locking: none
+ */
+ */
+char *get_boot_uuid(void)
+{
+	static char target[38];
+	unsigned char *uuid;
+
+	if (sysctl_bootid[8] == 0)
+		generate_random_uuid(sysctl_bootid);
+	/* sysctl_bootid is signed, to print we need unsigned .. */
+	uuid = sysctl_bootid;
+
+	if (target[0] == 0) {
+		sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
+			"%02x%02x%02x%02x%02x%02x",
+		uuid[0],  uuid[1],  uuid[2],  uuid[3],  uuid[4],
+		uuid[5],  uuid[6],  uuid[7],  uuid[8],  uuid[9],
+		uuid[10], uuid[11], uuid[12], uuid[13], uuid[14],
+		uuid[15]);
+	}
+	return target;
+}
+
 /*
- * These functions is used to return both the bootid UUID, and random
+ * These functions are used to return both the bootid UUID, and random
  * UUID.  The difference is in whether table->data is NULL; if it is,
  * then a new UUID is generated and returned to the user.
  *
Index: linux-2.6.24-rc5/include/linux/random.h
===================================================================
--- linux-2.6.24-rc5.orig/include/linux/random.h
+++ linux-2.6.24-rc5/include/linux/random.h
@@ -71,6 +71,7 @@ unsigned long randomize_range(unsigned l
 
 u32 random32(void);
 void srandom32(u32 seed);
+char *get_boot_uuid(void);
 
 #endif /* __KERNEL___ */
 
Index: linux-2.6.24-rc5/kernel/panic.c
===================================================================
--- linux-2.6.24-rc5.orig/kernel/panic.c
+++ linux-2.6.24-rc5/kernel/panic.c
@@ -19,6 +19,7 @@
 #include <linux/nmi.h>
 #include <linux/kexec.h>
 #include <linux/debug_locks.h>
+#include <linux/random.h>
 
 int panic_on_oops;
 int tainted;
@@ -272,6 +273,7 @@ void oops_enter(void)
 void oops_exit(void)
 {
 	do_oops_enter_exit();
+	printk("---[ end of trace %s ]---\n", get_boot_uuid());
 }
 
 #ifdef CONFIG_CC_STACKPROTECTOR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 23:47           ` Arjan van de Ven
@ 2007-12-18  0:21             ` Linus Torvalds
  2007-12-18  0:39               ` Arjan van de Ven
                                 ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Linus Torvalds @ 2007-12-18  0:21 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Tony Luck, Ingo Molnar, linux-kernel, Andrew Morton, protasnb,
	tytso



On Mon, 17 Dec 2007, Arjan van de Ven wrote:
>
> +char *get_boot_uuid(void)
> +{
> +	static char target[38];
> +	unsigned char *uuid;
> +
> +	if (sysctl_bootid[8] == 0)
> +		generate_random_uuid(sysctl_bootid);
> +	/* sysctl_bootid is signed, to print we need unsigned .. */
> +	uuid = sysctl_bootid;
> +
> +	if (target[0] == 0) {
> +		sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> +			"%02x%02x%02x%02x%02x%02x",

Why isn't *everything* inside that "if (target[0] == 0" check?

IOW, that function should look something like

	const char *get_boot_uuid(void)
	{
		static char target[38];

		if (!target[0])
			fill_boot_uid(target)
		return target;
	}

which also allows you to clean it up a bit.

I'd _also_ suggest that you'd actually try to avoid that horrid sequence 
of "%02x..", and instead just make sure that sysctl_bootid[] is 4-byte 
aligned, and then you can do

	sprintf("%08x-%04x-%04x-%04x-%04x%08x",
		ntohl(0[(u32 *)uuid]),
		ntohs(2[(u16 *)uuid]),
		ntohs(3[(u16 *)uuid]),
		ntohs(4[(u16 *)uuid]),
		ntohs(5[(u16 *)uuid]),
		ntohl(3[(u32 *)uuid]));

which also gets bonus points for being totally unreadable, and thus 100% 
in the spirit of uuid's.

		Linus

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18  0:21             ` Linus Torvalds
@ 2007-12-18  0:39               ` Arjan van de Ven
  2007-12-18  2:31               ` Theodore Tso
  2007-12-18 18:06               ` Arjan van de Ven
  2 siblings, 0 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-18  0:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tony Luck, Ingo Molnar, linux-kernel, Andrew Morton, protasnb,
	tytso

Linus Torvalds wrote:
> 
> On Mon, 17 Dec 2007, Arjan van de Ven wrote:
>> +char *get_boot_uuid(void)
>> +{
>> +	static char target[38];
>> +	unsigned char *uuid;
>> +
>> +	if (sysctl_bootid[8] == 0)
>> +		generate_random_uuid(sysctl_bootid);
>> +	/* sysctl_bootid is signed, to print we need unsigned .. */
>> +	uuid = sysctl_bootid;
>> +
>> +	if (target[0] == 0) {
>> +		sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
>> +			"%02x%02x%02x%02x%02x%02x",
> 
> Why isn't *everything* inside that "if (target[0] == 0" check?

the sysctl_bootid is shared with the /proc exposed bootid, so I need to generate it the same way

> I'd _also_ suggest that you'd actually try to avoid that horrid sequence 
> of "%02x..", and instead just make sure that sysctl_bootid[] is 4-byte 
> aligned, and then you can do
> 
> 	sprintf("%08x-%04x-%04x-%04x-%04x%08x",
> 		ntohl(0[(u32 *)uuid]),
> 		ntohs(2[(u16 *)uuid]),
> 		ntohs(3[(u16 *)uuid]),
> 		ntohs(4[(u16 *)uuid]),
> 		ntohs(5[(u16 *)uuid]),
> 		ntohl(3[(u32 *)uuid]));
> 
> which also gets bonus points for being totally unreadable, and thus 100% 
> in the spirit of uuid's.

again.. this is for compatibility with /proc/sys/kernel/random/boot_id .. the code 10 lines below my patch is identical and does
the %02x stuff... I didn't make that up, I just copied that to get the same output.
I can deviate for cleanup... but I can see some value of being the same format and same data.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18  0:21             ` Linus Torvalds
  2007-12-18  0:39               ` Arjan van de Ven
@ 2007-12-18  2:31               ` Theodore Tso
  2007-12-18  6:58                 ` Arjan van de Ven
  2007-12-18 10:11                 ` Jon Masters
  2007-12-18 18:06               ` Arjan van de Ven
  2 siblings, 2 replies; 35+ messages in thread
From: Theodore Tso @ 2007-12-18  2:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arjan van de Ven, Tony Luck, Ingo Molnar, linux-kernel,
	Andrew Morton, protasnb

On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
> which also gets bonus points for being totally unreadable, and thus 100% 
> in the spirit of uuid's.

Heh.  UUID's don't have to be readable; just universally unique.  Code
on the other hand should be readable.   :-)

If you want something more readable, you could print the MAC address
and boot time.  Of course some crazy people seem to think leaking the
MAC address will somehow be a privacy violation.  And printing a
random UUID is a lot simpler....

						- Ted

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18  2:31               ` Theodore Tso
@ 2007-12-18  6:58                 ` Arjan van de Ven
  2007-12-18 17:53                   ` Matt Mackall
  2007-12-18 18:28                   ` Theodore Tso
  2007-12-18 10:11                 ` Jon Masters
  1 sibling, 2 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-18  6:58 UTC (permalink / raw)
  To: Theodore Tso, Linus Torvalds, Arjan van de Ven, Tony Luck,
	Ingo Molnar, linux-kernel, Andrew Morton, protasnb

Theodore Tso wrote:
> On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
>> which also gets bonus points for being totally unreadable, and thus 100% 
>> in the spirit of uuid's.
> 
> Heh.  UUID's don't have to be readable; just universally unique.  Code
> on the other hand should be readable.   :-)

Linus' suggested... improvement should either be done in all 3 places or none ;)
Since you're the maintainer... what's your suggestion?

> 
> If you want something more readable, you could print the MAC address
> and boot time.  Of course some crazy people seem to think leaking the
> MAC address will somehow be a privacy violation.  And printing a
> random UUID is a lot simpler....

boot UUID is nice in that it's different each boot, so that an oops that happens twice will have a
different UUID even if it's the same machine, while repeat-reports of the same oops will have
the same UUID. So I very much like to use some form of UUID; since the boot UUID has the
same properties I was happy to share this; if it gets too ugly or evil code wise I can always
pick something else ;-)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18  2:31               ` Theodore Tso
  2007-12-18  6:58                 ` Arjan van de Ven
@ 2007-12-18 10:11                 ` Jon Masters
  1 sibling, 0 replies; 35+ messages in thread
From: Jon Masters @ 2007-12-18 10:11 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Linus Torvalds, Arjan van de Ven, Tony Luck, Ingo Molnar,
	linux-kernel, Andrew Morton, protasnb


On Mon, 2007-12-17 at 21:31 -0500, Theodore Tso wrote:
> On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
> > which also gets bonus points for being totally unreadable, and thus 100% 
> > in the spirit of uuid's.
> 
> Heh.  UUID's don't have to be readable; just universally unique.  Code
> on the other hand should be readable.   :-)
> 
> If you want something more readable, you could print the MAC address
> and boot time.  Of course some crazy people seem to think leaking the
> MAC address will somehow be a privacy violation.  And printing a
> random UUID is a lot simpler....

Printing a random UUID is necessary, for now anyway, because you cannot
assume every machine is going to have a MAC address, even if it is
deemed appropriate to print this on oops.

The Network is the Computer!

Jon.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-17 21:36   ` Arjan van de Ven
  2007-12-17 21:58     ` Theodore Tso
  2007-12-17 22:58     ` Tony Luck
@ 2007-12-18 17:48     ` Matt Mackall
  2007-12-18 23:37       ` Arjan van de Ven
  2 siblings, 1 reply; 35+ messages in thread
From: Matt Mackall @ 2007-12-18 17:48 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, linux-kernel, Andrew Morton, Linus Torvalds,
	protasnb, tytso

On Mon, Dec 17, 2007 at 01:36:31PM -0800, Arjan van de Ven wrote:
> On Mon, 17 Dec 2007 18:23:31 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > * Arjan van de Ven <arjan@linux.intel.com> wrote:
> > 
> > > The http://www.kerneloops.org website collects kernel oops and
> > > warning reports from various mailing lists and bugzillas; below is
> > > a top 10 list of the oopses collected in the last 7 days. (Reports
> > > prior to 2.6.23 have been omitted in collecting the top 10)
> > 
> > cool stuff! I cannot over-emphasise how useful this will be.
> > 
> > Let us know if you need any additional WARN_ON()s or other dmesg 
> > annotations to make parsing easier / more intelligent. At least as
> > far as arch/x86 and the scheduler is related it's going to be applied
> > to the fast-track queue ;-)
> >
> 
> the following patch would help a lot; it ads a very nice parsable end-marker
> to oopses, as well as printing the boot UUID as part of the oops, which 
> makes it easier to de-dupe oopses. The UUID is just a random number and not
> privacy-tracable to any system.
> 
> --
> 
> Subject: [patch] terminate the oops printing with a defined string/uuid
> From: Arjan van de Ven <arjan@linux.intel.com>
> 
> Right now, it's hard for automated tools to determine when an oops has
> ended; there's no clear marker for this. In addition, there's no good
> way to find out if an oops is unique. Sometimes it's the same oops
> just reported multiple times, while other times it's a different
> instance of the crash with the same signature. Printing the boot UUID
> as part of the end string resolves this ambiguity.
> 
> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
> CC: Ted Ts'o <tytso@thunk.org>
> 
> ---
>  drivers/char/random.c  |   35 ++++++++++++++++++++++++++++++++++-
>  include/linux/random.h |    1 +
>  kernel/panic.c         |    2 ++
>  3 files changed, 37 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6.24-rc5/drivers/char/random.c
> ===================================================================
> --- linux-2.6.24-rc5.orig/drivers/char/random.c
> +++ linux-2.6.24-rc5/drivers/char/random.c
> @@ -1176,8 +1176,41 @@ static int max_read_thresh = INPUT_POOL_
>  static int max_write_thresh = INPUT_POOL_WORDS * 32;
>  static char sysctl_bootid[16];
>  
> +/**
> + *	get_boot_uuid	- return a string pointer to a system wide boot UUID
> + *
> + *	Returns a pointer to the boot UUID. This UUID is unique per system
> + *	boot but persistent for one boot session.
> + *
> + *	The memory returned via the return pointer is static allocated and
> + *	owned by the random.c driver; this should not be kfree()'d.
> + *
> + *	Locking: none
> + */
> + */
> +char *get_boot_uuid(void)
> +{
> +	static char target[80];
> +	unsigned char *uuid;
> +
> +	if (sysctl_bootid[8] == 0)
> +		generate_random_uuid(sysctl_bootid);
> +	/* sysctl_bootid is signed, to print we need unsigned .. */
> +	uuid = sysctl_bootid;
> +
> +	if (target[0] == 0) {
> +		sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> +			"%02x%02x%02x%02x%02x%02x",
> +		uuid[0],  uuid[1],  uuid[2],  uuid[3],  uuid[4],
> +		uuid[5],  uuid[6],  uuid[7],  uuid[8],  uuid[9],
> +		uuid[10], uuid[11], uuid[12], uuid[13], uuid[14],
> +		uuid[15]);

Blech. Invoking the random pool machinery at oops time is moderately
safe, but not very shiny. Going through all the sprintf ugliness to
format it to an irrelevant UUID standard is not very shiny either. At
least refactor it so it's not duplicating code.

And I'd much rather the static variable lived with its user, as
random.c is already too miscellaneous:

> --- linux-2.6.24-rc5.orig/kernel/panic.c
> +++ linux-2.6.24-rc5/kernel/panic.c
...
> +	printk("---[ end of trace %s ]---\n", get_boot_uuid());

Also, please cc: me on any future patches to random.c.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18  6:58                 ` Arjan van de Ven
@ 2007-12-18 17:53                   ` Matt Mackall
  2007-12-18 18:28                   ` Theodore Tso
  1 sibling, 0 replies; 35+ messages in thread
From: Matt Mackall @ 2007-12-18 17:53 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Theodore Tso, Linus Torvalds, Tony Luck, Ingo Molnar,
	linux-kernel, Andrew Morton, protasnb

On Mon, Dec 17, 2007 at 10:58:54PM -0800, Arjan van de Ven wrote:
> Theodore Tso wrote:
> >On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
> >>which also gets bonus points for being totally unreadable, and thus 100% 
> >>in the spirit of uuid's.
> >
> >Heh.  UUID's don't have to be readable; just universally unique.  Code
> >on the other hand should be readable.   :-)
> 
> Linus' suggested... improvement should either be done in all 3 places or 
> none ;)
> Since you're the maintainer... what's your suggestion?

For the record:

RANDOM NUMBER DRIVER
P:      Matt Mackall
M:      mpm@selenic.com
S:      Maintained

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18  0:21             ` Linus Torvalds
  2007-12-18  0:39               ` Arjan van de Ven
  2007-12-18  2:31               ` Theodore Tso
@ 2007-12-18 18:06               ` Arjan van de Ven
  2007-12-18 18:13                 ` Matt Mackall
  2 siblings, 1 reply; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-18 18:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tony Luck, Ingo Molnar, linux-kernel, Andrew Morton, protasnb,
	tytso, mpm

[-- Attachment #1: Type: text/plain, Size: 1831 bytes --]

Linus Torvalds wrote:
> 
> On Mon, 17 Dec 2007, Arjan van de Ven wrote:
>> +char *get_boot_uuid(void)
>> +{
>> +	static char target[38];
>> +	unsigned char *uuid;
>> +
>> +	if (sysctl_bootid[8] == 0)
>> +		generate_random_uuid(sysctl_bootid);
>> +	/* sysctl_bootid is signed, to print we need unsigned .. */
>> +	uuid = sysctl_bootid;
>> +
>> +	if (target[0] == 0) {
>> +		sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
>> +			"%02x%02x%02x%02x%02x%02x",
> 
> Why isn't *everything* inside that "if (target[0] == 0" check?
> 
> IOW, that function should look something like


ok so this got a lot more involved than I was hoping for;
something like below will help me (and kerneloops.org ;) for the short term,
while I'll see what I can do for random.c in a few dead moments soon, for a 2.6.25
enhancement...


Subject: [patch] terminate the oops printing with a defined string/uuid
From: Arjan van de Ven <arjan@linux.intel.com>

Right now, it's hard for automated tools to determine when an oops has
ended; there's no clear marker for this. For later kernels I would also
like a UUID to printed here, but for short term I've put all zeros there
since printing a UUID seems to involve cleaning up/rewriting quite a chunk
of random.c and that's more involved -> later patch.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

---
  kernel/panic.c         |    1 +
  1 files changed, 1 insertion(+), 0 deletions(-)

Index: linux-2.6.24-rc5/kernel/panic.c
===================================================================
--- linux-2.6.24-rc5.orig/kernel/panic.c
+++ linux-2.6.24-rc5/kernel/panic.c
@@ -272,6 +273,7 @@ void oops_enter(void)
  void oops_exit(void)
  {
  	do_oops_enter_exit();
+	printk("---[ end of trace 00000000-0000-0000-0000-000000000000 ]---\n");
  }

  #ifdef CONFIG_CC_STACKPROTECTOR


[-- Attachment #2: oopsend.patch --]
[-- Type: text/x-patch, Size: 981 bytes --]

Subject: [patch] terminate the oops printing with a defined string/uuid
From: Arjan van de Ven <arjan@linux.intel.com>

Right now, it's hard for automated tools to determine when an oops has
ended; there's no clear marker for this. For later kernels I would also
like a UUID to printed here, but for short term I've put all zeros there
since printing a UUID seems to involve cleaning up/rewriting quite a chunk
of random.c and that's more involved -> later patch.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

---
 kernel/panic.c         |    1 +
 1 files changed, 1 insertion(+), 0 deletions(-)

Index: linux-2.6.24-rc5/kernel/panic.c
===================================================================
--- linux-2.6.24-rc5.orig/kernel/panic.c
+++ linux-2.6.24-rc5/kernel/panic.c
@@ -272,6 +273,7 @@ void oops_enter(void)
 void oops_exit(void)
 {
 	do_oops_enter_exit();
+	printk("---[ end of trace 0000-00-00-00-000000 ]---\n");
 }
 
 #ifdef CONFIG_CC_STACKPROTECTOR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18 18:06               ` Arjan van de Ven
@ 2007-12-18 18:13                 ` Matt Mackall
  2007-12-18 18:19                   ` Arjan van de Ven
  0 siblings, 1 reply; 35+ messages in thread
From: Matt Mackall @ 2007-12-18 18:13 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linus Torvalds, Tony Luck, Ingo Molnar, linux-kernel,
	Andrew Morton, protasnb, tytso

On Tue, Dec 18, 2007 at 10:06:14AM -0800, Arjan van de Ven wrote:
> Linus Torvalds wrote:
> >
> >On Mon, 17 Dec 2007, Arjan van de Ven wrote:
> >>+char *get_boot_uuid(void)
> >>+{
> >>+	static char target[38];
> >>+	unsigned char *uuid;
> >>+
> >>+	if (sysctl_bootid[8] == 0)
> >>+		generate_random_uuid(sysctl_bootid);
> >>+	/* sysctl_bootid is signed, to print we need unsigned .. */
> >>+	uuid = sysctl_bootid;
> >>+
> >>+	if (target[0] == 0) {
> >>+		sprintf(target, 
> >>"%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> >>+			"%02x%02x%02x%02x%02x%02x",
> >
> >Why isn't *everything* inside that "if (target[0] == 0" check?
> >
> >IOW, that function should look something like
> 
> 
> ok so this got a lot more involved than I was hoping for;
> something like below will help me (and kerneloops.org ;) for the short term,
> while I'll see what I can do for random.c in a few dead moments soon, for a 
> 2.6.25
> enhancement...

Might as well leave out the null UUID, no sense in claiming to have
one when you don't. It's easy for a parser to cut on "^---["

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18 18:13                 ` Matt Mackall
@ 2007-12-18 18:19                   ` Arjan van de Ven
  0 siblings, 0 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-18 18:19 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Tony Luck, Ingo Molnar, linux-kernel,
	Andrew Morton, protasnb, tytso

Matt Mackall wrote:
> Might as well leave out the null UUID, no sense in claiming to have
> one when you don't. It's easy for a parser to cut on "^---["

one can't cut on that since that's also the start marker.
Yes it's possible to leave it out entirely, and thus have 2 different terminators over time.
No I don't think it's a good idea.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18  6:58                 ` Arjan van de Ven
  2007-12-18 17:53                   ` Matt Mackall
@ 2007-12-18 18:28                   ` Theodore Tso
  2007-12-18 18:45                     ` Linus Torvalds
  1 sibling, 1 reply; 35+ messages in thread
From: Theodore Tso @ 2007-12-18 18:28 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linus Torvalds, Tony Luck, Ingo Molnar, linux-kernel,
	Andrew Morton, protasnb

On Mon, Dec 17, 2007 at 10:58:54PM -0800, Arjan van de Ven wrote:
> Theodore Tso wrote:
>> On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
>>> which also gets bonus points for being totally unreadable, and thus 100% 
>>> in the spirit of uuid's.
>> Heh.  UUID's don't have to be readable; just universally unique.  Code
>> on the other hand should be readable.   :-)
>
> Linus' suggested... improvement should either be done in all 3 places or 
> none ;)
> Since you're the maintainer... what's your suggestion?

Well, Matt took over maintenance of the /dev/random driver, but my
take on it is that code readability is more important that saving a
few bytes of generated code or speed; the code paths are only executed
once, so it's hardly a fast path.

      	      				- Ted

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18 18:28                   ` Theodore Tso
@ 2007-12-18 18:45                     ` Linus Torvalds
  0 siblings, 0 replies; 35+ messages in thread
From: Linus Torvalds @ 2007-12-18 18:45 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Arjan van de Ven, Tony Luck, Ingo Molnar, linux-kernel,
	Andrew Morton, protasnb



On Tue, 18 Dec 2007, Theodore Tso wrote:
> 
> Well, Matt took over maintenance of the /dev/random driver, but my
> take on it is that code readability is more important that saving a
> few bytes of generated code or speed; the code paths are only executed
> once, so it's hardly a fast path.

Quite frankly, I'd argue that while my suggested code wasn't exactly 
readable, it was more so than the horror it tried to replace.

BAD CODE is never readable. At least my suggestion was good code.

			Linus

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: Top kernel oopses/warnings this week
  2007-12-18 17:48     ` Matt Mackall
@ 2007-12-18 23:37       ` Arjan van de Ven
  0 siblings, 0 replies; 35+ messages in thread
From: Arjan van de Ven @ 2007-12-18 23:37 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Ingo Molnar, linux-kernel, Andrew Morton, Linus Torvalds,
	protasnb, tytso

[-- Attachment #1: Type: text/plain, Size: 5513 bytes --]

Matt Mackall wrote:
> 
> Blech. Invoking the random pool machinery at oops time is moderately
> safe, but not very shiny. Going through all the sprintf ugliness to
> format it to an irrelevant UUID standard is not very shiny either. At
> least refactor it so it's not duplicating code.
> 
> And I'd much rather the static variable lived with its user, as
> random.c is already too miscellaneous:

ok so something like this?


From: Arjan van de Ven <arjan@linux.intel.com>
Subject: [patch] Print end-of-oops marker with UUID

Right now, it's nearly impossible for parsers to detect the end-of-oops
condition; for example this is a problem for www.kerneloops.org.
In addition, it's not currently possible to detect whether or not
2 oopses that look alike are actually the same oops reported twice,
or truely 2 unique oopses.

This patch factors out the "sprintf a UUID into a string" code from
random.c into a separate function (using snprintf as suggested by
Randy). So far I left the %02x in place instead of using Linus'
"improvement"; if someone really hates the %02x's he/she can do that
later.

It also reduces the stack footprint of proc_do_uuid(); it
was using 64 bytes for the string where 37 is sufficient.
With these random.c changes, the oops_exit() function can print an
end-of-oops marker from the oops_exit() function.

Normally, the UUID used for oopses is calculated as late_initcall
(in the hope that at that time there is enough entropy to get a
unique enough UUID); however for early oopses the oops_exit() function
needs to generate the UUID on the fly.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
CC: Matt
CC: Ted
CC: Randy

--- linux-2.6.24-rc5/drivers/char/random.c.org	2007-12-18 11:37:22.000000000 -0800
+++ linux-2.6.24-rc5/drivers/char/random.c	2007-12-18 12:20:48.000000000 -0800
@@ -1176,8 +1175,34 @@ static int max_read_thresh = INPUT_POOL_
  static int max_write_thresh = INPUT_POOL_WORDS * 32;
  static char sysctl_bootid[16];

+
+/**
+ * snprintf_uuid - Convert a 16 byte UUID into string format
+ * @string: buffer to store the UUID into
+ * @len:    size of @string
+ * @uuid:   the UUID to convert
+ *
+ * This function converts a 16 byte binary UUID into canonical
+ * ASCII form. This ASCII form needs 37 bytes of storage space,
+ * allocated and provided by the caller.
+ *
+ * Returns: pointer to @string
+ *
+ * Locking: none
+ */
+const char *snprintf_uuid(char *string, int len, unsigned char *uuid)
+{
+	snprintf(string, len,  "%02x%02x%02x%02x-%02x%02x-%02x%02x-"
+		"%02x%02x-%02x%02x%02x%02x%02x%02x",
+		uuid[0],  uuid[1],  uuid[2],  uuid[3],
+		uuid[4],  uuid[5],  uuid[6],  uuid[7],
+		uuid[8],  uuid[9],  uuid[10], uuid[11],
+		uuid[12], uuid[13], uuid[14], uuid[15]);
+	return string;
+}
+
  /*
- * These functions is used to return both the bootid UUID, and random
+ * These functions are used to return both the bootid UUID, and random
   * UUID.  The difference is in whether table->data is NULL; if it is,
   * then a new UUID is generated and returned to the user.
   *
@@ -1189,7 +1214,7 @@ static int proc_do_uuid(ctl_table *table
  			void __user *buffer, size_t *lenp, loff_t *ppos)
  {
  	ctl_table fake_table;
-	unsigned char buf[64], tmp_uuid[16], *uuid;
+	unsigned char buf[37], tmp_uuid[16], *uuid;

  	uuid = table->data;
  	if (!uuid) {
@@ -1199,12 +1224,7 @@ static int proc_do_uuid(ctl_table *table
  	if (uuid[8] == 0)
  		generate_random_uuid(uuid);

-	sprintf(buf, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
-		"%02x%02x%02x%02x%02x%02x",
-		uuid[0],  uuid[1],  uuid[2],  uuid[3],
-		uuid[4],  uuid[5],  uuid[6],  uuid[7],
-		uuid[8],  uuid[9],  uuid[10], uuid[11],
-		uuid[12], uuid[13], uuid[14], uuid[15]);
+	snprintf_uuid(buf, sizeof(buf), uuid);
  	fake_table.data = buf;
  	fake_table.maxlen = sizeof(buf);

--- linux-2.6.24-rc5/include/linux/random.h.org	2007-12-18 12:22:49.000000000 -0800
+++ linux-2.6.24-rc5/include/linux/random.h	2007-12-18 12:22:57.000000000 -0800
@@ -71,6 +71,7 @@ unsigned long randomize_range(unsigned l

  u32 random32(void);
  void srandom32(u32 seed);
+const char *snprintf_uuid(char *string, int len, unsigned char *uuid);

  #endif /* __KERNEL___ */

--- linux-2.6.24-rc5/kernel/panic.c.org	2007-12-18 12:23:19.000000000 -0800
+++ linux-2.6.24-rc5/kernel/panic.c	2007-12-18 12:35:46.000000000 -0800
@@ -19,6 +19,7 @@
  #include <linux/nmi.h>
  #include <linux/kexec.h>
  #include <linux/debug_locks.h>
+#include <linux/random.h>

  int panic_on_oops;
  int tainted;
@@ -32,6 +33,8 @@ ATOMIC_NOTIFIER_HEAD(panic_notifier_list

  EXPORT_SYMBOL(panic_notifier_list);

+static unsigned char oops_uuid[16];
+
  static int __init panic_setup(char *str)
  {
  	panic_timeout = simple_strtoul(str, NULL, 0);
@@ -265,15 +268,32 @@ void oops_enter(void)
  	do_oops_enter_exit();
  }

+static int prime_oops_uuid(void)
+{
+	if (oops_uuid[8] == 0)
+		generate_random_uuid(oops_uuid);
+	return 0;
+}
+
  /*
   * Called when the architecture exits its oops handler, after printing
   * everything.
   */
  void oops_exit(void)
  {
+	char uuid_string[37];
  	do_oops_enter_exit();
+
+	/*
+	 * normally the oops_uid is already calculated, but if we oops during
+	 * really early boot, it may not be. In that case, calculate it here.
+	 */
+	prime_oops_uuid();
+	printk("---[ end trace %s ]---\n",
+		snprintf_uuid(uuid_string, sizeof(uuid_string), oops_uuid));
  }

+late_initcall(prime_oops_uuid);
  #ifdef CONFIG_CC_STACKPROTECTOR
  /*
   * Called when gcc's -fstack-protector feature is used, and

[-- Attachment #2: oopsend2.patch --]
[-- Type: text/x-patch, Size: 5063 bytes --]

From: Arjan van de Ven <arjan@linux.intel.com>
Subject: [patch] Print end-of-oops marker with UUID

Right now, it's nearly impossible for parsers to detect the end-of-oops
condition; for example this is a problem for www.kerneloops.org.
In addition, it's not currently possible to detect whether or not
2 oopses that look alike are actually the same oops reported twice,
or truely 2 unique oopses. 

This patch factors out the "sprintf a UUID into a string" code from
random.c into a separate function (using snprintf as suggested by
Randy). So far I left the %02x in place instead of using Linus' 
"improvement"; if someone really hates the %02x's he/she can do that
later.

It also reduces the stack footprint of proc_do_uuid(); it
was using 64 bytes for the string where 37 is sufficient.
With these random.c changes, the oops_exit() function can print an
end-of-oops marker from the oops_exit() function.

Normally, the UUID used for oopses is calculated as late_initcall 
(in the hope that at that time there is enough entropy to get a 
unique enough UUID); however for early oopses the oops_exit() function
needs to generate the UUID on the fly.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
CC: Matt
CC: Ted
CC: Randy

--- linux-2.6.24-rc5/drivers/char/random.c.org	2007-12-18 11:37:22.000000000 -0800
+++ linux-2.6.24-rc5/drivers/char/random.c	2007-12-18 12:20:48.000000000 -0800
@@ -1176,8 +1175,34 @@ static int max_read_thresh = INPUT_POOL_
 static int max_write_thresh = INPUT_POOL_WORDS * 32;
 static char sysctl_bootid[16];
 
+
+/**
+ * snprintf_uuid - Convert a 16 byte UUID into string format
+ * @string: buffer to store the UUID into
+ * @len:    size of @string
+ * @uuid:   the UUID to convert
+ *
+ * This function converts a 16 byte binary UUID into canonical
+ * ASCII form. This ASCII form needs 37 bytes of storage space,
+ * allocated and provided by the caller.
+ *
+ * Returns: pointer to @string
+ *
+ * Locking: none
+ */
+const char *snprintf_uuid(char *string, int len, unsigned char *uuid)
+{
+	snprintf(string, len,  "%02x%02x%02x%02x-%02x%02x-%02x%02x-"
+		"%02x%02x-%02x%02x%02x%02x%02x%02x",
+		uuid[0],  uuid[1],  uuid[2],  uuid[3],
+		uuid[4],  uuid[5],  uuid[6],  uuid[7],
+		uuid[8],  uuid[9],  uuid[10], uuid[11],
+		uuid[12], uuid[13], uuid[14], uuid[15]);
+	return string;
+}
+
 /*
- * These functions is used to return both the bootid UUID, and random
+ * These functions are used to return both the bootid UUID, and random
  * UUID.  The difference is in whether table->data is NULL; if it is,
  * then a new UUID is generated and returned to the user.
  *
@@ -1189,7 +1214,7 @@ static int proc_do_uuid(ctl_table *table
 			void __user *buffer, size_t *lenp, loff_t *ppos)
 {
 	ctl_table fake_table;
-	unsigned char buf[64], tmp_uuid[16], *uuid;
+	unsigned char buf[37], tmp_uuid[16], *uuid;
 
 	uuid = table->data;
 	if (!uuid) {
@@ -1199,12 +1224,7 @@ static int proc_do_uuid(ctl_table *table
 	if (uuid[8] == 0)
 		generate_random_uuid(uuid);
 
-	sprintf(buf, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
-		"%02x%02x%02x%02x%02x%02x",
-		uuid[0],  uuid[1],  uuid[2],  uuid[3],
-		uuid[4],  uuid[5],  uuid[6],  uuid[7],
-		uuid[8],  uuid[9],  uuid[10], uuid[11],
-		uuid[12], uuid[13], uuid[14], uuid[15]);
+	snprintf_uuid(buf, sizeof(buf), uuid);
 	fake_table.data = buf;
 	fake_table.maxlen = sizeof(buf);
 
--- linux-2.6.24-rc5/include/linux/random.h.org	2007-12-18 12:22:49.000000000 -0800
+++ linux-2.6.24-rc5/include/linux/random.h	2007-12-18 12:22:57.000000000 -0800
@@ -71,6 +71,7 @@ unsigned long randomize_range(unsigned l
 
 u32 random32(void);
 void srandom32(u32 seed);
+const char *snprintf_uuid(char *string, int len, unsigned char *uuid);
 
 #endif /* __KERNEL___ */
 
--- linux-2.6.24-rc5/kernel/panic.c.org	2007-12-18 12:23:19.000000000 -0800
+++ linux-2.6.24-rc5/kernel/panic.c	2007-12-18 12:35:46.000000000 -0800
@@ -19,6 +19,7 @@
 #include <linux/nmi.h>
 #include <linux/kexec.h>
 #include <linux/debug_locks.h>
+#include <linux/random.h>
 
 int panic_on_oops;
 int tainted;
@@ -32,6 +33,8 @@ ATOMIC_NOTIFIER_HEAD(panic_notifier_list
 
 EXPORT_SYMBOL(panic_notifier_list);
 
+static unsigned char oops_uuid[16];
+
 static int __init panic_setup(char *str)
 {
 	panic_timeout = simple_strtoul(str, NULL, 0);
@@ -265,15 +268,32 @@ void oops_enter(void)
 	do_oops_enter_exit();
 }
 
+static int prime_oops_uuid(void)
+{
+	if (oops_uuid[8] == 0)
+		generate_random_uuid(oops_uuid);
+	return 0;
+}
+
 /*
  * Called when the architecture exits its oops handler, after printing
  * everything.
  */
 void oops_exit(void)
 {
+	char uuid_string[37];
 	do_oops_enter_exit();
+
+	/*
+	 * normally the oops_uid is already calculated, but if we oops during
+	 * really early boot, it may not be. In that case, calculate it here.
+	 */
+	prime_oops_uuid();
+	printk("---[ end trace %s ]---\n",
+		snprintf_uuid(uuid_string, sizeof(uuid_string), oops_uuid));
 }
 
+late_initcall(prime_oops_uuid);
 #ifdef CONFIG_CC_STACKPROTECTOR
 /*
  * Called when gcc's -fstack-protector feature is used, and

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2007-12-18 23:42 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-14 18:46 Top kernel oopses/warnings this week Arjan van de Ven
2007-12-14 18:58 ` Dave Jones
2007-12-14 21:57 ` Andrew Morton
2007-12-14 22:25   ` Natalie Protasevich
2007-12-15  0:38   ` Arjan van de Ven
2007-12-14 22:12 ` Jon Masters
2007-12-15 15:49 ` Stefan Richter
2007-12-15 18:21   ` Arjan van de Ven
2007-12-15 19:44     ` Stefan Richter
2007-12-17 18:25     ` Zach Brown
2007-12-17 18:41       ` Arjan van de Ven
2007-12-17  2:51   ` Dave Jones
2007-12-17 12:33     ` Jon Masters
2007-12-17 13:13       ` Stefan Richter
2007-12-17 16:40         ` Arjan van de Ven
2007-12-17 17:23 ` Ingo Molnar
2007-12-17 21:36   ` Arjan van de Ven
2007-12-17 21:58     ` Theodore Tso
2007-12-17 22:58     ` Tony Luck
2007-12-17 23:17       ` Arjan van de Ven
2007-12-17 23:26         ` Tony Luck
2007-12-17 23:47           ` Arjan van de Ven
2007-12-18  0:21             ` Linus Torvalds
2007-12-18  0:39               ` Arjan van de Ven
2007-12-18  2:31               ` Theodore Tso
2007-12-18  6:58                 ` Arjan van de Ven
2007-12-18 17:53                   ` Matt Mackall
2007-12-18 18:28                   ` Theodore Tso
2007-12-18 18:45                     ` Linus Torvalds
2007-12-18 10:11                 ` Jon Masters
2007-12-18 18:06               ` Arjan van de Ven
2007-12-18 18:13                 ` Matt Mackall
2007-12-18 18:19                   ` Arjan van de Ven
2007-12-18 17:48     ` Matt Mackall
2007-12-18 23:37       ` Arjan van de Ven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox