* Top 10 bugs/warnings for the week of March 23rd, 2008
@ 2008-05-23 16:19 Arjan van de Ven
2008-05-23 16:23 ` Top 10 bugs/warnings for the week of May " Arjan van de Ven
` (5 more replies)
0 siblings, 6 replies; 29+ messages in thread
From: Arjan van de Ven @ 2008-05-23 16:19 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: Linus Torvalds, Greg KH, Andrew Morton
The http://www.kerneloops.org website collects kernel oops and
warning reports from various mailing lists and bugzillas as well as
with a client users can install to auto-submit oopses.
Below is a top 10 list of the oopses collected in the last 7 days.
(Reports prior to 2.6.23 have been omitted in collecting the top 10)
This week, a total of 3192 oopses and warnings have been reported,
compared to 1503 reports in the previous week.
Per file statistics
[I'd love to borrow Linus' gitstat stuff for this to get a nicer presentation of the per file/directory data]
1267 kernel/sysctl.c
283 fs/sysfs/dir.c
266 fs/buffer.c
111 lib/iomap.c
81 kernel/spinlock.c
51 mm/page_alloc.c
49 net/mac80211/main.c
38 kernel/irq/manage.c
38 security/selinux/hooks.c
34 net/core/sock.c
Rank 1: __register_sysctl_paths
Reported 1260 times (2491 total reports)
[tainted] Duplicate /proc registration. Bug in the madwifi driver
(Occasionally seen in the parport driver)
This oops was last seen in version 2.6.25.4, and first seen in 2.6.25-rc3.
More info: http://www.kerneloops.org/searchweek.php?search=__register_sysctl_paths
Rank 2: sysfs_add_one
Reported 264 times (528 total reports)
Duplicate sysfs registration... mostly a GregKH bug ...
This oops was last seen in version 2.6.26-rc3, and first seen in 2.6.24-rc6.
More info: http://www.kerneloops.org/searchweek.php?search=sysfs_add_one
Rank 3: mark_buffer_dirty
Reported 253 times (547 total reports)
EXT3 bug while hot-removing a USB device
This oops was last seen in version 2.6.25.3, and first seen in 2.6.24-rc6.
More info: http://www.kerneloops.org/searchweek.php?search=mark_buffer_dirty
Rank 4: bad_io_access
Reported 111 times (153 total reports)
Bug with pata_isapnp causing another ata driver to go SPLAT
Patch available in http://bugzilla.kernel.org/show_bug.cgi?id=10752
This oops was last seen in version 2.6.25.3, and first seen in 2.6.24.
More info: http://www.kerneloops.org/searchweek.php?search=bad_io_access
Rank 5: _spin_unlock_irqrestore
Reported 80 times (207 total reports)
Seems to happen mostly around the idle loop.. weak voltage regulators?
This oops was last seen in version 2.6.25.3, and first seen in 2.6.22-rc1.
More info: http://www.kerneloops.org/searchweek.php?search=_spin_unlock_irqrestore
Rank 6: ieee80211_stop_tx_ba_session
Reported 49 times (104 total reports)
A bug in the iwl4965 driver
This oops was last seen in version 2.6.25.3, and first seen in 2.6.25-rc7-git6.
More info: http://www.kerneloops.org/searchweek.php?search=ieee80211_stop_tx_ba_session
Rank 7: __alloc_pages
Reported 49 times (79 total reports)
This oops was last seen in version 2.6.25.3, and first seen in 2.6.18-rc1.
Bugs in the sata_nv and velocity drivers.
sata_nv patch available at http://lkml.org/lkml/2008/5/20/604
More info: http://www.kerneloops.org/searchweek.php?search=__alloc_pages
Rank 8: set_irq_wake
Reported 38 times (43 total reports)
[fixed] Bug in serial_core.c where disable_irq_wake/enable_irq_wake were unbalanced
Fix available at http://lkml.org/lkml/2008/5/20/218 (and in -mm)
This oops was last seen in version 2.6.25.3, and first seen in 2.6.25-rc9-git1.
More info: http://www.kerneloops.org/searchweek.php?search=set_irq_wake
Rank 9: task_has_capability
Reported 34 times
[tainted] Bug in the proprietary firegl driver
Oops only shows up in tainted kernels
This oops was last seen in version 2.6.25.3, and first seen in 2.6.25.
More info: http://www.kerneloops.org/searchweek.php?search=task_has_capability
Rank 10: sk_free
Reported 29 times (109 total reports)
[tainted] VMWare driver bug
Oops only shows up in tainted kernels
This oops was last seen in version 2.6.25.4, and first seen in 2.6.23.9.
More info: http://www.kerneloops.org/searchweek.php?search=sk_free
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of May 23rd, 2008
2008-05-23 16:19 Top 10 bugs/warnings for the week of March 23rd, 2008 Arjan van de Ven
@ 2008-05-23 16:23 ` Arjan van de Ven
2008-05-23 16:42 ` Top 10 bugs/warnings for the week of March " Linus Torvalds
` (4 subsequent siblings)
5 siblings, 0 replies; 29+ messages in thread
From: Arjan van de Ven @ 2008-05-23 16:23 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: Linus Torvalds, Greg KH, Andrew Morton
Arjan van de Ven wrote:
eh make that May 23rd
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-23 16:19 Top 10 bugs/warnings for the week of March 23rd, 2008 Arjan van de Ven
2008-05-23 16:23 ` Top 10 bugs/warnings for the week of May " Arjan van de Ven
@ 2008-05-23 16:42 ` Linus Torvalds
2008-05-23 17:35 ` Arjan van de Ven
2008-05-23 19:31 ` Alan Cox
` (3 subsequent siblings)
5 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2008-05-23 16:42 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Linux Kernel Mailing List, Greg KH, Andrew Morton
On Fri, 23 May 2008, Arjan van de Ven wrote:
>
> Per file statistics
> [I'd love to borrow Linus' gitstat stuff for this to get a nicer presentation
> of the per file/directory data]
The algorithm is very simple. Just sort your filenames alphabetically, and
then you can do it with a simple 40-line recursive function and a trivial
data structure. See the git sources, diff.c: gather_dirstat().
Or just do "git show 7df7c019c2a46672c12a11a45600cdc698e03029" in git to
show the commit that introduces --dirstat.
(In fact, much of the dirstat code is the thing that turns it into
percentages, so it has some setup code that first calculates the total
number of changes, and the printout code spends effort in generating the
percentage (well, permille) and not showing insignificant stuff - whether
you'd want/need that for this is debatable)
> Rank 1: __register_sysctl_paths
> Reported 1260 times (2491 total reports)
> [tainted] Duplicate /proc registration. Bug in the madwifi driver
> (Occasionally seen in the parport driver)
> This oops was last seen in version 2.6.25.4, and first seen in 2.6.25-rc3.
> More info:
> http://www.kerneloops.org/searchweek.php?search=__register_sysctl_paths
Btw, can you try to call these warnings, not oopses? It's not an oops, and
it's not even reported as an oops in the overviews on the top-level things
on the web-site, so your scripts do know it's not an oops - but then in
this summary and in the "detailed information" reports it's called an oops
again.
It's a WARN_ON, and yeah, while they can be bad, it's still different
from an actual oops.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-23 16:42 ` Top 10 bugs/warnings for the week of March " Linus Torvalds
@ 2008-05-23 17:35 ` Arjan van de Ven
0 siblings, 0 replies; 29+ messages in thread
From: Arjan van de Ven @ 2008-05-23 17:35 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Greg KH, Andrew Morton
Linus Torvalds wrote:
>
> On Fri, 23 May 2008, Arjan van de Ven wrote:
>> Per file statistics
>> [I'd love to borrow Linus' gitstat stuff for this to get a nicer presentation
>> of the per file/directory data]
>
> The algorithm is very simple. Just sort your filenames alphabetically, and
> then you can do it with a simple 40-line recursive function and a trivial
> data structure. See the git sources, diff.c: gather_dirstat().
>
> Or just do "git show 7df7c019c2a46672c12a11a45600cdc698e03029" in git to
> show the commit that introduces --dirstat.
>
> (In fact, much of the dirstat code is the thing that turns it into
> percentages, so it has some setup code that first calculates the total
> number of changes, and the printout code spends effort in generating the
> percentage (well, permille) and not showing insignificant stuff - whether
> you'd want/need that for this is debatable)
ok thanks I'll take a look at this; I might have to convert it to php (sigh ;)
>
>> Rank 1: __register_sysctl_paths
>> Reported 1260 times (2491 total reports)
>> [tainted] Duplicate /proc registration. Bug in the madwifi driver
>> (Occasionally seen in the parport driver)
>> This oops was last seen in version 2.6.25.4, and first seen in 2.6.25-rc3.
>> More info:
>> http://www.kerneloops.org/searchweek.php?search=__register_sysctl_paths
>
> Btw, can you try to call these warnings, not oopses? It's not an oops, and
> it's not even reported as an oops in the overviews on the top-level things
> on the web-site, so your scripts do know it's not an oops - but then in
> this summary and in the "detailed information" reports it's called an oops
> again.
>
> It's a WARN_ON, and yeah, while they can be bad, it's still different
> from an actual oops.
>
I'll see how to do this; the complex case of "some are oopses some are warn_ons" I probably can just deal with
by prioritization.
(but I do track the "class" so the info is there in the database)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-23 16:19 Top 10 bugs/warnings for the week of March 23rd, 2008 Arjan van de Ven
2008-05-23 16:23 ` Top 10 bugs/warnings for the week of May " Arjan van de Ven
2008-05-23 16:42 ` Top 10 bugs/warnings for the week of March " Linus Torvalds
@ 2008-05-23 19:31 ` Alan Cox
2008-05-24 0:15 ` Chris Wright
` (2 subsequent siblings)
5 siblings, 0 replies; 29+ messages in thread
From: Alan Cox @ 2008-05-23 19:31 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
> Rank 4: bad_io_access
> Reported 111 times (153 total reports)
> Bug with pata_isapnp causing another ata driver to go SPLAT
Bug in libata-sff and one that really proved the value of the oops work
you did. Nobody ever reported it in bugzilla in a way that made it
apparent what was up (I guess as it was an install crash) and probably
99% of users hitting it wouldn't even have thought their soundcard had an
unused, and possible unwired, IDE port.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-23 16:19 Top 10 bugs/warnings for the week of March 23rd, 2008 Arjan van de Ven
` (2 preceding siblings ...)
2008-05-23 19:31 ` Alan Cox
@ 2008-05-24 0:15 ` Chris Wright
2008-05-24 5:07 ` Arjan van de Ven
2008-05-24 5:32 ` Greg KH
2008-05-24 22:23 ` Jan Kara
5 siblings, 1 reply; 29+ messages in thread
From: Chris Wright @ 2008-05-24 0:15 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
* Arjan van de Ven (arjan@linux.intel.com) wrote:
> Rank 9: task_has_capability
> Reported 34 times
> [tainted] Bug in the proprietary firegl driver
> Oops only shows up in tainted kernels
> This oops was last seen in version 2.6.25.3, and first seen in 2.6.25.
> More info: http://www.kerneloops.org/searchweek.php?search=task_has_capability
looking at first one: http://www.kerneloops.org/raw.php?rawid=13598&msgid=
OK, aside of the obvious (their problem):
Tainted: P
EIP is at task_has_capability+0x48/0x76
Code: ... <0f> 0b
^^^^^^^
BUG()
This should be listed under the BUG/BUG_ON category as opposed to oops, no?
Also, I think the raw data is missing some bit. Where is the:
kernel BUG at...
At any rate, they have a bug in their proprietary module (news at 11).
So, I don't think this should make the top ten. Do you have a way to
sort tainted vs non-tainted, and only produce the top ten for untainted?
And one last question re: the stats. Is there a way to tell if the 41
times this was reported are from 41 distinct users. Is there any unique
cookie you receive with the raw oops report that can help filter out
duplicates (by duplicate I mean a user w/ this proprietary driver and
rebooting is likely to reproduce the same info on each boot). You don't
want to drop dups, but at least let that info the stats or something.
For the record, that bug triggers:
printk(KERN_ERR "SELinux: out of range capability %d\n", cap);
BUG();
meaning they are passing in a capability that's > 63 (2.6.25 introduced
64 bit caps).
BTW, EAX: 00000030 (48)...that suggests their capability they passed in
was quite large, likely an address or smth.
"<3>SELinux: out of range capability \n" <-- 38 chars
that leaves 10 for %d, which is > 999,999,999 ;-)
thanks,
-chris
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-24 0:15 ` Chris Wright
@ 2008-05-24 5:07 ` Arjan van de Ven
2008-05-26 9:36 ` Ingo Molnar
0 siblings, 1 reply; 29+ messages in thread
From: Arjan van de Ven @ 2008-05-24 5:07 UTC (permalink / raw)
To: Chris Wright
Cc: Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
Chris Wright wrote:
> * Arjan van de Ven (arjan@linux.intel.com) wrote:
>> Rank 9: task_has_capability
>> Reported 34 times
>> [tainted] Bug in the proprietary firegl driver
^^^^^^^^^
>> Oops only shows up in tainted kernels
^^^^^^^^
>> This oops was last seen in version 2.6.25.3, and first seen in 2.6.25.
>> More info: http://www.kerneloops.org/searchweek.php?search=task_has_capability
>
> looking at first one: http://www.kerneloops.org/raw.php?rawid=13598&msgid=
>
> OK, aside of the obvious (their problem):
>
> Tainted: P
> EIP is at task_has_capability+0x48/0x76
> Code: ... <0f> 0b
> ^^^^^^^
> BUG()
>
> This should be listed under the BUG/BUG_ON category as opposed to oops, no?
yeah it should; Linus pointed that out and I've since fixed my report generator script
> Also, I think the raw data is missing some bit. Where is the:
>
> kernel BUG at...
>
hmm it ought to be there.
> At any rate, they have a bug in their proprietary module (news at 11).
>
> So, I don't think this should make the top ten. Do you have a way to
> sort tainted vs non-tainted, and only produce the top ten for untainted?
yes absolutely; this is a question I'll have for the customers of the data...
do people want to see "only-tainted" in these top 10s? Right now I mark them as such
but leave them in. It's trivial for me to just leave them out instead (the info is there,
just a matter of not counting)
>
> And one last question re: the stats. Is there a way to tell if the 41
> times this was reported are from 41 distinct users. Is there any unique
> cookie you receive with the raw oops report that can help filter out
> duplicates (by duplicate I mean a user w/ this proprietary driver and
> rebooting is likely to reproduce the same info on each boot). You don't
> want to drop dups, but at least let that info the stats or something.
There is no unique per-system ID yet; I'm working with the SMOLT guys to get this added potentially.
>
> For the record, that bug triggers:
>
> printk(KERN_ERR "SELinux: out of range capability %d\n", cap);
> BUG();
>
> meaning they are passing in a capability that's > 63 (2.6.25 introduced
> 64 bit caps).
>
> BTW, EAX: 00000030 (48)...that suggests their capability they passed in
> was quite large, likely an address or smth.
yeah iirc the AMD graphics driver gives the user process full root caps for some time...
Annoying things these non-root linux users ;)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-23 16:19 Top 10 bugs/warnings for the week of March 23rd, 2008 Arjan van de Ven
` (3 preceding siblings ...)
2008-05-24 0:15 ` Chris Wright
@ 2008-05-24 5:32 ` Greg KH
2008-05-24 22:23 ` Jan Kara
5 siblings, 0 replies; 29+ messages in thread
From: Greg KH @ 2008-05-24 5:32 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Linux Kernel Mailing List, Linus Torvalds, Andrew Morton
On Fri, May 23, 2008 at 09:19:24AM -0700, Arjan van de Ven wrote:
> Rank 2: sysfs_add_one
> Reported 264 times (528 total reports)
> Duplicate sysfs registration... mostly a GregKH bug ...
> This oops was last seen in version 2.6.26-rc3, and first seen in
> 2.6.24-rc6.
> More info: http://www.kerneloops.org/searchweek.php?search=sysfs_add_one
Note, only the older 2.6.24 and possibly .25 versions of this are my
"fault", as those come from the usb core. That's a bug that I thought
we fixed, and I can not duplicate it myself at all.
Other instances of this are real problems in other areas of the kernel
(like the pci hotplug drivers), and are not my fault :)
If anyone knows of a way to constantly reproduce this issue with a USB
device, please contact me and I'll work to resolve this.
Arjan, thanks a lot for these reports, I find them very useful.
thanks,
gerg k-h
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-23 16:19 Top 10 bugs/warnings for the week of March 23rd, 2008 Arjan van de Ven
` (4 preceding siblings ...)
2008-05-24 5:32 ` Greg KH
@ 2008-05-24 22:23 ` Jan Kara
2008-05-24 22:30 ` Arjan van de Ven
5 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2008-05-24 22:23 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
Hello,
> Rank 3: mark_buffer_dirty
> Reported 253 times (547 total reports)
> EXT3 bug while hot-removing a USB device
> This oops was last seen in version 2.6.25.3, and first seen in
> 2.6.24-rc6.
> More info:
> http://www.kerneloops.org/searchweek.php?search=mark_buffer_dirty
Is someone looking into this? It could be somehow connected with commit
1be62dc190ebaca331038962c873e7967de6cc4b where we add smp_mb() to
mark_buffer_dirty() under some circumstances. But I don't really see
how. The WARN_ON() being triggered is !buffer_uptodate(bh) but that
seems ridiculous for call paths like ext2_sync_super() ->
mark_buffer_dirty() or journal_destroy() -> journal_update_superblock() ->
mark_buffer_dirty() which are in oopses. Also this warning started
appearing only recently while ext2 and JBD didn't change those areas
recently. Also interesting may be that both ext2_sync_super() and
journal_update_superblock() call sync_dirty_buffer() just after calling
mark_buffer_dirty()...
Arjan, do we have some more info for these oopses (like hw config,
what was the machine doing while the WARN_ON has been triggered etc.)?
Thanks.
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-24 22:23 ` Jan Kara
@ 2008-05-24 22:30 ` Arjan van de Ven
2008-05-24 22:45 ` Theodore Tso
0 siblings, 1 reply; 29+ messages in thread
From: Arjan van de Ven @ 2008-05-24 22:30 UTC (permalink / raw)
To: Jan Kara
Cc: Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton,
tytso
On Sun, 25 May 2008 00:23:04 +0200
Jan Kara <jack@suse.cz> wrote:
> Hello,
>
> > Rank 3: mark_buffer_dirty
> > Reported 253 times (547 total reports)
> > EXT3 bug while hot-removing a USB device
> > This oops was last seen in version 2.6.25.3, and first seen
> > in 2.6.24-rc6.
> > More info:
> > http://www.kerneloops.org/searchweek.php?search=mark_buffer_dirty
> Is someone looking into this? It could be somehow connected with
> commit 1be62dc190ebaca331038962c873e7967de6cc4b where we add smp_mb()
> to mark_buffer_dirty() under some circumstances. But I don't really
> see how. The WARN_ON() being triggered is !buffer_uptodate(bh) but
> that seems ridiculous for call paths like ext2_sync_super() ->
> mark_buffer_dirty() or journal_destroy() ->
> journal_update_superblock() -> mark_buffer_dirty() which are in
> oopses. Also this warning started appearing only recently while ext2
> and JBD didn't change those areas recently. Also interesting may be
> that both ext2_sync_super() and journal_update_superblock() call
> sync_dirty_buffer() just after calling mark_buffer_dirty()...
> Arjan, do we have some more info for these oopses (like hw config,
> what was the machine doing while the WARN_ON has been triggered etc.)?
> Thanks.
>
Ted looked at these during the LF summit, and his conclusion was that
they're all media errors (eg USB unplug) that ext3 then did not handle
well at all. Maybe Ted has an update on this?
the 2.6.24-rc is a red herring btw.. that's just the first kernel
version that actually printed its version as part of WARN_ON().
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-24 22:30 ` Arjan van de Ven
@ 2008-05-24 22:45 ` Theodore Tso
2008-05-25 11:58 ` Stefan Richter
2008-05-26 9:39 ` Ingo Molnar
0 siblings, 2 replies; 29+ messages in thread
From: Theodore Tso @ 2008-05-24 22:45 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Jan Kara, Linux Kernel Mailing List, Linus Torvalds, Greg KH,
Andrew Morton
On Sat, May 24, 2008 at 03:30:20PM -0700, Arjan van de Ven wrote:
>
> Ted looked at these during the LF summit, and his conclusion was that
> they're all media errors (eg USB unplug) that ext3 then did not handle
> well at all. Maybe Ted has an update on this?
>
Not really. It's on my todo list but fixing a bug caused by users
doing something stupid (pulling a mounted USB stick) has been lower
than a number of other fires burning on my plate. I'll try to get to
it but a lot of other things I need to worry about have deadlines
associated with them....
- Ted
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-24 22:45 ` Theodore Tso
@ 2008-05-25 11:58 ` Stefan Richter
2008-05-26 9:39 ` Ingo Molnar
1 sibling, 0 replies; 29+ messages in thread
From: Stefan Richter @ 2008-05-25 11:58 UTC (permalink / raw)
To: Theodore Tso, Arjan van de Ven, Jan Kara,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
Theodore Tso wrote:
> On Sat, May 24, 2008 at 03:30:20PM -0700, Arjan van de Ven wrote:
>> Ted looked at these during the LF summit, and his conclusion was that
>> they're all media errors (eg USB unplug) that ext3 then did not handle
>> well at all. Maybe Ted has an update on this?
>>
>
> Not really. It's on my todo list but fixing a bug caused by users
> doing something stupid (pulling a mounted USB stick) has been lower
> than a number of other fires burning on my plate. I'll try to get to
> it but a lot of other things I need to worry about have deadlines
> associated with them....
There are more reasons for connection loss than "user doing something
stupid". Firmware flaws for example. Or SBP-2 re-login failing on a
crowded FireWire bus.
Hot-removal capability is a fundamental requirement for a filesystem,
just like for the block drivers and transport drivers. (E.g. don't
corrupt the kernel in case of unrecoverable IO failures.)
BTW, hfs+ is also buggy WRT connection loss.
--
Stefan Richter
-=====-==--- -=-= ==--=
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-24 5:07 ` Arjan van de Ven
@ 2008-05-26 9:36 ` Ingo Molnar
0 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2008-05-26 9:36 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Chris Wright, Linux Kernel Mailing List, Linus Torvalds, Greg KH,
Andrew Morton
* Arjan van de Ven <arjan@linux.intel.com> wrote:
>> At any rate, they have a bug in their proprietary module (news at
>> 11).
>>
>> So, I don't think this should make the top ten. Do you have a way to
>> sort tainted vs non-tainted, and only produce the top ten for
>> untainted?
>
> yes absolutely; this is a question I'll have for the customers of the
> data... do people want to see "only-tainted" in these top 10s? Right
> now I mark them as such but leave them in. It's trivial for me to just
> leave them out instead (the info is there, just a matter of not
> counting)
i think they should be included, but perhaps abbreviated [into 1-2
lines] so that they do not hold us up.
We must not whitewash our bug statistics by intentionally excluding the
harm that bin-only modules do to our users - but we can make them less
visually intrusive, so that we can work on fixing the bugs we can fix.
Perhaps make sure it's top 10 of _our_ bugs, with the bin-only data
mixed in as well (in a visually unintrusive way) to make the picture
complete. I.e. list 20 bugs if 10 of them are bin-only modules.
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-24 22:45 ` Theodore Tso
2008-05-25 11:58 ` Stefan Richter
@ 2008-05-26 9:39 ` Ingo Molnar
2008-05-26 10:16 ` Theodore Tso
1 sibling, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2008-05-26 9:39 UTC (permalink / raw)
To: Theodore Tso, Arjan van de Ven, Jan Kara,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
* Theodore Tso <tytso@MIT.EDU> wrote:
> On Sat, May 24, 2008 at 03:30:20PM -0700, Arjan van de Ven wrote:
> >
> > Ted looked at these during the LF summit, and his conclusion was
> > that they're all media errors (eg USB unplug) that ext3 then did not
> > handle well at all. Maybe Ted has an update on this?
>
> Not really. It's on my todo list but fixing a bug caused by users
> doing something stupid (pulling a mounted USB stick) has been lower
> than a number of other fires burning on my plate. I'll try to get to
> it but a lot of other things I need to worry about have deadlines
> associated with them....
Exactly why is pulling an USB stick considered "stupid"? Last i checked
there was no physical lock preventing users from doing that.
Sure, pulling a mounted USB stick is inconvenient ... for _us_ kernel
developers. But the user really doesnt care and shouldnt care.
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 9:39 ` Ingo Molnar
@ 2008-05-26 10:16 ` Theodore Tso
2008-05-26 10:48 ` Ingo Molnar
2008-05-26 14:52 ` Stefan Richter
0 siblings, 2 replies; 29+ messages in thread
From: Theodore Tso @ 2008-05-26 10:16 UTC (permalink / raw)
To: Ingo Molnar
Cc: Arjan van de Ven, Jan Kara, Linux Kernel Mailing List,
Linus Torvalds, Greg KH, Andrew Morton
On Mon, May 26, 2008 at 11:39:13AM +0200, Ingo Molnar wrote:
> Exactly why is pulling an USB stick considered "stupid"? Last i checked
> there was no physical lock preventing users from doing that.
>
> Sure, pulling a mounted USB stick is inconvenient ... for _us_ kernel
> developers. But the user really doesnt care and shouldnt care.
Because they could lose data? Because if the kernel wakes up and
tries writing to the USB stick right as they pull it out, it could
physically damage the flash format? I know, stupid reason... :-)
I know, I know, it's an issue. If someone wants to look at it, great.
It is on my todo list, but I've been doing a lot of travelling, and
when you have only one laptop, and pulling USB sticks isn't the sort
of thing you can do using UML or KVM, it tends to fall to the bottom
of your list.
- Ted
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 10:16 ` Theodore Tso
@ 2008-05-26 10:48 ` Ingo Molnar
2008-05-26 16:20 ` Jan Kara
2008-05-27 11:40 ` Pavel Machek
2008-05-26 14:52 ` Stefan Richter
1 sibling, 2 replies; 29+ messages in thread
From: Ingo Molnar @ 2008-05-26 10:48 UTC (permalink / raw)
To: Theodore Tso, Arjan van de Ven, Jan Kara,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
* Theodore Tso <tytso@MIT.EDU> wrote:
> On Mon, May 26, 2008 at 11:39:13AM +0200, Ingo Molnar wrote:
> > Exactly why is pulling an USB stick considered "stupid"? Last i checked
> > there was no physical lock preventing users from doing that.
> >
> > Sure, pulling a mounted USB stick is inconvenient ... for _us_
> > kernel developers. But the user really doesnt care and shouldnt
> > care.
>
> Because they could lose data? Because if the kernel wakes up and
> tries writing to the USB stick right as they pull it out, it could
> physically damage the flash format? I know, stupid reason... :-)
user can lose data in many other ways, that's not the issue - the issue
here is something very crutial: the kernel gets confused about a _very_
common user-triggerable condition.
That confusion must not happen in a modern OS and the kernel should be
resilient and cope with such external events. And we must not
deprioritize it with an incorrect "user did something stupid" tag...
That argument might have been valid 15 years ago when floppies could be
locked and you needed a needle to force-eject it but it is rather lame
today when unplugging an USB stick is as easy as moving the mouse.
If there's something stupid here it's the kernel not dealing with that
condition properly. Yes, the "user action" here looks "trivial" to the
user but what happens below is indeed very hard technically, but who
said that writing an OS from scratch would be an easy task? ;-)
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 10:16 ` Theodore Tso
2008-05-26 10:48 ` Ingo Molnar
@ 2008-05-26 14:52 ` Stefan Richter
1 sibling, 0 replies; 29+ messages in thread
From: Stefan Richter @ 2008-05-26 14:52 UTC (permalink / raw)
To: Theodore Tso
Cc: Ingo Molnar, Arjan van de Ven, Jan Kara,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
Theodore Tso wrote:
> It is on my todo list, but I've been doing a lot of travelling, and
> when you have only one laptop, and pulling USB sticks isn't the sort
> of thing you can do using UML or KVM, it tends to fall to the bottom
> of your list.
Maybe add fault injection code which simply lets a SCSI LLD's
queuecommand fail everything when a sysfs switch was flipped. There may
be more convenient points to fail IO higher up, I don't know.
--
Stefan Richter
-=====-==--- -=-= ==-=-
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 10:48 ` Ingo Molnar
@ 2008-05-26 16:20 ` Jan Kara
2008-05-26 16:48 ` Ingo Molnar
2008-05-27 11:40 ` Pavel Machek
1 sibling, 1 reply; 29+ messages in thread
From: Jan Kara @ 2008-05-26 16:20 UTC (permalink / raw)
To: Ingo Molnar
Cc: Theodore Tso, Arjan van de Ven, Linux Kernel Mailing List,
Linus Torvalds, Greg KH, Andrew Morton
On Mon 26-05-08 12:48:32, Ingo Molnar wrote:
>
> * Theodore Tso <tytso@MIT.EDU> wrote:
>
> > On Mon, May 26, 2008 at 11:39:13AM +0200, Ingo Molnar wrote:
> > > Exactly why is pulling an USB stick considered "stupid"? Last i checked
> > > there was no physical lock preventing users from doing that.
> > >
> > > Sure, pulling a mounted USB stick is inconvenient ... for _us_
> > > kernel developers. But the user really doesnt care and shouldnt
> > > care.
> >
> > Because they could lose data? Because if the kernel wakes up and
> > tries writing to the USB stick right as they pull it out, it could
> > physically damage the flash format? I know, stupid reason... :-)
>
> user can lose data in many other ways, that's not the issue - the issue
> here is something very crutial: the kernel gets confused about a _very_
> common user-triggerable condition.
>
> That confusion must not happen in a modern OS and the kernel should be
> resilient and cope with such external events. And we must not
> deprioritize it with an incorrect "user did something stupid" tag...
> That argument might have been valid 15 years ago when floppies could be
> locked and you needed a needle to force-eject it but it is rather lame
> today when unplugging an USB stick is as easy as moving the mouse.
>
> If there's something stupid here it's the kernel not dealing with that
> condition properly. Yes, the "user action" here looks "trivial" to the
> user but what happens below is indeed very hard technically, but who
> said that writing an OS from scratch would be an easy task? ;-)
Well, Ingo, I don't know if you've noticed but the machine continues to run
just fine, as far as I understand. It only spits a dangerously looking
warning and that's it. I agree we shouldn't be doing this but I don't really
find this a critical problem.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 16:20 ` Jan Kara
@ 2008-05-26 16:48 ` Ingo Molnar
2008-05-26 17:01 ` Theodore Tso
0 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2008-05-26 16:48 UTC (permalink / raw)
To: Jan Kara
Cc: Theodore Tso, Arjan van de Ven, Linux Kernel Mailing List,
Linus Torvalds, Greg KH, Andrew Morton
* Jan Kara <jack@suse.cz> wrote:
> just fine, as far as I understand. It only spits a dangerously looking
> warning and that's it. I agree we shouldn't be doing this but I don't
> really find this a critical problem.
yeah i know it's "just" a warning, i checked the stack dump on
kerneloops.org before i wrote the mail.
Pulling removable media out while it might still be mounted is a fact of
life and comes not from the stupidity of the user but from the lack of
physical barriers on the device side.
What if the USB stick was pulled mistakenly, the user notices her
mistake later on and plugs the USB stick back in and expects all the
data to not be corrupted?
What if the user puts the stick back in and it wont be mounted or will
be critically damaged?
How do we even know whether these cases all work 100% robustly if the
attitude is that removing a mounted device is a stupid thing to do and a
bug related to it gets deprioritized? [starting with the issue of why
the user should even care about such a relatively low-level abstraction
as a "mounted filesystem".]
And any such problems do come up in the enterprise space as well, in
terms of multipath IO issues - and Linux still does quite poorly in that
area.
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 16:48 ` Ingo Molnar
@ 2008-05-26 17:01 ` Theodore Tso
2008-05-26 17:09 ` Oliver Neukum
2008-05-27 3:49 ` Greg KH
0 siblings, 2 replies; 29+ messages in thread
From: Theodore Tso @ 2008-05-26 17:01 UTC (permalink / raw)
To: Ingo Molnar
Cc: Jan Kara, Arjan van de Ven, Linux Kernel Mailing List,
Linus Torvalds, Greg KH, Andrew Morton
On Mon, May 26, 2008 at 06:48:58PM +0200, Ingo Molnar wrote:
>
> What if the USB stick was pulled mistakenly, the user notices her
> mistake later on and plugs the USB stick back in and expects all the
> data to not be corrupted?
If the USB stack folks would like to work on how to recognize that
it's the same USB stick that had been previously pulled, so that it
gets the same block device, and we can decide for how long we should
keep dirty buffers around associated with a pulled USB stick, we can
certainly have that conversation. :-)
> And any such problems do come up in the enterprise space as well, in
> terms of multipath IO issues - and Linux still does quite poorly in that
> area.
We definitely have problems here, I agree. But at least with a
multipath I/O device we have something that sticks around even when
the last I/O path is pulled. We could talk about setting up
dm-multipath with USB, I suppose --- that would have the benefit of
getting the dm-multipath code much more widely exercised without
needing exotic (and expensive) hardware kit being required.
- Ted
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 17:01 ` Theodore Tso
@ 2008-05-26 17:09 ` Oliver Neukum
2008-05-26 17:28 ` Bart Van Assche
2008-05-27 11:41 ` Pavel Machek
2008-05-27 3:49 ` Greg KH
1 sibling, 2 replies; 29+ messages in thread
From: Oliver Neukum @ 2008-05-26 17:09 UTC (permalink / raw)
To: Theodore Tso
Cc: Ingo Molnar, Jan Kara, Arjan van de Ven,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
Am Montag 26 Mai 2008 19:01:48 schrieb Theodore Tso:
> If the USB stack folks would like to work on how to recognize that
> it's the same USB stick that had been previously pulled, so that it
> gets the same block device, and we can decide for how long we should
> keep dirty buffers around associated with a pulled USB stick, we can
> certainly have that conversation. :-)
Even if we could tell whether the device has remained the same, how
would we know the medium wasn't exchanged?
Regards
Oliver
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 17:09 ` Oliver Neukum
@ 2008-05-26 17:28 ` Bart Van Assche
2008-05-26 17:38 ` Jan Kara
2008-05-27 6:12 ` Oliver Neukum
2008-05-27 11:41 ` Pavel Machek
1 sibling, 2 replies; 29+ messages in thread
From: Bart Van Assche @ 2008-05-26 17:28 UTC (permalink / raw)
To: Oliver Neukum
Cc: Theodore Tso, Ingo Molnar, Jan Kara, Arjan van de Ven,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
On Mon, May 26, 2008 at 7:09 PM, Oliver Neukum <oliver@neukum.org> wrote:
> Am Montag 26 Mai 2008 19:01:48 schrieb Theodore Tso:
>> If the USB stack folks would like to work on how to recognize that
>> it's the same USB stick that had been previously pulled, so that it
>> gets the same block device, and we can decide for how long we should
>> keep dirty buffers around associated with a pulled USB stick, we can
>> certainly have that conversation. :-)
>
> Even if we could tell whether the device has remained the same, how
> would we know the medium wasn't exchanged?
Looking at the filesystem UUID could help -- this is an ID that is
present as data on the disk, and that is even independent of the bus
type. See also /dev/disk/by-uuid.
For the journaling filesystems I am familiar with the default value
for the commit parameter is 5 seconds. Would it be a good idea to
leave the default to 5s for non-removable devices, and to change this
default to 1s for removable devices ?
Bart.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 17:28 ` Bart Van Assche
@ 2008-05-26 17:38 ` Jan Kara
2008-05-26 17:50 ` Theodore Tso
2008-05-27 6:12 ` Oliver Neukum
1 sibling, 1 reply; 29+ messages in thread
From: Jan Kara @ 2008-05-26 17:38 UTC (permalink / raw)
To: Bart Van Assche
Cc: Oliver Neukum, Theodore Tso, Ingo Molnar, Jan Kara,
Arjan van de Ven, Linux Kernel Mailing List, Linus Torvalds,
Greg KH, Andrew Morton
On Mon 26-05-08 19:28:23, Bart Van Assche wrote:
> On Mon, May 26, 2008 at 7:09 PM, Oliver Neukum <oliver@neukum.org> wrote:
> > Am Montag 26 Mai 2008 19:01:48 schrieb Theodore Tso:
> >> If the USB stack folks would like to work on how to recognize that
> >> it's the same USB stick that had been previously pulled, so that it
> >> gets the same block device, and we can decide for how long we should
> >> keep dirty buffers around associated with a pulled USB stick, we can
> >> certainly have that conversation. :-)
> >
> > Even if we could tell whether the device has remained the same, how
> > would we know the medium wasn't exchanged?
>
> Looking at the filesystem UUID could help -- this is an ID that is
> present as data on the disk, and that is even independent of the bus
> type. See also /dev/disk/by-uuid.
Yes, but as Oliver wrote if someone modified the filesystem in the mean
time, you won't notice it - UUID doesn't help here.
> For the journaling filesystems I am familiar with the default value
> for the commit parameter is 5 seconds. Would it be a good idea to
> leave the default to 5s for non-removable devices, and to change this
> default to 1s for removable devices ?
I don't think it's a good idea:
1) You'd even more stress wear-leveling of USB flash drives - btw, given
their sizes it does not make much sence to use ext3 on your USB stick. I
still use VFAT/ext2 there.
2) You'd probably notice performance decrease because of more journaling
overhead.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 17:38 ` Jan Kara
@ 2008-05-26 17:50 ` Theodore Tso
2008-05-26 18:23 ` Ingo Molnar
0 siblings, 1 reply; 29+ messages in thread
From: Theodore Tso @ 2008-05-26 17:50 UTC (permalink / raw)
To: Jan Kara
Cc: Bart Van Assche, Oliver Neukum, Ingo Molnar, Arjan van de Ven,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
On Mon, May 26, 2008 at 07:38:48PM +0200, Jan Kara wrote:
> > Looking at the filesystem UUID could help -- this is an ID that is
> > present as data on the disk, and that is even independent of the bus
> > type. See also /dev/disk/by-uuid.
> Yes, but as Oliver wrote if someone modified the filesystem in the mean
> time, you won't notice it - UUID doesn't help here.
That part you could figure out in userspace, by looking at the last
mount and last modified time in the superblock. But the problem is
it's too late. If you had buffers which had been "in flight" at the
time when the USB stick was pulled, the kernel isn't going to be able
to send them to the new instantiation of the device for the freshly
installed USB stick. And I don't think we want to put
filesystem-specific UUID and superblock parsing code in the generic
USB layer!
I suspect that if we want to do this, the only way would be with
massive amounts of userspace help, and with the dm layer interposing
between the filesystem and the device. So when the USB stick gets
pulled, from the dm-multipath side it looks like the last I/O path has
been pulled, and it role-plays accordingly (with some kind of
intelligence where it holds dirty buffers for some reasonable amount
of time --- where reasonable is not easy to define) and then when
someone re-inserts a USB stick, userspace will have to figure out that
it was the same filesystem, and that it apparently hasn't been
tampered with, cross its fingers, and then associate the (possibly
different USB device) with the dm-multipath device.
If we have a super-bright student who needs humbling, it might make
for an interesting GSOC project. :-)
- Ted
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 17:50 ` Theodore Tso
@ 2008-05-26 18:23 ` Ingo Molnar
0 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2008-05-26 18:23 UTC (permalink / raw)
To: Theodore Tso, Jan Kara, Bart Van Assche, Oliver Neukum,
Arjan van de Ven, Linux Kernel Mailing List, Linus Torvalds,
Greg KH, Andrew Morton
* Theodore Tso <tytso@mit.edu> wrote:
> > > Looking at the filesystem UUID could help -- this is an ID that is
> > > present as data on the disk, and that is even independent of the
> > > bus type. See also /dev/disk/by-uuid.
> > Yes, but as Oliver wrote if someone modified the filesystem in the mean
> > time, you won't notice it - UUID doesn't help here.
>
> That part you could figure out in userspace, by looking at the last
> mount and last modified time in the superblock. But the problem is
> it's too late. If you had buffers which had been "in flight" at the
> time when the USB stick was pulled, the kernel isn't going to be able
> to send them to the new instantiation of the device for the freshly
> installed USB stick. And I don't think we want to put
> filesystem-specific UUID and superblock parsing code in the generic
> USB layer!
yeah, i agree it's all ugly - but it's really our making not the user's
;-)
i think we could and should go to quite some length to properly support
a rather benign-appearing usecase such as the user removing stuff from a
modern computer (stuff that is not specifically bolted down that is).
Violating a few artificial abstraction layers within the kernel is a lot
better than losing user data, IMHO.
Ingo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 17:01 ` Theodore Tso
2008-05-26 17:09 ` Oliver Neukum
@ 2008-05-27 3:49 ` Greg KH
1 sibling, 0 replies; 29+ messages in thread
From: Greg KH @ 2008-05-27 3:49 UTC (permalink / raw)
To: Theodore Tso, Ingo Molnar, Jan Kara, Arjan van de Ven,
Linux Kernel Mailing List, Linus Torvalds, Andrew Morton
On Mon, May 26, 2008 at 01:01:48PM -0400, Theodore Tso wrote:
> On Mon, May 26, 2008 at 06:48:58PM +0200, Ingo Molnar wrote:
> >
> > What if the USB stick was pulled mistakenly, the user notices her
> > mistake later on and plugs the USB stick back in and expects all the
> > data to not be corrupted?
>
> If the USB stack folks would like to work on how to recognize that
> it's the same USB stick that had been previously pulled, so that it
> gets the same block device, and we can decide for how long we should
> keep dirty buffers around associated with a pulled USB stick, we can
> certainly have that conversation. :-)
We do that already on suspend/resume, so it should not be hard to move
that to a disconnect/connect model as well, it's the same code path :)
Bring it up on the linux-usb list if you are interested in persuing
this.
In the meanwhile, getting rid of the warning would be a good idea.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 17:28 ` Bart Van Assche
2008-05-26 17:38 ` Jan Kara
@ 2008-05-27 6:12 ` Oliver Neukum
1 sibling, 0 replies; 29+ messages in thread
From: Oliver Neukum @ 2008-05-27 6:12 UTC (permalink / raw)
To: Bart Van Assche
Cc: Theodore Tso, Ingo Molnar, Jan Kara, Arjan van de Ven,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
Am Montag 26 Mai 2008 19:28:23 schrieb Bart Van Assche:
> > Even if we could tell whether the device has remained the same, how
> > would we know the medium wasn't exchanged?
>
> Looking at the filesystem UUID could help -- this is an ID that is
> present as data on the disk, and that is even independent of the bus
> type. See also /dev/disk/by-uuid.
The medium may or may not hold a filesystem that has a UUID.
In addition you can clone filesystems with dd but reuse them as
independent filesystems.
Yes, you could checksum parts of the filesystem and say that at some
point the user should suffer the consequences of his stupidity, but
such things don't belong into the kernel and neither are they specific
to USB.
Regards
Oliver
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 10:48 ` Ingo Molnar
2008-05-26 16:20 ` Jan Kara
@ 2008-05-27 11:40 ` Pavel Machek
1 sibling, 0 replies; 29+ messages in thread
From: Pavel Machek @ 2008-05-27 11:40 UTC (permalink / raw)
To: Ingo Molnar
Cc: Theodore Tso, Arjan van de Ven, Jan Kara,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
On Mon 2008-05-26 12:48:32, Ingo Molnar wrote:
>
> * Theodore Tso <tytso@MIT.EDU> wrote:
>
> > On Mon, May 26, 2008 at 11:39:13AM +0200, Ingo Molnar wrote:
> > > Exactly why is pulling an USB stick considered "stupid"? Last i checked
> > > there was no physical lock preventing users from doing that.
> > >
> > > Sure, pulling a mounted USB stick is inconvenient ... for _us_
> > > kernel developers. But the user really doesnt care and shouldnt
> > > care.
> >
> > Because they could lose data? Because if the kernel wakes up and
> > tries writing to the USB stick right as they pull it out, it could
> > physically damage the flash format? I know, stupid reason... :-)
>
> user can lose data in many other ways, that's not the issue - the issue
> here is something very crutial: the kernel gets confused about a _very_
> common user-triggerable condition.
>
> That confusion must not happen in a modern OS and the kernel should be
> resilient and cope with such external events. And we must not
> deprioritize it with an incorrect "user did something stupid" tag...
Of course unavoidable "data corruption happens when user pulls the
stick without unmounting" has lower priority than, for example "usb
stick stops working randomly when used overnight"?
And we have still bugs of the second class, so yes unexpected pulls do
have lower priority.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Top 10 bugs/warnings for the week of March 23rd, 2008
2008-05-26 17:09 ` Oliver Neukum
2008-05-26 17:28 ` Bart Van Assche
@ 2008-05-27 11:41 ` Pavel Machek
1 sibling, 0 replies; 29+ messages in thread
From: Pavel Machek @ 2008-05-27 11:41 UTC (permalink / raw)
To: Oliver Neukum
Cc: Theodore Tso, Ingo Molnar, Jan Kara, Arjan van de Ven,
Linux Kernel Mailing List, Linus Torvalds, Greg KH, Andrew Morton
On Mon 2008-05-26 19:09:57, Oliver Neukum wrote:
> Am Montag 26 Mai 2008 19:01:48 schrieb Theodore Tso:
> > If the USB stack folks would like to work on how to recognize that
> > it's the same USB stick that had been previously pulled, so that it
> > gets the same block device, and we can decide for how long we should
> > keep dirty buffers around associated with a pulled USB stick, we can
> > certainly have that conversation. :-)
>
> Even if we could tell whether the device has remained the same, how
> would we know the medium wasn't exchanged?
We need filesystem support for that, but I guess we should just bite
the bullet and do it for the most common filesystems... it is very
useful for suspend/resume, too and I have heared about (misdesigned)
machines where it will be basically mandatory...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2008-05-27 11:41 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-23 16:19 Top 10 bugs/warnings for the week of March 23rd, 2008 Arjan van de Ven
2008-05-23 16:23 ` Top 10 bugs/warnings for the week of May " Arjan van de Ven
2008-05-23 16:42 ` Top 10 bugs/warnings for the week of March " Linus Torvalds
2008-05-23 17:35 ` Arjan van de Ven
2008-05-23 19:31 ` Alan Cox
2008-05-24 0:15 ` Chris Wright
2008-05-24 5:07 ` Arjan van de Ven
2008-05-26 9:36 ` Ingo Molnar
2008-05-24 5:32 ` Greg KH
2008-05-24 22:23 ` Jan Kara
2008-05-24 22:30 ` Arjan van de Ven
2008-05-24 22:45 ` Theodore Tso
2008-05-25 11:58 ` Stefan Richter
2008-05-26 9:39 ` Ingo Molnar
2008-05-26 10:16 ` Theodore Tso
2008-05-26 10:48 ` Ingo Molnar
2008-05-26 16:20 ` Jan Kara
2008-05-26 16:48 ` Ingo Molnar
2008-05-26 17:01 ` Theodore Tso
2008-05-26 17:09 ` Oliver Neukum
2008-05-26 17:28 ` Bart Van Assche
2008-05-26 17:38 ` Jan Kara
2008-05-26 17:50 ` Theodore Tso
2008-05-26 18:23 ` Ingo Molnar
2008-05-27 6:12 ` Oliver Neukum
2008-05-27 11:41 ` Pavel Machek
2008-05-27 3:49 ` Greg KH
2008-05-27 11:40 ` Pavel Machek
2008-05-26 14:52 ` Stefan Richter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox