All of lore.kernel.org
 help / color / mirror / Atom feed
* Recent mm changes leading to filesystem corruption?
@ 2006-12-16 15:50 Martin Michlmayr
  2006-12-16 18:20   ` Hugh Dickins
  2006-12-16 20:55   ` Peter Zijlstra
  0 siblings, 2 replies; 11+ messages in thread
From: Martin Michlmayr @ 2006-12-16 15:50 UTC (permalink / raw)
  To: Peter Zijlstra, Hugh Dickins, linux-kernel, debian-kernel

Debian recently applied a number of mm changes that went into 2.6.19
to their 2.6.18 kernel for LSB 3.1 compliance (msync() had problems
before).  Since then, some filesystem corruption has been observed
which can be traced back to these mm changes.  Is anyone aware of
problems with these patches?

The patches that were applied are:

   - mm: tracking shared dirty pages
   - mm: balance dirty pages
   - mm: optimize the new mprotect() code a bit
   - mm: small cleanup of install_page()
   - mm: fixup do_wp_page()
   - mm: msync() cleanup

With these applied to 2.6.18, the Debian installer on a slow ARM
system fails because a program segfaults due to filesystem corruption:
http://bugs.debian.org/401980  This problem also occurs if you only
apply the "mm: tracking shared dirty pages" patch to 2.6.18 from the
series of 5 patches listed above.

Another problem has been reported related to libtorrent: according to
http://bugs.debian.org/402707 someone also saw this with non-Debian
2.6.19 but obviously it's hard to say whether the bugs are really
related.
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=394392;msg=24 shows
some dmesg messages but again it's not 100% clear it's the same bug.

Has anyone else seen problems or is aware of a fix to the patches
listed above that I'm unaware of?  It's possible the problem only
shows up on slow systems. (The corruption is reproducible on a slow
NSLU2 ARM system with 32 MB ram, but it doesn't happen on a faster ARM
box with more RAM.)
-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
  2006-12-16 15:50 Recent mm changes leading to filesystem corruption? Martin Michlmayr
@ 2006-12-16 18:20   ` Hugh Dickins
  2006-12-16 20:55   ` Peter Zijlstra
  1 sibling, 0 replies; 11+ messages in thread
From: Hugh Dickins @ 2006-12-16 18:20 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: Peter Zijlstra, linux-mm, linux-kernel, debian-kernel

On Sat, 16 Dec 2006, Martin Michlmayr wrote:

> Debian recently applied a number of mm changes that went into 2.6.19
> to their 2.6.18 kernel for LSB 3.1 compliance (msync() had problems
> before).  Since then, some filesystem corruption has been observed
> which can be traced back to these mm changes.  Is anyone aware of
> problems with these patches?

Very disturbing.  I'm not aware of any problem with them, and we
surely wouldn't have released 2.6.19 with any known-corrupting patches
in.  There's some doubts about 2.6.19 itself in the links below: were
it not for those, I'd suspect a mismerge of the pieces into 2.6.18,
perhaps a hidden dependency on something else.  I'll ponder a little,
but let's CC linux-mm in case someone there has an idea.

Hugh

> 
> The patches that were applied are:
> 
>    - mm: tracking shared dirty pages
>    - mm: balance dirty pages
>    - mm: optimize the new mprotect() code a bit
>    - mm: small cleanup of install_page()
>    - mm: fixup do_wp_page()
>    - mm: msync() cleanup
> 
> With these applied to 2.6.18, the Debian installer on a slow ARM
> system fails because a program segfaults due to filesystem corruption:
> http://bugs.debian.org/401980  This problem also occurs if you only
> apply the "mm: tracking shared dirty pages" patch to 2.6.18 from the
> series of 5 patches listed above.
> 
> Another problem has been reported related to libtorrent: according to
> http://bugs.debian.org/402707 someone also saw this with non-Debian
> 2.6.19 but obviously it's hard to say whether the bugs are really
> related.
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=394392;msg=24 shows
> some dmesg messages but again it's not 100% clear it's the same bug.
> 
> Has anyone else seen problems or is aware of a fix to the patches
> listed above that I'm unaware of?  It's possible the problem only
> shows up on slow systems. (The corruption is reproducible on a slow
> NSLU2 ARM system with 32 MB ram, but it doesn't happen on a faster ARM
> box with more RAM.)
> -- 
> Martin Michlmayr
> http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
@ 2006-12-16 18:20   ` Hugh Dickins
  0 siblings, 0 replies; 11+ messages in thread
From: Hugh Dickins @ 2006-12-16 18:20 UTC (permalink / raw)
  To: Martin Michlmayr; +Cc: Peter Zijlstra, linux-mm, linux-kernel, debian-kernel

On Sat, 16 Dec 2006, Martin Michlmayr wrote:

> Debian recently applied a number of mm changes that went into 2.6.19
> to their 2.6.18 kernel for LSB 3.1 compliance (msync() had problems
> before).  Since then, some filesystem corruption has been observed
> which can be traced back to these mm changes.  Is anyone aware of
> problems with these patches?

Very disturbing.  I'm not aware of any problem with them, and we
surely wouldn't have released 2.6.19 with any known-corrupting patches
in.  There's some doubts about 2.6.19 itself in the links below: were
it not for those, I'd suspect a mismerge of the pieces into 2.6.18,
perhaps a hidden dependency on something else.  I'll ponder a little,
but let's CC linux-mm in case someone there has an idea.

Hugh

> 
> The patches that were applied are:
> 
>    - mm: tracking shared dirty pages
>    - mm: balance dirty pages
>    - mm: optimize the new mprotect() code a bit
>    - mm: small cleanup of install_page()
>    - mm: fixup do_wp_page()
>    - mm: msync() cleanup
> 
> With these applied to 2.6.18, the Debian installer on a slow ARM
> system fails because a program segfaults due to filesystem corruption:
> http://bugs.debian.org/401980  This problem also occurs if you only
> apply the "mm: tracking shared dirty pages" patch to 2.6.18 from the
> series of 5 patches listed above.
> 
> Another problem has been reported related to libtorrent: according to
> http://bugs.debian.org/402707 someone also saw this with non-Debian
> 2.6.19 but obviously it's hard to say whether the bugs are really
> related.
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=394392;msg=24 shows
> some dmesg messages but again it's not 100% clear it's the same bug.
> 
> Has anyone else seen problems or is aware of a fix to the patches
> listed above that I'm unaware of?  It's possible the problem only
> shows up on slow systems. (The corruption is reproducible on a slow
> NSLU2 ARM system with 32 MB ram, but it doesn't happen on a faster ARM
> box with more RAM.)
> -- 
> Martin Michlmayr
> http://www.cyrius.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
  2006-12-16 18:20   ` Hugh Dickins
@ 2006-12-16 18:44     ` Martin Michlmayr
  -1 siblings, 0 replies; 11+ messages in thread
From: Martin Michlmayr @ 2006-12-16 18:44 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Peter Zijlstra, linux-mm, linux-kernel, debian-kernel

* Hugh Dickins <hugh@veritas.com> [2006-12-16 18:20]:
> Very disturbing.  I'm not aware of any problem with them, and we
> surely wouldn't have released 2.6.19 with any known-corrupting patches
> in.  There's some doubts about 2.6.19 itself in the links below: were
> it not for those, I'd suspect a mismerge of the pieces into 2.6.18,
> perhaps a hidden dependency on something else.  I'll ponder a little,
> but let's CC linux-mm in case someone there has an idea.

Do you think http://article.gmane.org/gmane.linux.kernel/473710 might
be related?
-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
@ 2006-12-16 18:44     ` Martin Michlmayr
  0 siblings, 0 replies; 11+ messages in thread
From: Martin Michlmayr @ 2006-12-16 18:44 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Peter Zijlstra, linux-mm, linux-kernel, debian-kernel

* Hugh Dickins <hugh@veritas.com> [2006-12-16 18:20]:
> Very disturbing.  I'm not aware of any problem with them, and we
> surely wouldn't have released 2.6.19 with any known-corrupting patches
> in.  There's some doubts about 2.6.19 itself in the links below: were
> it not for those, I'd suspect a mismerge of the pieces into 2.6.18,
> perhaps a hidden dependency on something else.  I'll ponder a little,
> but let's CC linux-mm in case someone there has an idea.

Do you think http://article.gmane.org/gmane.linux.kernel/473710 might
be related?
-- 
Martin Michlmayr
http://www.cyrius.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
  2006-12-16 18:44     ` Martin Michlmayr
@ 2006-12-16 19:07       ` Hugh Dickins
  -1 siblings, 0 replies; 11+ messages in thread
From: Hugh Dickins @ 2006-12-16 19:07 UTC (permalink / raw)
  To: Martin Michlmayr
  Cc: Peter Zijlstra, Jan Kara, linux-mm, linux-kernel, debian-kernel

On Sat, 16 Dec 2006, Martin Michlmayr wrote:
> * Hugh Dickins <hugh@veritas.com> [2006-12-16 18:20]:
> > Very disturbing.  I'm not aware of any problem with them, and we
> > surely wouldn't have released 2.6.19 with any known-corrupting patches
> > in.  There's some doubts about 2.6.19 itself in the links below: were
> > it not for those, I'd suspect a mismerge of the pieces into 2.6.18,
> > perhaps a hidden dependency on something else.  I'll ponder a little,
> > but let's CC linux-mm in case someone there has an idea.
> 
> Do you think http://article.gmane.org/gmane.linux.kernel/473710 might
> be related?

Sounds like it.  Let's CC Jan Kara on your other thread,
he seems to have delved into it a little.

Hugh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
@ 2006-12-16 19:07       ` Hugh Dickins
  0 siblings, 0 replies; 11+ messages in thread
From: Hugh Dickins @ 2006-12-16 19:07 UTC (permalink / raw)
  To: Martin Michlmayr
  Cc: Peter Zijlstra, Jan Kara, linux-mm, linux-kernel, debian-kernel

On Sat, 16 Dec 2006, Martin Michlmayr wrote:
> * Hugh Dickins <hugh@veritas.com> [2006-12-16 18:20]:
> > Very disturbing.  I'm not aware of any problem with them, and we
> > surely wouldn't have released 2.6.19 with any known-corrupting patches
> > in.  There's some doubts about 2.6.19 itself in the links below: were
> > it not for those, I'd suspect a mismerge of the pieces into 2.6.18,
> > perhaps a hidden dependency on something else.  I'll ponder a little,
> > but let's CC linux-mm in case someone there has an idea.
> 
> Do you think http://article.gmane.org/gmane.linux.kernel/473710 might
> be related?

Sounds like it.  Let's CC Jan Kara on your other thread,
he seems to have delved into it a little.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
  2006-12-16 15:50 Recent mm changes leading to filesystem corruption? Martin Michlmayr
@ 2006-12-16 20:55   ` Peter Zijlstra
  2006-12-16 20:55   ` Peter Zijlstra
  1 sibling, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2006-12-16 20:55 UTC (permalink / raw)
  To: Martin Michlmayr
  Cc: Hugh Dickins, linux-kernel, debian-kernel, linux-mm, David Miller

On Sat, 2006-12-16 at 16:50 +0100, Martin Michlmayr wrote:
> Debian recently applied a number of mm changes that went into 2.6.19
> to their 2.6.18 kernel for LSB 3.1 compliance (msync() had problems
> before).  Since then, some filesystem corruption has been observed
> which can be traced back to these mm changes.  Is anyone aware of
> problems with these patches?

As said by Hugh, no we were not.

> The patches that were applied are:
> 
>    - mm: tracking shared dirty pages
>    - mm: balance dirty pages
>    - mm: optimize the new mprotect() code a bit
>    - mm: small cleanup of install_page()
>    - mm: fixup do_wp_page()
>    - mm: msync() cleanup
> 
> With these applied to 2.6.18, the Debian installer on a slow ARM
> system fails because a program segfaults due to filesystem corruption:
> http://bugs.debian.org/401980  This problem also occurs if you only
> apply the "mm: tracking shared dirty pages" patch to 2.6.18 from the
> series of 5 patches listed above.

This made me think of a blog entry by DaveM from some time ago:
  http://vger.kernel.org/~davem/cgi-bin/blog.cgi/2006/06/09

> Another problem has been reported related to libtorrent: according to
> http://bugs.debian.org/402707 someone also saw this with non-Debian
> 2.6.19 but obviously it's hard to say whether the bugs are really
> related.
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=394392;msg=24 shows
> some dmesg messages but again it's not 100% clear it's the same bug.
> 
> Has anyone else seen problems or is aware of a fix to the patches
> listed above that I'm unaware of?  It's possible the problem only
> shows up on slow systems. (The corruption is reproducible on a slow
> NSLU2 ARM system with 32 MB ram, but it doesn't happen on a faster ARM
> box with more RAM.)

What is not clear from all these reports is what architectures this is
seen on. I suspect some of them are i686, which together with the
explicit mention of ARM make it a cross platform issue.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
@ 2006-12-16 20:55   ` Peter Zijlstra
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2006-12-16 20:55 UTC (permalink / raw)
  To: Martin Michlmayr
  Cc: Hugh Dickins, linux-kernel, debian-kernel, linux-mm, David Miller

On Sat, 2006-12-16 at 16:50 +0100, Martin Michlmayr wrote:
> Debian recently applied a number of mm changes that went into 2.6.19
> to their 2.6.18 kernel for LSB 3.1 compliance (msync() had problems
> before).  Since then, some filesystem corruption has been observed
> which can be traced back to these mm changes.  Is anyone aware of
> problems with these patches?

As said by Hugh, no we were not.

> The patches that were applied are:
> 
>    - mm: tracking shared dirty pages
>    - mm: balance dirty pages
>    - mm: optimize the new mprotect() code a bit
>    - mm: small cleanup of install_page()
>    - mm: fixup do_wp_page()
>    - mm: msync() cleanup
> 
> With these applied to 2.6.18, the Debian installer on a slow ARM
> system fails because a program segfaults due to filesystem corruption:
> http://bugs.debian.org/401980  This problem also occurs if you only
> apply the "mm: tracking shared dirty pages" patch to 2.6.18 from the
> series of 5 patches listed above.

This made me think of a blog entry by DaveM from some time ago:
  http://vger.kernel.org/~davem/cgi-bin/blog.cgi/2006/06/09

> Another problem has been reported related to libtorrent: according to
> http://bugs.debian.org/402707 someone also saw this with non-Debian
> 2.6.19 but obviously it's hard to say whether the bugs are really
> related.
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=394392;msg=24 shows
> some dmesg messages but again it's not 100% clear it's the same bug.
> 
> Has anyone else seen problems or is aware of a fix to the patches
> listed above that I'm unaware of?  It's possible the problem only
> shows up on slow systems. (The corruption is reproducible on a slow
> NSLU2 ARM system with 32 MB ram, but it doesn't happen on a faster ARM
> box with more RAM.)

What is not clear from all these reports is what architectures this is
seen on. I suspect some of them are i686, which together with the
explicit mention of ARM make it a cross platform issue.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
  2006-12-16 20:55   ` Peter Zijlstra
@ 2006-12-16 21:23     ` Martin Michlmayr
  -1 siblings, 0 replies; 11+ messages in thread
From: Martin Michlmayr @ 2006-12-16 21:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Hugh Dickins, linux-kernel, debian-kernel, linux-mm, David Miller

* Peter Zijlstra <a.p.zijlstra@chello.nl> [2006-12-16 21:55]:
> What is not clear from all these reports is what architectures this is
> seen on. I suspect some of them are i686, which together with the
> explicit mention of ARM make it a cross platform issue.

Problems have been seen at least on x86, x86_64 and arm.
-- 
Martin Michlmayr
tbm@cyrius.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Recent mm changes leading to filesystem corruption?
@ 2006-12-16 21:23     ` Martin Michlmayr
  0 siblings, 0 replies; 11+ messages in thread
From: Martin Michlmayr @ 2006-12-16 21:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Hugh Dickins, linux-kernel, debian-kernel, linux-mm, David Miller

* Peter Zijlstra <a.p.zijlstra@chello.nl> [2006-12-16 21:55]:
> What is not clear from all these reports is what architectures this is
> seen on. I suspect some of them are i686, which together with the
> explicit mention of ARM make it a cross platform issue.

Problems have been seen at least on x86, x86_64 and arm.
-- 
Martin Michlmayr
tbm@cyrius.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-12-16 21:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-16 15:50 Recent mm changes leading to filesystem corruption? Martin Michlmayr
2006-12-16 18:20 ` Hugh Dickins
2006-12-16 18:20   ` Hugh Dickins
2006-12-16 18:44   ` Martin Michlmayr
2006-12-16 18:44     ` Martin Michlmayr
2006-12-16 19:07     ` Hugh Dickins
2006-12-16 19:07       ` Hugh Dickins
2006-12-16 20:55 ` Peter Zijlstra
2006-12-16 20:55   ` Peter Zijlstra
2006-12-16 21:23   ` Martin Michlmayr
2006-12-16 21:23     ` Martin Michlmayr

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.