* Bug in error recovery in fs/buffer.c::__block_prepare_write()
@ 2005-06-10 13:01 Anton Altaparmakov
2005-06-10 13:05 ` Anton Altaparmakov
0 siblings, 1 reply; 3+ messages in thread
From: Anton Altaparmakov @ 2005-06-10 13:01 UTC (permalink / raw)
To: Andrew Morton, Al Viro; +Cc: fsdevel, lkml
Hi,
fs/buffer.c::__block_prepare_write() has broken error recovery. It
calls the get_block() callback with "create = 1" and if that succeeds it
immediately clears buffer_new on the just allocated buffer (recognised
by virtue of having buffer_new set).
It then zeroes out the region _outside_ the write, thus if the buffer is
fully being overwritten nothing is zeroed at all in this buffer.
It then repeats this for all buffers in the page.
The bug is that if an error occurs and get_block() returns != 0, we
break from this loop and go into recovery code. This code has this
comment:
/* Error case: */
/*
* Zero out any newly allocated blocks to avoid exposing stale
* data. If BH_New is set, we know that the block was newly
* allocated in the above loop.
*/
So the intent is obviously good in that it wants to clear just allocated
and hence not zeroed buffers. However the code recognises allocated
buffers by checking for buffer_new being set.
Unfortunately __block_prepare_write() as discussed above already cleared
buffer_new on all allocated buffers hence not a single buffer will be
cleared here and old data will be leaked. Here is the relevant code
from the error recovery path which at present _never_ triggers:
if (buffer_new(bh)) {
void *kaddr;
clear_buffer_new(bh);
kaddr = kmap_atomic(page, KM_USER0);
memset(kaddr+block_start, 0, bh->b_size);
kunmap_atomic(kaddr, KM_USER0);
set_buffer_uptodate(bh);
mark_buffer_dirty(bh);
}
So how do we fix this?
A) The simplest way I can see to fix this is to make the current
recovery code work by _not_ clearing buffer_new after calling
get_block() in __block_prepare_write(). The patch is simple and is
below.
We should be safe to leave the buffer_new set in the non-error case. No
code should care other than __block_prepare_write() and this clears
buffer_new before it calls get_block() so that is fine. I had a quick
look at buffer_new() users and I think it is safe to just leave it
set... Anyone know otherwise?
B) If we cannot safely allow buffer_new buffers to "leak out" of
__block_prepare_write(), then we simply would need to run a quick loop
over the buffers clearing buffer_new on each of them if it is set just
before returning "success" to the caller of __block_prepare_write().
The patch for this is simple, too (sent in separate email).
Andrew/Linus, I would suggest that you apply at least A and perhaps B if
you deem it necessary or want to be on the safe side.
Having had a look at the code it would seem perfectly safe to leave
buffer_new() set and ignore patch B but I may be wrong which is why I
did both.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
ps. I am sending this from evolution so please let me know if it mangles
the whitespace. (It shouldn't as I told it to use "Preformat" style but
you never know.)
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
--- linux-2.6.git/fs/buffer.c.old 2005-06-10 13:33:07.000000000 +0100
+++ linux-2.6.git/fs/buffer.c 2005-06-10 13:34:11.000000000 +0100
@@ -1951,7 +1951,6 @@ static int __block_prepare_write(struct
if (err)
break;
if (buffer_new(bh)) {
- clear_buffer_new(bh);
unmap_underlying_metadata(bh->b_bdev,
bh->b_blocknr);
if (PageUptodate(page)) {
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Bug in error recovery in fs/buffer.c::__block_prepare_write()
2005-06-10 13:01 Bug in error recovery in fs/buffer.c::__block_prepare_write() Anton Altaparmakov
@ 2005-06-10 13:05 ` Anton Altaparmakov
2005-06-10 14:15 ` Anton Altaparmakov
0 siblings, 1 reply; 3+ messages in thread
From: Anton Altaparmakov @ 2005-06-10 13:05 UTC (permalink / raw)
To: Al Viro, Linus Torvalds, Andrew Morton; +Cc: fsdevel, lkml
Here is the second patch (patch B).
On Fri, 2005-06-10 at 14:01 +0100, Anton Altaparmakov wrote:
[snip]
> B) If we cannot safely allow buffer_new buffers to "leak out" of
> __block_prepare_write(), then we simply would need to run a quick loop
> over the buffers clearing buffer_new on each of them if it is set just
> before returning "success" to the caller of __block_prepare_write().
[snip]
The patch for this is simple, too (it is below).
> Andrew/Linus, I would suggest that you apply at least A and perhaps B if
> you deem it necessary or want to be on the safe side.
>
> Having had a look at the code it would seem perfectly safe to leave
> buffer_new() set and ignore patch B but I may be wrong which is why I
> did both.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
--- linux-2.6.git/fs/buffer.c.old2 2005-06-10 13:35:03.000000000 +0100
+++ linux-2.6.git/fs/buffer.c 2005-06-10 13:38:14.000000000 +0100
@@ -1992,9 +1992,14 @@ static int __block_prepare_write(struct
if (!buffer_uptodate(*wait_bh))
err = -EIO;
}
- if (!err)
+ if (!err) {
+ bh = head;
+ do {
+ if (buffer_new(bh))
+ clear_buffer_new(bh);
+ } while ((bh = bh->b_this_page) != head);
return err;
-
+ }
/* Error case: */
/*
* Zero out any newly allocated blocks to avoid exposing stale
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Bug in error recovery in fs/buffer.c::__block_prepare_write()
2005-06-10 13:05 ` Anton Altaparmakov
@ 2005-06-10 14:15 ` Anton Altaparmakov
0 siblings, 0 replies; 3+ messages in thread
From: Anton Altaparmakov @ 2005-06-10 14:15 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, Andrew Morton, fsdevel, lkml
On Fri, 2005-06-10 at 14:05 +0100, Anton Altaparmakov wrote:
> Here is the second patch (patch B).
>
> On Fri, 2005-06-10 at 14:01 +0100, Anton Altaparmakov wrote:
> [snip]
> > B) If we cannot safely allow buffer_new buffers to "leak out" of
> > __block_prepare_write(), then we simply would need to run a quick loop
> > over the buffers clearing buffer_new on each of them if it is set just
> > before returning "success" to the caller of __block_prepare_write().
> [snip]
>
> The patch for this is simple, too (it is below).
>
> > Andrew/Linus, I would suggest that you apply at least A and perhaps B if
> > you deem it necessary or want to be on the safe side.
> >
> > Having had a look at the code it would seem perfectly safe to leave
> > buffer_new() set and ignore patch B but I may be wrong which is why I
> > did both.
I have changed my mind having had a look at the code some more. Patch B
is definitely needed.
Otherwise you can get a situation where an allocating write followed by
an erring write to the same location would cause uptodate and already
written out buffers to be zeroed & written out thus overwritting
existing data with zero.
This could be worked around differently (e.g. by being more picky which
buffer_new buffers get zeroed&written in the error handling) it is
definitely safest to just apply patch B.
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-06-10 14:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-10 13:01 Bug in error recovery in fs/buffer.c::__block_prepare_write() Anton Altaparmakov
2005-06-10 13:05 ` Anton Altaparmakov
2005-06-10 14:15 ` Anton Altaparmakov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).