* [BUG] garbage instead of zeroes in UFS @ 2006-12-20 11:02 Tomasz Kvarsin 2006-12-20 11:04 ` Tomasz Kvarsin 0 siblings, 1 reply; 8+ messages in thread From: Tomasz Kvarsin @ 2006-12-20 11:02 UTC (permalink / raw) To: Andrew Morton, Alexander Viro, linux-kernel I have some problems with write support of UFS. Here is script which demonstrate problem: #create image mkdir /tmp/ufs-expirements && cd /tmp/ufs-expirements/ for ((i=0; i<1024*1024*2; ++i)); do printf "z"; done > image #build ufs tools wget 'http://heanet.dl.sourceforge.net/sourceforge/ufs-linux/ufs-tools-0.1.tar.bz2' && tar xjf ufs-tools-0.1.tar.bz2 && cd ufs-tools-0.1 wget http://lkml.org/lkml/diff/2006/5/20/48/1 -O build.patch patch -p1 < build.patch && make #create UFS file system on image ./mkufs -O 1 -b 16384 -f 2048 ../image cd .. && mkdir root mount -t ufs image root -o loop,ufstype=44bsd cd root/ touch a.txt echo "END" > end.txt dd if=./end.txt of=./a.txt bs=16384 seek=1 and at the end content of "a.txt" not only "END" and zeroes, "a.txt" also contains "z". The real situation happened when I deleted big file, and create new one with holes. This script just easy way to reproduce bug. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] garbage instead of zeroes in UFS 2006-12-20 11:02 [BUG] garbage instead of zeroes in UFS Tomasz Kvarsin @ 2006-12-20 11:04 ` Tomasz Kvarsin 2006-12-20 11:09 ` Andrew Morton 0 siblings, 1 reply; 8+ messages in thread From: Tomasz Kvarsin @ 2006-12-20 11:04 UTC (permalink / raw) To: Andrew Morton, Alexander Viro, linux-kernel Forgot to say I use linux-2.6.20-rc1-mm1 On 12/20/06, Tomasz Kvarsin <kvarsin@gmail.com> wrote: > I have some problems with write support of UFS. > Here is script which demonstrate problem: > > #create image > mkdir /tmp/ufs-expirements && cd /tmp/ufs-expirements/ > for ((i=0; i<1024*1024*2; ++i)); do printf "z"; done > image > > #build ufs tools > wget 'http://heanet.dl.sourceforge.net/sourceforge/ufs-linux/ufs-tools-0.1.tar.bz2' > && tar xjf ufs-tools-0.1.tar.bz2 && cd ufs-tools-0.1 > wget http://lkml.org/lkml/diff/2006/5/20/48/1 -O build.patch > patch -p1 < build.patch && make > > #create UFS file system on image > ./mkufs -O 1 -b 16384 -f 2048 ../image > cd .. && mkdir root > mount -t ufs image root -o loop,ufstype=44bsd > cd root/ > touch a.txt > echo "END" > end.txt > dd if=./end.txt of=./a.txt bs=16384 seek=1 > > and at the end content of "a.txt" not only "END" and zeroes, > "a.txt" also contains "z". > > The real situation happened when I deleted big file, > and create new one with holes. This script just easy way to reproduce bug. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] garbage instead of zeroes in UFS 2006-12-20 11:04 ` Tomasz Kvarsin @ 2006-12-20 11:09 ` Andrew Morton 2006-12-20 11:41 ` Tomasz Kvarsin 2006-12-20 14:54 ` [BUG] [PATCH] [RFC] " Evgeniy Dushistov 0 siblings, 2 replies; 8+ messages in thread From: Andrew Morton @ 2006-12-20 11:09 UTC (permalink / raw) To: Tomasz Kvarsin; +Cc: Alexander Viro, linux-kernel, Evgeniy Dushistov On Wed, 20 Dec 2006 14:04:06 +0300 "Tomasz Kvarsin" <kvarsin@gmail.com> wrote: > Forgot to say I use linux-2.6.20-rc1-mm1 > > On 12/20/06, Tomasz Kvarsin <kvarsin@gmail.com> wrote: > > I have some problems with write support of UFS. > > Here is script which demonstrate problem: > > > > #create image > > mkdir /tmp/ufs-expirements && cd /tmp/ufs-expirements/ > > for ((i=0; i<1024*1024*2; ++i)); do printf "z"; done > image > > > > #build ufs tools > > wget 'http://heanet.dl.sourceforge.net/sourceforge/ufs-linux/ufs-tools-0.1.tar.bz2' > > && tar xjf ufs-tools-0.1.tar.bz2 && cd ufs-tools-0.1 > > wget http://lkml.org/lkml/diff/2006/5/20/48/1 -O build.patch > > patch -p1 < build.patch && make > > > > #create UFS file system on image > > ./mkufs -O 1 -b 16384 -f 2048 ../image > > cd .. && mkdir root > > mount -t ufs image root -o loop,ufstype=44bsd > > cd root/ > > touch a.txt > > echo "END" > end.txt > > dd if=./end.txt of=./a.txt bs=16384 seek=1 > > > > and at the end content of "a.txt" not only "END" and zeroes, > > "a.txt" also contains "z". > > > > The real situation happened when I deleted big file, > > and create new one with holes. This script just easy way to reproduce bug. > > Does 2.6.20-rc1 have the same problem? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] garbage instead of zeroes in UFS 2006-12-20 11:09 ` Andrew Morton @ 2006-12-20 11:41 ` Tomasz Kvarsin 2006-12-20 14:54 ` [BUG] [PATCH] [RFC] " Evgeniy Dushistov 1 sibling, 0 replies; 8+ messages in thread From: Tomasz Kvarsin @ 2006-12-20 11:41 UTC (permalink / raw) To: Andrew Morton; +Cc: Alexander Viro, linux-kernel, Evgeniy Dushistov On 12/20/06, Andrew Morton <akpm@osdl.org> wrote: > On Wed, 20 Dec 2006 14:04:06 +0300 > "Tomasz Kvarsin" <kvarsin@gmail.com> wrote: > > > Forgot to say I use linux-2.6.20-rc1-mm1 > > > > On 12/20/06, Tomasz Kvarsin <kvarsin@gmail.com> wrote: > > > I have some problems with write support of UFS. > > > Here is script which demonstrate problem: > > > > > > #create image > > > mkdir /tmp/ufs-expirements && cd /tmp/ufs-expirements/ > > > for ((i=0; i<1024*1024*2; ++i)); do printf "z"; done > image > > > > > > #build ufs tools > > > wget 'http://heanet.dl.sourceforge.net/sourceforge/ufs-linux/ufs-tools-0.1.tar.bz2' > > > && tar xjf ufs-tools-0.1.tar.bz2 && cd ufs-tools-0.1 > > > wget http://lkml.org/lkml/diff/2006/5/20/48/1 -O build.patch > > > patch -p1 < build.patch && make > > > > > > #create UFS file system on image > > > ./mkufs -O 1 -b 16384 -f 2048 ../image > > > cd .. && mkdir root > > > mount -t ufs image root -o loop,ufstype=44bsd > > > cd root/ > > > touch a.txt > > > echo "END" > end.txt > > > dd if=./end.txt of=./a.txt bs=16384 seek=1 > > > > > > and at the end content of "a.txt" not only "END" and zeroes, > > > "a.txt" also contains "z". > > > > > > The real situation happened when I deleted big file, > > > and create new one with holes. This script just easy way to reproduce bug. > > > > > Does 2.6.20-rc1 have the same problem? > Yes. Actually, if it is important, I start my searching of stable UFS write support from 2.6.10 + RH patches. It just hang up time to time, I try 2.6.19, it is pretty stable, but has this bug, 2.6.20-rc1 has it. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] [PATCH] [RFC] garbage instead of zeroes in UFS 2006-12-20 11:09 ` Andrew Morton 2006-12-20 11:41 ` Tomasz Kvarsin @ 2006-12-20 14:54 ` Evgeniy Dushistov 2006-12-20 16:52 ` Tomasz Kvarsin 2006-12-21 5:35 ` Andrew Morton 1 sibling, 2 replies; 8+ messages in thread From: Evgeniy Dushistov @ 2006-12-20 14:54 UTC (permalink / raw) To: Andrew Morton; +Cc: Tomasz Kvarsin, Alexander Viro, linux-kernel On Wed, Dec 20, 2006 at 03:09:55AM -0800, Andrew Morton wrote: > On Wed, 20 Dec 2006 14:04:06 +0300 > "Tomasz Kvarsin" <kvarsin@gmail.com> wrote: > > > Forgot to say I use linux-2.6.20-rc1-mm1 > > > > On 12/20/06, Tomasz Kvarsin <kvarsin@gmail.com> wrote: > > > I have some problems with write support of UFS. > > > Here is script which demonstrate problem: > > > > > > #create image > > > mkdir /tmp/ufs-expirements && cd /tmp/ufs-expirements/ > > > for ((i=0; i<1024*1024*2; ++i)); do printf "z"; done > image > > > > > > #build ufs tools > > > wget 'http://heanet.dl.sourceforge.net/sourceforge/ufs-linux/ufs-tools-0.1.tar.bz2' > > > && tar xjf ufs-tools-0.1.tar.bz2 && cd ufs-tools-0.1 > > > wget http://lkml.org/lkml/diff/2006/5/20/48/1 -O build.patch > > > patch -p1 < build.patch && make > > > > > > #create UFS file system on image > > > ./mkufs -O 1 -b 16384 -f 2048 ../image > > > cd .. && mkdir root > > > mount -t ufs image root -o loop,ufstype=44bsd > > > cd root/ > > > touch a.txt > > > echo "END" > end.txt > > > dd if=./end.txt of=./a.txt bs=16384 seek=1 > > > > > > and at the end content of "a.txt" not only "END" and zeroes, > > > "a.txt" also contains "z". > > > > > > The real situation happened when I deleted big file, > > > and create new one with holes. This script just easy way to reproduce bug. > > > > > Does 2.6.20-rc1 have the same problem? Looks like this is the problem, which point Al Viro some time ago: when we allocate block(in this situation 16K) we mark as new only one fragment(in this situation 2K), I don't see _right_ way to do nullification of whole block, if use inode page cache, some pages may be outside of inode limits (inode size), and will be lost; if use blockdev page cache it is possible to zeroize real data, if later inode page cache will be used. The simpliest way, as can I see usage of block device page cache, but not only mark dirty, but also sync it during "nullification". Patch bellow properly handle Tomasz's test case for me(Tomasz can you try it?), but I am not sure is this _right_ solution. Any ideas? --- Index: linux-2.6.20-rc1-mm1/fs/ufs/inode.c =================================================================== --- linux-2.6.20-rc1-mm1.orig/fs/ufs/inode.c +++ linux-2.6.20-rc1-mm1/fs/ufs/inode.c @@ -156,36 +156,6 @@ out: return ret; } -static void ufs_clear_frag(struct inode *inode, struct buffer_head *bh) -{ - lock_buffer(bh); - memset(bh->b_data, 0, inode->i_sb->s_blocksize); - set_buffer_uptodate(bh); - mark_buffer_dirty(bh); - unlock_buffer(bh); - if (IS_SYNC(inode)) - sync_dirty_buffer(bh); -} - -static struct buffer_head * -ufs_clear_frags(struct inode *inode, sector_t beg, - unsigned int n, sector_t want) -{ - struct buffer_head *res = NULL, *bh; - sector_t end = beg + n; - - for (; beg < end; ++beg) { - bh = sb_getblk(inode->i_sb, beg); - ufs_clear_frag(inode, bh); - if (want != beg) - brelse(bh); - else - res = bh; - } - BUG_ON(!res); - return res; -} - /** * ufs_inode_getfrag() - allocate new fragment(s) * @inode - pointer to inode @@ -302,7 +272,7 @@ repeat: } if (!phys) { - result = ufs_clear_frags(inode, tmp, required, tmp + blockoff); + result = sb_getblk(sb, tmp + blockoff); } else { *phys = tmp + blockoff; result = NULL; @@ -403,8 +373,7 @@ repeat: if (!phys) { - result = ufs_clear_frags(inode, tmp, uspi->s_fpb, - tmp + blockoff); + result = sb_getblk(sb, tmp + blockoff); } else { *phys = tmp + blockoff; *new = 1; @@ -471,13 +440,13 @@ int ufs_getfrag_block(struct inode *inod #define GET_INODE_DATABLOCK(x) \ ufs_inode_getfrag(inode, x, fragment, 1, &err, &phys, &new, bh_result->b_page) #define GET_INODE_PTR(x) \ - ufs_inode_getfrag(inode, x, fragment, uspi->s_fpb, &err, NULL, NULL, bh_result->b_page) + ufs_inode_getfrag(inode, x, fragment, uspi->s_fpb, &err, NULL, NULL, NULL) #define GET_INDIRECT_DATABLOCK(x) \ ufs_inode_getblock(inode, bh, x, fragment, \ - &err, &phys, &new, bh_result->b_page); + &err, &phys, &new, bh_result->b_page) #define GET_INDIRECT_PTR(x) \ ufs_inode_getblock(inode, bh, x, fragment, \ - &err, NULL, NULL, bh_result->b_page); + &err, NULL, NULL, NULL) if (ptr < UFS_NDIR_FRAGMENT) { bh = GET_INODE_DATABLOCK(ptr); Index: linux-2.6.20-rc1-mm1/fs/ufs/balloc.c =================================================================== --- linux-2.6.20-rc1-mm1.orig/fs/ufs/balloc.c +++ linux-2.6.20-rc1-mm1/fs/ufs/balloc.c @@ -275,6 +275,25 @@ static void ufs_change_blocknr(struct in UFSD("EXIT\n"); } +static void ufs_clear_frags(struct inode *inode, sector_t beg, unsigned int n, + int sync) +{ + struct buffer_head *bh; + sector_t end = beg + n; + + for (; beg < end; ++beg) { + bh = sb_getblk(inode->i_sb, beg); + lock_buffer(bh); + memset(bh->b_data, 0, inode->i_sb->s_blocksize); + set_buffer_uptodate(bh); + mark_buffer_dirty(bh); + unlock_buffer(bh); + if (IS_SYNC(inode) || sync) + sync_dirty_buffer(bh); + brelse(bh); + } +} + unsigned ufs_new_fragments(struct inode * inode, __fs32 * p, unsigned fragment, unsigned goal, unsigned count, int * err, struct page *locked_page) { @@ -350,6 +369,8 @@ unsigned ufs_new_fragments(struct inode *p = cpu_to_fs32(sb, result); *err = 0; UFS_I(inode)->i_lastfrag = max_t(u32, UFS_I(inode)->i_lastfrag, fragment + count); + ufs_clear_frags(inode, result + oldcount, newcount - oldcount, + locked_page != NULL); } unlock_super(sb); UFSD("EXIT, result %u\n", result); @@ -363,6 +384,8 @@ unsigned ufs_new_fragments(struct inode if (result) { *err = 0; UFS_I(inode)->i_lastfrag = max_t(u32, UFS_I(inode)->i_lastfrag, fragment + count); + ufs_clear_frags(inode, result + oldcount, newcount - oldcount, + locked_page != NULL); unlock_super(sb); UFSD("EXIT, result %u\n", result); return result; @@ -398,6 +421,8 @@ unsigned ufs_new_fragments(struct inode *p = cpu_to_fs32(sb, result); *err = 0; UFS_I(inode)->i_lastfrag = max_t(u32, UFS_I(inode)->i_lastfrag, fragment + count); + ufs_clear_frags(inode, result + oldcount, newcount - oldcount, + locked_page != NULL); unlock_super(sb); if (newcount < request) ufs_free_fragments (inode, result + newcount, request - newcount); -- /Evgeniy ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] [PATCH] [RFC] garbage instead of zeroes in UFS 2006-12-20 14:54 ` [BUG] [PATCH] [RFC] " Evgeniy Dushistov @ 2006-12-20 16:52 ` Tomasz Kvarsin 2006-12-21 5:35 ` Andrew Morton 1 sibling, 0 replies; 8+ messages in thread From: Tomasz Kvarsin @ 2006-12-20 16:52 UTC (permalink / raw) To: Andrew Morton, Tomasz Kvarsin, Alexander Viro, linux-kernel Works for me. At least now I can not reproduce this bug. On 12/20/06, Evgeniy Dushistov <dushistov@mail.ru> wrote: > On Wed, Dec 20, 2006 at 03:09:55AM -0800, Andrew Morton wrote: > > On Wed, 20 Dec 2006 14:04:06 +0300 > > "Tomasz Kvarsin" <kvarsin@gmail.com> wrote: > > > > > Forgot to say I use linux-2.6.20-rc1-mm1 > > > > > > On 12/20/06, Tomasz Kvarsin <kvarsin@gmail.com> wrote: > > > > I have some problems with write support of UFS. > > > > Here is script which demonstrate problem: > > > > > > > > #create image > > > > mkdir /tmp/ufs-expirements && cd /tmp/ufs-expirements/ > > > > for ((i=0; i<1024*1024*2; ++i)); do printf "z"; done > image > > > > > > > > #build ufs tools > > > > wget 'http://heanet.dl.sourceforge.net/sourceforge/ufs-linux/ufs-tools-0.1.tar.bz2' > > > > && tar xjf ufs-tools-0.1.tar.bz2 && cd ufs-tools-0.1 > > > > wget http://lkml.org/lkml/diff/2006/5/20/48/1 -O build.patch > > > > patch -p1 < build.patch && make > > > > > > > > #create UFS file system on image > > > > ./mkufs -O 1 -b 16384 -f 2048 ../image > > > > cd .. && mkdir root > > > > mount -t ufs image root -o loop,ufstype=44bsd > > > > cd root/ > > > > touch a.txt > > > > echo "END" > end.txt > > > > dd if=./end.txt of=./a.txt bs=16384 seek=1 > > > > > > > > and at the end content of "a.txt" not only "END" and zeroes, > > > > "a.txt" also contains "z". > > > > > > > > The real situation happened when I deleted big file, > > > > and create new one with holes. This script just easy way to reproduce bug. > > > > > > > > Does 2.6.20-rc1 have the same problem? > > Looks like this is the problem, which point Al Viro some time ago: > when we allocate block(in this situation 16K) we mark as new > only one fragment(in this situation 2K), > I don't see _right_ way to do nullification of whole block, > if use inode page cache, some pages may be outside of inode limits > (inode size), > and will be lost; > if use blockdev page cache it is possible to zeroize real data, > if later inode page cache will be used. > > The simpliest way, as can I see usage of block device page cache, > but not only mark dirty, but also sync it during "nullification". > > Patch bellow properly handle Tomasz's test case for me(Tomasz can you try it?), > but I am not sure is this _right_ solution. > Any ideas? > > --- > > Index: linux-2.6.20-rc1-mm1/fs/ufs/inode.c > =================================================================== > --- linux-2.6.20-rc1-mm1.orig/fs/ufs/inode.c > +++ linux-2.6.20-rc1-mm1/fs/ufs/inode.c > @@ -156,36 +156,6 @@ out: > return ret; > } > > -static void ufs_clear_frag(struct inode *inode, struct buffer_head *bh) > -{ > - lock_buffer(bh); > - memset(bh->b_data, 0, inode->i_sb->s_blocksize); > - set_buffer_uptodate(bh); > - mark_buffer_dirty(bh); > - unlock_buffer(bh); > - if (IS_SYNC(inode)) > - sync_dirty_buffer(bh); > -} > - > -static struct buffer_head * > -ufs_clear_frags(struct inode *inode, sector_t beg, > - unsigned int n, sector_t want) > -{ > - struct buffer_head *res = NULL, *bh; > - sector_t end = beg + n; > - > - for (; beg < end; ++beg) { > - bh = sb_getblk(inode->i_sb, beg); > - ufs_clear_frag(inode, bh); > - if (want != beg) > - brelse(bh); > - else > - res = bh; > - } > - BUG_ON(!res); > - return res; > -} > - > /** > * ufs_inode_getfrag() - allocate new fragment(s) > * @inode - pointer to inode > @@ -302,7 +272,7 @@ repeat: > } > > if (!phys) { > - result = ufs_clear_frags(inode, tmp, required, tmp + blockoff); > + result = sb_getblk(sb, tmp + blockoff); > } else { > *phys = tmp + blockoff; > result = NULL; > @@ -403,8 +373,7 @@ repeat: > > > if (!phys) { > - result = ufs_clear_frags(inode, tmp, uspi->s_fpb, > - tmp + blockoff); > + result = sb_getblk(sb, tmp + blockoff); > } else { > *phys = tmp + blockoff; > *new = 1; > @@ -471,13 +440,13 @@ int ufs_getfrag_block(struct inode *inod > #define GET_INODE_DATABLOCK(x) \ > ufs_inode_getfrag(inode, x, fragment, 1, &err, &phys, &new, bh_result->b_page) > #define GET_INODE_PTR(x) \ > - ufs_inode_getfrag(inode, x, fragment, uspi->s_fpb, &err, NULL, NULL, bh_result->b_page) > + ufs_inode_getfrag(inode, x, fragment, uspi->s_fpb, &err, NULL, NULL, NULL) > #define GET_INDIRECT_DATABLOCK(x) \ > ufs_inode_getblock(inode, bh, x, fragment, \ > - &err, &phys, &new, bh_result->b_page); > + &err, &phys, &new, bh_result->b_page) > #define GET_INDIRECT_PTR(x) \ > ufs_inode_getblock(inode, bh, x, fragment, \ > - &err, NULL, NULL, bh_result->b_page); > + &err, NULL, NULL, NULL) > > if (ptr < UFS_NDIR_FRAGMENT) { > bh = GET_INODE_DATABLOCK(ptr); > Index: linux-2.6.20-rc1-mm1/fs/ufs/balloc.c > =================================================================== > --- linux-2.6.20-rc1-mm1.orig/fs/ufs/balloc.c > +++ linux-2.6.20-rc1-mm1/fs/ufs/balloc.c > @@ -275,6 +275,25 @@ static void ufs_change_blocknr(struct in > UFSD("EXIT\n"); > } > > +static void ufs_clear_frags(struct inode *inode, sector_t beg, unsigned int n, > + int sync) > +{ > + struct buffer_head *bh; > + sector_t end = beg + n; > + > + for (; beg < end; ++beg) { > + bh = sb_getblk(inode->i_sb, beg); > + lock_buffer(bh); > + memset(bh->b_data, 0, inode->i_sb->s_blocksize); > + set_buffer_uptodate(bh); > + mark_buffer_dirty(bh); > + unlock_buffer(bh); > + if (IS_SYNC(inode) || sync) > + sync_dirty_buffer(bh); > + brelse(bh); > + } > +} > + > unsigned ufs_new_fragments(struct inode * inode, __fs32 * p, unsigned fragment, > unsigned goal, unsigned count, int * err, struct page *locked_page) > { > @@ -350,6 +369,8 @@ unsigned ufs_new_fragments(struct inode > *p = cpu_to_fs32(sb, result); > *err = 0; > UFS_I(inode)->i_lastfrag = max_t(u32, UFS_I(inode)->i_lastfrag, fragment + count); > + ufs_clear_frags(inode, result + oldcount, newcount - oldcount, > + locked_page != NULL); > } > unlock_super(sb); > UFSD("EXIT, result %u\n", result); > @@ -363,6 +384,8 @@ unsigned ufs_new_fragments(struct inode > if (result) { > *err = 0; > UFS_I(inode)->i_lastfrag = max_t(u32, UFS_I(inode)->i_lastfrag, fragment + count); > + ufs_clear_frags(inode, result + oldcount, newcount - oldcount, > + locked_page != NULL); > unlock_super(sb); > UFSD("EXIT, result %u\n", result); > return result; > @@ -398,6 +421,8 @@ unsigned ufs_new_fragments(struct inode > *p = cpu_to_fs32(sb, result); > *err = 0; > UFS_I(inode)->i_lastfrag = max_t(u32, UFS_I(inode)->i_lastfrag, fragment + count); > + ufs_clear_frags(inode, result + oldcount, newcount - oldcount, > + locked_page != NULL); > unlock_super(sb); > if (newcount < request) > ufs_free_fragments (inode, result + newcount, request - newcount); > > > -- > /Evgeniy > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] [PATCH] [RFC] garbage instead of zeroes in UFS 2006-12-20 14:54 ` [BUG] [PATCH] [RFC] " Evgeniy Dushistov 2006-12-20 16:52 ` Tomasz Kvarsin @ 2006-12-21 5:35 ` Andrew Morton 2006-12-28 16:04 ` Evgeniy Dushistov 1 sibling, 1 reply; 8+ messages in thread From: Andrew Morton @ 2006-12-21 5:35 UTC (permalink / raw) To: Evgeniy Dushistov; +Cc: Tomasz Kvarsin, Alexander Viro, linux-kernel On Wed, 20 Dec 2006 17:54:12 +0300 Evgeniy Dushistov <dushistov@mail.ru> wrote: > On Wed, Dec 20, 2006 at 03:09:55AM -0800, Andrew Morton wrote: > > On Wed, 20 Dec 2006 14:04:06 +0300 > > "Tomasz Kvarsin" <kvarsin@gmail.com> wrote: > > > > > Forgot to say I use linux-2.6.20-rc1-mm1 > > > > > > On 12/20/06, Tomasz Kvarsin <kvarsin@gmail.com> wrote: > > > > I have some problems with write support of UFS. > > > > Here is script which demonstrate problem: > > > > > > > > #create image > > > > mkdir /tmp/ufs-expirements && cd /tmp/ufs-expirements/ > > > > for ((i=0; i<1024*1024*2; ++i)); do printf "z"; done > image > > > > > > > > #build ufs tools > > > > wget 'http://heanet.dl.sourceforge.net/sourceforge/ufs-linux/ufs-tools-0.1.tar.bz2' > > > > && tar xjf ufs-tools-0.1.tar.bz2 && cd ufs-tools-0.1 > > > > wget http://lkml.org/lkml/diff/2006/5/20/48/1 -O build.patch > > > > patch -p1 < build.patch && make > > > > > > > > #create UFS file system on image > > > > ./mkufs -O 1 -b 16384 -f 2048 ../image > > > > cd .. && mkdir root > > > > mount -t ufs image root -o loop,ufstype=44bsd > > > > cd root/ > > > > touch a.txt > > > > echo "END" > end.txt > > > > dd if=./end.txt of=./a.txt bs=16384 seek=1 > > > > > > > > and at the end content of "a.txt" not only "END" and zeroes, > > > > "a.txt" also contains "z". > > > > > > > > The real situation happened when I deleted big file, > > > > and create new one with holes. This script just easy way to reproduce bug. > > > > > > > > Does 2.6.20-rc1 have the same problem? I know nothing of UFS, but here goes.. > Looks like this is the problem, which point Al Viro some time ago: > when we allocate block(in this situation 16K) we mark as new > only one fragment(in this situation 2K), Do you mean: ufs's get_block callback allocates 16k of disk at a time, and links that entire 16k into the file's metadata. But because get_block is called for only a single buffer_head (a 2k buffer_head in this case?) we are only able to tell the VFS that this 2k is buffer_new(). So when ufs_getfrag_block() is later called to map some more data in the file, and when that data resides within the remaining 14k of this fragment, ufs_getfrag_block() will incorrectly return a !buffer_new() buffer_head. If that is correct, it seems like a fairly easy-to-trigger bug. Perhaps it only happens when we're filling in file holes for some reason? Or perhaps this bad data can be accessed simply by extending the file with ftruncate and then reading shortly beyond the previous end-of-file? > I don't see _right_ way to do nullification of whole block, > if use inode page cache, some pages may be outside of inode limits > (inode size), > and will be lost; Using the per-inode pagecache is certainly more efficient than perfroming synchronous writes via blockdev pagecache and then reading the blocks back into the inode pagecache! It'd be fairly straightforward to do this for blocks which are inside i_size (ie: filling in file holes): populate pagecache, zero it, mark it dirty. For pages which are presently outside i_size things are a bit more risky - they're not really legal and can't in theory be written out. Or might not be written out. Although from a quick look at the writeback code, it might all just work. However a cleaner solution might be to remember, on a per-inode basis, what the filesystem's view is of the current file size. Then implement inode_operations.setattr() and if someone extends the file, we know that this will expose the uninitialised blocks outside the old file-size to reads, so now is the time to instantiate that dirty, zero-filled pagecache to cover the tail of the last fragment. Is all a bit tricky. > if use blockdev page cache it is possible to zeroize real data, > if later inode page cache will be used. > > The simpliest way, as can I see usage of block device page cache, > but not only mark dirty, but also sync it during "nullification". That'll work. How bad is this change from a performance point-of-view? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [BUG] [PATCH] [RFC] garbage instead of zeroes in UFS 2006-12-21 5:35 ` Andrew Morton @ 2006-12-28 16:04 ` Evgeniy Dushistov 0 siblings, 0 replies; 8+ messages in thread From: Evgeniy Dushistov @ 2006-12-28 16:04 UTC (permalink / raw) To: Andrew Morton; +Cc: Tomasz Kvarsin, Alexander Viro, linux-kernel [sorry for delay with answer] On Wed, Dec 20, 2006 at 09:35:55PM -0800, Andrew Morton wrote: > I know nothing of UFS, but here goes.. > > > Looks like this is the problem, which point Al Viro some time ago: > > when we allocate block(in this situation 16K) we mark as new > > only one fragment(in this situation 2K), > > Do you mean: > > ufs's get_block callback allocates 16k of disk at a time, and links that > entire 16k into the file's metadata. But because get_block is called for > only a single buffer_head (a 2k buffer_head in this case?) we are only > able to tell the VFS that this 2k is buffer_new(). > > So when ufs_getfrag_block() is later called to map some more data in the > file, and when that data resides within the remaining 14k of this > fragment, ufs_getfrag_block() will incorrectly return a !buffer_new() > buffer_head. > Yes. > If that is correct, it seems like a fairly easy-to-trigger bug. Perhaps > it only happens when we're filling in file holes for some reason? > when we filling file hole with size >=16K(in this case), or extend big enough file(big - we use indirect, double indirect and so on blocks). > Or perhaps this bad data can be accessed simply by extending the file with > ftruncate and then reading shortly beyond the previous end-of-file? > > > I don't see _right_ way to do nullification of whole block, > > if use inode page cache, some pages may be outside of inode limits > > (inode size), > > and will be lost; > > Using the per-inode pagecache is certainly more efficient than perfroming > synchronous writes via blockdev pagecache and then reading the blocks back > into the inode pagecache! > > It'd be fairly straightforward to do this for blocks which are inside > i_size (ie: filling in file holes): populate pagecache, zero it, mark it > dirty. > > For pages which are presently outside i_size things are a bit more risky - > they're not really legal and can't in theory be written out. Or might not > be written out. Although from a quick look at the writeback code, it might > all just work. > > However a cleaner solution might be to remember, on a per-inode basis, what > the filesystem's view is of the current file size. Then implement > inode_operations.setattr() and if someone extends the file, we know that > this will expose the uninitialised blocks outside the old file-size to > reads, so now is the time to instantiate that dirty, zero-filled pagecache > to cover the tail of the last fragment. > > Is all a bit tricky. > I see the two places: inode_ops.setattr and address_space_operations.commit_write, where inode size can be changed, and we should say "hey, we have pages outside of inode lets fill them with zeroes". But there is one problem, as can I see(may be I missed something?): functions from ufs address_space_operations like prepare_write, writepage are called with page in locked state, and deadlock may appear, when for example msync will be called for page "0" and another msync will be called for page "1", and prepare_write for "0" will try nullify pages "0", "1", "2" and "3" (The similar code[populate cache and modify pages], used for reallocate fragments, but due to nature of ufs such situation is impossible). So it will be funny implement, debug and use such code. > > if use blockdev page cache it is possible to zeroize real data, > > if later inode page cache will be used. > > > > The simpliest way, as can I see usage of block device page cache, > > but not only mark dirty, but also sync it during "nullification". > > That'll work. How bad is this change from a performance point-of-view? You suggest any particular benchmark? I use my simple tests collection, which I used for check that create,open,write,read,close works on ufs, and I see that this patch makes ufs code 18% slower then before. -- /Evgeniy ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-12-28 15:59 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-20 11:02 [BUG] garbage instead of zeroes in UFS Tomasz Kvarsin 2006-12-20 11:04 ` Tomasz Kvarsin 2006-12-20 11:09 ` Andrew Morton 2006-12-20 11:41 ` Tomasz Kvarsin 2006-12-20 14:54 ` [BUG] [PATCH] [RFC] " Evgeniy Dushistov 2006-12-20 16:52 ` Tomasz Kvarsin 2006-12-21 5:35 ` Andrew Morton 2006-12-28 16:04 ` Evgeniy Dushistov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox