* [2.6.29-rc5][BUG] swapon on vfat file gets stuck on inode lock @ 2009-03-11 13:57 Laurent GUERBY 2009-03-11 15:53 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: Laurent GUERBY @ 2009-03-11 13:57 UTC (permalink / raw) To: linux-kernel; +Cc: OGAWA Hirofumi, Linus Torvalds Hi, With 2.6.29-rc5 (on an ARM platform but I don't think it's significant) when I try to enable swap on a file which is on an USB mounted vfat partition the swapon syscall gets stuck: swapon D c03780d4 0 22361 1 [<c0377db0>] (schedule+0x0/0x3ac) from [<c0378e50>] (__mutex_lock_slowpath+0x94/0xf4) [<c0378dbc>] (__mutex_lock_slowpath+0x0/0xf4) from [<c0378c40>] (mutex_lock+0x20/0x24) r6:df49e808 r5:00000000 r4:00000000 [<c0378c20>] (mutex_lock+0x0/0x24) from [<c013bb2c>] (_fat_bmap+0x28/0x68) [<c013bb04>] (_fat_bmap+0x0/0x68) from [<c00af910>] (bmap+0x2c/0x40) r6:0005ffff r5:00000000 r4:00000000 [<c00af8e4>] (bmap+0x0/0x40) from [<c0094b84>] (sys_swapon+0x630/0xcbc) r5:c0541510 r4:00300000 [<c0094554>] (sys_swapon+0x0/0xcbc) from [<c0029960>] (ret_fast_syscall+0x0/0x2c) Looking around in the kernel sources it looks like the inode mutex is taken both by mm/swapfile.c and by fs/fat/inode.c, the latter one since this patch from november 2008: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fa93ca18a8b0da4e26bd9491ad144cd14d22f8ec When I use a loopback device on the very same file the swapon call works so there's an easy workaround for this issue. Sincerely, Laurent http://guerby.org/blog ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [2.6.29-rc5][BUG] swapon on vfat file gets stuck on inode lock 2009-03-11 13:57 [2.6.29-rc5][BUG] swapon on vfat file gets stuck on inode lock Laurent GUERBY @ 2009-03-11 15:53 ` Linus Torvalds 2009-03-11 16:45 ` Laurent GUERBY ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Linus Torvalds @ 2009-03-11 15:53 UTC (permalink / raw) To: Laurent GUERBY; +Cc: linux-kernel, OGAWA Hirofumi On Wed, 11 Mar 2009, Laurent GUERBY wrote: > > With 2.6.29-rc5 (on an ARM platform but I don't think it's significant) > when I try to enable swap on a file which is on an USB mounted vfat > partition the swapon syscall gets stuck: Oh wow. Are you _sure_ you want to do that? That's crazy. But: > swapon D c03780d4 0 22361 1 > [<c0377db0>] (schedule+0x0/0x3ac) from [<c0378e50>] (__mutex_lock_slowpath+0x94/0xf4) > [<c0378dbc>] (__mutex_lock_slowpath+0x0/0xf4) from [<c0378c40>] (mutex_lock+0x20/0x24) r6:df49e808 r5:00000000 r4:00000000 > [<c0378c20>] (mutex_lock+0x0/0x24) from [<c013bb2c>] (_fat_bmap+0x28/0x68) > [<c013bb04>] (_fat_bmap+0x0/0x68) from [<c00af910>] (bmap+0x2c/0x40) r6:0005ffff r5:00000000 r4:00000000 > [<c00af8e4>] (bmap+0x0/0x40) from [<c0094b84>] (sys_swapon+0x630/0xcbc) r5:c0541510 r4:00300000 > [<c0094554>] (sys_swapon+0x0/0xcbc) from [<c0029960>] (ret_fast_syscall+0x0/0x2c) > > Looking around in the kernel sources it looks like the inode mutex is > taken both by mm/swapfile.c and by fs/fat/inode.c, the latter one since > this patch from november 2008: > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fa93ca18a8b0da4e26bd9491ad144cd14d22f8ec Yes, clearly there's a deadlock there which was hidden before. And FAT technically does the locking right, so that bmap() doesn't race with somebody changing the file. That said, other filesystems don't have this problem, simply because they just ignore the race, knowing that bmap is inherently racy in that situation _anyway_ (ie the value we return is clearly going to race _after_ we release the lock even if we do the lookup with the lock held!). So the right thing to do would appear to be to just remove the silly locking in fat_bmap. It's not helping, and it's clearly hurting your (crazy) case. In the _normal_ paths (ie a regular read/write) we handle locking on a per-page basis anyway. I dunno. No other filesystem has _any_ locking in their bmap that I can see, so I strongly suspect that fat doesn't need it either. IOW, I'm almost 100% sure that the right fix is this trivial one, but I'd like somebody else to double-check my thinking. Linus --- fs/fat/inode.c | 9 +-------- 1 files changed, 1 insertions(+), 8 deletions(-) diff --git a/fs/fat/inode.c b/fs/fat/inode.c index 6b74d09..2fcb269 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -199,14 +199,7 @@ static ssize_t fat_direct_IO(int rw, struct kiocb *iocb, static sector_t _fat_bmap(struct address_space *mapping, sector_t block) { - sector_t blocknr; - - /* fat_get_cluster() assumes the requested blocknr isn't truncated. */ - mutex_lock(&mapping->host->i_mutex); - blocknr = generic_block_bmap(mapping, block, fat_get_block); - mutex_unlock(&mapping->host->i_mutex); - - return blocknr; + return generic_block_bmap(mapping, block, fat_get_block); } static const struct address_space_operations fat_aops = { ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [2.6.29-rc5][BUG] swapon on vfat file gets stuck on inode lock 2009-03-11 15:53 ` Linus Torvalds @ 2009-03-11 16:45 ` Laurent GUERBY 2009-03-11 17:03 ` OGAWA Hirofumi 2009-03-16 8:34 ` Christoph Hellwig 2 siblings, 0 replies; 7+ messages in thread From: Laurent GUERBY @ 2009-03-11 16:45 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel, OGAWA Hirofumi On Wed, 2009-03-11 at 08:53 -0700, Linus Torvalds wrote: > On Wed, 11 Mar 2009, Laurent GUERBY wrote: > > With 2.6.29-rc5 (on an ARM platform but I don't think it's significant) > > when I try to enable swap on a file which is on an USB mounted vfat > > partition the swapon syscall gets stuck: > > Oh wow. Are you _sure_ you want to do that? That's crazy. I was testing a device with no swap (root over NFS) and GCC bootstrap failed because of lack of RAM. So I first tried to add swap on file over NFS, judging from various warnings on the net and that the system froze after about 4 hours it was in the "crazy" category. After that I grabbed an USB HDD which happened to be vfat formated, hence second freeze but on swapon process this time. I promise next time I'll stick to local ext3 on x86 :). Laurent ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [2.6.29-rc5][BUG] swapon on vfat file gets stuck on inode lock 2009-03-11 15:53 ` Linus Torvalds 2009-03-11 16:45 ` Laurent GUERBY @ 2009-03-11 17:03 ` OGAWA Hirofumi 2009-03-11 18:13 ` OGAWA Hirofumi 2009-03-16 8:34 ` Christoph Hellwig 2 siblings, 1 reply; 7+ messages in thread From: OGAWA Hirofumi @ 2009-03-11 17:03 UTC (permalink / raw) To: Linus Torvalds; +Cc: Laurent GUERBY, linux-kernel Linus Torvalds <torvalds@linux-foundation.org> writes: > Yes, clearly there's a deadlock there which was hidden before. And FAT > technically does the locking right, so that bmap() doesn't race with > somebody changing the file. > > That said, other filesystems don't have this problem, simply because they > just ignore the race, knowing that bmap is inherently racy in that > situation _anyway_ (ie the value we return is clearly going to race > _after_ we release the lock even if we do the lookup with the lock held!). > > So the right thing to do would appear to be to just remove the silly > locking in fat_bmap. It's not helping, and it's clearly hurting your > (crazy) case. In the _normal_ paths (ie a regular read/write) we handle > locking on a per-page basis anyway. > > I dunno. No other filesystem has _any_ locking in their bmap that I can > see, so I strongly suspect that fat doesn't need it either. > > IOW, I'm almost 100% sure that the right fix is this trivial one, but I'd > like somebody else to double-check my thinking. I'm sure that path touch the metadata without locking (so, reused entry can not be for that inode anymore). However, I guess the result doesn't become any fs corruption, so and other fs is ignoring the possibly wrong result of bmap(). I'm thinking to use this patch instead of removing. [PATCH] Fix _fat_bmap() locking On swapon() path, it has already i_mutex. So, this uses i_alloc_sem instead of it. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> --- fs/fat/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN fs/fat/inode.c~fat_bmap-locking-fix fs/fat/inode.c --- linux-2.6/fs/fat/inode.c~fat_bmap-locking-fix 2009-03-12 00:47:15.000000000 +0900 +++ linux-2.6-hirofumi/fs/fat/inode.c 2009-03-12 00:47:42.000000000 +0900 @@ -202,9 +202,9 @@ static sector_t _fat_bmap(struct address sector_t blocknr; /* fat_get_cluster() assumes the requested blocknr isn't truncated. */ - mutex_lock(&mapping->host->i_mutex); + down_read(&mapping->host->i_alloc_sem); blocknr = generic_block_bmap(mapping, block, fat_get_block); - mutex_unlock(&mapping->host->i_mutex); + up_read(&mapping->host->i_alloc_sem); return blocknr; } _ -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [2.6.29-rc5][BUG] swapon on vfat file gets stuck on inode lock 2009-03-11 17:03 ` OGAWA Hirofumi @ 2009-03-11 18:13 ` OGAWA Hirofumi 2009-03-11 18:19 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: OGAWA Hirofumi @ 2009-03-11 18:13 UTC (permalink / raw) To: Linus Torvalds; +Cc: Laurent GUERBY, linux-kernel OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> writes: > I'm sure that path touch the metadata without locking (so, reused entry > can not be for that inode anymore). However, I guess the result doesn't > become any fs corruption, so and other fs is ignoring the possibly wrong > result of bmap(). > > I'm thinking to use this patch instead of removing. Ok. Simple stress test was done. BTW, since fat caches the metadata on that path, fat would be sensitive to race more than other fs. Anyway, if you are ok, please apply. Otherwise, I'm going to prepare for this patch to next merge window. Thanks. > [PATCH] Fix _fat_bmap() locking > > On swapon() path, it has already i_mutex. So, this uses i_alloc_sem > instead of it. > > Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> > --- > > fs/fat/inode.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff -puN fs/fat/inode.c~fat_bmap-locking-fix fs/fat/inode.c > --- linux-2.6/fs/fat/inode.c~fat_bmap-locking-fix 2009-03-12 00:47:15.000000000 +0900 > +++ linux-2.6-hirofumi/fs/fat/inode.c 2009-03-12 00:47:42.000000000 +0900 > @@ -202,9 +202,9 @@ static sector_t _fat_bmap(struct address > sector_t blocknr; > > /* fat_get_cluster() assumes the requested blocknr isn't truncated. */ > - mutex_lock(&mapping->host->i_mutex); > + down_read(&mapping->host->i_alloc_sem); > blocknr = generic_block_bmap(mapping, block, fat_get_block); > - mutex_unlock(&mapping->host->i_mutex); > + up_read(&mapping->host->i_alloc_sem); > > return blocknr; > } > _ -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [2.6.29-rc5][BUG] swapon on vfat file gets stuck on inode lock 2009-03-11 18:13 ` OGAWA Hirofumi @ 2009-03-11 18:19 ` Linus Torvalds 0 siblings, 0 replies; 7+ messages in thread From: Linus Torvalds @ 2009-03-11 18:19 UTC (permalink / raw) To: OGAWA Hirofumi; +Cc: Laurent GUERBY, linux-kernel On Thu, 12 Mar 2009, OGAWA Hirofumi wrote: > > Ok. Simple stress test was done. BTW, since fat caches the metadata on > that path, fat would be sensitive to race more than other fs. > > Anyway, if you are ok, please apply. Otherwise, I'm going to prepare for > this patch to next merge window. It looks good to me, and obviously safer than my version. Will apply. Thanks, Linus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [2.6.29-rc5][BUG] swapon on vfat file gets stuck on inode lock 2009-03-11 15:53 ` Linus Torvalds 2009-03-11 16:45 ` Laurent GUERBY 2009-03-11 17:03 ` OGAWA Hirofumi @ 2009-03-16 8:34 ` Christoph Hellwig 2 siblings, 0 replies; 7+ messages in thread From: Christoph Hellwig @ 2009-03-16 8:34 UTC (permalink / raw) To: Linus Torvalds; +Cc: Laurent GUERBY, linux-kernel, OGAWA Hirofumi On Wed, Mar 11, 2009 at 08:53:14AM -0700, Linus Torvalds wrote: > I dunno. No other filesystem has _any_ locking in their bmap that I can > see, so I strongly suspect that fat doesn't need it either. XFS has proper locking using it's own locks. Then again Peter had patchset somewhere to implement proper aops for swap instead of abusing -bmap to implement swap over NFS, which should be able to fix this for fat, too. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-03-16 8:34 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-03-11 13:57 [2.6.29-rc5][BUG] swapon on vfat file gets stuck on inode lock Laurent GUERBY 2009-03-11 15:53 ` Linus Torvalds 2009-03-11 16:45 ` Laurent GUERBY 2009-03-11 17:03 ` OGAWA Hirofumi 2009-03-11 18:13 ` OGAWA Hirofumi 2009-03-11 18:19 ` Linus Torvalds 2009-03-16 8:34 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox