From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756580AbYDGI25 (ORCPT ); Mon, 7 Apr 2008 04:28:57 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754204AbYDGI2q (ORCPT ); Mon, 7 Apr 2008 04:28:46 -0400 Received: from brick.kernel.dk ([87.55.233.238]:22346 "EHLO kernel.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754036AbYDGI2p (ORCPT ); Mon, 7 Apr 2008 04:28:45 -0400 Date: Mon, 7 Apr 2008 10:28:41 +0200 From: Jens Axboe To: =?iso-8859-1?Q?J=F6rn?= Engel Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mtd@lists.infradead.org, Nick Piggin , David Woodhouse Subject: Re: [patch 0/15] LogFS take five Message-ID: <20080407082841.GL12774@kernel.dk> References: <20080401181308.512473173@logfs.org> <20080404114600.GD29686@kernel.dk> <20080407082235.GB22431@logfs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20080407082235.GB22431@logfs.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 07 2008, Jörn Engel wrote: > On Fri, 4 April 2008 13:46:00 +0200, Jens Axboe wrote: > > On Tue, Apr 01 2008, joern@logfs.org wrote: > > > And it is currently reasonably simple to run into a deadlock when > > > using logfs on a block device. The problem appears to be the block > > > layer allocating memory for its cache without GFP_NOFS, so that under > > > memory pressure logfs writes through block layer may recurse back to > > > logfs writes. > > > > So you mean for writes through the page cache, you are seeing pages > > allocated with __GFP_FS set? > > It sure looks like it. On top, the patch at the bottom seems to solve > the deadlock. I'm just not certain it is the right fix for the problem. > > > > Not entirely sure who is to blame for this bug and how to > > > solve it. > > > > A good starting point would be doing a stack trace dump in logfs if you > > see such back recursion into the fs. A quick guess would be a missing > > setting of mapping gfp mask? > > Sorry, should have sent that right along. > > [] elv_insert+0x156/0x219 > [] __mutex_lock_slowpath+0x57/0x81 > [] mutex_lock+0xd/0xf > [] logfs_get_wblocks+0x33/0x54 > [] logfs_write_buf+0x3d/0x322 > [] __logfs_writepage+0x24/0x67 > [] logfs_writepage+0xd8/0xe3 > [] shrink_page_list+0x2ee/0x514 > [] isolate_lru_pages+0x6c/0x1ff > [] shrink_zone+0x60b/0x85b > [] generic_make_request+0x329/0x364 > [] mempool_alloc_slab+0x11/0x13 > [] up_read+0x9/0xb > [] shrink_slab+0x13f/0x151 > [] try_to_free_pages+0x111/0x209 > [] __alloc_pages+0x1b1/0x2f5 > [] read_cache_page_async+0x7e/0x15c > [] blkdev_readpage+0x0/0x15 > [] read_cache_page+0xe/0x46 > [] bdev_read+0x61/0xee > [] __logfs_gc_pass+0x219/0x7dc > [] logfs_gc_pass+0x17/0x19 > [] logfs_flush_dirty+0x7d/0x99 > [] logfs_get_wblocks+0x4c/0x54 > [] logfs_write_buf+0x3d/0x322 > [] logfs_commit_write+0x77/0x7d > [] generic_file_buffered_write+0x49d/0x62c > [] file_update_time+0x7f/0xad > [] __generic_file_aio_write_nolock+0x354/0x3be > [] atomic_notifier_call_chain+0xf/0x11 > [] filemap_fault+0x1b4/0x320 > [] generic_file_aio_write+0x64/0xc0 > [] do_sync_write+0xe2/0x126 > [] release_console_sem+0x1a0/0x1a9 > [] autoremove_wake_function+0x0/0x38 > [] tty_write+0x1f2/0x20d > [] write_chan+0x0/0x334 > [] vfs_write+0xae/0x137 > [] sys_write+0x47/0x6f > [] ia32_sysret+0x0/0xa > > Jörn > > -- > Joern's library part 10: > http://blogs.msdn.com/David_Gristwood/archive/2004/06/24/164849.aspx > > Signed-off-by: Joern Engel > > fs/block_dev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- linux-2.6.24logfs/fs/block_dev.c~blockdev_nofs 2008-04-07 10:19:08.627413077 +0200 > +++ linux-2.6.24logfs/fs/block_dev.c 2008-04-07 10:20:56.927117162 +0200 > @@ -586,7 +586,7 @@ struct block_device *bdget(dev_t dev) > inode->i_rdev = dev; > inode->i_bdev = bdev; > inode->i_data.a_ops = &def_blk_aops; > - mapping_set_gfp_mask(&inode->i_data, GFP_USER); > + mapping_set_gfp_mask(&inode->i_data, GFP_USER & ~__GFP_FS); > inode->i_data.backing_dev_info = &default_backing_dev_info; > spin_lock(&bdev_lock); > list_add(&bdev->bd_list, &all_bdevs); It's not the right fix, generally GFP_FS is fine here. So do that in logfs when you cannot traverse back into the fs, eg mapping_gfp_mask(mapping) & ~__GFP_FS; locally. -- Jens Axboe