From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH 1/2] [PATCH 1/2] direct-io: implement generic deferred AIO completions Date: Thu, 6 Dec 2012 20:05:37 +0100 Message-ID: <20121206190537.GD21029@quack.suse.cz> References: <20121123075502.307482760@bombadil.infradead.org> <20121123075916.182694435@bombadil.infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-fsdevel@vger.kernel.org, Jan Kara , Jeff Moyer , "Darrick J. Wong" To: Christoph Hellwig Return-path: Received: from cantor2.suse.de ([195.135.220.15]:43721 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1424515Ab2LFTFk (ORCPT ); Thu, 6 Dec 2012 14:05:40 -0500 Content-Disposition: inline In-Reply-To: <20121123075916.182694435@bombadil.infradead.org> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri 23-11-12 02:55:03, Christoph Hellwig wrote: > Add support to the core direct-io code to defer AIO completions to user > context using a workqueue. This replaces opencoded and less efficient > code in XFS and ext4 and will be needed to properly support O_(D)SYNC > for AIO. > > The communication between the filesystem and the direct I/O code requires > a new buffer head flag, which is a bit ugly but not avoidable until the > direct I/O code stops abusing the buffer_head structure for communicating > with the filesystems. > > Currently this creates a per-superblock unbnound workqueue for these > completions, which is taken from an earlier patch by Jan Kara. I'm > not really convince about this use and would prefer a "normal" global > workqueue with a high concurrently limit, but this needs further discussion. > > Signed-off-by: Christoph Hellwig I like this, but this patch already breaks ext4, doesn't it? > Index: linux-2.6/fs/ext4/inode.c > =================================================================== > --- linux-2.6.orig/fs/ext4/inode.c 2012-11-06 13:33:43.483326794 +0100 > +++ linux-2.6/fs/ext4/inode.c 2012-11-21 21:19:35.227136042 +0100 > @@ -2876,15 +2876,13 @@ static int ext4_get_block_write_nolock(s > } > > static void ext4_end_io_dio(struct kiocb *iocb, loff_t offset, > - ssize_t size, void *private, int ret, > - bool is_async) > + ssize_t size, void *private) > { > - struct inode *inode = iocb->ki_filp->f_path.dentry->d_inode; > ext4_io_end_t *io_end = iocb->private; > > /* if not async direct IO or dio with 0 bytes write, just return */ > if (!io_end || !size) > - goto out; > + return; > > ext_debug("ext4_end_io_dio(): io_end 0x%p " > "for inode %lu, iocb 0x%p, offset %llu, size %zd\n", > @@ -2896,19 +2894,11 @@ static void ext4_end_io_dio(struct kiocb > /* if not aio dio with unwritten extents, just free io and return */ > if (!(io_end->flag & EXT4_IO_END_UNWRITTEN)) { > ext4_free_io_end(io_end); > -out: > - if (is_async) > - aio_complete(iocb, ret, 0); > - inode_dio_done(inode); > return; > } > > io_end->offset = offset; > io_end->size = size; > - if (is_async) { > - io_end->iocb = iocb; > - io_end->result = ret; > - } > > ext4_add_complete_io(io_end); ^^^ Here ext4 offloads IO completion to a worker thread. So you now complete AIO / DIO before ext4_end_io() runs which is a bug (ext4_end_io() is responsible for example for calling end_page_writeback()). I'll modify these patches to work for ext4 tomorrow I hope... Honza -- Jan Kara SUSE Labs, CR