From mboxrd@z Thu Jan 1 00:00:00 1970 From: Felix Blyakher Subject: Re: [PATCH/RFC] pass dio_complete proper offset from finished_one_bio Date: Sun, 26 Nov 2006 10:18:29 -0600 Message-ID: <4569BE55.3020806@sgi.com> References: <45691753.60500@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.org, xfs@oss.sgi.com, rtc@gmx.de Return-path: Received: from omx2-ext.sgi.com ([192.48.171.19]:52162 "EHLO omx2.sgi.com") by vger.kernel.org with ESMTP id S935428AbWKZQR7 (ORCPT ); Sun, 26 Nov 2006 11:17:59 -0500 To: Eric Sandeen In-Reply-To: <45691753.60500@redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Good find, Eric. Looks good to me. I guess, that can explain some problems I saw on SLES10, and never had time to dig in it. Seems to me, it's been that way for quite a while, though. This code was introduced in 2.6.12. Prior to that, the code in finished_one_bio() did following: dio_complete(dio, dio->block_in_file << dio->blkbits, dio->result); where block_in_file is "current offset into the underlying file in dio_block units", which is still the post-IO position, not the IO start offset, isn't it? - Felix Eric Sandeen wrote: > We saw problems w/ xfs doing AIO+DIO into a sparse file. > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217098 > > It seemed that xfs was doing "extent conversion" at the wrong offsets, so > written regions came up as unwritten (zeros) and stale data was exposed > in the region after the write. Thanks to Peter Backes for the very > nice testcase. > > This also broke xen with blktap over xfs. > > Here's what I found: > finished_one_bio calls dio_complete calls xfs_end_io_direct with an > offset, but: > > offset = dio->iocb->ki_pos; > > so this is the -current- post-IO position, not the IO start point that > dio_complete expects. So, xfs converts the the wrong region. Ouch! > > XFS seems to be the only filesystem that uses the offset passed to the > end_io function, so only it is affected by this as near as I can tell. > > However, the "short read" case is probably also wrong, as it is checking: > > if ((dio->rw == READ) && > ((offset + transferred) > dio->i_size)) > transferred = dio->i_size - offset; > > but offset is the ending IO position in this case too. > > This patch seems to fix it up. > > Comments? > > Thanks, > -Eric > > Signed-off-by: Eric Sandeen > > Index: linux-2.6.18/fs/direct-io.c > =================================================================== > --- linux-2.6.18.orig/fs/direct-io.c > +++ linux-2.6.18/fs/direct-io.c > @@ -244,12 +244,12 @@ static void finished_one_bio(struct dio > */ > spin_unlock_irqrestore(&dio->bio_lock, flags); > > - /* Check for short read case */ > transferred = dio->result; > - offset = dio->iocb->ki_pos; > + offset = dio->iocb->ki_pos - transferred; > > + /* Check for short read case */ > if ((dio->rw == READ) && > - ((offset + transferred) > dio->i_size)) > + (dio->iocb->ki_pos > dio->i_size)) > transferred = dio->i_size - offset; > > /* check for error in completion path */ > > >