From: Felix Blyakher <felixb@sgi.com>
To: Eric Sandeen <sandeen@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.org,
xfs@oss.sgi.com, rtc@gmx.de
Subject: Re: [PATCH/RFC] pass dio_complete proper offset from finished_one_bio
Date: Sun, 26 Nov 2006 10:18:29 -0600 [thread overview]
Message-ID: <4569BE55.3020806@sgi.com> (raw)
In-Reply-To: <45691753.60500@redhat.com>
Good find, Eric.
Looks good to me. I guess, that can explain some problems I saw
on SLES10, and never had time to dig in it.
Seems to me, it's been that way for quite a while, though. This code was
introduced in 2.6.12. Prior to that, the code in finished_one_bio() did
following:
dio_complete(dio, dio->block_in_file <<
dio->blkbits,
dio->result);
where block_in_file is "current offset into the underlying file in
dio_block units", which is still the post-IO position, not the IO start
offset, isn't it?
- Felix
Eric Sandeen wrote:
> We saw problems w/ xfs doing AIO+DIO into a sparse file.
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217098
>
> It seemed that xfs was doing "extent conversion" at the wrong offsets, so
> written regions came up as unwritten (zeros) and stale data was exposed
> in the region after the write. Thanks to Peter Backes for the very
> nice testcase.
>
> This also broke xen with blktap over xfs.
>
> Here's what I found:
> finished_one_bio calls dio_complete calls xfs_end_io_direct with an
> offset, but:
>
> offset = dio->iocb->ki_pos;
>
> so this is the -current- post-IO position, not the IO start point that
> dio_complete expects. So, xfs converts the the wrong region. Ouch!
>
> XFS seems to be the only filesystem that uses the offset passed to the
> end_io function, so only it is affected by this as near as I can tell.
>
> However, the "short read" case is probably also wrong, as it is checking:
>
> if ((dio->rw == READ) &&
> ((offset + transferred) > dio->i_size))
> transferred = dio->i_size - offset;
>
> but offset is the ending IO position in this case too.
>
> This patch seems to fix it up.
>
> Comments?
>
> Thanks,
> -Eric
>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>
> Index: linux-2.6.18/fs/direct-io.c
> ===================================================================
> --- linux-2.6.18.orig/fs/direct-io.c
> +++ linux-2.6.18/fs/direct-io.c
> @@ -244,12 +244,12 @@ static void finished_one_bio(struct dio
> */
> spin_unlock_irqrestore(&dio->bio_lock, flags);
>
> - /* Check for short read case */
> transferred = dio->result;
> - offset = dio->iocb->ki_pos;
> + offset = dio->iocb->ki_pos - transferred;
>
> + /* Check for short read case */
> if ((dio->rw == READ) &&
> - ((offset + transferred) > dio->i_size))
> + (dio->iocb->ki_pos > dio->i_size))
> transferred = dio->i_size - offset;
>
> /* check for error in completion path */
>
>
>
next prev parent reply other threads:[~2006-11-26 16:17 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-26 4:25 [PATCH/RFC] pass dio_complete proper offset from finished_one_bio Eric Sandeen
2006-11-26 16:18 ` Felix Blyakher [this message]
2006-11-28 14:09 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4569BE55.3020806@sgi.com \
--to=felixb@sgi.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.org \
--cc=rtc@gmx.de \
--cc=sandeen@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).