From: Dave Chinner <david@fromorbit.com>
To: Michael Monnerie <michael.monnerie@is.it-management.at>
Cc: Christoph Hellwig <hch@infradead.org>,
Jeff Liu <jeff.liu@oracle.com>,
xfs@oss.sgi.com
Subject: Re: SEEK_DATA/SEEK_HOLE support
Date: Wed, 5 Oct 2011 20:36:15 +1100 [thread overview]
Message-ID: <20111005093615.GQ3159@dastard> (raw)
In-Reply-To: <201110050934.28021@zmi.at>
On Wed, Oct 05, 2011 at 09:34:26AM +0200, Michael Monnerie wrote:
> On Mittwoch, 5. Oktober 2011 Dave Chinner wrote:
> > That will only work if you can prevent concurrent unwritten extent
> > conversion from happening while we do the separate tag lookups on
> > the range because it requires two radix tree tag lookups rather than
> > just one index lookup. i.e. miss the dirty page because it went
> > dirty->writeback during the dirty tag search, and miss the same page
> > when doing the writeback lookup because it went writeback->clean
> > very quickly due to IO completion.
> >
> > So to stop that from happening, it requires that filesystems can
> > exclude unwritten extent conversion from happening while a
> > SEEK_HOLE/SEEK_DATA operation is in progress, and that the
> > filesystem can safely do mapping tree lookups while providing that
> > extent tree exclusion. I know that XFS has no problems here, but
> > filesystems that use i_mutex for everything might be in trouble.
> >
> > Besides, if we just look for pages in the cache over unwritten
> > extents (i.e. someone has treated it as data already), then it can
> > be done locklessly without having to worry about page state changes
> > occurring concurrently...
>
> I'd like to understand why it's important to care about locking here. As
> I understand it, SEEK_* is used for example to copy a file efficiently.
> If that is performed on a file that is currently being written to, the
> resulting copy will probably be bogus anyway, even without SEEK_* usage.
The problem is that copying a file with dirty data hot in the cache
concurrently when writeback to that file is occurring. The
application is not modifying the file, and the state changes that
could lead SEEK_DATA missing data are driven entirely by background
kernel threads. The application leaves a file in this state:
dirty page
+------D--------+
unwritten extent
SEEK_DATA must stop at that dirty page over the unwritten extent.
If we do a dirty page lookup, but the flusher thread has started IO
on it, it will have transitioned to this state:
writeback page
+------W--------+
unwritten extent
And the dirty page lookup will not find it. If we allow unwritten
extent conversion to also run while we are doing the writeback page
cache lookup, then IO completion and conversion can occur before we
find the writeback page in the cache. i.e we see this state:
clean page
+------C--------+
unwritten extent
And we don't find the data page over the unwritten extent at all
because we haven't looked up clean pages in the cache.
At that point, our SEEK_DATA call has skipped over a valid data
page, the copy utility fails to copy the data over the unwritten
extent, thereby silently creating a corrupt copy....
> There might be a case where it is important, but I can't see that atm.
It's a data corruption problem, pure and simple....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-10-05 9:36 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-02 15:04 SEEK_DATA/SEEK_HOLE support Jeff Liu
2011-10-02 15:42 ` Christoph Hellwig
2011-10-02 16:06 ` Jeff Liu
2011-10-02 17:59 ` Christoph Hellwig
2011-10-02 19:11 ` Andi Kleen
2011-10-03 4:04 ` Jeff Liu
2011-10-03 23:43 ` Dave Chinner
2011-10-04 13:02 ` Christoph Hellwig
2011-10-05 4:36 ` Dave Chinner
2011-10-05 5:32 ` Jeff Liu
2011-10-05 9:23 ` Dave Chinner
2011-10-05 13:56 ` Jeff Liu
2011-10-05 7:34 ` Michael Monnerie
2011-10-05 9:36 ` Dave Chinner [this message]
2011-10-05 18:22 ` Michael Monnerie
2011-10-06 0:32 ` Dave Chinner
2011-11-14 10:24 ` Christoph Hellwig
2011-11-14 12:47 ` Jeff Liu
2011-11-14 12:50 ` Christoph Hellwig
2011-11-19 8:29 ` XFS SEEK_DATA/SEEK_HOLE support V1 Jeff Liu
2011-11-19 8:34 ` Jeff Liu
2011-11-19 8:37 ` [PATCH] Introduce SEEK_DATA/SEEK_HOLE support to XFS V1 Jeff Liu
2011-11-19 19:11 ` Christoph Hellwig
2011-11-20 13:15 ` Jeff Liu
2011-11-20 0:30 ` Dave Chinner
2011-11-20 13:59 ` Jeff Liu
2011-11-20 15:30 ` Christoph Hellwig
2011-11-20 22:34 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111005093615.GQ3159@dastard \
--to=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=jeff.liu@oracle.com \
--cc=michael.monnerie@is.it-management.at \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox