linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Liu Bo <obuil.liubo@gmail.com>
To: Chris Mason <clm@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	bugzilla-daemon@bugzilla.kernel.org,
	bugzilla.kernel.org@plan9.de, linux-btrfs@vger.kernel.org,
	linux-mm@kvack.org, Jan Kara <jack@suse.cz>
Subject: Re: [Bug 199931] New: systemd/rtorrent file data corruption when using echo 3 >/proc/sys/vm/drop_caches
Date: Wed, 6 Jun 2018 21:55:16 +0800	[thread overview]
Message-ID: <CANQeFDAe5LOaTief5KBxtWb9ewJmRjmHvSnkdY_LrCX7-1rdkw@mail.gmail.com> (raw)
In-Reply-To: <0909E1D8-D024-4667-A0E8-C1CF40E77683@fb.com>

On Wed, Jun 6, 2018 at 9:44 PM, Chris Mason <clm@fb.com> wrote:
> On 6 Jun 2018, at 9:38, Liu Bo wrote:
>
>> On Wed, Jun 6, 2018 at 8:18 AM, Chris Mason <clm@fb.com> wrote:
>>>
>>>
>>>
>>> On 5 Jun 2018, at 16:03, Andrew Morton wrote:
>>>
>>>> (switched to email.  Please respond via emailed reply-to-all, not via
>>>> the
>>>> bugzilla web interface).
>>>>
>>>> On Tue, 05 Jun 2018 18:01:36 +0000 bugzilla-daemon@bugzilla.kernel.org
>>>> wrote:
>>>>
>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=199931
>>>>>
>>>>>             Bug ID: 199931
>>>>>            Summary: systemd/rtorrent file data corruption when using
>>>>> echo
>>>>>                     3 >/proc/sys/vm/drop_caches
>>>>
>>>>
>>>>
>>>> A long tale of woe here.  Chris, do you think the pagecache corruption
>>>> is a general thing, or is it possible that btrfs is contributing?
>>>>
>>>> Also, that 4.4 oom-killer regression sounds very serious.
>>>
>>>
>>>
>>> This week I found a bug in btrfs file write with how we handle stable
>>> pages.
>>> Basically it works like this:
>>>
>>> write(fd, some bytes less than a page)
>>> write(fd, some bytes into the same page)
>>>     btrfs prefaults the userland page
>>>     lock_and_cleanup_extent_if_need()   <- stable pages
>>>                 wait for writeback()
>>>                 clear_page_dirty_for_io()
>>>
>>> At this point we have a page that was dirty and is now clean.  That's
>>> normally fine, unless our prefaulted page isn't in ram anymore.
>>>
>>>         iov_iter_copy_from_user_atomic() <--- uh oh
>>>
>>> If the copy_from_user fails, we drop all our locks and retry.  But along
>>> the
>>> way, we completely lost the dirty bit on the page.  If the page is
>>> dropped
>>> by drop_caches, the writes are lost.  We'll just read back the stale
>>> contents of that page during the retry loop.  This won't result in crc
>>> errors because the bytes we lost were never crc'd.
>>>
>>
>> So we're going to carefully redirty the page under the page lock, right?
>
>
> I don't think we actually need to clean it.  We have the page locked,
> writeback won't start until we unlock.
>

My concern is that the buggy thing is similar to compression path,
where we also did the trick of clear_page_dirty_for_io and
redirty_pages to avoid any faults wandering in and changing pages
underneath, but seems here we're fine if pages get changed in between.

>>
>>> It could result in zeros in the file because we're basically reading a
>>> hole,
>>> and those zeros could move around in the page depending on which part of
>>> the
>>> page was dirty when the writes were lost.
>>>
>>
>> I got a question, while re-reading this page, wouldn't it read
>> old/stale on-disk data?
>
>
> If it was never written we should be treating it like a hole, but I'll
> double check.
>

Okay, I think this would also happen in the overwrite case, where
stale data lies on disk.

thanks,
liubo

  reply	other threads:[~2018-06-06 13:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-199931-27@https.bugzilla.kernel.org/>
2018-06-05 20:03 ` [Bug 199931] New: systemd/rtorrent file data corruption when using echo 3 >/proc/sys/vm/drop_caches Andrew Morton
2018-06-05 21:22   ` Tetsuo Handa
2018-06-05 21:38     ` Andrew Morton
2018-06-05 21:52   ` james harvey
2018-06-06 19:06     ` Marc Lehmann
2018-06-06 20:33       ` james harvey
2018-06-08  7:18       ` Duncan
2018-06-06  0:18   ` Chris Mason
2018-06-06 13:38     ` Liu Bo
2018-06-06 13:44       ` Chris Mason
2018-06-06 13:55         ` Liu Bo [this message]
2018-06-06  8:45   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANQeFDAe5LOaTief5KBxtWb9ewJmRjmHvSnkdY_LrCX7-1rdkw@mail.gmail.com \
    --to=obuil.liubo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=bugzilla.kernel.org@plan9.de \
    --cc=clm@fb.com \
    --cc=jack@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).