From: Joseph Qi <joseph.qi@linux.alibaba.com>
To: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org, ocfs2-devel@oss.oracle.com
Subject: Re: [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix data corruption on truncate
Date: Tue, 2 Nov 2021 10:36:42 +0800 [thread overview]
Message-ID: <beb87697-e851-9681-bcc4-0669eb210703@linux.alibaba.com> (raw)
In-Reply-To: <20211101113100.GA18487@quack2.suse.cz>
On 11/1/21 7:31 PM, Jan Kara wrote:
> On Thu 28-10-21 15:09:08, Joseph Qi wrote:
>> Hi Jan,
>>
>> On 10/25/21 11:13 PM, Jan Kara wrote:
>>> ocfs2_truncate_file() did unmap invalidate page cache pages before
>>> zeroing partial tail cluster and setting i_size. Thus some pages could
>>> be left (and likely have left if the cluster zeroing happened) in the
>>> page cache beyond i_size after truncate finished letting user possibly
>>> see stale data once the file was extended again. Also the tail cluster
>>
>> I don't quite understand the case.
>> truncate_inode_pages() will truncate pages from new_i_size to i_size,
>> and the following ocfs2_orphan_for_truncate() will zero range and then
>> update i_size for inode as well as dinode.
>> So once truncate finished, how stale data exposing happens? Or do you
>> mean a race case between the above two steps?
>
> Sorry, I was not quite accurate in the above paragraph. There are several
> ways how stale pages in the pagecache can cause problems.
>
> 1) Because i_size is reduced after truncating page cache, page fault can
> happen after truncating page cache and zeroing pages but before reducing i_size.
> This will in allow user to arbitrarily modify pages that are used for
> writing zeroes into the cluster tail and after file extension these data
> will become visible.
>
> 2) The tail cluster zeroing in ocfs2_orphan_for_truncate() can actually try
> to write zeroed pages above i_size (e.g. if we have 4k blocksize, 64k
> clustersize, and do truncate(f, 4k) on a 4k file). This will cause exactly
> same problems as already described in commit 5314454ea3f "ocfs2: fix data
> corruption after conversion from inline format".
>
> Hope it is clearer now.
>
So the core reason is ocfs2_zero_range_for_truncate() grabs pages and
then zero, right?
I think an alternative way is using zeroout instead of zero pages, which
won't grab pages again.
Anyway, I'm also fine with your way since it is simple.
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
WARNING: multiple messages have this Message-ID (diff)
From: Joseph Qi <joseph.qi@linux.alibaba.com>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
ocfs2-devel@oss.oracle.com, Gang He <ghe@suse.com>,
Mark Fasheh <mark@fasheh.com>, Joel Becker <jlbec@evilplan.org>,
stable@vger.kernel.org
Subject: Re: [PATCH 1/2] ocfs2: Fix data corruption on truncate
Date: Tue, 2 Nov 2021 10:36:42 +0800 [thread overview]
Message-ID: <beb87697-e851-9681-bcc4-0669eb210703@linux.alibaba.com> (raw)
In-Reply-To: <20211101113100.GA18487@quack2.suse.cz>
On 11/1/21 7:31 PM, Jan Kara wrote:
> On Thu 28-10-21 15:09:08, Joseph Qi wrote:
>> Hi Jan,
>>
>> On 10/25/21 11:13 PM, Jan Kara wrote:
>>> ocfs2_truncate_file() did unmap invalidate page cache pages before
>>> zeroing partial tail cluster and setting i_size. Thus some pages could
>>> be left (and likely have left if the cluster zeroing happened) in the
>>> page cache beyond i_size after truncate finished letting user possibly
>>> see stale data once the file was extended again. Also the tail cluster
>>
>> I don't quite understand the case.
>> truncate_inode_pages() will truncate pages from new_i_size to i_size,
>> and the following ocfs2_orphan_for_truncate() will zero range and then
>> update i_size for inode as well as dinode.
>> So once truncate finished, how stale data exposing happens? Or do you
>> mean a race case between the above two steps?
>
> Sorry, I was not quite accurate in the above paragraph. There are several
> ways how stale pages in the pagecache can cause problems.
>
> 1) Because i_size is reduced after truncating page cache, page fault can
> happen after truncating page cache and zeroing pages but before reducing i_size.
> This will in allow user to arbitrarily modify pages that are used for
> writing zeroes into the cluster tail and after file extension these data
> will become visible.
>
> 2) The tail cluster zeroing in ocfs2_orphan_for_truncate() can actually try
> to write zeroed pages above i_size (e.g. if we have 4k blocksize, 64k
> clustersize, and do truncate(f, 4k) on a 4k file). This will cause exactly
> same problems as already described in commit 5314454ea3f "ocfs2: fix data
> corruption after conversion from inline format".
>
> Hope it is clearer now.
>
So the core reason is ocfs2_zero_range_for_truncate() grabs pages and
then zero, right?
I think an alternative way is using zeroout instead of zero pages, which
won't grab pages again.
Anyway, I'm also fine with your way since it is simple.
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
next prev parent reply other threads:[~2021-11-02 2:37 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-25 15:13 [Ocfs2-devel] [PATCH 0/2] ocfs2: Truncate data corruption fix Jan Kara
2021-10-25 15:13 ` [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix data corruption on truncate Jan Kara
2021-10-25 15:13 ` Jan Kara
2021-10-28 7:09 ` [Ocfs2-devel] " Joseph Qi
2021-10-28 7:09 ` Joseph Qi
2021-10-28 7:44 ` [Ocfs2-devel] " Joseph Qi
2021-10-28 7:44 ` Joseph Qi
2021-11-01 11:31 ` [Ocfs2-devel] " Jan Kara
2021-11-01 11:31 ` Jan Kara
2021-11-02 2:36 ` Joseph Qi [this message]
2021-11-02 2:36 ` Joseph Qi
2021-11-02 9:55 ` [Ocfs2-devel] " Jan Kara
2021-11-02 9:55 ` Jan Kara
2021-10-25 15:13 ` [Ocfs2-devel] [PATCH 2/2] ocfs2: Do not zero pages beyond i_size Jan Kara
2021-11-02 2:58 ` Joseph Qi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=beb87697-e851-9681-bcc4-0669eb210703@linux.alibaba.com \
--to=joseph.qi@linux.alibaba.com \
--cc=jack@suse.cz \
--cc=ocfs2-devel@oss.oracle.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.