linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peng Zhang <zhangpeng362@huawei.com>
To: <linux-mm@kvack.org>, <linux-fsdevel@vger.kernel.org>,
	<netdev@vger.kernel.org>
Cc: <willy@infradead.org>, <akpm@linux-foundation.org>,
	<edumazet@google.com>, <davem@davemloft.net>,
	<dsahern@kernel.org>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<arjunroy@google.com>, <wangkefeng.wang@huawei.com>
Subject: [RFC PATCH] filemap: add mapping_mapped check in filemap_unaccount_folio()
Date: Fri, 19 Jan 2024 17:20:24 +0800	[thread overview]
Message-ID: <20240119092024.193066-1-zhangpeng362@huawei.com> (raw)

From: ZhangPeng <zhangpeng362@huawei.com>

Recently, we discovered a syzkaller issue that triggers
VM_BUG_ON_FOLIO in filemap_unaccount_folio() with CONFIG_DEBUG_VM
enabled, or bad page without CONFIG_DEBUG_VM.

The specific scenarios are as follows:
(1) mmap: Use socket fd to create a TCP VMA.
(2) open(O_CREAT) + fallocate + sendfile: Read the ext4 file and create
the page cache. The mapping of the page cache is ext4 inode->i_mapping.
Send the ext4 page cache to the socket fd through sendfile.
(3) getsockopt TCP_ZEROCOPY_RECEIVE: Receive the ext4 page cache and use
vm_insert_pages() to insert the ext4 page cache to the TCP VMA. In this
case, mapcount changes from - 1 to 0. The page cache mapping is ext4
inode->i_mapping, but the VMA of the page cache is the TCP VMA and
folio->mapping->i_mmap is empty.
(4) open(O_TRUNC): Deletes the ext4 page cache. In this case, the page
cache is still in the xarray tree of mapping->i_pages and these page
cache should also be deleted. However, folio->mapping->i_mmap is empty.
Therefore, truncate_cleanup_folio()->unmap_mapping_folio() can't unmap
i_mmap tree. In filemap_unaccount_folio(), the mapcount of the folio is
0, causing BUG ON.

Syz log that can be used to reproduce the issue:
r3 = socket$inet_tcp(0x2, 0x1, 0x0)
mmap(&(0x7f0000ff9000/0x4000)=nil, 0x4000, 0x0, 0x12, r3, 0x0)
r4 = socket$inet_tcp(0x2, 0x1, 0x0)
bind$inet(r4, &(0x7f0000000000)={0x2, 0x4e24, @multicast1}, 0x10)
connect$inet(r4, &(0x7f00000006c0)={0x2, 0x4e24, @empty}, 0x10)
r5 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
0x181e42, 0x0)
fallocate(r5, 0x0, 0x0, 0x85b8)
sendfile(r4, r5, 0x0, 0x8ba0)
getsockopt$inet_tcp_TCP_ZEROCOPY_RECEIVE(r4, 0x6, 0x23,
&(0x7f00000001c0)={&(0x7f0000ffb000/0x3000)=nil, 0x3000, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0}, &(0x7f0000000440)=0x40)
r6 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
0x181e42, 0x0)

In the current TCP zerocopy scenario, folio will be released normally .
When the process exits, if the page cache is truncated before the
process exits, BUG ON or Bad page occurs, which does not meet the
expectation.
To fix this issue, the mapping_mapped() check is added to
filemap_unaccount_folio(). In addition, to reduce the impact on
performance, no lock is added when mapping_mapped() is checked.

Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
---
 mm/filemap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index ea49677c6338..6a669eb24816 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -148,10 +148,11 @@ static void page_cache_delete(struct address_space *mapping,
 static void filemap_unaccount_folio(struct address_space *mapping,
 		struct folio *folio)
 {
+	bool mapped = folio_mapped(folio) && mapping_mapped(mapping);
 	long nr;
 
-	VM_BUG_ON_FOLIO(folio_mapped(folio), folio);
-	if (!IS_ENABLED(CONFIG_DEBUG_VM) && unlikely(folio_mapped(folio))) {
+	VM_BUG_ON_FOLIO(mapped, folio);
+	if (!IS_ENABLED(CONFIG_DEBUG_VM) && unlikely(mapped)) {
 		pr_alert("BUG: Bad page cache in process %s  pfn:%05lx\n",
 			 current->comm, folio_pfn(folio));
 		dump_page(&folio->page, "still mapped when deleted");
-- 
2.25.1


             reply	other threads:[~2024-01-19  9:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-19  9:20 Peng Zhang [this message]
2024-01-19 13:40 ` [RFC PATCH] filemap: add mapping_mapped check in filemap_unaccount_folio() Matthew Wilcox
2024-01-20  6:46   ` zhangpeng (AS)
2024-01-22 16:04     ` SECURITY PROBLEM: Any user can crash the kernel with TCP ZEROCOPY Matthew Wilcox
2024-01-22 16:30       ` Eric Dumazet
2024-01-22 17:12         ` Matthew Wilcox
2024-01-22 17:39           ` Eric Dumazet
2024-01-24  9:30             ` zhangpeng (AS)
2024-01-24 10:11               ` Eric Dumazet
2024-01-25  2:18                 ` zhangpeng (AS)
2024-01-25  8:57                   ` Eric Dumazet
2024-01-25  9:22                     ` zhangpeng (AS)
2024-01-25 10:31                       ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240119092024.193066-1-zhangpeng362@huawei.com \
    --to=zhangpeng362@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjunroy@google.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).