From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73BB8C47DAF for ; Mon, 22 Jan 2024 16:04:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F0E5A6B008C; Mon, 22 Jan 2024 11:04:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EBBEF6B0092; Mon, 22 Jan 2024 11:04:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D848B6B0093; Mon, 22 Jan 2024 11:04:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C4B1A6B008C for ; Mon, 22 Jan 2024 11:04:37 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 99F411A0654 for ; Mon, 22 Jan 2024 16:04:37 +0000 (UTC) X-FDA: 81707419794.30.C046187 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf26.hostedemail.com (Postfix) with ESMTP id B70E1140017 for ; Mon, 22 Jan 2024 16:04:35 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=a3n75wVq; dmarc=none; spf=none (imf26.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705939475; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Fby4dc1Suk6EB1Xeu8J3YVmpMVpj8AzcbM0Glo3ehlA=; b=HkZYAn6PHuqNwW3oqmpbWfe1lOHplc383yzU7oZozbw/WZdIEEhyy+544WYvIn/8Uf2YN4 DRL8GlT4YfEY7BqqZbk86ViuokG+OaDUPgjVxoisbVkqBv3poAf0r43Ouj0t3OOspsHcIa LHsDRYGRsHv6fMdVLKNUHW1FNF7CUZ0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=a3n75wVq; dmarc=none; spf=none (imf26.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705939475; a=rsa-sha256; cv=none; b=lf3alqNcB66fMwRNPY6Ny0UqjiGVkWhSaSX7BIz4uxcsGulAwwhBnvGd82QW9+fATrQOM8 2Ac9ekrYq/8gEi/wV1MmFfwRZObgxU5266HA3WpbbGRVeTgEkrR4dtNWHWtKK3cYIg4OZo znjuslslks/Rs2FxbRP7+yoVq6PU2ZE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Fby4dc1Suk6EB1Xeu8J3YVmpMVpj8AzcbM0Glo3ehlA=; b=a3n75wVqsJWKY9jmFCiF3BQLse jpHMlu7C/fU9Pekbp7ixiRpn1TgtmgQF5jyuS1fe5Q/E8MVzFa2yVjZrFxeeQ+888Wj9PB7oYTq4N 8nkTfjm2lNfL7reU0TpbjzgvC1mLySE4f0TXASfs3RIU/4L/odKBprX3jYJCxphHedwSI23+5PANo F/Eq7YKu5C1yLJWe2LWXZPAG6C4DJGIWpB0qTMak77nwoFp86nhx7p16FBBEhc2DfVuJfah4lG8YH qDTHkLe+fn5/ZMM5o2ubcvtR8+E7oW9oJv/1ldyfZgPEAKAeimDVE5vdS+NZ6gA7kAcERrL+ADEnj aQmicc2g==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rRwml-00000000LaV-1Ff1; Mon, 22 Jan 2024 16:04:31 +0000 Date: Mon, 22 Jan 2024 16:04:31 +0000 From: Matthew Wilcox To: "zhangpeng (AS)" Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, netdev@vger.kernel.org, akpm@linux-foundation.org, edumazet@google.com, davem@davemloft.net, dsahern@kernel.org, kuba@kernel.org, pabeni@redhat.com, arjunroy@google.com, wangkefeng.wang@huawei.com Subject: SECURITY PROBLEM: Any user can crash the kernel with TCP ZEROCOPY Message-ID: References: <20240119092024.193066-1-zhangpeng362@huawei.com> <5106a58e-04da-372a-b836-9d3d0bd2507b@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5106a58e-04da-372a-b836-9d3d0bd2507b@huawei.com> X-Rspamd-Queue-Id: B70E1140017 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: u36eziad4fnzwmyzsfqc9xhfe9q8yp4k X-HE-Tag: 1705939475-827230 X-HE-Meta: U2FsdGVkX19JkPBgHKbZ9ZPXBEPw/485cWf1xitO9pDJVYK85xoUHD0HjJo00cfKCbPW0FiqMfB9FGosDt9CE7i8n1ueQMeHKqNBy3QV5jgc8uEvEgYyljnkm3DmIqgJZsrKADeCLY1VHt6XTZ8iRlQ65S8oCBe3PYEZ+af9cE1wJHc3w5RK9FmoD0WS0fSer5Uq4Dh19LogBjwDthpxSDg8xoIXvQMtomx8I/QG1DRI09FNuM4nER2xOXhELXvxQWUXO9Nke2mX/qgcY3o/SXLgIx2BysLLbl9p/b2qg9VD5AB05qHgcdNreZs5vTYITBpBNE3ghTEh4tZKX+6gwpZTgk1xvjRezf3ELdWvm9gpts7rCfkIGDVHYOXyxsX8JJpapqLRAyPYiLMY1RbyvQSPX21RBRkCi2P+hJE781kXs/hpPkFTMZz3jH4KLNnq3NKvhwZ0xlT1hYkZ+bSz5SoOBL8aqlBPU5saAphzrMRu8Cztux760cniXKcPqFwU/786sM6i+UiyAYjCDWeU/qn4pdlm7GWQN3G0FRMNqneke7U/SQkUPsuHG+E1OCxkGuHUpzSzAk9252Vpm1Ss1v4jjCn13Fxzz3G/tEMZssM3bJqho85t4/h9accqsBeSNT7khQQslNoF4a7SkjdnMZFSy3klGlTDuXmKliZ3E0RaTMdH16+rcPRzD2QwiUreZfvzJGP1MjVRopL+ycSMtW22YAjYwVZdjFKj3JgMV5D9n1n/PooKfJaVWNjY3y4qS+XkoepFXOsbj1YtWGpgmf5uwKU5mmiSRMa9heZsC7NVHnEatuh/cC/EyDEZKHc2LEeewr3zN/UKjeRjJ4XQaQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: I'm disappointed to have no reaction from netdev so far. Let's see if a more exciting subject line evinces some interest. On Sat, Jan 20, 2024 at 02:46:49PM +0800, zhangpeng (AS) wrote: > On 2024/1/19 21:40, Matthew Wilcox wrote: > > > On Fri, Jan 19, 2024 at 05:20:24PM +0800, Peng Zhang wrote: > > > Recently, we discovered a syzkaller issue that triggers > > > VM_BUG_ON_FOLIO in filemap_unaccount_folio() with CONFIG_DEBUG_VM > > > enabled, or bad page without CONFIG_DEBUG_VM. > > > > > > The specific scenarios are as follows: > > > (1) mmap: Use socket fd to create a TCP VMA. > > > (2) open(O_CREAT) + fallocate + sendfile: Read the ext4 file and create > > > the page cache. The mapping of the page cache is ext4 inode->i_mapping. > > > Send the ext4 page cache to the socket fd through sendfile. > > > (3) getsockopt TCP_ZEROCOPY_RECEIVE: Receive the ext4 page cache and use > > > vm_insert_pages() to insert the ext4 page cache to the TCP VMA. In this > > > case, mapcount changes from - 1 to 0. The page cache mapping is ext4 > > > inode->i_mapping, but the VMA of the page cache is the TCP VMA and > > > folio->mapping->i_mmap is empty. > > I think this is the bug. We shouldn't be incrementing the mapcount > > in this scenario. Assuming we want to support doing this at all and > > we don't want to include something like ... > > > > if (folio->mapping) { > > if (folio->mapping != vma->vm_file->f_mapping) > > return -EINVAL; > > if (page_to_pgoff(page) != linear_page_index(vma, address)) > > return -EINVAL; > > } > > > > But maybe there's a reason for networking needing to map pages in this > > scenario? > > Agreed, and I'm also curious why. > > > > (4) open(O_TRUNC): Deletes the ext4 page cache. In this case, the page > > > cache is still in the xarray tree of mapping->i_pages and these page > > > cache should also be deleted. However, folio->mapping->i_mmap is empty. > > > Therefore, truncate_cleanup_folio()->unmap_mapping_folio() can't unmap > > > i_mmap tree. In filemap_unaccount_folio(), the mapcount of the folio is > > > 0, causing BUG ON. > > > > > > Syz log that can be used to reproduce the issue: > > > r3 = socket$inet_tcp(0x2, 0x1, 0x0) > > > mmap(&(0x7f0000ff9000/0x4000)=nil, 0x4000, 0x0, 0x12, r3, 0x0) > > > r4 = socket$inet_tcp(0x2, 0x1, 0x0) > > > bind$inet(r4, &(0x7f0000000000)={0x2, 0x4e24, @multicast1}, 0x10) > > > connect$inet(r4, &(0x7f00000006c0)={0x2, 0x4e24, @empty}, 0x10) > > > r5 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00', > > > 0x181e42, 0x0) > > > fallocate(r5, 0x0, 0x0, 0x85b8) > > > sendfile(r4, r5, 0x0, 0x8ba0) > > > getsockopt$inet_tcp_TCP_ZEROCOPY_RECEIVE(r4, 0x6, 0x23, > > > &(0x7f00000001c0)={&(0x7f0000ffb000/0x3000)=nil, 0x3000, 0x0, 0x0, 0x0, > > > 0x0, 0x0, 0x0, 0x0}, &(0x7f0000000440)=0x40) > > > r6 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00', > > > 0x181e42, 0x0) > > > > > > In the current TCP zerocopy scenario, folio will be released normally . > > > When the process exits, if the page cache is truncated before the > > > process exits, BUG ON or Bad page occurs, which does not meet the > > > expectation. > > > To fix this issue, the mapping_mapped() check is added to > > > filemap_unaccount_folio(). In addition, to reduce the impact on > > > performance, no lock is added when mapping_mapped() is checked. > > NAK this patch, you're just preventing the assertion from firing. > > I think there's a deeper problem here. > > -- > Best Regards, > Peng > >