From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0616AC3ABBE for ; Thu, 8 May 2025 15:24:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0562A6B0085; Thu, 8 May 2025 11:24:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F21286B0088; Thu, 8 May 2025 11:24:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9AD26B0089; Thu, 8 May 2025 11:24:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B4BDD6B0085 for ; Thu, 8 May 2025 11:24:07 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E2F5CBB399 for ; Thu, 8 May 2025 15:24:07 +0000 (UTC) X-FDA: 83420111334.01.F881695 Received: from mailout2.w2.samsung.com (mailout2.w2.samsung.com [211.189.100.12]) by imf06.hostedemail.com (Postfix) with ESMTP id CA51918000E for ; Thu, 8 May 2025 15:24:04 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b="SyWPqSq/"; spf=pass (imf06.hostedemail.com: domain of p.antoniou@partner.samsung.com designates 211.189.100.12 as permitted sender) smtp.mailfrom=p.antoniou@partner.samsung.com; dmarc=pass (policy=none) header.from=partner.samsung.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746717845; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=G4m0IKW5CtvLWRZkH6H7Re/CAvwT6toG6MODAwPO0vE=; b=LS8ZS004exVeNP/65tQ3I4thMfLFNSVgMtBfOOdRNJgGnpyk9Aznxg3+2eILpe0d0VeWIi 9DfFSdMMRuR8UQP7sFLwn+mt9pUPEoeXcEZuJGmZ6ZGUvTRzo890ssOsZcIWr/4R0Ec2P+ OUC1BIpixB3NA5IRzmvQ32N2HaCleeY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746717845; a=rsa-sha256; cv=none; b=yERz8YyEDNVJcM7i91wPG/c0o2Y88iVKarGcRwtGb2WdxZrhxzkfDFedOepUtOTC7XK/T6 fTdEGKNLsjVEfycuDZGpBaVyjBs2Y8dqaFhqDfFabL6p63TmOQZZ60dtyB0I3EHiFLDiL2 qzSwelCmDjVqPiUQx2XoESRloYLNm2A= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b="SyWPqSq/"; spf=pass (imf06.hostedemail.com: domain of p.antoniou@partner.samsung.com designates 211.189.100.12 as permitted sender) smtp.mailfrom=p.antoniou@partner.samsung.com; dmarc=pass (policy=none) header.from=partner.samsung.com Received: from uscas1p1.samsung.com (unknown [182.198.245.206]) by mailout2.w2.samsung.com (KnoxPortal) with ESMTP id 20250508152403usoutp02ce33eeecccb648894ecc5b986f6678b1~9lp___pD_2763927639usoutp023; Thu, 8 May 2025 15:24:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w2.samsung.com 20250508152403usoutp02ce33eeecccb648894ecc5b986f6678b1~9lp___pD_2763927639usoutp023 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1746717843; bh=G4m0IKW5CtvLWRZkH6H7Re/CAvwT6toG6MODAwPO0vE=; h=Date:From:To:CC:Subject:In-Reply-To:References:From; b=SyWPqSq/xO2QHitBfWOiLZtCSr/TvpS4zyNL/XsbiNXXmrVbAGLnOR6NrHDs4KGBI K0+qPisAgzSKbSRWRod6pJ12QaW468EQSHng+W3tsq1nj6cH0MVZfn6APpxxFmFMWH 5EaqREBjilTGCDb2JefqDkFwuF2HPaPXicsAVujM= Received: from ussmtxp1.samsung.com (u136.gpu85.samsung.co.kr [203.254.195.136]) by uscas1p2.samsung.com (KnoxPortal) with ESMTP id 20250508152403uscas1p28fc1d213062fa4a01ed52588ae9c1e20~9lp_2TG6d2258522585uscas1p2t; Thu, 8 May 2025 15:24:03 +0000 (GMT) Received: from ATXPVPPTAGT03.sarc.samsung.com (unknown [105.148.161.7]) by ussmtxp1.samsung.com (KnoxPortal) with ESMTP id 20250508152403ussmtxp1fc3b03d9f7b23772a0d9733e49f66868~9lp_tDR4O0103701037ussmtxp1d; Thu, 8 May 2025 15:24:02 +0000 (GMT) Received: from pps.filterd (ATXPVPPTAGT03.sarc.samsung.com [127.0.0.1]) by ATXPVPPTAGT03.sarc.samsung.com (8.18.1.2/8.18.1.2) with ESMTP id 548EOaUO045334; Thu, 8 May 2025 10:24:02 -0500 Received: from webmail.sarc.samsung.com ([172.30.39.9]) by ATXPVPPTAGT03.sarc.samsung.com (PPS) with ESMTP id 46df5wbv69-1; Thu, 08 May 2025 10:24:02 -0500 Received: from sarc.samsung.com (105.148.145.5) by au1ppexchange01.sarc.samsung.com (105.148.32.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Thu, 8 May 2025 10:24:00 -0500 Date: Thu, 8 May 2025 18:23:56 +0300 From: Pantelis Antoniou To: David Hildenbrand CC: Andrew Morton , , , Artem Krupotkin , Charles Briere , Wade Farnsworth , Peter Xu Subject: Re: [PATCH 1/1] Fix zero copy I/O on __get_user_pages allocated pages Message-ID: <20250508182356.45dbfd40@sarc.samsung.com> In-Reply-To: <99ed92b7-c1b2-4e12-a7ee-776a7f890b47@redhat.com> Organization: SARC X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-ClientProxiedBy: au1ppexchange02.sarc.samsung.com (105.148.32.82) To au1ppexchange01.sarc.samsung.com (105.148.32.81) X-CFilter-Loop: Reflected Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Proofpoint-GUID: WIViWkdrb1teZjQ_dT55pg7cdGqIvFB8 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTA4MDEzMyBTYWx0ZWRfX2RfAWvoIlAVw 0JBjlBImh/7hfELV0lrHyPJWoMClvzEp6+GpQmvEF6kiBfp5DTM/9PC3unO/y0Rd33AnhYGHXJQ oTvzQCXgqqx5Th9oq2i/edOa9fS/Q8+4axSYapXWfbVy+sEJW4KHmAut7qlj8vUoxa4aFDRZgTE OItfHGpUo+S1o/4THc72hlLYmmVhnpwBi9c3UjHyWaqfBx/3I2f99R4no5bKdKU/+crucFT9b9H ncbtb9oZUAiL2vy8x16A+l6P5dj71IoahyRTGV2sdRvw1erRm5XYlJYtviyJ1VNtFy5iUn0Vyji IbrAGOTjQnV1w1zkWwd8zQa0vZLVNOfRGtpIGkI44njZFb83k5s2/q607JoHdBXpp1lxKIrCvWm scgpHS+2 X-Proofpoint-ORIG-GUID: WIViWkdrb1teZjQ_dT55pg7cdGqIvFB8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-08_05,2025-05-07_02,2025-02-21_01 X-Proofpoint-Spam-Details: rule=outbound_spam_notspam policy=outbound_spam score=0 malwarescore=0 clxscore=1015 priorityscore=1501 mlxscore=0 lowpriorityscore=0 impostorscore=0 spamscore=0 phishscore=0 bulkscore=0 mlxlogscore=852 adultscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2504070000 definitions=main-2505080133 X-CMS-MailID: 20250508152403uscas1p28fc1d213062fa4a01ed52588ae9c1e20 X-CMS-RootMailID: 20250507154119uscas1p17799fe7589e4f1bd53d2d3dc7f44cb8c References: <20250507154105.763088-1-p.antoniou@partner.samsung.com> <20250507154105.763088-2-p.antoniou@partner.samsung.com> <99ed92b7-c1b2-4e12-a7ee-776a7f890b47@redhat.com> X-Stat-Signature: tyqq8kfwgbuarshyni88ztyr8c8r3ih4 X-Rspamd-Queue-Id: CA51918000E X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1746717844-784044 X-HE-Meta: U2FsdGVkX19Eoj4UYAjg6hVhUgp5/2E0rSYylHB60GINWgt8JRptU6YkNq3OJib+LQdSS0TNDjUOis8Qzm8e3IzGOlvjQHPy5OxwOK5vmJuY1lQn82kiL8hFCcaiCTkCzqU5vAKDXX8vNnWBWZsc0XfWP8yeKBGeLVajuVSomnfAw40bc7LfPh9Si6QtrMbDSyOTSjOkrFqV1PLSMD0KvDEwZgexDF2+klXkuhLOhePIZq3ZHvrUokqlpdEMKcvfZig8Nm5agc7UoLtprAHuA4Uv+ASm4j043vFj7oCc3V1W5J2HZtnK5bIy01KEhJ49x14NA6Gw6d78GUrCBgVEISgp+02UVMHx7c3+0wiTBofyNhDkcMFVAx8GreeShFFhwJA9hdGnD2nMqNmqm4G77GWQMxBmYgk2ULMUFyDZ3bAQ4Kr80+YVw6VyvYXXpsfNPF58WZlMQYSFHXKKv86s2t/uSGKuHTRDzDX2pice4BsyeKBE4XqasvYOC7VhGJ/R4s5E0lLMAxsdt1Ik+AWtD/WpQn7cw0J4Be8vODVHJm5Q3WO3D74iALRaewChqCa6kQmN1mjzLBZi5rgAEKYmLzGXsdED8YQh3fh4eZu96jSFmS19CJOGYxNLIBh1dtnp5/vZm5V9o73Q0OWnd1DZOgQxoDHUz4DA4ZaMKKUZAp9Tf2o0vCvGkLts7ZEgDC3ab5c6q1cpJBXhF2ncTjKthLnMiYdWv+yFlyaxC5rz9NQLkzeExDccSB0d+KJZayFUslRpXrqcmudJSBsNCD59YUjRYi8ohAejQg+jhbzC65pTojt3Q+7c8nkc1TxEhEObVafm85RQGoSW6IRddxNIpPbqpbyA6YsETqILTq9IYSdUIvHhLui/4PsfurwPQLFixAtSI8h8jZOnHKmgUBpGHc1VGNFpa3Ksqx+684sNG4zhwI8fwAHuIgWjtZlfuOG2y5AV6MGITWUloB6deiW YdyR0stf L/PBvBhOywoUm2lzxxL1MP7NyZrXKpyW4mLRhxwcaSb9VeqdHd/6kTg/+CQzFH07Y0l03Q5NcD8B5Vr4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 8 May 2025 17:03:46 +0200 David Hildenbrand wrote: Hi there, > On 07.=E2=80=8A05.=E2=80=8A25 17:=E2=80=8A41, Pantelis Antoniou wrote: Hi= , > Recent updates > to net filesystems enabled zero copy operations, > which require > getting a user space page pinned. > > This does not work for pages > that were allocated via __get_user_pages=20 > On 07.05.25 17:41, Pantelis Antoniou wrote: >=20 > Hi, >=20 > > Recent updates to net filesystems enabled zero copy operations, > > which require getting a user space page pinned. > >=20 > > This does not work for pages that were allocated via > > __get_user_pages and then mapped to user-space via remap_pfn_rage. >=20 > Right. Because the struct page of a VM_PFNMAP *must not be touched*. > It has to be treated like it doesn't exist. >=20 Well, that's not exactly the case. For pages mapped to user space via remap_pfn_range() the VM_PFNMAP bit is set even though the pages do have a struct page. The details of how it happens are at the cover page of this patch but let me paste the relevant bits here. "In our emulation environment we have noticed failing writes when performing I/O from a userspace mapped DRM GEM buffer object. The platform does not use VRAM, all graphics memory is regular DRAM memory, allocated via __get_free_pages The same write was successful from a heap allocated bounce buffer. The sequence of events is as follows. 1. A BO (Buffer Object) is created, and it's backing memory is allocated via __get_user_pages() 2. Userspace mmaps a BO (Buffer Object) via a mmap call on the opened file handle of a DRM driver. The mapping is done via the drm_gem_mmap_obj() call. 3. Userspace issues a write to a file copying the contents of the BO. 3a. If the file is located on regular filesystem (like ext4), the write completes successfully. 3b. If the file is located on a network filesystem, like 9p the write fails. The write fails because v9fs_file_write_iter() will call netfs_unbuffered_write_iter(), netfs_unbuffered_write_iter_locked() which w= ill=20 call netfs_extract_user_iter()=20 netfs_extract_user_iter() will in turn call iov_iter_extract_pages() which = for a user backed iterator will call iov_iter_extract_user_pages which will call pin_user_pages_fast() which finally will call __gup_longterm_locked(). __gup_longterm_locked() will call __get_user_pages_locked() which will fail because the VMA is marked with the VM_IO and VM_PFNMAP flags." > >=20 > > remap_pfn_range_internal() will turn on VM_IO | VM_PFNMAP vma bits. > > VM_PFNMAP in particular mark the pages as not having struct_page > > associated with them, which is not the case for __get_user_pages() > >=20 > > This in turn makes any attempt to lock a page fail, and breaking > > I/O from that address range. > >=20 > > This patch address it by special casing pages in those VMAs and not > > calling vm_normal_page() for them. > >=20 > > Signed-off-by: Pantelis Antoniou > > --- > > mm/gup.c | 22 ++++++++++++++++++---- > > 1 file changed, 18 insertions(+), 4 deletions(-) > >=20 > > diff --git a/mm/gup.c b/mm/gup.c > > index 84461d384ae2..e185c18c0c81 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -833,6 +833,20 @@ static inline bool can_follow_write_pte(pte_t > > pte, struct page *page, return !userfaultfd_pte_wp(vma, pte); > > } > >=20=20=20 > > +static struct page *gup_normal_page(struct vm_area_struct *vma, > > + unsigned long address, pte_t pte) > > +{ > > + unsigned long pfn; > > + > > + if (vma->vm_flags & (VM_MIXEDMAP | VM_PFNMAP)) { > > + pfn =3D pte_pfn(pte); > > + if (!pfn_valid(pfn) || is_zero_pfn(pfn) || pfn > > > highest_memmap_pfn) > > + return NULL; > > + return pfn_to_page(pfn); > > + } > > + return vm_normal_page(vma, address, pte); >=20 > I enjoy seeing vm_normal_page() checks in GUP code. >=20 > I don't enjoy seeing what you added before that :) >=20 > If vm_normal_page() tells you "this is not a normal", then we should > not touch it. There is one exception: the shared zeropage. >=20 >=20 > So, unfortunately, this is wrong. >=20 Well, lets talk about a proper fix then for the previously mentioned user-space regression. Regards -- Pantelis