From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 067D6C83F22 for ; Wed, 16 Jul 2025 19:41:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B54E8D0003; Wed, 16 Jul 2025 15:41:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 98CA28D0001; Wed, 16 Jul 2025 15:41:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A3B48D0003; Wed, 16 Jul 2025 15:41:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7BABD8D0001 for ; Wed, 16 Jul 2025 15:41:21 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 213F810FAA5 for ; Wed, 16 Jul 2025 19:41:21 +0000 (UTC) X-FDA: 83671146762.06.789517D Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf20.hostedemail.com (Postfix) with ESMTP id 322401C000C for ; Wed, 16 Jul 2025 19:41:19 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gK9E6Vyi; spf=pass (imf20.hostedemail.com: domain of almasrymina@google.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=almasrymina@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752694879; a=rsa-sha256; cv=none; b=J4mZbSotcBC/6z1e4ra51vYDk3HhB4vvowd9L/lGI+UvAaRUYJZnUhuOd0vb81YlimtntN nSGP+3QJk6wPs/f5FpG+zYQgfeZNrz6FIljHhagHxR9AyIpoaqZeWge+Sbr6CozoR8PJhs /CYWKr0cO1XwwKfKgojEMbkyS0Mb7Rs= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gK9E6Vyi; spf=pass (imf20.hostedemail.com: domain of almasrymina@google.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=almasrymina@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752694879; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6cBcxFpY+leZiSm2ZlkvHJhd+tn3lpuAd2BWjNqwWsU=; b=6XrTJTRKcmMEVGxxqZ+1DUZgIIR6wHqivJ5mdOozzSNy4BfHfvxNGVIkK4hfcE9aWHgOcX zZLr2XsTHBYAX1KZSjwr8A43DpH7tInAEO9cAIwxx4HidEPP/iNtYiCuDszQ6pY9zRa5mI /x7wpsW0A5TmW7ZKEH48hg+lGAMODUY= Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-2357c61cda7so2845ad.1 for ; Wed, 16 Jul 2025 12:41:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752694878; x=1753299678; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6cBcxFpY+leZiSm2ZlkvHJhd+tn3lpuAd2BWjNqwWsU=; b=gK9E6VyijSl413azZhefXyMe9G+AqnEY3D0mXYhUyt1f5EFzRh7KEaqTzPsgiufDNa 8n9tKSyxpWqtX1XMjsSxqQqtavG3yRG9lmqMzqUpRax+a2SoHIwKtB8GUM4S7a8cmXxu 17N8fJO6lRIQEQfPU14GbRqsOGr7yAyCcmYE7B6wLdgOziQJ9Lqerg/K60n4PDpltm75 qcjvSvTdcCnbgeuLJDZgIiB2JizDd420///Jbd+bfzsuG+HxY9vxkIjW1iLRfbOf6KFz TYGXLwmnGEcZVv/pupVtbA9hSuMImCZE18j1rvMAhzhVMmLxjRfZg5kpvtC4/jvpN74a KzWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752694878; x=1753299678; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6cBcxFpY+leZiSm2ZlkvHJhd+tn3lpuAd2BWjNqwWsU=; b=UFG8+mjJU6B0rx61aW6PcQkoKiVsHXYsfCyJSnd7QQeNalXjEzmLiVIcXyX5M4ceVp kTWTDzYs8Sc6Xw1zsMiir7WQf9STDA20R5wwh5Cs99Wp4L5aUkJQKxV6lmXtTLwSBVee /D5N8+3F8rY/OHzNywCu8xrpZcBCg88H0NFrTaJQbqtYVxXW43+nZv5Ujuk4Ne34bwBl 0j4Clv5XWCdVBmzakxoZETRHKTbp7RzY6lfqq8E9yvxMQv9pNn4SV8JnKX0co2tM3ikQ CRho9gD4pgcBOh0x23ta1oczuNhZyqyg3rIzw9gAfuftUGq+OY8keoAi8DWNUTps1fJs fKgQ== X-Forwarded-Encrypted: i=1; AJvYcCV7XW4de2hks9uf/mCtedZqmeYYdNCuI0qyCc68dFKTPVmwGl0gkcq95yq1U0sxAjgJ4dSWVM/yIg==@kvack.org X-Gm-Message-State: AOJu0YyB+yc1k0YqVlQv0q2brzEQ6+ABsLd6B1uvZUJE1YpAC79kqQRc A3YuvGr/R75e+WRlwplG7epPQN92x30ckJGK/+G9TQjYP58XSJ3b0XSDT8+6KPKSJKzoRU39+Fb KxYZmlQXYO986ZCcVIO3ZIpnorqjT61OgCPvdtXeR X-Gm-Gg: ASbGncv2CrZTm+sXHnqyRVvnNX0YKBE8CCGH3YLba7qM8Vjw/EaNPna9YifpZAKWnUo 1R4DiFWIHwUEDxYvoowsYm9YGuSZ7lv5hXUKugrtidzQ5QPYi7Ax8E0eKDlQXw/v5esolOj95gA QIoydqHQJ21juJYd6mnkhOudmkRokwxfkPW6a7v6uCkRCqlKEBm2SduvJRPAEmrMWwmi3dZg0nw 1XrsqQe X-Google-Smtp-Source: AGHT+IFyX+XCNjY83DoIYD6wz3oMr28OgEv5cSQ9ojCgkzsna4P0DmbzE8rFt2raxHhdvl1dI8tDi1DCHzWnQLePCe8= X-Received: by 2002:a17:903:174e:b0:234:a734:4ab9 with SMTP id d9443c01a7336-23e2fe3a9f6mr605605ad.20.1752694877565; Wed, 16 Jul 2025 12:41:17 -0700 (PDT) MIME-Version: 1.0 References: <20250714120047.35901-1-byungchul@sk.com> <20250714120047.35901-3-byungchul@sk.com> <20250715013626.GA49874@system.software.com> <20250716045124.GB12760@system.software.com> In-Reply-To: <20250716045124.GB12760@system.software.com> From: Mina Almasry Date: Wed, 16 Jul 2025 12:41:04 -0700 X-Gm-Features: Ac12FXwMabx4wvazw8X8IufyySuAWIkeeHz9sZZKQ1flAeUQa_1pSj5SZBgkO6M Message-ID: Subject: Re: [PATCH net-next v10 02/12] netmem: use netmem_desc instead of page to access ->pp in __netmem_get_pp() To: Byungchul Park , "Lobakin, Aleksander" Cc: willy@infradead.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel_team@skhynix.com, ilias.apalodimas@linaro.org, harry.yoo@oracle.com, akpm@linux-foundation.org, andrew+netdev@lunn.ch, asml.silence@gmail.com, toke@redhat.com, david@redhat.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-rdma@vger.kernel.org, bpf@vger.kernel.org, vishal.moola@gmail.com, hannes@cmpxchg.org, ziy@nvidia.com, jackmanb@google.com, wei.fang@nxp.com, shenwei.wang@nxp.com, xiaoning.wang@nxp.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com, sgoutham@marvell.com, gakula@marvell.com, sbhatta@marvell.com, hkelam@marvell.com, bbhushan2@marvell.com, tariqt@nvidia.com, ast@kernel.org, daniel@iogearbox.net, hawk@kernel.org, john.fastabend@gmail.com, sdf@fomichev.me, saeedm@nvidia.com, leon@kernel.org, mbloch@nvidia.com, danishanwar@ti.com, rogerq@kernel.org, nbd@nbd.name, lorenzo@kernel.org, ryder.lee@mediatek.com, shayne.chen@mediatek.com, sean.wang@mediatek.com, matthias.bgg@gmail.com, angelogioacchino.delregno@collabora.com, horms@kernel.org, m-malladi@ti.com, krzysztof.kozlowski@linaro.org, matthias.schiffer@ew.tq-group.com, robh@kernel.org, imx@lists.linux.dev, intel-wired-lan@lists.osuosl.org, linux-arm-kernel@lists.infradead.org, linux-wireless@vger.kernel.org, linux-mediatek@lists.infradead.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 322401C000C X-Stat-Signature: p14h49wfs7xr45momjjspc8pjqtfftkh X-HE-Tag: 1752694879-118508 X-HE-Meta: U2FsdGVkX1/QpHqvQGPDp7I3HSZMvZ28N8sQx9ShGlfmTQ9eB02gsuiG7WPsmFS7ISMHRsgiEVm7nzK5lW8QJu2cS0XolF+tBuXjhzIg1JX+udNGwZDl1OfFZcVLeiDsJdjGuf9dB/wA5twwE4+Eddo8zD5ja8jLjW7uKunUA1gq31+LIHnY2zlRcVHkdOPIbhaF3wsWj39xpfWLrY3mQbRL1Jeew9FjLG0iyoE98TER+IwvN3f/vp80a/2jh441Miq8+J/zvnsVGoRlQ2vveSowF1uPDJtbcS13olmNzYyhxk2Z+rM5IMeKxBtV0xtHkgrBwRz9KzHQbEm/U+eqwKd0/CKkfLRV/6+z6UNKCI0j9lpR0PPJxMERhzNs5+PQoShA+8RIno0nQbLxrKbbRzMIJFL09c54sFo26f/9e8GLPgarcYXvkELzlA68OSM7+R0q48ctEvRI5UxvtFRlEqXyAP6gEOWZQuNLb92oeFqD887qS/JkS4KeZH5xEQ6OS2iPt3QRn1F8Gtna5fEKf/vooxw4YxKWJsUg1R7D6gPxVm8DE0YQpjKRy3GwS04UPCyTFMbitwWG4odysluN9a+jFbuovECcfbWE0Vw3yH4+Kf/JjPJkPcAcn3mKkuuR8qYu7QRXirbKe5JH9wdGaPiV4wPX3zqUszEi8sAOUf02CQEYqr/23qbDpHeo6JcjphfgLEk2B9p0LX0C2sv3KkxGUuTXV63YS0lAirrGrJ0wH49G/9aPyv/GeUYNaoR1wau7uAk2v6YIhTRH0Vwg2JpMNkE+wXxF9Sx5U15XmolnA2QtVBuXp5qCEIFjt1HzK3bGIj2Q4jkSg6Wjt8K0IJxPsISXeXwMEKiBuuHIBxT9R4zG3+CGNXzMH3+4hAlAuj/XXEGYvrlUTm9WmupwvwoCpO4CVUJX/9uaek2VjoJPmnt8QBl9AceITAnrSiQO4kUw+3+TTHp1L693VO8 BPrlMObP ka5lCV97PrYIobhFg9KAVlQuFd2m1XJwGuVFFzLv/8Z6CgpB2noV0/LOcbdLybFOmQHJLhjoWyp/+4nOXutj1nzFHdwwRCHg+JRgK/dybIWDRGbZHdBxt9UxONNii39GvL1sGkqgak0H+Z7tIl8vYrrKNlvTCrs/TQLF8ISX2Fzy+p2Q+/8aSyTlwbcJAq1nt0PlJc984vjy7G4Gq4W9Myk5J2m4TL2yjJIP9snX0WXCHdfXFbP+Ma+9tqxC0eKo96YKNEMe3Z9d6I35D3mgKRldgObtBmcXC/CGmNDDUMtpmhzi9mYMVGzlzXxpRcnYcBcpUvMH70dQUsFQsn8Oq/ELFMcrv/EYLrv+Cl17K2Jz3MxIzURB200Z2rO8Y6oe4J4pB+fQL6l07N/8gZ8wi/89JIhfeoWOq3cLp22jOLw/6pPySoX12DrVwoLSAbYqGpGMM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 15, 2025 at 9:51=E2=80=AFPM Byungchul Park w= rote: > > On Tue, Jul 15, 2025 at 12:09:34PM -0700, Mina Almasry wrote: > > On Mon, Jul 14, 2025 at 6:36=E2=80=AFPM Byungchul Park wrote: > > > > > > On Mon, Jul 14, 2025 at 12:58:15PM -0700, Mina Almasry wrote: > > > > On Mon, Jul 14, 2025 at 12:37=E2=80=AFPM Mina Almasry wrote: > > > > > > > > > > On Mon, Jul 14, 2025 at 5:01=E2=80=AFAM Byungchul Park wrote: > > > > > > > > > > > > To eliminate the use of the page pool fields in struct page, th= e page > > > > > > pool code should use netmem descriptor and APIs instead. > > > > > > > > > > > > However, __netmem_get_pp() still accesses ->pp via struct page.= So > > > > > > change it to use struct netmem_desc instead, since ->pp no long= er will > > > > > > be available in struct page. > > > > > > > > > > > > While at it, add a helper, pp_page_to_nmdesc(), that can be use= d to > > > > > > extract netmem_desc from page only if it's pp page. For now th= at > > > > > > netmem_desc overlays on page, it can be achieved by just castin= g. > > > > > > > > > > > > Signed-off-by: Byungchul Park > > > > > > --- > > > > > > include/net/netmem.h | 13 ++++++++++++- > > > > > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/include/net/netmem.h b/include/net/netmem.h > > > > > > index 535cf17b9134..2b8a7b51ac99 100644 > > > > > > --- a/include/net/netmem.h > > > > > > +++ b/include/net/netmem.h > > > > > > @@ -267,6 +267,17 @@ static inline struct net_iov *__netmem_cle= ar_lsb(netmem_ref netmem) > > > > > > return (struct net_iov *)((__force unsigned long)netmem= & ~NET_IOV); > > > > > > } > > > > > > > > > > > > +static inline struct netmem_desc *pp_page_to_nmdesc(struct pag= e *page) > > > > > > +{ > > > > > > + DEBUG_NET_WARN_ON_ONCE(!page_pool_page_is_pp(page)); > > > > > > + > > > > > > + /* XXX: How to extract netmem_desc from page must be ch= anged, > > > > > > + * once netmem_desc no longer overlays on page and will= be > > > > > > + * allocated through slab. > > > > > > + */ > > > > > > + return (struct netmem_desc *)page; > > > > > > +} > > > > > > + > > > > > > > > > > Same thing. Do not create a generic looking pp_page_to_nmdesc hel= per > > > > > which does not check that the page is the correct type. The > > > > > DEBUG_NET... is not good enough. > > > > > > > > > > You don't need to add a generic helper here. There is only one ca= ll > > > > > site. Open code this in the callsite. The one callsite is marked = as > > > > > unsafe, only called by code that knows that the netmem is specifi= cally > > > > > a pp page. Open code this in the unsafe callsite, instead of crea= ting > > > > > a generic looking unsafe helper and not even documenting it's uns= afe. > > > > > > > > > > > > > On second read through the series, I actually now think this is a > > > > great idea :-) Adding this helper has simplified the series greatly= . I > > > > did not realize you were converting entire drivers to netmem just t= o > > > > get rid of page->pp accesses. Adding a pp_page_to_nmdesc helper mak= es > > > > the entire series simpler. > > > > > > > > You're also calling it only from code paths like drivers that alrea= dy > > > > assumed that the page is a pp page and did page->pp deference witho= ut > > > > a check, so this should be safe. > > > > > > > > Only thing I would change is add a comment explaining that the call= ing > > > > code needs to check the page is pp page or know it's a pp page (lik= e a > > > > driver that supports pp). > > > > > > > > > > > > > > /** > > > > > > * __netmem_get_pp - unsafely get pointer to the &page_pool ba= cking @netmem > > > > > > * @netmem: netmem reference to get the pointer from > > > > > > @@ -280,7 +291,7 @@ static inline struct net_iov *__netmem_clea= r_lsb(netmem_ref netmem) > > > > > > */ > > > > > > static inline struct page_pool *__netmem_get_pp(netmem_ref net= mem) > > > > > > { > > > > > > - return __netmem_to_page(netmem)->pp; > > > > > > + return pp_page_to_nmdesc(__netmem_to_page(netmem))->pp; > > > > > > } > > > > > > > > > > This makes me very sad. Casting from netmem -> page -> nmdesc... > > > > > > > > > > Instead, we should be able to go from netmem directly to nmdesc. = I > > > > > would suggest rename __netmem_clear_lsb to netmem_to_nmdesc and h= ave > > > > > it return netmem_desc instead of net_iov. Then use it here. > > > > > > > > > > We could have an unsafe version of netmem_to_nmdesc which convert= s the > > > > > netmem to netmem_desc without clearing the lsb and mark it unsafe= . > > > > > > > > > > > > > This, I think, we should address to keep some sanity in the code an= d > > > > reduce the casts and make it a bit more maintainable. > > > > > > I will reflect your suggestions. To summarize: > > > > > > 1) The current implementation of pp_page_to_nmdesc() is good enoug= h > > > to keep, but add a comment on it like "Check if the page is a p= p > > > page before calling this function or know it's a pp page.". > > > > > > > Yes please. > > > > > 2) Introduce the unsafe version, __netmem_to_nmdesc(), and use it = in > > > __netmem_get_pp(). > > > > > > > No need following Pavel's feedback. We can just delete > > __netmem_get_pp. If we do find a need in the future to extract the > > netmem_desc from a netmem_ref, I would rather we do a straight cast > > from netmem_ref to netmem_desc rather than netmem_ref -> pages/net_iov > > -> netmem_desc. > > > > But that seems unnecessary for this series. > > No. The series should remove accessing ->pp through page. > > I will kill __netmem_get_pp() as you and I prefer. However, > __netmem_get_pp() users e.i. libeth_xdp_return_va() and > libeth_xdp_tx_fill_buf() should be altered. I will modify the code like: > > as is: __netmem_get_pp(netmem) > to be: __netmem_nmdesc(netmem)->pp > > Is it okay with you? > When Pavel and I were saying 'remove __netmem_get_pp', I think we meant to remove the entire concept of unsafe netmem -> page conversions. I think we both don't like them. From this perspective, __netmem_nmdesc(netmem)->pp is just as bad as __netmem_get_pp(netmem). I think since the unsafe netmem-to-page casts are already in mainline, lets assume they should stay there until someone feels strongly enough to remove them. The logic in Olek's patch is sound: https://lore.kernel.org/all/20241203173733.3181246-8-aleksander.lobakin@int= el.com/ Header buffer page pools do always use pages and will likely remain so for a long time, so I guess lets continue to support them rather than try to remove them in this series. A followup series could try to remove them. > > > 3) Rename __netmem_clear_lsb() to netmem_to_nmdesc(), and return > > > netmem_desc, and use it in all users of __netmem_clear_lsb(). > > > > > > > Following Pavel's comment, this I think also is not necessary for this > > series. Cleaning up the return value of __netmem_clear_lsb is good > > work I think, but we're already on v10 of this and I think it would > > unnecessary to ask for added cleanups. We can do the cleanup on top. > > However, I still need to include 'introduce __netmem_nmdesc() helper' Yes. > in this series since it should be used to remove __netmem_get_pp() as I lets keep __netmem_get_pp, which does a `return __netmem_nmdesc(netmem)->pp;` In general we avoid allowing the driver to do any netmem casts in the driver code, and we do any casting in core. > described above. I think I'd better add netmem_nmdesc() too while at it. > Yes. netmem_nmdesc should replace __netmem_clear_lsb. > I assume __netmem_nmdesc() is an unsafe version not clearing lsb. The Yes. > safe version, netmem_nmdesc() needs an additional operation clearing lsb. Yes. -- Thanks, Mina