From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 54C28389E07 for ; Wed, 13 May 2026 09:26:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778664381; cv=none; b=Xu/Y/4wGEGSxXAdQz0RwQZBQWaRFqr/755v+sXrtqGLpqecVjC2j9hnIgQjAnD5H0b2/LuhGm4ifjnawuHGSX9eMBlGT9S0nBeoQFk13168+Ib8Cj+8UNTml7meRIk2BdfXnrMJ7uTyMSlxR6AdHbXfVY3jwW/J1p9IkAOeZpzM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778664381; c=relaxed/simple; bh=+A61wAMje7LEB0KBDcQB/mm5PLvDuXfGI5f1q6hv03U=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=hlHn0jqCimmYZl47cX8gJ0y1Na+tL8or/p8uJjV9GXJ8TbC/2RUV3lgKs0LSJlJqjpSG/YB16BW/ZzvviVefsP2pdp/Ch1Mgi+dWFZulvgPoR21eQtc0KHlP9kHcI3P3SKC+zfcHrlBRs/RXFifT/sZVpr2aMOsr5732eH7Xuko= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=B/MeBwzn; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=wolDbEnS; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=B/MeBwzn; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=wolDbEnS; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="B/MeBwzn"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="wolDbEnS"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="B/MeBwzn"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="wolDbEnS" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 99D635CFB0; Wed, 13 May 2026 09:26:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1778664378; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XKMrjHaSnPAmYEI6gQbk/Eny6irDx2lJWYL45Uo3FA0=; b=B/MeBwznYOdHc/hhIMg9XPMYhyMqz+Sdb6grPppKJNPaZvkz8i2c4x+aWC8+LAgaO/FZJx nge+82TDRbcL6XHLe3AW3Iunic2rZc5jcWj8yGXITKqSeHN4etaxPrPR5mgcEq1JovUvq/ yaQsxHnoe1YsdisEooOv9J/luv5Pafw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1778664378; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XKMrjHaSnPAmYEI6gQbk/Eny6irDx2lJWYL45Uo3FA0=; b=wolDbEnSGba8O5U4zO9fMtpF8uA3r+9Pj4N+aOadTPVOPcZbdRgl5zvxXLRiVAHYya/8DM E1U/WxDRgYtsPKBA== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1778664378; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XKMrjHaSnPAmYEI6gQbk/Eny6irDx2lJWYL45Uo3FA0=; b=B/MeBwznYOdHc/hhIMg9XPMYhyMqz+Sdb6grPppKJNPaZvkz8i2c4x+aWC8+LAgaO/FZJx nge+82TDRbcL6XHLe3AW3Iunic2rZc5jcWj8yGXITKqSeHN4etaxPrPR5mgcEq1JovUvq/ yaQsxHnoe1YsdisEooOv9J/luv5Pafw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1778664378; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XKMrjHaSnPAmYEI6gQbk/Eny6irDx2lJWYL45Uo3FA0=; b=wolDbEnSGba8O5U4zO9fMtpF8uA3r+9Pj4N+aOadTPVOPcZbdRgl5zvxXLRiVAHYya/8DM E1U/WxDRgYtsPKBA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 9893E593A9; Wed, 13 May 2026 09:26:15 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id A7zjIbdDBGoVZwAAD6G6ig (envelope-from ); Wed, 13 May 2026 09:26:15 +0000 Date: Wed, 13 May 2026 10:26:13 +0100 From: Pedro Falcato To: "Vlastimil Babka (SUSE)" Cc: Dragos Tatulea , Byungchul Park , linux-mm@kvack.org, akpm@linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel_team@skhynix.com, harry.yoo@oracle.com, ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, sdf@fomichev.me, saeedm@nvidia.com, leon@kernel.org, tariqt@nvidia.com, mbloch@nvidia.com, andrew+netdev@lunn.ch, edumazet@google.com, pabeni@redhat.com, david@redhat.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, horms@kernel.org, jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com, ilias.apalodimas@linaro.org, willy@infradead.org, brauner@kernel.org, kas@kernel.org, yuzhao@google.com, usamaarif642@gmail.com, baolin.wang@linux.alibaba.com, almasrymina@google.com, toke@redhat.com, asml.silence@gmail.com, bpf@vger.kernel.org, linux-rdma@vger.kernel.org, sfr@canb.auug.org.au, dw@davidwei.uk, ap420073@gmail.com Subject: Re: [PATCH v4] mm: introduce a new page type for page pool in page type Message-ID: References: <20260224051347.19621-1-byungchul@sk.com> <982b9bc1-0a0a-4fc5-8e3a-3672db2b29a1@nvidia.com> <4af19eda-c29c-4302-92d5-c0915267fc0c@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4af19eda-c29c-4302-92d5-c0915267fc0c@kernel.org> X-Spam-Level: X-Spamd-Result: default: False [-2.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; SUSPICIOUS_RECIPS(1.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_RHS_NOT_FQDN(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; FUZZY_RATELIMITED(0.00)[rspamd.com]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; MISSING_XM_UA(0.00)[]; RCPT_COUNT_TWELVE(0.00)[49]; TAGGED_RCPT(0.00)[netdev]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; R_RATELIMIT(0.00)[to_ip_from(RLpm3taathqb1wzjxhmfw1k6gg)]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[nvidia.com,sk.com,kvack.org,linux-foundation.org,vger.kernel.org,skhynix.com,oracle.com,kernel.org,iogearbox.net,davemloft.net,gmail.com,fomichev.me,lunn.ch,google.com,redhat.com,suse.cz,suse.com,cmpxchg.org,linaro.org,infradead.org,linux.alibaba.com,canb.auug.org.au,davidwei.uk]; RCVD_TLS_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.cz:email] X-Spam-Flag: NO X-Spam-Score: -2.30 On Wed, May 13, 2026 at 11:12:43AM +0200, Vlastimil Babka (SUSE) wrote: > On 5/13/26 11:00, Dragos Tatulea wrote: > > > > > > On 24.02.26 06:13, Byungchul Park wrote: > >> Currently, the condition 'page->pp_magic == PP_SIGNATURE' is used to > >> determine if a page belongs to a page pool. However, with the planned > >> removal of @pp_magic, we should instead leverage the page_type in struct > >> page, such as PGTY_netpp, for this purpose. > >> > >> Introduce and use the page type APIs e.g. PageNetpp(), __SetPageNetpp(), > >> and __ClearPageNetpp() instead, and remove the existing APIs accessing > >> @pp_magic e.g. page_pool_page_is_pp(), netmem_or_pp_magic(), and > >> netmem_clear_pp_magic(). > >> > >> Plus, add @page_type to struct net_iov at the same offset as struct page > >> so as to use the page_type APIs for struct net_iov as well. While at it, > >> reorder @type and @owner in struct net_iov to avoid a hole and > >> increasing the struct size. > >> > >> This work was inspired by the following link: > >> > >> https://lore.kernel.org/all/582f41c0-2742-4400-9c81-0d46bf4e8314@gmail.com/ > >> > >> While at it, move the sanity check for page pool to on the free path. > >> > >> Suggested-by: David Hildenbrand > >> Co-developed-by: Pavel Begunkov > >> Signed-off-by: Pavel Begunkov > >> Signed-off-by: Byungchul Park > >> Acked-by: David Hildenbrand > >> Acked-by: Zi Yan > >> Acked-by: Vlastimil Babka > >> Reviewed-by: Toke Høiland-Jørgensen > >> --- > > > > Seems like this patch broke tcp_mmap because > > validate_page_before_insert() returns -EINVAL due > > to a page having a type. Here's the full flow: > > > > getsockopt(TCP_ZEROCOPY_RECEIVE) returns -EINVAL because of the > > below flow in the kernel: > > > > tcp_zerocopy_receive() > > -> tcp_zerocopy_vm_insert_batch() > > -> vm_insert_pages() > > -> insert_pages() > > -> insert_page_in_batch_locked() > > -> validate_page_before_insert() returns -EINVAL > > because page_has_type(page) is now true. > > > > The patch below fixes the issue. But is this a valid fix? > > Hmm the check traces back to commit 0ee930e6cafa0 "mm/memory.c: prevent > mapping typed pages to userspace" > > > Pages which use page_type must never be mapped to userspace as it would > > destroy their page type. Add an explicit check for this instead of > > assuming that kernel drivers always get this right. > > So uh, this doesn't look good I think. Yep, you fundamentally can't map a page with a type as page type aliases with mapcount. Even with the given diff, just mapping it will increment the mapcount and wreak havoc. I think we need to revert this patch for now. I'm not sure what the long term plan for this would be. If page types are moved to memdesc types, then the two stop colliding and that could work. I don't know if that's Willy's plan, however. (then there's the other question: are page pool pages really folios? not really. they are mappable, but they aren't part of the page cache, or anon, nor are they in the LRU or have rmap capabilities. perhaps we need a different memdesc for those. we're one step away from reinventing class polymorphism from first principles ;) > > > diff --git a/mm/memory.c b/mm/memory.c > > index ea6568571131..4cb12673f450 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -2326,7 +2326,7 @@ static int validate_page_before_insert(struct vm_area_struct *vma, > > return -EINVAL; > > return 0; > > } > > - if (folio_test_anon(folio) || page_has_type(page)) > > + if (folio_test_anon(folio) || (page_has_type(page) && !PageNetpp(page))) > > return -EINVAL; > > flush_dcache_folio(folio); > > return 0; > > > > Thanks, > > Dragos > > > > > > -- Pedro