From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8DF8730FC35 for ; Tue, 16 Dec 2025 19:58:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765915098; cv=none; b=rpevatmtl/HPI7wpa7GrmeBRmndez5u9VoGC0fP60OyImCKrMM6OzTaYUUUOstStvnOFu3KS032R5JTM2n13f/GORgsuVdT9YMOIWnvmR82uyXRydrTlooFSjBMpPa9ZbGTOhqsPKKP2ZdOONQQGs35S7+QJiZoLxsKNmhoTw8w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765915098; c=relaxed/simple; bh=aiS+VBXGWERVPlAx9SX3K5F+LrZjCmiJAM0YrLhOYYA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=twdIxY+whI4PEmcpsu8TbYBaEtN9dlARP7kK4qPVRNcWOoINoSxzQn/vQRJs284UPqlZHH3XhmpkdxtIrDZ8pTNvG59IZWmf80FNvj9Xbr914mq+0AvyHoOzvQWymYYkul7Z02soMB3Bp3qkGsj/leq+B88PB3uxoC3Ygl58w4A= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=MrBzeP/n; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=XPf+GToY; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MrBzeP/n"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="XPf+GToY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1765915095; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=WILoW472mfGqus7f3vqpilqOzoz0h+z/dztTyHfCBuA=; b=MrBzeP/nhFY+eCzBU6K/by5rK/s4oY4EsSKSceHHcILXYeXzUghWObcCKoPJF8EnyZp0f/ bB4h6VsGQc4lxbKRKNEg1mX+Oynkuuc4vsD0J6c6OKI6A3QRUJsmIlgkpq1pWkb5jKnd0y Y56BtPO+2h5D8kZ/dE/q6uvWExcivUE= Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-599-e5rdias7MhGsY1EIOqPeSQ-1; Tue, 16 Dec 2025 14:58:14 -0500 X-MC-Unique: e5rdias7MhGsY1EIOqPeSQ-1 X-Mimecast-MFC-AGG-ID: e5rdias7MhGsY1EIOqPeSQ_1765915093 Received: by mail-pl1-f198.google.com with SMTP id d9443c01a7336-2a0d058fc56so39301575ad.3 for ; Tue, 16 Dec 2025 11:58:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1765915093; x=1766519893; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=WILoW472mfGqus7f3vqpilqOzoz0h+z/dztTyHfCBuA=; b=XPf+GToYNaVuW7gaQMoVikJ3kWDT/bxOkcOkPbvijWCAPrhKJH8tVYD7WG1+b3Y4Aj NmPzsb0NSjk3wDnaoUy1oN/vUadNOuePACfN49IRDdjEMpel1f84jRhU1PwGQPx602yr MQMxHFRV7uwln6Y99cGALnzUCtw7zLRc/0/9W68M+UKpXi9hwMKU/nFmXmT2GZJ1soew UQLOq3c60cvJl5NBzObww47YAJowVwc52/smnlaDBirCNbpVz45iY16Uewa7tHTxuA66 BkfVXd2P5Qtm6xrn208rv45ydwBnAIH/ZgF8dxmA8jQ1doEsggv1s6MNAohDbug/XdR1 NWhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765915093; x=1766519893; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WILoW472mfGqus7f3vqpilqOzoz0h+z/dztTyHfCBuA=; b=FXiDTT9yjvfmtGoeee+8uIGk0rq90nBVsXTaQuMPdxQm7GwkS6wcXXUHM/d+NS1G1J M2sQcJG+l7/cHIfVEaZq3H5lqNSbWbQXoRrd4WHe8KEpGA9EDJ8MQMrWAwdE9khpQ3lF 7ODbB4pwWTgpexOB1cirIhvvN7fHBy2615nFuY4NUgx171WO5PYtg/Mv9g+0f8V0uDF3 OAYe+aCCC6eJZzO83OD51cRvu/ohyREBTLMnFXy1+jVldUhsV21cdqyuE5BsmM62cjMZ RKpGOF6FoL2a51BcEGtpk9i0TBUYYGKpk9w0Z8SwZGP1J22L8Ehtk2acRg/1h5GOb2fp 230g== X-Forwarded-Encrypted: i=1; AJvYcCVEuESLWE8Dg0w1qC5S1Ejjyc9C+fF81LhRmerkw7J24LpscSSwpHDxSpjTlOSgm9INmPXRS676KwqFaWU=@vger.kernel.org X-Gm-Message-State: AOJu0YyJX6OwN16jHZRmKaKTS0+bbiVLkeqSXdempqrFoXoMpvTJC7D/ EeD8jpSLW7WzSaRKZYxq7fhuC5e+KizJ2wdwDZFfiviJaEf8KhBLbtrs+dC1YPrQ8c64bz8+mjM YQSYBYt/ckhEwAwNlDSYNeCn7EVislGdkZki7E1RMC95EaV99DZNI3w9XcbrDH5Lejg== X-Gm-Gg: AY/fxX6v1RMANcOwjYm2DKNlobDEgrxTn10PktBy2eKRwUW76JvDRuibk3u0iNLNfgx r3HuxrPaHYwKxawwSLy1BN7av0+dv+oiqZ7AIYbUcbakUp3oeU9Iui94AL5iHhJAo8OYFL6eCyW xyE5BgE8WcONSiC23XOSfeY8ij3kEt2renAC+6+i98q0zU3Vx8/d3X3IoVswkqrpbSm9AbiyyhH rixU/gcEYeBiW7mZNXQW2NJlZ81NKvokhn4Sckv/HtsRcCk4B5TgDVkxlXR/j1MqjI7gjnB6sB1 fT6wcKSTBCRZeOl7QupXAWv2LoLh3APVYkmVLz/NreNt2epLb3tA/5bD6o/DdI2cCiXM0vHHKqj rVkc= X-Received: by 2002:a17:903:3bad:b0:297:d45b:6d97 with SMTP id d9443c01a7336-29f23e3618emr164955865ad.14.1765915093042; Tue, 16 Dec 2025 11:58:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IHr2X5JE1VeyIh1e5EanKDeX9YYmMdGAHNX6spNNWVf22t0T9F/mXgE3FmtoXHqzC5TX7vXhg== X-Received: by 2002:a17:903:3bad:b0:297:d45b:6d97 with SMTP id d9443c01a7336-29f23e3618emr164955555ad.14.1765915092528; Tue, 16 Dec 2025 11:58:12 -0800 (PST) Received: from x1.local ([142.188.210.156]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2a094398ed0sm110452725ad.27.2025.12.16.11.58.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Dec 2025 11:58:11 -0800 (PST) Date: Tue, 16 Dec 2025 14:58:03 -0500 From: Peter Xu To: Jason Gunthorpe Cc: kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Nico Pache , Zi Yan , Alex Mastro , David Hildenbrand , Alex Williamson , Zhi Wang , David Laight , Yi Liu , Ankit Agrawal , Kevin Tian , Andrew Morton Subject: Re: [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Message-ID: References: <20251204151003.171039-1-peterx@redhat.com> <20251204151003.171039-5-peterx@redhat.com> <20251216144224.GE6079@nvidia.com> <20251216190131.GI6079@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20251216190131.GI6079@nvidia.com> On Tue, Dec 16, 2025 at 03:01:31PM -0400, Jason Gunthorpe wrote: > On Tue, Dec 16, 2025 at 11:01:00AM -0500, Peter Xu wrote: > > Do we have any function that we can fetch the best mapping lower than a > > specific order? > > I'm not aware of anything Maybe I can introduce a per-arch helper for it, then. I'll see if I can cover some tests from ARM side, or I'll enable x86_64 first so we can do it in two steps. > > > > None of this logic should be in drivers. > > > > I still think it's the driver's decision to have its own macro controlling > > the huge pfnmap behavior. I agree with you core mm can have it, I don't > > see it blocks the driver not returning huge order if huge pfnmap is turned > > off. VFIO-PCI currently indeed only depends directly on global THP > > configs, but I don't see why it's strictly needed. So I think it's fine if > > a driver (even if global THP enabled for pmd/pud) deselect huge pfnmap for > > other reasons, then here the order returned can still always be PSIZE for > > the driver. It's really not a huge deal to me. > > All these APIs should be around the idea that the driver just returns > what it has and the core mm places it into ptes. There is not a good > reason drivers should be overriding this logic or doing their own > thing. I'll make sure the driver will not need to consider size of mapping that arch would support. > > > > Drivers shouldn't implement this alignment function without also > > > implementing huge fault, it is pointless. Don't see a reason to add > > > extra complexity. > > > > It's not implementing the order hint without huge fault. It's when both > > are turned off in a kernel config.. then the order hint (even from driver > > POV) shouldn't need to be reported. > > No, it should still all be the same the core code just won't call the > function. > > > I don't know why you have so strong feeling on having a config check in > > vfio-pci drivers is bad. > > It is leaking MM details into drivers that should not be in drivers. To me it still makes perfect sense here to pair with huge_fault(), and it's driver knowledge alone. It has nothing to do with leaking mm details. I think I get your point above, maybe when the core mm fallback paths not available yet we can mix things together. I'll see what I can do when repost. -- Peter Xu