From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4B78C77B6E for ; Wed, 12 Apr 2023 19:44:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229583AbjDLToO (ORCPT ); Wed, 12 Apr 2023 15:44:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229469AbjDLToN (ORCPT ); Wed, 12 Apr 2023 15:44:13 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC708185 for ; Wed, 12 Apr 2023 12:44:11 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 86BB063004 for ; Wed, 12 Apr 2023 19:44:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A8E65C433EF; Wed, 12 Apr 2023 19:44:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1681328650; bh=0AZq/o3ZCUS0t9BJn7kB+7sv3AQMQsMAajH6hzk2VG8=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=TIEskSknvbCe2MPNLvbnZNoQY391EE3YD7BrCJnnbu+ZD9QncDPsXRVRxssxq9fTC lRnI9Xo7VzihVNx9uuFWgsIufwGKOFe1H3P/Zt2l5y3WivVQVrLQP+RDr35KLz7xBq 7Hn2MLIT8pySVsg2EFb6Gml8DKD0eQ8MBcAOX0bPUsnGnmYjNUL66ma5pMhFYOxTgW nFEgt4aiMWrSciriGvjA8fx2kmrxC5047imkikDPqGiBpp5zzVUEN8XBavqrBWbhh5 pQWlQ8Z/g+TxswJ5nJ+K1Iz9Rkg2USniFWi+wacVvLDykE4s4qH0nIsqyV2sS8wwpm oCs27m41jhOTA== Date: Wed, 12 Apr 2023 12:44:09 -0700 From: Jakub Kicinski To: Leon Romanovsky Cc: Brett Creeley , Brett Creeley , davem@davemloft.net, netdev@vger.kernel.org, drivers@pensando.io, shannon.nelson@amd.com, neel.patel@amd.com Subject: Re: [PATCH net] ionic: Fix allocation of q/cq info structures from device local node Message-ID: <20230412124409.7c2d73cc@kernel.org> In-Reply-To: <20230412165816.GB182481@unreal> References: <20230407233645.35561-1-brett.creeley@amd.com> <20230409105242.GR14869@unreal> <20230411124704.GX182481@unreal> <20230411124945.527b0ee4@kernel.org> <20230412165816.GB182481@unreal> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Wed, 12 Apr 2023 19:58:16 +0300 Leon Romanovsky wrote: > > > I'm not sure about it as you are running kernel thread which is > > > triggered directly by device and most likely will run on same node as > > > PCI device. > > > > Isn't that true only for bus-side probing? > > If you bind/unbind via sysfs does it still try to move to the right > > node? Same for resources allocated during ifup? > > Kernel threads are more interesting case, as they are not controlled > through mempolicy (maybe it is not true in 2023, I'm not sure). > > User triggered threads are subjected to mempolicy and all allocations > are expected to follow it. So users, who wants specific memory behaviour > should use it. > > https://docs.kernel.org/6.1/admin-guide/mm/numa_memory_policy.html > > There is a huge chance that fallback mechanisms proposed here in ionic > and implemented in ENA are "break" this interface. Ack, that's what I would have answered while working for a vendor myself, 5 years ago. Now, after seeing how NICs get configured in practice, and all the random tools which may decide to tweak some random param and forget to pin themselves - I'm not as sure. Having a policy configured per netdev and maybe netdev helpers for memory allocation could be an option. We already link netdev to the struct device. > > > vzalloc_node() doesn't do fallback, but vzalloc will find the right node > > > for you. > > > > Sounds like we may want a vzalloc_node_with_fallback or some GFP flag? > > All the _node() helpers which don't fall back lead to unpleasant code > > in the users. > > I would challenge the whole idea of having *_node() allocations in > driver code at the first place. Even in RDMA, where we super focused > on performance and allocation of memory in right place is super > critical, we rely on general kzalloc(). > > There is one exception in RDMA world (hfi1), but it is more because of > legacy implementation and not because of specific need, at least Intel > folks didn't success to convince me with real data. Yes, but RDMA is much more heavy on the application side, much more tightly integrated in general.