From: Leon Romanovsky <leon@kernel.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Brett Creeley <bcreeley@amd.com>,
Brett Creeley <brett.creeley@amd.com>,
davem@davemloft.net, netdev@vger.kernel.org, drivers@pensando.io,
shannon.nelson@amd.com, neel.patel@amd.com
Subject: Re: [PATCH net] ionic: Fix allocation of q/cq info structures from device local node
Date: Thu, 13 Apr 2023 09:43:13 +0300 [thread overview]
Message-ID: <20230413064313.GD182481@unreal> (raw)
In-Reply-To: <20230412124409.7c2d73cc@kernel.org>
On Wed, Apr 12, 2023 at 12:44:09PM -0700, Jakub Kicinski wrote:
> On Wed, 12 Apr 2023 19:58:16 +0300 Leon Romanovsky wrote:
> > > > I'm not sure about it as you are running kernel thread which is
> > > > triggered directly by device and most likely will run on same node as
> > > > PCI device.
> > >
> > > Isn't that true only for bus-side probing?
> > > If you bind/unbind via sysfs does it still try to move to the right
> > > node? Same for resources allocated during ifup?
> >
> > Kernel threads are more interesting case, as they are not controlled
> > through mempolicy (maybe it is not true in 2023, I'm not sure).
> >
> > User triggered threads are subjected to mempolicy and all allocations
> > are expected to follow it. So users, who wants specific memory behaviour
> > should use it.
> >
> > https://docs.kernel.org/6.1/admin-guide/mm/numa_memory_policy.html
> >
> > There is a huge chance that fallback mechanisms proposed here in ionic
> > and implemented in ENA are "break" this interface.
>
> Ack, that's what I would have answered while working for a vendor
> myself, 5 years ago. Now, after seeing how NICs get configured in
> practice, and all the random tools which may decide to tweak some
> random param and forget to pin themselves - I'm not as sure.
I would like to separate between tweaks to driver internals and general
kernel core functionality. Everything that fails under latter category
should be avoided in drivers and in-some extent in subsystems too.
NUMA, IRQ, e.t.c are one of such general features.
>
> Having a policy configured per netdev and maybe netdev helpers for
> memory allocation could be an option. We already link netdev to
> the struct device.
I don't think that it is really needed, I personally never saw real data
which supports claim that system default policy doesn't work for NICs.
I saw a lot of synthetic testing results where allocations were forced
to be taken from far node, but even in this case the performance
difference wasn't huge.
From reading the NUMA Locality docs, I can imagine that NICs already get
right NUMA node from the beginning.
https://docs.kernel.org/6.1/admin-guide/mm/numaperf.html
>
> > > > vzalloc_node() doesn't do fallback, but vzalloc will find the right node
> > > > for you.
> > >
> > > Sounds like we may want a vzalloc_node_with_fallback or some GFP flag?
> > > All the _node() helpers which don't fall back lead to unpleasant code
> > > in the users.
> >
> > I would challenge the whole idea of having *_node() allocations in
> > driver code at the first place. Even in RDMA, where we super focused
> > on performance and allocation of memory in right place is super
> > critical, we rely on general kzalloc().
> >
> > There is one exception in RDMA world (hfi1), but it is more because of
> > legacy implementation and not because of specific need, at least Intel
> > folks didn't success to convince me with real data.
>
> Yes, but RDMA is much more heavy on the application side, much more
> tightly integrated in general.
Yes and no, we have vast number of in-kernel RDMA users (NVMe, RDS, NFS,
e.t.c) who care about performance.
Thanks
prev parent reply other threads:[~2023-04-13 6:43 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-07 23:36 [PATCH net] ionic: Fix allocation of q/cq info structures from device local node Brett Creeley
2023-04-09 10:52 ` Leon Romanovsky
2023-04-10 18:16 ` Brett Creeley
2023-04-11 12:47 ` Leon Romanovsky
2023-04-11 19:49 ` Jakub Kicinski
2023-04-12 16:58 ` Leon Romanovsky
2023-04-12 19:44 ` Jakub Kicinski
2023-04-13 6:43 ` Leon Romanovsky [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230413064313.GD182481@unreal \
--to=leon@kernel.org \
--cc=bcreeley@amd.com \
--cc=brett.creeley@amd.com \
--cc=davem@davemloft.net \
--cc=drivers@pensando.io \
--cc=kuba@kernel.org \
--cc=neel.patel@amd.com \
--cc=netdev@vger.kernel.org \
--cc=shannon.nelson@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).