From: Neil Horman <nhorman@tuxdriver.com>
To: netdev@vger.kernel.org
Cc: nhorman@tuxdriver.com
Subject: [RFC] Idea about increasing efficency of skb allocation in network devices
Date: Sun, 26 Jul 2009 20:36:09 -0400 [thread overview]
Message-ID: <20090727003609.GA30438@localhost.localdomain> (raw)
Hey all-
I've been thinking of an idea lately, and I'm starting to tinker with
implementation, so I thought before I went to far down any one path, I'd like to
solicit for comments on it, just to avoid early design errors and the like.
Please find my proposal below. Feel free to openly ridicule it if you think its
completely off base or pointless. Any and all criticizm welcome. Thanks!
Problem Statement:
Currently the networking stack receive path consists of a set of
producers (the network drivers which allocate skbs to receive on the wire data
into), and a set of consumers (user space applications and other networking
devices which free those skbs when their use is finished). These consumers and
producders are dynamic (additional consumers and producers can be added alsmot
at will within the system). Currently, there exists an potential inefficiency
in this receive path when using NUMA systems. Given that allocation of skb data
buffers is done with only minimal regard to the NUMA node on which a producer
exists (following standard vm policy in which we try to allocate on the local
node first), it is entirely possible that a consumer of this frame data will
exist on a different NUMA node than the node on which it was allocated. This
disparity leads to slower copying when an application attempts to copy this data
from the kernel, as it must cross a greater number of memory bridges.
Proposed solution:
Since Network devices dma their memory into a provided DMA buffer (which
can usually be at an arbitrary location, as they must cross potentially several
pci busses to reach any memory location), I'm postulating that it would increase
our receive path efficiency to provide a hint to the driver layer as to which
node to allocate an skb data buffer on. This hint would be determined by a
feedback mechanism. I was thinking that we could provide a callback function
via the skb, that accepted the skb and the originating net_device. This
callback can track statistics on which numa nodes consume (read: copy data from)
skbs that were produced by specific net devices. Then, when in the future that
netdevice allocates a new skb (perhaps via netdev_alloc_skb), we can use that
statistical profile to determine if the data buffer should be allocated on the
local node, or on a remote node instead. Ideally, this 'consumer based
allocation bias' would allow us to reduce the amount of time it takes to
transfer recieved buffers to user space and make the overall receive path more
efficient. I see lots of opportunity here to develop tools to measure the
speedup this might provide (perhaps via ftrace plugins), as well as various
algorithms to better predict how to allocate skb's on various nodes.
Obviously, the code is going to do the talking here, but I wanted to get
the idea out there so that I anyone who wanted to could point out anything
obvious that would lead to the conclusion that I was nuts. Feel free to tear it
all apart, or, on the off chance that this has legs, suggestions for
improvements/features that you might like.
Thanks!
Neil
next reply other threads:[~2009-07-27 0:36 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-27 0:36 Neil Horman [this message]
2009-07-27 1:02 ` [RFC] Idea about increasing efficency of skb allocation in network devices David Miller
2009-07-27 7:10 ` Brice Goglin
2009-07-27 7:58 ` Eric Dumazet
2009-07-27 8:27 ` Brice Goglin
2009-07-27 10:55 ` Neil Horman
2009-07-29 8:20 ` Brice Goglin
2009-07-29 10:47 ` Neil Horman
2009-07-27 10:52 ` Neil Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090727003609.GA30438@localhost.localdomain \
--to=nhorman@tuxdriver.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).