From: Jacob Keller <jacob.e.keller@intel.com>
To: netdev@vger.kernel.org
Cc: Jacob Keller <jacob.e.keller@intel.com>
Subject: [RFC PATCH] net: limit maximum number of packets to mark with xmit_more
Date: Fri, 25 Aug 2017 08:24:49 -0700 [thread overview]
Message-ID: <20170825152449.29790-1-jacob.e.keller@intel.com> (raw)
Under some circumstances, such as with many stacked devices, it is
possible that dev_hard_start_xmit will bundle many packets together, and
mark them all with xmit_more.
Most drivers respond to xmit_more by skipping tail bumps on packet
rings, or similar behavior as long as xmit_more is set. This is
a performance win since it means drivers can avoid notifying hardware of
new packets repeat daily, and thus avoid wasting unnecessary PCIe or other
bandwidth.
This use of xmit_more comes with a trade off because bundling too many
packets can increase latency of the Tx packets. To avoid this, we should
limit the maximum number of packets with xmit_more.
Driver authors could modify their drivers to check for some determined
limit, but this requires all drivers to be modified in order to gain
advantage.
Instead, add a sysctl "xmit_more_max" which can be used to configure the
maximum number of xmit_more skbs to send in a sequence. This ensures
that all drivers benefit, and allows system administrators the option to
tune the value to their environment.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
Stray thoughts and further questions....
Is this the right approach? Did I miss any other places where we should
limit? Does the limit make sense? Should it instead be a per-device
tuning nob instead of a global? Is 32 a good default?
Documentation/sysctl/net.txt | 6 ++++++
include/linux/netdevice.h | 2 ++
net/core/dev.c | 10 +++++++++-
net/core/sysctl_net_core.c | 7 +++++++
4 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt
index b67044a2575f..3d995e8f4448 100644
--- a/Documentation/sysctl/net.txt
+++ b/Documentation/sysctl/net.txt
@@ -230,6 +230,12 @@ netdev_max_backlog
Maximum number of packets, queued on the INPUT side, when the interface
receives packets faster than kernel can process them.
+xmit_more_max
+-------------
+
+Maximum number of packets in a row to mark with skb->xmit_more. A value of zero
+indicates no limit.
+
netdev_rss_key
--------------
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c5475b37a631..6341452aed09 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3321,6 +3321,8 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev);
extern int netdev_budget;
extern unsigned int netdev_budget_usecs;
+extern unsigned int sysctl_xmit_more_max;
+
/* Called by rtnetlink.c:rtnl_unlock() */
void netdev_run_todo(void);
diff --git a/net/core/dev.c b/net/core/dev.c
index 270b54754821..d9946d29c3a5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2983,12 +2983,19 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *first, struct net_device *de
{
struct sk_buff *skb = first;
int rc = NETDEV_TX_OK;
+ int xmit_count = 0;
+ bool more = true;
while (skb) {
struct sk_buff *next = skb->next;
+ if (sysctl_xmit_more_max)
+ more = xmit_count++ < sysctl_xmit_more_max;
+ if (!more)
+ xmit_count = 0;
+
skb->next = NULL;
- rc = xmit_one(skb, dev, txq, next != NULL);
+ rc = xmit_one(skb, dev, txq, more && next != NULL);
if (unlikely(!dev_xmit_complete(rc))) {
skb->next = next;
goto out;
@@ -3523,6 +3530,7 @@ EXPORT_SYMBOL(netdev_max_backlog);
int netdev_tstamp_prequeue __read_mostly = 1;
int netdev_budget __read_mostly = 300;
unsigned int __read_mostly netdev_budget_usecs = 2000;
+unsigned int __read_mostly sysctl_xmit_more_max = 32;
int weight_p __read_mostly = 64; /* old backlog weight */
int dev_weight_rx_bias __read_mostly = 1; /* bias for backlog weight */
int dev_weight_tx_bias __read_mostly = 1; /* bias for output_queue quota */
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index b7cd9aafe99e..6950e702e101 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -460,6 +460,13 @@ static struct ctl_table net_core_table[] = {
.proc_handler = proc_dointvec_minmax,
.extra1 = &zero,
},
+ {
+ .procname = "xmit_more_max",
+ .data = &sysctl_xmit_more_max,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .prox_handler = proc_dointvec
+ },
{ }
};
--
2.14.1.436.g33e61a4f0239
next reply other threads:[~2017-08-25 15:24 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-25 15:24 Jacob Keller [this message]
2017-08-25 15:36 ` [RFC PATCH] net: limit maximum number of packets to mark with xmit_more Waskiewicz Jr, Peter
2017-08-25 15:58 ` Stephen Hemminger
2017-08-25 16:24 ` Keller, Jacob E
2017-08-25 22:33 ` Alexander Duyck
2017-08-28 20:46 ` Keller, Jacob E
2017-08-25 19:34 ` Jakub Kicinski
2017-08-28 20:56 ` Keller, Jacob E
2017-08-29 13:35 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170825152449.29790-1-jacob.e.keller@intel.com \
--to=jacob.e.keller@intel.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox