From: Stephen Hemminger <shemminger@vyatta.com>
To: "Michał Mirosław" <mirqus@gmail.com>
Cc: Tom Herbert <therbert@google.com>,
davem@davemloft.net, netdev@vger.kernel.org
Subject: Re: [PATCH v2] net: Allow no-cache copy from user on transmit
Date: Wed, 23 Mar 2011 12:25:44 -0700 [thread overview]
Message-ID: <20110323122544.032fb543@nehalam> (raw)
In-Reply-To: <AANLkTi=Tg4jW7hZZ43E71GNi+d8JWMJQcv3ExZA6zJQW@mail.gmail.com>
On Wed, 23 Mar 2011 19:42:20 +0100
Michał Mirosław <mirqus@gmail.com> wrote:
> 2011/3/23 Tom Herbert <therbert@google.com>:
> > This patch uses __copy_from_user_nocache (from skb_copy_to_page)
> > on transmit to bypass data cache for a performance improvement.
> > This functionality is configurable per device using ethtool, the
> > device must also be doing TX csum offload to enable. It seems
> > reasonable to set this when the netdevice does not copy or
> > otherwise touch the data.
> >
> > This patch was tested using 200 instances of netperf TCP_RR with
> > 1400 byte request and one byte reply. Platform is 16 core AMD x86.
> >
> > No-cache copy disabled:
> > 672703 tps, 97.13% utilization
> > 50/90/99% latency:244.31 484.205 1028.41
> >
> > No-cache copy enabled:
> > 702113 tps, 96.16% utilization,
> > 50/90/99% latency 238.56 467.56 956.955
> >
> > Using 14000 byte request and response sizes demonstrate the
> > effects more dramatically:
> >
> > No-cache copy disabled:
> > 79571 tps, 34.34 %utlization
> > 50/90/95% latency 1584.46 2319.59 5001.76
> >
> > No-cache copy enabled:
> > 83856 tps, 34.81% utilization
> > 50/90/95% latency 2508.42 2622.62 2735.88
> >
> > Note especially the effect on tail latency (95th percentile).
> >
> > This seems to provide a nice performance improvement and is
> > consistent in the tests I ran. Presumably, this would provide
> > the greatest benfits in the presence of an application workload
> > stressing the cache and a lot of transmit data happening. I don't
> > yet see a downside to using this.
> >
> > Signed-off-by: Tom Herbert <therbert@google.com>
> > ---
> > include/linux/netdevice.h | 10 ++++++++--
> > include/net/sock.h | 5 +++++
> > net/core/dev.c | 2 +-
> > net/core/ethtool.c | 2 +-
> > 4 files changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 5eeb2cd..52d444f 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -1066,6 +1066,7 @@ struct net_device {
> > #define NETIF_F_NTUPLE (1 << 27) /* N-tuple filters supported */
> > #define NETIF_F_RXHASH (1 << 28) /* Receive hashing offload */
> > #define NETIF_F_RXCSUM (1 << 29) /* Receive checksumming offload */
> > +#define NETIF_F_NOCACHE_COPY (1 << 30) /* Use no-cache copyfromuser */
> >
> > /* Segmentation offload features */
> > #define NETIF_F_GSO_SHIFT 16
> > @@ -1081,7 +1082,7 @@ struct net_device {
> > /* = all defined minus driver/device-class-related */
> > #define NETIF_F_NEVER_CHANGE (NETIF_F_HIGHDMA | NETIF_F_VLAN_CHALLENGED | \
> > NETIF_F_LLTX | NETIF_F_NETNS_LOCAL)
> > -#define NETIF_F_ETHTOOL_BITS (0x3f3fffff & ~NETIF_F_NEVER_CHANGE)
> > +#define NETIF_F_ETHTOOL_BITS (0x7f3fffff & ~NETIF_F_NEVER_CHANGE)
> >
> > /* List of features with software fallbacks. */
> > #define NETIF_F_GSO_SOFTWARE (NETIF_F_TSO | NETIF_F_TSO_ECN | \
> > @@ -1108,7 +1109,12 @@ struct net_device {
> > NETIF_F_FRAGLIST)
> >
> > /* changeable features with no special hardware requirements */
> > -#define NETIF_F_SOFT_FEATURES (NETIF_F_GSO | NETIF_F_GRO)
> > +#define NETIF_F_SOFT_FEATURES (NETIF_F_GSO | NETIF_F_GRO | \
> > + NETIF_F_NOCACHE_COPY)
> > +
> > + /* soft features automatically enabled */
> > +#define NETIF_F_SOFT_FEAT_ENAB (NETIF_F_GSO | NETIF_F_GRO)
> > +
> >
> > /* Interface index. Unique device identifier */
> > int ifindex;
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index da0534d..74ce586 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -1401,6 +1401,11 @@ static inline int skb_copy_to_page(struct sock *sk, char __user *from,
> > if (err)
> > return err;
> > skb->csum = csum_block_add(skb->csum, csum, skb->len);
> > + } else if (sk->sk_route_caps & NETIF_F_NOCACHE_COPY) {
> > + if (!access_ok(VERIFY_READ, from, copy) ||
> > + __copy_from_user_nocache(page_address(page) + off,
> > + from, copy))
> > + return -EFAULT;
> > } else if (copy_from_user(page_address(page) + off, from, copy))
> > return -EFAULT;
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 0b88eba..c3ed95e 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -5435,7 +5435,7 @@ int register_netdevice(struct net_device *dev)
> > * software offloads (GSO and GRO).
> > */
> > dev->hw_features |= NETIF_F_SOFT_FEATURES;
> > - dev->features |= NETIF_F_SOFT_FEATURES;
> > + dev->features |= NETIF_F_SOFT_FEAT_ENAB;
> > dev->wanted_features = dev->features & dev->hw_features;
> >
> > /* Avoid warning from netdev_fix_features() for GSO without SG */
> > diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> > index c1a71bb..40b6fe0 100644
> > --- a/net/core/ethtool.c
> > +++ b/net/core/ethtool.c
> > @@ -344,7 +344,7 @@ static const char netdev_features_strings[ETHTOOL_DEV_FEATURE_WORDS * 32][ETH_GS
> > /* NETIF_F_NTUPLE */ "rx-ntuple-filter",
> > /* NETIF_F_RXHASH */ "rx-hashing",
> > /* NETIF_F_RXCSUM */ "rx-checksum",
> > - "",
> > + /* NETIF_F_NOCACHE_COPY */ "tx-nocache-copy"
> > "",
> > };
>
> I would rather see it enabled by default, including "hacks" in
> register_netdev() like for GSO. Otherwise not much people will test
> this. There should also be constraints for it in
> netdev_fix_features().
>
> BTW, what happens if this is used on e.g. bridge device or veth and
> later packet ends up going to device which needs to do checksumming in
> software?
The configuration via device and ethtool seems problematic for general use
in a distro. Nice for testing, but not really matching the architecture
issues.
Isn't nocache DMA a function of the I/O architecture not a function
of the device driver? Shouldn't it be handled at PCI level somehow
with considerations of CPU arch and quirks? Doesn't it make sense for
non-network traffic as well.
Hate to hold up a good optimization while waiting for a general
solution, but commiting to an API prematurely would be bad as well.
--
next prev parent reply other threads:[~2011-03-23 19:25 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-23 17:10 [PATCH v2] net: Allow no-cache copy from user on transmit Tom Herbert
2011-03-23 18:42 ` Michał Mirosław
2011-03-23 19:25 ` Stephen Hemminger [this message]
2011-03-23 19:48 ` Tom Herbert
2011-03-23 18:51 ` Rick Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110323122544.032fb543@nehalam \
--to=shemminger@vyatta.com \
--cc=davem@davemloft.net \
--cc=mirqus@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=therbert@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).