netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Hemminger <shemminger@vyatta.com>
To: "Michał Mirosław" <mirqus@gmail.com>
Cc: Tom Herbert <therbert@google.com>,
	davem@davemloft.net, netdev@vger.kernel.org
Subject: Re: [PATCH v2] net: Allow no-cache copy from user on transmit
Date: Wed, 23 Mar 2011 12:25:44 -0700	[thread overview]
Message-ID: <20110323122544.032fb543@nehalam> (raw)
In-Reply-To: <AANLkTi=Tg4jW7hZZ43E71GNi+d8JWMJQcv3ExZA6zJQW@mail.gmail.com>

On Wed, 23 Mar 2011 19:42:20 +0100
Michał Mirosław <mirqus@gmail.com> wrote:

> 2011/3/23 Tom Herbert <therbert@google.com>:
> > This patch uses __copy_from_user_nocache (from skb_copy_to_page)
> > on transmit to bypass data cache for a performance improvement.
> > This functionality is configurable per device using ethtool, the
> > device must also be doing TX csum offload to enable.  It seems
> > reasonable to set this when the netdevice does not copy or
> > otherwise touch the data.
> >
> > This patch was tested using 200 instances of netperf TCP_RR with
> > 1400 byte request and one byte reply.  Platform is 16 core AMD x86.
> >
> > No-cache copy disabled:
> >   672703 tps, 97.13% utilization
> >   50/90/99% latency:244.31 484.205 1028.41
> >
> > No-cache copy enabled:
> >   702113 tps, 96.16% utilization,
> >   50/90/99% latency 238.56 467.56 956.955
> >
> > Using 14000 byte request and response sizes demonstrate the
> > effects more dramatically:
> >
> > No-cache copy disabled:
> >   79571 tps, 34.34 %utlization
> >   50/90/95% latency 1584.46 2319.59 5001.76
> >
> > No-cache copy enabled:
> >   83856 tps, 34.81% utilization
> >   50/90/95% latency 2508.42 2622.62 2735.88
> >
> > Note especially the effect on tail latency (95th percentile).
> >
> > This seems to provide a nice performance improvement and is
> > consistent in the tests I ran.  Presumably, this would provide
> > the greatest benfits in the presence of an application workload
> > stressing the cache and a lot of transmit data happening.  I don't
> > yet see a downside to using this.
> >
> > Signed-off-by: Tom Herbert <therbert@google.com>
> > ---
> >  include/linux/netdevice.h |   10 ++++++++--
> >  include/net/sock.h        |    5 +++++
> >  net/core/dev.c            |    2 +-
> >  net/core/ethtool.c        |    2 +-
> >  4 files changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 5eeb2cd..52d444f 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -1066,6 +1066,7 @@ struct net_device {
> >  #define NETIF_F_NTUPLE         (1 << 27) /* N-tuple filters supported */
> >  #define NETIF_F_RXHASH         (1 << 28) /* Receive hashing offload */
> >  #define NETIF_F_RXCSUM         (1 << 29) /* Receive checksumming offload */
> > +#define NETIF_F_NOCACHE_COPY   (1 << 30) /* Use no-cache copyfromuser */
> >
> >        /* Segmentation offload features */
> >  #define NETIF_F_GSO_SHIFT      16
> > @@ -1081,7 +1082,7 @@ struct net_device {
> >        /* = all defined minus driver/device-class-related */
> >  #define NETIF_F_NEVER_CHANGE   (NETIF_F_HIGHDMA | NETIF_F_VLAN_CHALLENGED | \
> >                                  NETIF_F_LLTX | NETIF_F_NETNS_LOCAL)
> > -#define NETIF_F_ETHTOOL_BITS   (0x3f3fffff & ~NETIF_F_NEVER_CHANGE)
> > +#define NETIF_F_ETHTOOL_BITS   (0x7f3fffff & ~NETIF_F_NEVER_CHANGE)
> >
> >        /* List of features with software fallbacks. */
> >  #define NETIF_F_GSO_SOFTWARE   (NETIF_F_TSO | NETIF_F_TSO_ECN | \
> > @@ -1108,7 +1109,12 @@ struct net_device {
> >                                 NETIF_F_FRAGLIST)
> >
> >        /* changeable features with no special hardware requirements */
> > -#define NETIF_F_SOFT_FEATURES  (NETIF_F_GSO | NETIF_F_GRO)
> > +#define NETIF_F_SOFT_FEATURES  (NETIF_F_GSO | NETIF_F_GRO |    \
> > +                                NETIF_F_NOCACHE_COPY)
> > +
> > +       /* soft features automatically enabled */
> > +#define NETIF_F_SOFT_FEAT_ENAB (NETIF_F_GSO | NETIF_F_GRO)
> > +
> >
> >        /* Interface index. Unique device identifier    */
> >        int                     ifindex;
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index da0534d..74ce586 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -1401,6 +1401,11 @@ static inline int skb_copy_to_page(struct sock *sk, char __user *from,
> >                if (err)
> >                        return err;
> >                skb->csum = csum_block_add(skb->csum, csum, skb->len);
> > +       } else if (sk->sk_route_caps & NETIF_F_NOCACHE_COPY) {
> > +               if (!access_ok(VERIFY_READ, from, copy) ||
> > +                   __copy_from_user_nocache(page_address(page) + off,
> > +                                               from, copy))
> > +                       return -EFAULT;
> >        } else if (copy_from_user(page_address(page) + off, from, copy))
> >                return -EFAULT;
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 0b88eba..c3ed95e 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -5435,7 +5435,7 @@ int register_netdevice(struct net_device *dev)
> >         * software offloads (GSO and GRO).
> >         */
> >        dev->hw_features |= NETIF_F_SOFT_FEATURES;
> > -       dev->features |= NETIF_F_SOFT_FEATURES;
> > +       dev->features |= NETIF_F_SOFT_FEAT_ENAB;
> >        dev->wanted_features = dev->features & dev->hw_features;
> >
> >        /* Avoid warning from netdev_fix_features() for GSO without SG */
> > diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> > index c1a71bb..40b6fe0 100644
> > --- a/net/core/ethtool.c
> > +++ b/net/core/ethtool.c
> > @@ -344,7 +344,7 @@ static const char netdev_features_strings[ETHTOOL_DEV_FEATURE_WORDS * 32][ETH_GS
> >        /* NETIF_F_NTUPLE */          "rx-ntuple-filter",
> >        /* NETIF_F_RXHASH */          "rx-hashing",
> >        /* NETIF_F_RXCSUM */          "rx-checksum",
> > -       "",
> > +       /* NETIF_F_NOCACHE_COPY */    "tx-nocache-copy"
> >        "",
> >  };
> 
> I would rather see it enabled by default, including "hacks" in
> register_netdev() like for GSO. Otherwise not much people will test
> this. There should also be constraints for it in
> netdev_fix_features().
> 
> BTW, what happens if this is used on e.g. bridge device or veth and
> later packet ends up going to device which needs to do checksumming in
> software?

The configuration via device and ethtool seems problematic for general use
in a distro. Nice for testing, but not really matching the architecture
issues.

Isn't nocache DMA a function of the I/O architecture not a function
of the device driver? Shouldn't it be handled at PCI level somehow
with considerations of CPU arch and quirks? Doesn't it make sense for
non-network traffic as well.

Hate to hold up a good optimization while waiting for a general
solution, but commiting to an API prematurely would be bad as well.



-- 

  reply	other threads:[~2011-03-23 19:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-23 17:10 [PATCH v2] net: Allow no-cache copy from user on transmit Tom Herbert
2011-03-23 18:42 ` Michał Mirosław
2011-03-23 19:25   ` Stephen Hemminger [this message]
2011-03-23 19:48     ` Tom Herbert
2011-03-23 18:51 ` Rick Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110323122544.032fb543@nehalam \
    --to=shemminger@vyatta.com \
    --cc=davem@davemloft.net \
    --cc=mirqus@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=therbert@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).