netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Randy Dunlap <rdunlap@xenotime.net>
To: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: netdev@vger.kernel.org, Ben Greear <greearb@candelatech.com>,
	Stephen Hemminger <shemminger@vyatta.com>,
	Ben Hutchings <bhutchings@solarflare.com>,
	Donald Skidmore <donald.c.skidmore@intel.com>,
	Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH] net: Add documentation for netdev features handling
Date: Tue, 12 Jul 2011 12:19:02 -0700	[thread overview]
Message-ID: <20110712121902.2c8b98cf.rdunlap@xenotime.net> (raw)
In-Reply-To: <f95d92bfa1d8760db6ae0dbff67bb0e85eb77a2d.1310496866.git.mirq-linux@rere.qmqm.pl>

On Tue, 12 Jul 2011 21:01:30 +0200 (CEST) Michał Mirosław wrote:

> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> ---
> 
> Please comment if something is unclear!
> Apply otherwise. ;)
> 
> ---
>  Documentation/networking/netdev-features.txt |  155 ++++++++++++++++++++++++++
>  1 files changed, 155 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/networking/netdev-features.txt b/Documentation/networking/netdev-features.txt
> new file mode 100644
> index 0000000..9c209e6
> --- /dev/null
> +++ b/Documentation/networking/netdev-features.txt
> @@ -0,0 +1,155 @@
> +Netdev features mess and how to get out from it alive
> +=====================================================
> +
> +Author:
> +	Michał Mirosław <mirq-linux@rere.qmqm.pl>
> +
> +
> +
> + Part I: Feature sets
> +======================
> +
> +Long gone are days, when a network card would just take and give packets

   Long gone are the days when

> +verbatim.  Todays devices add multiple features and bugs (read: offloads)

              Today's

> +that relieves OS of various tasks like generating and checking checksums,

        relieve an OS

> +splitting packets, classifying them.  Those capabilities and their state
> +is commonly referred to as netdev features in Linux kernel world.

   are

> +
> +There are currently three sets of features relevant to the driver, and
> +one used internally by network core:
> +
> + 1. netdev->hw_features set contains features whose state may possibly
> +    be changed (enabled or disabled) for a particular device by user's
> +    request.  This set should be initialized in ndo_init callback and not
> +    changed later.
> +
> + 2. netdev->features set contains features which are currently enabled
> +    for a device.  This should be changed only by network core or in
> +    error paths of ndo_set_features callback.
> +
> + 3. netdev->vlan_features set contains features whose state is inherited
> +    by child VLAN devices (limits netdev->features set).  This is currently
> +    used for all VLAN devices whether tags are stripped or inserted in
> +    hardware or software.
> +
> + 4. netdev->wanted_features set contains feature set requested by user.
> +    This set is filtered by ndo_fix_features callback whenever it or
> +    some device-specific conditions change. This set is internal to
> +    networking core and should not be referenced in drivers.
> +
> +
> +
> + Part II: Controlling enabled features
> +=======================================
> +
> +When current feature set (netdev->features) is to be changed, new set
> +is calculated and filtered by calling ndo_fix_features callback
> +and netdev_fix_features(). If the resulting set differs from current
> +set, it is passed to ndo_set_features callback and (if the callback
> +returns success) replaces value stored in netdev->features.
> +NETDEV_FEAT_CHANGE notification is issued after that whenever current
> +set might have changed.
> +
> +Following events trigger recalculation:

   The following events ...

> + 1. device's registration, after ndo_init returned success
> + 2. user requested changes in features state
> + 3. netdev_update_features() is called
> +
> +ndo_*_features callbacks are called with rtnl_lock held. Missing callbacks
> +are treated as always returning success.
> +
> +Driver wanting to trigger recalculation must do so by calling

   A driver that wants to trigger ...

> +netdev_update_features() while holding rtnl_lock. This should not be done
> +from ndo_*_features callbacks. netdev->features should not be modified by
> +driver except by means of ndo_fix_features callback.
> +
> +
> +
> + Part III: Implementation hints
> +================================
> +
> + * ndo_fix_features:
> +
> +All dependencies between features should be resolved here. The resulting
> +set can be reduced further by networking core imposed limitations (as coded
> +in netdev_fix_features()). For this reason its safer to disable a feature

                                              it is

> +when its dependencies are not met instead of forcing the dependency on.
> +
> +This callback should not modify hardware nor driver state (should be
> +stateless).  It can be called multiple times between successive
> +ndo_set_features calls.
> +
> +Callback must not alter features contained in NETIF_F_SOFT_FEATURES or
> +NETIF_F_NEVER_CHANGE sets. The exception is NETIF_F_VLAN_CHALLENGED but
> +care must be taken as the change won't affect already configured VLANs.
> +
> + * ndo_set_features:
> +
> +Hardware should be reconfigured to match passed feature set. The should not

                                                                The <what> should not

> +be altered unless some error condition happens that can't be reliably
> +detected in ndo_fix_features. In this case, the callback should update
> +netdev->features to match resulting hardware state. Errors returned are
> +not (and cannot be) propagated anywhere except dmesg. (Note: successful
> +return is zero, >0 is silent error.) 
> +
> +
> +
> + Part IV: Features
> +===================
> +
> +For current list of features, see include/linux/netdev_features.h.
> +This section describes semantics of some of them.
> +
> + * Transmit checksumming
> +
> +For complete description, see comments near the top of include/linux/skbuff.h.
> +
> +Note: NETIF_F_HW_CSUM is a superset of NETIF_F_IP_CSUM + NETIF_F_IPV6_CSUM.
> +It means that device can fill TCP/UDP-like checksum anywhere in the packets
> +whatever headers there might be.
> +
> + * Transmit TCP segmentation offload
> +
> +NETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit
> +set, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6).
> +
> + * Transmit DMA from high memory
> +
> +On platforms where this is relevant, NETIF_F_HIGHDMA signals that
> +ndo_start_xmit can handle skbs with frags in high memory.
> +
> + * Transmit scatter-gather
> +
> +Those features say that ndo_start_xmit can handle fragmented skbs:
> +NETIF_F_SG --- paged skbs (skb_shinfo()->frags), NETIF_F_FRAGLIST ---
> +chained skbs (skb->next/prev list).
> +
> + * Software features
> +
> +Features contained in NETIF_F_SOFT_FEATURES are a features of networking

                                                   ^ drop "a"

> +stack. Driver should not change behaviour based on them.
> +
> + * LLTX driver (deprecated for hardware drivers)
> +
> +NETIF_F_LLTX should be set in drivers that implement their own locking in
> +transmit path or don't need locking at all (e.g. software tunnels).
> +In ndo_start_xmit, it is recommended to use a try_lock and return
> +NETDEV_TX_LOCKED when the spin lock fails.  The locking should also properly
> +protect against other callbacks (the rules you need to find out).
> +
> +Don't use it for new drivers.
> +
> + * netns-local device
> +
> +NETIF_F_NETNS_LOCAL is set for devices that are not allowed to move between
> +network namespaces (e.g. loopback).
> +
> +Don't use it in drivers.
> +
> + * VLAN challenged
> +
> +NETIF_F_VLAN_CHALLENGED should be set for devices which can't cope with VLAN
> +headers. Some drivers set this because the cards can't handle the bigger MTU.
> +[FIXME: Those cases could be fixed in VLAN code by allowing only reduced-MTU
> +VLANs. This may be not usefull, though.]

                          useful
> +
> -- 


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

  reply	other threads:[~2011-07-12 19:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-24 19:06 [RFC v2 0/2] Enable RX-FCS in e100 greearb
2011-06-24 19:06 ` [RFC v2 1/2] net: Support RXFCS feature flag greearb
2011-06-24 19:06 ` [RFC v2 2/2] e100: " greearb
2011-06-29 11:37   ` Michał Mirosław
2011-06-29 14:22     ` Ben Greear
2011-06-29 14:33       ` Michał Mirosław
2011-06-29 14:35         ` Ben Greear
2011-06-29 15:06           ` Michał Mirosław
2011-06-29 15:20             ` Ben Greear
2011-07-12 15:49               ` Michał Mirosław
2011-07-12 16:00                 ` Stephen Hemminger
2011-07-12 16:23                   ` Ben Hutchings
2011-07-12 19:01                   ` [PATCH] net: Add documentation for netdev features handling Michał Mirosław
2011-07-12 19:19                     ` Randy Dunlap [this message]
2011-07-12 19:41                     ` [PATCH v2] " Michał Mirosław
2011-07-13  5:27                       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110712121902.2c8b98cf.rdunlap@xenotime.net \
    --to=rdunlap@xenotime.net \
    --cc=bhutchings@solarflare.com \
    --cc=davem@davemloft.net \
    --cc=donald.c.skidmore@intel.com \
    --cc=greearb@candelatech.com \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=mirq-linux@rere.qmqm.pl \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@vyatta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).