* [PATCH v5 net-next 4/7] net: add new control message for incoming HW-timestamped packets
From: Miroslav Lichvar @ 2017-05-18 14:07 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170518140738.19617-1-mlichvar@redhat.com>
Add SOF_TIMESTAMPING_OPT_PKTINFO option to request a new control message
for incoming packets with hardware timestamps. It contains the index of
the real interface which received the packet and the length of the
packet at layer 2.
The index is useful with bonding, bridges and other interfaces, where
IP_PKTINFO doesn't allow applications to determine which PHC made the
timestamp. With the L2 length (and link speed) it is possible to
transpose preamble timestamps to trailer timestamps, which are used in
the NTP protocol.
While this information could be provided by two new socket options
independently from timestamping, it doesn't look like they would be very
useful. With this option any performance impact is limited to hardware
timestamping.
Use dev_get_by_napi_id() to get the device and its index. On kernels
with disabled CONFIG_NET_RX_BUSY_POLL or drivers not using NAPI, a zero
index will be returned in the control message.
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
Documentation/networking/timestamping.txt | 9 +++++++++
include/uapi/asm-generic/socket.h | 2 ++
include/uapi/linux/net_tstamp.h | 11 ++++++++++-
net/socket.c | 27 ++++++++++++++++++++++++++-
4 files changed, 47 insertions(+), 2 deletions(-)
diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
index 96f5069..600c6bf 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -193,6 +193,15 @@ SOF_TIMESTAMPING_OPT_STATS:
the transmit timestamps, such as how long a certain block of
data was limited by peer's receiver window.
+SOF_TIMESTAMPING_OPT_PKTINFO:
+
+ Enable the SCM_TIMESTAMPING_PKTINFO control message for incoming
+ packets with hardware timestamps. The message contains struct
+ scm_ts_pktinfo, which supplies the index of the real interface which
+ received the packet and its length at layer 2. A valid (non-zero)
+ interface index will be returned only if CONFIG_NET_RX_BUSY_POLL is
+ enabled and the driver is using NAPI.
+
New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
regardless of the setting of sysctl net.core.tstamp_allow_data.
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 2b48856..a5f6e81 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -100,4 +100,6 @@
#define SO_COOKIE 57
+#define SCM_TIMESTAMPING_PKTINFO 58
+
#endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index 0749fb1..dee74d3 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -9,6 +9,7 @@
#ifndef _NET_TIMESTAMPING_H
#define _NET_TIMESTAMPING_H
+#include <linux/types.h>
#include <linux/socket.h> /* for SO_TIMESTAMPING */
/* SO_TIMESTAMPING gets an integer bit field comprised of these values */
@@ -26,8 +27,9 @@ enum {
SOF_TIMESTAMPING_OPT_CMSG = (1<<10),
SOF_TIMESTAMPING_OPT_TSONLY = (1<<11),
SOF_TIMESTAMPING_OPT_STATS = (1<<12),
+ SOF_TIMESTAMPING_OPT_PKTINFO = (1<<13),
- SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_STATS,
+ SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_PKTINFO,
SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
SOF_TIMESTAMPING_LAST
};
@@ -130,4 +132,11 @@ enum hwtstamp_rx_filters {
HWTSTAMP_FILTER_NTP_ALL,
};
+/* SCM_TIMESTAMPING_PKTINFO control message */
+struct scm_ts_pktinfo {
+ __u32 if_index;
+ __u32 pkt_length;
+ __u32 reserved[2];
+};
+
#endif /* _NET_TIMESTAMPING_H */
diff --git a/net/socket.c b/net/socket.c
index c2564eb..67db7d8 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -662,6 +662,27 @@ static bool skb_is_err_queue(const struct sk_buff *skb)
return skb->pkt_type == PACKET_OUTGOING;
}
+static void put_ts_pktinfo(struct msghdr *msg, struct sk_buff *skb)
+{
+ struct scm_ts_pktinfo ts_pktinfo;
+ struct net_device *orig_dev;
+
+ if (!skb_mac_header_was_set(skb))
+ return;
+
+ memset(&ts_pktinfo, 0, sizeof(ts_pktinfo));
+
+ rcu_read_lock();
+ orig_dev = dev_get_by_napi_id(skb_napi_id(skb));
+ if (orig_dev)
+ ts_pktinfo.if_index = orig_dev->ifindex;
+ rcu_read_unlock();
+
+ ts_pktinfo.pkt_length = skb->len - skb_mac_offset(skb);
+ put_cmsg(msg, SOL_SOCKET, SCM_TIMESTAMPING_PKTINFO,
+ sizeof(ts_pktinfo), &ts_pktinfo);
+}
+
/*
* called from sock_recv_timestamp() if sock_flag(sk, SOCK_RCVTSTAMP)
*/
@@ -699,8 +720,12 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
empty = 0;
if (shhwtstamps &&
(sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
- ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2))
+ ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
empty = 0;
+ if ((sk->sk_tsflags & SOF_TIMESTAMPING_OPT_PKTINFO) &&
+ !skb_is_err_queue(skb))
+ put_ts_pktinfo(msg, skb);
+ }
if (!empty) {
put_cmsg(msg, SOL_SOCKET,
SCM_TIMESTAMPING, sizeof(tss), &tss);
--
2.9.3
^ permalink raw reply related
* [PATCH v5 net-next 0/7] Extend socket timestamping API
From: Miroslav Lichvar @ 2017-05-18 14:07 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
Changes v4->v5:
- fix initialization of reserved fields in struct scm_ts_pktinfo
Changes v3->v4:
- added reserved fields to struct scm_ts_pktinfo
- replaced patch fixing false SW timestamps with a documentation fix
- updated OPT_TX_SWHW patch to handle false SW timestamps
Changes v2->v3:
- modified struct scm_ts_pktinfo to use fixed-width integer types
- added WARN_ON_ONCE for missing RCU lock in dev_get_by_napi_id()
- modified dev_get_by_napi_id() to not return dev in unexpected branch
- modified recv to return SCM_TIMESTAMPING_PKTINFO even if the interface
index is unknown
Changes v1->v2:
- added separate patch for new NAPI functions
- split code from __sock_recv_timestamp() for better readability
- fixed RCU locking
- fixed compiler warning (missing case in switch in first patch)
- inline sw_tx_timestamp() in its only user
Changes RFC->v1:
- reworked SOF_TIMESTAMPING_OPT_PKTINFO patch to not add new fields to
skb shared info (net device is now looked up by napi_id), not require
any changes in drivers, and restrict the cmsg to incoming packets
- renamed SOF_TIMESTAMPING_OPT_MULTIMSG to SOF_TIMESTAMPING_OPT_TX_SWHW
and fixed its description
- moved struct scm_ts_pktinfo from errqueue.h to net_tstamp.h as it
can't be received from the error queue anymore
- improved commit descriptions and removed incorrect comment
This patchset adds new options to the timestamping API that will be
useful for NTP implementations and possibly other applications.
The first patch specifies a timestamp filter for NTP packets. The second
patch updates drivers that can timestamp all packets, or need to list
the filter as unsupported. There is no attempt to add the support to the
phyter driver.
The third patch adds two helper functions working with NAPI ID, which is
needed by the next patch. The fourth patch adds a new option to get a
new control message with the L2 length and interface index for incoming
packets with hardware timestamps.
The fifth patch fixes documentation on number of non-zero fields in
scm_timestamping and warns about false software timestamps when
SO_TIMESTAMP(NS) is combined with SCM_TIMESTAMPING.
The sixth patch adds a new option to request both software and hardware
timestamps for outgoing packets. The seventh patch updates drivers that
assumed software timestamping cannot be used together with hardware
timestamping.
The patches have been tested on x86_64 machines with igb and e1000e
drivers.
Miroslav Lichvar (7):
net: define receive timestamp filter for NTP
net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALL
net: add function to retrieve original skb device using NAPI ID
net: add new control message for incoming HW-timestamped packets
net: fix documentation of struct scm_timestamping
net: allow simultaneous SW and HW transmit timestamping
net: ethernet: update drivers to make both SW and HW TX timestamps
Documentation/networking/timestamping.txt | 25 ++++++++++-
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 4 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 1 +
drivers/net/ethernet/cavium/liquidio/lio_main.c | 1 +
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 1 +
drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 1 +
drivers/net/ethernet/intel/e1000e/netdev.c | 5 ++-
drivers/net/ethernet/intel/i40e/i40e_ptp.c | 1 +
drivers/net/ethernet/intel/igb/igb_ptp.c | 1 +
drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 1 +
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_clock.c | 1 +
drivers/net/ethernet/neterion/vxge/vxge-main.c | 1 +
drivers/net/ethernet/qlogic/qede/qede_ptp.c | 1 +
drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c | 3 +-
drivers/net/ethernet/sfc/ef10.c | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 ++-
drivers/net/ethernet/ti/cpsw.c | 1 +
drivers/net/ethernet/tile/tilegx.c | 1 +
include/linux/netdevice.h | 1 +
include/linux/skbuff.h | 19 ++++----
include/uapi/asm-generic/socket.h | 2 +
include/uapi/linux/net_tstamp.h | 15 ++++++-
net/core/dev.c | 26 +++++++++++
net/core/dev_ioctl.c | 1 +
net/core/skbuff.c | 4 ++
net/socket.c | 50 ++++++++++++++++++++--
27 files changed, 153 insertions(+), 23 deletions(-)
--
2.9.3
^ permalink raw reply
* [PATCH v5 net-next 7/7] net: ethernet: update drivers to make both SW and HW TX timestamps
From: Miroslav Lichvar @ 2017-05-18 14:07 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170518140738.19617-1-mlichvar@redhat.com>
Some drivers were calling the skb_tx_timestamp() function only when
a hardware timestamp was not requested. Now that applications can use
the SOF_TIMESTAMPING_OPT_TX_SWHW option to request both software and
hardware timestamps, the drivers need to be modified to unconditionally
call skb_tx_timestamp().
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 3 +--
drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++--
drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c | 3 +--
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 ++----
4 files changed, 6 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index 89b21d7..5a2ad9c 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1391,8 +1391,7 @@ static void xgbe_prep_tx_tstamp(struct xgbe_prv_data *pdata,
spin_unlock_irqrestore(&pdata->tstamp_lock, flags);
}
- if (!XGMAC_GET_BITS(packet->attributes, TX_PACKET_ATTRIBUTES, PTP))
- skb_tx_timestamp(skb);
+ skb_tx_timestamp(skb);
}
static void xgbe_prep_vlan(struct sk_buff *skb, struct xgbe_packet_data *packet)
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 0ff9295..6ed3bc4 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5868,10 +5868,10 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
adapter->tx_hwtstamp_skb = skb_get(skb);
adapter->tx_hwtstamp_start = jiffies;
schedule_work(&adapter->tx_hwtstamp_work);
- } else {
- skb_tx_timestamp(skb);
}
+ skb_tx_timestamp(skb);
+
netdev_sent_queue(netdev, skb->len);
e1000_tx_queue(tx_ring, tx_flags, count);
/* Make sure there is space in the ring for the next send. */
diff --git a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
index 1e59435..89831ad 100644
--- a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
+++ b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
@@ -1418,8 +1418,7 @@ static netdev_tx_t sxgbe_xmit(struct sk_buff *skb, struct net_device *dev)
priv->hw->desc->tx_enable_tstamp(first_desc);
}
- if (!tqueue->hwts_tx_en)
- skb_tx_timestamp(skb);
+ skb_tx_timestamp(skb);
priv->hw->dma->enable_dma_transmission(priv->ioaddr, txq_index);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index cce862b..27c12e7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2880,8 +2880,7 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
priv->xstats.tx_set_ic_bit++;
}
- if (!priv->hwts_tx_en)
- skb_tx_timestamp(skb);
+ skb_tx_timestamp(skb);
if (unlikely((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
priv->hwts_tx_en)) {
@@ -3084,8 +3083,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
priv->xstats.tx_set_ic_bit++;
}
- if (!priv->hwts_tx_en)
- skb_tx_timestamp(skb);
+ skb_tx_timestamp(skb);
/* Ready to fill the first descriptor and set the OWN bit w/o any
* problems because all the descriptors are actually ready to be
--
2.9.3
^ permalink raw reply related
* [PATCH v5 net-next 6/7] net: allow simultaneous SW and HW transmit timestamping
From: Miroslav Lichvar @ 2017-05-18 14:07 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170518140738.19617-1-mlichvar@redhat.com>
Add SOF_TIMESTAMPING_OPT_TX_SWHW option to allow an outgoing packet to
be looped to the socket's error queue with a software timestamp even
when a hardware transmit timestamp is expected to be provided by the
driver.
Applications using this option will receive two separate messages from
the error queue, one with a software timestamp and the other with a
hardware timestamp. As the hardware timestamp is saved to the shared skb
info, which may happen before the first message with software timestamp
is received by the application, the hardware timestamp is copied to the
SCM_TIMESTAMPING control message only when the skb has no software
timestamp or it is an incoming packet.
While changing sw_tx_timestamp(), inline it in skb_tx_timestamp() as
there are no other users.
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
Documentation/networking/timestamping.txt | 8 ++++++++
include/linux/skbuff.h | 10 ++--------
include/uapi/linux/net_tstamp.h | 3 ++-
net/core/skbuff.c | 4 ++++
net/socket.c | 23 +++++++++++++++++++++--
5 files changed, 37 insertions(+), 11 deletions(-)
diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
index 74b7c61..65be2d9 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -202,6 +202,14 @@ SOF_TIMESTAMPING_OPT_PKTINFO:
interface index will be returned only if CONFIG_NET_RX_BUSY_POLL is
enabled and the driver is using NAPI.
+SOF_TIMESTAMPING_OPT_TX_SWHW:
+
+ Request both hardware and software timestamps for outgoing packets
+ when SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE
+ are enabled at the same time. If both timestamps are generated,
+ two separate messages will be looped to the socket's error queue,
+ each containing just one timestamp.
+
New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
regardless of the setting of sysctl net.core.tstamp_allow_data.
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 1f8028c..3b2e284 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3254,13 +3254,6 @@ void __skb_tstamp_tx(struct sk_buff *orig_skb,
void skb_tstamp_tx(struct sk_buff *orig_skb,
struct skb_shared_hwtstamps *hwtstamps);
-static inline void sw_tx_timestamp(struct sk_buff *skb)
-{
- if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
- !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
- skb_tstamp_tx(skb, NULL);
-}
-
/**
* skb_tx_timestamp() - Driver hook for transmit timestamping
*
@@ -3276,7 +3269,8 @@ static inline void sw_tx_timestamp(struct sk_buff *skb)
static inline void skb_tx_timestamp(struct sk_buff *skb)
{
skb_clone_tx_timestamp(skb);
- sw_tx_timestamp(skb);
+ if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP)
+ skb_tstamp_tx(skb, NULL);
}
/**
diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index dee74d3..3d421d9 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -28,8 +28,9 @@ enum {
SOF_TIMESTAMPING_OPT_TSONLY = (1<<11),
SOF_TIMESTAMPING_OPT_STATS = (1<<12),
SOF_TIMESTAMPING_OPT_PKTINFO = (1<<13),
+ SOF_TIMESTAMPING_OPT_TX_SWHW = (1<<14),
- SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_PKTINFO,
+ SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_TX_SWHW,
SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
SOF_TIMESTAMPING_LAST
};
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 346d3e8..68c02df 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3875,6 +3875,10 @@ void __skb_tstamp_tx(struct sk_buff *orig_skb,
if (!sk)
return;
+ if (!hwtstamps && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW) &&
+ skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)
+ return;
+
tsonly = sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TSONLY;
if (!skb_may_tx_timestamp(sk, tsonly))
return;
diff --git a/net/socket.c b/net/socket.c
index 67db7d8..0b5758a 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -662,6 +662,22 @@ static bool skb_is_err_queue(const struct sk_buff *skb)
return skb->pkt_type == PACKET_OUTGOING;
}
+/* On transmit, software and hardware timestamps are returned independently.
+ * As the two skb clones share the hardware timestamp, which may be updated
+ * before the software timestamp is received, a hardware TX timestamp may be
+ * returned only if there is no software TX timestamp. A false software
+ * timestamp made for SOCK_RCVTSTAMP when a real timestamp is missing must
+ * be ignored.
+ */
+static bool skb_is_swtx_tstamp(const struct sk_buff *skb,
+ const struct sock *sk, int false_tstamp)
+{
+ if (false_tstamp && sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW)
+ return 0;
+
+ return skb->tstamp && skb_is_err_queue(skb);
+}
+
static void put_ts_pktinfo(struct msghdr *msg, struct sk_buff *skb)
{
struct scm_ts_pktinfo ts_pktinfo;
@@ -691,14 +707,16 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
{
int need_software_tstamp = sock_flag(sk, SOCK_RCVTSTAMP);
struct scm_timestamping tss;
- int empty = 1;
+ int empty = 1, false_tstamp = 0;
struct skb_shared_hwtstamps *shhwtstamps =
skb_hwtstamps(skb);
/* Race occurred between timestamp enabling and packet
receiving. Fill in the current time for now. */
- if (need_software_tstamp && skb->tstamp == 0)
+ if (need_software_tstamp && skb->tstamp == 0) {
__net_timestamp(skb);
+ false_tstamp = 1;
+ }
if (need_software_tstamp) {
if (!sock_flag(sk, SOCK_RCVTSTAMPNS)) {
@@ -720,6 +738,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
empty = 0;
if (shhwtstamps &&
(sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
+ !skb_is_swtx_tstamp(skb, sk, false_tstamp) &&
ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
empty = 0;
if ((sk->sk_tsflags & SOF_TIMESTAMPING_OPT_PKTINFO) &&
--
2.9.3
^ permalink raw reply related
* [PATCH v5 net-next 2/7] net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALL
From: Miroslav Lichvar @ 2017-05-18 14:07 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170518140738.19617-1-mlichvar@redhat.com>
Include HWTSTAMP_FILTER_NTP_ALL in net_hwtstamp_validate() as a valid
filter and update drivers which can timestamp all packets, or which
explicitly list unsupported filters instead of using a default case, to
handle the filter.
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 1 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 1 +
drivers/net/ethernet/cavium/liquidio/lio_main.c | 1 +
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 1 +
drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 1 +
drivers/net/ethernet/intel/e1000e/netdev.c | 1 +
drivers/net/ethernet/intel/i40e/i40e_ptp.c | 1 +
drivers/net/ethernet/intel/igb/igb_ptp.c | 1 +
drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 1 +
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_clock.c | 1 +
drivers/net/ethernet/neterion/vxge/vxge-main.c | 1 +
drivers/net/ethernet/qlogic/qede/qede_ptp.c | 1 +
drivers/net/ethernet/sfc/ef10.c | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 +
drivers/net/ethernet/ti/cpsw.c | 1 +
drivers/net/ethernet/tile/tilegx.c | 1 +
net/core/dev_ioctl.c | 3 +--
18 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index c772420..89b21d7 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1268,6 +1268,7 @@ static int xgbe_set_hwtstamp_settings(struct xgbe_prv_data *pdata,
case HWTSTAMP_FILTER_NONE:
break;
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSENALL, 1);
XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSENA, 1);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 7414ffd..14c236e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -15351,6 +15351,7 @@ int bnx2x_configure_ptp_filters(struct bnx2x *bp)
break;
case HWTSTAMP_FILTER_ALL:
case HWTSTAMP_FILTER_SOME:
+ case HWTSTAMP_FILTER_NTP_ALL:
bp->rx_filter = HWTSTAMP_FILTER_NONE;
break;
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index 649f2aa..ba01242 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -3024,6 +3024,7 @@ static int hwtstamp_ioctl(struct net_device *netdev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
conf.rx_filter = HWTSTAMP_FILTER_ALL;
break;
default:
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
index d51c8d8..31d737c 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
@@ -2085,6 +2085,7 @@ static int hwtstamp_ioctl(struct net_device *netdev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
conf.rx_filter = HWTSTAMP_FILTER_ALL;
break;
default:
diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index a213868..2887bca 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -755,6 +755,7 @@ static int octeon_mgmt_ioctl_hwtstamp(struct net_device *netdev,
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
p->has_rx_tstamp = have_hw_timestamps;
config.rx_filter = HWTSTAMP_FILTER_ALL;
if (p->has_rx_tstamp) {
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index b367972..0ff9295 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3680,6 +3680,7 @@ static int e1000e_config_hwtstamp(struct e1000_adapter *adapter,
* Delay Request messages but not both so fall-through to
* time stamp all packets.
*/
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
is_l2 = true;
is_l4 = true;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ptp.c b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
index 18c1cc0..0efff18 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ptp.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
@@ -562,6 +562,7 @@ static int i40e_ptp_set_timestamp_mode(struct i40e_pf *pf,
config->rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT;
}
break;
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
default:
return -ERANGE;
diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index 7a3fd4d..d333d6d 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -941,6 +941,7 @@ static int igb_ptp_set_timestamp_mode(struct igb_adapter *adapter,
is_l4 = true;
break;
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
/* 82576 cannot timestamp all packets, which it needs to do to
* support both V1 Sync and Delay_Req messages
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
index ef0635e..d44c728 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
@@ -883,6 +883,7 @@ static int ixgbe_ptp_set_timestamp_mode(struct ixgbe_adapter *adapter,
IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
break;
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
/* The X550 controller is capable of timestamping all packets,
* which allows it to accept any filter.
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 94fab20..8243674 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2375,6 +2375,7 @@ static int mlx4_en_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
config.rx_filter = HWTSTAMP_FILTER_ALL;
break;
default:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
index e706a87..e294944 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
@@ -128,6 +128,7 @@ int mlx5e_hwstamp_set(struct net_device *dev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
/* Disable CQE compression */
netdev_warn(dev, "Disabling cqe compression");
err = mlx5e_modify_rx_cqe_compression_locked(priv, false);
diff --git a/drivers/net/ethernet/neterion/vxge/vxge-main.c b/drivers/net/ethernet/neterion/vxge/vxge-main.c
index 6a4310a..50ea69d 100644
--- a/drivers/net/ethernet/neterion/vxge/vxge-main.c
+++ b/drivers/net/ethernet/neterion/vxge/vxge-main.c
@@ -3218,6 +3218,7 @@ static int vxge_hwtstamp_set(struct vxgedev *vdev, void __user *data)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
if (vdev->devh->config.hwts_en != VXGE_HW_HWTS_ENABLE)
return -EFAULT;
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ptp.c b/drivers/net/ethernet/qlogic/qede/qede_ptp.c
index 24f06e2..9b2280b 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ptp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ptp.c
@@ -244,6 +244,7 @@ static int qede_ptp_cfg_filters(struct qede_dev *edev)
break;
case HWTSTAMP_FILTER_ALL:
case HWTSTAMP_FILTER_SOME:
+ case HWTSTAMP_FILTER_NTP_ALL:
ptp->rx_filter = HWTSTAMP_FILTER_NONE;
rx_filter = QED_PTP_FILTER_ALL;
break;
diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 78efb28..ad9c4de 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -6068,6 +6068,7 @@ static int efx_ef10_ptp_set_ts_config(struct efx_nic *efx,
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
init->rx_filter = HWTSTAMP_FILTER_ALL;
rc = efx_ptp_change_mode(efx, true, 0);
if (!rc)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index a74c481..cce862b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -644,6 +644,7 @@ static int stmmac_hwtstamp_ioctl(struct net_device *dev, struct ifreq *ifr)
ptp_over_ethernet = PTP_TCR_TSIPENA;
break;
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
/* time stamp any incoming packet */
config.rx_filter = HWTSTAMP_FILTER_ALL;
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index f4d7aec..37fc165 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1734,6 +1734,7 @@ static int cpsw_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
return -ERANGE;
case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
diff --git a/drivers/net/ethernet/tile/tilegx.c b/drivers/net/ethernet/tile/tilegx.c
index 7c634bc..aec9538 100644
--- a/drivers/net/ethernet/tile/tilegx.c
+++ b/drivers/net/ethernet/tile/tilegx.c
@@ -512,6 +512,7 @@ static int tile_hwtstamp_set(struct net_device *dev, struct ifreq *rq)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
config.rx_filter = HWTSTAMP_FILTER_ALL;
break;
default:
diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index 8f036a7..77f04e7 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -225,9 +225,8 @@ static int net_hwtstamp_validate(struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
- rx_filter_valid = 1;
- break;
case HWTSTAMP_FILTER_NTP_ALL:
+ rx_filter_valid = 1;
break;
}
--
2.9.3
^ permalink raw reply related
* [PATCH] net: sched: fix a use-after-free error on chain on the error exit path
From: Colin King @ 2017-05-18 14:07 UTC (permalink / raw)
To: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller, netdev
Cc: kernel-janitors, linux-kernel
From: Colin Ian King <colin.king@canonical.com>
Set chain to null after the call to tcf_chain_destroy so that we don't
call tcf_chain_put on the error exit path, thus avoiding a use-after-free
error.
Detected by CoverityScan, CID#1436357 ("Use after free")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
net/sched/cls_api.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 4020b8d932a1..82ebdc3fcb2e 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -511,6 +511,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
if (n->nlmsg_type == RTM_DELTFILTER && prio == 0) {
tfilter_notify_chain(net, skb, n, chain, RTM_DELTFILTER);
tcf_chain_destroy(chain);
+ chain = NULL;
err = 0;
goto errout;
}
--
2.11.0
^ permalink raw reply related
* Re: [PATCH net v3] net: x25: fix one potential use-after-free issue
From: David Miller @ 2017-05-18 14:06 UTC (permalink / raw)
To: xiaolou4617; +Cc: andrew.hendry, linux-x25, netdev, linux-kernel
In-Reply-To: <1494993907-22727-1-git-send-email-xiaolou4617@gmail.com>
From: linzhang <xiaolou4617@gmail.com>
Date: Wed, 17 May 2017 12:05:07 +0800
> The function x25_init is not properly unregister related resources
> on error handler.It is will result in kernel oops if x25_init init
> failed, so add properly unregister call on error handler.
>
> Also, i adjust the coding style and make x25_register_sysctl properly
> return failure.
>
> Signed-off-by: linzhang <xiaolou4617@gmail.com>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH net-next 3/6] net: bridge: break if __br_mdb_del fails
From: Nikolay Aleksandrov @ 2017-05-18 13:45 UTC (permalink / raw)
To: Vivien Didelot, netdev
Cc: moderated list:ETHERNET BRIDGE, kernel, linux-kernel,
David S. Miller
In-Reply-To: <20170517212709.6473-4-vivien.didelot@savoirfairelinux.com>
On 5/18/17 12:27 AM, Vivien Didelot wrote:
> Be symmetric with br_mdb_add and break if __br_mdb_del returns an error.
>
> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
> ---
> net/bridge/br_mdb.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
> index d20a01622b20..24eeeefb4179 100644
> --- a/net/bridge/br_mdb.c
> +++ b/net/bridge/br_mdb.c
> @@ -688,8 +688,9 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
> list_for_each_entry(v, &vg->vlan_list, vlist) {
> entry->vid = v->vid;
> err = __br_mdb_del(br, entry);
> - if (!err)
> - __br_mdb_notify(dev, p, entry, RTM_DELMDB);
> + if (err)
> + break;
> + __br_mdb_notify(dev, p, entry, RTM_DELMDB);
> }
> } else {
> err = __br_mdb_del(br, entry);
>
This can potentially break user-space scripts that rely on the best-effort
behaviour, this is the normal "delete without vid & enabled vlan filtering".
You can check the fdb delete code which does the same, this was intentional.
You can add an mdb entry without a vid to all vlans, add a vlan and then try
to remove it from all vlans where it is present - with this patch obviously
that will fail at the new vlan.
^ permalink raw reply
* [PATCH net-next v3 7/7] sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY}
From: Davide Caratti @ 2017-05-18 13:44 UTC (permalink / raw)
To: linux-sctp, netdev
Cc: Alexander Duyck, Tom Herbert, David Laight, David S. Miller,
Marcelo Ricardo Leitner
In-Reply-To: <cover.1495112301.git.dcaratti@redhat.com>
Add FCoE to the list of protocols that can set CHECKSUM_UNNECESSARY; add a
note to CHECKSUM_COMPLETE section to specify that it does not apply to SCTP
and FCoE protocols.
Suggested-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Tom Herbert <tom@herbertland.com>
---
include/linux/skbuff.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 43d7ca07..1713e4b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -109,6 +109,7 @@
* may perform further validation in this case.
* GRE: only if the checksum is present in the header.
* SCTP: indicates the CRC in SCTP header has been validated.
+ * FCOE: indicates the CRC in FC frame has been validated.
*
* skb->csum_level indicates the number of consecutive checksums found in
* the packet minus one that have been verified as CHECKSUM_UNNECESSARY.
@@ -126,8 +127,10 @@
* packet as seen by netif_rx() and fills out in skb->csum. Meaning, the
* hardware doesn't need to parse L3/L4 headers to implement this.
*
- * Note: Even if device supports only some protocols, but is able to produce
- * skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
+ * Notes:
+ * - Even if device supports only some protocols, but is able to produce
+ * skb->csum, it MUST use CHECKSUM_COMPLETE, not CHECKSUM_UNNECESSARY.
+ * - CHECKSUM_COMPLETE is not applicable to SCTP and FCoE protocols.
*
* CHECKSUM_PARTIAL:
*
--
2.7.4
^ permalink raw reply related
* [PATCH net-next v3 6/7] openvswitch: more accurate checksumming in queue_userspace_packet()
From: Davide Caratti @ 2017-05-18 13:44 UTC (permalink / raw)
To: linux-sctp, netdev
Cc: Alexander Duyck, Tom Herbert, David Laight, David S. Miller,
Marcelo Ricardo Leitner
In-Reply-To: <cover.1495112301.git.dcaratti@redhat.com>
if skb carries an SCTP packet and ip_summed is CHECKSUM_PARTIAL, it needs
CRC32c in place of Internet Checksum: use skb_csum_hwoffload_help to avoid
corrupting such packets while queueing them towards userspace.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
net/openvswitch/datapath.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 7b17da9..9ddc9f8 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -453,7 +453,7 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
/* Complete checksum if needed */
if (skb->ip_summed == CHECKSUM_PARTIAL &&
- (err = skb_checksum_help(skb)))
+ (err = skb_csum_hwoffload_help(skb, 0)))
goto out;
/* Older versions of OVS user space enforce alignment of the last
--
2.7.4
^ permalink raw reply related
* [PATCH net-next v3 5/7] net: more accurate checksumming in validate_xmit_skb()
From: Davide Caratti @ 2017-05-18 13:44 UTC (permalink / raw)
To: linux-sctp, netdev
Cc: Alexander Duyck, Tom Herbert, David Laight, David S. Miller,
Marcelo Ricardo Leitner
In-Reply-To: <cover.1495112301.git.dcaratti@redhat.com>
skb_csum_hwoffload_help() uses netdev features and skb->csum_not_inet to
determine if skb needs software computation of Internet Checksum or crc32c
(or nothing, if this computation can be done by the hardware). Use it in
place of skb_checksum_help() in validate_xmit_skb() to avoid corruption
of non-GSO SCTP packets having skb->ip_summed equal to CHECKSUM_PARTIAL.
While at it, remove references to skb_csum_off_chk* functions, since they
are not present anymore in Linux _ see commit cf53b1da73bd ("Revert
"net: Add driver helper functions to determine checksum offloadability"").
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
Documentation/networking/checksum-offloads.txt | 11 +++++++----
include/linux/netdevice.h | 3 +++
include/linux/skbuff.h | 13 +++++--------
net/core/dev.c | 14 ++++++++++++--
4 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/Documentation/networking/checksum-offloads.txt b/Documentation/networking/checksum-offloads.txt
index 56e3686..d52d191 100644
--- a/Documentation/networking/checksum-offloads.txt
+++ b/Documentation/networking/checksum-offloads.txt
@@ -35,6 +35,9 @@ This interface only allows a single checksum to be offloaded. Where
encapsulation is used, the packet may have multiple checksum fields in
different header layers, and the rest will have to be handled by another
mechanism such as LCO or RCO.
+CRC32c can also be offloaded using this interface, by means of filling
+ skb->csum_start and skb->csum_offset as described above, and setting
+ skb->csum_not_inet: see skbuff.h comment (section 'D') for more details.
No offloading of the IP header checksum is performed; it is always done in
software. This is OK because when we build the IP header, we obviously
have it in cache, so summing it isn't expensive. It's also rather short.
@@ -49,9 +52,9 @@ A driver declares its offload capabilities in netdev->hw_features; see
and csum_offset given in the SKB; if it tries to deduce these itself in
hardware (as some NICs do) the driver should check that the values in the
SKB match those which the hardware will deduce, and if not, fall back to
- checksumming in software instead (with skb_checksum_help or one of the
- skb_csum_off_chk* functions as mentioned in include/linux/skbuff.h). This
- is a pain, but that's what you get when hardware tries to be clever.
+ checksumming in software instead (with skb_csum_hwoffload_help() or one of
+ the skb_checksum_help() / skb_crc32c_csum_help functions, as mentioned in
+ include/linux/skbuff.h).
The stack should, for the most part, assume that checksum offload is
supported by the underlying device. The only place that should check is
@@ -60,7 +63,7 @@ The stack should, for the most part, assume that checksum offload is
may include other offloads besides TX Checksum Offload) and, if they are
not supported or enabled on the device (determined by netdev->features),
performs the corresponding offload in software. In the case of TX
- Checksum Offload, that means calling skb_checksum_help(skb).
+ Checksum Offload, that means calling skb_csum_hwoffload_help(skb, features).
LCO: Local Checksum Offload
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 970a255..4980f83 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3930,6 +3930,9 @@ void netdev_rss_key_fill(void *buffer, size_t len);
int dev_get_nest_level(struct net_device *dev);
int skb_checksum_help(struct sk_buff *skb);
int skb_crc32c_csum_help(struct sk_buff *skb);
+int skb_csum_hwoffload_help(struct sk_buff *skb,
+ const netdev_features_t features);
+
struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
netdev_features_t features, bool tx_path);
struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a43d208..43d7ca07 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -162,14 +162,11 @@
*
* NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated in favor of
* NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to indicate
- * checksum offload capability. If a device has limited checksum capabilities
- * (for instance can only perform NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM as
- * described above) a helper function can be called to resolve
- * CHECKSUM_PARTIAL. The helper functions are skb_csum_off_chk*. The helper
- * function takes a spec argument that describes the protocol layer that is
- * supported for checksum offload and can be called for each packet. If a
- * packet does not match the specification for offload, skb_checksum_help
- * is called to resolve the checksum.
+ * checksum offload capability.
+ * skb_csum_hwoffload_help() can be called to resolve CHECKSUM_PARTIAL based
+ * on network device checksumming capabilities: if a packet does not match
+ * them, skb_checksum_help or skb_crc32c_help (depending on the value of
+ * csum_not_inet, see item D.) is called to resolve the checksum.
*
* CHECKSUM_NONE:
*
diff --git a/net/core/dev.c b/net/core/dev.c
index 71107d1..bb136f7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2996,6 +2996,17 @@ static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
return skb;
}
+int skb_csum_hwoffload_help(struct sk_buff *skb,
+ const netdev_features_t features)
+{
+ if (unlikely(skb->csum_not_inet))
+ return !!(features & NETIF_F_SCTP_CRC) ? 0 :
+ skb_crc32c_csum_help(skb);
+
+ return !!(features & NETIF_F_CSUM_MASK) ? 0 : skb_checksum_help(skb);
+}
+EXPORT_SYMBOL(skb_csum_hwoffload_help);
+
static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
{
netdev_features_t features;
@@ -3034,8 +3045,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device
else
skb_set_transport_header(skb,
skb_checksum_start_offset(skb));
- if (!(features & NETIF_F_CSUM_MASK) &&
- skb_checksum_help(skb))
+ if (skb_csum_hwoffload_help(skb, features))
goto out_kfree_skb;
}
}
--
2.7.4
^ permalink raw reply related
* [PATCH net-next v3 4/7] net: use skb->csum_not_inet to identify packets needing crc32c
From: Davide Caratti @ 2017-05-18 13:44 UTC (permalink / raw)
To: linux-sctp, netdev
Cc: Alexander Duyck, Tom Herbert, David Laight, David S. Miller,
Marcelo Ricardo Leitner
In-Reply-To: <cover.1495112301.git.dcaratti@redhat.com>
skb->csum_not_inet carries the indication on which algorithm is needed to
compute checksum on skb in the transmit path, when skb->ip_summed is equal
to CHECKSUM_PARTIAL. If skb carries a SCTP packet and crc32c hasn't been
yet written in L4 header, skb->csum_not_inet is assigned to 1; otherwise,
assume Internet Checksum is needed and thus set skb->csum_not_inet to 0.
Suggested-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Tom Herbert <tom@herbertland.com>
---
include/linux/skbuff.h | 16 +++++++++-------
net/core/dev.c | 1 +
net/sched/act_csum.c | 1 +
net/sctp/offload.c | 1 +
net/sctp/output.c | 1 +
5 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index c38f890..a43d208 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -189,12 +189,13 @@
*
* NETIF_F_SCTP_CRC - This feature indicates that a device is capable of
* offloading the SCTP CRC in a packet. To perform this offload the stack
- * will set ip_summed to CHECKSUM_PARTIAL and set csum_start and csum_offset
- * accordingly. Note the there is no indication in the skbuff that the
- * CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
- * both IP checksum offload and SCTP CRC offload must verify which offload
- * is configured for a packet presumably by inspecting packet headers; in
- * case, skb_crc32c_csum_help is provided to compute CRC on SCTP packets.
+ * will set set csum_start and csum_offset accordingly, set ip_summed to
+ * CHECKSUM_PARTIAL and set csum_not_inet to 1, to provide an indication in
+ * the skbuff that the CHECKSUM_PARTIAL refers to CRC32c.
+ * A driver that supports both IP checksum offload and SCTP CRC32c offload
+ * must verify which offload is configured for a packet by testing the
+ * value of skb->csum_not_inet; skb_crc32c_csum_help is provided to resolve
+ * CHECKSUM_PARTIAL on skbs where csum_not_inet is set to 1.
*
* NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
* offloading the FCOE CRC in a packet. To perform this offload the stack
@@ -557,6 +558,7 @@ typedef unsigned char *sk_buff_data_t;
* @wifi_acked_valid: wifi_acked was set
* @wifi_acked: whether frame was acked on wifi or not
* @no_fcs: Request NIC to treat last 4 bytes as Ethernet FCS
+ * @csum_not_inet: use CRC32c to resolve CHECKSUM_PARTIAL
* @dst_pending_confirm: need to confirm neighbour
* @napi_id: id of the NAPI struct this skb came from
* @secmark: security marking
@@ -685,7 +687,7 @@ struct sk_buff {
__u8 csum_valid:1;
__u8 csum_complete_sw:1;
__u8 csum_level:2;
- __u8 __csum_bad_unused:1; /* one bit hole */
+ __u8 csum_not_inet:1;
__u8 dst_pending_confirm:1;
#ifdef CONFIG_IPV6_NDISC_NODETYPE
diff --git a/net/core/dev.c b/net/core/dev.c
index f0281ff..71107d1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2649,6 +2649,7 @@ int skb_crc32c_csum_help(struct sk_buff *skb)
crc32c_csum_stub));
*(__le32 *)(skb->data + offset) = crc32c_csum;
skb->ip_summed = CHECKSUM_NONE;
+ skb->csum_not_inet = 0;
out:
return ret;
}
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index ab6fdbd..3317a2f 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -350,6 +350,7 @@ static int tcf_csum_sctp(struct sk_buff *skb, unsigned int ihl,
sctph->checksum = sctp_compute_cksum(skb,
skb_network_offset(skb) + ihl);
skb->ip_summed = CHECKSUM_NONE;
+ skb->csum_not_inet = 0;
return 1;
}
diff --git a/net/sctp/offload.c b/net/sctp/offload.c
index b671984..275925b 100644
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -35,6 +35,7 @@
static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
{
skb->ip_summed = CHECKSUM_NONE;
+ skb->csum_not_inet = 0;
return sctp_compute_cksum(skb, skb_transport_offset(skb));
}
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 1409a87..e2edf2e 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -538,6 +538,7 @@ static int sctp_packet_pack(struct sctp_packet *packet,
} else {
chksum:
head->ip_summed = CHECKSUM_PARTIAL;
+ head->csum_not_inet = 1;
head->csum_start = skb_transport_header(head) - head->head;
head->csum_offset = offsetof(struct sctphdr, checksum);
}
--
2.7.4
^ permalink raw reply related
* [PATCH net-next v3 2/7] net: introduce skb_crc32c_csum_help
From: Davide Caratti @ 2017-05-18 13:44 UTC (permalink / raw)
To: linux-sctp, netdev
Cc: Alexander Duyck, Tom Herbert, David Laight, David S. Miller,
Marcelo Ricardo Leitner
In-Reply-To: <cover.1495112301.git.dcaratti@redhat.com>
skb_crc32c_csum_help is like skb_checksum_help, but it is designed for
checksumming SCTP packets using crc32c (see RFC3309), provided that
libcrc32c.ko has been loaded before. In case libcrc32c is not loaded,
invoking skb_crc32c_csum_help on a skb results in one the following
printouts:
warn_crc32c_csum_update: attempt to compute crc32c without libcrc32c.ko
warn_crc32c_csum_combine: attempt to compute crc32c without libcrc32c.ko
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
include/linux/netdevice.h | 1 +
include/linux/skbuff.h | 3 ++-
net/core/dev.c | 41 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 44 insertions(+), 1 deletion(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3f39d27..442650b 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3931,6 +3931,7 @@ void netdev_rss_key_fill(void *buffer, size_t len);
int dev_get_nest_level(struct net_device *dev);
int skb_checksum_help(struct sk_buff *skb);
+int skb_crc32c_csum_help(struct sk_buff *skb);
struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
netdev_features_t features, bool tx_path);
struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index b1f46a0..62d6296 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -193,7 +193,8 @@
* accordingly. Note the there is no indication in the skbuff that the
* CHECKSUM_PARTIAL refers to an SCTP checksum, a driver that supports
* both IP checksum offload and SCTP CRC offload must verify which offload
- * is configured for a packet presumably by inspecting packet headers.
+ * is configured for a packet presumably by inspecting packet headers; in
+ * case, skb_crc32c_csum_help is provided to compute CRC on SCTP packets.
*
* NETIF_F_FCOE_CRC - This feature indicates that a device is capable of
* offloading the FCOE CRC in a packet. To perform this offload the stack
diff --git a/net/core/dev.c b/net/core/dev.c
index acd594c..8356d5f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -143,6 +143,7 @@
#include <linux/hrtimer.h>
#include <linux/netfilter_ingress.h>
#include <linux/crash_dump.h>
+#include <linux/sctp.h>
#include "net-sysfs.h"
@@ -2612,6 +2613,46 @@ int skb_checksum_help(struct sk_buff *skb)
}
EXPORT_SYMBOL(skb_checksum_help);
+int skb_crc32c_csum_help(struct sk_buff *skb)
+{
+ __le32 crc32c_csum;
+ int ret = 0, offset, start;
+
+ if (skb->ip_summed != CHECKSUM_PARTIAL)
+ goto out;
+
+ if (unlikely(skb_is_gso(skb)))
+ goto out;
+
+ /* Before computing a checksum, we should make sure no frag could
+ * be modified by an external entity : checksum could be wrong.
+ */
+ if (unlikely(skb_has_shared_frag(skb))) {
+ ret = __skb_linearize(skb);
+ if (ret)
+ goto out;
+ }
+ start = skb_checksum_start_offset(skb);
+ offset = start + offsetof(struct sctphdr, checksum);
+ if (WARN_ON_ONCE(offset >= skb_headlen(skb))) {
+ ret = -EINVAL;
+ goto out;
+ }
+ if (skb_cloned(skb) &&
+ !skb_clone_writable(skb, offset + sizeof(__le32))) {
+ ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+ if (ret)
+ goto out;
+ }
+ crc32c_csum = cpu_to_le32(~__skb_checksum(skb, start,
+ skb->len - start, ~(__u32)0,
+ crc32c_csum_stub));
+ *(__le32 *)(skb->data + offset) = crc32c_csum;
+ skb->ip_summed = CHECKSUM_NONE;
+out:
+ return ret;
+}
+
__be16 skb_network_protocol(struct sk_buff *skb, int *depth)
{
__be16 type = skb->protocol;
--
2.7.4
^ permalink raw reply related
* [PATCH net-next v3 3/7] sk_buff: remove support for csum_bad in sk_buff
From: Davide Caratti @ 2017-05-18 13:44 UTC (permalink / raw)
To: linux-sctp, netdev
Cc: Alexander Duyck, Tom Herbert, David Laight, David S. Miller,
Marcelo Ricardo Leitner
In-Reply-To: <cover.1495112301.git.dcaratti@redhat.com>
This bit was introduced with commit 5a21232983aa ("net: Support for
csum_bad in skbuff") to reduce the stack workload when processing RX
packets carrying a wrong Internet Checksum. Up to now, only one driver and
GRO core are setting it.
Suggested-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 2 +-
include/linux/netdevice.h | 4 +---
include/linux/skbuff.h | 23 ++---------------------
net/bridge/netfilter/nft_reject_bridge.c | 5 +----
net/core/dev.c | 3 ---
net/ipv4/netfilter/nf_reject_ipv4.c | 2 +-
net/ipv6/netfilter/nf_reject_ipv6.c | 3 ---
7 files changed, 6 insertions(+), 36 deletions(-)
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
index 3a8a4aa..9a08179 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
@@ -223,7 +223,7 @@ int aq_ring_rx_clean(struct aq_ring_s *self, int *work_done, int budget)
skb->protocol = eth_type_trans(skb, ndev);
if (unlikely(buff->is_cso_err)) {
++self->stats.rx.errors;
- __skb_mark_checksum_bad(skb);
+ skb->ip_summed = CHECKSUM_NONE;
} else {
if (buff->is_ip_cso) {
__skb_incr_checksum_unnecessary(skb);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 442650b..970a255 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2573,9 +2573,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
if (__skb_gro_checksum_validate_needed(skb, zero_okay, check)) \
__ret = __skb_gro_checksum_validate_complete(skb, \
compute_pseudo(skb, proto)); \
- if (__ret) \
- __skb_mark_checksum_bad(skb); \
- else \
+ if (!__ret) \
skb_gro_incr_csum_unnecessary(skb); \
__ret; \
})
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 62d6296..c38f890 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -685,7 +685,7 @@ struct sk_buff {
__u8 csum_valid:1;
__u8 csum_complete_sw:1;
__u8 csum_level:2;
- __u8 csum_bad:1;
+ __u8 __csum_bad_unused:1; /* one bit hole */
__u8 dst_pending_confirm:1;
#ifdef CONFIG_IPV6_NDISC_NODETYPE
@@ -3336,21 +3336,6 @@ static inline void __skb_incr_checksum_unnecessary(struct sk_buff *skb)
}
}
-static inline void __skb_mark_checksum_bad(struct sk_buff *skb)
-{
- /* Mark current checksum as bad (typically called from GRO
- * path). In the case that ip_summed is CHECKSUM_NONE
- * this must be the first checksum encountered in the packet.
- * When ip_summed is CHECKSUM_UNNECESSARY, this is the first
- * checksum after the last one validated. For UDP, a zero
- * checksum can not be marked as bad.
- */
-
- if (skb->ip_summed == CHECKSUM_NONE ||
- skb->ip_summed == CHECKSUM_UNNECESSARY)
- skb->csum_bad = 1;
-}
-
/* Check if we need to perform checksum complete validation.
*
* Returns true if checksum complete is needed, false otherwise
@@ -3404,9 +3389,6 @@ static inline __sum16 __skb_checksum_validate_complete(struct sk_buff *skb,
skb->csum_valid = 1;
return 0;
}
- } else if (skb->csum_bad) {
- /* ip_summed == CHECKSUM_NONE in this case */
- return (__force __sum16)1;
}
skb->csum = psum;
@@ -3466,8 +3448,7 @@ static inline __wsum null_compute_pseudo(struct sk_buff *skb, int proto)
static inline bool __skb_checksum_convert_check(struct sk_buff *skb)
{
- return (skb->ip_summed == CHECKSUM_NONE &&
- skb->csum_valid && !skb->csum_bad);
+ return (skb->ip_summed == CHECKSUM_NONE && skb->csum_valid);
}
static inline void __skb_checksum_convert(struct sk_buff *skb,
diff --git a/net/bridge/netfilter/nft_reject_bridge.c b/net/bridge/netfilter/nft_reject_bridge.c
index 346ef6b..c16dd3a 100644
--- a/net/bridge/netfilter/nft_reject_bridge.c
+++ b/net/bridge/netfilter/nft_reject_bridge.c
@@ -111,7 +111,7 @@ static void nft_reject_br_send_v4_unreach(struct net *net,
__wsum csum;
u8 proto;
- if (oldskb->csum_bad || !nft_bridge_iphdr_validate(oldskb))
+ if (!nft_bridge_iphdr_validate(oldskb))
return;
/* IP header checks: fragment. */
@@ -226,9 +226,6 @@ static bool reject6_br_csum_ok(struct sk_buff *skb, int hook)
__be16 fo;
u8 proto = ip6h->nexthdr;
- if (skb->csum_bad)
- return false;
-
if (skb_csum_unnecessary(skb))
return true;
diff --git a/net/core/dev.c b/net/core/dev.c
index 8356d5f..f0281ff 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4678,9 +4678,6 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
if (netif_elide_gro(skb->dev))
goto normal;
- if (skb->csum_bad)
- goto normal;
-
gro_list_prepare(napi, skb);
rcu_read_lock();
diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c
index 7cd8d0d..6f8d9e5 100644
--- a/net/ipv4/netfilter/nf_reject_ipv4.c
+++ b/net/ipv4/netfilter/nf_reject_ipv4.c
@@ -172,7 +172,7 @@ void nf_send_unreach(struct sk_buff *skb_in, int code, int hook)
struct iphdr *iph = ip_hdr(skb_in);
u8 proto;
- if (skb_in->csum_bad || iph->frag_off & htons(IP_OFFSET))
+ if (iph->frag_off & htons(IP_OFFSET))
return;
if (skb_csum_unnecessary(skb_in)) {
diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c b/net/ipv6/netfilter/nf_reject_ipv6.c
index eedee5d..f63b18e 100644
--- a/net/ipv6/netfilter/nf_reject_ipv6.c
+++ b/net/ipv6/netfilter/nf_reject_ipv6.c
@@ -220,9 +220,6 @@ static bool reject6_csum_ok(struct sk_buff *skb, int hook)
__be16 fo;
u8 proto;
- if (skb->csum_bad)
- return false;
-
if (skb_csum_unnecessary(skb))
return true;
--
2.7.4
^ permalink raw reply related
* [PATCH net-next v3 1/7] skbuff: add stub to help computing crc32c on SCTP packets
From: Davide Caratti @ 2017-05-18 13:44 UTC (permalink / raw)
To: linux-sctp, netdev
Cc: Alexander Duyck, Tom Herbert, David Laight, David S. Miller,
Marcelo Ricardo Leitner
In-Reply-To: <cover.1495112301.git.dcaratti@redhat.com>
sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so
it can't be used in net core. Like it has been done previously with other
symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops
to allow computation of crc32c checksum in net core after sctp.ko (and thus
libcrc32c) has been loaded.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
include/linux/skbuff.h | 2 ++
net/core/skbuff.c | 26 ++++++++++++++++++++++++++
net/sctp/offload.c | 6 ++++++
3 files changed, 34 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 7c0cb2c..b1f46a0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3076,6 +3076,8 @@ struct skb_checksum_ops {
__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
};
+extern const struct skb_checksum_ops *crc32c_csum_stub __read_mostly;
+
__wsum __skb_checksum(const struct sk_buff *skb, int offset, int len,
__wsum csum, const struct skb_checksum_ops *ops);
__wsum skb_checksum(const struct sk_buff *skb, int offset, int len,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 346d3e8..d5c9811 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2243,6 +2243,32 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
}
EXPORT_SYMBOL(skb_copy_and_csum_bits);
+static __wsum warn_crc32c_csum_update(const void *buff, int len, __wsum sum)
+{
+ net_warn_ratelimited(
+ "%s: attempt to compute crc32c without libcrc32c.ko\n",
+ __func__);
+ return 0;
+}
+
+static __wsum warn_crc32c_csum_combine(__wsum csum, __wsum csum2,
+ int offset, int len)
+{
+ net_warn_ratelimited(
+ "%s: attempt to compute crc32c without libcrc32c.ko\n",
+ __func__);
+ return 0;
+}
+
+static const struct skb_checksum_ops default_crc32c_ops = {
+ .update = warn_crc32c_csum_update,
+ .combine = warn_crc32c_csum_combine,
+};
+
+const struct skb_checksum_ops *crc32c_csum_stub __read_mostly =
+ &default_crc32c_ops;
+EXPORT_SYMBOL(crc32c_csum_stub);
+
/**
* skb_zerocopy_headlen - Calculate headroom needed for skb_zerocopy()
* @from: source buffer
diff --git a/net/sctp/offload.c b/net/sctp/offload.c
index 4f5a2b5..b671984 100644
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -98,6 +98,11 @@ static const struct net_offload sctp6_offload = {
},
};
+static const struct skb_checksum_ops crc32c_csum_ops = {
+ .update = sctp_csum_update,
+ .combine = sctp_csum_combine,
+};
+
int __init sctp_offload_init(void)
{
int ret;
@@ -110,6 +115,7 @@ int __init sctp_offload_init(void)
if (ret)
goto ipv4;
+ crc32c_csum_stub = &crc32c_csum_ops;
return ret;
ipv4:
--
2.7.4
^ permalink raw reply related
* [PATCH net-next v3 0/7] fix CRC32c in the forwarding path
From: Davide Caratti @ 2017-05-18 13:44 UTC (permalink / raw)
To: linux-sctp, netdev
Cc: Alexander Duyck, Tom Herbert, David Laight, David S. Miller,
Marcelo Ricardo Leitner
Current kernel allows offloading CRC32c computation when SCTP packets
are generated, setting skb->ip_summed to CHECKSUM_PARTIAL, if the
underlying device features have NETIF_F_SCTP_CRC set. However, after these
packets are forwarded, they may land on a device where CRC32c offloading is
not available: as a consequence, transmission is done with wrong CRC32c.
It's not possible to use sctp_compte_cksum() in the forwarding path
and in most drivers, because it needs symbols exported by libcrc32c module.
Patch 1 and 2 of this series try to solve this problem, introducing a new
helper function, namely skb_crc32c_csum_help(), that can be used to resolve
CHECKSUM_PARTIAL when crc32c is needed instead of Internet Checksum.
Currently, we need to parse the packet headers to understand what algorithm
is needed to resolve CHECKSUM_PARTIAL. We can speedup things by storing
this information in the skb metadata, and use it to call an appropriate
helper (skb_checksum_help or skb_crc32c_csum_help), or leave the packet
unmodified when the NIC is able to offload the checksum computation.
Patch 3 deprecates skb->csum_bad to free one bit in skb metadata; patch 4
introduces skb->csum_not_inet, providing skb with an indication on the
algorithm needed to resolve CHECKSUM_PARTIAL.
Patch 5 and 6 fix the kernel forwarding path and openvswitch datapath,
where skb_checksum_help was unconditionally called to resolve CHECKSUM_PARTIAL,
thus generating wrong CRC32c in forwarded SCTP packets.
Finally, patch 7 updates documentation to provide a better description of
possible values of skb->ip_summed.
Some further work is still possible:
* drivers that parse the packet header to correctly resolve CHECKSUM_PARTIAL
(e.g. ixgbe_tx_csum()) can benefit from testing skb->csum_not_inet to avoid
calling ip_hdr(skb)->protocol or ixgbe_ipv6_csum_is_sctp(skb).
* drivers that call skb_checksum_help() to resolve CHECKSUM_PARTIAL can
call skb_csum_hwoffload_help to avoid corrupting SCTP packets.
Changes v2->v3:
- patch 1/7: more standard declaration of stub variables
Changes v1->v2:
- none
Changes RFCv4->v1:
- patch 2/7: use WARN_ON_ONCE() instead of BUG_ON(), and avoid computing
CRC32c on the error path.
- patch 3/7: don't invert tests on the values of same_flow and
NAPI_GRO_CB(skb)->flush in dev_gro_receive(), it's useless and it breaks
GRO functionality as reported by kernel test robot.
Davide Caratti (7):
skbuff: add stub to help computing crc32c on SCTP packets
net: introduce skb_crc32c_csum_help
sk_buff: remove support for csum_bad in sk_buff
net: use skb->csum_not_inet to identify packets needing crc32c
net: more accurate checksumming in validate_xmit_skb()
openvswitch: more accurate checksumming in queue_userspace_packet()
sk_buff.h: improve description of CHECKSUM_{COMPLETE,UNNECESSARY}
Documentation/networking/checksum-offloads.txt | 11 +++--
drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 2 +-
include/linux/netdevice.h | 8 ++--
include/linux/skbuff.h | 58 +++++++++--------------
net/bridge/netfilter/nft_reject_bridge.c | 5 +-
net/core/dev.c | 59 ++++++++++++++++++++++--
net/core/skbuff.c | 26 +++++++++++
net/ipv4/netfilter/nf_reject_ipv4.c | 2 +-
net/ipv6/netfilter/nf_reject_ipv6.c | 3 --
net/openvswitch/datapath.c | 2 +-
net/sched/act_csum.c | 1 +
net/sctp/offload.c | 7 +++
net/sctp/output.c | 1 +
13 files changed, 127 insertions(+), 58 deletions(-)
--
2.7.4
^ permalink raw reply
* Re: wlcore: use memdup_user
From: Kalle Valo @ 2017-05-18 13:41 UTC (permalink / raw)
To: Geliang Tang
Cc: Colin Ian King, Geliang Tang, linux-wireless, netdev,
linux-kernel
In-Reply-To: <69124563157776eb4187fa35b0161dc8675ebeaf.1493781907.git.geliangtang@gmail.com>
Geliang Tang <geliangtang@gmail.com> wrote:
> Use memdup_user() helper instead of open-coding to simplify the code.
>
> Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Patch applied to wireless-drivers-next.git, thanks.
6a01d48d47c8 wlcore: use memdup_user
--
https://patchwork.kernel.org/patch/9715005/
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
^ permalink raw reply
* Re: [PATCH] ravb: add wake-on-lan support via magic packet
From: Simon Horman @ 2017-05-18 13:40 UTC (permalink / raw)
To: Niklas Söderlund
Cc: Geert Uytterhoeven, Sergei Shtylyov, netdev@vger.kernel.org,
Linux-Renesas
In-Reply-To: <20170516121604.GF15597@bigcity.dyn.berto.se>
On Tue, May 16, 2017 at 02:16:04PM +0200, Niklas Söderlund wrote:
> Hi,
>
> On 2017-05-16 13:36:21 +0200, Geert Uytterhoeven wrote:
> > Hi Simon,
> >
> > On Tue, May 16, 2017 at 1:01 PM, Simon Horman <horms@verge.net.au> wrote:
> > > On Tue, May 16, 2017 at 11:07:34AM +0200, Geert Uytterhoeven wrote:
> > >> On Tue, May 16, 2017 at 11:02 AM, Niklas Söderlund
> > >> <niklas.soderlund@ragnatech.se> wrote:
> > >> >> > Whit all this being said I still like to withdraw this patch as I found
> > >> >> > another fault with it, ravb_wol_restore() will unconditionally be called
> > >> >> > while ravb_wol_setup() will only be called if netif_running(ndev). This
> > >> >> > is en easy fix and I will send out a v2 once we figure out what to do
> > >> >> > about the clock.
> > >> >>
> > >> >> The clock issue is external to the ravb driver. If it works with
> > >> >> s2idle, it should
> > >> >> be OK.
> > >> >
> > >> > Do you think it's a good idea to move ahead with a v2 of the ravb WoL
> > >> > patch to fix the unrelated issue and aim for it to be picked up prior to
> > >> > suspend/resume support is added to the CPG/MSSR?
> > >>
> > >> Sure.
> > >>
> > >> It can still be used on R-Car Gen2, where we're not s*d by mandatory PSCI.
> > >
> > > Is there some way for - e.g. the driver - to not enable WoL on Gen3 SoCs
> > > until the clock issues is sorted out? I'm quite happy to enable features
> >
> > "priv->chip_id != RCAR_GEN3". However, as we don't have RAVB enabled
> > on any R-Car Gen2 board, its use is limited.
>
> I agree that it's not so useful to enable this on Gen2 only.
Yeah, agreed. I had forgotten that RAVB only enabled on Gen3.
...
^ permalink raw reply
* [PATCH net 1/3] vlan: Fix tcp checksums offloads for Q-in-Q vlan.
From: Vladislav Yasevich @ 2017-05-18 13:31 UTC (permalink / raw)
To: netdev; +Cc: Vladislav Yasevich, Toshiaki Makita
In-Reply-To: <1495114265-23368-1-git-send-email-vyasevic@redhat.com>
It appears that since commit 8cb65d000, Q-in-Q vlans have been
broken. The series that commit is part of enabled TSO and checksum
offloading on Q-in-Q vlans. However, most HW we support can't handle
it. To work around the issue, the above commit added a function that
turns off offloads on Q-in-Q devices, but it left the checksum offload.
That will cause issues with most older devices that supprort very basic
checksum offload capabilities as well as some newer devices (we've
reproduced te problem with both be2net and bnx).
To solve this for everyone, turn off checksum offloading feature
by default when sending Q-in-Q traffic. Devices that are proven to
work can provided a corrected ndo_features_check implemetation.
Fixes: 8cb65d000 ("net: Move check for multiple vlans to drivers")
CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
include/linux/if_vlan.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 8d5fcd6..ae537f0 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -619,7 +619,6 @@ static inline netdev_features_t vlan_features_check(const struct sk_buff *skb,
NETIF_F_SG |
NETIF_F_HIGHDMA |
NETIF_F_FRAGLIST |
- NETIF_F_HW_CSUM |
NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_STAG_TX);
--
2.7.4
^ permalink raw reply related
* [PATCH net 0/3] vlan: Offload fixes for Q-in-Q vlans
From: Vladislav Yasevich @ 2017-05-18 13:31 UTC (permalink / raw)
To: netdev; +Cc: Vladislav Yasevich, Toshiaki Makita
Since the following commit:
commit 8cb65d00086bfba22bac87ff18b751432fc74003
Author: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Date: Fri Mar 27 14:31:12 2015 +0900
net: Move check for multiple vlans to drivers
offloads were broken in Q-in-Q vlans. The biggest problem was
that tcp checksum offload was enabled and used for all hardware,
while most of the supported HW (mostly old HW and some newer ones)
don't support it. As a result, tcp connections could not be
established when using Q-in-Q vlans.
This short series corrects this by disabling checksum offload
support on packets sent through Q-in-Q vlans. It is up to individual
drivers to enable it properly through ndo_features_check if they
have some support for Q-in-Q vlans.
Vladislav Yasevich (3):
vlan: Fix Q-in-Q vlan support.
be2net: Fix offload features for Q-in-Q packets
virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
drivers/net/ethernet/emulex/benet/be_main.c | 4 +++-
drivers/net/virtio_net.c | 1 +
include/linux/if_vlan.h | 1 -
3 files changed, 4 insertions(+), 2 deletions(-)
CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
--
2.7.4
^ permalink raw reply
* [PATCH net 2/3] be2net: Fix offload features for Q-in-Q packets
From: Vladislav Yasevich @ 2017-05-18 13:31 UTC (permalink / raw)
To: netdev
Cc: Vladislav Yasevich, Sathya Perla, Ajit Khaparde,
Sriharsha Basavapatna, Somnath Kotur
In-Reply-To: <1495114265-23368-1-git-send-email-vyasevic@redhat.com>
At least some of the be2net cards do not seem to be capabled
of performing checksum offload computions on Q-in-Q packets.
In these case, the recevied checksum on the remote is invalid
and TCP syn packets are dropped.
This patch adds a call to check disbled acceleration features
on Q-in-Q tagged traffic.
CC: Sathya Perla <sathya.perla@broadcom.com>
CC: Ajit Khaparde <ajit.khaparde@broadcom.com
CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
CC: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
drivers/net/ethernet/emulex/benet/be_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index f3a09ab..4eee18c 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -5078,9 +5078,11 @@ static netdev_features_t be_features_check(struct sk_buff *skb,
struct be_adapter *adapter = netdev_priv(dev);
u8 l4_hdr = 0;
- /* The code below restricts offload features for some tunneled packets.
+ /* The code below restricts offload features for some tunneled and
+ * Q-in-Q packets.
* Offload features for normal (non tunnel) packets are unchanged.
*/
+ features = vlan_features_check(skb, features);
if (!skb->encapsulation ||
!(adapter->flags & BE_FLAGS_VXLAN_OFFLOADS))
return features;
--
2.7.4
^ permalink raw reply related
* [PATCH net 3/3] virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
From: Vladislav Yasevich @ 2017-05-18 13:31 UTC (permalink / raw)
To: netdev; +Cc: virtualization, Michael S. Tsirkin
In-Reply-To: <1495114265-23368-1-git-send-email-vyasevic@redhat.com>
Since virtio does not provide it's own ndo_features_check handler,
TSO, and now checksum offload, are disabled for stacked vlans.
Re-enable the support and let the host take care of it. This
restores/improves Guest-to-Guest performance over Q-in-Q vlans.
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Jason Wang <jasowang@redhat.com>
CC: virtualization@lists.linux-foundation.org
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
---
drivers/net/virtio_net.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 8324a5e..341fb96 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2028,6 +2028,7 @@ static const struct net_device_ops virtnet_netdev = {
.ndo_poll_controller = virtnet_netpoll,
#endif
.ndo_xdp = virtnet_xdp,
+ .ndo_features_check = passthru_features_check,
};
static void virtnet_config_changed_work(struct work_struct *work)
--
2.7.4
^ permalink raw reply related
* Re: [PATCH v4 net-next 0/7] Extend socket timestamping API
From: Miroslav Lichvar @ 2017-05-18 13:20 UTC (permalink / raw)
To: netdev, Richard Cochran, Willem de Bruijn
In-Reply-To: <20170518130656.24163-1-mlichvar@redhat.com>
On Thu, May 18, 2017 at 02:58:23PM +0200, Miroslav Lichvar wrote:
> Changes v3->v4:
> - added reserved fields to struct scm_ts_pktinfo
Of course, I forgot to initialize the new reserved fields. I'll send a
new series.
--
Miroslav Lichvar
^ permalink raw reply
* Re: [PATCH] net: ieee802154: fix net_device reference release too early
From: Stefan Schmidt @ 2017-05-18 13:14 UTC (permalink / raw)
To: linzhang; +Cc: aar, stefan, davem, linux-wpan, netdev, linux-kernel
In-Reply-To: <1495093807-11000-1-git-send-email-xiaolou4617@gmail.com>
Hello.
On Thu, 2017-05-18 at 15:50, linzhang wrote:
> This patch fixes the kernel oops when release net_device reference in
> advance. In function raw_sendmsg(i think the dgram_sendmsg has the same
> problem), there is a race condition between dev_put and dev_queue_xmit
> when the device is gong that maybe lead to dev_queue_ximt to see
> an illegal net_device pointer.
>
You have a test case to reproduce this oops? I fear I have not seen
one.
> So i think that dev_put should be behind of the dev_queue_xmit.
>
> Also, explicit set skb->sk is needless, sock_alloc_send_skb is
> already set it.
You could have put this fixup in a different patch.
> Signed-off-by: linzhang <xiaolou4617@gmail.com>
This looks more like a username instead of a real name. If you have Lin
Zhang as you English real name that would be better here. :)
> ---
> net/ieee802154/socket.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
> index eedba76..a60658c 100644
> --- a/net/ieee802154/socket.c
> +++ b/net/ieee802154/socket.c
> @@ -301,15 +301,14 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> goto out_skb;
>
> skb->dev = dev;
> - skb->sk = sk;
> skb->protocol = htons(ETH_P_IEEE802154);
>
> - dev_put(dev);
> -
> err = dev_queue_xmit(skb);
> if (err > 0)
> err = net_xmit_errno(err);
>
> + dev_put(dev);
> +
> return err ?: size;
>
> out_skb:
> @@ -690,15 +689,14 @@ static int dgram_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> goto out_skb;
>
> skb->dev = dev;
> - skb->sk = sk;
> skb->protocol = htons(ETH_P_IEEE802154);
>
> - dev_put(dev);
> -
> err = dev_queue_xmit(skb);
> if (err > 0)
> err = net_xmit_errno(err);
>
> + dev_put(dev);
> +
> return err ?: size;
Going to give this a test ride here now.
regards
Stefan Schmidt
^ permalink raw reply
* Re: [PATCH 2/2] sh_eth: Do not print an error message for probe deferral
From: Laurent Pinchart @ 2017-05-18 13:08 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Sergei Shtylyov, David S . Miller, Laurent Pinchart, netdev,
linux-renesas-soc
In-Reply-To: <1495112495-25086-2-git-send-email-geert+renesas@glider.be>
Hi Geert,
Thank you for the patch.
On Thursday 18 May 2017 15:01:35 Geert Uytterhoeven wrote:
> EPROBE_DEFER is not an error, hence printing an error message like
>
> sh-eth ee700000.ethernet: failed to initialise MDIO
>
> may confuse the user.
>
> To fix this, suppress the error message in case of probe deferral.
> While at it, shorten the message, and add the actual error code.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
> drivers/net/ethernet/renesas/sh_eth.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c
> b/drivers/net/ethernet/renesas/sh_eth.c index
> c85222b02754f5a7..2d686ccf971b1b65 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -3220,7 +3220,8 @@ static int sh_eth_drv_probe(struct platform_device
> *pdev) /* MDIO bus init */
> ret = sh_mdio_init(mdp, pd);
> if (ret) {
> - dev_err(&pdev->dev, "failed to initialise MDIO\n");
> + if (ret != -EPROBE_DEFER)
> + dev_err(&pdev->dev, "MDIO init failed: %d\n", ret);
> goto out_release;
> }
--
Regards,
Laurent Pinchart
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox