* Re: [PATCH v5 net-next 5/7] net: fix documentation of struct scm_timestamping
From: Willem de Bruijn @ 2017-05-19 15:23 UTC (permalink / raw)
To: Miroslav Lichvar; +Cc: Network Development, Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519101118.GD21003@localhost>
On Fri, May 19, 2017 at 6:11 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Thu, May 18, 2017 at 03:38:30PM -0400, Willem de Bruijn wrote:
>> On Thu, May 18, 2017 at 10:07 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>> > +Note that if the SO_TIMESTAMP or SO_TIMESTAMPNS option is enabled
>> > +together with SO_TIMESTAMPING using SOF_TIMESTAMPING_SOFTWARE, a false
>> > +software timestamp will be generated in the recvmsg() call and passed
>> > +in ts[0] when a real software timestamp is missing.
>>
>> With receive software timestamping this is expected behavior? I would make
>> explicit that this happens even on tx timestamps.
>
> How about adding ", e.g. when receive timestamping is enabled
> between receiving the message and the recvmsg() call, or it is a
> message with a hardware transmit timestamp." ?
Perhaps even more brief "This happens also on hardware tx timestamps."
>> > For this reason it
>> > +is not recommended to combine SO_TIMESTAMP(NS) with SO_TIMESTAMPING.
>>
>> And I'd remove this. The extra timestamp is harmless, and we may be missing
>> other reasons why someone would want to enable both on the same socket.
>
> Ok. I'm just concerned people will inadvertently use the timestamp as
> a real timestamp and then wonder why SW TX timestamping is so bad. I
> have fallen into this trap.
So have I. It is equally surprising when only enabling SO_TIMESTAMP and
observing out of order timestamps. It is certainly worth calling out.
^ permalink raw reply
* [PATCH net] sctp: fix ICMP processing if skb is non-linear
From: Davide Caratti @ 2017-05-19 15:34 UTC (permalink / raw)
To: netdev, linux-sctp; +Cc: David S. Miller, Xin Long
when the ICMP packet is carried by a paged skb, sctp_err_lookup() may fail
validation even if the payload contents match an open socket: as a
consequence, sometimes ICMPs are wrongly ignored. Use skb_header_pointer()
to retrieve encapsulated SCTP headers, to ensure that ICMP payloads are
validated correctly, also when skbs are non-linear.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
include/net/sctp/sctp.h | 2 +-
net/sctp/input.c | 29 +++++++++++++++++++----------
net/sctp/ipv6.c | 2 +-
3 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 069582e..1b8c16b 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -152,7 +152,7 @@ void sctp_v4_err(struct sk_buff *skb, u32 info);
void sctp_hash_endpoint(struct sctp_endpoint *);
void sctp_unhash_endpoint(struct sctp_endpoint *);
struct sock *sctp_err_lookup(struct net *net, int family, struct sk_buff *,
- struct sctphdr *, struct sctp_association **,
+ struct sctp_association **,
struct sctp_transport **);
void sctp_err_finish(struct sock *, struct sctp_transport *);
void sctp_icmp_frag_needed(struct sock *, struct sctp_association *,
diff --git a/net/sctp/input.c b/net/sctp/input.c
index 0e06a27..7f3f983 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -469,19 +469,19 @@ void sctp_icmp_proto_unreachable(struct sock *sk,
/* Common lookup code for icmp/icmpv6 error handler. */
struct sock *sctp_err_lookup(struct net *net, int family, struct sk_buff *skb,
- struct sctphdr *sctphdr,
struct sctp_association **app,
struct sctp_transport **tpp)
{
+ struct sctp_init_chunk _chunkhdr, *chunkhdr;
+ struct sctphdr _sctphdr, *sctphdr;
union sctp_addr saddr;
union sctp_addr daddr;
struct sctp_af *af;
struct sock *sk = NULL;
struct sctp_association *asoc;
struct sctp_transport *transport = NULL;
- struct sctp_init_chunk *chunkhdr;
- __u32 vtag = ntohl(sctphdr->vtag);
- int len = skb->len - ((void *)sctphdr - (void *)skb->data);
+ int offset;
+ __u32 vtag;
*app = NULL; *tpp = NULL;
@@ -515,14 +515,23 @@ struct sock *sctp_err_lookup(struct net *net, int family, struct sk_buff *skb,
* or the chunk type or the Initiate Tag does not match, silently
* discard the packet.
*/
+ offset = skb_transport_offset(skb);
+ sctphdr = skb_header_pointer(skb, offset, sizeof(_sctphdr), &_sctphdr);
+ if (unlikely(!sctphdr))
+ goto out;
+
+ vtag = ntohl(sctphdr->vtag);
if (vtag == 0) {
- chunkhdr = (void *)sctphdr + sizeof(struct sctphdr);
- if (len < sizeof(struct sctphdr) + sizeof(sctp_chunkhdr_t)
- + sizeof(__be32) ||
+ offset += sizeof(_sctphdr);
+ /* chunk header + first 4 octects of init header */
+ chunkhdr = skb_header_pointer(skb, offset,
+ sizeof(struct sctp_chunkhdr) +
+ sizeof(__be32), &_chunkhdr);
+ if (!chunkhdr ||
chunkhdr->chunk_hdr.type != SCTP_CID_INIT ||
- ntohl(chunkhdr->init_hdr.init_tag) != asoc->c.my_vtag) {
+ ntohl(chunkhdr->init_hdr.init_tag) != asoc->c.my_vtag)
goto out;
- }
+
} else if (vtag != asoc->c.peer_vtag) {
goto out;
}
@@ -585,7 +594,7 @@ void sctp_v4_err(struct sk_buff *skb, __u32 info)
savesctp = skb->transport_header;
skb_reset_network_header(skb);
skb_set_transport_header(skb, ihlen);
- sk = sctp_err_lookup(net, AF_INET, skb, sctp_hdr(skb), &asoc, &transport);
+ sk = sctp_err_lookup(net, AF_INET, skb, &asoc, &transport);
/* Put back, the original values. */
skb->network_header = saveip;
skb->transport_header = savesctp;
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index 142b70e..d72c8d5 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -157,7 +157,7 @@ static void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
savesctp = skb->transport_header;
skb_reset_network_header(skb);
skb_set_transport_header(skb, offset);
- sk = sctp_err_lookup(net, AF_INET6, skb, sctp_hdr(skb), &asoc, &transport);
+ sk = sctp_err_lookup(net, AF_INET6, skb, &asoc, &transport);
/* Put back, the original pointers. */
skb->network_header = saveip;
skb->transport_header = savesctp;
--
2.7.4
^ permalink raw reply related
* Re: [PATCH net] bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
From: Nikolay Aleksandrov @ 2017-05-19 15:41 UTC (permalink / raw)
To: Xin Long, network dev
Cc: davem, cera, Satish Ashok, Stephen Hemminger, bridge
In-Reply-To: <6e78bfea3969ca1a937b37e28e796a3047b82615.1495203629.git.lucien.xin@gmail.com>
On 5/19/17 5:20 PM, Xin Long wrote:
> Since commit 76b91c32dd86 ("bridge: stp: when using userspace stp stop
> kernel hello and hold timers"), bridge would not start hello_timer if
> stp_enabled is not KERNEL_STP when br_dev_open.
>
> The problem is even if users set stp_enabled with KERNEL_STP later,
> the timer will still not be started. It causes that KERNEL_STP can
> not really work. Users have to re-ifup the bridge to avoid this.
>
> This patch is to fix it by starting br->hello_timer when enabling
> KERNEL_STP in br_stp_start.
>
> As an improvement, it's also to start hello_timer again only when
> br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is
> no reason to start the timer again when it's NO_STP.
>
> Fixes: 76b91c32dd86 ("bridge: stp: when using userspace stp stop kernel hello and hold timers")
> Reported-by: Haidong Li <haili@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
> net/bridge/br_stp_if.c | 1 +
> net/bridge/br_stp_timer.c | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 08341d2..0db8102 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -179,6 +179,7 @@ static void br_stp_start(struct net_bridge *br)
> br_debug(br, "using kernel STP\n");
>
> /* To start timers on any ports left in blocking */
> + mod_timer(&br->hello_timer, jiffies + br->hello_time);
> br_port_state_selection(br);
> }
>
> diff --git a/net/bridge/br_stp_timer.c b/net/bridge/br_stp_timer.c
> index c98b3e5..60b6fe2 100644
> --- a/net/bridge/br_stp_timer.c
> +++ b/net/bridge/br_stp_timer.c
> @@ -40,7 +40,7 @@ static void br_hello_timer_expired(unsigned long arg)
> if (br->dev->flags & IFF_UP) {
> br_config_bpdu_generation(br);
>
> - if (br->stp_enabled != BR_USER_STP)
> + if (br->stp_enabled == BR_KERNEL_STP)
> mod_timer(&br->hello_timer,
> round_jiffies(jiffies + br->hello_time));
> }
>
+CC Bridge maintainers & fixes commit author
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
^ permalink raw reply
* Re: [RFC net-next PATCH 1/5] samples/bpf: xdp_tx_iptunnel make use of map_data[]
From: Daniel Borkmann @ 2017-05-19 15:45 UTC (permalink / raw)
To: Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov
Cc: John Fastabend, netdev
In-Reply-To: <149512209298.14733.14668513619424960672.stgit@firesoul>
On 05/18/2017 05:41 PM, Jesper Dangaard Brouer wrote:
> There is no reason to use a compile time constant MAX_IPTNL_ENTRIES
> shared between the _user.c and _kern.c, when map_data[].def.max_entries
> can tell us dynamically what the max_entries were of the ELF map that
> the bpf loaded created.
>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Previous code was perhaps a bit more robust in the sense that the
order of the map wouldn't matter due to MAX_IPTNL_ENTRIES being
shared. Now you rely on it being in slot 1 (map_data[1].def.max_entries)
from "maps" section in ELF.
> samples/bpf/xdp_tx_iptunnel_common.h | 2 --
> samples/bpf/xdp_tx_iptunnel_kern.c | 2 +-
> samples/bpf/xdp_tx_iptunnel_user.c | 14 +++++++++-----
> 3 files changed, 10 insertions(+), 8 deletions(-)
Not sure it's worth it given this actually adds more code and makes
it more fragile at the same time. Only point I could see is to demo
usage of map_data[1].def.max_entries for sample code.
Perhaps at the very minimum add a warning comment to xdp_tx_iptunnel_kern.c
that should the code be further extended with additional maps, that
ordering of struct bpf_map_def entries really matters here to not break
the _user.c part.
Other than that, this should be sent as stand-alone "cleanup" ...
^ permalink raw reply
* Re: [RFC net-next PATCH 2/5] mlx5: fix bug reading rss_hash_type from CQE
From: Daniel Borkmann @ 2017-05-19 15:47 UTC (permalink / raw)
To: Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov
Cc: John Fastabend, netdev, saeedm
In-Reply-To: <149512209807.14733.12774004120782832472.stgit@firesoul>
On 05/18/2017 05:41 PM, Jesper Dangaard Brouer wrote:
> Masks for extracting part of the Completion Queue Entry (CQE)
> field rss_hash_type was swapped, namely CQE_RSS_HTYPE_IP and
> CQE_RSS_HTYPE_L4.
>
> The bug resulted in setting skb->l4_hash, even-though the
> rss_hash_type indicated that hash was NOT computed over the
> L4 (UDP or TCP) part of the packet.
>
> Added comments from the datasheet, to make it more clear what
> these masks are selecting.
>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
Stand-alone fix for -net tree?
> include/linux/mlx5/device.h | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
> index dd9a263ed368..a940ec6a046c 100644
> --- a/include/linux/mlx5/device.h
> +++ b/include/linux/mlx5/device.h
> @@ -787,8 +787,14 @@ enum {
> };
>
> enum {
> - CQE_RSS_HTYPE_IP = 0x3 << 6,
> - CQE_RSS_HTYPE_L4 = 0x3 << 2,
> + CQE_RSS_HTYPE_IP = 0x3 << 2,
> + /* cqe->rss_hash_type[3:2] - IP destination selected for hash
> + * (00 = none, 01 = IPv4, 10 = IPv6, 11 = Reserved)
> + */
> + CQE_RSS_HTYPE_L4 = 0x3 << 6,
> + /* cqe->rss_hash_type[7:6] - L4 destination selected for hash
> + * (00 = none, 01 = TCP. 10 = UDP, 11 = IPSEC.SPI
> + */
> };
>
> enum {
>
^ permalink raw reply
* [PATCH v6 net-next 0/7] Extend socket timestamping API
From: Miroslav Lichvar @ 2017-05-19 15:52 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
Changes v5->v6:
- fixed skb_is_swtx_tstamp() when OPT_TX_SWHW is disabled and improved
its description
- improved OPT_PKTINFO documentation
- improved scm_timestamping documentation
Changes v4->v5:
- fixed initialization of reserved fields in struct scm_ts_pktinfo
Changes v3->v4:
- added reserved fields to struct scm_ts_pktinfo
- replaced patch fixing false SW timestamps with a documentation fix
- updated OPT_TX_SWHW patch to handle false SW timestamps
Changes v2->v3:
- modified struct scm_ts_pktinfo to use fixed-width integer types
- added WARN_ON_ONCE for missing RCU lock in dev_get_by_napi_id()
- modified dev_get_by_napi_id() to not return dev in unexpected branch
- modified recv to return SCM_TIMESTAMPING_PKTINFO even if the interface
index is unknown
Changes v1->v2:
- added separate patch for new NAPI functions
- split code from __sock_recv_timestamp() for better readability
- fixed RCU locking
- fixed compiler warning (missing case in switch in first patch)
- inline sw_tx_timestamp() in its only user
Changes RFC->v1:
- reworked SOF_TIMESTAMPING_OPT_PKTINFO patch to not add new fields to
skb shared info (net device is now looked up by napi_id), not require
any changes in drivers, and restrict the cmsg to incoming packets
- renamed SOF_TIMESTAMPING_OPT_MULTIMSG to SOF_TIMESTAMPING_OPT_TX_SWHW
and fixed its description
- moved struct scm_ts_pktinfo from errqueue.h to net_tstamp.h as it
can't be received from the error queue anymore
- improved commit descriptions and removed incorrect comment
This patchset adds new options to the timestamping API that will be
useful for NTP implementations and possibly other applications.
The first patch specifies a timestamp filter for NTP packets. The second
patch updates drivers that can timestamp all packets, or need to list
the filter as unsupported. There is no attempt to add the support to the
phyter driver.
The third patch adds two helper functions working with NAPI ID, which is
needed by the next patch. The fourth patch adds a new option to get a
new control message with the L2 length and interface index for incoming
packets with hardware timestamps.
The fifth patch fixes documentation on number of non-zero fields in
scm_timestamping and warns about false software timestamps when
SO_TIMESTAMP(NS) is combined with SCM_TIMESTAMPING.
The sixth patch adds a new option to request both software and hardware
timestamps for outgoing packets. The seventh patch updates drivers that
assumed software timestamping cannot be used together with hardware
timestamping.
The patches have been tested on x86_64 machines with igb and e1000e
drivers.
Miroslav Lichvar (7):
net: define receive timestamp filter for NTP
net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALL
net: add function to retrieve original skb device using NAPI ID
net: add new control message for incoming HW-timestamped packets
net: fix documentation of struct scm_timestamping
net: allow simultaneous SW and HW transmit timestamping
net: ethernet: update drivers to make both SW and HW TX timestamps
Documentation/networking/timestamping.txt | 26 +++++++++++-
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 4 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 1 +
drivers/net/ethernet/cavium/liquidio/lio_main.c | 1 +
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 1 +
drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 1 +
drivers/net/ethernet/intel/e1000e/netdev.c | 5 ++-
drivers/net/ethernet/intel/i40e/i40e_ptp.c | 1 +
drivers/net/ethernet/intel/igb/igb_ptp.c | 1 +
drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 1 +
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_clock.c | 1 +
drivers/net/ethernet/neterion/vxge/vxge-main.c | 1 +
drivers/net/ethernet/qlogic/qede/qede_ptp.c | 1 +
drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c | 3 +-
drivers/net/ethernet/sfc/ef10.c | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 7 ++--
drivers/net/ethernet/ti/cpsw.c | 1 +
drivers/net/ethernet/tile/tilegx.c | 1 +
include/linux/netdevice.h | 1 +
include/linux/skbuff.h | 19 +++++----
include/uapi/asm-generic/socket.h | 2 +
include/uapi/linux/net_tstamp.h | 15 ++++++-
net/core/dev.c | 26 ++++++++++++
net/core/dev_ioctl.c | 1 +
net/core/skbuff.c | 4 ++
net/socket.c | 47 ++++++++++++++++++++--
27 files changed, 151 insertions(+), 23 deletions(-)
--
2.9.3
^ permalink raw reply
* [PATCH v6 net-next 1/7] net: define receive timestamp filter for NTP
From: Miroslav Lichvar @ 2017-05-19 15:52 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519155241.15817-1-mlichvar@redhat.com>
Add HWTSTAMP_FILTER_NTP_ALL to the hwtstamp_rx_filters enum for
timestamping of NTP packets. There is currently only one driver
(phyter) that could support it directly.
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
include/uapi/linux/net_tstamp.h | 3 +++
net/core/dev_ioctl.c | 2 ++
2 files changed, 5 insertions(+)
diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index 464dcca..0749fb1 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -125,6 +125,9 @@ enum hwtstamp_rx_filters {
HWTSTAMP_FILTER_PTP_V2_SYNC,
/* PTP v2/802.AS1, any layer, Delay_req packet */
HWTSTAMP_FILTER_PTP_V2_DELAY_REQ,
+
+ /* NTP, UDP, all versions and packet modes */
+ HWTSTAMP_FILTER_NTP_ALL,
};
#endif /* _NET_TIMESTAMPING_H */
diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index b94b1d2..8f036a7 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -227,6 +227,8 @@ static int net_hwtstamp_validate(struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
rx_filter_valid = 1;
break;
+ case HWTSTAMP_FILTER_NTP_ALL:
+ break;
}
if (!tx_type_valid || !rx_filter_valid)
--
2.9.3
^ permalink raw reply related
* [PATCH v6 net-next 2/7] net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALL
From: Miroslav Lichvar @ 2017-05-19 15:52 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519155241.15817-1-mlichvar@redhat.com>
Include HWTSTAMP_FILTER_NTP_ALL in net_hwtstamp_validate() as a valid
filter and update drivers which can timestamp all packets, or which
explicitly list unsupported filters instead of using a default case, to
handle the filter.
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 1 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 1 +
drivers/net/ethernet/cavium/liquidio/lio_main.c | 1 +
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 1 +
drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 1 +
drivers/net/ethernet/intel/e1000e/netdev.c | 1 +
drivers/net/ethernet/intel/i40e/i40e_ptp.c | 1 +
drivers/net/ethernet/intel/igb/igb_ptp.c | 1 +
drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 1 +
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_clock.c | 1 +
drivers/net/ethernet/neterion/vxge/vxge-main.c | 1 +
drivers/net/ethernet/qlogic/qede/qede_ptp.c | 1 +
drivers/net/ethernet/sfc/ef10.c | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 +
drivers/net/ethernet/ti/cpsw.c | 1 +
drivers/net/ethernet/tile/tilegx.c | 1 +
net/core/dev_ioctl.c | 3 +--
18 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index c772420..89b21d7 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1268,6 +1268,7 @@ static int xgbe_set_hwtstamp_settings(struct xgbe_prv_data *pdata,
case HWTSTAMP_FILTER_NONE:
break;
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSENALL, 1);
XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSENA, 1);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 7414ffd..14c236e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -15351,6 +15351,7 @@ int bnx2x_configure_ptp_filters(struct bnx2x *bp)
break;
case HWTSTAMP_FILTER_ALL:
case HWTSTAMP_FILTER_SOME:
+ case HWTSTAMP_FILTER_NTP_ALL:
bp->rx_filter = HWTSTAMP_FILTER_NONE;
break;
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index 649f2aa..ba01242 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -3024,6 +3024,7 @@ static int hwtstamp_ioctl(struct net_device *netdev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
conf.rx_filter = HWTSTAMP_FILTER_ALL;
break;
default:
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
index d51c8d8..31d737c 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
@@ -2085,6 +2085,7 @@ static int hwtstamp_ioctl(struct net_device *netdev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
conf.rx_filter = HWTSTAMP_FILTER_ALL;
break;
default:
diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index a213868..2887bca 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -755,6 +755,7 @@ static int octeon_mgmt_ioctl_hwtstamp(struct net_device *netdev,
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
p->has_rx_tstamp = have_hw_timestamps;
config.rx_filter = HWTSTAMP_FILTER_ALL;
if (p->has_rx_tstamp) {
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index b367972..0ff9295 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3680,6 +3680,7 @@ static int e1000e_config_hwtstamp(struct e1000_adapter *adapter,
* Delay Request messages but not both so fall-through to
* time stamp all packets.
*/
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
is_l2 = true;
is_l4 = true;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ptp.c b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
index 18c1cc0..0efff18 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ptp.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
@@ -562,6 +562,7 @@ static int i40e_ptp_set_timestamp_mode(struct i40e_pf *pf,
config->rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT;
}
break;
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
default:
return -ERANGE;
diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index 7a3fd4d..d333d6d 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -941,6 +941,7 @@ static int igb_ptp_set_timestamp_mode(struct igb_adapter *adapter,
is_l4 = true;
break;
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
/* 82576 cannot timestamp all packets, which it needs to do to
* support both V1 Sync and Delay_Req messages
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
index ef0635e..d44c728 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
@@ -883,6 +883,7 @@ static int ixgbe_ptp_set_timestamp_mode(struct ixgbe_adapter *adapter,
IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
break;
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
/* The X550 controller is capable of timestamping all packets,
* which allows it to accept any filter.
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 94fab20..8243674 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2375,6 +2375,7 @@ static int mlx4_en_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
config.rx_filter = HWTSTAMP_FILTER_ALL;
break;
default:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
index e706a87..e294944 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
@@ -128,6 +128,7 @@ int mlx5e_hwstamp_set(struct net_device *dev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
/* Disable CQE compression */
netdev_warn(dev, "Disabling cqe compression");
err = mlx5e_modify_rx_cqe_compression_locked(priv, false);
diff --git a/drivers/net/ethernet/neterion/vxge/vxge-main.c b/drivers/net/ethernet/neterion/vxge/vxge-main.c
index 6a4310a..50ea69d 100644
--- a/drivers/net/ethernet/neterion/vxge/vxge-main.c
+++ b/drivers/net/ethernet/neterion/vxge/vxge-main.c
@@ -3218,6 +3218,7 @@ static int vxge_hwtstamp_set(struct vxgedev *vdev, void __user *data)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
if (vdev->devh->config.hwts_en != VXGE_HW_HWTS_ENABLE)
return -EFAULT;
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ptp.c b/drivers/net/ethernet/qlogic/qede/qede_ptp.c
index 24f06e2..9b2280b 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ptp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ptp.c
@@ -244,6 +244,7 @@ static int qede_ptp_cfg_filters(struct qede_dev *edev)
break;
case HWTSTAMP_FILTER_ALL:
case HWTSTAMP_FILTER_SOME:
+ case HWTSTAMP_FILTER_NTP_ALL:
ptp->rx_filter = HWTSTAMP_FILTER_NONE;
rx_filter = QED_PTP_FILTER_ALL;
break;
diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 78efb28..ad9c4de 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -6068,6 +6068,7 @@ static int efx_ef10_ptp_set_ts_config(struct efx_nic *efx,
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
init->rx_filter = HWTSTAMP_FILTER_ALL;
rc = efx_ptp_change_mode(efx, true, 0);
if (!rc)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index a74c481..cce862b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -644,6 +644,7 @@ static int stmmac_hwtstamp_ioctl(struct net_device *dev, struct ifreq *ifr)
ptp_over_ethernet = PTP_TCR_TSIPENA;
break;
+ case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
/* time stamp any incoming packet */
config.rx_filter = HWTSTAMP_FILTER_ALL;
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index f4d7aec..37fc165 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1734,6 +1734,7 @@ static int cpsw_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
return -ERANGE;
case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
diff --git a/drivers/net/ethernet/tile/tilegx.c b/drivers/net/ethernet/tile/tilegx.c
index 7c634bc..aec9538 100644
--- a/drivers/net/ethernet/tile/tilegx.c
+++ b/drivers/net/ethernet/tile/tilegx.c
@@ -512,6 +512,7 @@ static int tile_hwtstamp_set(struct net_device *dev, struct ifreq *rq)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
config.rx_filter = HWTSTAMP_FILTER_ALL;
break;
default:
diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index 8f036a7..77f04e7 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -225,9 +225,8 @@ static int net_hwtstamp_validate(struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
- rx_filter_valid = 1;
- break;
case HWTSTAMP_FILTER_NTP_ALL:
+ rx_filter_valid = 1;
break;
}
--
2.9.3
^ permalink raw reply related
* [PATCH v6 net-next 3/7] net: add function to retrieve original skb device using NAPI ID
From: Miroslav Lichvar @ 2017-05-19 15:52 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519155241.15817-1-mlichvar@redhat.com>
Since commit b68581778cd0 ("net: Make skb->skb_iif always track
skb->dev") skbs don't have the original index of the interface which
received the packet. This information is now needed for a new control
message related to hardware timestamping.
Instead of adding a new field to skb, we can find the device by the NAPI
ID if it is available, i.e. CONFIG_NET_RX_BUSY_POLL is enabled and the
driver is using NAPI. Add dev_get_by_napi_id() and also skb_napi_id() to
hide the CONFIG_NET_RX_BUSY_POLL ifdef.
CC: Richard Cochran <richardcochran@gmail.com>
Suggested-by: Willem de Bruijn <willemb@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
include/linux/netdevice.h | 1 +
include/linux/skbuff.h | 9 +++++++++
net/core/dev.c | 26 ++++++++++++++++++++++++++
3 files changed, 36 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3f39d27..b6c36d5 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2456,6 +2456,7 @@ static inline int dev_recursion_level(void)
struct net_device *dev_get_by_index(struct net *net, int ifindex);
struct net_device *__dev_get_by_index(struct net *net, int ifindex);
struct net_device *dev_get_by_index_rcu(struct net *net, int ifindex);
+struct net_device *dev_get_by_napi_id(unsigned int napi_id);
int netdev_get_name(struct net *net, char *name, int ifindex);
int dev_restart(struct net_device *dev);
int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 7c0cb2c..1f8028c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -855,6 +855,15 @@ static inline bool skb_pkt_type_ok(u32 ptype)
return ptype <= PACKET_OTHERHOST;
}
+static inline unsigned int skb_napi_id(const struct sk_buff *skb)
+{
+#ifdef CONFIG_NET_RX_BUSY_POLL
+ return skb->napi_id;
+#else
+ return 0;
+#endif
+}
+
void kfree_skb(struct sk_buff *skb);
void kfree_skb_list(struct sk_buff *segs);
void skb_tx_error(struct sk_buff *skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index acd594c..6d3c452 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -162,6 +162,7 @@ static int netif_rx_internal(struct sk_buff *skb);
static int call_netdevice_notifiers_info(unsigned long val,
struct net_device *dev,
struct netdev_notifier_info *info);
+static struct napi_struct *napi_by_id(unsigned int napi_id);
/*
* The @dev_base_head list is protected by @dev_base_lock and the rtnl
@@ -866,6 +867,31 @@ struct net_device *dev_get_by_index(struct net *net, int ifindex)
EXPORT_SYMBOL(dev_get_by_index);
/**
+ * dev_get_by_napi_id - find a device by napi_id
+ * @napi_id: ID of the NAPI struct
+ *
+ * Search for an interface by NAPI ID. Returns %NULL if the device
+ * is not found or a pointer to the device. The device has not had
+ * its reference counter increased so the caller must be careful
+ * about locking. The caller must hold RCU lock.
+ */
+
+struct net_device *dev_get_by_napi_id(unsigned int napi_id)
+{
+ struct napi_struct *napi;
+
+ WARN_ON_ONCE(!rcu_read_lock_held());
+
+ if (napi_id < MIN_NAPI_ID)
+ return NULL;
+
+ napi = napi_by_id(napi_id);
+
+ return napi ? napi->dev : NULL;
+}
+EXPORT_SYMBOL(dev_get_by_napi_id);
+
+/**
* netdev_get_name - get a netdevice name, knowing its ifindex.
* @net: network namespace
* @name: a pointer to the buffer where the name will be stored.
--
2.9.3
^ permalink raw reply related
* [PATCH v6 net-next 4/7] net: add new control message for incoming HW-timestamped packets
From: Miroslav Lichvar @ 2017-05-19 15:52 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519155241.15817-1-mlichvar@redhat.com>
Add SOF_TIMESTAMPING_OPT_PKTINFO option to request a new control message
for incoming packets with hardware timestamps. It contains the index of
the real interface which received the packet and the length of the
packet at layer 2.
The index is useful with bonding, bridges and other interfaces, where
IP_PKTINFO doesn't allow applications to determine which PHC made the
timestamp. With the L2 length (and link speed) it is possible to
transpose preamble timestamps to trailer timestamps, which are used in
the NTP protocol.
While this information could be provided by two new socket options
independently from timestamping, it doesn't look like they would be very
useful. With this option any performance impact is limited to hardware
timestamping.
Use dev_get_by_napi_id() to get the device and its index. On kernels
with disabled CONFIG_NET_RX_BUSY_POLL or drivers not using NAPI, a zero
index will be returned in the control message.
CC: Richard Cochran <richardcochran@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
Documentation/networking/timestamping.txt | 10 ++++++++++
include/uapi/asm-generic/socket.h | 2 ++
include/uapi/linux/net_tstamp.h | 11 ++++++++++-
net/socket.c | 27 ++++++++++++++++++++++++++-
4 files changed, 48 insertions(+), 2 deletions(-)
diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
index 96f5069..ce11e3a 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -193,6 +193,16 @@ SOF_TIMESTAMPING_OPT_STATS:
the transmit timestamps, such as how long a certain block of
data was limited by peer's receiver window.
+SOF_TIMESTAMPING_OPT_PKTINFO:
+
+ Enable the SCM_TIMESTAMPING_PKTINFO control message for incoming
+ packets with hardware timestamps. The message contains struct
+ scm_ts_pktinfo, which supplies the index of the real interface which
+ received the packet and its length at layer 2. A valid (non-zero)
+ interface index will be returned only if CONFIG_NET_RX_BUSY_POLL is
+ enabled and the driver is using NAPI. The struct contains also two
+ other fields, but they are reserved and undefined.
+
New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
regardless of the setting of sysctl net.core.tstamp_allow_data.
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 2b48856..a5f6e81 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -100,4 +100,6 @@
#define SO_COOKIE 57
+#define SCM_TIMESTAMPING_PKTINFO 58
+
#endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index 0749fb1..dee74d3 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -9,6 +9,7 @@
#ifndef _NET_TIMESTAMPING_H
#define _NET_TIMESTAMPING_H
+#include <linux/types.h>
#include <linux/socket.h> /* for SO_TIMESTAMPING */
/* SO_TIMESTAMPING gets an integer bit field comprised of these values */
@@ -26,8 +27,9 @@ enum {
SOF_TIMESTAMPING_OPT_CMSG = (1<<10),
SOF_TIMESTAMPING_OPT_TSONLY = (1<<11),
SOF_TIMESTAMPING_OPT_STATS = (1<<12),
+ SOF_TIMESTAMPING_OPT_PKTINFO = (1<<13),
- SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_STATS,
+ SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_PKTINFO,
SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
SOF_TIMESTAMPING_LAST
};
@@ -130,4 +132,11 @@ enum hwtstamp_rx_filters {
HWTSTAMP_FILTER_NTP_ALL,
};
+/* SCM_TIMESTAMPING_PKTINFO control message */
+struct scm_ts_pktinfo {
+ __u32 if_index;
+ __u32 pkt_length;
+ __u32 reserved[2];
+};
+
#endif /* _NET_TIMESTAMPING_H */
diff --git a/net/socket.c b/net/socket.c
index c2564eb..67db7d8 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -662,6 +662,27 @@ static bool skb_is_err_queue(const struct sk_buff *skb)
return skb->pkt_type == PACKET_OUTGOING;
}
+static void put_ts_pktinfo(struct msghdr *msg, struct sk_buff *skb)
+{
+ struct scm_ts_pktinfo ts_pktinfo;
+ struct net_device *orig_dev;
+
+ if (!skb_mac_header_was_set(skb))
+ return;
+
+ memset(&ts_pktinfo, 0, sizeof(ts_pktinfo));
+
+ rcu_read_lock();
+ orig_dev = dev_get_by_napi_id(skb_napi_id(skb));
+ if (orig_dev)
+ ts_pktinfo.if_index = orig_dev->ifindex;
+ rcu_read_unlock();
+
+ ts_pktinfo.pkt_length = skb->len - skb_mac_offset(skb);
+ put_cmsg(msg, SOL_SOCKET, SCM_TIMESTAMPING_PKTINFO,
+ sizeof(ts_pktinfo), &ts_pktinfo);
+}
+
/*
* called from sock_recv_timestamp() if sock_flag(sk, SOCK_RCVTSTAMP)
*/
@@ -699,8 +720,12 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
empty = 0;
if (shhwtstamps &&
(sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
- ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2))
+ ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
empty = 0;
+ if ((sk->sk_tsflags & SOF_TIMESTAMPING_OPT_PKTINFO) &&
+ !skb_is_err_queue(skb))
+ put_ts_pktinfo(msg, skb);
+ }
if (!empty) {
put_cmsg(msg, SOL_SOCKET,
SCM_TIMESTAMPING, sizeof(tss), &tss);
--
2.9.3
^ permalink raw reply related
* [PATCH v6 net-next 5/7] net: fix documentation of struct scm_timestamping
From: Miroslav Lichvar @ 2017-05-19 15:52 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519155241.15817-1-mlichvar@redhat.com>
The scm_timestamping struct may return multiple non-zero fields, e.g.
when both software and hardware RX timestamping is enabled, or when the
SO_TIMESTAMP(NS) option is combined with SCM_TIMESTAMPING and a false
software timestamp is generated in the recvmsg() call in order to always
return a SCM_TIMESTAMP(NS) message.
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
Documentation/networking/timestamping.txt | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
index ce11e3a..50eb0e5 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -322,7 +322,7 @@ struct scm_timestamping {
};
The structure can return up to three timestamps. This is a legacy
-feature. Only one field is non-zero at any time. Most timestamps
+feature. At least one field is non-zero at any time. Most timestamps
are passed in ts[0]. Hardware timestamps are passed in ts[2].
ts[1] used to hold hardware timestamps converted to system time.
@@ -331,6 +331,12 @@ a HW PTP clock source, to allow time conversion in userspace and
optionally synchronize system time with a userspace PTP stack such
as linuxptp. For the PTP clock API, see Documentation/ptp/ptp.txt.
+Note that if the SO_TIMESTAMP or SO_TIMESTAMPNS option is enabled
+together with SO_TIMESTAMPING using SOF_TIMESTAMPING_SOFTWARE, a false
+software timestamp will be generated in the recvmsg() call and passed
+in ts[0] when a real software timestamp is missing. This happens also
+on hardware transmit timestamps.
+
2.1.1 Transmit timestamps with MSG_ERRQUEUE
For transmit timestamps the outgoing packet is looped back to the
--
2.9.3
^ permalink raw reply related
* [PATCH v6 net-next 6/7] net: allow simultaneous SW and HW transmit timestamping
From: Miroslav Lichvar @ 2017-05-19 15:52 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519155241.15817-1-mlichvar@redhat.com>
Add SOF_TIMESTAMPING_OPT_TX_SWHW option to allow an outgoing packet to
be looped to the socket's error queue with a software timestamp even
when a hardware transmit timestamp is expected to be provided by the
driver.
Applications using this option will receive two separate messages from
the error queue, one with a software timestamp and the other with a
hardware timestamp. As the hardware timestamp is saved to the shared skb
info, which may happen before the first message with software timestamp
is received by the application, the hardware timestamp is copied to the
SCM_TIMESTAMPING control message only when the skb has no software
timestamp or it is an incoming packet.
While changing sw_tx_timestamp(), inline it in skb_tx_timestamp() as
there are no other users.
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
Documentation/networking/timestamping.txt | 8 ++++++++
include/linux/skbuff.h | 10 ++--------
include/uapi/linux/net_tstamp.h | 3 ++-
net/core/skbuff.c | 4 ++++
net/socket.c | 20 ++++++++++++++++++--
5 files changed, 34 insertions(+), 11 deletions(-)
diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
index 50eb0e5..196ba17 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -203,6 +203,14 @@ SOF_TIMESTAMPING_OPT_PKTINFO:
enabled and the driver is using NAPI. The struct contains also two
other fields, but they are reserved and undefined.
+SOF_TIMESTAMPING_OPT_TX_SWHW:
+
+ Request both hardware and software timestamps for outgoing packets
+ when SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE
+ are enabled at the same time. If both timestamps are generated,
+ two separate messages will be looped to the socket's error queue,
+ each containing just one timestamp.
+
New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
regardless of the setting of sysctl net.core.tstamp_allow_data.
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 1f8028c..3b2e284 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3254,13 +3254,6 @@ void __skb_tstamp_tx(struct sk_buff *orig_skb,
void skb_tstamp_tx(struct sk_buff *orig_skb,
struct skb_shared_hwtstamps *hwtstamps);
-static inline void sw_tx_timestamp(struct sk_buff *skb)
-{
- if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
- !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
- skb_tstamp_tx(skb, NULL);
-}
-
/**
* skb_tx_timestamp() - Driver hook for transmit timestamping
*
@@ -3276,7 +3269,8 @@ static inline void sw_tx_timestamp(struct sk_buff *skb)
static inline void skb_tx_timestamp(struct sk_buff *skb)
{
skb_clone_tx_timestamp(skb);
- sw_tx_timestamp(skb);
+ if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP)
+ skb_tstamp_tx(skb, NULL);
}
/**
diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index dee74d3..3d421d9 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -28,8 +28,9 @@ enum {
SOF_TIMESTAMPING_OPT_TSONLY = (1<<11),
SOF_TIMESTAMPING_OPT_STATS = (1<<12),
SOF_TIMESTAMPING_OPT_PKTINFO = (1<<13),
+ SOF_TIMESTAMPING_OPT_TX_SWHW = (1<<14),
- SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_PKTINFO,
+ SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_TX_SWHW,
SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
SOF_TIMESTAMPING_LAST
};
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 346d3e8..68c02df 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3875,6 +3875,10 @@ void __skb_tstamp_tx(struct sk_buff *orig_skb,
if (!sk)
return;
+ if (!hwtstamps && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW) &&
+ skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)
+ return;
+
tsonly = sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TSONLY;
if (!skb_may_tx_timestamp(sk, tsonly))
return;
diff --git a/net/socket.c b/net/socket.c
index 67db7d8..cb355a7 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -662,6 +662,19 @@ static bool skb_is_err_queue(const struct sk_buff *skb)
return skb->pkt_type == PACKET_OUTGOING;
}
+/* On transmit, software and hardware timestamps are returned independently.
+ * As the two skb clones share the hardware timestamp, which may be updated
+ * before the software timestamp is received, a hardware TX timestamp may be
+ * returned only if there is no software TX timestamp. Ignore false software
+ * timestamps, which may be made in the __sock_recv_timestamp() call when the
+ * option SO_TIMESTAMP(NS) is enabled on the socket, even when the skb has a
+ * hardware timestamp.
+ */
+static bool skb_is_swtx_tstamp(const struct sk_buff *skb, int false_tstamp)
+{
+ return skb->tstamp && !false_tstamp && skb_is_err_queue(skb);
+}
+
static void put_ts_pktinfo(struct msghdr *msg, struct sk_buff *skb)
{
struct scm_ts_pktinfo ts_pktinfo;
@@ -691,14 +704,16 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
{
int need_software_tstamp = sock_flag(sk, SOCK_RCVTSTAMP);
struct scm_timestamping tss;
- int empty = 1;
+ int empty = 1, false_tstamp = 0;
struct skb_shared_hwtstamps *shhwtstamps =
skb_hwtstamps(skb);
/* Race occurred between timestamp enabling and packet
receiving. Fill in the current time for now. */
- if (need_software_tstamp && skb->tstamp == 0)
+ if (need_software_tstamp && skb->tstamp == 0) {
__net_timestamp(skb);
+ false_tstamp = 1;
+ }
if (need_software_tstamp) {
if (!sock_flag(sk, SOCK_RCVTSTAMPNS)) {
@@ -720,6 +735,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
empty = 0;
if (shhwtstamps &&
(sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
+ !skb_is_swtx_tstamp(skb, false_tstamp) &&
ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
empty = 0;
if ((sk->sk_tsflags & SOF_TIMESTAMPING_OPT_PKTINFO) &&
--
2.9.3
^ permalink raw reply related
* [PATCH v6 net-next 7/7] net: ethernet: update drivers to make both SW and HW TX timestamps
From: Miroslav Lichvar @ 2017-05-19 15:52 UTC (permalink / raw)
To: netdev; +Cc: Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519155241.15817-1-mlichvar@redhat.com>
Some drivers were calling the skb_tx_timestamp() function only when
a hardware timestamp was not requested. Now that applications can use
the SOF_TIMESTAMPING_OPT_TX_SWHW option to request both software and
hardware timestamps, the drivers need to be modified to unconditionally
call skb_tx_timestamp().
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 3 +--
drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++--
drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c | 3 +--
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 ++----
4 files changed, 6 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index 89b21d7..5a2ad9c 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1391,8 +1391,7 @@ static void xgbe_prep_tx_tstamp(struct xgbe_prv_data *pdata,
spin_unlock_irqrestore(&pdata->tstamp_lock, flags);
}
- if (!XGMAC_GET_BITS(packet->attributes, TX_PACKET_ATTRIBUTES, PTP))
- skb_tx_timestamp(skb);
+ skb_tx_timestamp(skb);
}
static void xgbe_prep_vlan(struct sk_buff *skb, struct xgbe_packet_data *packet)
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 0ff9295..6ed3bc4 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5868,10 +5868,10 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
adapter->tx_hwtstamp_skb = skb_get(skb);
adapter->tx_hwtstamp_start = jiffies;
schedule_work(&adapter->tx_hwtstamp_work);
- } else {
- skb_tx_timestamp(skb);
}
+ skb_tx_timestamp(skb);
+
netdev_sent_queue(netdev, skb->len);
e1000_tx_queue(tx_ring, tx_flags, count);
/* Make sure there is space in the ring for the next send. */
diff --git a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
index 1e59435..89831ad 100644
--- a/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
+++ b/drivers/net/ethernet/samsung/sxgbe/sxgbe_main.c
@@ -1418,8 +1418,7 @@ static netdev_tx_t sxgbe_xmit(struct sk_buff *skb, struct net_device *dev)
priv->hw->desc->tx_enable_tstamp(first_desc);
}
- if (!tqueue->hwts_tx_en)
- skb_tx_timestamp(skb);
+ skb_tx_timestamp(skb);
priv->hw->dma->enable_dma_transmission(priv->ioaddr, txq_index);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index cce862b..27c12e7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2880,8 +2880,7 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
priv->xstats.tx_set_ic_bit++;
}
- if (!priv->hwts_tx_en)
- skb_tx_timestamp(skb);
+ skb_tx_timestamp(skb);
if (unlikely((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
priv->hwts_tx_en)) {
@@ -3084,8 +3083,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
priv->xstats.tx_set_ic_bit++;
}
- if (!priv->hwts_tx_en)
- skb_tx_timestamp(skb);
+ skb_tx_timestamp(skb);
/* Ready to fill the first descriptor and set the OWN bit w/o any
* problems because all the descriptors are actually ready to be
--
2.9.3
^ permalink raw reply related
* Re: [PATCH v6 net-next 5/7] net: fix documentation of struct scm_timestamping
From: Willem de Bruijn @ 2017-05-19 15:59 UTC (permalink / raw)
To: Miroslav Lichvar; +Cc: Network Development, Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519155241.15817-6-mlichvar@redhat.com>
On Fri, May 19, 2017 at 11:52 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> The scm_timestamping struct may return multiple non-zero fields, e.g.
> when both software and hardware RX timestamping is enabled, or when the
> SO_TIMESTAMP(NS) option is combined with SCM_TIMESTAMPING and a false
> software timestamp is generated in the recvmsg() call in order to always
> return a SCM_TIMESTAMP(NS) message.
>
> CC: Richard Cochran <richardcochran@gmail.com>
> CC: Willem de Bruijn <willemb@google.com>
> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply
* Re: [PATCH v6 net-next 6/7] net: allow simultaneous SW and HW transmit timestamping
From: Willem de Bruijn @ 2017-05-19 16:00 UTC (permalink / raw)
To: Miroslav Lichvar; +Cc: Network Development, Richard Cochran, Willem de Bruijn
In-Reply-To: <20170519155241.15817-7-mlichvar@redhat.com>
On Fri, May 19, 2017 at 11:52 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> Add SOF_TIMESTAMPING_OPT_TX_SWHW option to allow an outgoing packet to
> be looped to the socket's error queue with a software timestamp even
> when a hardware transmit timestamp is expected to be provided by the
> driver.
>
> Applications using this option will receive two separate messages from
> the error queue, one with a software timestamp and the other with a
> hardware timestamp. As the hardware timestamp is saved to the shared skb
> info, which may happen before the first message with software timestamp
> is received by the application, the hardware timestamp is copied to the
> SCM_TIMESTAMPING control message only when the skb has no software
> timestamp or it is an incoming packet.
>
> While changing sw_tx_timestamp(), inline it in skb_tx_timestamp() as
> there are no other users.
>
> CC: Richard Cochran <richardcochran@gmail.com>
> CC: Willem de Bruijn <willemb@google.com>
> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply
* HELLO!!!!!
From: MR. NEIL TROTTER @ 2017-05-19 15:55 UTC (permalink / raw)
HELLO!!!!!
I am Mr Neil Trotter, the current winner of 108 Euro millions jackpot, if you have received this email then you are of the lucky fellows to benefit from me,so do get back to me for a better understanding. Here is the website for proof
http://www.huffingtonpost.co.uk/2014/03/18/neil-trotter-euromillions-winner_n_4984234.html}
Contact Email;(mr.neiltrotter121@outlook.com)
THANKS,
MR. NEIL TROTTER.
^ permalink raw reply
* [PATCH net] bridge: fix hello and hold timers starting/stopping
From: Ivan Vecera @ 2017-05-19 16:25 UTC (permalink / raw)
To: netdev; +Cc: davem, sashok, stephen, bridge, lucien.xin, nikolay
Current bridge code incorrectly handles starting/stopping of hello and
hold timers during STP enable/disable.
1. Timers are stopped in br_stp_start() during NO_STP->USER_STP
transition. This is not correct as the timers are stopped in NO_STP
case.
2. Timers are started in br_stp_stop() during USER_STP->NO_STP transition.
This is not also correct as the timers should be stopped in NO_STP
state.
3. Timers are NOT stopped in br_stp_stop() during KERNEL_STP->NO_STP
transition. They should be stopped as they are running in KERNEL_STP
state and should not run in NO_STP case.
The patch is a follow-up for "bridge: start hello_timer when enabling
KERNEL_STP in br_stp_start" patch from Xin Long.
Cc: davem@davemloft.net
Cc: sashok@cumulusnetworks.com
Cc: stephen@networkplumber.org
Cc: bridge@lists.linux-foundation.org
Cc: lucien.xin@gmail.com
Cc: nikolay@cumulusnetworks.com
Signed-off-by: Ivan Vecera <cera@cera.cz>
---
net/bridge/br_stp_if.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 0db8102995a5..f137ebf27755 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -150,7 +150,6 @@ static int br_stp_call_user(struct net_bridge *br, char *arg)
static void br_stp_start(struct net_bridge *br)
{
- struct net_bridge_port *p;
int err = -ENOENT;
if (net_eq(dev_net(br->dev), &init_net))
@@ -169,11 +168,6 @@ static void br_stp_start(struct net_bridge *br)
if (!err) {
br->stp_enabled = BR_USER_STP;
br_debug(br, "userspace STP started\n");
-
- /* Stop hello and hold timers */
- del_timer(&br->hello_timer);
- list_for_each_entry(p, &br->port_list, list)
- del_timer(&p->hold_timer);
} else {
br->stp_enabled = BR_KERNEL_STP;
br_debug(br, "using kernel STP\n");
@@ -197,13 +191,14 @@ static void br_stp_stop(struct net_bridge *br)
br_err(br, "failed to stop userspace STP (%d)\n", err);
/* To start timers on any ports left in blocking */
- mod_timer(&br->hello_timer, jiffies + br->hello_time);
- list_for_each_entry(p, &br->port_list, list)
- mod_timer(&p->hold_timer,
- round_jiffies(jiffies + BR_HOLD_TIME));
spin_lock_bh(&br->lock);
br_port_state_selection(br);
spin_unlock_bh(&br->lock);
+ } else {
+ /* BR_KERNEL_STP - stop hello and hold timers */
+ del_timer(&br->hello_timer);
+ list_for_each_entry(p, &br->port_list, list)
+ del_timer(&p->hold_timer);
}
br->stp_enabled = BR_NO_STP;
--
2.13.0
^ permalink raw reply related
* Re: [PATCH net] bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
From: Ivan Vecera @ 2017-05-19 16:29 UTC (permalink / raw)
To: Xin Long; +Cc: network dev, David Miller, Nikolay Aleksandrov
In-Reply-To: <6e78bfea3969ca1a937b37e28e796a3047b82615.1495203629.git.lucien.xin@gmail.com>
2017-05-19 16:20 GMT+02:00 Xin Long <lucien.xin@gmail.com>:
> Since commit 76b91c32dd86 ("bridge: stp: when using userspace stp stop
> kernel hello and hold timers"), bridge would not start hello_timer if
> stp_enabled is not KERNEL_STP when br_dev_open.
>
> The problem is even if users set stp_enabled with KERNEL_STP later,
> the timer will still not be started. It causes that KERNEL_STP can
> not really work. Users have to re-ifup the bridge to avoid this.
>
> This patch is to fix it by starting br->hello_timer when enabling
> KERNEL_STP in br_stp_start.
>
> As an improvement, it's also to start hello_timer again only when
> br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is
> no reason to start the timer again when it's NO_STP.
>
> Fixes: 76b91c32dd86 ("bridge: stp: when using userspace stp stop kernel hello and hold timers")
> Reported-by: Haidong Li <haili@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
> net/bridge/br_stp_if.c | 1 +
> net/bridge/br_stp_timer.c | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 08341d2..0db8102 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -179,6 +179,7 @@ static void br_stp_start(struct net_bridge *br)
> br_debug(br, "using kernel STP\n");
>
> /* To start timers on any ports left in blocking */
> + mod_timer(&br->hello_timer, jiffies + br->hello_time);
> br_port_state_selection(br);
> }
>
> diff --git a/net/bridge/br_stp_timer.c b/net/bridge/br_stp_timer.c
> index c98b3e5..60b6fe2 100644
> --- a/net/bridge/br_stp_timer.c
> +++ b/net/bridge/br_stp_timer.c
> @@ -40,7 +40,7 @@ static void br_hello_timer_expired(unsigned long arg)
> if (br->dev->flags & IFF_UP) {
> br_config_bpdu_generation(br);
>
> - if (br->stp_enabled != BR_USER_STP)
> + if (br->stp_enabled == BR_KERNEL_STP)
> mod_timer(&br->hello_timer,
> round_jiffies(jiffies + br->hello_time));
> }
> --
> 2.1.0
>
Reviewed-by: Ivan Vecera <cera@cera.cz>
^ permalink raw reply
* Re: [PATCH net-next V5 0/9] vhost_net rx batch dequeuing
From: Michael S. Tsirkin @ 2017-05-19 16:36 UTC (permalink / raw)
To: Jason Wang; +Cc: netdev, linux-kernel
In-Reply-To: <0d1dbf31-32c8-34b4-d8e8-48d04f2fc205@redhat.com>
On Fri, May 19, 2017 at 02:27:16PM +0800, Jason Wang wrote:
>
>
> On 2017年05月18日 04:59, Michael S. Tsirkin wrote:
> > On Wed, May 17, 2017 at 12:14:36PM +0800, Jason Wang wrote:
> > > This series tries to implement rx batching for vhost-net. This is done
> > > by batching the dequeuing from skb_array which was exported by
> > > underlayer socket and pass the sbk back through msg_control to finish
> > > userspace copying. This is also the requirement for more batching
> > > implemention on rx path.
> > >
> > > Tests shows at most 7.56% improvment bon rx pps on top of batch
> > > zeroing and no obvious changes for TCP_STREAM/TCP_RR result.
> > >
> > > Please review.
> > >
> > > Thanks
> > A surprisingly large gain for such as simple change. It would be nice
> > to understand better why this helps - in particular, does the optimal
> > batch size change if ring is bigger or smaller?
>
> Will test, just want to confirm. You mean virtio ring not tx_queue_len here?
>
> Thanks
Exactly.
Thanks,
MST
^ permalink raw reply
* Re: [PATCH net] bridge: fix hello and hold timers starting/stopping
From: Stephen Hemminger @ 2017-05-19 16:38 UTC (permalink / raw)
To: Ivan Vecera; +Cc: lucien.xin, nikolay, netdev, bridge, davem
In-Reply-To: <20170519162543.31670-1-cera@cera.cz>
On Fri, 19 May 2017 18:25:43 +0200
Ivan Vecera <cera@cera.cz> wrote:
> Current bridge code incorrectly handles starting/stopping of hello and
> hold timers during STP enable/disable.
>
> 1. Timers are stopped in br_stp_start() during NO_STP->USER_STP
> transition. This is not correct as the timers are stopped in NO_STP
> case.
>
> 2. Timers are started in br_stp_stop() during USER_STP->NO_STP transition.
> This is not also correct as the timers should be stopped in NO_STP
> state.
>
> 3. Timers are NOT stopped in br_stp_stop() during KERNEL_STP->NO_STP
> transition. They should be stopped as they are running in KERNEL_STP
> state and should not run in NO_STP case.
>
> The patch is a follow-up for "bridge: start hello_timer when enabling
> KERNEL_STP in br_stp_start" patch from Xin Long.
>
> Cc: davem@davemloft.net
> Cc: sashok@cumulusnetworks.com
> Cc: stephen@networkplumber.org
> Cc: bridge@lists.linux-foundation.org
> Cc: lucien.xin@gmail.com
> Cc: nikolay@cumulusnetworks.com
> Signed-off-by: Ivan Vecera <cera@cera.cz>
Overall, this looks correct but the wording of commit message
is too terse.
It would be better to add a more complete description of the impact
of this from a user's point of view. I am concerned that this
might have other side effects.
For example, what is the sequence of commands to validated this.
What is the impact, should this go to stable?
^ permalink raw reply
* Re: [PATCH net-next] net: sched: provide stubs for tcf_chain_{get,put} for CONFIG_NET_CLS=n
From: Cong Wang @ 2017-05-19 16:50 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim
In-Reply-To: <01ab2339375fd161310945aa4eb77c3ecb46b26a.1495124042.git.sd@queasysnail.net>
On Thu, May 18, 2017 at 9:24 AM, Sabrina Dubroca <sd@queasysnail.net> wrote:
> This also changes tcf_chain_get() to return an error pointer instead of
> NULL, so that tcf_action_goto_chain_init() can differentiate memory
> allocation failure from lack of support.
>
> Fixes: 5bc1701881e3 ("net: sched: introduce multichain support for filters")
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> ---
> I'm not sure this EOPNOTSUPP is really necessary, ie if we can really
> reach the tcf_action_goto_chain_init() call when CONFIG_NET_CLS=n.
> If not, a simpler patch would add a tcf_chain_get() stub that just
> returns NULL, as we wouldn't have to care about returning an incorrect
> error code from tcf_action_goto_chain_init().
I wonder if we should just make CONFIG_NET_CLS_ACT depending
on CONFIG_NET_CLS. Although without filters we can still have
standalone actions but no one can use them.
^ permalink raw reply
* Re: [PATCH net] bridge: fix hello and hold timers starting/stopping
From: Xin Long @ 2017-05-19 16:51 UTC (permalink / raw)
To: Ivan Vecera
Cc: network dev, davem, sashok, Stephen Hemminger, bridge,
Nikolay Aleksandrov
In-Reply-To: <20170519162543.31670-1-cera@cera.cz>
On Sat, May 20, 2017 at 12:25 AM, Ivan Vecera <cera@cera.cz> wrote:
> Current bridge code incorrectly handles starting/stopping of hello and
> hold timers during STP enable/disable.
>
> 1. Timers are stopped in br_stp_start() during NO_STP->USER_STP
> transition. This is not correct as the timers are stopped in NO_STP
> case.
>
> 2. Timers are started in br_stp_stop() during USER_STP->NO_STP transition.
> This is not also correct as the timers should be stopped in NO_STP
> state.
>
> 3. Timers are NOT stopped in br_stp_stop() during KERNEL_STP->NO_STP
> transition. They should be stopped as they are running in KERNEL_STP
> state and should not run in NO_STP case.
>
> The patch is a follow-up for "bridge: start hello_timer when enabling
> KERNEL_STP in br_stp_start" patch from Xin Long.
>
> Cc: davem@davemloft.net
> Cc: sashok@cumulusnetworks.com
> Cc: stephen@networkplumber.org
> Cc: bridge@lists.linux-foundation.org
> Cc: lucien.xin@gmail.com
> Cc: nikolay@cumulusnetworks.com
> Signed-off-by: Ivan Vecera <cera@cera.cz>
> ---
> net/bridge/br_stp_if.c | 15 +++++----------
> 1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 0db8102995a5..f137ebf27755 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -150,7 +150,6 @@ static int br_stp_call_user(struct net_bridge *br, char *arg)
>
> static void br_stp_start(struct net_bridge *br)
> {
> - struct net_bridge_port *p;
> int err = -ENOENT;
>
> if (net_eq(dev_net(br->dev), &init_net))
> @@ -169,11 +168,6 @@ static void br_stp_start(struct net_bridge *br)
> if (!err) {
> br->stp_enabled = BR_USER_STP;
> br_debug(br, "userspace STP started\n");
> -
> - /* Stop hello and hold timers */
> - del_timer(&br->hello_timer);
> - list_for_each_entry(p, &br->port_list, list)
> - del_timer(&p->hold_timer);
> } else {
> br->stp_enabled = BR_KERNEL_STP;
> br_debug(br, "using kernel STP\n");
> @@ -197,13 +191,14 @@ static void br_stp_stop(struct net_bridge *br)
> br_err(br, "failed to stop userspace STP (%d)\n", err);
>
> /* To start timers on any ports left in blocking */
> - mod_timer(&br->hello_timer, jiffies + br->hello_time);
> - list_for_each_entry(p, &br->port_list, list)
> - mod_timer(&p->hold_timer,
> - round_jiffies(jiffies + BR_HOLD_TIME));
> spin_lock_bh(&br->lock);
> br_port_state_selection(br);
> spin_unlock_bh(&br->lock);
> + } else {
> + /* BR_KERNEL_STP - stop hello and hold timers */
> + del_timer(&br->hello_timer);
> + list_for_each_entry(p, &br->port_list, list)
> + del_timer(&p->hold_timer);
I'm thinking, what if the timers are running when deleting them ?
del_timer may not be going to delete it, and still have to stop itself
next time when br->stp_enabled = BR_NO_STP.
So do you think it's better to do nothing here and just leave it to be
stopped by itself when checking br->stp_enabled in
br_hello_timer_expired ?
> }
>
> br->stp_enabled = BR_NO_STP;
> --
> 2.13.0
>
^ permalink raw reply
* Re: [PATCH net] bridge: fix hello and hold timers starting/stopping
From: Nikolay Aleksandrov @ 2017-05-19 16:55 UTC (permalink / raw)
To: Ivan Vecera, netdev; +Cc: davem, sashok, stephen, bridge, lucien.xin
In-Reply-To: <20170519162543.31670-1-cera@cera.cz>
On 5/19/17 7:25 PM, Ivan Vecera wrote:
> Current bridge code incorrectly handles starting/stopping of hello and
> hold timers during STP enable/disable.
>
> 1. Timers are stopped in br_stp_start() during NO_STP->USER_STP
> transition. This is not correct as the timers are stopped in NO_STP
> case.
This really is a noop, but ok.
>
> 2. Timers are started in br_stp_stop() during USER_STP->NO_STP transition.
> This is not also correct as the timers should be stopped in NO_STP
> state.
Indeed, but the actual end result is almost as them being stopped because
in the timers there are specific checks if the STP == KERNEL_STP (see
br_transmit_config()) and the hold_timers will simply expire and not rearm
in any other mode. The only real problem is the hello_timer which continues
to rearm itself, but with Xin's earlier patch that is taken care of too.
>
> 3. Timers are NOT stopped in br_stp_stop() during KERNEL_STP->NO_STP
> transition. They should be stopped as they are running in KERNEL_STP
> state and should not run in NO_STP case.
Same comment as for point 2.
>
> The patch is a follow-up for "bridge: start hello_timer when enabling
> KERNEL_STP in br_stp_start" patch from Xin Long.
>
I'd say this is more of a cleanup/improvement after Xin's patch and thus would
suggest targeting net-next. The only real issue is fixed by his patch.
> Cc: davem@davemloft.net
> Cc: sashok@cumulusnetworks.com
> Cc: stephen@networkplumber.org
> Cc: bridge@lists.linux-foundation.org
> Cc: lucien.xin@gmail.com
> Cc: nikolay@cumulusnetworks.com
> Signed-off-by: Ivan Vecera <cera@cera.cz>
> ---
> net/bridge/br_stp_if.c | 15 +++++----------
> 1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 0db8102995a5..f137ebf27755 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -150,7 +150,6 @@ static int br_stp_call_user(struct net_bridge *br, char *arg)
>
> static void br_stp_start(struct net_bridge *br)
> {
> - struct net_bridge_port *p;
> int err = -ENOENT;
>
> if (net_eq(dev_net(br->dev), &init_net))
> @@ -169,11 +168,6 @@ static void br_stp_start(struct net_bridge *br)
> if (!err) {
> br->stp_enabled = BR_USER_STP;
> br_debug(br, "userspace STP started\n");
> -
> - /* Stop hello and hold timers */
> - del_timer(&br->hello_timer);
> - list_for_each_entry(p, &br->port_list, list)
> - del_timer(&p->hold_timer);
> } else {
> br->stp_enabled = BR_KERNEL_STP;
> br_debug(br, "using kernel STP\n");
> @@ -197,13 +191,14 @@ static void br_stp_stop(struct net_bridge *br)
> br_err(br, "failed to stop userspace STP (%d)\n", err);
>
> /* To start timers on any ports left in blocking */
> - mod_timer(&br->hello_timer, jiffies + br->hello_time);
> - list_for_each_entry(p, &br->port_list, list)
> - mod_timer(&p->hold_timer,
> - round_jiffies(jiffies + BR_HOLD_TIME));
> spin_lock_bh(&br->lock);
> br_port_state_selection(br);
> spin_unlock_bh(&br->lock);
> + } else {
> + /* BR_KERNEL_STP - stop hello and hold timers */
> + del_timer(&br->hello_timer);
> + list_for_each_entry(p, &br->port_list, list)
> + del_timer(&p->hold_timer);
> }
>
> br->stp_enabled = BR_NO_STP;
>
^ permalink raw reply
* Re: [PATCH net] bridge: fix hello and hold timers starting/stopping
From: Ivan Vecera @ 2017-05-19 16:55 UTC (permalink / raw)
To: Xin Long
Cc: network dev, davem, sashok, Stephen Hemminger, bridge,
Nikolay Aleksandrov
In-Reply-To: <CADvbK_eRRy5bXb1RN6p-DTp9cKp6MvCZGH8oL=gfhTrJ0URFqQ@mail.gmail.com>
2017-05-19 18:51 GMT+02:00 Xin Long <lucien.xin@gmail.com>:
> On Sat, May 20, 2017 at 12:25 AM, Ivan Vecera <cera@cera.cz> wrote:
>> Current bridge code incorrectly handles starting/stopping of hello and
>> hold timers during STP enable/disable.
>>
>> 1. Timers are stopped in br_stp_start() during NO_STP->USER_STP
>> transition. This is not correct as the timers are stopped in NO_STP
>> case.
>>
>> 2. Timers are started in br_stp_stop() during USER_STP->NO_STP transition.
>> This is not also correct as the timers should be stopped in NO_STP
>> state.
>>
>> 3. Timers are NOT stopped in br_stp_stop() during KERNEL_STP->NO_STP
>> transition. They should be stopped as they are running in KERNEL_STP
>> state and should not run in NO_STP case.
>>
>> The patch is a follow-up for "bridge: start hello_timer when enabling
>> KERNEL_STP in br_stp_start" patch from Xin Long.
>>
>> Cc: davem@davemloft.net
>> Cc: sashok@cumulusnetworks.com
>> Cc: stephen@networkplumber.org
>> Cc: bridge@lists.linux-foundation.org
>> Cc: lucien.xin@gmail.com
>> Cc: nikolay@cumulusnetworks.com
>> Signed-off-by: Ivan Vecera <cera@cera.cz>
>> ---
>> net/bridge/br_stp_if.c | 15 +++++----------
>> 1 file changed, 5 insertions(+), 10 deletions(-)
>>
>> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
>> index 0db8102995a5..f137ebf27755 100644
>> --- a/net/bridge/br_stp_if.c
>> +++ b/net/bridge/br_stp_if.c
>> @@ -150,7 +150,6 @@ static int br_stp_call_user(struct net_bridge *br, char *arg)
>>
>> static void br_stp_start(struct net_bridge *br)
>> {
>> - struct net_bridge_port *p;
>> int err = -ENOENT;
>>
>> if (net_eq(dev_net(br->dev), &init_net))
>> @@ -169,11 +168,6 @@ static void br_stp_start(struct net_bridge *br)
>> if (!err) {
>> br->stp_enabled = BR_USER_STP;
>> br_debug(br, "userspace STP started\n");
>> -
>> - /* Stop hello and hold timers */
>> - del_timer(&br->hello_timer);
>> - list_for_each_entry(p, &br->port_list, list)
>> - del_timer(&p->hold_timer);
>> } else {
>> br->stp_enabled = BR_KERNEL_STP;
>> br_debug(br, "using kernel STP\n");
>> @@ -197,13 +191,14 @@ static void br_stp_stop(struct net_bridge *br)
>> br_err(br, "failed to stop userspace STP (%d)\n", err);
>>
>> /* To start timers on any ports left in blocking */
>> - mod_timer(&br->hello_timer, jiffies + br->hello_time);
>> - list_for_each_entry(p, &br->port_list, list)
>> - mod_timer(&p->hold_timer,
>> - round_jiffies(jiffies + BR_HOLD_TIME));
>> spin_lock_bh(&br->lock);
>> br_port_state_selection(br);
>> spin_unlock_bh(&br->lock);
>> + } else {
>> + /* BR_KERNEL_STP - stop hello and hold timers */
>> + del_timer(&br->hello_timer);
>> + list_for_each_entry(p, &br->port_list, list)
>> + del_timer(&p->hold_timer);
> I'm thinking, what if the timers are running when deleting them ?
> del_timer may not be going to delete it, and still have to stop itself
> next time when br->stp_enabled = BR_NO_STP.
>
> So do you think it's better to do nothing here and just leave it to be
> stopped by itself when checking br->stp_enabled in
> br_hello_timer_expired ?
Yes, this kind of "lazy stopping" could be safer.
I.
^ permalink raw reply
* [PATCH net-next 0/9] warning cleanups
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
This series addresses a number of warnings in common networking
code visible when kernel is built with W=1
The DCB patch needs review from John Fastbend the original author
of the netlink interface since it adds missing checks.
Stephen Hemminger (9):
dcb: enforce minimum length on IEEE_APPS attribute
ila: propagate error code in ila_output
udp: make local function static
inet: fix warning about missing prototype
tcpnv: do not export local function
xfrm: make xfrm_dev_register static
fou: make local function static
ipv6: drop unused variables in seg6_genl_dumphac
ipv6: remove unused variables in esp6
net/dcb/dcbnl.c | 11 ++++--
net/ipv4/fou.c | 82 ++++++++++++++++++++---------------------
net/ipv4/inet_connection_sock.c | 1 +
net/ipv4/tcp_nv.c | 5 +--
net/ipv4/udp.c | 2 +-
net/ipv6/esp6.c | 5 ---
net/ipv6/fou6.c | 14 +++----
net/ipv6/ila/ila_lwt.c | 2 +-
net/ipv6/seg6.c | 4 --
net/ipv6/udp.c | 2 +-
net/xfrm/xfrm_device.c | 2 +-
11 files changed, 61 insertions(+), 69 deletions(-)
--
2.11.0
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox