Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net] vxlan: update skb dst pmtu on tx path
From: Xin Long @ 2017-12-19 17:05 UTC (permalink / raw)
  To: David Miller; +Cc: network dev, Jiri Benc
In-Reply-To: <20171219.111207.1046172818461753483.davem@davemloft.net>

On Wed, Dec 20, 2017 at 12:12 AM, David Miller <davem@davemloft.net> wrote:
> From: Xin Long <lucien.xin@gmail.com>
> Date: Mon, 18 Dec 2017 14:20:56 +0800
>
>> Unlike ip tunnels, now vxlan doesn't do any pmtu update for
>> upper dst pmtu, even if it doesn't match the lower dst pmtu
>> any more.
>>
>> The problem can be reproduced when reducing the vxlan lower
>> dev's pmtu when running netperf. In jianlin's testing, the
>> performance went to 1/7 of the previous.
>>
>> This patch is to update the upper dst pmtu to match the lower
>> dst pmtu on tx path so that packets can be sent out even when
>> lower dev's pmtu has been changed.
>>
>> It also works for metadata dst.
>>
>> Note that this patch doesn't process any pmtu icmp packet.
>> But even in the future, the support for pmtu icmp packets
>> process of udp tunnels will also needs this.
>>
>> The same thing will be done for geneve in another patch.
>>
>> Signed-off-by: Xin Long <lucien.xin@gmail.com>
>
> Yikes...
>
> You're going to have to find a way to fix this without
> invoking ->update_pmtu() on every single transmit.  That's
> really excessive, especially for an operation which is
> going to be a NOP %99.9999 of the time.
understand, I couldn't find a better way,  and all iptunnels are
doing it in this way.

Or is it possible to go with an unlikely here ?

                if (unlikely(skb_dst(skb) && mtu < dst_mtu(skb_dst(skb))))
                        skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
                                                       skb, mtu);



>
> We need some way, instead, for the MTU change event to propagate
> properly.  I know this might be hard, but doing this in the transmit
> handler on every packet to deal with it is not the way to go.
how about doing it in vxlan_get_route():
@@ -1896,6 +1896,13 @@ static struct rtable *vxlan_get_route(struct
vxlan_dev *vxlan, struct net_device
                *saddr = fl4.saddr;
                if (use_cache)
                        dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr);
+
+               if (skb_dst(skb)) {
+                       int mtu = dst_mtu(ndst) - VXLAN_HEADROOM;
+
+                       skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
+                                                      skb, mtu);
+               }


This would do it only when no dst_cache and it has to do real route lookup.

Note that even when update_pmtu is hit, mostly it will do nothing and
just return
as usually new mtu >= skb_dst(skb)'s pmtu.


>
> Thanks.
>

^ permalink raw reply

* [PATCH net-next 2/4] sfc: support variable number of MAC stats
From: Edward Cree @ 2017-12-19 17:05 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev
In-Reply-To: <62aabe1f-6148-922c-03fb-afec440b4a0e@solarflare.com>

Medford2 NICs support more than MC_CMD_MAC_NSTATS stats, and report the new
 count in a field of MC_CMD_GET_CAPABILITIES_V4.  This also means that the
 end generation count moves (it is, as before, the last 64 bits of the DMA
 buffer, but that is no longer MC_CMD_MAC_GENERATION_END).
So read num_mac_stats from the GET_CAPABILITIES response, if present;
 otherwise assume MC_CMD_MAC_NSTATS; and always use num_mac_stats - 1 rather
 than MC_CMD_MAC_GENERATION_END.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef10.c       | 23 ++++++++++++++++++-----
 drivers/net/ethernet/sfc/efx.c        |  2 ++
 drivers/net/ethernet/sfc/mcdi_port.c  | 10 +++++-----
 drivers/net/ethernet/sfc/net_driver.h |  3 +++
 drivers/net/ethernet/sfc/siena.c      |  2 +-
 5 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 1f64c7f60943..27b981e7e786 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -233,7 +233,7 @@ static int efx_ef10_get_vf_index(struct efx_nic *efx)
 
 static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
 {
-	MCDI_DECLARE_BUF(outbuf, MC_CMD_GET_CAPABILITIES_V3_OUT_LEN);
+	MCDI_DECLARE_BUF(outbuf, MC_CMD_GET_CAPABILITIES_V4_OUT_LEN);
 	struct efx_ef10_nic_data *nic_data = efx->nic_data;
 	size_t outlen;
 	int rc;
@@ -306,6 +306,19 @@ static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
 			  efx->vi_stride);
 	}
 
+	if (outlen >= MC_CMD_GET_CAPABILITIES_V4_OUT_LEN) {
+		efx->num_mac_stats = MCDI_WORD(outbuf,
+				GET_CAPABILITIES_V4_OUT_MAC_STATS_NUM_STATS);
+		netif_dbg(efx, probe, efx->net_dev,
+			  "firmware reports num_mac_stats = %u\n",
+			  efx->num_mac_stats);
+	} else {
+		/* leave num_mac_stats as the default value, MC_CMD_MAC_NSTATS */
+		netif_dbg(efx, probe, efx->net_dev,
+			  "firmware did not report num_mac_stats, assuming %u\n",
+			  efx->num_mac_stats);
+	}
+
 	return 0;
 }
 
@@ -1850,7 +1863,7 @@ static int efx_ef10_try_update_nic_stats_pf(struct efx_nic *efx)
 
 	dma_stats = efx->stats_buffer.addr;
 
-	generation_end = dma_stats[MC_CMD_MAC_GENERATION_END];
+	generation_end = dma_stats[efx->num_mac_stats - 1];
 	if (generation_end == EFX_MC_STATS_GENERATION_INVALID)
 		return 0;
 	rmb();
@@ -1898,7 +1911,7 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx)
 	DECLARE_BITMAP(mask, EF10_STAT_COUNT);
 	__le64 generation_start, generation_end;
 	u64 *stats = nic_data->stats;
-	u32 dma_len = MC_CMD_MAC_NSTATS * sizeof(u64);
+	u32 dma_len = efx->num_mac_stats * sizeof(u64);
 	struct efx_buffer stats_buf;
 	__le64 *dma_stats;
 	int rc;
@@ -1923,7 +1936,7 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx)
 	}
 
 	dma_stats = stats_buf.addr;
-	dma_stats[MC_CMD_MAC_GENERATION_END] = EFX_MC_STATS_GENERATION_INVALID;
+	dma_stats[efx->num_mac_stats - 1] = EFX_MC_STATS_GENERATION_INVALID;
 
 	MCDI_SET_QWORD(inbuf, MAC_STATS_IN_DMA_ADDR, stats_buf.dma_addr);
 	MCDI_POPULATE_DWORD_1(inbuf, MAC_STATS_IN_CMD,
@@ -1942,7 +1955,7 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx)
 		goto out;
 	}
 
-	generation_end = dma_stats[MC_CMD_MAC_GENERATION_END];
+	generation_end = dma_stats[efx->num_mac_stats - 1];
 	if (generation_end == EFX_MC_STATS_GENERATION_INVALID) {
 		WARN_ON_ONCE(1);
 		goto out;
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 7bcbedce07a5..3780161de5a1 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -2983,6 +2983,8 @@ static int efx_init_struct(struct efx_nic *efx,
 		efx->type->rx_ts_offset - efx->type->rx_prefix_size;
 	spin_lock_init(&efx->stats_lock);
 	efx->vi_stride = EFX_DEFAULT_VI_STRIDE;
+	efx->num_mac_stats = MC_CMD_MAC_NSTATS;
+	BUILD_BUG_ON(MC_CMD_MAC_NSTATS - 1 != MC_CMD_MAC_GENERATION_END);
 	mutex_init(&efx->mac_lock);
 	efx->phy_op = &efx_dummy_phy_operations;
 	efx->mdio.dev = net_dev;
diff --git a/drivers/net/ethernet/sfc/mcdi_port.c b/drivers/net/ethernet/sfc/mcdi_port.c
index 6e1f282b2976..65ee1a468170 100644
--- a/drivers/net/ethernet/sfc/mcdi_port.c
+++ b/drivers/net/ethernet/sfc/mcdi_port.c
@@ -1087,7 +1087,7 @@ static int efx_mcdi_mac_stats(struct efx_nic *efx,
 	int period = action == EFX_STATS_ENABLE ? 1000 : 0;
 	dma_addr_t dma_addr = efx->stats_buffer.dma_addr;
 	u32 dma_len = action != EFX_STATS_DISABLE ?
-		MC_CMD_MAC_NSTATS * sizeof(u64) : 0;
+		efx->num_mac_stats * sizeof(u64) : 0;
 
 	BUILD_BUG_ON(MC_CMD_MAC_STATS_OUT_DMA_LEN != 0);
 
@@ -1121,7 +1121,7 @@ void efx_mcdi_mac_start_stats(struct efx_nic *efx)
 {
 	__le64 *dma_stats = efx->stats_buffer.addr;
 
-	dma_stats[MC_CMD_MAC_GENERATION_END] = EFX_MC_STATS_GENERATION_INVALID;
+	dma_stats[efx->num_mac_stats - 1] = EFX_MC_STATS_GENERATION_INVALID;
 
 	efx_mcdi_mac_stats(efx, EFX_STATS_ENABLE, 0);
 }
@@ -1139,10 +1139,10 @@ void efx_mcdi_mac_pull_stats(struct efx_nic *efx)
 	__le64 *dma_stats = efx->stats_buffer.addr;
 	int attempts = EFX_MAC_STATS_WAIT_ATTEMPTS;
 
-	dma_stats[MC_CMD_MAC_GENERATION_END] = EFX_MC_STATS_GENERATION_INVALID;
+	dma_stats[efx->num_mac_stats - 1] = EFX_MC_STATS_GENERATION_INVALID;
 	efx_mcdi_mac_stats(efx, EFX_STATS_PULL, 0);
 
-	while (dma_stats[MC_CMD_MAC_GENERATION_END] ==
+	while (dma_stats[efx->num_mac_stats - 1] ==
 				EFX_MC_STATS_GENERATION_INVALID &&
 			attempts-- != 0)
 		udelay(EFX_MAC_STATS_WAIT_US);
@@ -1167,7 +1167,7 @@ int efx_mcdi_port_probe(struct efx_nic *efx)
 
 	/* Allocate buffer for stats */
 	rc = efx_nic_alloc_buffer(efx, &efx->stats_buffer,
-				  MC_CMD_MAC_NSTATS * sizeof(u64), GFP_KERNEL);
+				  efx->num_mac_stats * sizeof(u64), GFP_KERNEL);
 	if (rc)
 		return rc;
 	netif_dbg(efx, probe, efx->net_dev,
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 6b8730a24513..4cedc5c4c6d9 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -774,6 +774,8 @@ struct vfdi_status;
  * @port_initialized: Port initialized?
  * @net_dev: Operating system network device. Consider holding the rtnl lock
  * @fixed_features: Features which cannot be turned off
+ * @num_mac_stats: Number of MAC stats reported by firmware (MAC_STATS_NUM_STATS
+ *	field of %MC_CMD_GET_CAPABILITIES_V4 response, or %MC_CMD_MAC_NSTATS)
  * @stats_buffer: DMA buffer for statistics
  * @phy_type: PHY type
  * @phy_op: PHY interface
@@ -922,6 +924,7 @@ struct efx_nic {
 
 	netdev_features_t fixed_features;
 
+	u16 num_mac_stats;
 	struct efx_buffer stats_buffer;
 	u64 rx_nodesc_drops_total;
 	u64 rx_nodesc_drops_while_down;
diff --git a/drivers/net/ethernet/sfc/siena.c b/drivers/net/ethernet/sfc/siena.c
index 22d49ebb347c..ae8645ae4492 100644
--- a/drivers/net/ethernet/sfc/siena.c
+++ b/drivers/net/ethernet/sfc/siena.c
@@ -555,7 +555,7 @@ static int siena_try_update_nic_stats(struct efx_nic *efx)
 
 	dma_stats = efx->stats_buffer.addr;
 
-	generation_end = dma_stats[MC_CMD_MAC_GENERATION_END];
+	generation_end = dma_stats[efx->num_mac_stats - 1];
 	if (generation_end == EFX_MC_STATS_GENERATION_INVALID)
 		return 0;
 	rmb();

^ permalink raw reply related

* [PATCH net-next 3/4] sfc: expose FEC stats on Medford2
From: Edward Cree @ 2017-12-19 17:06 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev
In-Reply-To: <62aabe1f-6148-922c-03fb-afec440b4a0e@solarflare.com>

There's no explicit capability bit, so we just condition them on having
 efx->num_mac_stats >= MC_CMD_MAC_NSTATS_V2.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef10.c | 24 +++++++++++++++++++++++-
 drivers/net/ethernet/sfc/nic.h  |  7 +++++++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 27b981e7e786..352ca43a7395 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -1643,6 +1643,12 @@ static const struct efx_hw_stat_desc efx_ef10_stat_desc[EF10_STAT_COUNT] = {
 	EF10_DMA_STAT(tx_bad, VADAPTER_TX_BAD_PACKETS),
 	EF10_DMA_STAT(tx_bad_bytes, VADAPTER_TX_BAD_BYTES),
 	EF10_DMA_STAT(tx_overflow, VADAPTER_TX_OVERFLOW),
+	EF10_DMA_STAT(fec_uncorrected_errors, FEC_UNCORRECTED_ERRORS),
+	EF10_DMA_STAT(fec_corrected_errors, FEC_CORRECTED_ERRORS),
+	EF10_DMA_STAT(fec_corrected_symbols_lane0, FEC_CORRECTED_SYMBOLS_LANE0),
+	EF10_DMA_STAT(fec_corrected_symbols_lane1, FEC_CORRECTED_SYMBOLS_LANE1),
+	EF10_DMA_STAT(fec_corrected_symbols_lane2, FEC_CORRECTED_SYMBOLS_LANE2),
+	EF10_DMA_STAT(fec_corrected_symbols_lane3, FEC_CORRECTED_SYMBOLS_LANE3),
 };
 
 #define HUNT_COMMON_STAT_MASK ((1ULL << EF10_STAT_port_tx_bytes) |	\
@@ -1718,6 +1724,19 @@ static const struct efx_hw_stat_desc efx_ef10_stat_desc[EF10_STAT_COUNT] = {
 	(1ULL << EF10_STAT_port_rx_dp_hlb_fetch) |			\
 	(1ULL << EF10_STAT_port_rx_dp_hlb_wait))
 
+/* These statistics are only provided if the NIC supports MC_CMD_MAC_STATS_V2,
+ * indicated by returning a value >= MC_CMD_MAC_NSTATS_V2 in
+ * MC_CMD_GET_CAPABILITIES_V4_OUT_MAC_STATS_NUM_STATS.
+ * These bits are in the second u64 of the raw mask.
+ */
+#define EF10_FEC_STAT_MASK (						\
+	(1ULL << (EF10_STAT_fec_uncorrected_errors - 64)) |		\
+	(1ULL << (EF10_STAT_fec_corrected_errors - 64)) |		\
+	(1ULL << (EF10_STAT_fec_corrected_symbols_lane0 - 64)) |	\
+	(1ULL << (EF10_STAT_fec_corrected_symbols_lane1 - 64)) |	\
+	(1ULL << (EF10_STAT_fec_corrected_symbols_lane2 - 64)) |	\
+	(1ULL << (EF10_STAT_fec_corrected_symbols_lane3 - 64)))
+
 static u64 efx_ef10_raw_stat_mask(struct efx_nic *efx)
 {
 	u64 raw_mask = HUNT_COMMON_STAT_MASK;
@@ -1756,10 +1775,13 @@ static void efx_ef10_get_stat_mask(struct efx_nic *efx, unsigned long *mask)
 	if (nic_data->datapath_caps &
 	    (1 << MC_CMD_GET_CAPABILITIES_OUT_EVB_LBN)) {
 		raw_mask[0] |= ~((1ULL << EF10_STAT_rx_unicast) - 1);
-		raw_mask[1] = (1ULL << (EF10_STAT_COUNT - 63)) - 1;
+		raw_mask[1] = (1ULL << (EF10_STAT_V1_COUNT - 64)) - 1;
 	} else {
 		raw_mask[1] = 0;
 	}
+	/* Only show FEC stats when NIC supports MC_CMD_MAC_STATS_V2 */
+	if (efx->num_mac_stats >= MC_CMD_MAC_NSTATS_V2)
+		raw_mask[1] |= EF10_FEC_STAT_MASK;
 
 #if BITS_PER_LONG == 64
 	BUILD_BUG_ON(BITS_TO_LONGS(EF10_STAT_COUNT) != 2);
diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 7b51b6371724..e39e7b399252 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -325,6 +325,13 @@ enum {
 	EF10_STAT_tx_bad,
 	EF10_STAT_tx_bad_bytes,
 	EF10_STAT_tx_overflow,
+	EF10_STAT_V1_COUNT,
+	EF10_STAT_fec_uncorrected_errors = EF10_STAT_V1_COUNT,
+	EF10_STAT_fec_corrected_errors,
+	EF10_STAT_fec_corrected_symbols_lane0,
+	EF10_STAT_fec_corrected_symbols_lane1,
+	EF10_STAT_fec_corrected_symbols_lane2,
+	EF10_STAT_fec_corrected_symbols_lane3,
 	EF10_STAT_COUNT
 };
 

^ permalink raw reply related

* [PATCH net-next 4/4] sfc: expose CTPIO stats on NICs that support them
From: Edward Cree @ 2017-12-19 17:06 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev
In-Reply-To: <62aabe1f-6148-922c-03fb-afec440b4a0e@solarflare.com>

From: Bert Kenward <bkenward@solarflare.com>

While the Linux driver doesn't use CTPIO ('cut-through programmed I/O'),
 other drivers on the same port might, so if we're responsible for
 reporting per-port stats we need to include the CTPIO stats.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef10.c | 50 +++++++++++++++++++++++++++++++++++++++++
 drivers/net/ethernet/sfc/nic.h  | 17 ++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 352ca43a7395..8ae467db9162 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -1649,6 +1649,23 @@ static const struct efx_hw_stat_desc efx_ef10_stat_desc[EF10_STAT_COUNT] = {
 	EF10_DMA_STAT(fec_corrected_symbols_lane1, FEC_CORRECTED_SYMBOLS_LANE1),
 	EF10_DMA_STAT(fec_corrected_symbols_lane2, FEC_CORRECTED_SYMBOLS_LANE2),
 	EF10_DMA_STAT(fec_corrected_symbols_lane3, FEC_CORRECTED_SYMBOLS_LANE3),
+	EF10_DMA_STAT(ctpio_dmabuf_start, CTPIO_DMABUF_START),
+	EF10_DMA_STAT(ctpio_vi_busy_fallback, CTPIO_VI_BUSY_FALLBACK),
+	EF10_DMA_STAT(ctpio_long_write_success, CTPIO_LONG_WRITE_SUCCESS),
+	EF10_DMA_STAT(ctpio_missing_dbell_fail, CTPIO_MISSING_DBELL_FAIL),
+	EF10_DMA_STAT(ctpio_overflow_fail, CTPIO_OVERFLOW_FAIL),
+	EF10_DMA_STAT(ctpio_underflow_fail, CTPIO_UNDERFLOW_FAIL),
+	EF10_DMA_STAT(ctpio_timeout_fail, CTPIO_TIMEOUT_FAIL),
+	EF10_DMA_STAT(ctpio_noncontig_wr_fail, CTPIO_NONCONTIG_WR_FAIL),
+	EF10_DMA_STAT(ctpio_frm_clobber_fail, CTPIO_FRM_CLOBBER_FAIL),
+	EF10_DMA_STAT(ctpio_invalid_wr_fail, CTPIO_INVALID_WR_FAIL),
+	EF10_DMA_STAT(ctpio_vi_clobber_fallback, CTPIO_VI_CLOBBER_FALLBACK),
+	EF10_DMA_STAT(ctpio_unqualified_fallback, CTPIO_UNQUALIFIED_FALLBACK),
+	EF10_DMA_STAT(ctpio_runt_fallback, CTPIO_RUNT_FALLBACK),
+	EF10_DMA_STAT(ctpio_success, CTPIO_SUCCESS),
+	EF10_DMA_STAT(ctpio_fallback, CTPIO_FALLBACK),
+	EF10_DMA_STAT(ctpio_poison, CTPIO_POISON),
+	EF10_DMA_STAT(ctpio_erase, CTPIO_ERASE),
 };
 
 #define HUNT_COMMON_STAT_MASK ((1ULL << EF10_STAT_port_tx_bytes) |	\
@@ -1737,6 +1754,30 @@ static const struct efx_hw_stat_desc efx_ef10_stat_desc[EF10_STAT_COUNT] = {
 	(1ULL << (EF10_STAT_fec_corrected_symbols_lane2 - 64)) |	\
 	(1ULL << (EF10_STAT_fec_corrected_symbols_lane3 - 64)))
 
+/* These statistics are only provided if the NIC supports MC_CMD_MAC_STATS_V3,
+ * indicated by returning a value >= MC_CMD_MAC_NSTATS_V3 in
+ * MC_CMD_GET_CAPABILITIES_V4_OUT_MAC_STATS_NUM_STATS.
+ * These bits are in the second u64 of the raw mask.
+ */
+#define EF10_CTPIO_STAT_MASK (						\
+	(1ULL << (EF10_STAT_ctpio_dmabuf_start - 64)) |			\
+	(1ULL << (EF10_STAT_ctpio_vi_busy_fallback - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_long_write_success - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_missing_dbell_fail - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_overflow_fail - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_underflow_fail - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_timeout_fail - 64)) |			\
+	(1ULL << (EF10_STAT_ctpio_noncontig_wr_fail - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_frm_clobber_fail - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_invalid_wr_fail - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_vi_clobber_fallback - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_unqualified_fallback - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_runt_fallback - 64)) |		\
+	(1ULL << (EF10_STAT_ctpio_success - 64)) |			\
+	(1ULL << (EF10_STAT_ctpio_fallback - 64)) |			\
+	(1ULL << (EF10_STAT_ctpio_poison - 64)) |			\
+	(1ULL << (EF10_STAT_ctpio_erase - 64)))
+
 static u64 efx_ef10_raw_stat_mask(struct efx_nic *efx)
 {
 	u64 raw_mask = HUNT_COMMON_STAT_MASK;
@@ -1783,6 +1824,15 @@ static void efx_ef10_get_stat_mask(struct efx_nic *efx, unsigned long *mask)
 	if (efx->num_mac_stats >= MC_CMD_MAC_NSTATS_V2)
 		raw_mask[1] |= EF10_FEC_STAT_MASK;
 
+	/* CTPIO stats appear in V3. Only show them on devices that actually
+	 * support CTPIO. Although this driver doesn't use CTPIO others might,
+	 * and we may be reporting the stats for the underlying port.
+	 */
+	if (efx->num_mac_stats >= MC_CMD_MAC_NSTATS_V3 &&
+	    (nic_data->datapath_caps2 &
+	     (1 << MC_CMD_GET_CAPABILITIES_V4_OUT_CTPIO_LBN)))
+		raw_mask[1] |= EF10_CTPIO_STAT_MASK;
+
 #if BITS_PER_LONG == 64
 	BUILD_BUG_ON(BITS_TO_LONGS(EF10_STAT_COUNT) != 2);
 	mask[0] = raw_mask[0];
diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index e39e7b399252..763052214525 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -332,6 +332,23 @@ enum {
 	EF10_STAT_fec_corrected_symbols_lane1,
 	EF10_STAT_fec_corrected_symbols_lane2,
 	EF10_STAT_fec_corrected_symbols_lane3,
+	EF10_STAT_ctpio_dmabuf_start,
+	EF10_STAT_ctpio_vi_busy_fallback,
+	EF10_STAT_ctpio_long_write_success,
+	EF10_STAT_ctpio_missing_dbell_fail,
+	EF10_STAT_ctpio_overflow_fail,
+	EF10_STAT_ctpio_underflow_fail,
+	EF10_STAT_ctpio_timeout_fail,
+	EF10_STAT_ctpio_noncontig_wr_fail,
+	EF10_STAT_ctpio_frm_clobber_fail,
+	EF10_STAT_ctpio_invalid_wr_fail,
+	EF10_STAT_ctpio_vi_clobber_fallback,
+	EF10_STAT_ctpio_unqualified_fallback,
+	EF10_STAT_ctpio_runt_fallback,
+	EF10_STAT_ctpio_success,
+	EF10_STAT_ctpio_fallback,
+	EF10_STAT_ctpio_poison,
+	EF10_STAT_ctpio_erase,
 	EF10_STAT_COUNT
 };
 

^ permalink raw reply related

* Re: [PATCH net-next v5 1/4] phylib: Add device reset delay support
From: Rob Herring @ 2017-12-19 17:12 UTC (permalink / raw)
  To: Richard Leitner
  Cc: Richard Leitner, Mark Rutland, Fugang Duan, Andrew Lunn,
	Florian Fainelli, Frank Rowand, David Miller, Geert Uytterhoeven,
	Sergei Shtylyov, Baruch Siach, David Wu, Lukasz Majewski, netdev,
	devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1b230e6b-8e8c-28a4-7524-6ce60aa71591-WcANXNA0UjBBDgjK7y7TUQ@public.gmane.org>

On Mon, Dec 18, 2017 at 1:30 AM, Richard Leitner
<richard.leitner-WcANXNA0UjBBDgjK7y7TUQ@public.gmane.org> wrote:
> Hi Rob,
>
> On 12/15/2017 11:17 PM, Rob Herring wrote:
>> On Mon, Dec 11, 2017 at 01:16:57PM +0100, Richard Leitner wrote:
>>> From: Richard Leitner <richard.leitner-WcANXNA0UjBBDgjK7y7TUQ@public.gmane.org>
>>>
>>> Some PHYs need a minimum time after the reset gpio was asserted and/or
>>> deasserted. To ensure we meet these timing requirements add two new
>>> optional devicetree parameters for the phy: reset-delay-us and
>>> reset-post-delay-us.
>>>
>>> Signed-off-by: Richard Leitner <richard.leitner-WcANXNA0UjBBDgjK7y7TUQ@public.gmane.org>
>>> Reviewed-by: Geert Uytterhoeven <geert+renesas-gXvu3+zWzMSzQB+pC5nmwQ@public.gmane.org>
>>> ---
>>>  Documentation/devicetree/bindings/net/phy.txt | 10 ++++++++++
>>>  drivers/net/phy/mdio_device.c                 | 13 +++++++++++--
>>>  drivers/of/of_mdio.c                          |  4 ++++
>>>  include/linux/mdio.h                          |  2 ++
>>>  4 files changed, 27 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/Documentation/devicetree/bindings/net/phy.txt b/Documentation/devicetree/bindings/net/phy.txt
>>> index c05479f5ac7c..72860ce7f610 100644
>>> --- a/Documentation/devicetree/bindings/net/phy.txt
>>> +++ b/Documentation/devicetree/bindings/net/phy.txt
>>> @@ -55,6 +55,12 @@ Optional Properties:
>>>
>>>  - reset-gpios: The GPIO phandle and specifier for the PHY reset signal.
>>>
>>> +- reset-delay-us: Delay after the reset was asserted in microseconds.
>>> +  If this property is missing the delay will be skipped.
>>> +
>>> +- reset-post-delay-us: Delay after the reset was deasserted in microseconds.
>>> +  If this property is missing the delay will be skipped.
>>
>> I think these names could be clearer as to exactly what they mean.
>> Looking at existing properties with "reset-delay" there's a mixture of
>> definitions whether it is the assert time or the time after deassert.
>>
>> So I'd call these "reset-assert-us" and "reset-deassert-us".
>
> Ok, that would be fine with me, but are you sure that we should omit the
> "-delay" term completely?

These are just minimum assertion and deassertion times. I think that's
clear enough.

> What would be the best approach to post this change (as the patchset was
> already merged to net-next)? A separate patch or a v6 of the complete
> patchset?

A patch on top of this.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open
From: Song Liu @ 2017-12-19 17:24 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Peter Zijlstra, Steven Rostedt, mingo@redhat.com, David Miller,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Kernel Team
In-Reply-To: <530f908c-9c34-2db9-1d98-db1c790fa81b@iogearbox.net>


> On Dec 8, 2017, at 11:57 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> 
> On 12/06/2017 11:45 PM, Song Liu wrote:
>> Changes PATCH v4 to PATCH v5:
>>  Remove PERF_PROBE_CONFIG_IS_RETPROBE from uapi, use PMU_FORMAT_ATTR
>>  instead.
>> 
>> Changes PATCH v3 to PATCH v4:
>>  Remove uapi define MAX_PROBE_FUNC_NAME_LEN, use KSYM_NAME_LEN instead.
>>  Add flag PERF_PROBE_CONFIG_IS_RETPROBE for config field of [k,u]probe.
>>  Optimize ifdef's of CONFIG_KPROBE_EVENTS and CONFIG_UPROBE_EVENTS.
>>  Optimize checks in perf_event_is_tracing().
>>  Optimize perf_tp_register().
>> 
>> Changes PATCH v2 to PATCH v3:
>>  Remove fixed type PERF_TYPE_KPROBE and PERF_TYPE_UPROBE, use dynamic
>>  type instead.
>>  Update userspace (samples/bpf, bcc) to look up type from sysfs.
>>  Change License info in test_many_kprobe_user.c as Philippe Ombredanne
>>  suggested.
>> 
>> Changes PATCH v1 to PATCH v2:
>>  Split PERF_TYPE_PROBE into PERF_TYPE_KPROBE and PERF_TYPE_UPROBE.
>>  Split perf_probe into perf_kprobe and perf_uprobe.
>>  Remove struct probe_desc, use config1 and config2 instead.
>> 
>> Changes RFC v2 to PATCH v1:
>>  Check type PERF_TYPE_PROBE in perf_event_set_filter().
>>  Rebase on to tip perf/core.
>> 
>> Changes RFC v1 to RFC v2:
>>  Fix build issue reported by kbuild test bot by adding ifdef of
>>  CONFIG_KPROBE_EVENTS, and CONFIG_UPROBE_EVENTS.
>> 
>> RFC v1 cover letter:
>> 
>> This is to follow up the discussion over "new kprobe api" at Linux
>> Plumbers 2017:
>> 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linuxplumbersconf.org_2017_ocw_proposals_4808&d=DwICaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=dR8692q0_uaizy0jkrBJQM5k2hfm4CiFxYT8KaysFrg&m=iAtBi3jXPYfFgvhHTsv8oWyUKiyEwnGZ34-4jniREz4&s=87sxRxoRkJJenQjCtFLJFXkMLvd_rHkY8ZGON9em1Jc&e=
>> 
>> With current kernel, user space tools can only create/destroy [k,u]probes
>> with a text-based API (kprobe_events and uprobe_events in tracefs). This
>> approach relies on user space to clean up the [k,u]probe after using them.
>> However, this is not easy for user space to clean up properly.
>> 
>> To solve this problem, we introduce a file descriptor based API.
>> Specifically, we extended perf_event_open to create [k,u]probe, and attach
>> this [k,u]probe to the file descriptor created by perf_event_open. These
>> [k,u]probe are associated with this file descriptor, so they are not
>> available in tracefs.
>> 
>> We reuse large portion of existing trace_kprobe and trace_uprobe code.
>> Currently, the file descriptor API does not support arguments as the
>> text-based API does. This should not be a problem, as user of the file
>> decriptor based API read data through other methods (bpf, etc.).
>> 
>> I also include a patch to to bcc, and a patch to man-page perf_even_open.
>> Please see the list below. A fork of bcc with this patch is also available
>> on github:
>> 
>>  https://github.com/liu-song-6/bcc/tree/perf_event_open
> 
> Peter / Stephen, I presume this will be routed through one of you, if
> not please yell.
> 
> Thanks,
> Daniel
> 

Dear Peter, 

Do you have further feedbacks/comments on these patches? If not, could you 
please confirm they are ready for merge?

Thanks and Best Regards,
Song

^ permalink raw reply

* [PATCH] net: Fix double free and memory corruption in get_net_ns_by_id()
From: Eric W. Biederman @ 2017-12-19 17:27 UTC (permalink / raw)
  To: netdev
  Cc: David Miller, ktkhai, security, secalert, eric.dumazet, stephen,
	Nicolas Dichtel
In-Reply-To: <20171218.152320.827340066666299976.davem@davemloft.net>


(I can trivially verify that that idr_remove in cleanup_net happens
 after the network namespace count has dropped to zero --EWB)

Function get_net_ns_by_id() does not check for net::count
after it has found a peer in netns_ids idr.

It may dereference a peer, after its count has already been
finaly decremented. This leads to double free and memory
corruption:

put_net(peer)                                   rtnl_lock()
atomic_dec_and_test(&peer->count) [count=0]     ...
__put_net(peer)                                 get_net_ns_by_id(net, id)
  spin_lock(&cleanup_list_lock)
  list_add(&net->cleanup_list, &cleanup_list)
  spin_unlock(&cleanup_list_lock)
queue_work()                                      peer = idr_find(&net->netns_ids, id)
  |                                               get_net(peer) [count=1]
  |                                               ...
  |                                               (use after final put)
  v                                               ...
  cleanup_net()                                   ...
    spin_lock(&cleanup_list_lock)                 ...
    list_replace_init(&cleanup_list, ..)          ...
    spin_unlock(&cleanup_list_lock)               ...
    ...                                           ...
    ...                                           put_net(peer)
    ...                                             atomic_dec_and_test(&peer->count) [count=0]
    ...                                               spin_lock(&cleanup_list_lock)
    ...                                               list_add(&net->cleanup_list, &cleanup_list)
    ...                                               spin_unlock(&cleanup_list_lock)
    ...                                             queue_work()
    ...                                           rtnl_unlock()
    rtnl_lock()                                   ...
    for_each_net(tmp) {                           ...
      id = __peernet2id(tmp, peer)                ...
      spin_lock_irq(&tmp->nsid_lock)              ...
      idr_remove(&tmp->netns_ids, id)             ...
      ...                                         ...
      net_drop_ns()                               ...
	net_free(peer)                            ...
    }                                             ...
  |
  v
  cleanup_net()
    ...
    (Second free of peer)

Also, put_net() on the right cpu may reorder with left's cpu
list_replace_init(&cleanup_list, ..), and then cleanup_list
will be corrupted.

Since cleanup_net() is executed in worker thread, while
put_net(peer) can happen everywhere, there should be
enough time for concurrent get_net_ns_by_id() to pick
the peer up, and the race does not seem to be unlikely.
The patch fixes the problem in standard way.

(Also, there is possible problem in peernet2id_alloc(), which requires
check for net::count under nsid_lock and maybe_get_net(peer), but
in current stable kernel it's used under rtnl_lock() and it has to be
safe. Openswitch begun to use peernet2id_alloc(), and possibly it should
be fixed too. While this is not in stable kernel yet, so I'll send
a separate message to netdev@ later).

Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Fixes: 0c7aecd4bde4 "netns: add rtnl cmd to add and get peer netns ids"
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 net/core/net_namespace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index b797832565d3..60a71be75aea 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -267,7 +267,7 @@ struct net *get_net_ns_by_id(struct net *net, int id)
 	spin_lock_bh(&net->nsid_lock);
 	peer = idr_find(&net->netns_ids, id);
 	if (peer)
-		get_net(peer);
+		peer = maybe_get_net(peer);
 	spin_unlock_bh(&net->nsid_lock);
 	rcu_read_unlock();
 
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCH net] ipv4: Fix use-after-free when flushing FIB tables
From: Alexander Duyck @ 2017-12-19 17:34 UTC (permalink / raw)
  To: David Miller
  Cc: Ido Schimmel, Netdev, Duyck, Alexander H, Fengguang Wu,
	David Ahern, mlxsw
In-Reply-To: <20171219.113221.332118764426444996.davem@davemloft.net>

On Tue, Dec 19, 2017 at 8:32 AM, David Miller <davem@davemloft.net> wrote:
> From: Ido Schimmel <idosch@mellanox.com>
> Date: Mon, 18 Dec 2017 10:13:20 +0200
>
>> Since commit 0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse") the
>> local table uses the same trie allocated for the main table when custom
>> rules are not in use.
>>
>> When a net namespace is dismantled, the main table is flushed and freed
>> (via an RCU callback) before the local table. In case the callback is
>> invoked before the local table is iterated, a use-after-free can occur.
>>
>> Fix this by iterating over the FIB tables in reverse order, so that the
>> main table is always freed after the local table.
>>
>> Fixes: 0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse")
>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
>
> This is really too clever of a fix I think :-)
>
> I would prefer if we fixed things more explicitly.
>
> In struct fib_table you can add a "data_ref" integer.  Any pointer
> reference created to fib_table->__data increases this counter.  It is
> always done inside of RTNL locking, so should be doable without
> atomics or extra locking.
>
> For a non-aliased fib_table we go:
>
>         if (!--fib_table->data_ref)
>                 kfree(fib_table);
>
> And for aliased ones we do something like:
>
>         if (fib_table->tb_data != fib_table->__data) {
>                 void *data = fib_table->fb_data;
>                 struct fib_table *alias;
>
>                 alias = container_of(data, struct fib_table, __data[0]);
>                 if (!--alias->data_ref)
>                         kfree(alias);
>                 kfree(fib_table);
>         }
>
> Something like that.

That seems like unneeded complexity when the issue is just the order
that these were created in versus the order they are freed in. As long
as we always destroy the one containing the alias before the one that
has the actual data we don't need to have a reference count. Basically
the issue is the bring-up and the tear-down order. It isn't something
that really needs a reference count since it would always be either 1
or 2. My preference would be to just add a comment explaining that
local must always be destroyed before the main trie in order to
guarantee that there are no external references to the data contained
in main when it is freed.

The one question I have in all this is if I did the bring-up in the
right order in the first place. I'm wondering if local should be where
the combined trie lives instead of main. Local is currently destroyed
after main anyway so I wonder if it wouldn't have been better if
everything lived in local since from what I can tell it looks like we
add rules for local first before we do so in main. The complexity of
that patch would be higher though since the patch would need to be
much larger and touch multiple files.

- Alex

^ permalink raw reply

* Re: [RFC PATCH] virtio_net: Extend virtio to use VF datapath when available
From: Samudrala, Sridhar @ 2017-12-19 17:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: stephen, netdev, virtualization, alexander.duyck,
	Brandeburg, Jesse
In-Reply-To: <20171219174259-mutt-send-email-mst@kernel.org>


On 12/19/2017 7:47 AM, Michael S. Tsirkin wrote:
> I'll need to look at this more, in particular the feature
> bit is missing here. For now one question:
>
> On Mon, Dec 18, 2017 at 04:40:36PM -0800, Sridhar Samudrala wrote:
>> @@ -56,6 +58,8 @@ module_param(napi_tx, bool, 0644);
>>    */
>>   DECLARE_EWMA(pkt_len, 0, 64)
>>   
>> +#define VF_TAKEOVER_INT	(HZ / 10)
>> +
>>   #define VIRTNET_DRIVER_VERSION "1.0.0"
>>   
>>   static const unsigned long guest_offloads[] = {
> Why is this delay necessary? And why by 100ms?

This is based on netvsc implementation and here is the commit that
added this delay.  Not sure if this needs to be 100ms.

commit 6123c66854c174e4982f98195100c1d990f9e5e6
Author: stephen hemminger <stephen@networkplumber.org>
Date:   Wed Aug 9 17:46:03 2017 -0700

     netvsc: delay setup of VF device

     When VF device is discovered, delay bring it automatically up in
     order to allow userspace to some simple changes (like renaming).

^ permalink raw reply

* Re: [RFC PATCH] virtio_net: Extend virtio to use VF datapath when available
From: Stephen Hemminger @ 2017-12-19 17:55 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Michael S. Tsirkin, netdev, virtualization, alexander.duyck,
	Brandeburg, Jesse
In-Reply-To: <ca709329-a47a-2e76-b1fc-38fc665d7c0b@intel.com>

On Tue, 19 Dec 2017 09:41:39 -0800
"Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:

> On 12/19/2017 7:47 AM, Michael S. Tsirkin wrote:
> > I'll need to look at this more, in particular the feature
> > bit is missing here. For now one question:
> >
> > On Mon, Dec 18, 2017 at 04:40:36PM -0800, Sridhar Samudrala wrote:  
> >> @@ -56,6 +58,8 @@ module_param(napi_tx, bool, 0644);
> >>    */
> >>   DECLARE_EWMA(pkt_len, 0, 64)
> >>   
> >> +#define VF_TAKEOVER_INT	(HZ / 10)
> >> +
> >>   #define VIRTNET_DRIVER_VERSION "1.0.0"
> >>   
> >>   static const unsigned long guest_offloads[] = {  
> > Why is this delay necessary? And why by 100ms?  
> 
> This is based on netvsc implementation and here is the commit that
> added this delay.  Not sure if this needs to be 100ms.
> 
> commit 6123c66854c174e4982f98195100c1d990f9e5e6
> Author: stephen hemminger <stephen@networkplumber.org>
> Date:   Wed Aug 9 17:46:03 2017 -0700
> 
>      netvsc: delay setup of VF device
> 
>      When VF device is discovered, delay bring it automatically up in
>      order to allow userspace to some simple changes (like renaming).
> 
> 
> 

could be 10ms, just enough to let udev do its renaming

^ permalink raw reply

* Re: [PATCH -tip v3 0/6] net: tcp: sctp: dccp: Replace jprobe usage with trace events
From: Alexei Starovoitov @ 2017-12-19 18:01 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Stephen Hemminger, Steven Rostedt, Peter Zijlstra,
	Thomas Gleixner, LKML, David S . Miller, netdev
In-Reply-To: <151367381480.32364.2659143894655716709.stgit@devbox>

On Tue, Dec 19, 2017 at 05:56:55PM +0900, Masami Hiramatsu wrote:
>  include/trace/events/sctp.h |   98 ++++++++++++++
>  include/trace/events/tcp.h  |   80 +++++++++++
>  net/Kconfig                 |   17 --
>  net/dccp/Kconfig            |   17 --
>  net/dccp/Makefile           |    2 
>  net/dccp/probe.c            |  203 -----------------------------
>  net/dccp/proto.c            |    5 +
>  net/dccp/trace.h            |  105 +++++++++++++++
>  net/ipv4/Makefile           |    1 
>  net/ipv4/tcp_input.c        |    3 
>  net/ipv4/tcp_probe.c        |  301 -------------------------------------------
>  net/sctp/Kconfig            |   12 --
>  net/sctp/Makefile           |    3 
>  net/sctp/probe.c            |  244 -----------------------------------
>  net/sctp/sm_statefuns.c     |    5 +
>  15 files changed, 296 insertions(+), 800 deletions(-)

You need to target net-next tree for this patch set.

^ permalink raw reply

* [PATCH v1] net: amd-xgbe: Get rid of custom hex_dump_to_buffer()
From: Andy Shevchenko @ 2017-12-19 18:02 UTC (permalink / raw)
  To: Tom Lendacky, David S. Miller, netdev; +Cc: Andy Shevchenko

Get rid of yet another custom hex_dump_to_buffer().

The output is slightly changed, i.e. each byte followed by white space.

Note, we don't use print_hex_dump() here since the original code uses
nedev_dbg().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index a74a8fbad53a..fc58dc43a5bd 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -2930,9 +2930,8 @@ void xgbe_dump_rx_desc(struct xgbe_prv_data *pdata, struct xgbe_ring *ring,
 void xgbe_print_pkt(struct net_device *netdev, struct sk_buff *skb, bool tx_rx)
 {
 	struct ethhdr *eth = (struct ethhdr *)skb->data;
-	unsigned char *buf = skb->data;
 	unsigned char buffer[128];
-	unsigned int i, j;
+	unsigned int i;
 
 	netdev_dbg(netdev, "\n************** SKB dump ****************\n");
 
@@ -2943,22 +2942,12 @@ void xgbe_print_pkt(struct net_device *netdev, struct sk_buff *skb, bool tx_rx)
 	netdev_dbg(netdev, "Src MAC addr: %pM\n", eth->h_source);
 	netdev_dbg(netdev, "Protocol: %#06hx\n", ntohs(eth->h_proto));
 
-	for (i = 0, j = 0; i < skb->len;) {
-		j += snprintf(buffer + j, sizeof(buffer) - j, "%02hhx",
-			      buf[i++]);
-
-		if ((i % 32) == 0) {
-			netdev_dbg(netdev, "  %#06x: %s\n", i - 32, buffer);
-			j = 0;
-		} else if ((i % 16) == 0) {
-			buffer[j++] = ' ';
-			buffer[j++] = ' ';
-		} else if ((i % 4) == 0) {
-			buffer[j++] = ' ';
-		}
+	for (i = 0; i < skb->len; i += 32) {
+		unsigned int len = min(skb->len - i, 32U);
+
+		hex_dump_to_buffer(&skb->data[i], len, 32, 1, buffer, 128, false);
+		netdev_dbg(netdev, "  %#06x: %s\n", i, buffer);
 	}
-	if (i % 32)
-		netdev_dbg(netdev, "  %#06x: %s\n", i - (i % 32), buffer);
 
 	netdev_dbg(netdev, "\n************** SKB dump ****************\n");
 }
-- 
2.15.1

^ permalink raw reply related

* [PATCH v1] net: dwc-xlgmac: Get rid of custom hex_dump_to_buffer()
From: Andy Shevchenko @ 2017-12-19 18:03 UTC (permalink / raw)
  To: Jie Deng, Jose Abreu, David S. Miller, netdev; +Cc: Andy Shevchenko

Get rid of yet another custom hex_dump_to_buffer().

The output is slightly changed, i.e. each byte followed by white space.

Note, we don't use print_hex_dump() here since the original code uses
nedev_dbg().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 drivers/net/ethernet/synopsys/dwc-xlgmac-common.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/synopsys/dwc-xlgmac-common.c b/drivers/net/ethernet/synopsys/dwc-xlgmac-common.c
index d655a4261e98..6fbab6c30fd4 100644
--- a/drivers/net/ethernet/synopsys/dwc-xlgmac-common.c
+++ b/drivers/net/ethernet/synopsys/dwc-xlgmac-common.c
@@ -333,9 +333,8 @@ void xlgmac_print_pkt(struct net_device *netdev,
 		      struct sk_buff *skb, bool tx_rx)
 {
 	struct ethhdr *eth = (struct ethhdr *)skb->data;
-	unsigned char *buf = skb->data;
 	unsigned char buffer[128];
-	unsigned int i, j;
+	unsigned int i;
 
 	netdev_dbg(netdev, "\n************** SKB dump ****************\n");
 
@@ -346,22 +345,12 @@ void xlgmac_print_pkt(struct net_device *netdev,
 	netdev_dbg(netdev, "Src MAC addr: %pM\n", eth->h_source);
 	netdev_dbg(netdev, "Protocol: %#06hx\n", ntohs(eth->h_proto));
 
-	for (i = 0, j = 0; i < skb->len;) {
-		j += snprintf(buffer + j, sizeof(buffer) - j, "%02hhx",
-			      buf[i++]);
-
-		if ((i % 32) == 0) {
-			netdev_dbg(netdev, "  %#06x: %s\n", i - 32, buffer);
-			j = 0;
-		} else if ((i % 16) == 0) {
-			buffer[j++] = ' ';
-			buffer[j++] = ' ';
-		} else if ((i % 4) == 0) {
-			buffer[j++] = ' ';
-		}
+	for (i = 0; i < skb->len; i += 32) {
+		unsigned int len = min(skb->len - i, 32U);
+
+		hex_dump_to_buffer(&skb->data[i], len, 32, 1, buffer, 128, false);
+		netdev_dbg(netdev, "  %#06x: %s\n", i, buffer);
 	}
-	if (i % 32)
-		netdev_dbg(netdev, "  %#06x: %s\n", i - (i % 32), buffer);
 
 	netdev_dbg(netdev, "\n************** SKB dump ****************\n");
 }
-- 
2.15.1

^ permalink raw reply related

* Re: [RFC PATCH] virtio_net: Extend virtio to use VF datapath when available
From: Michael S. Tsirkin @ 2017-12-19 18:07 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Samudrala, Sridhar, netdev, virtualization, alexander.duyck,
	Brandeburg, Jesse
In-Reply-To: <20171219095548.61d70d10@xeon-e3>

On Tue, Dec 19, 2017 at 09:55:48AM -0800, Stephen Hemminger wrote:
> On Tue, 19 Dec 2017 09:41:39 -0800
> "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
> 
> > On 12/19/2017 7:47 AM, Michael S. Tsirkin wrote:
> > > I'll need to look at this more, in particular the feature
> > > bit is missing here. For now one question:
> > >
> > > On Mon, Dec 18, 2017 at 04:40:36PM -0800, Sridhar Samudrala wrote:  
> > >> @@ -56,6 +58,8 @@ module_param(napi_tx, bool, 0644);
> > >>    */
> > >>   DECLARE_EWMA(pkt_len, 0, 64)
> > >>   
> > >> +#define VF_TAKEOVER_INT	(HZ / 10)
> > >> +
> > >>   #define VIRTNET_DRIVER_VERSION "1.0.0"
> > >>   
> > >>   static const unsigned long guest_offloads[] = {  
> > > Why is this delay necessary? And why by 100ms?  
> > 
> > This is based on netvsc implementation and here is the commit that
> > added this delay.  Not sure if this needs to be 100ms.
> > 
> > commit 6123c66854c174e4982f98195100c1d990f9e5e6
> > Author: stephen hemminger <stephen@networkplumber.org>
> > Date:   Wed Aug 9 17:46:03 2017 -0700
> > 
> >      netvsc: delay setup of VF device
> > 
> >      When VF device is discovered, delay bring it automatically up in
> >      order to allow userspace to some simple changes (like renaming).
> > 
> > 
> > 
> 
> could be 10ms, just enough to let udev do its renaming

Isn't there a way not to depend on udev completing its thing within a given timeframe?

^ permalink raw reply

* Re: dst refcount is -1
From: Wei Wang @ 2017-12-19 18:09 UTC (permalink / raw)
  To: Ortwin Glück
  Cc: linux-kernel@vger.kernel.org, Linux Kernel Network Developers
In-Reply-To: <f53fd35b-5a63-18af-4093-f996aae503bc@odi.ch>

On Tue, Dec 19, 2017 at 2:56 AM, Ortwin Glück <odi@odi.ch> wrote:
> Hi,
>
> On 4.14.6 I just got this (on a busy firewall):
> [Tue Dec 19 11:15:59 2017] dst_release: dst:ffff9bb7aca0d6c0 refcnt:-1
>
> Are you sure the refcounting is now correct?
>
> Ortwin

Would you give more details under what circumstances it happened?
What kind of traffic is running? IPv4? IPv6? Or Both? Do you use xfrm?

^ permalink raw reply

* [PATCH v1] bridge: Use helpers to handle MAC address
From: Andy Shevchenko @ 2017-12-19 18:10 UTC (permalink / raw)
  To: Stephen Hemminger, David S. Miller, bridge, netdev; +Cc: Andy Shevchenko

Use
	%pM to print MAC
	mac_pton() to convert it from ASCII to binary format, and
	ether_addr_copy() to copy.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 net/bridge/br_sysfs_br.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/net/bridge/br_sysfs_br.c b/net/bridge/br_sysfs_br.c
index 723f25eed8ea..b1be0dcfba6b 100644
--- a/net/bridge/br_sysfs_br.c
+++ b/net/bridge/br_sysfs_br.c
@@ -272,10 +272,7 @@ static ssize_t group_addr_show(struct device *d,
 			       struct device_attribute *attr, char *buf)
 {
 	struct net_bridge *br = to_bridge(d);
-	return sprintf(buf, "%x:%x:%x:%x:%x:%x\n",
-		       br->group_addr[0], br->group_addr[1],
-		       br->group_addr[2], br->group_addr[3],
-		       br->group_addr[4], br->group_addr[5]);
+	return sprintf(buf, "%pM\n", br->group_addr);
 }
 
 static ssize_t group_addr_store(struct device *d,
@@ -284,14 +281,11 @@ static ssize_t group_addr_store(struct device *d,
 {
 	struct net_bridge *br = to_bridge(d);
 	u8 new_addr[6];
-	int i;
 
 	if (!ns_capable(dev_net(br->dev)->user_ns, CAP_NET_ADMIN))
 		return -EPERM;
 
-	if (sscanf(buf, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
-		   &new_addr[0], &new_addr[1], &new_addr[2],
-		   &new_addr[3], &new_addr[4], &new_addr[5]) != 6)
+	if (!mac_pton(buf, new_addr))
 		return -EINVAL;
 
 	if (!is_link_local_ether_addr(new_addr))
@@ -306,8 +300,7 @@ static ssize_t group_addr_store(struct device *d,
 		return restart_syscall();
 
 	spin_lock_bh(&br->lock);
-	for (i = 0; i < 6; i++)
-		br->group_addr[i] = new_addr[i];
+	ether_addr_copy(br->group_addr, new_addr);
 	spin_unlock_bh(&br->lock);
 
 	br->group_addr_set = true;
-- 
2.15.1

^ permalink raw reply related

* Re: [RFC PATCH] virtio_net: Extend virtio to use VF datapath when available
From: Stephen Hemminger @ 2017-12-19 18:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Samudrala, Sridhar, netdev, virtualization, alexander.duyck,
	Brandeburg, Jesse
In-Reply-To: <20171219200611-mutt-send-email-mst@kernel.org>

On Tue, 19 Dec 2017 20:07:01 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Dec 19, 2017 at 09:55:48AM -0800, Stephen Hemminger wrote:
> > On Tue, 19 Dec 2017 09:41:39 -0800
> > "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
> >   
> > > On 12/19/2017 7:47 AM, Michael S. Tsirkin wrote:  
> > > > I'll need to look at this more, in particular the feature
> > > > bit is missing here. For now one question:
> > > >
> > > > On Mon, Dec 18, 2017 at 04:40:36PM -0800, Sridhar Samudrala wrote:    
> > > >> @@ -56,6 +58,8 @@ module_param(napi_tx, bool, 0644);
> > > >>    */
> > > >>   DECLARE_EWMA(pkt_len, 0, 64)
> > > >>   
> > > >> +#define VF_TAKEOVER_INT	(HZ / 10)
> > > >> +
> > > >>   #define VIRTNET_DRIVER_VERSION "1.0.0"
> > > >>   
> > > >>   static const unsigned long guest_offloads[] = {    
> > > > Why is this delay necessary? And why by 100ms?    
> > > 
> > > This is based on netvsc implementation and here is the commit that
> > > added this delay.  Not sure if this needs to be 100ms.
> > > 
> > > commit 6123c66854c174e4982f98195100c1d990f9e5e6
> > > Author: stephen hemminger <stephen@networkplumber.org>
> > > Date:   Wed Aug 9 17:46:03 2017 -0700
> > > 
> > >      netvsc: delay setup of VF device
> > > 
> > >      When VF device is discovered, delay bring it automatically up in
> > >      order to allow userspace to some simple changes (like renaming).
> > > 
> > > 
> > >   
> > 
> > could be 10ms, just enough to let udev do its renaming  
> 
> Isn't there a way not to depend on udev completing its thing within a given timeframe?

Not that I know. the path is quite indirect.

^ permalink raw reply

* [-next PATCH 0/4] sysfs and DEVICE_ATTR_<foo>
From: Joe Perches @ 2017-12-19 18:15 UTC (permalink / raw)
  To: linux-arm-kernel, linux-acpi, openipmi-developer, intel-gfx,
	linuxppc-dev, netdev, linux-nvme, platform-driver-x86, linux-s390,
	esc.storagedev, linux-scsi, linux-pm, linux-serial, linux-usb,
	linux-kernel, alsa-devel, linux-omap
  Cc: devel, linux-fbdev, linux-sh, dri-devel, linux-input, linux-media

Joe Perches (4):
  sysfs.h: Use octal permissions
  treewide: Use DEVICE_ATTR_RW
  treewide: Use DEVICE_ATTR_RO
  treewide: Use DEVICE_ATTR_WO

 arch/arm/mach-pxa/sharpsl_pm.c                     |  4 +-
 arch/s390/kernel/smp.c                             |  2 +-
 arch/s390/kernel/topology.c                        |  3 +-
 arch/sh/drivers/push-switch.c                      |  2 +-
 arch/tile/kernel/sysfs.c                           | 12 ++--
 arch/x86/kernel/cpu/microcode/core.c               |  2 +-
 drivers/acpi/device_sysfs.c                        |  6 +-
 drivers/char/ipmi/ipmi_msghandler.c                | 17 +++---
 drivers/gpu/drm/i915/i915_sysfs.c                  | 12 ++--
 drivers/input/touchscreen/elants_i2c.c             |  2 +-
 drivers/net/ethernet/ibm/ibmvnic.c                 |  2 +-
 drivers/net/wimax/i2400m/sysfs.c                   |  3 +-
 drivers/nvme/host/core.c                           | 10 ++--
 drivers/platform/x86/compal-laptop.c               | 18 ++----
 drivers/s390/cio/css.c                             |  8 +--
 drivers/s390/cio/device.c                          | 10 ++--
 drivers/s390/crypto/ap_card.c                      |  2 +-
 drivers/scsi/hpsa.c                                | 10 ++--
 drivers/scsi/lpfc/lpfc_attr.c                      | 64 ++++++++--------------
 .../staging/media/atomisp/pci/atomisp2/hmm/hmm.c   |  8 +--
 drivers/thermal/thermal_sysfs.c                    | 17 +++---
 drivers/tty/serial/sh-sci.c                        |  2 +-
 drivers/usb/host/xhci-dbgcap.c                     |  2 +-
 drivers/usb/phy/phy-tahvo.c                        |  2 +-
 drivers/video/fbdev/auo_k190x.c                    |  4 +-
 drivers/video/fbdev/w100fb.c                       |  4 +-
 include/linux/sysfs.h                              | 14 ++---
 lib/test_firmware.c                                | 14 ++---
 lib/test_kmod.c                                    | 14 ++---
 sound/soc/omap/mcbsp.c                             |  4 +-
 sound/soc/soc-core.c                               |  2 +-
 sound/soc/soc-dapm.c                               |  2 +-
 32 files changed, 120 insertions(+), 158 deletions(-)

-- 
2.15.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply

* [-next PATCH 4/4] treewide: Use DEVICE_ATTR_WO
From: Joe Perches @ 2017-12-19 18:15 UTC (permalink / raw)
  To: Borislav Petkov, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Thomas Falcon, John Allen, Inaky Perez-Gonzalez,
	linux-wimax, James Smart, Dick Kennedy, Zhang Rui,
	Eduardo Valentin
  Cc: Martin Schwidefsky, Heiko Carstens, Thomas Gleixner,
	H. Peter Anvin, x86, Dmitry Torokhov, James E.J. Bottomley,
	Martin K. Petersen, linux-s390, linux-kernel, linux-input, netdev,
	linuxppc-dev, linux-scsi, linux-pm
In-Reply-To: <cover.1513706701.git.joe@perches.com>

Convert DEVICE_ATTR uses to DEVICE_ATTR_WO where possible.

Done with perl script:

$ git grep -w --name-only DEVICE_ATTR | \
  xargs perl -i -e 'local $/; while (<>) { s/\bDEVICE_ATTR\s*\(\s*(\w+)\s*,\s*\(?(?:\s*S_IWUSR\s*|\s*0200\s*)\)?\s*,\s*NULL\s*,\s*\s_store\s*\)/DEVICE_ATTR_WO(\1)/g; print;}'

Signed-off-by: Joe Perches <joe@perches.com>
---
 arch/s390/kernel/smp.c                 | 2 +-
 arch/x86/kernel/cpu/microcode/core.c   | 2 +-
 drivers/input/touchscreen/elants_i2c.c | 2 +-
 drivers/net/ethernet/ibm/ibmvnic.c     | 2 +-
 drivers/net/wimax/i2400m/sysfs.c       | 3 +--
 drivers/scsi/lpfc/lpfc_attr.c          | 3 +--
 drivers/thermal/thermal_sysfs.c        | 2 +-
 7 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index b8c1a85bcf2d..a919b2f0141d 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -1151,7 +1151,7 @@ static ssize_t __ref rescan_store(struct device *dev,
 	rc = smp_rescan_cpus();
 	return rc ? rc : count;
 }
-static DEVICE_ATTR(rescan, 0200, NULL, rescan_store);
+static DEVICE_ATTR_WO(rescan);
 #endif /* CONFIG_HOTPLUG_CPU */
 
 static int __init s390_smp_init(void)
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index c4fa4a85d4cb..09c74b0560dd 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -560,7 +560,7 @@ static ssize_t pf_show(struct device *dev,
 	return sprintf(buf, "0x%x\n", uci->cpu_sig.pf);
 }
 
-static DEVICE_ATTR(reload, 0200, NULL, reload_store);
+static DEVICE_ATTR_WO(reload);
 static DEVICE_ATTR(version, 0400, version_show, NULL);
 static DEVICE_ATTR(processor_flags, 0400, pf_show, NULL);
 
diff --git a/drivers/input/touchscreen/elants_i2c.c b/drivers/input/touchscreen/elants_i2c.c
index a458e5ec9e41..819213e88f32 100644
--- a/drivers/input/touchscreen/elants_i2c.c
+++ b/drivers/input/touchscreen/elants_i2c.c
@@ -1000,7 +1000,7 @@ static ssize_t show_iap_mode(struct device *dev,
 				"Normal" : "Recovery");
 }
 
-static DEVICE_ATTR(calibrate, S_IWUSR, NULL, calibrate_store);
+static DEVICE_ATTR_WO(calibrate);
 static DEVICE_ATTR(iap_mode, S_IRUGO, show_iap_mode, NULL);
 static DEVICE_ATTR(update_fw, S_IWUSR, NULL, write_update_fw);
 
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 1dc4aef37d3a..42b96e1a1b13 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -4411,7 +4411,7 @@ static ssize_t failover_store(struct device *dev, struct device_attribute *attr,
 	return count;
 }
 
-static DEVICE_ATTR(failover, 0200, NULL, failover_store);
+static DEVICE_ATTR_WO(failover);
 
 static unsigned long ibmvnic_get_desired_dma(struct vio_dev *vdev)
 {
diff --git a/drivers/net/wimax/i2400m/sysfs.c b/drivers/net/wimax/i2400m/sysfs.c
index 1237109f251a..8c67df11105c 100644
--- a/drivers/net/wimax/i2400m/sysfs.c
+++ b/drivers/net/wimax/i2400m/sysfs.c
@@ -65,8 +65,7 @@ ssize_t i2400m_idle_timeout_store(struct device *dev,
 }
 
 static
-DEVICE_ATTR(i2400m_idle_timeout, S_IWUSR,
-	    NULL, i2400m_idle_timeout_store);
+DEVICE_ATTR_WO(i2400m_idle_timeout);
 
 static
 struct attribute *i2400m_dev_attrs[] = {
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 517ff203cfde..6ddaf51a23f6 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -2418,8 +2418,7 @@ lpfc_soft_wwn_enable_store(struct device *dev, struct device_attribute *attr,
 
 	return count;
 }
-static DEVICE_ATTR(lpfc_soft_wwn_enable, S_IWUSR, NULL,
-		   lpfc_soft_wwn_enable_store);
+static DEVICE_ATTR_WO(lpfc_soft_wwn_enable);
 
 /**
  * lpfc_soft_wwpn_show - Return the cfg soft ww port name of the adapter
diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index 2bc964392924..ba81c9080f6e 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -317,7 +317,7 @@ emul_temp_store(struct device *dev, struct device_attribute *attr,
 
 	return ret ? ret : count;
 }
-static DEVICE_ATTR(emul_temp, S_IWUSR, NULL, emul_temp_store);
+static DEVICE_ATTR_WO(emul_temp);
 #endif
 
 static ssize_t
-- 
2.15.0

^ permalink raw reply related

* Re: INFO: task hung in cleanup_net
From: David Ahern @ 2017-12-19 18:17 UTC (permalink / raw)
  To: Dmitry Vyukov, syzbot
  Cc: LKML, Ingo Molnar, Peter Zijlstra, syzkaller-bugs, David Miller,
	dsahern, Florian Westphal, Daniel Borkmann, Xin Long,
	jakub.kicinski, mschiffer, Vladislav Yasevich, Jiri Benc, netdev
In-Reply-To: <CACT4Y+aubwO_Xs5viN_P8WzA_WM5Pj7XYVy+M_vxhA7_ZNaz1g@mail.gmail.com>

On 12/19/17 5:47 AM, Dmitry Vyukov wrote:
> On Tue, Dec 19, 2017 at 1:33 PM, syzbot
> <bot+b17f10c8a8c693b40723d40d6553fbc54d197679@syzkaller.appspotmail.com>
> wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> e40fd8d6b4d9f59b160faa1736f78fc07533ff37
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>>
>> Unfortunately, I don't have any reproducer for this bug yet.
>>
>>
>> sctp: sctp_transport_update_pmtu: Reported pmtu 508 too low, using default
>> minimum of 512
>> sctp: sctp_transport_update_pmtu: Reported pmtu 508 too low, using default
>> minimum of 512
>> sctp: sctp_transport_update_pmtu: Reported pmtu 508 too low, using default
>> minimum of 512
>> INFO: task kworker/u4:11:3785 blocked for more than 120 seconds.
>>       Not tainted 4.15.0-rc2-next-20171207+ #61
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> kworker/u4:11   D16448  3785      2 0x80000000
>> Workqueue: netns cleanup_net
>> Call Trace:
>>  context_switch kernel/sched/core.c:2800 [inline]
>>  __schedule+0x8eb/0x2060 kernel/sched/core.c:3376
>>  schedule+0xf5/0x430 kernel/sched/core.c:3435
>>  schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3493
>>  __mutex_lock_common kernel/locking/mutex.c:833 [inline]
>>  __mutex_lock+0xaad/0x1a80 kernel/locking/mutex.c:893
>>  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>>  rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
>>  cleanup_net+0x24c/0xb60 net/core/net_namespace.c:453
>>  process_one_work+0xbfd/0x1bc0 kernel/workqueue.c:2113
>>  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
>>  kthread+0x37a/0x440 kernel/kthread.c:238
>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:524
>>
>> Showing all locks held in the system:
>> 2 locks held by khungtaskd/672:
>>  #0:  (rcu_read_lock){....}, at: [<00000000b9bef8ab>]
>> check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
>>  #0:  (rcu_read_lock){....}, at: [<00000000b9bef8ab>] watchdog+0x1c5/0xd60
>> kernel/hung_task.c:249
>>  #1:  (tasklist_lock){.+.+}, at: [<000000006d6acf8b>]
>> debug_show_all_locks+0xd3/0x400 kernel/locking/lockdep.c:4554
>> 1 lock held by rsyslogd/2970:
>>  #0:  (&f->f_pos_lock){+.+.}, at: [<000000001cede688>]
>> __fdget_pos+0x131/0x1a0 fs/file.c:765
>> 2 locks held by getty/3052:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<000000004e0245ba>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000ff863b18>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3053:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<000000004e0245ba>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000ff863b18>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3054:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<000000004e0245ba>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000ff863b18>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3055:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<000000004e0245ba>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000ff863b18>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3056:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<000000004e0245ba>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000ff863b18>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3057:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<000000004e0245ba>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000ff863b18>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3058:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<000000004e0245ba>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000ff863b18>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 1 lock held by syz-executor2/3320:
>>  #0:  (rtnl_mutex){+.+.}, at: [<0000000063c47ced>] rtnl_lock+0x17/0x20
>> net/core/rtnetlink.c:74
>> 4 locks held by kworker/u4:11/3785:
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<0000000087d81703>]
>> __write_once_size include/linux/compiler.h:212 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<0000000087d81703>]
>> atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<0000000087d81703>]
>> atomic_long_set include/asm-generic/atomic-long.h:57 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<0000000087d81703>]
>> set_work_data kernel/workqueue.c:619 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<0000000087d81703>]
>> set_work_pool_and_clear_pending kernel/workqueue.c:646 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<0000000087d81703>]
>> process_one_work+0xad4/0x1bc0 kernel/workqueue.c:2084
>>  #1:  (net_cleanup_work){+.+.}, at: [<000000008a817b5c>]
>> process_one_work+0xb2f/0x1bc0 kernel/workqueue.c:2088
>>  #2:  (net_mutex){+.+.}, at: [<0000000079150fd8>] cleanup_net+0x247/0xb60
>> net/core/net_namespace.c:450
>>  #3:  (rtnl_mutex){+.+.}, at: [<0000000063c47ced>] rtnl_lock+0x17/0x20
>> net/core/rtnetlink.c:74
>> 3 locks held by kworker/0:6/14978:
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<0000000087d81703>]
>> __write_once_size include/linux/compiler.h:212 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<0000000087d81703>]
>> atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<0000000087d81703>]
>> atomic_long_set include/asm-generic/atomic-long.h:57 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<0000000087d81703>]
>> set_work_data kernel/workqueue.c:619 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<0000000087d81703>]
>> set_work_pool_and_clear_pending kernel/workqueue.c:646 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<0000000087d81703>]
>> process_one_work+0xad4/0x1bc0 kernel/workqueue.c:2084
>>  #1:  ((addr_chk_work).work){+.+.}, at: [<000000008a817b5c>]
>> process_one_work+0xb2f/0x1bc0 kernel/workqueue.c:2088
>>  #2:  (rtnl_mutex){+.+.}, at: [<0000000063c47ced>] rtnl_lock+0x17/0x20
>> net/core/rtnetlink.c:74
>> 1 lock held by syz-executor2/15852:
>>  #0:  (rtnl_mutex){+.+.}, at: [<0000000063c47ced>] rtnl_lock+0x17/0x20
>> net/core/rtnetlink.c:74
>> 1 lock held by syz-executor7/15958:
>>  #0:  (rtnl_mutex){+.+.}, at: [<0000000063c47ced>] rtnl_lock+0x17/0x20
>> net/core/rtnetlink.c:74
>> 1 lock held by syz-executor7/15961:
>>  #0:  (rtnl_mutex){+.+.}, at: [<0000000063c47ced>] rtnl_lock+0x17/0x20
>> net/core/rtnetlink.c:74
>>
>> =============================================
>>
>> NMI backtrace for cpu 1
>> CPU: 1 PID: 672 Comm: khungtaskd Not tainted 4.15.0-rc2-next-20171207+ #61
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:17 [inline]
>>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>>  nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103
>>  nmi_trigger_cpumask_backtrace+0x122/0x180 lib/nmi_backtrace.c:62
>>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>>  trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline]
>>  check_hung_task kernel/hung_task.c:132 [inline]
>>  check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline]
>>  watchdog+0x90c/0xd60 kernel/hung_task.c:249
>>  kthread+0x37a/0x440 kernel/kthread.c:238
>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:524
>> Sending NMI from CPU 1 to CPUs 0:
>> NMI backtrace for cpu 0
>> CPU: 0 PID: 15938 Comm: syz-executor7 Not tainted 4.15.0-rc2-next-20171207+
>> #61
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> RIP: 0010:inb arch/x86/include/asm/io.h:348 [inline]
>> RIP: 0010:io_serial_in+0x6b/0x90 drivers/tty/serial/8250/8250_port.c:434
>> RSP: 0018:ffff8801c1c173e8 EFLAGS: 00000002
>> RAX: dffffc0000000000 RBX: 00000000000003fd RCX: 0000000000000000
>> RDX: 00000000000003fd RSI: ffffc90001c88000 RDI: ffffffff880fa9c0
>> RBP: ffff8801c1c173f8 R08: 0000000000000002 R09: 000000000000000c
>> R10: 0000000000000000 R11: ffffffff87896d20 R12: ffffffff880fa980
>> R13: 0000000000000020 R14: fffffbfff101f577 R15: fffffbfff101f53a
>> FS:  00007f1ef8fad700(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007fbea91b6000 CR3: 00000001d0aaa000 CR4: 00000000001406f0
>> DR0: 0000000020001000 DR1: 0000000020001000 DR2: 0000000020001020
>> DR3: 0000000020000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>> Call Trace:
>>  serial_in drivers/tty/serial/8250/8250.h:111 [inline]
>>  wait_for_xmitr+0x93/0x1e0 drivers/tty/serial/8250/8250_port.c:2033
>>  serial8250_console_putchar+0x1f/0x60
>> drivers/tty/serial/8250/8250_port.c:3170
>>  uart_console_write+0xac/0xe0 drivers/tty/serial/serial_core.c:1858
>>  serial8250_console_write+0x647/0xa20
>> drivers/tty/serial/8250/8250_port.c:3236
>>  univ8250_console_write+0x5f/0x70 drivers/tty/serial/8250/8250_core.c:590
>>  call_console_drivers kernel/printk/printk.c:1574 [inline]
>>  console_unlock+0x788/0xd70 kernel/printk/printk.c:2233
>>  vprintk_emit+0x4ad/0x590 kernel/printk/printk.c:1757
>>  vprintk_default+0x28/0x30 kernel/printk/printk.c:1796
>>  vprintk_func+0x57/0xc0 kernel/printk/printk_safe.c:379
>>  printk+0xaa/0xca kernel/printk/printk.c:1829
>>  __dev_set_promiscuity+0x2a4/0x630 net/core/dev.c:6609
>>  __dev_change_flags+0x559/0x990 net/core/dev.c:6826
>>  dev_change_flags+0x88/0x140 net/core/dev.c:6886
>>  dev_ifsioc+0x60d/0x9b0 net/core/dev_ioctl.c:257
>>  dev_ioctl+0x2c2/0xf90 net/core/dev_ioctl.c:566
>>  sock_do_ioctl+0x94/0xb0 net/socket.c:971
>>  sock_ioctl+0x2c2/0x440 net/socket.c:1061
>>  vfs_ioctl fs/ioctl.c:46 [inline]
>>  do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:686
>>  SYSC_ioctl fs/ioctl.c:701 [inline]
>>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>>  entry_SYSCALL_64_fastpath+0x1f/0x96
>> RIP: 0033:0x452a39
>> RSP: 002b:00007f1ef8facc58 EFLAGS: 00000212 ORIG_RAX: 0000000000000010
>> RAX: ffffffffffffffda RBX: 00007f1ef8fad700 RCX: 0000000000452a39
>> RDX: 00000000208a3fe0 RSI: 0000000000008914 RDI: 0000000000000019
>> RBP: 0000000000a6f880 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000
>> R13: 0000000000a6f7ff R14: 00007f1ef8fad9c0 R15: 0000000000000002
>> Code: 24 d9 00 00 00 49 8d 7c 24 40 48 b8 00 00 00 00 00 fc ff df 48 89 fa
>> 48 c1 ea 03 d3 e3 80 3c 02 00 75 17 41 03 5c 24 40 89 da ec <5b> 0f b6 c0 41
>> 5c 5d c3 e8 88 ad 18 ff eb c2 e8 e1 ad 18 ff eb
> 
> 
> This looks like +rtnetlink issue.

Perhaps related to / fixed by: http://patchwork.ozlabs.org/patch/850957/

^ permalink raw reply

* Re: INFO: task hung in bpf_exit_net
From: David Ahern @ 2017-12-19 18:20 UTC (permalink / raw)
  To: Dmitry Vyukov, syzbot
  Cc: LKML, Ingo Molnar, Peter Zijlstra, syzkaller-bugs, David Miller,
	Florian Westphal, Daniel Borkmann, Xin Long, jakub.kicinski,
	mschiffer, Vladislav Yasevich, Jiri Benc, netdev
In-Reply-To: <CACT4Y+Ze8t3ipFyhfy9XfmvUw4tob+dAUiVMQ-xHimYDXpSiRw@mail.gmail.com>

On 12/19/17 5:47 AM, Dmitry Vyukov wrote:
> On Tue, Dec 19, 2017 at 1:36 PM, syzbot
> <bot+21b498fc12cf2041655f8e1eeae0733807d794b3@syzkaller.appspotmail.com>
> wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> 7ceb97a071e80f1b5e4cd5a36de135612a836388
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>>
>> Unfortunately, I don't have any reproducer for this bug yet.
>>
>>
>> sctp: sctp_transport_update_pmtu: Reported pmtu 508 too low, using default
>> minimum of 512
>> INFO: task kworker/u4:0:5 blocked for more than 120 seconds.
>>       Not tainted 4.15.0-rc2-next-20171205+ #59
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> kworker/u4:0    D15808     5      2 0x80000000
>> Workqueue: netns cleanup_net
>> Call Trace:
>>  context_switch kernel/sched/core.c:2800 [inline]
>>  __schedule+0x8eb/0x2060 kernel/sched/core.c:3376
>>  schedule+0xf5/0x430 kernel/sched/core.c:3435
>>  schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3493
>>  __mutex_lock_common kernel/locking/mutex.c:833 [inline]
>>  __mutex_lock+0xaad/0x1a80 kernel/locking/mutex.c:893
>>  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>>  rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
>>  tc_action_net_exit include/net/act_api.h:125 [inline]
>>  bpf_exit_net+0x1a2/0x340 net/sched/act_bpf.c:408
>>  ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:142
>>  cleanup_net+0x5c7/0xb60 net/core/net_namespace.c:484
>>  process_one_work+0xbfd/0x1bc0 kernel/workqueue.c:2113
>>  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
>>  kthread+0x37a/0x440 kernel/kthread.c:238
>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:517
>>
>> Showing all locks held in the system:
>> 4 locks held by kworker/u4:0/5:
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000b9f061a2>]
>> __write_once_size include/linux/compiler.h:212 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000b9f061a2>]
>> atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000b9f061a2>]
>> atomic_long_set include/asm-generic/atomic-long.h:57 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000b9f061a2>]
>> set_work_data kernel/workqueue.c:619 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000b9f061a2>]
>> set_work_pool_and_clear_pending kernel/workqueue.c:646 [inline]
>>  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000b9f061a2>]
>> process_one_work+0xad4/0x1bc0 kernel/workqueue.c:2084
>>  #1:  (net_cleanup_work){+.+.}, at: [<000000006c7c48a3>]
>> process_one_work+0xb2f/0x1bc0 kernel/workqueue.c:2088
>>  #2:  (net_mutex){+.+.}, at: [<00000000bf4709f3>] cleanup_net+0x247/0xb60
>> net/core/net_namespace.c:450
>>  #3:  (rtnl_mutex){+.+.}, at: [<0000000053390f0b>] rtnl_lock+0x17/0x20
>> net/core/rtnetlink.c:74
>> 3 locks held by kworker/1:0/17:
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000b9f061a2>]
>> __write_once_size include/linux/compiler.h:212 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000b9f061a2>]
>> atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000b9f061a2>]
>> atomic_long_set include/asm-generic/atomic-long.h:57 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000b9f061a2>]
>> set_work_data kernel/workqueue.c:619 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000b9f061a2>]
>> set_work_pool_and_clear_pending kernel/workqueue.c:646 [inline]
>>  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000b9f061a2>]
>> process_one_work+0xad4/0x1bc0 kernel/workqueue.c:2084
>>  #1:  ((addr_chk_work).work){+.+.}, at: [<000000006c7c48a3>]
>> process_one_work+0xb2f/0x1bc0 kernel/workqueue.c:2088
>>  #2:  (rtnl_mutex){+.+.}, at: [<0000000053390f0b>] rtnl_lock+0x17/0x20
>> net/core/rtnetlink.c:74
>> 2 locks held by khungtaskd/675:
>>  #0:  (rcu_read_lock){....}, at: [<00000000587c8471>]
>> check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
>>  #0:  (rcu_read_lock){....}, at: [<00000000587c8471>] watchdog+0x1c5/0xd60
>> kernel/hung_task.c:249
>>  #1:  (tasklist_lock){.+.+}, at: [<000000005288685e>]
>> debug_show_all_locks+0xd3/0x400 kernel/locking/lockdep.c:4554
>> 1 lock held by rsyslogd/2974:
>>  #0:  (&f->f_pos_lock){+.+.}, at: [<0000000011e00499>]
>> __fdget_pos+0x131/0x1a0 fs/file.c:770
>> 2 locks held by getty/3056:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<00000000b9fd70a9>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000abb3bd08>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3057:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<00000000b9fd70a9>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000abb3bd08>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3058:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<00000000b9fd70a9>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000abb3bd08>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3059:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<00000000b9fd70a9>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000abb3bd08>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3060:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<00000000b9fd70a9>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000abb3bd08>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3061:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<00000000b9fd70a9>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000abb3bd08>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>> 2 locks held by getty/3062:
>>  #0:  (&tty->ldisc_sem){++++}, at: [<00000000b9fd70a9>]
>> ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
>>  #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000abb3bd08>]
>> n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
>>
>> =============================================
>>
>> NMI backtrace for cpu 0
>> CPU: 0 PID: 675 Comm: khungtaskd Not tainted 4.15.0-rc2-next-20171205+ #59
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:17 [inline]
>>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>>  nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103
>>  nmi_trigger_cpumask_backtrace+0x122/0x180 lib/nmi_backtrace.c:62
>>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>>  trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline]
>>  check_hung_task kernel/hung_task.c:132 [inline]
>>  check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline]
>>  watchdog+0x90c/0xd60 kernel/hung_task.c:249
>>  kthread+0x37a/0x440 kernel/kthread.c:238
>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:517
>> Sending NMI from CPU 0 to CPUs 1:
>> NMI backtrace for cpu 1
>> CPU: 1 PID: 13156 Comm: syz-executor7 Not tainted 4.15.0-rc2-next-20171205+
>> #59
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> task: 000000005209c120 task.stack: 00000000ffaab0e8
>> RIP: 0010:inb arch/x86/include/asm/io.h:348 [inline]
>> RIP: 0010:io_serial_in+0x6b/0x90 drivers/tty/serial/8250/8250_port.c:434
>> RSP: 0018:ffff8801c0a16e70 EFLAGS: 00000002
>> RAX: dffffc0000000000 RBX: 00000000000003fd RCX: 0000000000000000
>> RDX: 00000000000003fd RSI: ffffc90003745000 RDI: ffffffff87cf1a40
>> RBP: ffff8801c0a16e80 R08: 0000000000000005 R09: 000000000000000c
>> R10: 0000000000000000 R11: ffffffff8748dd20 R12: ffffffff87cf1a00
>> R13: 0000000000000020 R14: fffffbfff0f9e387 R15: fffffbfff0f9e34a
>> FS:  00007f6d52e3f700(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 000000c42005de80 CR3: 00000001c368a000 CR4: 00000000001406e0
>> DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>> Call Trace:
>>  serial_in drivers/tty/serial/8250/8250.h:111 [inline]
>>  wait_for_xmitr+0x93/0x1e0 drivers/tty/serial/8250/8250_port.c:2033
>>  serial8250_console_putchar+0x1f/0x60
>> drivers/tty/serial/8250/8250_port.c:3170
>>  uart_console_write+0xac/0xe0 drivers/tty/serial/serial_core.c:1858
>>  serial8250_console_write+0x647/0xa20
>> drivers/tty/serial/8250/8250_port.c:3236
>>  univ8250_console_write+0x5f/0x70 drivers/tty/serial/8250/8250_core.c:590
>>  call_console_drivers kernel/printk/printk.c:1574 [inline]
>>  console_unlock+0x788/0xd70 kernel/printk/printk.c:2233
>>  vprintk_emit+0x4ad/0x590 kernel/printk/printk.c:1757
>>  vprintk_default+0x28/0x30 kernel/printk/printk.c:1796
>>  vprintk_func+0x57/0xc0 kernel/printk/printk_safe.c:379
>>  printk+0xaa/0xca kernel/printk/printk.c:1829
>>  nla_parse+0x374/0x3d0 lib/nlattr.c:257
>>  nlmsg_parse include/net/netlink.h:398 [inline]
>>  nl80211_dump_wiphy_parse.isra.37.constprop.83+0x138/0x5c0
>> net/wireless/nl80211.c:1920
>>  nl80211_dump_interface+0x596/0x820 net/wireless/nl80211.c:2660
>>  genl_lock_dumpit+0x68/0x90 net/netlink/genetlink.c:480
>>  netlink_dump+0x48c/0xce0 net/netlink/af_netlink.c:2186
>>  __netlink_dump_start+0x4f0/0x6d0 net/netlink/af_netlink.c:2283
>>  genl_family_rcv_msg+0xd27/0xfc0 net/netlink/genetlink.c:548
>>  genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:624
>>  netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2405
>>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
>>  netlink_unicast_kernel net/netlink/af_netlink.c:1272 [inline]
>>  netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1298
>>  netlink_sendmsg+0xa4a/0xe70 net/netlink/af_netlink.c:1861
>>  sock_sendmsg_nosec net/socket.c:636 [inline]
>>  sock_sendmsg+0xca/0x110 net/socket.c:646
>>  sock_write_iter+0x320/0x5e0 net/socket.c:915
>>  call_write_iter include/linux/fs.h:1776 [inline]
>>  new_sync_write fs/read_write.c:469 [inline]
>>  __vfs_write+0x68a/0x970 fs/read_write.c:482
>>  vfs_write+0x18f/0x510 fs/read_write.c:544
>>  SYSC_write fs/read_write.c:589 [inline]
>>  SyS_write+0xef/0x220 fs/read_write.c:581
>>  entry_SYSCALL_64_fastpath+0x1f/0x96
>> RIP: 0033:0x4529d9
>> RSP: 002b:00007f6d52e3ec58 EFLAGS: 00000212 ORIG_RAX: 0000000000000001
>> RAX: ffffffffffffffda RBX: 00007f6d52e3f700 RCX: 00000000004529d9
>> RDX: 0000000000000024 RSI: 0000000020454000 RDI: 0000000000000016
>> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000
>> R13: 0000000000a6f7ff R14: 00007f6d52e3f9c0 R15: 0000000000000000
>> Code: 24 d9 00 00 00 49 8d 7c 24 40 48 b8 00 00 00 00 00 fc ff df 48 89 fa
>> 48 c1 ea 03 d3 e3 80 3c 02 00 75 17 41 03 5c 24 40 89 da ec <5b> 0f b6 c0 41
>> 5c 5d c3 e8 38 b0 18 ff eb c2 e8 91 b0 18 ff eb
>>
>>
>> ---
>> This bug is generated by a dumb bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for details.
>> Direct all questions to syzkaller@googlegroups.com.
>> Please credit me with: Reported-by: syzbot <syzkaller@googlegroups.com>
>>
>> syzbot will keep track of this bug report.
>> Once a fix for this bug is merged into any tree, reply to this email with:
>> #syz fix: exact-commit-title
>> To mark this as a duplicate of another syzbot report, please reply with:
>> #syz dup: exact-subject-of-another-report
>> If it's a one-off invalid bug report, please reply with:
>> #syz invalid
>> Note: if the crash happens again, it will cause creation of a new bug
>> report.
>> Note: all commands must start from beginning of the line in the email body.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "syzkaller-bugs" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to syzkaller-bugs+unsubscribe@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/syzkaller-bugs/001a1143fd00a8cc790560b0b552%40google.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 
> This looks like +rtnetlink issue.
> 

Same with this one, perhaps related to / fixed by:
    http://patchwork.ozlabs.org/patch/850957/

^ permalink raw reply

* Re: [RFC PATCH] virtio_net: Extend virtio to use VF datapath when available
From: David Miller @ 2017-12-19 18:20 UTC (permalink / raw)
  To: sridhar.samudrala
  Cc: mst, stephen, netdev, virtualization, alexander.duyck,
	jesse.brandeburg
In-Reply-To: <ca709329-a47a-2e76-b1fc-38fc665d7c0b@intel.com>

From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
Date: Tue, 19 Dec 2017 09:41:39 -0800

> This is based on netvsc implementation and here is the commit that
> added this delay.  Not sure if this needs to be 100ms.
> 
> commit 6123c66854c174e4982f98195100c1d990f9e5e6
> Author: stephen hemminger <stephen@networkplumber.org>
> Date:   Wed Aug 9 17:46:03 2017 -0700
> 
>     netvsc: delay setup of VF device
> 
>     When VF device is discovered, delay bring it automatically up in
>     order to allow userspace to some simple changes (like renaming).

This is kind of bogus, I should have called this out when the patch
was posted.

Any delay is wrong, there needs to be tight synchronization if a
userspace operation must occur before proceeding.  If something
happens and userspace is delayed, this whole thing doesn't work.

^ permalink raw reply

* Re: [RFC PATCH] virtio_net: Extend virtio to use VF datapath when available
From: David Miller @ 2017-12-19 18:21 UTC (permalink / raw)
  To: stephen
  Cc: sridhar.samudrala, mst, netdev, virtualization, alexander.duyck,
	jesse.brandeburg
In-Reply-To: <20171219095548.61d70d10@xeon-e3>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Tue, 19 Dec 2017 09:55:48 -0800

> could be 10ms, just enough to let udev do its renaming

Please, move to some kind of notification or event based handling of
this problem.

No delay is safe, what if userspace gets swapped out or whatever
else might make userspace stall unexpectedly?

^ permalink raw reply

* [PATCH v1] net: bonding: Replace mac address parsing
From: Andy Shevchenko @ 2017-12-19 18:20 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, netdev; +Cc: Andy Shevchenko

Replace sscanf() with mac_pton().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 drivers/net/bonding/bond_options.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 8a9b085c2a98..58c705f24f96 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -1431,13 +1431,9 @@ static int bond_option_ad_actor_system_set(struct bonding *bond,
 {
 	u8 macaddr[ETH_ALEN];
 	u8 *mac;
-	int i;
 
 	if (newval->string) {
-		i = sscanf(newval->string, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
-			   &macaddr[0], &macaddr[1], &macaddr[2],
-			   &macaddr[3], &macaddr[4], &macaddr[5]);
-		if (i != ETH_ALEN)
+		if (!mac_pton(newval->string, macaddr))
 			goto err;
 		mac = macaddr;
 	} else {
-- 
2.15.1

^ permalink raw reply related

* Re: [PATCH net] net: mediatek: setup proper state for disabled GMAC on the default
From: David Miller @ 2017-12-19 18:21 UTC (permalink / raw)
  To: sean.wang; +Cc: john, nbd, netdev, linux-kernel, linux-mediatek
In-Reply-To: <1ce99b9e64c6212eac501b98ee982b2e37fd7962.1513587298.git.sean.wang@mediatek.com>

From: <sean.wang@mediatek.com>
Date: Mon, 18 Dec 2017 17:00:17 +0800

> From: Sean Wang <sean.wang@mediatek.com>
> 
> The current solution would setup fixed and force link of 1Gbps to the both
> GMAC on the default. However, The GMAC should always be put to link down
> state when the GMAC is disabled on certain target boards. Otherwise,
> the driver possibly receives unexpected data from the floating hardware
> connection through the unused GMAC. Although the driver had been added
> certain protection in RX path to get rid of such kind of unexpected data
> sent to the upper stack.
> 
> Signed-off-by: Sean Wang <sean.wang@mediatek.com>

Applied, thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox