Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH mlx5-next] net/mlx5: Fix modify_cq_in alignment
From: Leon Romanovsky @ 2019-07-25  3:02 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: davem@davemloft.net, Jason Gunthorpe, Yishai Hadas,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	dledford@redhat.com, Edward Srouji
In-Reply-To: <5447fded90dfd133ef002177b77bfd3685bf8b42.camel@mellanox.com>

On Wed, Jul 24, 2019 at 08:56:08PM +0000, Saeed Mahameed wrote:
> On Tue, 2019-07-23 at 22:04 +0300, Leon Romanovsky wrote:
> > On Tue, Jul 23, 2019 at 11:28:50AM -0700, David Miller wrote:
> > > From: Leon Romanovsky <leon@kernel.org>
> > > Date: Tue, 23 Jul 2019 10:12:55 +0300
> > >
> > > > From: Edward Srouji <edwards@mellanox.com>
> > > >
> > > > Fix modify_cq_in alignment to match the device specification.
> > > > After this fix the 'cq_umem_valid' field will be in the right
> > > > offset.
> > > >
> > > > Cc: <stable@vger.kernel.org> # 4.19
> > > > Fixes: bd37197554eb ("net/mlx5: Update mlx5_ifc with DEVX UID
> > > > bits")
>
> Leon, I applied this patch to my tree, it got marked for -stable 4.20
> and not 4.19, i checked manually and indeed the offending patch came to
> light only on 4.20

Thanks

>

^ permalink raw reply

* Re: [PATCH net] net: hns: fix LED configuration for marvell phy
From: liuyonglong @ 2019-07-25  3:00 UTC (permalink / raw)
  To: David Miller, Andrew Lunn
  Cc: netdev, linux-kernel, linuxarm, salil.mehta, yisen.zhuang,
	shiju.jose
In-Reply-To: <20190722.181906.2225538844348045066.davem@davemloft.net>

> Revert "net: hns: fix LED configuration for marvell phy"
> This reverts commit f4e5f775db5a4631300dccd0de5eafb50a77c131.
>
> Andrew Lunn says this should be handled another way.
>
> Signed-off-by: David S. Miller <davem@davemloft.net>


Hi Andrew:

I see this patch have been reverted, can you tell me the better way to do this?
Thanks very much!

On 2019/7/23 9:19, David Miller wrote:
> From: Yonglong Liu <liuyonglong@huawei.com>
> Date: Mon, 22 Jul 2019 13:59:12 +0800
> 
>> Since commit(net: phy: marvell: change default m88e1510 LED configuration),
>> the active LED of Hip07 devices is always off, because Hip07 just
>> use 2 LEDs.
>> This patch adds a phy_register_fixup_for_uid() for m88e1510 to
>> correct the LED configuration.
>>
>> Fixes: 077772468ec1 ("net: phy: marvell: change default m88e1510 LED configuration")
>> Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
>> Reviewed-by: linyunsheng <linyunsheng@huawei.com>
> 
> Applied and queued up for -stable.
> 
> .
> 


^ permalink raw reply

* Re: [PATCH] rtlwifi: remove unneeded function _rtl_dump_channel_map()
From: Pkshih @ 2019-07-25  2:50 UTC (permalink / raw)
  To: yuehaibing@huawei.com, kvalo@codeaurora.org
  Cc: linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <20190724141020.58800-1-yuehaibing@huawei.com>

On Wed, 2019-07-24 at 22:10 +0800, YueHaibing wrote:
> Now _rtl_dump_channel_map() does not do any actual
> thing using the channel. So remove it.
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> ---
>  drivers/net/wireless/realtek/rtlwifi/regd.c | 18 ------------------
>  1 file changed, 18 deletions(-)
> 
> diff --git a/drivers/net/wireless/realtek/rtlwifi/regd.c
> b/drivers/net/wireless/realtek/rtlwifi/regd.c
> index 6ccb5b9..c10432c 100644
> --- a/drivers/net/wireless/realtek/rtlwifi/regd.c
> +++ b/drivers/net/wireless/realtek/rtlwifi/regd.c
> @@ -276,22 +276,6 @@ static void _rtl_reg_apply_world_flags(struct wiphy
> *wiphy,
>  	return;
>  }
>  
> -static void _rtl_dump_channel_map(struct wiphy *wiphy)
> -{
> -	enum nl80211_band band;
> -	struct ieee80211_supported_band *sband;
> -	struct ieee80211_channel *ch;
> -	unsigned int i;
> -
> -	for (band = 0; band < NUM_NL80211_BANDS; band++) {
> -		if (!wiphy->bands[band])
> -			continue;
> -		sband = wiphy->bands[band];
> -		for (i = 0; i < sband->n_channels; i++)
> -			ch = &sband->channels[i];
> -	}
> -}
> -
>  static int _rtl_reg_notifier_apply(struct wiphy *wiphy,
>  				   struct regulatory_request *request,
>  				   struct rtl_regulatory *reg)
> @@ -309,8 +293,6 @@ static int _rtl_reg_notifier_apply(struct wiphy *wiphy,
>  		break;
>  	}
>  
> -	_rtl_dump_channel_map(wiphy);
> -
>  	return 0;
>  }
>  

Acked-by: Ping-Ke Shih <pkshih@realtek.com>



^ permalink raw reply

* Re: [PATCH net-next v2 3/3] netlink: add validation of NLA_F_NESTED flag
From: David Ahern @ 2019-07-25  2:46 UTC (permalink / raw)
  To: Thomas Haller, Michal Kubecek, David S. Miller
  Cc: netdev, Johannes Berg, linux-kernel
In-Reply-To: <0fc58a4883f6656208b9250876e53d723919e342.camel@redhat.com>

On 7/23/19 1:57 AM, Thomas Haller wrote:
> Does this flag and strict validation really provide any value? Commonly a netlink message
> is a plain TLV blob, and the meaning depends entirely on the policy.

Strict checking enables kernel side filtering and other features that
require passing attributes as part of the dump request - like address
dumps in a specific namespace.

^ permalink raw reply

* Re: [PATCH net-next 06/11] net: hns3: modify firmware version display format
From: tanhuazhong @ 2019-07-25  2:34 UTC (permalink / raw)
  To: Saeed Mahameed, davem@davemloft.net
  Cc: lipeng321@huawei.com, yisen.zhuang@huawei.com,
	salil.mehta@huawei.com, linuxarm@huawei.com,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	moyufeng@huawei.com
In-Reply-To: <4c4ce27c9a9372340c0e2b0f654b3fb9cd85b3e4.camel@mellanox.com>



On 2019/7/25 2:34, Saeed Mahameed wrote:
> On Wed, 2019-07-24 at 11:18 +0800, Huazhong Tan wrote:
>> From: Yufeng Mo <moyufeng@huawei.com>
>>
>> This patch modifies firmware version display format in
>> hclge(vf)_cmd_init() and hns3_get_drvinfo(). Also, adds
>> some optimizations for firmware version display format.
>>
>> Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
>> Signed-off-by: Peng Li <lipeng321@huawei.com>
>> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
>> ---
>>   drivers/net/ethernet/hisilicon/hns3/hnae3.h              |  9
>> +++++++++
>>   drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c       | 15
>> +++++++++++++--
>>   drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c   | 10
>> +++++++++-
>>   drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c | 11
>> +++++++++--
>>   4 files changed, 40 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
>> b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
>> index 48c7b70..a4624db 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
>> @@ -179,6 +179,15 @@ struct hnae3_vector_info {
>>   #define HNAE3_RING_GL_RX 0
>>   #define HNAE3_RING_GL_TX 1
>>   
>> +#define HNAE3_FW_VERSION_BYTE3_SHIFT	24
>> +#define HNAE3_FW_VERSION_BYTE3_MASK	GENMASK(31, 24)
>> +#define HNAE3_FW_VERSION_BYTE2_SHIFT	16
>> +#define HNAE3_FW_VERSION_BYTE2_MASK	GENMASK(23, 16)
>> +#define HNAE3_FW_VERSION_BYTE1_SHIFT	8
>> +#define HNAE3_FW_VERSION_BYTE1_MASK	GENMASK(15, 8)
>> +#define HNAE3_FW_VERSION_BYTE0_SHIFT	0
>> +#define HNAE3_FW_VERSION_BYTE0_MASK	GENMASK(7, 0)
>> +
>>   struct hnae3_ring_chain_node {
>>   	struct hnae3_ring_chain_node *next;
>>   	u32 tqp_index;
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
>> b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
>> index 5bff98a..e71c92b 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
>> @@ -527,6 +527,7 @@ static void hns3_get_drvinfo(struct net_device
>> *netdev,
>>   {
>>   	struct hns3_nic_priv *priv = netdev_priv(netdev);
>>   	struct hnae3_handle *h = priv->ae_handle;
>> +	u32 fw_version;
>>   
>>   	if (!h->ae_algo->ops->get_fw_version) {
>>   		netdev_err(netdev, "could not get fw version!\n");
>> @@ -545,8 +546,18 @@ static void hns3_get_drvinfo(struct net_device
>> *netdev,
>>   		sizeof(drvinfo->bus_info));
>>   	drvinfo->bus_info[ETHTOOL_BUSINFO_LEN - 1] = '\0';
>>   
>> -	snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
>> "0x%08x",
>> -		 priv->ae_handle->ae_algo->ops->get_fw_version(h));
>> +	fw_version = priv->ae_handle->ae_algo->ops->get_fw_version(h);
>> +
>> +	snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
>> +		 "%lu.%lu.%lu.%lu",
>> +		 hnae3_get_field(fw_version,
>> HNAE3_FW_VERSION_BYTE3_MASK,
>> +				 HNAE3_FW_VERSION_BYTE3_SHIFT),
>> +		 hnae3_get_field(fw_version,
>> HNAE3_FW_VERSION_BYTE2_MASK,
>> +				 HNAE3_FW_VERSION_BYTE2_SHIFT),
>> +		 hnae3_get_field(fw_version,
>> HNAE3_FW_VERSION_BYTE1_MASK,
>> +				 HNAE3_FW_VERSION_BYTE1_SHIFT),
>> +		 hnae3_get_field(fw_version,
>> HNAE3_FW_VERSION_BYTE0_MASK,
>> +				 HNAE3_FW_VERSION_BYTE0_SHIFT));
>>   }
>>   
>>   static u32 hns3_get_link(struct net_device *netdev)
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
>> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
>> index 22f6acd..c2320bf 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
>> @@ -419,7 +419,15 @@ int hclge_cmd_init(struct hclge_dev *hdev)
>>   	}
>>   	hdev->fw_version = version;
>>   
>> -	dev_info(&hdev->pdev->dev, "The firmware version is %08x\n",
>> version);
>> +	pr_info_once("The firmware version is %lu.%lu.%lu.%lu\n",
>> +		     hnae3_get_field(version,
>> HNAE3_FW_VERSION_BYTE3_MASK,
>> +				     HNAE3_FW_VERSION_BYTE3_SHIFT),
>> +		     hnae3_get_field(version,
>> HNAE3_FW_VERSION_BYTE2_MASK,
>> +				     HNAE3_FW_VERSION_BYTE2_SHIFT),
>> +		     hnae3_get_field(version,
>> HNAE3_FW_VERSION_BYTE1_MASK,
>> +				     HNAE3_FW_VERSION_BYTE1_SHIFT),
>> +		     hnae3_get_field(version,
>> HNAE3_FW_VERSION_BYTE0_MASK,
>> +				     HNAE3_FW_VERSION_BYTE0_SHIFT));
>>   
> 
> Device name/string will not be printed now, what happens if i have
> multiple devices ? at least print the device name as it was before
>
Since on each board we only have one firmware, the firmware
version is same per device, and will not change when running.
So pr_info_once() looks good for this case.

BTW, maybe we should change below print in the end of
hclge_init_ae_dev(), use dev_info() instead of pr_info(),
then we can know that which device has already initialized.
I will send other patch to do that, is it acceptable for you?

"pr_info("%s driver initialization finished.\n", HCLGE_DRIVER_NAME);"

Thanks.

>>   	return 0;
>>   
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
>> b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
>> index 652b796..004125b 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
>> @@ -405,8 +405,15 @@ int hclgevf_cmd_init(struct hclgevf_dev *hdev)
>>   	}
>>   	hdev->fw_version = version;
>>   
>> -	dev_info(&hdev->pdev->dev, "The firmware version is %08x\n",
>> version);
>> -
>> +	pr_info_once("The firmware version is %lu.%lu.%lu.%lu\n",
>> +		     hnae3_get_field(version,
>> HNAE3_FW_VERSION_BYTE3_MASK,
>> +				     HNAE3_FW_VERSION_BYTE3_SHIFT),
>> +		     hnae3_get_field(version,
>> HNAE3_FW_VERSION_BYTE2_MASK,
>> +				     HNAE3_FW_VERSION_BYTE2_SHIFT),
>> +		     hnae3_get_field(version,
>> HNAE3_FW_VERSION_BYTE1_MASK,
>> +				     HNAE3_FW_VERSION_BYTE1_SHIFT),
>> +		     hnae3_get_field(version,
>> HNAE3_FW_VERSION_BYTE0_MASK,
>> +				     HNAE3_FW_VERSION_BYTE0_SHIFT));
>>   	return 0;
>>   
> 
> Same.
> 

Same
:)


^ permalink raw reply

* Re: [PATCH bpf-next v3 03/11] xsk: add support to allow unaligned chunk placement
From: Jakub Kicinski @ 2019-07-25  2:22 UTC (permalink / raw)
  To: Kevin Laatz
  Cc: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon,
	saeedm, maximmi, stephen, bruce.richardson, ciara.loftus, bpf,
	intel-wired-lan
In-Reply-To: <20190724051043.14348-4-kevin.laatz@intel.com>

On Wed, 24 Jul 2019 05:10:35 +0000, Kevin Laatz wrote:
> Currently, addresses are chunk size aligned. This means, we are very
> restricted in terms of where we can place chunk within the umem. For
> example, if we have a chunk size of 2k, then our chunks can only be placed
> at 0,2k,4k,6k,8k... and so on (ie. every 2k starting from 0).
> 
> This patch introduces the ability to use unaligned chunks. With these
> changes, we are no longer bound to having to place chunks at a 2k (or
> whatever your chunk size is) interval. Since we are no longer dealing with
> aligned chunks, they can now cross page boundaries. Checks for page
> contiguity have been added in order to keep track of which pages are
> followed by a physically contiguous page.
> 
> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> 
> ---
> v2:
>   - Add checks for the flags coming from userspace
>   - Fix how we get chunk_size in xsk_diag.c
>   - Add defines for masking the new descriptor format
>   - Modified the rx functions to use new descriptor format
>   - Modified the tx functions to use new descriptor format
> 
> v3:
>   - Add helper function to do address/offset masking/addition
> ---
>  include/net/xdp_sock.h      | 17 ++++++++
>  include/uapi/linux/if_xdp.h |  9 ++++
>  net/xdp/xdp_umem.c          | 18 +++++---
>  net/xdp/xsk.c               | 86 ++++++++++++++++++++++++++++++-------
>  net/xdp/xsk_diag.c          |  2 +-
>  net/xdp/xsk_queue.h         | 68 +++++++++++++++++++++++++----
>  6 files changed, 170 insertions(+), 30 deletions(-)
> 
> diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
> index 69796d264f06..738996c0f995 100644
> --- a/include/net/xdp_sock.h
> +++ b/include/net/xdp_sock.h
> @@ -19,6 +19,7 @@ struct xsk_queue;
>  struct xdp_umem_page {
>  	void *addr;
>  	dma_addr_t dma;
> +	bool next_pg_contig;

IIRC accesses to xdp_umem_page case a lot of cache misses, so having
this structure grow from 16 to 24B is a little unfortunate :(
Can we try to steal lower bits of addr or dma? Or perhaps not pre
compute this info at all?

>  };
>  
>  struct xdp_umem_fq_reuse {
> @@ -48,6 +49,7 @@ struct xdp_umem {
>  	bool zc;
>  	spinlock_t xsk_list_lock;
>  	struct list_head xsk_list;
> +	u32 flags;
>  };
>  
>  struct xdp_sock {
> @@ -144,6 +146,15 @@ static inline void xsk_umem_fq_reuse(struct xdp_umem *umem, u64 addr)
>  
>  	rq->handles[rq->length++] = addr;
>  }
> +
> +static inline u64 xsk_umem_handle_offset(struct xdp_umem *umem, u64 handle,
> +					 u64 offset)
> +{
> +	if (umem->flags & XDP_UMEM_UNALIGNED_CHUNKS)
> +		return handle |= (offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT);
> +	else
> +		return handle += offset;
> +}
>  #else
>  static inline int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
>  {
> @@ -241,6 +252,12 @@ static inline void xsk_umem_fq_reuse(struct xdp_umem *umem, u64 addr)
>  {
>  }
>  
> +static inline u64 xsk_umem_handle_offset(struct xdp_umem *umem, u64 handle,
> +					 u64 offset)
> +{
> +	return NULL;

	return 0?

> +}
> +
>  #endif /* CONFIG_XDP_SOCKETS */
>  
>  #endif /* _LINUX_XDP_SOCK_H */
> diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
> index faaa5ca2a117..f8dc68fcdf78 100644
> --- a/include/uapi/linux/if_xdp.h
> +++ b/include/uapi/linux/if_xdp.h
> @@ -17,6 +17,9 @@
>  #define XDP_COPY	(1 << 1) /* Force copy-mode */
>  #define XDP_ZEROCOPY	(1 << 2) /* Force zero-copy mode */
>  
> +/* Flags for xsk_umem_config flags */
> +#define XDP_UMEM_UNALIGNED_CHUNKS (1 << 0)
> +
>  struct sockaddr_xdp {
>  	__u16 sxdp_family;
>  	__u16 sxdp_flags;
> @@ -53,6 +56,7 @@ struct xdp_umem_reg {
>  	__u64 len; /* Length of packet data area */
>  	__u32 chunk_size;
>  	__u32 headroom;
> +	__u32 flags;
>  };
>  
>  struct xdp_statistics {
> @@ -74,6 +78,11 @@ struct xdp_options {
>  #define XDP_UMEM_PGOFF_FILL_RING	0x100000000ULL
>  #define XDP_UMEM_PGOFF_COMPLETION_RING	0x180000000ULL
>  
> +/* Masks for unaligned chunks mode */
> +#define XSK_UNALIGNED_BUF_OFFSET_SHIFT 48
> +#define XSK_UNALIGNED_BUF_ADDR_MASK \
> +	((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1)
> +
>  /* Rx/Tx descriptor */
>  struct xdp_desc {
>  	__u64 addr;
> diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
> index 83de74ca729a..952ca22103e9 100644
> --- a/net/xdp/xdp_umem.c
> +++ b/net/xdp/xdp_umem.c
> @@ -299,6 +299,7 @@ static int xdp_umem_account_pages(struct xdp_umem *umem)
>  
>  static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr)
>  {
> +	bool unaligned_chunks = mr->flags & XDP_UMEM_UNALIGNED_CHUNKS;
>  	u32 chunk_size = mr->chunk_size, headroom = mr->headroom;
>  	unsigned int chunks, chunks_per_page;
>  	u64 addr = mr->addr, size = mr->len;
> @@ -314,7 +315,10 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr)
>  		return -EINVAL;
>  	}
>  
> -	if (!is_power_of_2(chunk_size))
> +	if (mr->flags & ~(XDP_UMEM_UNALIGNED_CHUNKS))

parens unnecessary, consider adding a define for known flags.

> +		return -EINVAL;
> +
> +	if (!unaligned_chunks && !is_power_of_2(chunk_size))
>  		return -EINVAL;
>  
>  	if (!PAGE_ALIGNED(addr)) {
> @@ -331,9 +335,11 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr)
>  	if (chunks == 0)
>  		return -EINVAL;
>  
> -	chunks_per_page = PAGE_SIZE / chunk_size;
> -	if (chunks < chunks_per_page || chunks % chunks_per_page)
> -		return -EINVAL;
> +	if (!unaligned_chunks) {
> +		chunks_per_page = PAGE_SIZE / chunk_size;
> +		if (chunks < chunks_per_page || chunks % chunks_per_page)
> +			return -EINVAL;
> +	}
>  
>  	headroom = ALIGN(headroom, 64);
>  
> @@ -342,13 +348,15 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr)
>  		return -EINVAL;
>  
>  	umem->address = (unsigned long)addr;
> -	umem->chunk_mask = ~((u64)chunk_size - 1);
> +	umem->chunk_mask = unaligned_chunks ? XSK_UNALIGNED_BUF_ADDR_MASK
> +					    : ~((u64)chunk_size - 1);
>  	umem->size = size;
>  	umem->headroom = headroom;
>  	umem->chunk_size_nohr = chunk_size - headroom;
>  	umem->npgs = size / PAGE_SIZE;
>  	umem->pgs = NULL;
>  	umem->user = NULL;
> +	umem->flags = mr->flags;
>  	INIT_LIST_HEAD(&umem->xsk_list);
>  	spin_lock_init(&umem->xsk_list_lock);
>  
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 59b57d708697..b3ab653091c4 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -45,7 +45,7 @@ EXPORT_SYMBOL(xsk_umem_has_addrs);
>  
>  u64 *xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr)
>  {
> -	return xskq_peek_addr(umem->fq, addr);
> +	return xskq_peek_addr(umem->fq, addr, umem);
>  }
>  EXPORT_SYMBOL(xsk_umem_peek_addr);
>  
> @@ -55,21 +55,42 @@ void xsk_umem_discard_addr(struct xdp_umem *umem)
>  }
>  EXPORT_SYMBOL(xsk_umem_discard_addr);
>  
> +/* If a buffer crosses a page boundary, we need to do 2 memcpy's, one for
> + * each page. This is only required in copy mode.
> + */
> +static void __xsk_rcv_memcpy(struct xdp_umem *umem, u64 addr, void *from_buf,
> +			     u32 len, u32 metalen)
> +{
> +	void *to_buf = xdp_umem_get_data(umem, addr);
> +
> +	if (xskq_crosses_non_contig_pg(umem, addr, len + metalen)) {
> +		void *next_pg_addr = umem->pages[(addr >> PAGE_SHIFT) + 1].addr;
> +		u64 page_start = addr & (PAGE_SIZE - 1);
> +		u64 first_len = PAGE_SIZE - (addr - page_start);
> +
> +		memcpy(to_buf, from_buf, first_len + metalen);
> +		memcpy(next_pg_addr, from_buf + first_len, len - first_len);
> +
> +		return;
> +	}
> +
> +	memcpy(to_buf, from_buf, len + metalen);
> +}

Why handle this case gracefully? Real XSK use is the zero copy mode,
having extra code to make copy mode more permissive seems a little
counter productive IMHO.

>  static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
>  {
> -	void *to_buf, *from_buf;
> +	u64 offset = xs->umem->headroom;
> +	void *from_buf;
>  	u32 metalen;
>  	u64 addr;
>  	int err;
>  
> -	if (!xskq_peek_addr(xs->umem->fq, &addr) ||
> +	if (!xskq_peek_addr(xs->umem->fq, &addr, xs->umem) ||
>  	    len > xs->umem->chunk_size_nohr - XDP_PACKET_HEADROOM) {
>  		xs->rx_dropped++;
>  		return -ENOSPC;
>  	}
>  
> -	addr += xs->umem->headroom;
> -
>  	if (unlikely(xdp_data_meta_unsupported(xdp))) {
>  		from_buf = xdp->data;
>  		metalen = 0;
> @@ -78,9 +99,13 @@ static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
>  		metalen = xdp->data - xdp->data_meta;
>  	}
>  
> -	to_buf = xdp_umem_get_data(xs->umem, addr);
> -	memcpy(to_buf, from_buf, len + metalen);
> -	addr += metalen;
> +	__xsk_rcv_memcpy(xs->umem, addr + offset, from_buf, len, metalen);
> +
> +	offset += metalen;
> +	if (xs->umem->flags & XDP_UMEM_UNALIGNED_CHUNKS)
> +		addr |= offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
> +	else
> +		addr += offset;
>  	err = xskq_produce_batch_desc(xs->rx, addr, len);
>  	if (!err) {
>  		xskq_discard_addr(xs->umem->fq);
> @@ -127,6 +152,7 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
>  	u32 len = xdp->data_end - xdp->data;
>  	void *buffer;
>  	u64 addr;
> +	u64 offset = xs->umem->headroom;

reverse xmas tree, please

>  	int err;
>  
>  	spin_lock_bh(&xs->rx_lock);


^ permalink raw reply

* Re: [PATCH] net: sctp: fix memory leak in sctp_send_reset_streams
From: Marcelo Ricardo Leitner @ 2019-07-25  2:19 UTC (permalink / raw)
  To: Xin Long
  Cc: Neil Horman, Hillf Danton, linux-sctp, network dev, syzkaller,
	David S . Miller, LKML, syzkaller-bugs, syzbot, Vlad Yasevich,
	Eric Dumazet
In-Reply-To: <CADvbK_ddFyO2iz-QS3bHevHN7Q29VUS4joK3Kxam3Y4tEqHFKA@mail.gmail.com>

On Wed, Jul 24, 2019 at 03:56:40PM +0800, Xin Long wrote:
> On Sun, Jun 2, 2019 at 9:36 PM Xin Long <lucien.xin@gmail.com> wrote:
> >
> > On Sun, Jun 2, 2019 at 6:52 PM Neil Horman <nhorman@tuxdriver.com> wrote:
> > >
> > > On Sun, Jun 02, 2019 at 11:44:29AM +0800, Hillf Danton wrote:
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    036e3431 Merge git://git.kernel.org/pub/scm/linux/kernel/g..
> > > > git tree:       upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=153cff12a00000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=8f0f63a62bb5b13c
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=6ad9c3bd0a218a2ab41d
> > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12561c86a00000
> > > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15b76fd8a00000
> > > >
> > > > executing program
> > > > executing program
> > > > executing program
> > > > executing program
> > > > executing program
> > > > BUG: memory leak
> > > > unreferenced object 0xffff888123894820 (size 32):
> > > >   comm "syz-executor045", pid 7267, jiffies 4294943559 (age 13.660s)
> > > >   hex dump (first 32 bytes):
> > > >     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > > >     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > > >   backtrace:
> > > >     [<00000000c7e71c69>] kmemleak_alloc_recursive
> > > > include/linux/kmemleak.h:55 [inline]
> > > >     [<00000000c7e71c69>] slab_post_alloc_hook mm/slab.h:439 [inline]
> > > >     [<00000000c7e71c69>] slab_alloc mm/slab.c:3326 [inline]
> > > >     [<00000000c7e71c69>] __do_kmalloc mm/slab.c:3658 [inline]
> > > >     [<00000000c7e71c69>] __kmalloc+0x161/0x2c0 mm/slab.c:3669
> > > >     [<000000003250ed8e>] kmalloc_array include/linux/slab.h:670 [inline]
> > > >     [<000000003250ed8e>] kcalloc include/linux/slab.h:681 [inline]
> > > >     [<000000003250ed8e>] sctp_send_reset_streams+0x1ab/0x5a0 net/sctp/stream.c:302
> > > >     [<00000000cd899c6e>] sctp_setsockopt_reset_streams net/sctp/socket.c:4314 [inline]
> > > >     [<00000000cd899c6e>] sctp_setsockopt net/sctp/socket.c:4765 [inline]
> > > >     [<00000000cd899c6e>] sctp_setsockopt+0xc23/0x2bf0 net/sctp/socket.c:4608
> > > >     [<00000000ff3a21a2>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130
> > > >     [<000000009eb87ae7>] __sys_setsockopt+0x98/0x120 net/socket.c:2078
> > > >     [<00000000e0ede6ca>] __do_sys_setsockopt net/socket.c:2089 [inline]
> > > >     [<00000000e0ede6ca>] __se_sys_setsockopt net/socket.c:2086 [inline]
> > > >     [<00000000e0ede6ca>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
> > > >     [<00000000c61155f5>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
> > > >     [<00000000e540958c>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > >
> > > >
> > > > It was introduced in commit d570a59c5b5f ("sctp: only allow the out stream
> > > > reset when the stream outq is empty"), in orde to check stream outqs before
> > > > sending SCTP_STRRESET_IN_PROGRESS back to the peer of the stream. EAGAIN is
> > > > returned, however, without the nstr_list slab released, if any outq is found
> > > > to be non empty.
> > > >
> > > > Freeing the slab in question before bailing out fixes it.
> > > >
> > > > Fixes: d570a59c5b5f ("sctp: only allow the out stream reset when the stream outq is empty")
> > > > Reported-by: syzbot <syzbot+6ad9c3bd0a218a2ab41d@syzkaller.appspotmail.com>
> > > > Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> > > > Tested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> > > > Cc: Xin Long <lucien.xin@gmail.com>
> > > > Cc: Neil Horman <nhorman@tuxdriver.com>
> > > > Cc: Vlad Yasevich <vyasevich@gmail.com>
> > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > Signed-off-by: Hillf Danton <hdanton@sina.com>
> > > > ---
> > > > net/sctp/stream.c | 1 +
> > > > 1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/net/sctp/stream.c b/net/sctp/stream.c
> > > > index 93ed078..d3e2f03 100644
> > > > --- a/net/sctp/stream.c
> > > > +++ b/net/sctp/stream.c
> > > > @@ -310,6 +310,7 @@ int sctp_send_reset_streams(struct sctp_association *asoc,
> > > >
> > > >       if (out && !sctp_stream_outq_is_empty(stream, str_nums, nstr_list)) {
> > > >               retval = -EAGAIN;
> > > > +             kfree(nstr_list);
> > > >               goto out;
> > > >       }
> > > >
> > > > --
> > > >
> > > >
> > > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> > Reviewed-by: Xin Long <lucien.xin@gmail.com>
> This fix is not applied, pls resend it with:
> to = network dev <netdev@vger.kernel.org>
> cc = davem@davemloft.net
> to = linux-sctp@vger.kernel.org
> cc = Neil Horman <nhorman@tuxdriver.com>
> cc = Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

Good catch, thanks Xin. I don't know what happened but I never got
this patch via netdev@, just the direct delivery. If it didn't reach
netdev@, that explains it.

  Marcelo

^ permalink raw reply

* Re: [PATCH 4.4 stable net] net: tcp: Fix use-after-free in tcp_write_xmit
From: maowenan @ 2019-07-25  2:06 UTC (permalink / raw)
  To: Eric Dumazet, davem, gregkh, netdev, linux-kernel
In-Reply-To: <be1aebb5-fee7-e079-d864-a2e4aa13007f@gmail.com>



On 2019/7/24 18:13, Eric Dumazet wrote:
> 
> 
> On 7/24/19 12:01 PM, Eric Dumazet wrote:
>>
>>
>> On 7/24/19 11:17 AM, Mao Wenan wrote:
>>> There is one report about tcp_write_xmit use-after-free with version 4.4.136:
>>
>> Current stable 4.4 is 4.4.186
>>
>> Can you check the bug is still there ?
>>
> 
> BTW, I tried the C repro and another bug showed up.
> 
> It looks like 4.4.186 misses other fixes :/

Do you have logs about this?
Is it the bugs followed up this UAF?

> 
> [  180.811610] skbuff: skb_under_panic: text:ffffffff825ec6ea len:156 put:84 head:ffff8837dd1f0990 data:ffff8837dd1f098c tail:0x98 end:0xc0 dev:ip6gre0
> [  180.825037] ------------[ cut here ]------------
> [  180.829688] kernel BUG at net/core/skbuff.c:104!
> [  180.834316] invalid opcode: 0000 [#1] SMP KASAN
> [  180.839305] gsmi: Log Shutdown Reason 0x03
> [  180.843426] Modules linked in: ipip bonding bridge stp llc tun veth w1_therm wire i2c_mux_pca954x i2c_mux cdc_acm ehci_pci ehci_hcd ip_gre mlx4_en ib_uverbs mlx4_ib ib_sa ib_mad ib_core ib_addr mlx4_core
> [  180.862052] CPU: 22 PID: 1619 Comm: kworker/22:1 Not tainted 4.4.186-smp-DEV #41
> [  180.869475] Hardware name: Intel BIOS 2.56.0 10/19/2018
> [  180.876463] Workqueue: ipv6_addrconf addrconf_dad_work
> [  180.881658] task: ffff8837f1f59d80 ti: ffff8837eeeb8000 task.ti: ffff8837eeeb8000
> [  180.889171] RIP: 0010:[<ffffffff821ef26f>]  [<ffffffff821ef26f>] skb_panic+0x14f/0x210
> [  180.897162] RSP: 0018:ffff8837eeebf4b8  EFLAGS: 00010282
> [  180.902504] RAX: 0000000000000088 RBX: ffff8837eeeeb600 RCX: 0000000000000000
> [  180.909645] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff83508c00
> [  180.916854] RBP: ffff8837eeebf520 R08: 0000000000000016 R09: 0000000000000000
> [  180.924029] R10: ffff881fc8abf038 R11: 0000000000000007 R12: ffff881fc8abe720
> [  180.931213] R13: ffffffff82aa9e80 R14: 00000000000000c0 R15: 0000000000000098
> [  180.938390] FS:  0000000000000000(0000) GS:ffff8837ff280000(0000) knlGS:0000000000000000
> [  180.946519] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  180.952290] CR2: 00007f519426f530 CR3: 00000037d37f2000 CR4: 0000000000160670
> [  180.959447] Stack:
> [  180.961458]  ffff8837dd1f098c 0000000000000098 00000000000000c0 ffff881fc8abe720
> [  180.968909]  ffffea00df747c00 ffff881fff404b40 ffff8837ff2a1a20 ffff8837eeebf5b8
> [  180.976371]  ffff8837eeeeb600 ffffffff825ec6ea 1ffff106fddd7eb6 ffff8837eeeeb600
> [  180.983848] Call Trace:
> [  180.986297]  [<ffffffff825ec6ea>] ? ip6gre_header+0xba/0xd50
> [  180.991962]  [<ffffffff821f0e01>] skb_push+0xc1/0x100
> [  180.997023]  [<ffffffff825ec6ea>] ip6gre_header+0xba/0xd50
> [  181.002519]  [<ffffffff8158dc16>] ? memcpy+0x36/0x40
> [  181.007509]  [<ffffffff825ec630>] ? ip6gre_changelink+0x6d0/0x6d0
> [  181.013629]  [<ffffffff82550741>] ? ndisc_constructor+0x5b1/0x770
> [  181.019728]  [<ffffffff82666861>] ? _raw_write_unlock_bh+0x41/0x50
> [  181.025924]  [<ffffffff8226540b>] ? __neigh_create+0xe6b/0x1670
> [  181.031851]  [<ffffffff8225817f>] neigh_connected_output+0x23f/0x480
> [  181.038219]  [<ffffffff824f61ec>] ip6_finish_output2+0x74c/0x1a90
> [  181.044324]  [<ffffffff810f1d33>] ? print_context_stack+0x73/0xf0
> [  181.050429]  [<ffffffff824f5aa0>] ? ip6_xmit+0x1700/0x1700
> [  181.055933]  [<ffffffff82304a28>] ? nf_hook_slow+0x118/0x1b0
> [  181.061617]  [<ffffffff82502d7a>] ip6_finish_output+0x2ba/0x580
> [  181.067546]  [<ffffffff82503179>] ip6_output+0x139/0x380
> [  181.072884]  [<ffffffff82503040>] ? ip6_finish_output+0x580/0x580
> [  181.079004]  [<ffffffff82502ac0>] ? ip6_fragment+0x31b0/0x31b0
> [  181.084852]  [<ffffffff82251b51>] ? dst_init+0x4b1/0x820
> [  181.090172]  [<ffffffff8158da45>] ? kasan_unpoison_shadow+0x35/0x50
> [  181.096437]  [<ffffffff8158da45>] ? kasan_unpoison_shadow+0x35/0x50
> [  181.102712]  [<ffffffff8254f3ca>] NF_HOOK_THRESH.constprop.22+0xca/0x180
> [  181.109421]  [<ffffffff8254f300>] ? ndisc_alloc_skb+0x340/0x340
> [  181.115338]  [<ffffffff8254d820>] ? compat_ipv6_setsockopt+0x180/0x180
> [  181.121874]  [<ffffffff8254fbc2>] ndisc_send_skb+0x742/0xd10
> [  181.127550]  [<ffffffff8254f480>] ? NF_HOOK_THRESH.constprop.22+0x180/0x180
> [  181.134516]  [<ffffffff821f2440>] ? skb_complete_tx_timestamp+0x280/0x280
> [  181.141311]  [<ffffffff8254e2b3>] ? ndisc_fill_addr_option+0x193/0x260
> [  181.147844]  [<ffffffff82553bd9>] ndisc_send_rs+0x179/0x2d0
> [  181.153426]  [<ffffffff8251e7df>] addrconf_dad_completed+0x41f/0x7c0
> [  181.159795]  [<ffffffff81297f78>] ? pick_next_entity+0x198/0x470
> [  181.165807]  [<ffffffff8251e3c0>] ? addrconf_rs_timer+0x4a0/0x4a0
> [  181.171918]  [<ffffffff81aab928>] ? find_next_bit+0x18/0x20
> [  181.177504]  [<ffffffff81a99ec9>] ? prandom_seed+0xd9/0x160
> [  181.183095]  [<ffffffff8251eef5>] addrconf_dad_work+0x375/0x9e0
> [  181.189024]  [<ffffffff8251eb80>] ? addrconf_dad_completed+0x7c0/0x7c0
> [  181.195576]  [<ffffffff81249d8f>] process_one_work+0x52f/0xf60
> [  181.201468]  [<ffffffff8124a89d>] worker_thread+0xdd/0xe80
> [  181.206977]  [<ffffffff8265cf0a>] ? __schedule+0x73a/0x16d0
> [  181.212550]  [<ffffffff8124a7c0>] ? process_one_work+0xf60/0xf60
> [  181.218572]  [<ffffffff8125a115>] kthread+0x205/0x2b0
> [  181.223633]  [<ffffffff81259f10>] ? kthread_worker_fn+0x4e0/0x4e0
> [  181.229743]  [<ffffffff81259f10>] ? kthread_worker_fn+0x4e0/0x4e0
> [  181.235834]  [<ffffffff8266726f>] ret_from_fork+0x3f/0x70
> [  181.241232]  [<ffffffff81259f10>] ? kthread_worker_fn+0x4e0/0x4e0
> 
> 
> .
> 


^ permalink raw reply

* Re: [PATCH net-next 08/11] net: hns3: add interrupt affinity support for misc interrupt
From: Yunsheng Lin @ 2019-07-25  2:05 UTC (permalink / raw)
  To: Saeed Mahameed, tanhuazhong@huawei.com, davem@davemloft.net
  Cc: lipeng321@huawei.com, yisen.zhuang@huawei.com,
	salil.mehta@huawei.com, linuxarm@huawei.com,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <67b32cdc72c0be03622e78899ac518d807ca7b85.camel@mellanox.com>

On 2019/7/25 3:24, Saeed Mahameed wrote:
> On Wed, 2019-07-24 at 11:18 +0800, Huazhong Tan wrote:
>> From: Yunsheng Lin <linyunsheng@huawei.com>
>>
>> The misc interrupt is used to schedule the reset and mailbox
>> subtask, and a 1 sec timer is used to schedule the service
>> subtask, which does periodic work.
>>
>> This patch sets the above three subtask's affinity using the
>> misc interrupt' affinity.
>>
>> Also this patch setups a affinity notify for misc interrupt to
>> allow user to change the above three subtask's affinity.
>>
>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
>> Signed-off-by: Peng Li <lipeng321@huawei.com>
>> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
>> ---
>>  .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    | 59
>> ++++++++++++++++++++--
>>  .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |  4 ++
>>  2 files changed, 59 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
>> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
>> index f345095..fe45986 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
>> @@ -1270,6 +1270,12 @@ static int hclge_configure(struct hclge_dev
>> *hdev)
>>  
>>  	hclge_init_kdump_kernel_config(hdev);
>>  
>> +	/* Set the init affinity based on pci func number */
>> +	i = cpumask_weight(cpumask_of_node(dev_to_node(&hdev->pdev-
>>> dev)));
>> +	i = i ? PCI_FUNC(hdev->pdev->devfn) % i : 0;
>> +	cpumask_set_cpu(cpumask_local_spread(i, dev_to_node(&hdev-
>>> pdev->dev)),
>> +			&hdev->affinity_mask);
>> +
>>  	return ret;
>>  }
>>  
>> @@ -2502,14 +2508,16 @@ static void hclge_mbx_task_schedule(struct
>> hclge_dev *hdev)
>>  {
>>  	if (!test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state) &&
>>  	    !test_and_set_bit(HCLGE_STATE_MBX_SERVICE_SCHED, &hdev-
>>> state))
>> -		schedule_work(&hdev->mbx_service_task);
>> +		queue_work_on(cpumask_first(&hdev->affinity_mask),
>> system_wq,
>> +			      &hdev->mbx_service_task);
>>  }
>>  
>>  static void hclge_reset_task_schedule(struct hclge_dev *hdev)
>>  {
>>  	if (!test_bit(HCLGE_STATE_REMOVING, &hdev->state) &&
>>  	    !test_and_set_bit(HCLGE_STATE_RST_SERVICE_SCHED, &hdev-
>>> state))
>> -		schedule_work(&hdev->rst_service_task);
>> +		queue_work_on(cpumask_first(&hdev->affinity_mask),
>> system_wq,
>> +			      &hdev->rst_service_task);
>>  }
>>  
>>  static void hclge_task_schedule(struct hclge_dev *hdev)
>> @@ -2517,7 +2525,8 @@ static void hclge_task_schedule(struct
>> hclge_dev *hdev)
>>  	if (!test_bit(HCLGE_STATE_DOWN, &hdev->state) &&
>>  	    !test_bit(HCLGE_STATE_REMOVING, &hdev->state) &&
>>  	    !test_and_set_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state))
>> -		(void)schedule_work(&hdev->service_task);
>> +		queue_work_on(cpumask_first(&hdev->affinity_mask),
>> system_wq,
>> +			      &hdev->service_task);
>>  }
>>  
>>  static int hclge_get_mac_link_status(struct hclge_dev *hdev)
>> @@ -2921,6 +2930,39 @@ static void hclge_get_misc_vector(struct
>> hclge_dev *hdev)
>>  	hdev->num_msi_used += 1;
>>  }
>>  
>> +static void hclge_irq_affinity_notify(struct irq_affinity_notify
>> *notify,
>> +				      const cpumask_t *mask)
>> +{
>> +	struct hclge_dev *hdev = container_of(notify, struct hclge_dev,
>> +					      affinity_notify);
>> +
>> +	cpumask_copy(&hdev->affinity_mask, mask);
>> +	del_timer_sync(&hdev->service_timer);
>> +	hdev->service_timer.expires = jiffies + HZ;
>> +	add_timer_on(&hdev->service_timer, cpumask_first(&hdev-
>>> affinity_mask));
>> +}
> 
> I don't see any relation between your misc irq vector and &hdev-
>> service_timer, to me this looks like an abuse of the irq affinity API
> to allow the user to move the service timer affinity.

Hi, thanks for reviewing.

hdev->service_timer is used to schedule the periodic work
queue hdev->service_task， we want all the management work
queue including hdev->service_task to bind to the same cpu
to improve cache and power efficiency, it is better to move
service timer affinity according to that.

The hdev->service_task is changed to delay work queue in
next patch " net: hns3: make hclge_service use delayed workqueue",
So the affinity in the timer of the delay work queue is automatically
set to the affinity of the delay work queue, we will move the
"make hclge_service use delayed workqueue" patch before the
"add interrupt affinity support for misc interrupt" patch, so
we do not have to set service timer affinity explicitly.

Also, There is there work queues(mbx_service_task, service_task,
rst_service_task) in the hns3 driver, we plan to combine them
to one or two workqueue to improve efficiency and readability.

> 
>> +
>> +static void hclge_irq_affinity_release(struct kref *ref)
>> +{
>> +}
>> +
>> +static void hclge_misc_affinity_setup(struct hclge_dev *hdev)
>> +{
>> +	irq_set_affinity_hint(hdev->misc_vector.vector_irq,
>> +			      &hdev->affinity_mask);
>> +
>> +	hdev->affinity_notify.notify = hclge_irq_affinity_notify;
>> +	hdev->affinity_notify.release = hclge_irq_affinity_release;
>> +	irq_set_affinity_notifier(hdev->misc_vector.vector_irq,
>> +				  &hdev->affinity_notify);
>> +}
>> +
>> +static void hclge_misc_affinity_teardown(struct hclge_dev *hdev)
>> +{
>> +	irq_set_affinity_notifier(hdev->misc_vector.vector_irq, NULL);
>> +	irq_set_affinity_hint(hdev->misc_vector.vector_irq, NULL);
>> +}
>> +
>>  static int hclge_misc_irq_init(struct hclge_dev *hdev)
>>  {
>>  	int ret;
>> @@ -6151,7 +6193,10 @@ static void hclge_set_timer_task(struct
>> hnae3_handle *handle, bool enable)
>>  	struct hclge_dev *hdev = vport->back;
>>  
>>  	if (enable) {
>> -		mod_timer(&hdev->service_timer, jiffies + HZ);
>> +		del_timer_sync(&hdev->service_timer);
>> +		hdev->service_timer.expires = jiffies + HZ;
>> +		add_timer_on(&hdev->service_timer,
>> +			     cpumask_first(&hdev->affinity_mask));
>>  	} else {
>>  		del_timer_sync(&hdev->service_timer);
>>  		cancel_work_sync(&hdev->service_task);
>> @@ -8809,6 +8854,11 @@ static int hclge_init_ae_dev(struct
>> hnae3_ae_dev *ae_dev)
>>  	INIT_WORK(&hdev->rst_service_task, hclge_reset_service_task);
>>  	INIT_WORK(&hdev->mbx_service_task, hclge_mailbox_service_task);
>>  
>> +	/* Setup affinity after service timer setup because
>> add_timer_on
>> +	 * is called in affinity notify.
>> +	 */
>> +	hclge_misc_affinity_setup(hdev);
>> +
>>  	hclge_clear_all_event_cause(hdev);
>>  	hclge_clear_resetting_state(hdev);
>>  
>> @@ -8970,6 +9020,7 @@ static void hclge_uninit_ae_dev(struct
>> hnae3_ae_dev *ae_dev)
>>  	struct hclge_dev *hdev = ae_dev->priv;
>>  	struct hclge_mac *mac = &hdev->hw.mac;
>>  
>> +	hclge_misc_affinity_teardown(hdev);
>>  	hclge_state_uninit(hdev);
>>  
>>  	if (mac->phydev)
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
>> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
>> index 6a12285..14df23c 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
>> @@ -864,6 +864,10 @@ struct hclge_dev {
>>  
>>  	DECLARE_KFIFO(mac_tnl_log, struct hclge_mac_tnl_stats,
>>  		      HCLGE_MAC_TNL_LOG_SIZE);
>> +
>> +	/* affinity mask and notify for misc interrupt */
>> +	cpumask_t affinity_mask;
>> +	struct irq_affinity_notify affinity_notify;
>>  };
>>  
>>  /* VPort level vlan tag configuration for TX direction */


^ permalink raw reply

* Re: [PATCH net-next 04/11] net: hns3: fix mis-counting IRQ vector numbers issue
From: tanhuazhong @ 2019-07-25  2:04 UTC (permalink / raw)
  To: Saeed Mahameed, davem@davemloft.net
  Cc: lipeng321@huawei.com, yisen.zhuang@huawei.com,
	salil.mehta@huawei.com, linuxarm@huawei.com,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	liuyonglong@huawei.com
In-Reply-To: <ad63b46dfb7e36d63d95866a023ef181af40aa76.camel@mellanox.com>



On 2019/7/25 2:28, Saeed Mahameed wrote:
> On Wed, 2019-07-24 at 11:18 +0800, Huazhong Tan wrote:
>> From: Yonglong Liu <liuyonglong@huawei.com>
>>
>> The num_msi_left means the vector numbers of NIC, but if the
>> PF supported RoCE, it contains the vector numbers of NIC and
>> RoCE(Not expected).
>>
>> This may cause interrupts lost in some case, because of the
>> NIC module used the vector resources which belongs to RoCE.
>>
>> This patch corrects the value of num_msi_left to be equals to
>> the vector numbers of NIC, and adjust the default tqp numbers
>> according to the value of num_msi_left.
>>
>> Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine &
>> Compatibility Layer Support")
>> Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
>> Signed-off-by: Peng Li <lipeng321@huawei.com>
>> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
>> ---
>>   drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c   |  5 ++++-
>>   drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 12
>> ++++++++++--
>>   2 files changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
>> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
>> index 3c64d70..a59d13f 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
>> @@ -1470,13 +1470,16 @@ static int hclge_vport_setup(struct
>> hclge_vport *vport, u16 num_tqps)
>>   {
>>   	struct hnae3_handle *nic = &vport->nic;
>>   	struct hclge_dev *hdev = vport->back;
>> +	u16 alloc_tqps;
>>   	int ret;
>>   
>>   	nic->pdev = hdev->pdev;
>>   	nic->ae_algo = &ae_algo;
>>   	nic->numa_node_mask = hdev->numa_node_mask;
>>   
>> -	ret = hclge_knic_setup(vport, num_tqps,
>> +	alloc_tqps = min_t(u16, hdev->roce_base_msix_offset - 1,
> 
> 
> Why do you need the extra alloc_tqps ? just overwrite num_tqps, the
> original value is not needed afterwards.
> 

Yes, using num_tqps is better.
I will remove the extra alloc_tqps in V2.
Thanks.

>> num_tqps);
>> +
>> +	ret = hclge_knic_setup(vport, alloc_tqps,
>>   			       hdev->num_tx_desc, hdev->num_rx_desc);
>>   	if (ret)
>>   		dev_err(&hdev->pdev->dev, "knic setup failed %d\n",
>> ret);
>>


^ permalink raw reply

* [PATCH] net: mscc: ocelot: null check devm_kcalloc
From: Navid Emamdoost @ 2019-07-25  1:56 UTC (permalink / raw)
  Cc: emamd001, kjlu, smccaman, secalert, Navid Emamdoost,
	Alexandre Belloni, Microchip Linux Driver Support,
	David S. Miller, netdev, linux-kernel

devm_kcalloc may fail and return NULL. Added the null check.

Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
---
 drivers/net/ethernet/mscc/ocelot_board.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/mscc/ocelot_board.c b/drivers/net/ethernet/mscc/ocelot_board.c
index 58bde1a9eacb..52377cfdc31a 100644
--- a/drivers/net/ethernet/mscc/ocelot_board.c
+++ b/drivers/net/ethernet/mscc/ocelot_board.c
@@ -257,6 +257,8 @@ static int mscc_ocelot_probe(struct platform_device *pdev)
 
 	ocelot->ports = devm_kcalloc(&pdev->dev, ocelot->num_phys_ports,
 				     sizeof(struct ocelot_port *), GFP_KERNEL);
+	if (!ocelot->ports)
+		return -ENOMEM;
 
 	INIT_LIST_HEAD(&ocelot->multicast);
 	ocelot_init(ocelot);
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH 2/2] arm: Add support for function error injection
From: Leo Yan @ 2019-07-25  1:48 UTC (permalink / raw)
  To: Russell King, Oleg Nesterov, Catalin Marinas, Will Deacon,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Arnd Bergmann, linux-arm-kernel, linux-kernel,
	netdev, bpf, Masami Hiramatsu, Justin He
In-Reply-To: <20190716111301.1855-3-leo.yan@linaro.org>

Hi Russell,

On Tue, Jul 16, 2019 at 07:13:01PM +0800, Leo Yan wrote:
> This patch implement regs_set_return_value() and
> override_function_with_return() to support function error injection
> for arm.
> 
> In the exception flow, we can update pt_regs::ARM_pc with
> pt_regs::ARM_lr so that can override the probed function return.

Gentle ping.

> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  arch/arm/Kconfig                       |  1 +
>  arch/arm/include/asm/error-injection.h | 13 +++++++++++++
>  arch/arm/include/asm/ptrace.h          |  5 +++++
>  arch/arm/lib/Makefile                  |  2 ++
>  arch/arm/lib/error-inject.c            | 19 +++++++++++++++++++
>  5 files changed, 40 insertions(+)
>  create mode 100644 arch/arm/include/asm/error-injection.h
>  create mode 100644 arch/arm/lib/error-inject.c
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 8869742a85df..f7932a5e29ea 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -74,6 +74,7 @@ config ARM
>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
>  	select HAVE_EXIT_THREAD
>  	select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
> +	select HAVE_FUNCTION_ERROR_INJECTION if !THUMB2_KERNEL
>  	select HAVE_FUNCTION_GRAPH_TRACER if !THUMB2_KERNEL && !CC_IS_CLANG
>  	select HAVE_FUNCTION_TRACER if !XIP_KERNEL
>  	select HAVE_GCC_PLUGINS
> diff --git a/arch/arm/include/asm/error-injection.h b/arch/arm/include/asm/error-injection.h
> new file mode 100644
> index 000000000000..da057e8ed224
> --- /dev/null
> +++ b/arch/arm/include/asm/error-injection.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: GPL-2.0+ */
> +
> +#ifndef __ASM_ERROR_INJECTION_H_
> +#define __ASM_ERROR_INJECTION_H_
> +
> +#include <linux/compiler.h>
> +#include <linux/linkage.h>
> +#include <asm/ptrace.h>
> +#include <asm-generic/error-injection.h>
> +
> +void override_function_with_return(struct pt_regs *regs);
> +
> +#endif /* __ASM_ERROR_INJECTION_H_ */
> diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
> index 91d6b7856be4..3b41f37b361a 100644
> --- a/arch/arm/include/asm/ptrace.h
> +++ b/arch/arm/include/asm/ptrace.h
> @@ -89,6 +89,11 @@ static inline long regs_return_value(struct pt_regs *regs)
>  	return regs->ARM_r0;
>  }
>  
> +static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
> +{
> +	regs->ARM_r0 = rc;
> +}
> +
>  #define instruction_pointer(regs)	(regs)->ARM_pc
>  
>  #ifdef CONFIG_THUMB2_KERNEL
> diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> index 0bff0176db2c..d3d7430ecd76 100644
> --- a/arch/arm/lib/Makefile
> +++ b/arch/arm/lib/Makefile
> @@ -43,3 +43,5 @@ ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
>    CFLAGS_xor-neon.o		+= $(NEON_FLAGS)
>    obj-$(CONFIG_XOR_BLOCKS)	+= xor-neon.o
>  endif
> +
> +obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
> diff --git a/arch/arm/lib/error-inject.c b/arch/arm/lib/error-inject.c
> new file mode 100644
> index 000000000000..96319d017114
> --- /dev/null
> +++ b/arch/arm/lib/error-inject.c
> @@ -0,0 +1,19 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/error-injection.h>
> +#include <linux/kprobes.h>
> +
> +void override_function_with_return(struct pt_regs *regs)
> +{
> +	/*
> +	 * 'regs' represents the state on entry of a predefined function in
> +	 * the kernel/module and which is captured on a kprobe.
> +	 *
> +	 * 'regs->ARM_lr' contains the the link register for the probed
> +	 * function and assign it to 'regs->ARM_pc', so when kprobe returns
> +	 * back from exception it will override the end of probed function
> +	 * and drirectly return to the predefined function's caller.
> +	 */
> +	regs->ARM_pc = regs->ARM_lr;
> +}
> +NOKPROBE_SYMBOL(override_function_with_return);
> -- 
> 2.17.1
> 

^ permalink raw reply

* Re: [PATCH 1/2] arm64: Add support for function error injection
From: Leo Yan @ 2019-07-25  1:42 UTC (permalink / raw)
  To: Russell King, Oleg Nesterov, Catalin Marinas, Will Deacon,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Arnd Bergmann, linux-arm-kernel, linux-kernel,
	netdev, bpf, Masami Hiramatsu, Justin He
In-Reply-To: <20190716111301.1855-2-leo.yan@linaro.org>

On Tue, Jul 16, 2019 at 07:13:00PM +0800, Leo Yan wrote:
> This patch implement regs_set_return_value() and
> override_function_with_return() to support function error injection
> for arm64.
> 
> In the exception flow, arm64's general register x30 contains the value
> for the link register; so we can just update pt_regs::pc with it rather
> than redirecting execution to a dummy function that returns.
> 
> This patch is heavily inspired by the commit 7cd01b08d35f ("powerpc:
> Add support for function error injection").
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Catalin, Will:  Gentle ping ...

> ---
>  arch/arm64/Kconfig                       |  1 +
>  arch/arm64/include/asm/error-injection.h | 13 +++++++++++++
>  arch/arm64/include/asm/ptrace.h          |  5 +++++
>  arch/arm64/lib/Makefile                  |  2 ++
>  arch/arm64/lib/error-inject.c            | 19 +++++++++++++++++++
>  5 files changed, 40 insertions(+)
>  create mode 100644 arch/arm64/include/asm/error-injection.h
>  create mode 100644 arch/arm64/lib/error-inject.c
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 697ea0510729..a6d9e622977d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -142,6 +142,7 @@ config ARM64
>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>  	select HAVE_FTRACE_MCOUNT_RECORD
>  	select HAVE_FUNCTION_TRACER
> +	select HAVE_FUNCTION_ERROR_INJECTION
>  	select HAVE_FUNCTION_GRAPH_TRACER
>  	select HAVE_GCC_PLUGINS
>  	select HAVE_HW_BREAKPOINT if PERF_EVENTS
> diff --git a/arch/arm64/include/asm/error-injection.h b/arch/arm64/include/asm/error-injection.h
> new file mode 100644
> index 000000000000..da057e8ed224
> --- /dev/null
> +++ b/arch/arm64/include/asm/error-injection.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: GPL-2.0+ */
> +
> +#ifndef __ASM_ERROR_INJECTION_H_
> +#define __ASM_ERROR_INJECTION_H_
> +
> +#include <linux/compiler.h>
> +#include <linux/linkage.h>
> +#include <asm/ptrace.h>
> +#include <asm-generic/error-injection.h>
> +
> +void override_function_with_return(struct pt_regs *regs);
> +
> +#endif /* __ASM_ERROR_INJECTION_H_ */
> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> index dad858b6adc6..3aafbbe218a2 100644
> --- a/arch/arm64/include/asm/ptrace.h
> +++ b/arch/arm64/include/asm/ptrace.h
> @@ -294,6 +294,11 @@ static inline unsigned long regs_return_value(struct pt_regs *regs)
>  	return regs->regs[0];
>  }
>  
> +static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
> +{
> +	regs->regs[0] = rc;
> +}
> +
>  /**
>   * regs_get_kernel_argument() - get Nth function argument in kernel
>   * @regs:	pt_regs of that context
> diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
> index 33c2a4abda04..f182ccb0438e 100644
> --- a/arch/arm64/lib/Makefile
> +++ b/arch/arm64/lib/Makefile
> @@ -33,3 +33,5 @@ UBSAN_SANITIZE_atomic_ll_sc.o	:= n
>  lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
>  
>  obj-$(CONFIG_CRC32) += crc32.o
> +
> +obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
> diff --git a/arch/arm64/lib/error-inject.c b/arch/arm64/lib/error-inject.c
> new file mode 100644
> index 000000000000..35661c2de4b0
> --- /dev/null
> +++ b/arch/arm64/lib/error-inject.c
> @@ -0,0 +1,19 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/error-injection.h>
> +#include <linux/kprobes.h>
> +
> +void override_function_with_return(struct pt_regs *regs)
> +{
> +	/*
> +	 * 'regs' represents the state on entry of a predefined function in
> +	 * the kernel/module and which is captured on a kprobe.
> +	 *
> +	 * 'regs->regs[30]' contains the the link register for the probed
> +	 * function and assign it to 'regs->pc', so when kprobe returns
> +	 * back from exception it will override the end of probed function
> +	 * and drirectly return to the predefined function's caller.
> +	 */
> +	regs->pc = regs->regs[30];
> +}
> +NOKPROBE_SYMBOL(override_function_with_return);
> -- 
> 2.17.1
> 

^ permalink raw reply

* Re: [PATCH] carl9170: remove set but not used variable 'udev'
From: Yuehaibing @ 2019-07-25  1:40 UTC (permalink / raw)
  To: Christian Lamparter
  Cc: Kalle Valo, linux-wireless, Netdev, kernel-janitors, linux-kernel,
	Hulk Robot
In-Reply-To: <CAAd0S9BvTfRyUVkQzcczyNkU_oeU5hNdK3KVQzLsU21b4JGNTQ@mail.gmail.com>

On 2019/7/25 3:42, Christian Lamparter wrote:
> On Wed, Jul 24, 2019 at 3:48 AM YueHaibing <yuehaibing@huawei.com> wrote:
>>
>> Fixes gcc '-Wunused-but-set-variable' warning:
>>
>> drivers/net/wireless/ath/carl9170/usb.c: In function 'carl9170_usb_disconnect':
>> drivers/net/wireless/ath/carl9170/usb.c:1110:21: warning:
>>  variable 'udev' set but not used [-Wunused-but-set-variable]
>>
>> It is not used, so can be removed.
>>
>> Reported-by: Hulk Robot <hulkci@huawei.com>
>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
>> ---
> Isn't this the same patch you sent earlier:
> 
> https://patchwork.kernel.org/patch/11027909/
> 
>>From what I can tell, it's the same but with an extra [-next], I
> remember that I've acked that one
> but your patch now does not have it? Is this an oversight, because I'm
> the maintainer for this
> driver. So, in my opinion at least the "ack" should have some value
> and shouldn't be "ignored".
> 
> Look, from what I know, Kalle is not ignoring you, It's just that
> carl9170 is no longer top priority.
> So please be patient. As long as its queued in the patchwork it will
> get considered.

Thank you for reminder. I forget the previous patch，and our CI robot
report it again, So I do it again, sorry for confusion.

Just pls drop this and use previous one.

> 
> Cheers,
> Christian
> 
>>  drivers/net/wireless/ath/carl9170/usb.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/carl9170/usb.c b/drivers/net/wireless/ath/carl9170/usb.c
>> index 99f1897a775d..486957a04bd1 100644
>> --- a/drivers/net/wireless/ath/carl9170/usb.c
>> +++ b/drivers/net/wireless/ath/carl9170/usb.c
>> @@ -1107,12 +1107,10 @@ static int carl9170_usb_probe(struct usb_interface *intf,
>>  static void carl9170_usb_disconnect(struct usb_interface *intf)
>>  {
>>         struct ar9170 *ar = usb_get_intfdata(intf);
>> -       struct usb_device *udev;
>>
>>         if (WARN_ON(!ar))
>>                 return;
>>
>> -       udev = ar->udev;
>>         wait_for_completion(&ar->fw_load_wait);
>>
>>         if (IS_INITIALIZED(ar)) {
>>
>>
>>
> 
> .
> 


^ permalink raw reply

* linux-next: manual merge of the net-next tree with the net tree
From: Stephen Rothwell @ 2019-07-25  0:58 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List, Wen Yang,
	Sean Nyekjaer

[-- Attachment #1: Type: text/plain, Size: 1294 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/can/flexcan.c

between commit:

  e9f2a856e102 ("can: flexcan: fix an use-after-free in flexcan_setup_stop_mode()")

from the net tree and commit:

  915f9666421c ("can: flexcan: add support for DT property 'wakeup-source'")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/net/can/flexcan.c
index fcec8bcb53d6,09d8e623dcf6..000000000000
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@@ -1473,9 -1473,10 +1491,12 @@@ static int flexcan_setup_stop_mode(stru
  
  	device_set_wakeup_capable(&pdev->dev, true);
  
+ 	if (of_property_read_bool(np, "wakeup-source"))
+ 		device_set_wakeup_enable(&pdev->dev, true);
+ 
 -	return 0;
 +out_put_node:
 +	of_node_put(gpr_np);
 +	return ret;
  }
  
  static const struct of_device_id flexcan_of_match[] = {

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* linux-next: manual merge of the net-next tree with the jc_docs tree
From: Stephen Rothwell @ 2019-07-25  0:54 UTC (permalink / raw)
  To: David Miller, Networking, Jonathan Corbet
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Mauro Carvalho Chehab, Benjamin Poirier

[-- Attachment #1: Type: text/plain, Size: 1183 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  Documentation/PCI/pci-error-recovery.rst

between commit:

  4d2e26a38fbc ("docs: powerpc: convert docs to ReST and rename to *.rst")

from the jc_docs tree and commit:

  955315b0dc8c ("qlge: Move drivers/net/ethernet/qlogic/qlge/ to drivers/staging/qlge/")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc Documentation/PCI/pci-error-recovery.rst
index e5d450df06b4,7e30f43a9659..000000000000
--- a/Documentation/PCI/pci-error-recovery.rst
+++ b/Documentation/PCI/pci-error-recovery.rst
@@@ -421,7 -421,3 +421,6 @@@ That is, the recovery API only require
     - drivers/net/ixgbe
     - drivers/net/cxgb3
     - drivers/net/s2io.c
-    - drivers/net/qlge
 +
 +The End
 +-------

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* RE: [PATCH net-next 2/2] qed: Add API for flashing the nvm attributes.
From: Sudarsana Reddy Kalluru @ 2019-07-25  0:48 UTC (permalink / raw)
  To: Saeed Mahameed, davem@davemloft.net
  Cc: Ariel Elior, Michal Kalderon, netdev@vger.kernel.org
In-Reply-To: <24c09b029d00ba73aab58ef09a2e65ac545b3423.camel@mellanox.com>

> -----Original Message-----
> From: Saeed Mahameed <saeedm@mellanox.com>
> Sent: Thursday, July 25, 2019 1:13 AM
> To: Sudarsana Reddy Kalluru <skalluru@marvell.com>;
> davem@davemloft.net
> Cc: Ariel Elior <aelior@marvell.com>; Michal Kalderon
> <mkalderon@marvell.com>; netdev@vger.kernel.org
> Subject: [EXT] Re: [PATCH net-next 2/2] qed: Add API for flashing the nvm
> attributes.
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Tue, 2019-07-23 at 21:51 -0700, Sudarsana Reddy Kalluru wrote:
> > The patch adds driver interface for reading the NVM config request and
> > update the attributes on nvm config flash partition.
> >
> 
> You didn't not use the get_cfg API you added in previous patch.
Thanks for your review. Will move this API to the next patch series which will plan to send shortly.

> 
> Also can you please clarify how the user reads/write from/to NVM config
> ? i mean what UAPIs and tools are being used ?
NVM config/partition will be updated using ethtool flash update command (i.e., ethtool -f) just like the update of 
other flash partitions of qed device. Example code path,
  ethool-flash_device --> qede_flash_device() --> qed_nvm_flash() --> qed_nvm_flash_cfg_write()

> 
> > Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
> > Signed-off-by: Ariel Elior <aelior@marvell.com>
> > ---
> >  drivers/net/ethernet/qlogic/qed/qed_main.c | 65
> > ++++++++++++++++++++++++++++++
> >  include/linux/qed/qed_if.h                 |  1 +
> >  2 files changed, 66 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c
> > b/drivers/net/ethernet/qlogic/qed/qed_main.c
> > index 829dd60..54f00d2 100644
> > --- a/drivers/net/ethernet/qlogic/qed/qed_main.c
> > +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
> > @@ -67,6 +67,8 @@
> >  #define QED_ROCE_QPS			(8192)
> >  #define QED_ROCE_DPIS			(8)
> >  #define QED_RDMA_SRQS                   QED_ROCE_QPS
> > +#define QED_NVM_CFG_SET_FLAGS		0xE
> > +#define QED_NVM_CFG_SET_PF_FLAGS	0x1E
> >
> >  static char version[] =
> >  	"QLogic FastLinQ 4xxxx Core Module qed " DRV_MODULE_VERSION
> > "\n";
> > @@ -2227,6 +2229,66 @@ static int qed_nvm_flash_image_validate(struct
> > qed_dev *cdev,
> >  	return 0;
> >  }
> >
> > +/* Binary file format -
> > + *     /----------------------------------------------------------
> > ------------\
> > + * 0B  |                       0x5 [command
> > index]                            |
> > + * 4B  | Entity ID     | Reserved        |  Number of config
> > attributes       |
> > + * 8B  | Config ID                       | Length        |
> > Value              |
> > +
> > *     |
> >         |
> > + *     \----------------------------------------------------------
> > ------------/
> > + * There can be several Cfg_id-Length-Value sets as specified by
> > 'Number of...'.
> > + * Entity ID - A non zero entity value for which the config need to
> > be updated.
> > + */
> > +static int qed_nvm_flash_cfg_write(struct qed_dev *cdev, const u8
> > **data)
> > +{
> > +	struct qed_hwfn *hwfn = QED_LEADING_HWFN(cdev);
> > +	u8 entity_id, len, buf[32];
> > +	struct qed_ptt *ptt;
> > +	u16 cfg_id, count;
> > +	int rc = 0, i;
> > +	u32 flags;
> > +
> > +	ptt = qed_ptt_acquire(hwfn);
> > +	if (!ptt)
> > +		return -EAGAIN;
> > +
> > +	/* NVM CFG ID attribute header */
> > +	*data += 4;
> > +	entity_id = **data;
> > +	*data += 2;
> > +	count = *((u16 *)*data);
> > +	*data += 2;
> > +
> > +	DP_VERBOSE(cdev, NETIF_MSG_DRV,
> > +		   "Read config ids: entity id %02x num _attrs =
> > %0d\n",
> > +		   entity_id, count);
> > +	/* NVM CFG ID attributes */
> > +	for (i = 0; i < count; i++) {
> > +		cfg_id = *((u16 *)*data);
> > +		*data += 2;
> > +		len = **data;
> > +		(*data)++;
> > +		memcpy(buf, *data, len);
> > +		*data += len;
> > +
> > +		flags = entity_id ? QED_NVM_CFG_SET_PF_FLAGS :
> > +			QED_NVM_CFG_SET_FLAGS;
> > +
> > +		DP_VERBOSE(cdev, NETIF_MSG_DRV,
> > +			   "cfg_id = %d len = %d\n", cfg_id, len);
> > +		rc = qed_mcp_nvm_set_cfg(hwfn, ptt, cfg_id, entity_id,
> > flags,
> > +					 buf, len);
> > +		if (rc) {
> > +			DP_ERR(cdev, "Error %d configuring %d\n", rc,
> > cfg_id);
> > +			break;
> > +		}
> > +	}
> > +
> > +	qed_ptt_release(hwfn, ptt);
> > +
> > +	return rc;
> > +}
> > +
> >  static int qed_nvm_flash(struct qed_dev *cdev, const char *name)
> >  {
> >  	const struct firmware *image;
> > @@ -2268,6 +2330,9 @@ static int qed_nvm_flash(struct qed_dev *cdev,
> > const char *name)
> >  			rc = qed_nvm_flash_image_access(cdev, &data,
> >  							&check_resp);
> >  			break;
> > +		case QED_NVM_FLASH_CMD_NVM_CFG_ID:
> > +			rc = qed_nvm_flash_cfg_write(cdev, &data);
> > +			break;
> >  		default:
> >  			DP_ERR(cdev, "Unknown command %08x\n",
> > cmd_type);
> >  			rc = -EINVAL;
> > diff --git a/include/linux/qed/qed_if.h b/include/linux/qed/qed_if.h
> > index eef02e6..23805ea 100644
> > --- a/include/linux/qed/qed_if.h
> > +++ b/include/linux/qed/qed_if.h
> > @@ -804,6 +804,7 @@ enum qed_nvm_flash_cmd {
> >  	QED_NVM_FLASH_CMD_FILE_DATA = 0x2,
> >  	QED_NVM_FLASH_CMD_FILE_START = 0x3,
> >  	QED_NVM_FLASH_CMD_NVM_CHANGE = 0x4,
> > +	QED_NVM_FLASH_CMD_NVM_CFG_ID = 0x5,
> >  	QED_NVM_FLASH_CMD_NVM_MAX,
> >  };
> >

^ permalink raw reply

* Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()
From: Bob Liu @ 2019-07-25  0:41 UTC (permalink / raw)
  To: john.hubbard, Andrew Morton
  Cc: Alexander Viro, Anna Schumaker, David S . Miller,
	Dominique Martinet, Eric Van Hensbergen, Jason Gunthorpe,
	Jason Wang, Jens Axboe, Latchesar Ionkov, Michael S . Tsirkin,
	Miklos Szeredi, Trond Myklebust, Christoph Hellwig,
	Matthew Wilcox, linux-mm, LKML, ceph-devel, kvm, linux-block,
	linux-cifs, linux-fsdevel, linux-nfs, linux-rdma, netdev,
	samba-technical, v9fs-developer, virtualization, John Hubbard
In-Reply-To: <20190724042518.14363-1-jhubbard@nvidia.com>

On 7/24/19 12:25 PM, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Hi,
> 
> This is mostly Jerome's work, converting the block/bio and related areas
> to call put_user_page*() instead of put_page(). Because I've changed
> Jerome's patches, in some cases significantly, I'd like to get his
> feedback before we actually leave him listed as the author (he might
> want to disown some or all of these).
> 

Could you add some background to the commit log for people don't have the context..
Why this converting? What's the main differences?

Regards, -Bob

> I added a new patch, in order to make this work with Christoph Hellwig's
> recent overhaul to bio_release_pages(): "block: bio_release_pages: use
> flags arg instead of bool".
> 
> I've started the series with a patch that I've posted in another
> series ("mm/gup: add make_dirty arg to put_user_pages_dirty_lock()"[1]),
> because I'm not sure which of these will go in first, and this allows each
> to stand alone.
> 
> Testing: not much beyond build and boot testing has been done yet. And
> I'm not set up to even exercise all of it (especially the IB parts) at
> run time.
> 
> Anyway, changes here are:
> 
> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>   it is time to release the pages. That allows choosing between put_page()
>   and put_user_page*().
> 
> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>   parameter. Similar use as above.
> 
> * Change the block layer, and several file systems, to use
>   put_user_page*().
> 
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_r_20190724012606.25844-2D2-2Djhubbard-40nvidia.com&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=FpFhv2rjbKCAYGmO6Hy8WJAottr1Qz_mDKDLObQ40FU&s=q-_mX3daEr22WbdZMElc_ZbD8L9oGLD7U0xLeyJ661Y&e= 
>     And please note the correction email that I posted as a follow-up,
>     if you're looking closely at that patch. :) The fixed version is
>     included here.
> 
> John Hubbard (3):
>   mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
>   block: bio_release_pages: use flags arg instead of bool
>   fs/ceph: fix a build warning: returning a value from void function
> 
> Jérôme Glisse (9):
>   iov_iter: add helper to test if an iter would use GUP v2
>   block: bio_release_pages: convert put_page() to put_user_page*()
>   block_dev: convert put_page() to put_user_page*()
>   fs/nfs: convert put_page() to put_user_page*()
>   vhost-scsi: convert put_page() to put_user_page*()
>   fs/cifs: convert put_page() to put_user_page*()
>   fs/fuse: convert put_page() to put_user_page*()
>   fs/ceph: convert put_page() to put_user_page*()
>   9p/net: convert put_page() to put_user_page*()
> 
>  block/bio.c                                |  81 ++++++++++++---
>  drivers/infiniband/core/umem.c             |   5 +-
>  drivers/infiniband/hw/hfi1/user_pages.c    |   5 +-
>  drivers/infiniband/hw/qib/qib_user_pages.c |   5 +-
>  drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
>  drivers/infiniband/sw/siw/siw_mem.c        |   8 +-
>  drivers/vhost/scsi.c                       |  13 ++-
>  fs/block_dev.c                             |  22 +++-
>  fs/ceph/debugfs.c                          |   2 +-
>  fs/ceph/file.c                             |  62 ++++++++---
>  fs/cifs/cifsglob.h                         |   3 +
>  fs/cifs/file.c                             |  22 +++-
>  fs/cifs/misc.c                             |  19 +++-
>  fs/direct-io.c                             |   2 +-
>  fs/fuse/dev.c                              |  22 +++-
>  fs/fuse/file.c                             |  53 +++++++---
>  fs/nfs/direct.c                            |  10 +-
>  include/linux/bio.h                        |  22 +++-
>  include/linux/mm.h                         |   5 +-
>  include/linux/uio.h                        |  11 ++
>  mm/gup.c                                   | 115 +++++++++------------
>  net/9p/trans_common.c                      |  14 ++-
>  net/9p/trans_common.h                      |   3 +-
>  net/9p/trans_virtio.c                      |  18 +++-
>  24 files changed, 357 insertions(+), 170 deletions(-)
> 


^ permalink raw reply

* Re: [PATCH bpf-next 01/10] libbpf: add .BTF.ext offset relocation section loading
From: Andrii Nakryiko @ 2019-07-25  0:37 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Yonghong Song, Kernel Team
In-Reply-To: <B5E772A5-C0D9-4697-ADE2-2A94C4AD37B5@fb.com>

On Wed, Jul 24, 2019 at 5:00 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Jul 24, 2019, at 12:27 PM, Andrii Nakryiko <andriin@fb.com> wrote:
> >
> > Add support for BPF CO-RE offset relocations. Add section/record
> > iteration macros for .BTF.ext. These macro are useful for iterating over
> > each .BTF.ext record, either for dumping out contents or later for BPF
> > CO-RE relocation handling.
> >
> > To enable other parts of libbpf to work with .BTF.ext contents, moved
> > a bunch of type definitions into libbpf_internal.h.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> > tools/lib/bpf/btf.c             | 64 +++++++++--------------
> > tools/lib/bpf/btf.h             |  4 ++
> > tools/lib/bpf/libbpf_internal.h | 91 +++++++++++++++++++++++++++++++++
> > 3 files changed, 118 insertions(+), 41 deletions(-)
> >

[...]

> > +
> > static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
> > {
> >       const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
> > @@ -1004,6 +979,13 @@ struct btf_ext *btf_ext__new(__u8 *data, __u32 size)
> >       if (err)
> >               goto done;
> >
> > +     /* check if there is offset_reloc_off/offset_reloc_len fields */
> > +     if (btf_ext->hdr->hdr_len < sizeof(struct btf_ext_header))
>
> This check will break when we add more optional sections to btf_ext_header.
> Maybe use offsetof() instead?

I didn't do it, because there are no fields after offset_reloc_len.
But now I though that maybe it would be ok to add zero-sized marker
field, kind of like marking off various versions of btf_ext header?

Alternatively, I can add offsetofend() macro somewhere in libbpf_internal.h.

Do you have any preference?

>
> > +             goto done;
> > +     err = btf_ext_setup_offset_reloc(btf_ext);
> > +     if (err)
> > +             goto done;
> > +
> > done:

[...]

^ permalink raw reply

* Re: [PATCH net] selftests/net: add missing gitignores (ipv6_flowlabel)
From: Willem de Bruijn @ 2019-07-25  0:22 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Miller, Network Development, oss-drivers, Quentin Monnet
In-Reply-To: <20190725000714.10200-1-jakub.kicinski@netronome.com>

On Wed, Jul 24, 2019 at 8:07 PM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> ipv6_flowlabel and ipv6_flowlabel_mgr are missing from
> gitignore.  Quentin points out that the original
> commit 3fb321fde22d ("selftests/net: ipv6 flowlabel")
> did add ignore entries, they are just missing the "ipv6_"
> prefix.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>

Acked-by: Willem de Bruijn <willemb@google.com>

Thanks Jakub

^ permalink raw reply

* Re: [PATCH v4 net-next 13/19] ionic: Add initial ethtool support
From: Saeed Mahameed @ 2019-07-25  0:17 UTC (permalink / raw)
  To: snelson@pensando.io, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <20190722214023.9513-14-snelson@pensando.io>

On Mon, 2019-07-22 at 14:40 -0700, Shannon Nelson wrote:
> Add in the basic ethtool callbacks for device information
> and control.
> 
> Signed-off-by: Shannon Nelson <snelson@pensando.io>
> ---
>  drivers/net/ethernet/pensando/ionic/Makefile  |   2 +-
>  .../net/ethernet/pensando/ionic/ionic_dev.h   |   3 +
>  .../ethernet/pensando/ionic/ionic_ethtool.c   | 495
> ++++++++++++++++++
>  .../ethernet/pensando/ionic/ionic_ethtool.h   |   9 +
>  .../net/ethernet/pensando/ionic/ionic_lif.c   |   2 +
>  .../net/ethernet/pensando/ionic/ionic_lif.h   |   8 +
>  6 files changed, 518 insertions(+), 1 deletion(-)
>  create mode 100644
> drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
>  create mode 100644
> drivers/net/ethernet/pensando/ionic/ionic_ethtool.h
> 
> diff --git a/drivers/net/ethernet/pensando/ionic/Makefile
> b/drivers/net/ethernet/pensando/ionic/Makefile
> index 7d9cdc5f02a1..9b19bf57a489 100644
> --- a/drivers/net/ethernet/pensando/ionic/Makefile
> +++ b/drivers/net/ethernet/pensando/ionic/Makefile
> @@ -3,5 +3,5 @@
>  
>  obj-$(CONFIG_IONIC) := ionic.o
>  
> -ionic-y := ionic_main.o ionic_bus_pci.o ionic_dev.o \
> +ionic-y := ionic_main.o ionic_bus_pci.o ionic_dev.o ionic_ethtool.o
> \
>  	   ionic_lif.o ionic_rx_filter.o ionic_debugfs.o
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> index 523927566925..bacc9c557329 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
> @@ -12,6 +12,9 @@
>  
>  #define IONIC_MIN_MTU			ETH_MIN_MTU
>  #define IONIC_MAX_MTU			9194
> +#define IONIC_MAX_TXRX_DESC		16384
> +#define IONIC_MIN_TXRX_DESC		16
> +#define IONIC_DEF_TXRX_DESC		4096
>  #define IONIC_LIFS_MAX			1024
>  
>  struct ionic_dev_bar {
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
> b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
> new file mode 100644
> index 000000000000..f7899be547c3
> --- /dev/null
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
> @@ -0,0 +1,495 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright(c) 2017 - 2019 Pensando Systems, Inc */
> +
> +#include <linux/module.h>
> +#include <linux/netdevice.h>
> +
> +#include "ionic.h"
> +#include "ionic_bus.h"
> +#include "ionic_lif.h"
> +#include "ionic_ethtool.h"
> +
> +static void ionic_get_drvinfo(struct net_device *netdev,
> +			      struct ethtool_drvinfo *drvinfo)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic *ionic = lif->ionic;
> +	struct ionic_dev *idev = &ionic->idev;
> +
> +	strlcpy(drvinfo->driver, DRV_NAME, sizeof(drvinfo->driver));
> +	strlcpy(drvinfo->version, DRV_VERSION, sizeof(drvinfo-
> >version));
> +	strlcpy(drvinfo->fw_version, idev->dev_info.fw_version,
> +		sizeof(drvinfo->fw_version));
> +	strlcpy(drvinfo->bus_info, ionic_bus_info(ionic),
> +		sizeof(drvinfo->bus_info));
> +}
> +
> +#define DEV_CMD_REG_VERSION 1
> +#define DEV_INFO_REG_COUNT  32
> +#define DEV_CMD_REG_COUNT   32
> +static int ionic_get_regs_len(struct net_device *netdev)
> +{
> +	return (DEV_INFO_REG_COUNT + DEV_CMD_REG_COUNT) * sizeof(u32);
> +}
> +
> +static void ionic_get_regs(struct net_device *netdev, struct
> ethtool_regs *regs,
> +			   void *p)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	unsigned int size;
> +
> +	regs->version = DEV_CMD_REG_VERSION;
> +
> +	size = DEV_INFO_REG_COUNT * sizeof(u32);
> +	memcpy_fromio(p, lif->ionic->idev.dev_info_regs->words, size);
> +
> +	size = DEV_CMD_REG_COUNT * sizeof(u32);
> +	memcpy_fromio(p, lif->ionic->idev.dev_cmd_regs->words, size);
> +}
> +
> +static int ionic_get_link_ksettings(struct net_device *netdev,
> +				    struct ethtool_link_ksettings *ks)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	int copper_seen = 0;
> +
> +	ethtool_link_ksettings_zero_link_mode(ks, supported);
> +
> +	/* The port_info data is found in a DMA space that the NIC
> keeps
> +	 * up-to-date, so there's no need to request the data from the
> +	 * NIC, we already have it in our memory space.
> +	 */
> +
> +	switch (le16_to_cpu(idev->port_info->status.xcvr.pid)) {
> +		/* Copper */
> +	case XCVR_PID_QSFP_100G_CR4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     100000baseCR4_Full
> );
> +		copper_seen++;
> +		break;
> +	case XCVR_PID_QSFP_40GBASE_CR4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     40000baseCR4_Full)
> ;
> +		copper_seen++;
> +		break;
> +	case XCVR_PID_SFP_25GBASE_CR_S:
> +	case XCVR_PID_SFP_25GBASE_CR_L:
> +	case XCVR_PID_SFP_25GBASE_CR_N:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     25000baseCR_Full);
> +		copper_seen++;
> +		break;
> +	case XCVR_PID_SFP_10GBASE_AOC:
> +	case XCVR_PID_SFP_10GBASE_CU:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseCR_Full);
> +		copper_seen++;
> +		break;
> +
> +		/* Fibre */
> +	case XCVR_PID_QSFP_100G_SR4:
> +	case XCVR_PID_QSFP_100G_AOC:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     100000baseSR4_Full
> );
> +		break;
> +	case XCVR_PID_QSFP_100G_LR4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     100000baseLR4_ER4_
> Full);
> +		break;
> +	case XCVR_PID_QSFP_100G_ER4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     100000baseLR4_ER4_
> Full);
> +		break;
> +	case XCVR_PID_QSFP_40GBASE_SR4:
> +	case XCVR_PID_QSFP_40GBASE_AOC:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     40000baseSR4_Full)
> ;
> +		break;
> +	case XCVR_PID_QSFP_40GBASE_LR4:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     40000baseLR4_Full)
> ;
> +		break;
> +	case XCVR_PID_SFP_25GBASE_SR:
> +	case XCVR_PID_SFP_25GBASE_AOC:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     25000baseSR_Full);
> +		break;
> +	case XCVR_PID_SFP_10GBASE_SR:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseSR_Full);
> +		break;
> +	case XCVR_PID_SFP_10GBASE_LR:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseLR_Full);
> +		break;
> +	case XCVR_PID_SFP_10GBASE_LRM:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseLRM_Full)
> ;
> +		break;
> +	case XCVR_PID_SFP_10GBASE_ER:
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> +						     10000baseER_Full);
> +		break;
> +	case XCVR_PID_UNKNOWN:
> +		break;
> +	default:
> +		dev_info(lif->ionic->dev, "unknown xcvr type pid=%d /
> 0x%x\n",
> +			 idev->port_info->status.xcvr.pid,
> +			 idev->port_info->status.xcvr.pid);
> +		break;
> +	}
> +	ethtool_link_ksettings_add_link_mode(ks, supported, FIBRE);
> +
> +	if (ionic_is_pf(lif->ionic))
> +		ethtool_link_ksettings_add_link_mode(ks, supported,
> Autoneg);
> +
> +	bitmap_copy(ks->link_modes.advertising, ks-
> >link_modes.supported,
> +		    __ETHTOOL_LINK_MODE_MASK_NBITS);
> +
> +	ethtool_link_ksettings_add_link_mode(ks, supported, FEC_NONE);
> +	ethtool_link_ksettings_add_link_mode(ks, supported, FEC_RS);
> +	ethtool_link_ksettings_add_link_mode(ks, supported, FEC_BASER);
> +
> +	if (idev->port_info->config.fec_type == PORT_FEC_TYPE_FC)
> +		ethtool_link_ksettings_add_link_mode(ks, advertising,
> FEC_BASER);
> +	else if (idev->port_info->config.fec_type == PORT_FEC_TYPE_RS)
> +		ethtool_link_ksettings_add_link_mode(ks, advertising,
> FEC_RS);
> +	else if (idev->port_info->config.fec_type ==
> PORT_FEC_TYPE_NONE)
> +		ethtool_link_ksettings_add_link_mode(ks, advertising,
> FEC_NONE);
> +
> +	ethtool_link_ksettings_add_link_mode(ks, supported, Pause);
> +	if (idev->port_info->config.pause_type)
> +		ethtool_link_ksettings_add_link_mode(ks, advertising,
> Pause);
> +
> +	if (idev->port_info->status.xcvr.phy == PHY_TYPE_COPPER ||
> +	    copper_seen) {
> +		ks->base.port = PORT_DA;
> +	} else if (idev->port_info->status.xcvr.phy == PHY_TYPE_FIBER)
> {
> +		ks->base.port = PORT_FIBRE;
> +	} else {
> +		ks->base.port = PORT_OTHER;
> +	}
> +
> +	ks->base.speed = le32_to_cpu(lif->info->status.link_speed);
> +
> +	if (idev->port_info->config.an_enable)
> +		ks->base.autoneg = AUTONEG_ENABLE;
> +
> +	if (le16_to_cpu(lif->info->status.link_status))
> +		ks->base.duplex = DUPLEX_FULL;
> +	else
> +		ks->base.duplex = DUPLEX_UNKNOWN;
> +
> +	return 0;
> +}
> +
> +static int ionic_set_link_ksettings(struct net_device *netdev,
> +				    const struct ethtool_link_ksettings
> *ks)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic *ionic = lif->ionic;
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	u8 fec_type = PORT_FEC_TYPE_NONE;
> +	u32 req_rs, req_fc;
> +	int err = 0;
> +
> +	/* set autoneg */
> +	if (ks->base.autoneg != idev->port_info->config.an_enable) {
> +		mutex_lock(&ionic->dev_cmd_lock);
> +		ionic_dev_cmd_port_autoneg(idev, ks->base.autoneg);
> +		err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +		mutex_unlock(&ionic->dev_cmd_lock);
> +		if (err)
> +			return err;
> +	}
> +
> +	/* set speed */
> +	if (ks->base.speed != le32_to_cpu(idev->port_info-
> >config.speed)) {
> +		mutex_lock(&ionic->dev_cmd_lock);
> +		ionic_dev_cmd_port_speed(idev, ks->base.speed);
> +		err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +		mutex_unlock(&ionic->dev_cmd_lock);
> +		if (err)
> +			return err;
> +	}
> +
> +	/* set FEC */
> +	req_rs = ethtool_link_ksettings_test_link_mode(ks, advertising,
> FEC_RS);
> +	req_fc = ethtool_link_ksettings_test_link_mode(ks, advertising,
> FEC_BASER);
> +	if (req_rs && req_fc) {
> +		netdev_info(netdev, "Only select one FEC mode at a
> time\n");
> +		return -EINVAL;
> +	} else if (req_fc &&
> +		   idev->port_info->config.fec_type !=
> PORT_FEC_TYPE_FC) {
> +		fec_type = PORT_FEC_TYPE_FC;
> +	} else if (req_rs &&
> +		   idev->port_info->config.fec_type !=
> PORT_FEC_TYPE_RS) {
> +		fec_type = PORT_FEC_TYPE_RS;
> +	} else if (!(req_rs | req_fc) &&
> +		   idev->port_info->config.fec_type !=
> PORT_FEC_TYPE_NONE) {
> +		fec_type = PORT_FEC_TYPE_NONE;
> +	}
> +
> +	if (fec_type != idev->port_info->config.fec_type) {
> +		mutex_lock(&ionic->dev_cmd_lock);
> +		ionic_dev_cmd_port_fec(idev, fec_type);
> +		err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +		mutex_unlock(&ionic->dev_cmd_lock);
> +		if (err)
> +			return err;
> +
> +		idev->port_info->config.fec_type = fec_type;
> +	}
> +
> +	return 0;
> +}
> +
> +static void ionic_get_pauseparam(struct net_device *netdev,
> +				 struct ethtool_pauseparam *pause)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	uint8_t pause_type = idev->port_info->config.pause_type;
> +
> +	pause->autoneg = 0;
> +
> +	if (pause_type) {
> +		pause->rx_pause = pause_type & IONIC_PAUSE_F_RX ? 1 :
> 0;
> +		pause->tx_pause = pause_type & IONIC_PAUSE_F_TX ? 1 :
> 0;
> +	}
> +}
> +
> +static int ionic_set_pauseparam(struct net_device *netdev,
> +				struct ethtool_pauseparam *pause)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic *ionic = lif->ionic;
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	u32 requested_pause;
> +	int err;
> +
> +	if (pause->autoneg == AUTONEG_ENABLE) {
> +		netdev_info(netdev, "Please use 'ethtool -s ...' to
> change autoneg\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	/* change both at the same time */
> +	requested_pause = PORT_PAUSE_TYPE_LINK;
> +	if (pause->rx_pause)
> +		requested_pause |= IONIC_PAUSE_F_RX;
> +	if (pause->tx_pause)
> +		requested_pause |= IONIC_PAUSE_F_TX;
> +
> +	if (requested_pause == idev->port_info->config.pause_type)
> +		return 0;
> +
> +	idev->port_info->config.pause_type = requested_pause;
> +
> +	mutex_lock(&ionic->dev_cmd_lock);
> +	ionic_dev_cmd_port_pause(idev, requested_pause);
> +	err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +	mutex_unlock(&ionic->dev_cmd_lock);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
> +static int ionic_get_coalesce(struct net_device *netdev,
> +			      struct ethtool_coalesce *coalesce)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +
> +	coalesce->tx_coalesce_usecs = lif->tx_coalesce_usecs;
> +	coalesce->rx_coalesce_usecs = lif->rx_coalesce_usecs;
> +
> +	return 0;
> +}
> +
> +static void ionic_get_ringparam(struct net_device *netdev,
> +				struct ethtool_ringparam *ring)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +
> +	ring->tx_max_pending = IONIC_MAX_TXRX_DESC;
> +	ring->tx_pending = lif->ntxq_descs;
> +	ring->rx_max_pending = IONIC_MAX_TXRX_DESC;
> +	ring->rx_pending = lif->nrxq_descs;
> +}
> +
> +static int ionic_set_ringparam(struct net_device *netdev,
> +			       struct ethtool_ringparam *ring)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	bool running;
> +
> +	if (ring->rx_mini_pending || ring->rx_jumbo_pending) {
> +		netdev_info(netdev, "Changing jumbo or mini descriptors
> not supported\n");
> +		return -EINVAL;
> +	}
> +
> +	if (!is_power_of_2(ring->tx_pending) ||
> +	    !is_power_of_2(ring->rx_pending)) {
> +		netdev_info(netdev, "Descriptor count must be a power
> of 2\n");
> +		return -EINVAL;
> +	}
> +
> +	/* if nothing to do return success */
> +	if (ring->tx_pending == lif->ntxq_descs &&
> +	    ring->rx_pending == lif->nrxq_descs)
> +		return 0;
> +
> +	while (test_and_set_bit(LIF_QUEUE_RESET, lif->state))
> +		usleep_range(200, 400);
> +
> +	running = test_bit(LIF_UP, lif->state);
> +	if (running)
> +		ionic_stop(netdev);
> +
> +	lif->ntxq_descs = ring->tx_pending;
> +	lif->nrxq_descs = ring->rx_pending;
> +
> +	if (running)
> +		ionic_open(netdev);
> +	clear_bit(LIF_QUEUE_RESET, lif->state);
> +
> +	return 0;
> +}
> +
> +static void ionic_get_channels(struct net_device *netdev,
> +			       struct ethtool_channels *ch)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +
> +	/* report maximum channels */
> +	ch->max_combined = lif->ionic->ntxqs_per_lif;
> +
> +	/* report current channels */
> +	ch->combined_count = lif->nxqs;
> +}
> +
> +static int ionic_set_channels(struct net_device *netdev,
> +			      struct ethtool_channels *ch)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	bool running;
> +
> +	if (!ch->combined_count || ch->other_count ||
> +	    ch->rx_count || ch->tx_count)
> +		return -EINVAL;
> +
> +	if (ch->combined_count == lif->nxqs)
> +		return 0;
> +
> +	while (test_and_set_bit(LIF_QUEUE_RESET, lif->state))
> +		usleep_range(200, 400);
> +

I see this is recurring a lot in the driver, i suggest to have a helper
function (wait_pending_reset_timeout) and make it return with timeout
errno after a reasonable amount of time, especially on user context
flows.

> +	running = test_bit(LIF_UP, lif->state);
> +	if (running)
> +		ionic_stop(netdev);
> +
> +	lif->nxqs = ch->combined_count;
> +
> +	if (running)
> +		ionic_open(netdev);
> +	clear_bit(LIF_QUEUE_RESET, lif->state);
> +
> +	return 0;
> +}
> +
> +static int ionic_get_module_info(struct net_device *netdev,
> +				 struct ethtool_modinfo *modinfo)
> +
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	struct xcvr_status *xcvr;
> +
> +	xcvr = &idev->port_info->status.xcvr;
> +
> +	/* report the module data type and length */
> +	switch (xcvr->sprom[0]) {
> +	case 0x03: /* SFP */
> +		modinfo->type = ETH_MODULE_SFF_8079;
> +		modinfo->eeprom_len = ETH_MODULE_SFF_8079_LEN;
> +		break;
> +	case 0x0D: /* QSFP */
> +	case 0x11: /* QSFP28 */
> +		modinfo->type = ETH_MODULE_SFF_8436;
> +		modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN;
> +		break;
> +	default:
> +		netdev_info(netdev, "unknown xcvr type 0x%02x\n",
> +			    xcvr->sprom[0]);
> +		break;
> +	}
> +
> +	return 0;
> +}
> +
> +static int ionic_get_module_eeprom(struct net_device *netdev,
> +				   struct ethtool_eeprom *ee,
> +				   u8 *data)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic_dev *idev = &lif->ionic->idev;
> +	struct xcvr_status *xcvr;
> +	u32 len;
> +
> +	/* The NIC keeps the module prom up-to-date in the DMA space
> +	 * so we can simply copy the module bytes into the data buffer.
> +	 */
> +	xcvr = &idev->port_info->status.xcvr;
> +	len = min_t(u32, sizeof(xcvr->sprom), ee->len);
> +	memcpy(data, xcvr->sprom, len);
> +
> +	return 0;
> +}
> +
> +static int ionic_nway_reset(struct net_device *netdev)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct ionic *ionic = lif->ionic;
> +	int err = 0;
> +
> +	/* flap the link to force auto-negotiation */
> +
> +	mutex_lock(&ionic->dev_cmd_lock);
> +
> +	ionic_dev_cmd_port_state(&ionic->idev, PORT_ADMIN_STATE_DOWN);
> +	err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +
> +	if (!err) {
> +		ionic_dev_cmd_port_state(&ionic->idev,
> PORT_ADMIN_STATE_UP);
> +		err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
> +	}
> +
> +	mutex_unlock(&ionic->dev_cmd_lock);
> +
> +	return err;
> +}
> +
> +static const struct ethtool_ops ionic_ethtool_ops = {
> +	.get_drvinfo		= ionic_get_drvinfo,
> +	.get_regs_len		= ionic_get_regs_len,
> +	.get_regs		= ionic_get_regs,
> +	.get_link		= ethtool_op_get_link,
> +	.get_link_ksettings	= ionic_get_link_ksettings,
> +	.get_coalesce		= ionic_get_coalesce,
> +	.get_ringparam		= ionic_get_ringparam,
> +	.set_ringparam		= ionic_set_ringparam,
> +	.get_channels		= ionic_get_channels,
> +	.set_channels		= ionic_set_channels,
> +	.get_module_info	= ionic_get_module_info,
> +	.get_module_eeprom	= ionic_get_module_eeprom,
> +	.get_pauseparam		= ionic_get_pauseparam,
> +	.set_pauseparam		= ionic_set_pauseparam,
> +	.set_link_ksettings	= ionic_set_link_ksettings,
> +	.nway_reset		= ionic_nway_reset,
> +};
> +
> +void ionic_ethtool_set_ops(struct net_device *netdev)
> +{
> +	netdev->ethtool_ops = &ionic_ethtool_ops;
> +}
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.h
> b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.h
> new file mode 100644
> index 000000000000..38b91b1d70ae
> --- /dev/null
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright(c) 2017 - 2019 Pensando Systems, Inc */
> +
> +#ifndef _IONIC_ETHTOOL_H_
> +#define _IONIC_ETHTOOL_H_
> +
> +void ionic_ethtool_set_ops(struct net_device *netdev);
> +
> +#endif /* _IONIC_ETHTOOL_H_ */
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> index f52af9cb6264..2bd8ce61c4a0 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> @@ -10,6 +10,7 @@
>  #include "ionic.h"
>  #include "ionic_bus.h"
>  #include "ionic_lif.h"
> +#include "ionic_ethtool.h"
>  #include "ionic_debugfs.h"
>  
>  static void ionic_lif_rx_mode(struct lif *lif, unsigned int
> rx_mode);
> @@ -980,6 +981,7 @@ static struct lif *ionic_lif_alloc(struct ionic
> *ionic, unsigned int index)
>  	lif->netdev = netdev;
>  	ionic->master_lif = lif;
>  	netdev->netdev_ops = &ionic_netdev_ops;
> +	ionic_ethtool_set_ops(netdev);
>  
>  	netdev->watchdog_timeo = 2 * HZ;
>  	netdev->min_mtu = IONIC_MIN_MTU;
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> index 9930b9390c8a..d8589a306aa5 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> @@ -111,6 +111,8 @@ struct lif {
>  	u64 last_eid;
>  	unsigned int neqs;
>  	unsigned int nxqs;
> +	unsigned int ntxq_descs;
> +	unsigned int nrxq_descs;
>  	unsigned int rx_mode;
>  	u64 hw_features;
>  	bool mc_overflow;
> @@ -124,6 +126,8 @@ struct lif {
>  
>  	struct rx_filters rx_filters;
>  	struct ionic_deferred deferred;
> +	u32 tx_coalesce_usecs;
> +	u32 rx_coalesce_usecs;
>  	unsigned long *dbid_inuse;
>  	unsigned int dbid_count;
>  	struct dentry *dentry;
> @@ -165,6 +169,10 @@ int ionic_lif_identify(struct ionic *ionic, u8
> lif_type,
>  		       union lif_identity *lif_ident);
>  int ionic_lifs_size(struct ionic *ionic);
>  
> +int ionic_open(struct net_device *netdev);
> +int ionic_stop(struct net_device *netdev);
> +int ionic_reset_queues(struct lif *lif);
> +
>  static inline void debug_stats_napi_poll(struct qcq *qcq,
>  					 unsigned int work_done)
>  {

^ permalink raw reply

* [PATCH net] selftests/net: add missing gitignores (ipv6_flowlabel)
From: Jakub Kicinski @ 2019-07-25  0:07 UTC (permalink / raw)
  To: davem; +Cc: netdev, oss-drivers, willemb, Jakub Kicinski, Quentin Monnet

ipv6_flowlabel and ipv6_flowlabel_mgr are missing from
gitignore.  Quentin points out that the original
commit 3fb321fde22d ("selftests/net: ipv6 flowlabel")
did add ignore entries, they are just missing the "ipv6_"
prefix.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/testing/selftests/net/.gitignore | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore
index 4ce0bc1612f5..c7cced739c34 100644
--- a/tools/testing/selftests/net/.gitignore
+++ b/tools/testing/selftests/net/.gitignore
@@ -17,7 +17,7 @@ tcp_inq
 tls
 txring_overwrite
 ip_defrag
+ipv6_flowlabel
+ipv6_flowlabel_mgr
 so_txtime
-flowlabel
-flowlabel_mgr
 tcp_fastopen_backup_key
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH v4 net-next 12/19] ionic: Add async link status check and basic stats
From: Saeed Mahameed @ 2019-07-25  0:04 UTC (permalink / raw)
  To: snelson@pensando.io, netdev@vger.kernel.org, davem@davemloft.net
In-Reply-To: <20190722214023.9513-13-snelson@pensando.io>

On Mon, 2019-07-22 at 14:40 -0700, Shannon Nelson wrote:
> Add code to handle the link status event, and wire up the
> basic netdev hardware stats.
> 
> Signed-off-by: Shannon Nelson <snelson@pensando.io>
> ---
>  .../net/ethernet/pensando/ionic/ionic_lif.c   | 116
> ++++++++++++++++++
>  .../net/ethernet/pensando/ionic/ionic_lif.h   |   1 +
>  2 files changed, 117 insertions(+)
> 
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> index efcda1337f91..f52af9cb6264 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> @@ -15,6 +15,7 @@
>  static void ionic_lif_rx_mode(struct lif *lif, unsigned int
> rx_mode);
>  static int ionic_lif_addr_add(struct lif *lif, const u8 *addr);
>  static int ionic_lif_addr_del(struct lif *lif, const u8 *addr);
> +static void ionic_link_status_check(struct lif *lif);
>  
>  static int ionic_set_nic_features(struct lif *lif, netdev_features_t
> features);
>  static int ionic_notifyq_clean(struct lif *lif, int budget);
> @@ -44,6 +45,9 @@ static void ionic_lif_deferred_work(struct
> work_struct *work)
>  		case DW_TYPE_RX_ADDR_DEL:
>  			ionic_lif_addr_del(lif, w->addr);
>  			break;
> +		case DW_TYPE_LINK_STATUS:
> +			ionic_link_status_check(lif);
> +			break;
>  		default:
>  			break;
>  		}
> @@ -69,6 +73,7 @@ int ionic_open(struct net_device *netdev)
>  
>  	set_bit(LIF_UP, lif->state);
>  
> +	ionic_link_status_check(lif);
>  	if (netif_carrier_ok(netdev))
>  		netif_tx_wake_all_queues(netdev);
>  
> @@ -151,6 +156,39 @@ static int ionic_adminq_napi(struct napi_struct
> *napi, int budget)
>  	return max(n_work, a_work);
>  }
>  
> +static void ionic_link_status_check(struct lif *lif)
> +{
> +	struct net_device *netdev = lif->netdev;
> +	u16 link_status;
> +	bool link_up;
> +
> +	clear_bit(LIF_LINK_CHECK_NEEDED, lif->state);
> +
> +	link_status = le16_to_cpu(lif->info->status.link_status);
> +	link_up = link_status == PORT_OPER_STATUS_UP;
> +
> +	/* filter out the no-change cases */
> +	if (link_up == netif_carrier_ok(netdev))
> +		return;
> +
> +	if (link_up) {
> +		netdev_info(netdev, "Link up - %d Gbps\n",
> +			    le32_to_cpu(lif->info->status.link_speed) /
> 1000);
> +
> +		if (test_bit(LIF_UP, lif->state)) {
> +			netif_tx_wake_all_queues(lif->netdev);
> +			netif_carrier_on(netdev);
> +		}
> +	} else {
> +		netdev_info(netdev, "Link down\n");
> +
> +		/* carrier off first to avoid watchdog timeout */
> +		netif_carrier_off(netdev);
> +		if (test_bit(LIF_UP, lif->state))
> +			netif_tx_stop_all_queues(netdev);
> +	}
> +}
> +
>  static bool ionic_notifyq_service(struct cq *cq, struct cq_info
> *cq_info)
>  {
>  	union notifyq_comp *comp = cq_info->cq_desc;
> @@ -182,6 +220,9 @@ static bool ionic_notifyq_service(struct cq *cq,
> struct cq_info *cq_info)
>  			    "  link_status=%d link_speed=%d\n",
>  			    le16_to_cpu(comp->link_change.link_status),
>  			    le32_to_cpu(comp->link_change.link_speed));
> +
> +		set_bit(LIF_LINK_CHECK_NEEDED, lif->state);
> +
>  		break;
>  	case EVENT_OPCODE_RESET:
>  		netdev_info(netdev, "Notifyq EVENT_OPCODE_RESET
> eid=%lld\n",
> @@ -222,10 +263,81 @@ static int ionic_notifyq_clean(struct lif *lif,
> int budget)
>  	if (work_done == budget)
>  		goto return_to_napi;
>  
> +	/* After outstanding events are processed we can check on
> +	 * the link status and any outstanding interrupt credits.
> +	 *
> +	 * We wait until here to check on the link status in case
> +	 * there was a long list of link events from a flap episode.
> +	 */
> +	if (test_bit(LIF_LINK_CHECK_NEEDED, lif->state)) {
> +		struct ionic_deferred_work *work;
> +
> +		work = kzalloc(sizeof(*work), GFP_ATOMIC);
> +		if (!work) {
> +			netdev_err(lif->netdev, "%s OOM\n", __func__);

why not having a pre allocated dedicated lif->link_check_work, instead
of allocating in atomic context on every link check event ?

> +		} else {
> +			work->type = DW_TYPE_LINK_STATUS;
> +			ionic_lif_deferred_enqueue(&lif->deferred,
> work);
> +		}
> +	}
> +
>  return_to_napi:
>  	return work_done;
>  }
>  
> +static void ionic_get_stats64(struct net_device *netdev,
> +			      struct rtnl_link_stats64 *ns)
> +{
> +	struct lif *lif = netdev_priv(netdev);
> +	struct lif_stats *ls;
> +
> +	memset(ns, 0, sizeof(*ns));
> +	ls = &lif->info->stats;
> +
> +	ns->rx_packets = le64_to_cpu(ls->rx_ucast_packets) +
> +			 le64_to_cpu(ls->rx_mcast_packets) +
> +			 le64_to_cpu(ls->rx_bcast_packets);
> +
> +	ns->tx_packets = le64_to_cpu(ls->tx_ucast_packets) +
> +			 le64_to_cpu(ls->tx_mcast_packets) +
> +			 le64_to_cpu(ls->tx_bcast_packets);
> +
> +	ns->rx_bytes = le64_to_cpu(ls->rx_ucast_bytes) +
> +		       le64_to_cpu(ls->rx_mcast_bytes) +
> +		       le64_to_cpu(ls->rx_bcast_bytes);
> +
> +	ns->tx_bytes = le64_to_cpu(ls->tx_ucast_bytes) +
> +		       le64_to_cpu(ls->tx_mcast_bytes) +
> +		       le64_to_cpu(ls->tx_bcast_bytes);
> +
> +	ns->rx_dropped = le64_to_cpu(ls->rx_ucast_drop_packets) +
> +			 le64_to_cpu(ls->rx_mcast_drop_packets) +
> +			 le64_to_cpu(ls->rx_bcast_drop_packets);
> +
> +	ns->tx_dropped = le64_to_cpu(ls->tx_ucast_drop_packets) +
> +			 le64_to_cpu(ls->tx_mcast_drop_packets) +
> +			 le64_to_cpu(ls->tx_bcast_drop_packets);
> +
> +	ns->multicast = le64_to_cpu(ls->rx_mcast_packets);
> +
> +	ns->rx_over_errors = le64_to_cpu(ls->rx_queue_empty);
> +
> +	ns->rx_missed_errors = le64_to_cpu(ls->rx_dma_error) +
> +			       le64_to_cpu(ls->rx_queue_disabled) +
> +			       le64_to_cpu(ls->rx_desc_fetch_error) +
> +			       le64_to_cpu(ls->rx_desc_data_error);
> +
> +	ns->tx_aborted_errors = le64_to_cpu(ls->tx_dma_error) +
> +				le64_to_cpu(ls->tx_queue_disabled) +
> +				le64_to_cpu(ls->tx_desc_fetch_error) +
> +				le64_to_cpu(ls->tx_desc_data_error);
> +
> +	ns->rx_errors = ns->rx_over_errors +
> +			ns->rx_missed_errors;
> +
> +	ns->tx_errors = ns->tx_aborted_errors;
> +}
> +
>  static int ionic_lif_addr_add(struct lif *lif, const u8 *addr)
>  {
>  	struct ionic_admin_ctx ctx = {
> @@ -581,6 +693,7 @@ static int ionic_vlan_rx_kill_vid(struct
> net_device *netdev, __be16 proto,
>  static const struct net_device_ops ionic_netdev_ops = {
>  	.ndo_open               = ionic_open,
>  	.ndo_stop               = ionic_stop,
> +	.ndo_get_stats64	= ionic_get_stats64,
>  	.ndo_set_rx_mode	= ionic_set_rx_mode,
>  	.ndo_set_features	= ionic_set_features,
>  	.ndo_set_mac_address	= ionic_set_mac_address,
> @@ -1418,6 +1531,8 @@ static int ionic_lif_init(struct lif *lif)
>  
>  	set_bit(LIF_INITED, lif->state);
>  
> +	ionic_link_status_check(lif);
> +
>  	return 0;
>  
>  err_out_notifyq_deinit:
> @@ -1461,6 +1576,7 @@ int ionic_lifs_register(struct ionic *ionic)
>  		return err;
>  	}
>  

are events (NotifyQ) enabled at this stage ? if so then you might endup
racing ionic_link_status_check with itself.

> +	ionic_link_status_check(ionic->master_lif);
>  	ionic->master_lif->registered = true;
>  
>  	return 0;
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> index 20b4fa573f77..9930b9390c8a 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
> @@ -86,6 +86,7 @@ struct ionic_deferred {
>  enum lif_state_flags {
>  	LIF_INITED,
>  	LIF_UP,
> +	LIF_LINK_CHECK_NEEDED,
>  	LIF_QUEUE_RESET,
>  
>  	/* leave this as last */

^ permalink raw reply

* Re: [PATCH bpf-next 01/10] libbpf: add .BTF.ext offset relocation section loading
From: Song Liu @ 2019-07-25  0:00 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Yonghong Song, andrii.nakryiko@gmail.com, Kernel Team
In-Reply-To: <20190724192742.1419254-2-andriin@fb.com>



> On Jul 24, 2019, at 12:27 PM, Andrii Nakryiko <andriin@fb.com> wrote:
> 
> Add support for BPF CO-RE offset relocations. Add section/record
> iteration macros for .BTF.ext. These macro are useful for iterating over
> each .BTF.ext record, either for dumping out contents or later for BPF
> CO-RE relocation handling.
> 
> To enable other parts of libbpf to work with .BTF.ext contents, moved
> a bunch of type definitions into libbpf_internal.h.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
> tools/lib/bpf/btf.c             | 64 +++++++++--------------
> tools/lib/bpf/btf.h             |  4 ++
> tools/lib/bpf/libbpf_internal.h | 91 +++++++++++++++++++++++++++++++++
> 3 files changed, 118 insertions(+), 41 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 467224feb43b..4a36bc783848 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -42,47 +42,6 @@ struct btf {
> 	int fd;
> };
> 
> -struct btf_ext_info {
> -	/*
> -	 * info points to the individual info section (e.g. func_info and
> -	 * line_info) from the .BTF.ext. It does not include the __u32 rec_size.
> -	 */
> -	void *info;
> -	__u32 rec_size;
> -	__u32 len;
> -};
> -
> -struct btf_ext {
> -	union {
> -		struct btf_ext_header *hdr;
> -		void *data;
> -	};
> -	struct btf_ext_info func_info;
> -	struct btf_ext_info line_info;
> -	__u32 data_size;
> -};
> -
> -struct btf_ext_info_sec {
> -	__u32	sec_name_off;
> -	__u32	num_info;
> -	/* Followed by num_info * record_size number of bytes */
> -	__u8	data[0];
> -};
> -
> -/* The minimum bpf_func_info checked by the loader */
> -struct bpf_func_info_min {
> -	__u32   insn_off;
> -	__u32   type_id;
> -};
> -
> -/* The minimum bpf_line_info checked by the loader */
> -struct bpf_line_info_min {
> -	__u32	insn_off;
> -	__u32	file_name_off;
> -	__u32	line_off;
> -	__u32	line_col;
> -};
> -
> static inline __u64 ptr_to_u64(const void *ptr)
> {
> 	return (__u64) (unsigned long) ptr;
> @@ -831,6 +790,9 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
> 	/* The start of the info sec (including the __u32 record_size). */
> 	void *info;
> 
> +	if (ext_sec->len == 0)
> +		return 0;
> +
> 	if (ext_sec->off & 0x03) {
> 		pr_debug(".BTF.ext %s section is not aligned to 4 bytes\n",
> 		     ext_sec->desc);
> @@ -934,6 +896,19 @@ static int btf_ext_setup_line_info(struct btf_ext *btf_ext)
> 	return btf_ext_setup_info(btf_ext, &param);
> }
> 
> +static int btf_ext_setup_offset_reloc(struct btf_ext *btf_ext)
> +{
> +	struct btf_ext_sec_setup_param param = {
> +		.off = btf_ext->hdr->offset_reloc_off,
> +		.len = btf_ext->hdr->offset_reloc_len,
> +		.min_rec_size = sizeof(struct bpf_offset_reloc),
> +		.ext_info = &btf_ext->offset_reloc_info,
> +		.desc = "offset_reloc",
> +	};
> +
> +	return btf_ext_setup_info(btf_ext, &param);
> +}
> +
> static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
> {
> 	const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
> @@ -1004,6 +979,13 @@ struct btf_ext *btf_ext__new(__u8 *data, __u32 size)
> 	if (err)
> 		goto done;
> 
> +	/* check if there is offset_reloc_off/offset_reloc_len fields */
> +	if (btf_ext->hdr->hdr_len < sizeof(struct btf_ext_header))

This check will break when we add more optional sections to btf_ext_header.
Maybe use offsetof() instead?

> +		goto done;
> +	err = btf_ext_setup_offset_reloc(btf_ext);
> +	if (err)
> +		goto done;
> +
> done:
> 	if (err) {
> 		btf_ext__free(btf_ext);
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index 88a52ae56fc6..287361ee1f6b 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -57,6 +57,10 @@ struct btf_ext_header {
> 	__u32	func_info_len;
> 	__u32	line_info_off;
> 	__u32	line_info_len;
> +
> +	/* optional part of .BTF.ext header */
> +	__u32	offset_reloc_off;
> +	__u32	offset_reloc_len;
> };
> 
> LIBBPF_API void btf__free(struct btf *btf);
> diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
> index 2ac29bd36226..087ff512282f 100644
> --- a/tools/lib/bpf/libbpf_internal.h
> +++ b/tools/lib/bpf/libbpf_internal.h
> @@ -46,4 +46,95 @@ do {				\
> int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
> 			 const char *str_sec, size_t str_len);
> 
> +struct btf_ext_info {
> +	/*
> +	 * info points to the individual info section (e.g. func_info and
> +	 * line_info) from the .BTF.ext. It does not include the __u32 rec_size.
> +	 */
> +	void *info;
> +	__u32 rec_size;
> +	__u32 len;
> +};
> +
> +#define for_each_btf_ext_sec(seg, sec)					\
> +	for (sec = (seg)->info;						\
> +	     (void *)sec < (seg)->info + (seg)->len;			\
> +	     sec = (void *)sec + sizeof(struct btf_ext_info_sec) +	\
> +		   (seg)->rec_size * sec->num_info)
> +
> +#define for_each_btf_ext_rec(seg, sec, i, rec)				\
> +	for (i = 0, rec = (void *)&(sec)->data;				\
> +	     i < (sec)->num_info;					\
> +	     i++, rec = (void *)rec + (seg)->rec_size)
> +
> +struct btf_ext {
> +	union {
> +		struct btf_ext_header *hdr;
> +		void *data;
> +	};
> +	struct btf_ext_info func_info;
> +	struct btf_ext_info line_info;
> +	struct btf_ext_info offset_reloc_info;
> +	__u32 data_size;
> +};
> +
> +struct btf_ext_info_sec {
> +	__u32	sec_name_off;
> +	__u32	num_info;
> +	/* Followed by num_info * record_size number of bytes */
> +	__u8	data[0];
> +};
> +
> +/* The minimum bpf_func_info checked by the loader */
> +struct bpf_func_info_min {
> +	__u32   insn_off;
> +	__u32   type_id;
> +};
> +
> +/* The minimum bpf_line_info checked by the loader */
> +struct bpf_line_info_min {
> +	__u32	insn_off;
> +	__u32	file_name_off;
> +	__u32	line_off;
> +	__u32	line_col;
> +};
> +
> +/* The minimum bpf_offset_reloc checked by the loader
> + *
> + * Offset relocation captures the following data:
> + * - insn_off - instruction offset (in bytes) within a BPF program that needs
> + *   its insn->imm field to be relocated with actual offset;
> + * - type_id - BTF type ID of the "root" (containing) entity of a relocatable
> + *   offset;
> + * - access_str_off - offset into corresponding .BTF string section. String
> + *   itself encodes an accessed field using a sequence of field and array
> + *   indicies, separated by colon (:). It's conceptually very close to LLVM's
> + *   getelementptr ([0]) instruction's arguments for identifying offset to 
> + *   a field.
> + *
> + * Example to provide a better feel.
> + *
> + *   struct sample {
> + *       int a;
> + *       struct {
> + *           int b[10];
> + *       };
> + *   };
> + * 
> + *   struct sample *s = ...;
> + *   int x = &s->a;     // encoded as "0:0" (a is field #0)
> + *   int y = &s->b[5];  // encoded as "0:1:5" (b is field #1, arr elem #5)
> + *   int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
> + *
> + * type_id for all relocs in this example  will capture BTF type id of
> + * `struct sample`.
> + *
> + *   [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
> + */
> +struct bpf_offset_reloc {
> +	__u32   insn_off;
> +	__u32   type_id;
> +	__u32   access_str_off;
> +};
> +
> #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
> -- 
> 2.17.1
> 


^ permalink raw reply

* Re: [PATCH bpf-next 5/7] sefltests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG
From: Stanislav Fomichev @ 2019-07-24 23:52 UTC (permalink / raw)
  To: Song Liu
  Cc: Stanislav Fomichev, Networking, bpf, David S . Miller,
	Alexei Starovoitov, Daniel Borkmann, Willem de Bruijn,
	Petar Penkov
In-Reply-To: <CAPhsuW6Z2Bx66ZDOV-9jW+hsxKbZJxY-YFgP0rL_4QipAuptQA@mail.gmail.com>

On 07/24, Song Liu wrote:
> On Wed, Jul 24, 2019 at 10:11 AM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > bpf_flow.c: exit early unless FLOW_DISSECTOR_F_PARSE_1ST_FRAG is passed
> > in flags. Also, set ip_proto earlier, this makes sure we have correct
> > value with fragmented packets.
> >
> > Add selftest cases to test ipv4/ipv6 fragments and skip eth_get_headlen
> > tests that don't have FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag.
> >
> > eth_get_headlen calls flow dissector with
> > FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag so we can't run tests that
> > have different set of input flags against it.
> >
> > Cc: Willem de Bruijn <willemb@google.com>
> > Cc: Petar Penkov <ppenkov@google.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  .../selftests/bpf/prog_tests/flow_dissector.c | 129 ++++++++++++++++++
> >  tools/testing/selftests/bpf/progs/bpf_flow.c  |  28 +++-
> >  2 files changed, 151 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> > index c938283ac232..966cb3b06870 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> > @@ -5,6 +5,10 @@
> >  #include <linux/if_tun.h>
> >  #include <sys/uio.h>
> >
> > +#ifndef IP_MF
> > +#define IP_MF 0x2000
> > +#endif
> > +
> >  #define CHECK_FLOW_KEYS(desc, got, expected)                           \
> >         CHECK_ATTR(memcmp(&got, &expected, sizeof(got)) != 0,           \
> >               desc,                                                     \
> > @@ -49,6 +53,18 @@ struct ipv6_pkt {
> >         struct tcphdr tcp;
> >  } __packed;
> >
> > +struct ipv6_frag_pkt {
> > +       struct ethhdr eth;
> > +       struct ipv6hdr iph;
> > +       struct frag_hdr {
> > +               __u8 nexthdr;
> > +               __u8 reserved;
> > +               __be16 frag_off;
> > +               __be32 identification;
> > +       } ipf;
> > +       struct tcphdr tcp;
> > +} __packed;
> > +
> >  struct dvlan_ipv6_pkt {
> >         struct ethhdr eth;
> >         __u16 vlan_tci;
> > @@ -65,9 +81,11 @@ struct test {
> >                 struct ipv4_pkt ipv4;
> >                 struct svlan_ipv4_pkt svlan_ipv4;
> >                 struct ipv6_pkt ipv6;
> > +               struct ipv6_frag_pkt ipv6_frag;
> >                 struct dvlan_ipv6_pkt dvlan_ipv6;
> >         } pkt;
> >         struct bpf_flow_keys keys;
> > +       __u32 flags;
> >  };
> >
> >  #define VLAN_HLEN      4
> > @@ -143,6 +161,102 @@ struct test tests[] = {
> >                         .n_proto = __bpf_constant_htons(ETH_P_IPV6),
> >                 },
> >         },
> > +       {
> > +               .name = "ipv4-frag",
> > +               .pkt.ipv4 = {
> > +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> > +                       .iph.ihl = 5,
> > +                       .iph.protocol = IPPROTO_TCP,
> > +                       .iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> > +                       .iph.frag_off = __bpf_constant_htons(IP_MF),
> > +                       .tcp.doff = 5,
> > +                       .tcp.source = 80,
> > +                       .tcp.dest = 8080,
> > +               },
> > +               .keys = {
> > +                       .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> > +                       .nhoff = ETH_HLEN,
> > +                       .thoff = ETH_HLEN + sizeof(struct iphdr),
> > +                       .addr_proto = ETH_P_IP,
> > +                       .ip_proto = IPPROTO_TCP,
> > +                       .n_proto = __bpf_constant_htons(ETH_P_IP),
> > +                       .is_frag = true,
> > +                       .is_first_frag = true,
> > +                       .sport = 80,
> > +                       .dport = 8080,
> > +               },
> > +               .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> > +       },
> > +       {
> > +               .name = "ipv4-no-frag",
> > +               .pkt.ipv4 = {
> > +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> > +                       .iph.ihl = 5,
> > +                       .iph.protocol = IPPROTO_TCP,
> > +                       .iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> > +                       .iph.frag_off = __bpf_constant_htons(IP_MF),
> > +                       .tcp.doff = 5,
> > +                       .tcp.source = 80,
> > +                       .tcp.dest = 8080,
> > +               },
> > +               .keys = {
> > +                       .nhoff = ETH_HLEN,
> > +                       .thoff = ETH_HLEN + sizeof(struct iphdr),
> > +                       .addr_proto = ETH_P_IP,
> > +                       .ip_proto = IPPROTO_TCP,
> > +                       .n_proto = __bpf_constant_htons(ETH_P_IP),
> > +                       .is_frag = true,
> > +                       .is_first_frag = true,
> > +               },
> > +       },
> > +       {
> > +               .name = "ipv6-frag",
> > +               .pkt.ipv6_frag = {
> > +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> > +                       .iph.nexthdr = IPPROTO_FRAGMENT,
> > +                       .iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> > +                       .ipf.nexthdr = IPPROTO_TCP,
> > +                       .tcp.doff = 5,
> > +                       .tcp.source = 80,
> > +                       .tcp.dest = 8080,
> > +               },
> > +               .keys = {
> > +                       .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> > +                       .nhoff = ETH_HLEN,
> > +                       .thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> > +                               sizeof(struct frag_hdr),
> > +                       .addr_proto = ETH_P_IPV6,
> > +                       .ip_proto = IPPROTO_TCP,
> > +                       .n_proto = __bpf_constant_htons(ETH_P_IPV6),
> > +                       .is_frag = true,
> > +                       .is_first_frag = true,
> > +                       .sport = 80,
> > +                       .dport = 8080,
> > +               },
> > +               .flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> > +       },
> > +       {
> > +               .name = "ipv6-no-frag",
> > +               .pkt.ipv6_frag = {
> > +                       .eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> > +                       .iph.nexthdr = IPPROTO_FRAGMENT,
> > +                       .iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> > +                       .ipf.nexthdr = IPPROTO_TCP,
> > +                       .tcp.doff = 5,
> > +                       .tcp.source = 80,
> > +                       .tcp.dest = 8080,
> > +               },
> > +               .keys = {
> > +                       .nhoff = ETH_HLEN,
> > +                       .thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> > +                               sizeof(struct frag_hdr),
> > +                       .addr_proto = ETH_P_IPV6,
> > +                       .ip_proto = IPPROTO_TCP,
> > +                       .n_proto = __bpf_constant_htons(ETH_P_IPV6),
> > +                       .is_frag = true,
> > +                       .is_first_frag = true,
> > +               },
> > +       },
> >  };
> >
> >  static int create_tap(const char *ifname)
> > @@ -225,6 +339,13 @@ void test_flow_dissector(void)
> >                         .data_size_in = sizeof(tests[i].pkt),
> >                         .data_out = &flow_keys,
> >                 };
> > +               static struct bpf_flow_keys ctx = {};
> > +
> > +               if (tests[i].flags) {
> > +                       tattr.ctx_in = &ctx;
> > +                       tattr.ctx_size_in = sizeof(ctx);
> > +                       ctx.flags = tests[i].flags;
> > +               }
> >
> >                 err = bpf_prog_test_run_xattr(&tattr);
> >                 CHECK_ATTR(tattr.data_size_out != sizeof(flow_keys) ||
> > @@ -255,6 +376,14 @@ void test_flow_dissector(void)
> >                 struct bpf_prog_test_run_attr tattr = {};
> >                 __u32 key = 0;
> >
> > +               /* Don't run tests that are not marked as
> > +                * FLOW_DISSECTOR_F_PARSE_1ST_FRAG; eth_get_headlen
> > +                * sets this flag.
> > +                */
> > +
> > +               if (tests[i].flags != FLOW_DISSECTOR_F_PARSE_1ST_FRAG)
> > +                       continue;
> 
> Maybe test flags & FLOW_DISSECTOR_F_PARSE_1ST_FRAG == 0 instead?
> It is not necessary now, but might be useful in the future.
I'm not sure about this one. We want flags here to match flags
from eth_get_headlen:

	const unsigned int flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG;
	...
	if (!skb_flow_dissect_flow_keys_basic(..., flags))

Otherwise the test might break unexpectedly. So I'd rather manually
adjust a test here if eth_get_headlen flags change.

Maybe I should clarify the comment to signify that dependency? Because
currently it might be read as if we only care about
FLOW_DISSECTOR_F_PARSE_1ST_FRAG, but we really care about all flags
in eth_get_headlen; it just happens that it only has one right now.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox