Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH iproute2-next v2] rdma: Document IB device renaming option
From: Leon Romanovsky @ 2018-11-04 19:00 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, RDMA mailing list, Stephen Hemminger
In-Reply-To: <07ffbccc-916c-25d3-654e-7e457be3c133@gmail.com>

On Sun, Nov 04, 2018 at 07:26:24AM -0700, David Ahern wrote:
> On 11/4/18 4:54 AM, Leon Romanovsky wrote:
> > @@ -45,6 +53,11 @@ rdma dev show mlx5_3
> >  Shows the state of specified RDMA device.
> >  .RE
> >  .PP
> > +rdma dev set mlx5_3 name rdma_0
> > +.RS 4
> > +Renames the mlx5_3 device to be named rdma_0.
> > +.RE
> > +.PP
>
> You missed my other comment: Fix the "Renames .... to be named ..."

Sorry, my bad.

I'm resending it right now.

Thanks

^ permalink raw reply

* [PATCH iproute2-next v3] rdma: Document IB device renaming option
From: Leon Romanovsky @ 2018-11-04 19:11 UTC (permalink / raw)
  To: David Ahern; +Cc: Leon Romanovsky, netdev, RDMA mailing list, Stephen Hemminger

From: Leon Romanovsky <leonro@mellanox.com>

[leonro@server /]$ lspci |grep -i Ether
00:08.0 Ethernet controller: Red Hat, Inc. Virtio network device
00:09.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
[leonro@server /]$ sudo rdma dev
1: mlx5_0: node_type ca fw 3.8.9999 node_guid 5254:00c0:fe12:3455
sys_image_guid 5254:00c0:fe12:3455
[leonro@server /]$ sudo rdma dev set mlx5_0 name hfi1_0
[leonro@server /]$ sudo rdma dev
1: hfi1_0: node_type ca fw 3.8.9999 node_guid 5254:00c0:fe12:3455
sys_image_guid 5254:00c0:fe12:3455

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
Changelog:
v2->v3:
 * Dropped "to be named" words from example section of man
---
 man/man8/rdma-dev.8 | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/man/man8/rdma-dev.8 b/man/man8/rdma-dev.8
index 461681b6..7c275180 100644
--- a/man/man8/rdma-dev.8
+++ b/man/man8/rdma-dev.8
@@ -1,6 +1,6 @@
 .TH RDMA\-DEV 8 "06 Jul 2017" "iproute2" "Linux"
 .SH NAME
-rdmak-dev \- RDMA device configuration
+rdma-dev \- RDMA device configuration
 .SH SYNOPSIS
 .sp
 .ad l
@@ -22,10 +22,18 @@ rdmak-dev \- RDMA device configuration
 .B rdma dev show
 .RI "[ " DEV " ]"

+.ti -8
+.B rdma dev set
+.RI "[ " DEV " ]"
+.BR name
+.BR NEWNAME
+
 .ti -8
 .B rdma dev help

 .SH "DESCRIPTION"
+.SS rdma dev set - rename rdma device
+
 .SS rdma dev show - display rdma device attributes

 .PP
@@ -45,6 +53,11 @@ rdma dev show mlx5_3
 Shows the state of specified RDMA device.
 .RE
 .PP
+rdma dev set mlx5_3 name rdma_0
+.RS 4
+Renames the mlx5_3 device to rdma_0.
+.RE
+.PP

 .SH SEE ALSO
 .BR rdma (8),

^ permalink raw reply related

* Re: [PATCH 1/6] phy: Add max_bitrate attribute & phy_get_max_bitrate()
From: Faiz Abbas @ 2018-11-05  6:27 UTC (permalink / raw)
  To: Marc Kleine-Budde, linux-kernel, devicetree, netdev, linux-can
  Cc: wg, robh+dt, mark.rutland, kishon
In-Reply-To: <5e1a0b67-510a-5512-d477-0b363e4733fe@pengutronix.de>

Hi Marc,

On Saturday 03 November 2018 03:06 PM, Marc Kleine-Budde wrote:
> On 11/02/2018 08:26 PM, Faiz Abbas wrote:
>> In some subsystems (eg. CAN) the physical layer capabilities are
>> the limiting factor in the datarate of the device. Typically, the
>> physical layer transceiver does not provide a way to discover this
>> limitation at runtime. Thus this information needs to be represented as
>> a phy attribute which is read from the device tree.
>>
>> Therefore, add an optional max_bitrate attribute to the generic phy
>> sybsystem. Also add the complementary API which enables the consumer
>> to get max_bitrate.
>>
>> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> 
> NACK - We already have such a functionality in the CAN subsystem.
> Please have a look at the patches:
> 
> e759c626d826 can: m_can: Support higher speed CAN-FD bitrates
> b54f9eea7667 dt-bindings: can: m_can: Document new can transceiver binding
> 2290aefa2e90 can: dev: Add support for limiting configured bitrate
> 54a7fbcc17bc dt-bindings: can: can-transceiver: Document new binding
> 

I remove the transceiver child node binding documentation in patch 5/6.

The existing implementation is pretty limiting as it just has a child
node with no associated device. What if a transceiver requires its own
configurations before it can start sending/receiving messages (for
example, my usecase requires it to pull the standby line low)?

I think that can be solved by implementing the transceiver as a phy and
exposing a generic get_max_bitrate API. That way, the transceiver device
can do all its startup configuration in the phy probe function.

In any case, do suggest if you have a better idea on how to implement
pull gpio low requirement.

Thanks,
Faiz

^ permalink raw reply

* [PATCH 3/5] VSOCK: support receive mergeable rx buffer in guest
From: jiangyiwen @ 2018-11-05  7:47 UTC (permalink / raw)
  To: stefanha, Jason Wang; +Cc: netdev, kvm, virtualization

Guest receive mergeable rx buffer, it can merge
scatter rx buffer into a big buffer and then copy
to user space.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
---
 include/linux/virtio_vsock.h            |  9 ++++
 net/vmw_vsock/virtio_transport.c        | 75 +++++++++++++++++++++++++++++----
 net/vmw_vsock/virtio_transport_common.c | 59 ++++++++++++++++++++++----
 3 files changed, 127 insertions(+), 16 deletions(-)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index da9e1fe..6be3cd7 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -13,6 +13,8 @@
 #define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE	(1024 * 4)
 #define VIRTIO_VSOCK_MAX_BUF_SIZE		0xFFFFFFFFUL
 #define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE		(1024 * 64)
+/* virtio_vsock_pkt + max_pkt_len(default MAX_PKT_BUF_SIZE) */
+#define VIRTIO_VSOCK_MAX_MRG_BUF_NUM ((VIRTIO_VSOCK_MAX_PKT_BUF_SIZE / PAGE_SIZE) + 1)

 /* Virtio-vsock feature */
 #define VIRTIO_VSOCK_F_MRG_RXBUF 0 /* Host can merge receive buffers. */
@@ -48,6 +50,11 @@ struct virtio_vsock_sock {
 	struct list_head rx_queue;
 };

+struct virtio_vsock_mrg_rxbuf {
+	void *buf;
+	u32 len;
+};
+
 struct virtio_vsock_pkt {
 	struct virtio_vsock_hdr	hdr;
 	struct virtio_vsock_mrg_rxbuf_hdr mrg_rxbuf_hdr;
@@ -59,6 +66,8 @@ struct virtio_vsock_pkt {
 	u32 len;
 	u32 off;
 	bool reply;
+	bool mergeable;
+	struct virtio_vsock_mrg_rxbuf mrg_rxbuf[VIRTIO_VSOCK_MAX_MRG_BUF_NUM];
 };

 struct virtio_vsock_pkt_info {
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 2040a9e..3557ad3 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -359,11 +359,62 @@ static bool virtio_transport_more_replies(struct virtio_vsock *vsock)
 	return val < virtqueue_get_vring_size(vq);
 }

+static struct virtio_vsock_pkt *receive_mergeable(struct virtqueue *vq,
+		struct virtio_vsock *vsock, unsigned int *total_len)
+{
+	struct virtio_vsock_pkt *pkt;
+	u16 num_buf;
+	void *page;
+	unsigned int len;
+	int i = 0;
+
+	page = virtqueue_get_buf(vq, &len);
+	if (!page)
+		return NULL;
+
+	*total_len = len;
+	vsock->rx_buf_nr--;
+
+	pkt = page;
+	num_buf = le16_to_cpu(pkt->mrg_rxbuf_hdr.num_buffers);
+	if (!num_buf || num_buf > VIRTIO_VSOCK_MAX_MRG_BUF_NUM)
+		goto err;
+
+	pkt->mergeable = true;
+	if (!le32_to_cpu(pkt->hdr.len))
+		return pkt;
+
+	len -= sizeof(struct virtio_vsock_pkt);
+	pkt->mrg_rxbuf[i].buf = page + sizeof(struct virtio_vsock_pkt);
+	pkt->mrg_rxbuf[i].len = len;
+	i++;
+
+	while (--num_buf) {
+		page = virtqueue_get_buf(vq, &len);
+		if (!page)
+			goto err;
+
+		*total_len += len;
+		vsock->rx_buf_nr--;
+
+		pkt->mrg_rxbuf[i].buf = page;
+		pkt->mrg_rxbuf[i].len = len;
+		i++;
+	}
+
+	return pkt;
+err:
+	virtio_transport_free_pkt(pkt);
+	return NULL;
+}
+
 static void virtio_transport_rx_work(struct work_struct *work)
 {
 	struct virtio_vsock *vsock =
 		container_of(work, struct virtio_vsock, rx_work);
 	struct virtqueue *vq;
+	size_t vsock_hlen = vsock->mergeable ? sizeof(struct virtio_vsock_pkt) :
+			sizeof(struct virtio_vsock_hdr);

 	vq = vsock->vqs[VSOCK_VQ_RX];

@@ -383,21 +434,29 @@ static void virtio_transport_rx_work(struct work_struct *work)
 				goto out;
 			}

-			pkt = virtqueue_get_buf(vq, &len);
-			if (!pkt) {
-				break;
-			}
+			if (likely(vsock->mergeable)) {
+				pkt = receive_mergeable(vq, vsock, &len);
+				if (!pkt)
+					break;

-			vsock->rx_buf_nr--;
+				pkt->len = le32_to_cpu(pkt->hdr.len);
+			} else {
+				pkt = virtqueue_get_buf(vq, &len);
+				if (!pkt) {
+					break;
+				}
+
+				vsock->rx_buf_nr--;
+			}

 			/* Drop short/long packets */
-			if (unlikely(len < sizeof(pkt->hdr) ||
-				     len > sizeof(pkt->hdr) + pkt->len)) {
+			if (unlikely(len < vsock_hlen ||
+				     len > vsock_hlen + pkt->len)) {
 				virtio_transport_free_pkt(pkt);
 				continue;
 			}

-			pkt->len = len - sizeof(pkt->hdr);
+			pkt->len = len - vsock_hlen;
 			virtio_transport_deliver_tap_pkt(pkt);
 			virtio_transport_recv_pkt(pkt);
 		}
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 3ae3a33..7bef1d5 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -272,14 +272,49 @@ static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
 		 */
 		spin_unlock_bh(&vvs->rx_lock);

-		err = memcpy_to_msg(msg, pkt->buf + pkt->off, bytes);
-		if (err)
-			goto out;
+		if (pkt->mergeable) {
+			struct virtio_vsock_mrg_rxbuf *buf = pkt->mrg_rxbuf;
+			size_t mrg_copy_bytes, last_buf_total = 0, rxbuf_off;
+			size_t tmp_bytes = bytes;
+			int i;
+
+			for (i = 0; i < le16_to_cpu(pkt->mrg_rxbuf_hdr.num_buffers); i++) {
+				if (pkt->off > last_buf_total + buf[i].len) {
+					last_buf_total += buf[i].len;
+					continue;
+				}
+
+				rxbuf_off = pkt->off - last_buf_total;
+				mrg_copy_bytes = min(buf[i].len - rxbuf_off, tmp_bytes);
+				err = memcpy_to_msg(msg, buf[i].buf + rxbuf_off, mrg_copy_bytes);
+				if (err)
+					goto out;
+
+				tmp_bytes -= mrg_copy_bytes;
+				pkt->off += mrg_copy_bytes;
+				last_buf_total += buf[i].len;
+				if (!tmp_bytes)
+					break;
+			}
+
+			if (tmp_bytes) {
+				printk(KERN_WARNING "WARNING! bytes = %llu, "
+						"bytes = %llu\n",
+						(unsigned long long)bytes,
+						(unsigned long long)tmp_bytes);
+			}
+
+			total += (bytes - tmp_bytes);
+		} else {
+			err = memcpy_to_msg(msg, pkt->buf + pkt->off, bytes);
+			if (err)
+				goto out;
+
+			total += bytes;
+			pkt->off += bytes;
+		}

 		spin_lock_bh(&vvs->rx_lock);
-
-		total += bytes;
-		pkt->off += bytes;
 		if (pkt->off == pkt->len) {
 			virtio_transport_dec_rx_pkt(vvs, pkt);
 			list_del(&pkt->list);
@@ -1050,8 +1085,16 @@ void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt)

 void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt)
 {
-	kfree(pkt->buf);
-	kfree(pkt);
+	int i;
+
+	if (pkt->mergeable) {
+		for (i = 1; i < le16_to_cpu(pkt->mrg_rxbuf_hdr.num_buffers); i++)
+			free_page((unsigned long)pkt->mrg_rxbuf[i].buf);
+		free_page((unsigned long)(void *)pkt);
+	} else {
+		kfree(pkt->buf);
+		kfree(pkt);
+	}
 }
 EXPORT_SYMBOL_GPL(virtio_transport_free_pkt);

-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH 0/5] Use common cordic algorithm for b43
From: Kalle Valo @ 2018-11-05  8:02 UTC (permalink / raw)
  To: Priit Laes
  Cc: linux-wireless, b43-dev, netdev, linux-kernel,
	brcm80211-dev-list.pdl, brcm80211-dev-list
In-Reply-To: <cover.f21c8e62e188620d586edb3f77514e6237122c4c.1541238842.git-series.plaes@plaes.org>

Priit Laes <plaes@plaes.org> writes:

> b43 wireless driver included internal implementation of cordic
> algorithm which has now been removed in favor of library
> implementation.
>
> During the process, brcmfmac was driver was also cleaned.
>
> Please note that this series is only compile-tested, as I
> do not have access to the hardware.
>
> Priit Laes (5):
>   lib: cordic: Move cordic macros and defines to header file
>   brcmfmac: Use common CORDIC_FLOAT macro from lib
>   brcmfmac: Drop unused cordic defines and macros
>   b43: Use common cordic algorithm from kernel lib
>   b43: Drop internal cordic algorithm implementation
>
>  drivers/net/wireless/broadcom/b43/Kconfig                      |  1 +-
>  drivers/net/wireless/broadcom/b43/phy_common.c                 | 47 +-------
>  drivers/net/wireless/broadcom/b43/phy_common.h                 |  9 +-
>  drivers/net/wireless/broadcom/b43/phy_lp.c                     | 13 +-
>  drivers/net/wireless/broadcom/b43/phy_n.c                      | 13 +-
>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_int.h |  7 +-
>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c |  4 +-
>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c   |  4 +-
>  include/linux/cordic.h                                         |  9 +-
>  lib/cordic.c                                                   | 23 +---
>  10 files changed, 35 insertions(+), 95 deletions(-)

I don't see patch 1 in linux-wireless patchwork:

https://patchwork.kernel.org/project/linux-wireless/list/?series=38033&state=*

Via which tree are you planning to push these? These could potentially
go via my wireless-drivers-next tree (if review goes ok) but I need to
have all five patches in patchwork.

Also I don't see MAINTAINERS entry for cordic.[c|h], that would be good
to have as well.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH 0/5] Use common cordic algorithm for b43
From: Kalle Valo @ 2018-11-05  8:07 UTC (permalink / raw)
  To: Priit Laes
  Cc: linux-wireless, b43-dev, netdev, linux-kernel,
	brcm80211-dev-list.pdl, brcm80211-dev-list
In-Reply-To: <87muqoar5i.fsf@purkki.adurom.net>

Kalle Valo <kvalo@codeaurora.org> writes:

> Priit Laes <plaes@plaes.org> writes:
>
>> b43 wireless driver included internal implementation of cordic
>> algorithm which has now been removed in favor of library
>> implementation.
>>
>> During the process, brcmfmac was driver was also cleaned.
>>
>> Please note that this series is only compile-tested, as I
>> do not have access to the hardware.
>>
>> Priit Laes (5):
>>   lib: cordic: Move cordic macros and defines to header file
>>   brcmfmac: Use common CORDIC_FLOAT macro from lib
>>   brcmfmac: Drop unused cordic defines and macros
>>   b43: Use common cordic algorithm from kernel lib
>>   b43: Drop internal cordic algorithm implementation
>>
>>  drivers/net/wireless/broadcom/b43/Kconfig                      |  1 +-
>>  drivers/net/wireless/broadcom/b43/phy_common.c                 | 47 +-------
>>  drivers/net/wireless/broadcom/b43/phy_common.h                 |  9 +-
>>  drivers/net/wireless/broadcom/b43/phy_lp.c                     | 13 +-
>>  drivers/net/wireless/broadcom/b43/phy_n.c                      | 13 +-
>>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_int.h |  7 +-
>>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c |  4 +-
>>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c   |  4 +-
>>  include/linux/cordic.h                                         |  9 +-
>>  lib/cordic.c                                                   | 23 +---
>>  10 files changed, 35 insertions(+), 95 deletions(-)
>
> I don't see patch 1 in linux-wireless patchwork:
>
> https://patchwork.kernel.org/project/linux-wireless/list/?series=38033&state=*
>
> Via which tree are you planning to push these? These could potentially
> go via my wireless-drivers-next tree (if review goes ok) but I need to
> have all five patches in patchwork.

Oh, forgot to mention that please resubmit all five patches, not just
patch 1, because then it's easier for me to apply them.

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches#resubmit_the_whole_patchset

-- 
Kalle Valo

^ permalink raw reply

* [PATCH] kselftests/bpf: use ping6 as the default ipv6 ping binary when it exists
From: Li Zhijian @ 2018-11-05  8:13 UTC (permalink / raw)
  To: Song Liu, shuah, netdev, linux-kselftest
  Cc: linux-kernel, ast, daniel, Li Zhijian, Philip Li

At commit deee2cae27 ("kselftests/bpf: use ping6 as the default ipv6 ping
binary if it exists"), it fixed similar issues for shell script, but it
missed a same issue in the C code.

Fixes: 371e4fcc9d96 ("selftests/bpf: cgroup local storage-based network counters")
CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 tools/testing/selftests/bpf/test_netcnt.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_netcnt.c b/tools/testing/selftests/bpf/test_netcnt.c
index 7887df6..44889bc 100644
--- a/tools/testing/selftests/bpf/test_netcnt.c
+++ b/tools/testing/selftests/bpf/test_netcnt.c
@@ -81,7 +81,11 @@ int main(int argc, char **argv)
 		goto err;
 	}
 
-	assert(system("ping localhost -6 -c 10000 -f -q > /dev/null") == 0);
+	if (system("which ping6 &>/dev/null") == 0) {
+		assert(system("ping6 localhost -c 10000 -f -q > /dev/null") == 0);
+	} else {
+		assert(system("ping -6 localhost -c 10000 -f -q > /dev/null") == 0);
+	}
 
 	if (bpf_prog_query(cgroup_fd, BPF_CGROUP_INET_EGRESS, 0, NULL, NULL,
 			   &prog_cnt)) {
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH 0/5] Use common cordic algorithm for b43
From: Arend van Spriel @ 2018-11-05  8:24 UTC (permalink / raw)
  To: Priit Laes, linux-wireless, b43-dev
  Cc: netdev, linux-kernel, brcm80211-dev-list.pdl, brcm80211-dev-list
In-Reply-To: <cover.f21c8e62e188620d586edb3f77514e6237122c4c.1541238842.git-series.plaes@plaes.org>

On 11/3/2018 10:59 AM, Priit Laes wrote:
> b43 wireless driver included internal implementation of cordic
> algorithm which has now been removed in favor of library
> implementation.
>
> During the process, brcmfmac was driver was also cleaned.

You actually touched the *brcmsmac* driver, not brcmfmac. Please fix the 
driver prefix where appropriate in this series, ie. patches 2 and 3.

> Please note that this series is only compile-tested, as I
> do not have access to the hardware.

I can/will verify brcmsmac. As Kalle mentioned it makes more sense to 
push the 'lib: cordic:' patch through the wireless tree as well as it 
only is used by wireless drivers right now.

Regards,
Arend

^ permalink raw reply

* [PATCH] 9p/net: put a lower bound on msize
From: Dominique Martinet @ 2018-11-05  8:52 UTC (permalink / raw)
  Cc: Dominique Martinet, Eric Van Hensbergen, Latchesar Ionkov,
	syzkaller-bugs, v9fs-developer, linux-kernel, netdev
In-Reply-To: <20181102223908.GA9565@nautica>

From: Dominique Martinet <dominique.martinet@cea.fr>

If the requested msize is too small (either from command line argument
or from the server version reply), we won't get any work done.
If it's *really* too small, nothing will work, and this got caught by
syzbot recently (on a new kmem_cache_create_usercopy() call)

Just set a minimum msize to 4k in both code paths, until someone
complains they have a use-case for a smaller msize.

We need to check in both mount option and server reply individually
because the msize for the first version request would be unchecked
with just a global check on clnt->msize.

Reported-by: syzbot+0c1d61e4db7db94102ca@syzkaller.appspotmail.com
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
---
 net/9p/client.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/net/9p/client.c b/net/9p/client.c
index 2c9a17b9b46b..b1163ebdc622 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -181,6 +181,12 @@ static int parse_opts(char *opts, struct p9_client *clnt)
 				ret = r;
 				continue;
 			}
+			if (r < 4096) {
+				p9_debug(P9_DEBUG_ERROR,
+					 "msize should be at least 4k\n");
+				ret = -EINVAL;
+				continue;
+			}
 			clnt->msize = option;
 			break;
 		case Opt_trans:
@@ -983,10 +989,18 @@ static int p9_client_version(struct p9_client *c)
 	else if (!strncmp(version, "9P2000", 6))
 		c->proto_version = p9_proto_legacy;
 	else {
+		p9_debug(P9_DEBUG_ERROR,
+			 "server returned an unknown version: %s\n", version);
 		err = -EREMOTEIO;
 		goto error;
 	}
 
+	if (msize < 4096) {
+		p9_debug(P9_DEBUG_ERROR,
+			 "server returned a msize < 4096: %d\n", msize);
+		err = -EREMOTEIO;
+		goto error;
+	}
 	if (msize < c->msize)
 		c->msize = msize;
 
@@ -1043,6 +1057,13 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
 	if (clnt->msize > clnt->trans_mod->maxsize)
 		clnt->msize = clnt->trans_mod->maxsize;
 
+	if (clnt->msize < 4096) {
+		p9_debug(P9_DEBUG_ERROR,
+			 "Please specify a msize of at least 4k\n");
+		err = -EINVAL;
+		goto free_client;
+	}
+
 	err = p9_client_version(clnt);
 	if (err)
 		goto close_trans;
-- 
2.19.1

^ permalink raw reply related

* [PATCH v2] kselftests/bpf: use ping6 as the default ipv6 ping binary when it exists
From: Li Zhijian @ 2018-11-05  8:57 UTC (permalink / raw)
  To: Song Liu, shuah, netdev, linux-kselftest
  Cc: linux-kernel, ast, daniel, Li Zhijian, Philip Li

At commit deee2cae27d1 ("kselftests/bpf: use ping6 as the default ipv6 ping
binary if it exists"), it fixed similar issues for shell script, but it
missed a same issue in the C code.

Fixes: 371e4fcc9d96 ("selftests/bpf: cgroup local storage-based network counters")
CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
V2: Fix coding style: remove '{}' and 80+ characters per line

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 tools/testing/selftests/bpf/test_netcnt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_netcnt.c b/tools/testing/selftests/bpf/test_netcnt.c
index 7887df6..44ed7f2 100644
--- a/tools/testing/selftests/bpf/test_netcnt.c
+++ b/tools/testing/selftests/bpf/test_netcnt.c
@@ -81,7 +81,10 @@ int main(int argc, char **argv)
 		goto err;
 	}
 
-	assert(system("ping localhost -6 -c 10000 -f -q > /dev/null") == 0);
+	if (system("which ping6 &>/dev/null") == 0)
+		assert(!system("ping6 localhost -c 10000 -f -q > /dev/null"));
+	else
+		assert(!system("ping -6 localhost -c 10000 -f -q > /dev/null"));
 
 	if (bpf_prog_query(cgroup_fd, BPF_CGROUP_INET_EGRESS, 0, NULL, NULL,
 			   &prog_cnt)) {
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH v2 0/3] Allwinner H6 Ethernet support
From: Maxime Ripard @ 2018-11-05  9:00 UTC (permalink / raw)
  To: Icenowy Zheng
  Cc: Chen-Yu Tsai, Corentin Labbe, Rob Herring, David S . Miller,
	netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw
In-Reply-To: <20181103123238.4665-1-icenowy-h8G6r0blFSE@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 790 bytes --]

On Sat, Nov 03, 2018 at 08:32:35PM +0800, Icenowy Zheng wrote:
> This patchset introduces Allwinner H6 Ethernet support with code already
> available for A64.
> 
> As the EMAC on H6 is similar to A64 ones, support for them are directly
> reused, by using fallback compatible strings.
> 
> Patches about system controller in v1 is sent by Jernej Skabrec as part
> of his H6 display patchset, and already gets applied.
> 
> NOTE: This patchset targets the final version of Pine H64, and also
> supports the early sample of Pine H64 model B. However, it's not
> compatible with the early sample of Pine H64 model A. Please DO NOT test
> this patchset on the Pine H64 model A samples.

Applied thanks!
Maxime

-- 
Maxime Ripard, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: [PATCH 0/5] Use common cordic algorithm for b43
From: Kalle Valo @ 2018-11-05  9:02 UTC (permalink / raw)
  To: Arend van Spriel
  Cc: Priit Laes, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	b43-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	brcm80211-dev-list.pdl-dY08KVG/lbpWk0Htik3J/w,
	brcm80211-dev-list-+wT8y+m8/X5BDgjK7y7TUQ
In-Reply-To: <870a5f59-1031-c62d-0ee8-742bd0f16d8a-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>

Arend van Spriel <arend.vanspriel-dY08KVG/lbpWk0Htik3J/w@public.gmane.org> writes:

> On 11/5/2018 9:02 AM, Kalle Valo wrote:
>> Also I don't see MAINTAINERS entry for cordic.[c|h], that would be good
>> to have as well.
>
> We added the cordic library functions during brcm80211 staging
> cleanup. We can add it to MAINTAINERS file.

Great, thanks.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH 2/5] brcmfmac: Use common CORDIC_FLOAT macro from lib
From: Kalle Valo @ 2018-11-05  9:05 UTC (permalink / raw)
  To: Priit Laes
  Cc: linux-kernel, Arend van Spriel, Franky Lin, Hante Meuleman,
	Chi-Hsien Lin, Wright Feng, David S. Miller, linux-wireless,
	brcm80211-dev-list.pdl, brcm80211-dev-list, netdev
In-Reply-To: <4bd6e7758bc0d88b33cdb09448633bb5b97aba7c.1541238842.git-series.plaes@plaes.org>

Priit Laes <plaes@plaes.org> writes:

> Now that cordic library has the CORDIC_FLOAT macro, use that
>
> Signed-off-by: Priit Laes <plaes@plaes.org>
> ---
>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c | 4 ++--
>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c   | 4 ++--

The driver is "brcmsmac" (note the 's', not 'f'), you should fix the
title accordingly.

> --- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c
> @@ -3447,8 +3447,8 @@ wlc_lcnphy_start_tx_tone(struct brcms_phy *pi, s32 f_kHz, u16 max_val,
>  
>  		theta += rot;
>  
> -		i_samp = (u16) (FLOAT(tone_samp.i * max_val) & 0x3ff);
> -		q_samp = (u16) (FLOAT(tone_samp.q * max_val) & 0x3ff);
> +		i_samp = (u16)(CORDIC_FLOAT(tone_samp.i * max_val) & 0x3ff);
> +		q_samp = (u16)(CORDIC_FLOAT(tone_samp.q * max_val) & 0x3ff);

I haven't seen the patch 1 yet, but just from seeing this patch I don't
get what's the benefit.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH 3/5] brcmfmac: Drop unused cordic defines and macros
From: Kalle Valo @ 2018-11-05  9:07 UTC (permalink / raw)
  To: Priit Laes
  Cc: linux-kernel, Arend van Spriel, Franky Lin, Hante Meuleman,
	Chi-Hsien Lin, Wright Feng, David S. Miller, linux-wireless,
	brcm80211-dev-list.pdl, brcm80211-dev-list, netdev
In-Reply-To: <7f3dbe604102f7d765149fffef8a4e6b9fa15552.1541238842.git-series.plaes@plaes.org>

Priit Laes <plaes@plaes.org> writes:

> Now that we use library macros, we can drop internal copies
>
> Signed-off-by: Priit Laes <plaes@plaes.org>
> ---
>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_int.h | 7 +-------

Also here this is about brcmsmac.

> --- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_int.h
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_int.h
> @@ -220,13 +220,6 @@ enum phy_cal_mode {
>  #define BB_MULT_MASK		0x0000ffff
>  #define BB_MULT_VALID_MASK	0x80000000
>  
> -#define CORDIC_AG	39797
> -#define	CORDIC_NI	18
> -#define	FIXED(X)	((s32)((X) << 16))
> -
> -#define	FLOAT(X) \
> -	(((X) >= 0) ? ((((X) >> 15) + 1) >> 1) : -((((-(X)) >> 15) + 1) >> 1))
> -

Ah, now I see the benefit from patch 2. IMHO you could just fold patch 3
into patch 2, no need to split them.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH 5/5] b43: Drop internal cordic algorithm implementation
From: Kalle Valo @ 2018-11-05  9:09 UTC (permalink / raw)
  To: Priit Laes; +Cc: linux-kernel, David S. Miller, linux-wireless, b43-dev, netdev
In-Reply-To: <51f98dbd0efe48c315d8e7876074aeaa22fde580.1541238842.git-series.plaes@plaes.org>

Priit Laes <plaes@plaes.org> writes:

> Signed-off-by: Priit Laes <plaes@plaes.org>

No empty commit logs, please.

And IMHO you could fold patch 5 into patch 4.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH 2/5] brcmfmac: Use common CORDIC_FLOAT macro from lib
From: Arend van Spriel @ 2018-11-05  9:13 UTC (permalink / raw)
  To: Kalle Valo, Priit Laes
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Franky Lin, Hante Meuleman,
	Chi-Hsien Lin, Wright Feng, David S. Miller,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	brcm80211-dev-list.pdl-dY08KVG/lbpWk0Htik3J/w,
	brcm80211-dev-list-+wT8y+m8/X5BDgjK7y7TUQ,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <877ehrhp21.fsf-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>

On 11/5/2018 10:05 AM, Kalle Valo wrote:
> Priit Laes <plaes-q/aMd4JkU83YtjvyW6yDsg@public.gmane.org> writes:
>
>> Now that cordic library has the CORDIC_FLOAT macro, use that
>>
>> Signed-off-by: Priit Laes <plaes-q/aMd4JkU83YtjvyW6yDsg@public.gmane.org>
>> ---
>>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c | 4 ++--
>>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c   | 4 ++--
>
> The driver is "brcmsmac" (note the 's', not 'f'), you should fix the
> title accordingly.
>
>> --- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c
>> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c
>> @@ -3447,8 +3447,8 @@ wlc_lcnphy_start_tx_tone(struct brcms_phy *pi, s32 f_kHz, u16 max_val,
>>
>>  		theta += rot;
>>
>> -		i_samp = (u16) (FLOAT(tone_samp.i * max_val) & 0x3ff);
>> -		q_samp = (u16) (FLOAT(tone_samp.q * max_val) & 0x3ff);
>> +		i_samp = (u16)(CORDIC_FLOAT(tone_samp.i * max_val) & 0x3ff);
>> +		q_samp = (u16)(CORDIC_FLOAT(tone_samp.q * max_val) & 0x3ff);
>
> I haven't seen the patch 1 yet, but just from seeing this patch I don't
> get what's the benefit.

The FLOAT macro was defined in brcmsmac (see patch 3). It is now moved 
to the cordic library simply because it is more closely related to that.

Regards,
Arend

^ permalink raw reply

* Re: [PATCH 1/2] net: axienet: recheck condition after timeout in mdio_wait()
From: Sebastian Andrzej Siewior @ 2018-11-05  9:16 UTC (permalink / raw)
  To: David Miller
  Cc: kurt, anirudh, John.Linn, michal.simek, radhey.shyam.pandey,
	andrew, yuehaibing, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <20181030.112511.957438343014682098.davem@davemloft.net>

On 2018-10-30 11:25:11 [-0700], David Miller wrote:
> From: Kurt Kanzenbach <kurt@linutronix.de>
> Date: Tue, 30 Oct 2018 10:31:38 +0100
> 
> > The function could report a false positive if it gets preempted between reading
> > the XAE_MDIO_MCR_OFFSET register and checking for the timeout.  In such a case,
> > the condition has to be rechecked to avoid false positives.
> > 
> > Therefore, check for expected condition even after the timeout occurred.
> > 
> > Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
>  ...
> >  		if (time_before_eq(end, jiffies)) {
> > -			WARN_ON(1);
> > -			return -ETIMEDOUT;
> > +			val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET);
> > +			break;
> >  		}
> > +
> >  		udelay(1);
> >  	}
> > -	return 0;
> > +	if (val & XAE_MDIO_MCR_READY_MASK)
> > +		return 0;
> > +
> > +	WARN_ON(1);
> > +	return -ETIMEDOUT;
> 
> You are not fundamentally changing the situation at all.

> The condtion could change right after your last read of
> XAR_MDIO_MCR_OFFSET, which is the same thing that happens before your
> modifications to this code.
> 
> It sounds more like the timeout is slightly too short, and that's the
> real problem that causes whatever behavior you think you are fixing
> here.

There is a timeout of two jiffies. If the condition is not true within
those two jiffies it will attempt to check condition one last time after
the timeout occured.
If the task got preempted after the reading from the register but before
the timeout it is possible that the task gets back on the CPU after the
timeout occured. And since the timeout occured it won't check if the
condition changed:
Time
 0   +---+
     | c | Check for condition (false)
     | c |
     | c |
     | c |
     | c |
     | P | Task gets preempted
     |   |
     | O | Condition is true, task still preempted, no check
     |   |
 2   | T | The timeout is true
     |   |
     |   |
     |   |
     | p | Task gets back on the CPU, no re-check of condition

In the last step, there is no checking of the condition after the
timeout occured and it wrongly assumes that the condition is not true.
Increasing the timeout would help as long as the task gets not preempted
past the new timeout.
The same pattern (check condition after timeout) is also used in
wait_event_timeout() or readx_poll_timeout().  Would you prefer to
refactor this with readx_poll_timeout() instead?

> I'm not applying this.
Please reconsider.

Sebastian

^ permalink raw reply

* Re: [PATCH] libertas: don't set URB_ZERO_PACKET on IN USB transfer
From: Pavel Machek @ 2018-11-05  9:19 UTC (permalink / raw)
  To: Lubomir Rintel
  Cc: Kalle Valo, David S. Miller, libertas-dev, linux-wireless, netdev,
	linux-kernel, stable
In-Reply-To: <20181006201232.2789936-1-lkundrak@v3.sk>

[-- Attachment #1: Type: text/plain, Size: 2720 bytes --]

On Sat 2018-10-06 22:12:32, Lubomir Rintel wrote:
> The USB core gets rightfully upset:
> 
>   usb 1-1: BOGUS urb flags, 240 --> 200
>   WARNING: CPU: 0 PID: 60 at drivers/usb/core/urb.c:503 usb_submit_urb+0x2f8/0x3ed
>   Modules linked in:
>   CPU: 0 PID: 60 Comm: kworker/0:3 Not tainted 4.19.0-rc6-00319-g5206d00a45c7 #39
>   Hardware name: OLPC XO/XO, BIOS OLPC Ver 1.00.01 06/11/2014
>   Workqueue: events request_firmware_work_func
>   EIP: usb_submit_urb+0x2f8/0x3ed
>   Code: 75 06 8b 8f 80 00 00 00 8d 47 78 89 4d e4 89 55 e8 e8 35 1c f6 ff 8b 55 e8 56 52 8b 4d e4 51 50 68 e3 ce c7 c0 e8 ed 18 c6 ff <0f> 0b 83 c4 14 80 7d ef 01 74 0a 80 7d ef 03 0f 85 b8 00 00 00 8b
>   EAX: 00000025 EBX: ce7d4980 ECX: 00000000 EDX: 00000001
>   ESI: 00000200 EDI: ce7d8800 EBP: ce7f5ea8 ESP: ce7f5e70
>   DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 EFLAGS: 00210292
>   CR0: 80050033 CR2: 00000000 CR3: 00e80000 CR4: 00000090
>   Call Trace:
>    ? if_usb_fw_timeo+0x64/0x64
>    __if_usb_submit_rx_urb+0x85/0xe6
>    ? if_usb_fw_timeo+0x64/0x64
>    if_usb_submit_rx_urb_fwload+0xd/0xf
>    if_usb_prog_firmware+0xc0/0x3db
>    ? _request_firmware+0x54/0x47b
>    ? _request_firmware+0x89/0x47b
>    ? if_usb_probe+0x412/0x412
>    lbs_fw_loaded+0x55/0xa6
>    ? debug_smp_processor_id+0x12/0x14
>    helper_firmware_cb+0x3c/0x3f
>    request_firmware_work_func+0x37/0x6f
>    process_one_work+0x164/0x25a
>    worker_thread+0x1c4/0x284
>    kthread+0xec/0xf1
>    ? cancel_delayed_work_sync+0xf/0xf
>    ? kthread_create_on_node+0x1a/0x1a
>    ret_from_fork+0x2e/0x38
>   ---[ end trace 3ef1e3b2dd53852f ]---
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>

Acked-by: Pavel Machek <pavel@ucw.cz>

> ---
>  drivers/net/wireless/marvell/libertas/if_usb.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/net/wireless/marvell/libertas/if_usb.c b/drivers/net/wireless/marvell/libertas/if_usb.c
> index 5fee555a3d60..220dcdee8d2b 100644
> --- a/drivers/net/wireless/marvell/libertas/if_usb.c
> +++ b/drivers/net/wireless/marvell/libertas/if_usb.c
> @@ -459,8 +459,6 @@ static int __if_usb_submit_rx_urb(struct if_usb_card *cardp,
>  			  MRVDRV_ETH_RX_PACKET_BUFFER_SIZE, callbackfn,
>  			  cardp);
>  
> -	cardp->rx_urb->transfer_flags |= URB_ZERO_PACKET;
> -
>  	lbs_deb_usb2(&cardp->udev->dev, "Pointer for rx_urb %p\n", cardp->rx_urb);
>  	if ((ret = usb_submit_urb(cardp->rx_urb, GFP_ATOMIC))) {
>  		lbs_deb_usbd(&cardp->udev->dev, "Submit Rx URB failed: %d\n", ret);

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: [PATCH 1/2] mm/page_alloc: free order-0 pages through PCP in page_frag_free()
From: Vlastimil Babka @ 2018-11-05  9:26 UTC (permalink / raw)
  To: Aaron Lu, linux-mm, linux-kernel, netdev
  Cc: Andrew Morton, Paweł Staszewski, Jesper Dangaard Brouer,
	Eric Dumazet, Tariq Toukan, Ilias Apalodimas, Yoel Caspersen,
	Mel Gorman, Saeed Mahameed, Michal Hocko, Dave Hansen
In-Reply-To: <20181105085820.6341-1-aaron.lu@intel.com>

On 11/5/18 9:58 AM, Aaron Lu wrote:
> page_frag_free() calls __free_pages_ok() to free the page back to
> Buddy. This is OK for high order page, but for order-0 pages, it
> misses the optimization opportunity of using Per-Cpu-Pages and can
> cause zone lock contention when called frequently.
> 
> Paweł Staszewski recently shared his result of 'how Linux kernel
> handles normal traffic'[1] and from perf data, Jesper Dangaard Brouer
> found the lock contention comes from page allocator:
> 
>   mlx5e_poll_tx_cq
>   |
>    --16.34%--napi_consume_skb
>              |
>              |--12.65%--__free_pages_ok
>              |          |
>              |           --11.86%--free_one_page
>              |                     |
>              |                     |--10.10%--queued_spin_lock_slowpath
>              |                     |
>              |                      --0.65%--_raw_spin_lock
>              |
>              |--1.55%--page_frag_free
>              |
>               --1.44%--skb_release_data
> 
> Jesper explained how it happened: mlx5 driver RX-page recycle
> mechanism is not effective in this workload and pages have to go
> through the page allocator. The lock contention happens during
> mlx5 DMA TX completion cycle. And the page allocator cannot keep
> up at these speeds.[2]
> 
> I thought that __free_pages_ok() are mostly freeing high order
> pages and thought this is an lock contention for high order pages
> but Jesper explained in detail that __free_pages_ok() here are
> actually freeing order-0 pages because mlx5 is using order-0 pages
> to satisfy its page pool allocation request.[3]
> 
> The free path as pointed out by Jesper is:
> skb_free_head()
>   -> skb_free_frag()
>     -> skb_free_frag()
>       -> page_frag_free()
> And the pages being freed on this path are order-0 pages.
> 
> Fix this by doing similar things as in __page_frag_cache_drain() -
> send the being freed page to PCP if it's an order-0 page, or
> directly to Buddy if it is a high order page.
> 
> With this change, Paweł hasn't noticed lock contention yet in
> his workload and Jesper has noticed a 7% performance improvement
> using a micro benchmark and lock contention is gone.
> 
> [1]: https://www.spinics.net/lists/netdev/msg531362.html
> [2]: https://www.spinics.net/lists/netdev/msg531421.html
> [3]: https://www.spinics.net/lists/netdev/msg531556.html
> Reported-by: Paweł Staszewski <pstaszewski@itcare.pl>
> Analysed-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Signed-off-by: Aaron Lu <aaron.lu@intel.com>

Yeah looks like an obvious thing to do.

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/page_alloc.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ae31839874b8..91a9a6af41a2 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4555,8 +4555,14 @@ void page_frag_free(void *addr)
>  {
>  	struct page *page = virt_to_head_page(addr);
>  
> -	if (unlikely(put_page_testzero(page)))
> -		__free_pages_ok(page, compound_order(page));
> +	if (unlikely(put_page_testzero(page))) {
> +		unsigned int order = compound_order(page);
> +
> +		if (order == 0)
> +			free_unref_page(page);
> +		else
> +			__free_pages_ok(page, order);
> +	}
>  }
>  EXPORT_SYMBOL(page_frag_free);
>  
> 

^ permalink raw reply

* Re: [PATCH 1/2] mm/page_alloc: free order-0 pages through PCP in page_frag_free()
From: Mel Gorman @ 2018-11-05  9:26 UTC (permalink / raw)
  To: Aaron Lu
  Cc: linux-mm, linux-kernel, netdev, Andrew Morton,
	iso-8859-1?B?UGF3ZcWC?= Staszewski, Jesper Dangaard Brouer,
	Eric Dumazet, Tariq Toukan, Ilias Apalodimas, Yoel Caspersen,
	Saeed Mahameed, Michal Hocko, Vlastimil Babka, Dave Hansen
In-Reply-To: <20181105085820.6341-1-aaron.lu@intel.com>

On Mon, Nov 05, 2018 at 04:58:19PM +0800, Aaron Lu wrote:
> page_frag_free() calls __free_pages_ok() to free the page back to
> Buddy. This is OK for high order page, but for order-0 pages, it
> misses the optimization opportunity of using Per-Cpu-Pages and can
> cause zone lock contention when called frequently.
> 
> [1]: https://www.spinics.net/lists/netdev/msg531362.html
> [2]: https://www.spinics.net/lists/netdev/msg531421.html
> [3]: https://www.spinics.net/lists/netdev/msg531556.html
> Reported-by: PaweÅ‚ Staszewski <pstaszewski@itcare.pl>
> Analysed-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Signed-off-by: Aaron Lu <aaron.lu@intel.com>

Well spotted,

Acked-by: Mel Gorman <mgorman@techsingularity.net>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply

* Re: [PATCH 1/6] phy: Add max_bitrate attribute & phy_get_max_bitrate()
From: Marc Kleine-Budde @ 2018-11-05  9:37 UTC (permalink / raw)
  To: Faiz Abbas, linux-kernel, devicetree, netdev, linux-can
  Cc: wg, robh+dt, mark.rutland, kishon
In-Reply-To: <d75c9e3c-e12b-e0b8-832c-ad117dcbe990@ti.com>


[-- Attachment #1.1: Type: text/plain, Size: 1259 bytes --]

On 11/05/2018 07:27 AM, Faiz Abbas wrote:
> I remove the transceiver child node binding documentation in patch 5/6.
> 
> The existing implementation is pretty limiting as it just has a child
> node with no associated device. What if a transceiver requires its own
> configurations before it can start sending/receiving messages (for
> example, my usecase requires it to pull the standby line low)?
> 
> I think that can be solved by implementing the transceiver as a phy and
> exposing a generic get_max_bitrate API. That way, the transceiver device
> can do all its startup configuration in the phy probe function.
> 
> In any case, do suggest if you have a better idea on how to implement
> pull gpio low requirement.

As long as we don't have any proper transceiver/phy driver, that does
more than swtich on/off a GPIO, please add a "xceiver" regulator to your
driver. Look for:

> devm_regulator_get(&pdev->dev, "xceiver");

in the flexcan driver.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH 1/2] mm/page_alloc: free order-0 pages through PCP in page_frag_free()
From: Jesper Dangaard Brouer @ 2018-11-05  9:55 UTC (permalink / raw)
  To: Aaron Lu
  Cc: linux-mm, linux-kernel, netdev, Andrew Morton,
	Paweł Staszewski, Eric Dumazet, Tariq Toukan,
	Ilias Apalodimas, Yoel Caspersen, Mel Gorman, Saeed Mahameed,
	Michal Hocko, Vlastimil Babka, Dave Hansen, brouer
In-Reply-To: <20181105085820.6341-1-aaron.lu@intel.com>

On Mon,  5 Nov 2018 16:58:19 +0800
Aaron Lu <aaron.lu@intel.com> wrote:

> page_frag_free() calls __free_pages_ok() to free the page back to
> Buddy. This is OK for high order page, but for order-0 pages, it
> misses the optimization opportunity of using Per-Cpu-Pages and can
> cause zone lock contention when called frequently.
> 
> Paweł Staszewski recently shared his result of 'how Linux kernel
> handles normal traffic'[1] and from perf data, Jesper Dangaard Brouer
> found the lock contention comes from page allocator:
> 
>   mlx5e_poll_tx_cq
>   |
>    --16.34%--napi_consume_skb
>              |
>              |--12.65%--__free_pages_ok
>              |          |
>              |           --11.86%--free_one_page
>              |                     |
>              |                     |--10.10%--queued_spin_lock_slowpath
>              |                     |
>              |                      --0.65%--_raw_spin_lock
>              |
>              |--1.55%--page_frag_free
>              |
>               --1.44%--skb_release_data
> 
> Jesper explained how it happened: mlx5 driver RX-page recycle
> mechanism is not effective in this workload and pages have to go
> through the page allocator. The lock contention happens during
> mlx5 DMA TX completion cycle. And the page allocator cannot keep
> up at these speeds.[2]
> 
> I thought that __free_pages_ok() are mostly freeing high order
> pages and thought this is an lock contention for high order pages
> but Jesper explained in detail that __free_pages_ok() here are
> actually freeing order-0 pages because mlx5 is using order-0 pages
> to satisfy its page pool allocation request.[3]
> 
> The free path as pointed out by Jesper is:
> skb_free_head()
>   -> skb_free_frag()
>     -> skb_free_frag()

Nitpick: you added skb_free_frag() two times, else correct.
(All this stuff gets inlined by the compiler, which makes it hard to
spot with perf report).

>       -> page_frag_free()  
> And the pages being freed on this path are order-0 pages.
> 
> Fix this by doing similar things as in __page_frag_cache_drain() -
> send the being freed page to PCP if it's an order-0 page, or
> directly to Buddy if it is a high order page.
> 
> With this change, Paweł hasn't noticed lock contention yet in
> his workload and Jesper has noticed a 7% performance improvement
> using a micro benchmark and lock contention is gone.
> 
> [1]: https://www.spinics.net/lists/netdev/msg531362.html
> [2]: https://www.spinics.net/lists/netdev/msg531421.html
> [3]: https://www.spinics.net/lists/netdev/msg531556.html
> Reported-by: Paweł Staszewski <pstaszewski@itcare.pl>
> Analysed-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Signed-off-by: Aaron Lu <aaron.lu@intel.com>
> ---

It is REALLY great that Aaron spotted this! (based on my analysis).
This have likely been causing scalability issues on real-life network
traffic, but have been hiding behind the driver level recycle tricks
for micro-benchmarking.

Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>

>  mm/page_alloc.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ae31839874b8..91a9a6af41a2 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4555,8 +4555,14 @@ void page_frag_free(void *addr)
>  {
>  	struct page *page = virt_to_head_page(addr);
>  
> -	if (unlikely(put_page_testzero(page)))
> -		__free_pages_ok(page, compound_order(page));
> +	if (unlikely(put_page_testzero(page))) {
> +		unsigned int order = compound_order(page);
> +
> +		if (order == 0)
> +			free_unref_page(page);
> +		else
> +			__free_pages_ok(page, order);
> +	}
>  }
>  EXPORT_SYMBOL(page_frag_free);
>  

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH] net: phy: realtek: fix RTL8201F sysfs name
From: David Miller @ 2018-11-05  0:44 UTC (permalink / raw)
  To: andrew; +Cc: holger, netdev
In-Reply-To: <20181104184741.GB27023@lunn.ch>

From: Andrew Lunn <andrew@lunn.ch>
Date: Sun, 4 Nov 2018 19:47:41 +0100

> On Sun, Nov 04, 2018 at 07:02:42PM +0100, Holger Hoffstätte wrote:
>> Since 4.19 the following error in sysfs has appeared when using the
>> r8169 NIC driver:
>> 
>> $cd /sys/module/realtek/drivers
>> $ls -l
>> ls: cannot access 'mdio_bus:RTL8201F 10/100Mbps Ethernet': No such file or directory
>> [..garbled dir entries follow..]
>> 
>> Apparently the forward slash in "10/100Mbps Ethernet" is interpreted
>> as directory separator that leads nowhere, and was introduced in commit
>> 513588dd44b ("net: phy: realtek: add RTL8201F phy-id and functions").
>> 
>> Fix this by removing the offending slash in the driver name.
>> 
>> Other drivers in net/phy seem to have the same problem, but I cannot
>> test/verify them.
> 
> Hi Holger
> 
> This last comment would generally be placed after the ---. It will
> then not appear in the commit message.
> 
> Also, in future, please put the target tree, net or net-next as part
> of the subject line:
> 
> [PATCH net] ....

This didn't apply cleanly, something mangled the patch.

But I fixed it up and queued this up for -stable.

^ permalink raw reply

* Re: [RFC PATCH] lib: Introduce generic __cmpxchg_u64() and use it where needed
From: Peter Zijlstra @ 2018-11-05 10:38 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: mark.rutland@arm.com, linux-mips@linux-mips.org,
	will.deacon@arm.com, bfields@fieldses.org, paulus@samba.org,
	Trond Myklebust, jhogan@kernel.org, Paul McKenney,
	linux@roeck-us.net, arnd@arndb.de, boqun.feng@gmail.com, dvyukov,
	linux-nfs@vger.kernel.org, netdev@vger.kernel.org,
	jlayton@kernel.org, linux-kernel@vger.kernel.org,
	ralf@linux-mips.org, anna.schumaker@netapp.com,
	paul.burton@mips.com, "akpm@linu
In-Reply-To: <5a846924-e642-d9d1-4e0e-810bd4d01c26@virtuozzo.com>

On Fri, Nov 02, 2018 at 07:19:15PM +0300, Andrey Ryabinin wrote:

> UBSAN warns about signed overflows despite -fno-strict-overflow if gcc
> version is < 8.  I have learned recently that UBSAN in GCC 8 ignores
> signed overflows if -fno-strict-overflow of fwrapv is used.

Ah, good.

> We can always just drop -fsanitize=signed-integer-overflow if it considered too noisy.

I think that is the most consistent beahviour. signed overflow is not UB
in the kernel.

> Although it did catch some real bugs.

If we want an over/under-flow checker, then that should be a separate
plugin and not specific to signed or unsigned.

^ permalink raw reply

* Re: [PATCH 1/2] mm/page_alloc: free order-0 pages through PCP in page_frag_free()
From: Ilias Apalodimas @ 2018-11-05 10:46 UTC (permalink / raw)
  To: Aaron Lu
  Cc: linux-mm, linux-kernel, netdev, Andrew Morton,
	Paweł Staszewski, Jesper Dangaard Brouer, Eric Dumazet,
	Tariq Toukan, Yoel Caspersen, Mel Gorman, Saeed Mahameed,
	Michal Hocko, Vlastimil Babka, Dave Hansen
In-Reply-To: <20181105085820.6341-1-aaron.lu@intel.com>

Hi Aaron,
> page_frag_free() calls __free_pages_ok() to free the page back to
> Buddy. This is OK for high order page, but for order-0 pages, it
> misses the optimization opportunity of using Per-Cpu-Pages and can
> cause zone lock contention when called frequently.
> 
> Paweł Staszewski recently shared his result of 'how Linux kernel
> handles normal traffic'[1] and from perf data, Jesper Dangaard Brouer
> found the lock contention comes from page allocator:
> 
>   mlx5e_poll_tx_cq
>   |
>    --16.34%--napi_consume_skb
>              |
>              |--12.65%--__free_pages_ok
>              |          |
>              |           --11.86%--free_one_page
>              |                     |
>              |                     |--10.10%--queued_spin_lock_slowpath
>              |                     |
>              |                      --0.65%--_raw_spin_lock
>              |
>              |--1.55%--page_frag_free
>              |
>               --1.44%--skb_release_data
> 
> Jesper explained how it happened: mlx5 driver RX-page recycle
> mechanism is not effective in this workload and pages have to go
> through the page allocator. The lock contention happens during
> mlx5 DMA TX completion cycle. And the page allocator cannot keep
> up at these speeds.[2]
> 
> I thought that __free_pages_ok() are mostly freeing high order
> pages and thought this is an lock contention for high order pages
> but Jesper explained in detail that __free_pages_ok() here are
> actually freeing order-0 pages because mlx5 is using order-0 pages
> to satisfy its page pool allocation request.[3]
> 
> The free path as pointed out by Jesper is:
> skb_free_head()
>   -> skb_free_frag()
>     -> skb_free_frag()
>       -> page_frag_free()
> And the pages being freed on this path are order-0 pages.
> 
> Fix this by doing similar things as in __page_frag_cache_drain() -
> send the being freed page to PCP if it's an order-0 page, or
> directly to Buddy if it is a high order page.
> 
> With this change, Paweł hasn't noticed lock contention yet in
> his workload and Jesper has noticed a 7% performance improvement
> using a micro benchmark and lock contention is gone.
I did the same tests on a 'low' speed 1Gbit interface on an cortex-a53.
I used socionext's netsec driver and switched buffer allocation from the 
current scheme to using page_pool API (which by default allocates order0 
pages).

Running 'perf top' pre and post patch got me the same results.
__free_pages_ok() disappeared from perf top and i got an ~11% 
performance boost testing with 64byte packets.

Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Tested-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox