Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] net: phy: marvell: clear wol event before setting it
From: Bhadram Varka @ 2018-04-27  3:52 UTC (permalink / raw)
  To: Andrew Lunn, Jisheng Zhang
  Cc: Florian Fainelli, David S. Miller, netdev, linux-kernel,
	Jingju Hou
In-Reply-To: <20180426124007.GC13467@lunn.ch>

Hi Andrew/Jisheng,

On 4/26/2018 6:10 PM, Andrew Lunn wrote:
>> hmm, so you want a "stick" WOL feature, I dunno whether Linux kernel
>> requires WOL should be "stick".
> I see two different cases:
>
> Suspend/resume: The WoL state in the kernel is probably kept across
> such a cycle. If so, you would expect another suspend/resume to also
> work, without needs to reconfigure it.
Trying this scenario (suspend/resume) from my side. In this case WoL 
should be enabled in the HW. For Marvell PHY to generate WoL interrupt 
we need to clear WoL status.
Above mentioned change required to make this happen. Please share your 
thoughts on this.
>
> Boot from powered off: If the interrupt just enables the power supply,
> it is possible to wake up after a shutdown. There is no state kept, so
> WoL will be disabled in the kernel. So WoL should also be disabled in
> the hardware.
>
>      Andrew

^ permalink raw reply

* Re: [RFC v3 0/5] virtio: support packed ring
From: Jason Wang @ 2018-04-27  3:56 UTC (permalink / raw)
  To: Tiwei Bie, mst, virtualization, linux-kernel, netdev; +Cc: wexu, jfreimann
In-Reply-To: <20180425051550.24342-1-tiwei.bie@intel.com>



On 2018年04月25日 13:15, Tiwei Bie wrote:
> Hello everyone,
>
> This RFC implements packed ring support in virtio driver.
>
> Some simple functional tests have been done with Jason's
> packed ring implementation in vhost:
>
> https://lkml.org/lkml/2018/4/23/12
>
> Both of ping and netperf worked as expected (with EVENT_IDX
> disabled). But there are below known issues:
>
> 1. Reloading the guest driver will break the Tx/Rx;

Will have a look at this issue.

> 2. Zeroing the flags when detaching a used desc will
>     break the guest -> host path.

I still think zeroing flags is unnecessary or even a bug. At host, I 
track last observed avail wrap counter and detect avail like (what is 
suggested in the example code in the spec):

static bool desc_is_avail(struct vhost_virtqueue *vq, __virtio16 flags)
{
        bool avail = flags & cpu_to_vhost16(vq, DESC_AVAIL);

        return avail == vq->avail_wrap_counter;
}

So zeroing wrap can not work with this obviously.

Thanks

>
> Some simple functional tests have also been done with
> Wei's packed ring implementation in QEMU:
>
> http://lists.nongnu.org/archive/html/qemu-devel/2018-04/msg00342.html
>
> Both of ping and netperf worked as expected (with EVENT_IDX
> disabled). Reloading the guest driver also worked as expected.
>
> TODO:
> - Refinements (for code and commit log) and bug fixes;
> - Discuss/fix/test EVENT_IDX support;
> - Test devices other than net;
>
> RFC v2 -> RFC v3:
> - Split into small patches (Jason);
> - Add helper virtqueue_use_indirect() (Jason);
> - Just set id for the last descriptor of a list (Jason);
> - Calculate the prev in virtqueue_add_packed() (Jason);
> - Fix/improve desc suppression code (Jason/MST);
> - Refine the code layout for XXX_split/packed and wrappers (MST);
> - Fix the comments and API in uapi (MST);
> - Remove the BUG_ON() for indirect (Jason);
> - Some other refinements and bug fixes;
>
> RFC v1 -> RFC v2:
> - Add indirect descriptor support - compile test only;
> - Add event suppression supprt - compile test only;
> - Move vring_packed_init() out of uapi (Jason, MST);
> - Merge two loops into one in virtqueue_add_packed() (Jason);
> - Split vring_unmap_one() for packed ring and split ring (Jason);
> - Avoid using '%' operator (Jason);
> - Rename free_head -> next_avail_idx (Jason);
> - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
> - Some other refinements and bug fixes;
>
> Thanks!
>
> Tiwei Bie (5):
>    virtio: add packed ring definitions
>    virtio_ring: support creating packed ring
>    virtio_ring: add packed ring support
>    virtio_ring: add event idx support in packed ring
>    virtio_ring: enable packed ring
>
>   drivers/virtio/virtio_ring.c       | 1271 ++++++++++++++++++++++++++++--------
>   include/linux/virtio_ring.h        |    8 +-
>   include/uapi/linux/virtio_config.h |   12 +-
>   include/uapi/linux/virtio_ring.h   |   36 +
>   4 files changed, 1049 insertions(+), 278 deletions(-)
>

^ permalink raw reply

* Re: [RFC v3 0/5] virtio: support packed ring
From: Michael S. Tsirkin @ 2018-04-27  4:18 UTC (permalink / raw)
  To: Jason Wang
  Cc: Tiwei Bie, virtualization, linux-kernel, netdev, wexu, jfreimann
In-Reply-To: <5ad1ca01-1e5c-7105-f303-7e8d42f6a068@redhat.com>

On Fri, Apr 27, 2018 at 11:56:05AM +0800, Jason Wang wrote:
> 
> 
> On 2018年04月25日 13:15, Tiwei Bie wrote:
> > Hello everyone,
> > 
> > This RFC implements packed ring support in virtio driver.
> > 
> > Some simple functional tests have been done with Jason's
> > packed ring implementation in vhost:
> > 
> > https://lkml.org/lkml/2018/4/23/12
> > 
> > Both of ping and netperf worked as expected (with EVENT_IDX
> > disabled). But there are below known issues:
> > 
> > 1. Reloading the guest driver will break the Tx/Rx;
> 
> Will have a look at this issue.
> 
> > 2. Zeroing the flags when detaching a used desc will
> >     break the guest -> host path.
> 
> I still think zeroing flags is unnecessary or even a bug. At host, I track
> last observed avail wrap counter and detect avail like (what is suggested in
> the example code in the spec):
> 
> static bool desc_is_avail(struct vhost_virtqueue *vq, __virtio16 flags)
> {
>        bool avail = flags & cpu_to_vhost16(vq, DESC_AVAIL);
> 
>        return avail == vq->avail_wrap_counter;
> }
> 
> So zeroing wrap can not work with this obviously.
> 
> Thanks

I agree. I think what one should do is flip the available bit.

> > 
> > Some simple functional tests have also been done with
> > Wei's packed ring implementation in QEMU:
> > 
> > http://lists.nongnu.org/archive/html/qemu-devel/2018-04/msg00342.html
> > 
> > Both of ping and netperf worked as expected (with EVENT_IDX
> > disabled). Reloading the guest driver also worked as expected.
> > 
> > TODO:
> > - Refinements (for code and commit log) and bug fixes;
> > - Discuss/fix/test EVENT_IDX support;
> > - Test devices other than net;
> > 
> > RFC v2 -> RFC v3:
> > - Split into small patches (Jason);
> > - Add helper virtqueue_use_indirect() (Jason);
> > - Just set id for the last descriptor of a list (Jason);
> > - Calculate the prev in virtqueue_add_packed() (Jason);
> > - Fix/improve desc suppression code (Jason/MST);
> > - Refine the code layout for XXX_split/packed and wrappers (MST);
> > - Fix the comments and API in uapi (MST);
> > - Remove the BUG_ON() for indirect (Jason);
> > - Some other refinements and bug fixes;
> > 
> > RFC v1 -> RFC v2:
> > - Add indirect descriptor support - compile test only;
> > - Add event suppression supprt - compile test only;
> > - Move vring_packed_init() out of uapi (Jason, MST);
> > - Merge two loops into one in virtqueue_add_packed() (Jason);
> > - Split vring_unmap_one() for packed ring and split ring (Jason);
> > - Avoid using '%' operator (Jason);
> > - Rename free_head -> next_avail_idx (Jason);
> > - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
> > - Some other refinements and bug fixes;
> > 
> > Thanks!
> > 
> > Tiwei Bie (5):
> >    virtio: add packed ring definitions
> >    virtio_ring: support creating packed ring
> >    virtio_ring: add packed ring support
> >    virtio_ring: add event idx support in packed ring
> >    virtio_ring: enable packed ring
> > 
> >   drivers/virtio/virtio_ring.c       | 1271 ++++++++++++++++++++++++++++--------
> >   include/linux/virtio_ring.h        |    8 +-
> >   include/uapi/linux/virtio_config.h |   12 +-
> >   include/uapi/linux/virtio_ring.h   |   36 +
> >   4 files changed, 1049 insertions(+), 278 deletions(-)
> > 

^ permalink raw reply

* [PATCH v3] net: qrtr: Expose tunneling endpoint to user space
From: Bjorn Andersson @ 2018-04-27  5:28 UTC (permalink / raw)
  To: David S. Miller, Chris Lew; +Cc: linux-kernel, netdev, linux-arm-msm

This implements a misc character device named "qrtr-tun" for the purpose
of allowing user space applications to implement endpoints in the qrtr
network.

This allows more advanced (and dynamic) testing of the qrtr code as well
as opens up the ability of tunneling qrtr over a network or USB link.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
---

Changes since v2:
- Add support for read poll

Changes since v1:
- Dropped queue lock

 net/qrtr/Kconfig  |   7 +++
 net/qrtr/Makefile |   2 +
 net/qrtr/tun.c    | 161 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 170 insertions(+)
 create mode 100644 net/qrtr/tun.c

diff --git a/net/qrtr/Kconfig b/net/qrtr/Kconfig
index 326fd97444f5..1944834d225c 100644
--- a/net/qrtr/Kconfig
+++ b/net/qrtr/Kconfig
@@ -21,4 +21,11 @@ config QRTR_SMD
 	  Say Y here to support SMD based ipcrouter channels.  SMD is the
 	  most common transport for IPC Router.
 
+config QRTR_TUN
+	tristate "TUN device for Qualcomm IPC Router"
+	---help---
+	  Say Y here to expose a character device that allows user space to
+	  implement endpoints of QRTR, for purpose of tunneling data to other
+	  hosts or testing purposes.
+
 endif # QRTR
diff --git a/net/qrtr/Makefile b/net/qrtr/Makefile
index ab09e40f7c74..be012bfd3e52 100644
--- a/net/qrtr/Makefile
+++ b/net/qrtr/Makefile
@@ -2,3 +2,5 @@ obj-$(CONFIG_QRTR) := qrtr.o
 
 obj-$(CONFIG_QRTR_SMD) += qrtr-smd.o
 qrtr-smd-y	:= smd.o
+obj-$(CONFIG_QRTR_TUN) += qrtr-tun.o
+qrtr-tun-y	:= tun.o
diff --git a/net/qrtr/tun.c b/net/qrtr/tun.c
new file mode 100644
index 000000000000..ccff1e544c21
--- /dev/null
+++ b/net/qrtr/tun.c
@@ -0,0 +1,161 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018, Linaro Ltd */
+
+#include <linux/miscdevice.h>
+#include <linux/module.h>
+#include <linux/poll.h>
+#include <linux/skbuff.h>
+#include <linux/uaccess.h>
+
+#include "qrtr.h"
+
+struct qrtr_tun {
+	struct qrtr_endpoint ep;
+
+	struct sk_buff_head queue;
+	wait_queue_head_t readq;
+};
+
+static int qrtr_tun_send(struct qrtr_endpoint *ep, struct sk_buff *skb)
+{
+	struct qrtr_tun *tun = container_of(ep, struct qrtr_tun, ep);
+
+	skb_queue_tail(&tun->queue, skb);
+
+	/* wake up any blocking processes, waiting for new data */
+	wake_up_interruptible(&tun->readq);
+
+	return 0;
+}
+
+static int qrtr_tun_open(struct inode *inode, struct file *filp)
+{
+	struct qrtr_tun *tun;
+
+	tun = kzalloc(sizeof(*tun), GFP_KERNEL);
+	if (!tun)
+		return -ENOMEM;
+
+	skb_queue_head_init(&tun->queue);
+	init_waitqueue_head(&tun->readq);
+
+	tun->ep.xmit = qrtr_tun_send;
+
+	filp->private_data = tun;
+
+	return qrtr_endpoint_register(&tun->ep, QRTR_EP_NID_AUTO);
+}
+
+static ssize_t qrtr_tun_read_iter(struct kiocb *iocb, struct iov_iter *to)
+{
+	struct file *filp = iocb->ki_filp;
+	struct qrtr_tun *tun = filp->private_data;
+	struct sk_buff *skb;
+	int count;
+
+	while (!(skb = skb_dequeue(&tun->queue))) {
+		if (filp->f_flags & O_NONBLOCK)
+			return -EAGAIN;
+
+		/* Wait until we get data or the endpoint goes away */
+		if (wait_event_interruptible(tun->readq,
+					     !skb_queue_empty(&tun->queue)))
+			return -ERESTARTSYS;
+	}
+
+	count = min_t(size_t, iov_iter_count(to), skb->len);
+	if (copy_to_iter(skb->data, count, to) != count)
+		count = -EFAULT;
+
+	kfree_skb(skb);
+
+	return count;
+}
+
+static ssize_t qrtr_tun_write_iter(struct kiocb *iocb, struct iov_iter *from)
+{
+	struct file *filp = iocb->ki_filp;
+	struct qrtr_tun *tun = filp->private_data;
+	size_t len = iov_iter_count(from);
+	ssize_t ret;
+	void *kbuf;
+
+	kbuf = kzalloc(len, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	if (!copy_from_iter_full(kbuf, len, from))
+		return -EFAULT;
+
+	ret = qrtr_endpoint_post(&tun->ep, kbuf, len);
+
+	return ret < 0 ? ret : len;
+}
+
+static __poll_t qrtr_tun_poll(struct file *filp, poll_table *wait)
+{
+	struct qrtr_tun *tun = filp->private_data;
+	__poll_t mask = 0;
+
+	poll_wait(filp, &tun->readq, wait);
+
+	if (!skb_queue_empty(&tun->queue))
+		mask |= EPOLLIN | EPOLLRDNORM;
+
+	return mask;
+}
+
+static int qrtr_tun_release(struct inode *inode, struct file *filp)
+{
+	struct qrtr_tun *tun = filp->private_data;
+	struct sk_buff *skb;
+
+	qrtr_endpoint_unregister(&tun->ep);
+
+	/* Discard all SKBs */
+	while (!skb_queue_empty(&tun->queue)) {
+		skb = skb_dequeue(&tun->queue);
+		kfree_skb(skb);
+	}
+
+	kfree(tun);
+
+	return 0;
+}
+
+static const struct file_operations qrtr_tun_ops = {
+	.owner = THIS_MODULE,
+	.open = qrtr_tun_open,
+	.poll = qrtr_tun_poll,
+	.read_iter = qrtr_tun_read_iter,
+	.write_iter = qrtr_tun_write_iter,
+	.release = qrtr_tun_release,
+};
+
+static struct miscdevice qrtr_tun_miscdev = {
+	MISC_DYNAMIC_MINOR,
+	"qrtr-tun",
+	&qrtr_tun_ops,
+};
+
+static int __init qrtr_tun_init(void)
+{
+	int ret;
+
+	ret = misc_register(&qrtr_tun_miscdev);
+	if (ret)
+		pr_err("failed to register Qualcomm IPC Router tun device\n");
+
+	return ret;
+}
+
+static void __exit qrtr_tun_exit(void)
+{
+	misc_deregister(&qrtr_tun_miscdev);
+}
+
+module_init(qrtr_tun_init);
+module_exit(qrtr_tun_exit);
+
+MODULE_DESCRIPTION("Qualcomm IPC Router TUN device");
+MODULE_LICENSE("GPL v2");
-- 
2.16.2

^ permalink raw reply related

* [PATCH v4 net-next 0/3] lan78xx updates along with Fixed phy Support
From: Raghuram Chary J @ 2018-04-27  6:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, unglinuxdriver, woojung.huh, raghuramchary.jallipalli

These series of patches handle few modifications in driver
and adds support for fixed phy.

Raghuram Chary J (3):
  lan78xx: Lan7801 Support for Fixed PHY
  lan78xx: Remove DRIVER_VERSION for lan78xx driver
  lan78xx: Modify error messages

 drivers/net/usb/Kconfig   |   1 +
 drivers/net/usb/lan78xx.c | 106 ++++++++++++++++++++++++++++++++--------------
 2 files changed, 75 insertions(+), 32 deletions(-)

-- 
2.16.2

^ permalink raw reply

* [PATCH v4 net-next 1/3] lan78xx: Lan7801 Support for Fixed PHY
From: Raghuram Chary J @ 2018-04-27  6:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, unglinuxdriver, woojung.huh, raghuramchary.jallipalli
In-Reply-To: <20180427060024.23989-1-raghuramchary.jallipalli@microchip.com>

Adding Fixed PHY support to the lan78xx driver.

Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com>
---
v0->v1:
   * Remove driver version #define
   * Modify netdev_info to netdev_dbg
   * Move lan7801 specific to new routine and add switch case
   * Minor cleanup

v1->v2:
   * Removed fixedphy variable and used phy_is_pseudo_fixed_link() check.
v2->v3:
   * Revert driver version, debug statment changes for separate patch.
   * Modify lan7801 specific routine with return type struct phy_device.
v3->v4:
   * Modify lan7801 specific routine by removing phydev arg and get phydev.
---
 drivers/net/usb/Kconfig   |   1 +
 drivers/net/usb/lan78xx.c | 100 +++++++++++++++++++++++++++++++++-------------
 2 files changed, 73 insertions(+), 28 deletions(-)

diff --git a/drivers/net/usb/Kconfig b/drivers/net/usb/Kconfig
index f28bd74ac275..418b0904cecb 100644
--- a/drivers/net/usb/Kconfig
+++ b/drivers/net/usb/Kconfig
@@ -111,6 +111,7 @@ config USB_LAN78XX
 	select MII
 	select PHYLIB
 	select MICROCHIP_PHY
+	select FIXED_PHY
 	help
 	  This option adds support for Microchip LAN78XX based USB 2
 	  & USB 3 10/100/1000 Ethernet adapters.
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 0867f7275852..cb35cfa20ca0 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -36,7 +36,7 @@
 #include <linux/irq.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/microchipphy.h>
-#include <linux/phy.h>
+#include <linux/phy_fixed.h>
 #include "lan78xx.h"
 
 #define DRIVER_AUTHOR	"WOOJUNG HUH <woojung.huh@microchip.com>"
@@ -2003,52 +2003,91 @@ static int ksz9031rnx_fixup(struct phy_device *phydev)
 	return 1;
 }
 
-static int lan78xx_phy_init(struct lan78xx_net *dev)
+static struct phy_device *lan7801_phy_init(struct lan78xx_net *dev)
 {
+	u32 buf;
 	int ret;
-	u32 mii_adv;
+	struct fixed_phy_status fphy_status = {
+		.link = 1,
+		.speed = SPEED_1000,
+		.duplex = DUPLEX_FULL,
+	};
 	struct phy_device *phydev;
 
 	phydev = phy_find_first(dev->mdiobus);
 	if (!phydev) {
-		netdev_err(dev->net, "no PHY found\n");
-		return -EIO;
-	}
-
-	if ((dev->chipid == ID_REV_CHIP_ID_7800_) ||
-	    (dev->chipid == ID_REV_CHIP_ID_7850_)) {
-		phydev->is_internal = true;
-		dev->interface = PHY_INTERFACE_MODE_GMII;
-
-	} else if (dev->chipid == ID_REV_CHIP_ID_7801_) {
+		netdev_dbg(dev->net, "PHY Not Found!! Registering Fixed PHY\n");
+		phydev = fixed_phy_register(PHY_POLL, &fphy_status, -1,
+					    NULL);
+		if (IS_ERR(phydev)) {
+			netdev_err(dev->net, "No PHY/fixed_PHY found\n");
+			return NULL;
+		}
+		netdev_dbg(dev->net, "Registered FIXED PHY\n");
+		dev->interface = PHY_INTERFACE_MODE_RGMII;
+		ret = lan78xx_write_reg(dev, MAC_RGMII_ID,
+					MAC_RGMII_ID_TXC_DELAY_EN_);
+		ret = lan78xx_write_reg(dev, RGMII_TX_BYP_DLL, 0x3D00);
+		ret = lan78xx_read_reg(dev, HW_CFG, &buf);
+		buf |= HW_CFG_CLK125_EN_;
+		buf |= HW_CFG_REFCLK25_EN_;
+		ret = lan78xx_write_reg(dev, HW_CFG, buf);
+	} else {
 		if (!phydev->drv) {
 			netdev_err(dev->net, "no PHY driver found\n");
-			return -EIO;
+			return NULL;
 		}
-
 		dev->interface = PHY_INTERFACE_MODE_RGMII;
-
 		/* external PHY fixup for KSZ9031RNX */
 		ret = phy_register_fixup_for_uid(PHY_KSZ9031RNX, 0xfffffff0,
 						 ksz9031rnx_fixup);
 		if (ret < 0) {
 			netdev_err(dev->net, "fail to register fixup\n");
-			return ret;
+			return NULL;
 		}
 		/* external PHY fixup for LAN8835 */
 		ret = phy_register_fixup_for_uid(PHY_LAN8835, 0xfffffff0,
 						 lan8835_fixup);
 		if (ret < 0) {
 			netdev_err(dev->net, "fail to register fixup\n");
-			return ret;
+			return NULL;
 		}
 		/* add more external PHY fixup here if needed */
 
 		phydev->is_internal = false;
-	} else {
-		netdev_err(dev->net, "unknown ID found\n");
-		ret = -EIO;
-		goto error;
+	}
+	return phydev;
+}
+
+static int lan78xx_phy_init(struct lan78xx_net *dev)
+{
+	int ret;
+	u32 mii_adv;
+	struct phy_device *phydev;
+
+	switch (dev->chipid) {
+	case ID_REV_CHIP_ID_7801_:
+		phydev = lan7801_phy_init(dev);
+		if (!phydev) {
+			netdev_err(dev->net, "lan7801: PHY Init Failed");
+			return -EIO;
+		}
+		break;
+
+	case ID_REV_CHIP_ID_7800_:
+	case ID_REV_CHIP_ID_7850_:
+		phydev = phy_find_first(dev->mdiobus);
+		if (!phydev) {
+			netdev_err(dev->net, "no PHY found\n");
+			return -EIO;
+		}
+		phydev->is_internal = true;
+		dev->interface = PHY_INTERFACE_MODE_GMII;
+		break;
+
+	default:
+		netdev_err(dev->net, "Unknown CHIP ID found\n");
+		return -EIO;
 	}
 
 	/* if phyirq is not set, use polling mode in phylib */
@@ -2067,6 +2106,12 @@ static int lan78xx_phy_init(struct lan78xx_net *dev)
 	if (ret) {
 		netdev_err(dev->net, "can't attach PHY to %s\n",
 			   dev->mdiobus->id);
+		if (dev->chipid == ID_REV_CHIP_ID_7801_) {
+			phy_unregister_fixup_for_uid(PHY_KSZ9031RNX,
+						     0xfffffff0);
+			phy_unregister_fixup_for_uid(PHY_LAN8835,
+						     0xfffffff0);
+		}
 		return -EIO;
 	}
 
@@ -2084,12 +2129,6 @@ static int lan78xx_phy_init(struct lan78xx_net *dev)
 	dev->fc_autoneg = phydev->autoneg;
 
 	return 0;
-
-error:
-	phy_unregister_fixup_for_uid(PHY_KSZ9031RNX, 0xfffffff0);
-	phy_unregister_fixup_for_uid(PHY_LAN8835, 0xfffffff0);
-
-	return ret;
 }
 
 static int lan78xx_set_rx_max_frame_length(struct lan78xx_net *dev, int size)
@@ -3487,6 +3526,7 @@ static void lan78xx_disconnect(struct usb_interface *intf)
 	struct lan78xx_net		*dev;
 	struct usb_device		*udev;
 	struct net_device		*net;
+	struct phy_device		*phydev;
 
 	dev = usb_get_intfdata(intf);
 	usb_set_intfdata(intf, NULL);
@@ -3495,12 +3535,16 @@ static void lan78xx_disconnect(struct usb_interface *intf)
 
 	udev = interface_to_usbdev(intf);
 	net = dev->net;
+	phydev = net->phydev;
 
 	phy_unregister_fixup_for_uid(PHY_KSZ9031RNX, 0xfffffff0);
 	phy_unregister_fixup_for_uid(PHY_LAN8835, 0xfffffff0);
 
 	phy_disconnect(net->phydev);
 
+	if (phy_is_pseudo_fixed_link(phydev))
+		fixed_phy_unregister(phydev);
+
 	unregister_netdev(net);
 
 	cancel_delayed_work_sync(&dev->wq);
-- 
2.16.2

^ permalink raw reply related

* [PATCH v4 net-next 2/3] lan78xx: Remove DRIVER_VERSION for lan78xx driver
From: Raghuram Chary J @ 2018-04-27  6:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, unglinuxdriver, woojung.huh, raghuramchary.jallipalli
In-Reply-To: <20180427060024.23989-1-raghuramchary.jallipalli@microchip.com>

Remove driver version info from the lan78xx driver.

Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com>
---
 drivers/net/usb/lan78xx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index cb35cfa20ca0..5da5f0e3cd21 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -42,7 +42,6 @@
 #define DRIVER_AUTHOR	"WOOJUNG HUH <woojung.huh@microchip.com>"
 #define DRIVER_DESC	"LAN78XX USB 3.0 Gigabit Ethernet Devices"
 #define DRIVER_NAME	"lan78xx"
-#define DRIVER_VERSION	"1.0.6"
 
 #define TX_TIMEOUT_JIFFIES		(5 * HZ)
 #define THROTTLE_JIFFIES		(HZ / 8)
@@ -1477,7 +1476,6 @@ static void lan78xx_get_drvinfo(struct net_device *net,
 	struct lan78xx_net *dev = netdev_priv(net);
 
 	strncpy(info->driver, DRIVER_NAME, sizeof(info->driver));
-	strncpy(info->version, DRIVER_VERSION, sizeof(info->version));
 	usb_make_path(dev->udev, info->bus_info, sizeof(info->bus_info));
 }
 
-- 
2.16.2

^ permalink raw reply related

* [PATCH v4 net-next 3/3] lan78xx: Modify error messages
From: Raghuram Chary J @ 2018-04-27  6:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, unglinuxdriver, woojung.huh, raghuramchary.jallipalli
In-Reply-To: <20180427060024.23989-1-raghuramchary.jallipalli@microchip.com>

Modify the error messages when phy registration fails.

Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com>
---
 drivers/net/usb/lan78xx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 5da5f0e3cd21..525bb4bf1975 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -2040,14 +2040,14 @@ static struct phy_device *lan7801_phy_init(struct lan78xx_net *dev)
 		ret = phy_register_fixup_for_uid(PHY_KSZ9031RNX, 0xfffffff0,
 						 ksz9031rnx_fixup);
 		if (ret < 0) {
-			netdev_err(dev->net, "fail to register fixup\n");
+			netdev_err(dev->net, "fail to register fixup for PHY_KSZ9031RNX\n");
 			return NULL;
 		}
 		/* external PHY fixup for LAN8835 */
 		ret = phy_register_fixup_for_uid(PHY_LAN8835, 0xfffffff0,
 						 lan8835_fixup);
 		if (ret < 0) {
-			netdev_err(dev->net, "fail to register fixup\n");
+			netdev_err(dev->net, "fail to register fixup for PHY_LAN8835\n");
 			return NULL;
 		}
 		/* add more external PHY fixup here if needed */
-- 
2.16.2

^ permalink raw reply related

* Re: [RFC v3 0/5] virtio: support packed ring
From: Jason Wang @ 2018-04-27  6:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tiwei Bie, virtualization, linux-kernel, netdev, wexu, jfreimann
In-Reply-To: <20180427071725-mutt-send-email-mst@kernel.org>



On 2018年04月27日 12:18, Michael S. Tsirkin wrote:
> On Fri, Apr 27, 2018 at 11:56:05AM +0800, Jason Wang wrote:
>> On 2018年04月25日 13:15, Tiwei Bie wrote:
>>> Hello everyone,
>>>
>>> This RFC implements packed ring support in virtio driver.
>>>
>>> Some simple functional tests have been done with Jason's
>>> packed ring implementation in vhost:
>>>
>>> https://lkml.org/lkml/2018/4/23/12
>>>
>>> Both of ping and netperf worked as expected (with EVENT_IDX
>>> disabled). But there are below known issues:
>>>
>>> 1. Reloading the guest driver will break the Tx/Rx;
>> Will have a look at this issue.
>>
>>> 2. Zeroing the flags when detaching a used desc will
>>>      break the guest -> host path.
>> I still think zeroing flags is unnecessary or even a bug. At host, I track
>> last observed avail wrap counter and detect avail like (what is suggested in
>> the example code in the spec):
>>
>> static bool desc_is_avail(struct vhost_virtqueue *vq, __virtio16 flags)
>> {
>>         bool avail = flags & cpu_to_vhost16(vq, DESC_AVAIL);
>>
>>         return avail == vq->avail_wrap_counter;
>> }
>>
>> So zeroing wrap can not work with this obviously.
>>
>> Thanks
> I agree. I think what one should do is flip the available bit.
>

But is this flipping a must?

Thanks

^ permalink raw reply

* Re: [PATCH] mac80211_hwsim: fix a possible memory leak in hwsim_new_radio_nl()
From: YueHaibing @ 2018-04-27  6:28 UTC (permalink / raw)
  To: johannes-cdvu00un1VgdHxzADdlk8Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, kvalo-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	ben.hutchings-4yDnlxn2s6sWdaTGBSpHTA
In-Reply-To: <20180424030835.21776-1-yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>


cc  Ben Hutchings <ben.hutchings-4yDnlxn2s6sWdaTGBSpHTA@public.gmane.org>

On 2018/4/24 11:08, YueHaibing wrote:
> 'hwname' should be freed before leaving from the error handling cases,
> otherwise it will cause mem leak
> 
> Fixes: cb1a5bae5684 ("mac80211_hwsim: add permanent mac address option for new radios")
> Signed-off-by: YueHaibing <yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/net/wireless/mac80211_hwsim.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
> index 96d26cf..4a017a0 100644
> --- a/drivers/net/wireless/mac80211_hwsim.c
> +++ b/drivers/net/wireless/mac80211_hwsim.c
> @@ -3236,6 +3236,7 @@ static int hwsim_new_radio_nl(struct sk_buff *msg, struct genl_info *info)
>  			GENL_SET_ERR_MSG(info,"MAC is no valid source addr");
>  			NL_SET_BAD_ATTR(info->extack,
>  					info->attrs[HWSIM_ATTR_PERM_ADDR]);
> +			kfree(hwname);
>  			return -EINVAL;
>  		}
>  
> 

^ permalink raw reply

* Re: [PATCH 0/3] can: fix ndo_start_xmit()'s return type
From: Marc Kleine-Budde @ 2018-04-27  7:21 UTC (permalink / raw)
  To: Luc Van Oostenryck
  Cc: Wolfgang Grandegger, Maxime Ripard, Chen-Yu Tsai, Michal Simek,
	open list:CAN NETWORK DRIVERS, open list:NETWORKING DRIVERS,
	open list, moderated list:ARM/Allwinner sunXi SoC support
In-Reply-To: <20180426211339.30821-1-luc.vanoostenryck@gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 1004 bytes --]

On 04/26/2018 11:13 PM, Luc Van Oostenryck wrote:
> ndo_start_xmit() is defined as returing an 'netdev_tx_t'.
> However, several can drivers use 'int' as the return type
> of their start_xmit() method.
> This series contains the fix for all three of them.
> 
> Luc Van Oostenryck (3):
>   can: janz-ican3: fix ican3_xmit()'s return type
>   can: sun4i: fix sun4ican_start_xmit()'s return type
>   can: xilinx: fix xcan_start_xmit()'s return type
> 
>  drivers/net/can/janz-ican3.c | 2 +-
>  drivers/net/can/sun4i_can.c  | 2 +-
>  drivers/net/can/xilinx_can.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)

Applied all 3 to linux-can-next, added similar patch for the flexcan driver.

Tnx,
Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH] net: phy: marvell: clear wol event before setting it
From: Jisheng Zhang @ 2018-04-27  7:23 UTC (permalink / raw)
  To: Bhadram Varka
  Cc: Andrew Lunn, Florian Fainelli, David S. Miller, netdev,
	linux-kernel, Jingju Hou
In-Reply-To: <7b63b6f5-d93c-f2c8-c448-b81a8323c305@nvidia.com>

On Fri, 27 Apr 2018 09:22:34 +0530 Bhadram Varka wrote:

> Hi Andrew/Jisheng,
> 
> On 4/26/2018 6:10 PM, Andrew Lunn wrote:
> >> hmm, so you want a "stick" WOL feature, I dunno whether Linux kernel
> >> requires WOL should be "stick".  
> > I see two different cases:
> >
> > Suspend/resume: The WoL state in the kernel is probably kept across
> > such a cycle. If so, you would expect another suspend/resume to also
> > work, without needs to reconfigure it.  
> Trying this scenario (suspend/resume) from my side. In this case WoL 
> should be enabled in the HW. For Marvell PHY to generate WoL interrupt 
> we need to clear WoL status.
> Above mentioned change required to make this happen. Please share your 
> thoughts on this.

I'm fine with that patch. Maybe you could send out a patch?

Thanks

^ permalink raw reply

* Re: [PATCH] net: phy: marvell: clear wol event before setting it
From: Jisheng Zhang @ 2018-04-27  7:25 UTC (permalink / raw)
  To: Bhadram Varka
  Cc: Andrew Lunn, Florian Fainelli, David S. Miller, netdev,
	linux-kernel, Jingju Hou
In-Reply-To: <73e21c83-f78a-8b22-a421-f179ef6adef1@nvidia.com>

On Thu, 26 Apr 2018 12:39:59 +0530 Bhadram Varka wrote:

> >>>>>
> >>>>> diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
> >>>>> index c22e8e383247..b6abe1cbc84b 100644
> >>>>> --- a/drivers/net/phy/marvell.c
> >>>>> +++ b/drivers/net/phy/marvell.c
> >>>>> @@ -115,6 +115,9 @@
> >>>>>    /* WOL Event Interrupt Enable */
> >>>>>    #define MII_88E1318S_PHY_CSIER_WOL_EIE BIT(7)
> >>>>>    +/* Copper Specific Interrupt Status Register */
> >>>>> +#define MII_88E1318S_PHY_CSISR                0x13
> >>>>> +
> >>>>>    /* LED Timer Control Register */
> >>>>>    #define MII_88E1318S_PHY_LED_TCR            0x12
> >>>>>    #define MII_88E1318S_PHY_LED_TCR_FORCE_INT BIT(15)
> >>>>> @@ -1393,6 +1396,12 @@ static int m88e1318_set_wol(struct 
> >>>>> phy_device *phydev,
> >>>>>            if (err < 0)
> >>>>>                goto error;
> >>>>>    +        /* If WOL event happened once, the LED[2] interrupt pin
> >>>>> +         * will not be cleared unless reading the CSISR register.
> >>>>> +         * So clear the WOL event first before enabling it.
> >>>>> +         */
> >>>>> +        phy_read(phydev, MII_88E1318S_PHY_CSISR);
> >>>>> +  
> >>>> Hi Jisheng
> >>>>
> >>>> The problem with this is, you could be clearing a real interrupt, link
> >>>> down/up etc. If interrupts are in use, i think the normal interrupt
> >>>> handling will clear the WOL interrupt? So can you make this read
> >>>> conditional on !phy_interrupt_is_valid()?  
> >>> So this will clear WoL interrupt bit from Copper Interrupt status 
> >>> register.
> >>>
> >>> How about clearing WoL status (Page 17, register 17) for every WOL 
> >>> event ?
> >>>  
> >> This is already properly done by setting 
> >> MII_88E1318S_PHY_WOL_CTRL_CLEAR_WOL_STATUS
> >> in m88e1318_set_wol()  
> > This part of the code executes only when we enable WOL through ethtool 
> > (ethtool -s eth0 wol g)
> >
> > Lets say once WOL enabled through magic packet - HW generates WOL 
> > interrupt once magic packet received.
> > The problem that I see here is that for the next immediate magic 
> > packet I don't see WOL interrupt generated by the HW.
> > I need to explicitly clear WOL status for HW to generate WOL interrupt.  
> 
> With the below patch I see WOL event interrupt for every magic packet 
> that HW receives...
> 
> diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
> index ed8a67d..5d3d138 100644
> --- a/drivers/net/phy/marvell.c
> +++ b/drivers/net/phy/marvell.c
> @@ -55,6 +55,7 @@
> 
>   #define MII_M1011_IEVENT               0x13
>   #define MII_M1011_IEVENT_CLEAR         0x0000
> +#define MII_M1011_IEVENT_WOL_EVENT     BIT(7)
> 
>   #define MII_M1011_IMASK                        0x12
> - #define MII_M1011_IMASK_INIT           0x6400
> + #define MII_M1011_IMASK_INIT           0x6480
> 
> @@ -195,13 +196,40 @@ struct marvell_priv {
>          bool copper;
>   };
> 
> +static int marvell_clear_wol_status(struct phy_device *phydev)
> +{
> +       int err, temp, oldpage;
> +
> +       oldpage = phy_read(phydev, MII_MARVELL_PHY_PAGE);
> +       if (oldpage < 0)
> +               return oldpage;
> +
> +       err = phy_write(phydev, MII_MARVELL_PHY_PAGE,
> +                        MII_88E1318S_PHY_WOL_PAGE);
> +       if (err < 0)
> +               return err;
> +
> +       /*
> +        * Clear WOL status so that for next WOL event
> +        * interrupt will be generated by HW
> +        */
> +       temp = phy_read(phydev, MII_88E1318S_PHY_WOL_CTRL);
> +       temp |= MII_88E1318S_PHY_WOL_CTRL_CLEAR_WOL_STATUS;
> +       err = phy_write(phydev, MII_88E1318S_PHY_WOL_CTRL, temp);

is it better to reuse __phy_write()?

> +       if (err < 0)
> +               return err;
> +
> +
> +       phy_write(phydev, MII_MARVELL_PHY_PAGE, oldpage);
> +
> +       return 0;
> +}
> +
>   static int marvell_ack_interrupt(struct phy_device *phydev)
>   {
>          int err;
> 
>          /* Clear the interrupts by reading the reg */
>          err = phy_read(phydev, MII_M1011_IEVENT);
> -
>          if (err < 0)
>                  return err;
> 
> @@ -1454,12 +1482,18 @@ static int marvell_aneg_done(struct phy_device 
> *phydev)
> 
>   static int m88e1121_did_interrupt(struct phy_device *phydev)
>   {
> -       int imask;
> +       int imask, err;
> 
>          imask = phy_read(phydev, MII_M1011_IEVENT);
> 
> -       if (imask & MII_M1011_IMASK_INIT)
> +       if (imask & MII_M1011_IMASK_INIT) {
> +               if (imask & MII_M1011_IEVENT_WOL_EVENT) {
> +                       err = marvell_clear_wol_status(phydev);
> +                       if (err < 0)
> +                               return 0;
> +               }
>                  return 1;
> +       }
> 
>          return 0;
>   }

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] openvswitch: Support conntrack zone limit
From: Pravin Shelar @ 2018-04-27  7:28 UTC (permalink / raw)
  To: Yi-Hung Wei; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAG1aQh+KXPxB3yvmrdxgV6eO63CiwDFSB_RGcsMhGd4rUvSxSQ@mail.gmail.com>

On Wed, Apr 25, 2018 at 2:51 PM, Yi-Hung Wei <yihung.wei@gmail.com> wrote:
>>> +#if    IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
>>> +#define OVS_CT_LIMIT_UNLIMITED 0
>>> +#define OVS_CT_LIMIT_DEFAULT OVS_CT_LIMIT_UNLIMITED
>>> +#define CT_LIMIT_HASH_BUCKETS 512
>>> +
>> Can you use static key when the limit is not set.
>> This would avoid overhead in datapath when these limits are not used.
>>
> Thanks Parvin for the review.  I'm not sure about the 'static key'
> part, are you suggesting that say if we can have a flag to check if no
> one has ever set the ct_limit?   If the ct_limit feature is not used
> so far, then instead of look up the hash table for the zone limit, we
> just return the default unlimited value.  So that we can avoid the
> overhead of checking ct_limit.
>

Right, here documentaion of static keys:
https://www.kernel.org/doc/Documentation/static-keys.txt

>>> +struct ovs_ct_limit {
>>> +       /* Elements in ovs_ct_limit_info->limits hash table */
>>> +       struct hlist_node hlist_node;
>>> +       struct rcu_head rcu;
>>> +       u16 zone;
>>> +       u32 limit;
>>> +};
>>> +
>> ...
>
>
>>>  /* Lookup connection and confirm if unconfirmed. */
>>>  static int ovs_ct_commit(struct net *net, struct sw_flow_key *key,
>>>                          const struct ovs_conntrack_info *info,
>>> @@ -1054,6 +1176,13 @@ static int ovs_ct_commit(struct net *net, struct sw_flow_key *key,
>>>         if (!ct)
>>>                 return 0;
>>>
>>> +#if    IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
>>> +       err = ovs_ct_check_limit(net, info,
>>> +                                &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
>>> +       if (err)
>>> +               return err;
>>> +#endif
>>> +
>>
>> This could be checked during flow install time, so that only permitted
>> flows would have 'ct commit' action, we can avoid per packet cost
>> checking the limit.
> It seems to me that it would be hard to check the # of connections of
> a flow in the flow installation stage.  For example, if we have a
> datapath flow that performs “ct commit” action on all UDP traffic from
> in_port 1, then it could be various combinations of 5-tuple that form
> various # of connections.  Therefore, it would be hard to do the
> admission control there.
>

Ok, I see the issue.
I think we could still avoid the lookup every time by checking the
limits for unconfirmed connections (ref: nf_ct_is_confirmed()).
So once connection confirmed and is under limit  we should allow all
packets for given connection.
This way we can avoid connection disruption when we are reaching near
connection limit.

>
>> returning error code form ovs_ct_commit() is lost in datapath and it
>> would be hard to debug packet lost in case of the limit is reached. So
>> another advantage of checking the limit in flow install be better
>> traceability. datapath would return error to usespace and it can log
>> the error code.
> Thanks for pointing out the error code from ovs_ct_commit() will be
> lost in the datapath.  In this case, shall we consider to report the
> packet drop by some rate_limit logging such as net_warn_ratelimited()
> or net_info_ratelimited()?
>

ok, that would be fine.

^ permalink raw reply

* [PATCH] ptp_pch: use helpers function for converting between ns and timespec
From: YueHaibing @ 2018-04-27  7:36 UTC (permalink / raw)
  To: richardcochran, davem; +Cc: netdev, linux-kernel, YueHaibing

use ns_to_timespec64() and timespec64_to_ns() instead of open coding

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/ptp/ptp_pch.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/ptp/ptp_pch.c b/drivers/ptp/ptp_pch.c
index b328517..78ccf93 100644
--- a/drivers/ptp/ptp_pch.c
+++ b/drivers/ptp/ptp_pch.c
@@ -452,7 +452,6 @@ static int ptp_pch_adjtime(struct ptp_clock_info *ptp, s64 delta)
 static int ptp_pch_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
 {
 	u64 ns;
-	u32 remainder;
 	unsigned long flags;
 	struct pch_dev *pch_dev = container_of(ptp, struct pch_dev, caps);
 	struct pch_ts_regs __iomem *regs = pch_dev->regs;
@@ -461,8 +460,7 @@ static int ptp_pch_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
 	ns = pch_systime_read(regs);
 	spin_unlock_irqrestore(&pch_dev->register_lock, flags);
 
-	ts->tv_sec = div_u64_rem(ns, 1000000000, &remainder);
-	ts->tv_nsec = remainder;
+	*ts = ns_to_timespec64(ns);
 	return 0;
 }
 
@@ -474,8 +472,7 @@ static int ptp_pch_settime(struct ptp_clock_info *ptp,
 	struct pch_dev *pch_dev = container_of(ptp, struct pch_dev, caps);
 	struct pch_ts_regs __iomem *regs = pch_dev->regs;
 
-	ns = ts->tv_sec * 1000000000ULL;
-	ns += ts->tv_nsec;
+	ns = timespec64_to_ns(ts);
 
 	spin_lock_irqsave(&pch_dev->register_lock, flags);
 	pch_systime_write(regs, ns);
-- 
2.7.0

^ permalink raw reply related

* Re: [PATCH net-next v2 5/7] MIPS: mscc: Add switch to ocelot
From: Alexandre Belloni @ 2018-04-27  7:40 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: David S . Miller, Allan Nielsen, razvan.stefanescu, po.liu,
	Thomas Petazzoni, Florian Fainelli, netdev, devicetree,
	linux-kernel, linux-mips, James Hogan
In-Reply-To: <20180426205113.GD23481@lunn.ch>

On 26/04/2018 22:51:13+0200, Andrew Lunn wrote:
> > +
> > +		mdio0: mdio@107009c {
> > +			#address-cells = <1>;
> > +			#size-cells = <0>;
> > +			compatible = "mscc,ocelot-miim";
> > +			reg = <0x107009c 0x36>, <0x10700f0 0x8>;
> > +			interrupts = <14>;
> > +			status = "disabled";
> > +
> > +			phy0: ethernet-phy@0 {
> > +				reg = <0>;
> > +			};
> > +			phy1: ethernet-phy@1 {
> > +				reg = <1>;
> > +			};
> > +			phy2: ethernet-phy@2 {
> > +				reg = <2>;
> > +			};
> > +			phy3: ethernet-phy@3 {
> > +				reg = <3>;
> > +			};
> 
> Hi Alexandre
> 
> These are internal PHYs? Is there an option to use external PHYs for
> the ports which have internal PHYs?
> 
> I'm just wondering if they should be linked together by default. Or a
> comment added to the commit message about why they are not linked
> together here.
> 

They are dual media ports so they are not necessarily using the
integrated PHY but can use SerDEs1G lanes.

I'll add that to the commit message.

-- 
Alexandre Belloni, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: [PATCH 3/3] can: xilinx: fix xcan_start_xmit()'s return type
From: Michal Simek @ 2018-04-27  7:49 UTC (permalink / raw)
  To: Luc Van Oostenryck, Marc Kleine-Budde
  Cc: Wolfgang Grandegger, Maxime Ripard, Chen-Yu Tsai, Michal Simek,
	open list:CAN NETWORK DRIVERS, open list:NETWORKING DRIVERS,
	open list, moderated list:ARM/Allwinner sunXi SoC support
In-Reply-To: <20180426211339.30821-4-luc.vanoostenryck@gmail.com>

On 26.4.2018 23:13, Luc Van Oostenryck wrote:
> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
> which is a typedef for an enum type, but the implementation in this
> driver returns an 'int'.
> 
> Fix this by returning 'netdev_tx_t' in this driver too.
> 
> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
> ---
>  drivers/net/can/xilinx_can.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
> index 89aec07c2..a19648606 100644
> --- a/drivers/net/can/xilinx_can.c
> +++ b/drivers/net/can/xilinx_can.c
> @@ -386,7 +386,7 @@ static int xcan_do_set_mode(struct net_device *ndev, enum can_mode mode)
>   *
>   * Return: 0 on success and failure value on error
>   */
> -static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> +static netdev_tx_t xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>  {
>  	struct xcan_priv *priv = netdev_priv(ndev);
>  	struct net_device_stats *stats = &ndev->stats;
> 

It was applied already but there should be also kernel-doc update too to
use enum values instead of 0.

M

^ permalink raw reply

* Re: [PATCH 3/3] can: xilinx: fix xcan_start_xmit()'s return type
From: Marc Kleine-Budde @ 2018-04-27  7:55 UTC (permalink / raw)
  To: Michal Simek, Luc Van Oostenryck
  Cc: Wolfgang Grandegger, Maxime Ripard, Chen-Yu Tsai,
	open list:CAN NETWORK DRIVERS, open list:NETWORKING DRIVERS,
	open list, moderated list:ARM/Allwinner sunXi SoC support
In-Reply-To: <79ea4ab6-a12a-21ad-b586-6a1dad176c0a@xilinx.com>


[-- Attachment #1.1: Type: text/plain, Size: 2245 bytes --]

On 04/27/2018 09:49 AM, Michal Simek wrote:
> On 26.4.2018 23:13, Luc Van Oostenryck wrote:
>> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
>> which is a typedef for an enum type, but the implementation in this
>> driver returns an 'int'.
>>
>> Fix this by returning 'netdev_tx_t' in this driver too.
>>
>> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
>> ---
>>  drivers/net/can/xilinx_can.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
>> index 89aec07c2..a19648606 100644
>> --- a/drivers/net/can/xilinx_can.c
>> +++ b/drivers/net/can/xilinx_can.c
>> @@ -386,7 +386,7 @@ static int xcan_do_set_mode(struct net_device *ndev, enum can_mode mode)
>>   *
>>   * Return: 0 on success and failure value on error
>>   */
>> -static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>> +static netdev_tx_t xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>>  {
>>  	struct xcan_priv *priv = netdev_priv(ndev);
>>  	struct net_device_stats *stats = &ndev->stats;
>>
> 
> It was applied already but there should be also kernel-doc update too to
> use enum values instead of 0.

Like this:

> diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
> index f07ce4945356..d0ad1473f689 100644
> --- a/drivers/net/can/xilinx_can.c
> +++ b/drivers/net/can/xilinx_can.c
> @@ -398,7 +398,7 @@ static int xcan_do_set_mode(struct net_device *ndev, enum can_mode mode)
>   * function uses the next available free txbuff and populates their fields to
>   * start the transmission.
>   *
> - * Return: 0 on success and failure value on error
> + * Return: NETDEV_TX_OK on success and NETDEV_TX_BUSY when the tx queue is full
>   */
>  static netdev_tx_t xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>  {

I can squash in that change.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH 3/3] can: xilinx: fix xcan_start_xmit()'s return type
From: Michal Simek @ 2018-04-27  8:03 UTC (permalink / raw)
  To: Marc Kleine-Budde, Michal Simek, Luc Van Oostenryck
  Cc: Wolfgang Grandegger, Maxime Ripard, Chen-Yu Tsai,
	open list:CAN NETWORK DRIVERS, open list:NETWORKING DRIVERS,
	open list, moderated list:ARM/Allwinner sunXi SoC support
In-Reply-To: <02e9be66-1345-2af7-c343-9f94d71e1d10@pengutronix.de>

On 27.4.2018 09:55, Marc Kleine-Budde wrote:
> On 04/27/2018 09:49 AM, Michal Simek wrote:
>> On 26.4.2018 23:13, Luc Van Oostenryck wrote:
>>> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
>>> which is a typedef for an enum type, but the implementation in this
>>> driver returns an 'int'.
>>>
>>> Fix this by returning 'netdev_tx_t' in this driver too.
>>>
>>> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
>>> ---
>>>  drivers/net/can/xilinx_can.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
>>> index 89aec07c2..a19648606 100644
>>> --- a/drivers/net/can/xilinx_can.c
>>> +++ b/drivers/net/can/xilinx_can.c
>>> @@ -386,7 +386,7 @@ static int xcan_do_set_mode(struct net_device *ndev, enum can_mode mode)
>>>   *
>>>   * Return: 0 on success and failure value on error
>>>   */
>>> -static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>>> +static netdev_tx_t xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>>>  {
>>>  	struct xcan_priv *priv = netdev_priv(ndev);
>>>  	struct net_device_stats *stats = &ndev->stats;
>>>
>>
>> It was applied already but there should be also kernel-doc update too to
>> use enum values instead of 0.
> 
> Like this:
> 
>> diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
>> index f07ce4945356..d0ad1473f689 100644
>> --- a/drivers/net/can/xilinx_can.c
>> +++ b/drivers/net/can/xilinx_can.c
>> @@ -398,7 +398,7 @@ static int xcan_do_set_mode(struct net_device *ndev, enum can_mode mode)
>>   * function uses the next available free txbuff and populates their fields to
>>   * start the transmission.
>>   *
>> - * Return: 0 on success and failure value on error
>> + * Return: NETDEV_TX_OK on success and NETDEV_TX_BUSY when the tx queue is full
>>   */
>>  static netdev_tx_t xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>>  {
> 
> I can squash in that change.

looks good to me.
Acked-by: Michal Simek <michal.simek@xilinx.com>

Thanks,
Michal

^ permalink raw reply

* [PATCH] drivers: net: replace UINT64_MAX with U64_MAX
From: Jisheng Zhang @ 2018-04-27  8:18 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller
  Cc: netdev, linux-kernel

U64_MAX is well defined now while the UINT64_MAX is not, so we fall
back to drivers' own definition as below:

	#ifndef UINT64_MAX
	#define UINT64_MAX             (u64)(~((u64)0))
	#endif

I believe this is in one phy driver then copied and pasted to other phy
drivers.

Replace the UINT64_MAX with U64_MAX to clean up the source code.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 6 +++---
 drivers/net/dsa/mv88e6xxx/chip.h | 4 ----
 drivers/net/phy/bcm-phy-lib.c    | 6 +-----
 drivers/net/phy/marvell.c        | 5 +----
 drivers/net/phy/micrel.c         | 5 +----
 drivers/net/phy/smsc.c           | 5 +----
 6 files changed, 7 insertions(+), 24 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 3d2091099f7f..1a2dd340853f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -665,13 +665,13 @@ static uint64_t _mv88e6xxx_get_ethtool_stat(struct mv88e6xxx_chip *chip,
 	case STATS_TYPE_PORT:
 		err = mv88e6xxx_port_read(chip, port, s->reg, &reg);
 		if (err)
-			return UINT64_MAX;
+			return U64_MAX;
 
 		low = reg;
 		if (s->size == 4) {
 			err = mv88e6xxx_port_read(chip, port, s->reg + 1, &reg);
 			if (err)
-				return UINT64_MAX;
+				return U64_MAX;
 			high = reg;
 		}
 		break;
@@ -685,7 +685,7 @@ static uint64_t _mv88e6xxx_get_ethtool_stat(struct mv88e6xxx_chip *chip,
 			mv88e6xxx_g1_stats_read(chip, reg + 1, &high);
 		break;
 	default:
-		return UINT64_MAX;
+		return U64_MAX;
 	}
 	value = (((u64)high) << 16) | low;
 	return value;
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index 80490f66bc06..4163c8099d0b 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -21,10 +21,6 @@
 #include <linux/timecounter.h>
 #include <net/dsa.h>
 
-#ifndef UINT64_MAX
-#define UINT64_MAX		(u64)(~((u64)0))
-#endif
-
 #define SMI_CMD			0x00
 #define SMI_CMD_BUSY		BIT(15)
 #define SMI_CMD_CLAUSE_22	BIT(12)
diff --git a/drivers/net/phy/bcm-phy-lib.c b/drivers/net/phy/bcm-phy-lib.c
index 5ad130c3da43..0876aec7328c 100644
--- a/drivers/net/phy/bcm-phy-lib.c
+++ b/drivers/net/phy/bcm-phy-lib.c
@@ -346,10 +346,6 @@ void bcm_phy_get_strings(struct phy_device *phydev, u8 *data)
 }
 EXPORT_SYMBOL_GPL(bcm_phy_get_strings);
 
-#ifndef UINT64_MAX
-#define UINT64_MAX              (u64)(~((u64)0))
-#endif
-
 /* Caller is supposed to provide appropriate storage for the library code to
  * access the shadow copy
  */
@@ -362,7 +358,7 @@ static u64 bcm_phy_get_stat(struct phy_device *phydev, u64 *shadow,
 
 	val = phy_read(phydev, stat.reg);
 	if (val < 0) {
-		ret = UINT64_MAX;
+		ret = U64_MAX;
 	} else {
 		val >>= stat.shift;
 		val = val & ((1 << stat.bits) - 1);
diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 25e2a099b71c..b8f57e9b9379 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -1482,9 +1482,6 @@ static void marvell_get_strings(struct phy_device *phydev, u8 *data)
 	}
 }
 
-#ifndef UINT64_MAX
-#define UINT64_MAX		(u64)(~((u64)0))
-#endif
 static u64 marvell_get_stat(struct phy_device *phydev, int i)
 {
 	struct marvell_hw_stat stat = marvell_hw_stats[i];
@@ -1494,7 +1491,7 @@ static u64 marvell_get_stat(struct phy_device *phydev, int i)
 
 	val = phy_read_paged(phydev, stat.page, stat.reg);
 	if (val < 0) {
-		ret = UINT64_MAX;
+		ret = U64_MAX;
 	} else {
 		val = val & ((1 << stat.bits) - 1);
 		priv->stats[i] += val;
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index f41b224a9cdb..de31c5170a5b 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -650,9 +650,6 @@ static void kszphy_get_strings(struct phy_device *phydev, u8 *data)
 	}
 }
 
-#ifndef UINT64_MAX
-#define UINT64_MAX              (u64)(~((u64)0))
-#endif
 static u64 kszphy_get_stat(struct phy_device *phydev, int i)
 {
 	struct kszphy_hw_stat stat = kszphy_hw_stats[i];
@@ -662,7 +659,7 @@ static u64 kszphy_get_stat(struct phy_device *phydev, int i)
 
 	val = phy_read(phydev, stat.reg);
 	if (val < 0) {
-		ret = UINT64_MAX;
+		ret = U64_MAX;
 	} else {
 		val = val & ((1 << stat.bits) - 1);
 		priv->stats[i] += val;
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
index be399d645224..c328208388da 100644
--- a/drivers/net/phy/smsc.c
+++ b/drivers/net/phy/smsc.c
@@ -168,9 +168,6 @@ static void smsc_get_strings(struct phy_device *phydev, u8 *data)
 	}
 }
 
-#ifndef UINT64_MAX
-#define UINT64_MAX              (u64)(~((u64)0))
-#endif
 static u64 smsc_get_stat(struct phy_device *phydev, int i)
 {
 	struct smsc_hw_stat stat = smsc_hw_stats[i];
@@ -179,7 +176,7 @@ static u64 smsc_get_stat(struct phy_device *phydev, int i)
 
 	val = phy_read(phydev, stat.reg);
 	if (val < 0)
-		ret = UINT64_MAX;
+		ret = U64_MAX;
 	else
 		ret = val;
 
-- 
2.17.0

^ permalink raw reply related

* Re: [dm-devel] [PATCH v5] fault-injection: introduce kvmalloc fallback options
From: Michal Hocko @ 2018-04-27  8:25 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Michael S. Tsirkin, John Stoffel, James Bottomley, Michal,
	eric.dumazet, netdev, jasowang, Randy Dunlap, linux-kernel,
	Matthew Wilcox, linux-mm, dm-devel, Vlastimil Babka, Andrew,
	David Rientjes, Morton, virtualization, David Miller, edumazet
In-Reply-To: <alpine.LRH.2.02.1804261829190.30599@file01.intranet.prod.int.rdu2.redhat.com>

On Thu 26-04-18 18:52:05, Mikulas Patocka wrote:
> 
> 
> On Fri, 27 Apr 2018, Michael S. Tsirkin wrote:
[...]
> >    But assuming it's important to control this kind of
> >    fault injection to be controlled from
> >    a dedicated menuconfig option, why not the rest of
> >    faults?
> 
> The injected faults cause damage to the user, so there's no point to 
> enable them by default. vmalloc fallback should not cause any damage 
> (assuming that the code is correctly written).

But you want to find those bugs which would BUG_ON easier, so there is a
risk of harm IIUC and this is not much different than other fault
injecting paths.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply

* Re: [PATCH net] bridge: check iface upper dev when setting master via ioctl
From: Nikolay Aleksandrov @ 2018-04-27  8:33 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: netdev, Dmitry Vyukov, syzbot, David Miller
In-Reply-To: <20180427013102.GJ20683@leo.usersys.redhat.com>

On 27/04/18 04:31, Hangbin Liu wrote:
> Hi Nikolay,
> 
> Thanks for the comments.
> On Thu, Apr 26, 2018 at 05:22:46PM +0300, Nikolay Aleksandrov wrote:
>>> Not all upper devs are masters. This can break some setups.
> 
> Ah, like vlan device.. So how about
> 
> +	if (netdev_master_upper_dev_get(dev))
>   		return -EBUSY;

That should be fine, yes.

> 
>>>
>>>
>>
>> Also it's not really a bug, the device begins to get initialized but it
>> will get removed at netdev_master_upper_dev_link() anyway if there's
>> already a master. Why would it be better ?
> 
>> It's clearly wrong to try and enslave a device that already has a master
>> via ioctl, rtnetlink already deals with that and the old ioctl interface
>> will get an error, yes it will initialize some structs but they'll get
>> freed later. This is common practice, check the bonding for example.
> 
> Bonding use netdev_is_rx_handler_busy(slave_dev) to check if the slave
> already has a master, which is another solution.

Some masters don't use rx_handlers and the bonding fails at linking them
as a master which is still fine, it cleans up after the error like the bridge.

>>
>> If anything do the check in the ioctl interface (add_del_if) only and
>> maybe target net-next, there's really no bug fix here. IMO it's not
> 
> What if someone do like
> 
> while true; do brctl addif br0 bond_slave &; done
> 
> I know this is stupid and almost no one will do that in real world.
> But syzbot run some similar test and get warn from kobject_add_internal()
> with -ENOMEM. That's why I think we should fix it before allocate any
> resource.
> 
> What do you think?

The bridge code is only a symptom of what happened, that warn was placed to
warn people against doing stupid things - it was literally in the commit message
of some kobject patch. As long as the resources involved are cleaned up and it's
returned to the bridge to cleanup after itself, it should be fine.
  
You can add the check if you feel like it, I don't have an
objection against failing earlier. My main concern was the netdev_has_any_upper
usage which can break some setups.

Cheers,
  Nik

> 
> [1] https://syzkaller.appspot.com/bug?id=3e0339080acd6a2a350a900bc6533b03f5498490
> 
> Thanks
> Hangbin
> 

^ permalink raw reply

* Re: [PATCH net-next 1/2 v2] netns: restrict uevents
From: Christian Brauner @ 2018-04-27  8:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, netdev, linux-kernel, avagin, ktkhai, serge, gregkh
In-Reply-To: <87vacdbi58.fsf@xmission.com>

On Thu, Apr 26, 2018 at 07:35:47PM -0500, Eric W. Biederman wrote:
> Christian Brauner <christian.brauner@canonical.com> writes:
> 
> > On Thu, Apr 26, 2018 at 12:10:30PM -0500, Eric W. Biederman wrote:
> >> Christian Brauner <christian.brauner@canonical.com> writes:
> >> 
> >> > On Thu, Apr 26, 2018 at 11:47:19AM -0500, Eric W. Biederman wrote:
> >> >> Christian Brauner <christian.brauner@canonical.com> writes:
> >> >> 
> >> >> > On Tue, Apr 24, 2018 at 06:00:35PM -0500, Eric W. Biederman wrote:
> >> >> >> Christian Brauner <christian.brauner@canonical.com> writes:
> >> >> >> 
> >> >> >> > On Wed, Apr 25, 2018, 00:41 Eric W. Biederman <ebiederm@xmission.com> wrote:
> >> >> >> >
> >> >> >> >  Bah. This code is obviously correct and probably wrong.
> >> >> >> >
> >> >> >> >  How do we deliver uevents for network devices that are outside of the
> >> >> >> >  initial user namespace? The kernel still needs to deliver those.
> >> >> >> >
> >> >> >> >  The logic to figure out which network namespace a device needs to be
> >> >> >> >  delivered to is is present in kobj_bcast_filter. That logic will almost
> >> >> >> >  certainly need to be turned inside out. Sign not as easy as I would
> >> >> >> >  have hoped.
> >> >> >> >
> >> >> >> > My first patch that we discussed put additional filtering logic into kobj_bcast_filter for that very reason. But I can move that logic
> >> >> >> > out and come up with a new patch.
> >> >> >> 
> >> >> >> I may have mis-understood.
> >> >> >> 
> >> >> >> I heard and am still hearing additional filtering to reduce the places
> >> >> >> the packet is delievered.
> >> >> >> 
> >> >> >> I am saying something needs to change to increase the number of places
> >> >> >> the packet is delivered.
> >> >> >> 
> >> >> >> For the special class of devices that kobj_bcast_filter would apply to
> >> >> >> those need to be delivered to netowrk namespaces  that are no longer on
> >> >> >> uevent_sock_list.
> >> >> >> 
> >> >> >> So the code fundamentally needs to split into two paths.  Ordinary
> >> >> >> devices that use uevent_sock_list.  Network devices that are just
> >> >> >> delivered in their own network namespace.
> >> >> >> 
> >> >> >> netlink_broadcast_filtered gets to go away completely.
> >> >> >
> >> >> > The split *might* make sense but I think you're wrong about removing the
> >> >> > kobj_bcast_filter. The current filter doesn't operate on the uevent
> >> >> > socket in uevent_sock_list itself it rather operates on the sockets in
> >> >> > mc_list. And if socket in mc_list can have a different network namespace
> >> >> > then the uevent_socket itself then your way won't work. That's why my
> >> >> > original patch added additional filtering in there. The way I see it we
> >> >> > need something like:
> >> >> 
> >> >> We already filter the sockets in the mc_list by network namespace.
> >> >
> >> > Oh really? That's good to know. I haven't found where in the code this
> >> > actually happens. I thought that when netlink_bind() is called anyone
> >> > could register themselves in mc_list.
> >> 
> >> The code in af_netlink.c does:
> >> > static void do_one_broadcast(struct sock *sk,
> >> > 				    struct netlink_broadcast_data *p)
> >> > {
> >> > 	struct netlink_sock *nlk = nlk_sk(sk);
> >> > 	int val;
> >> > 
> >> > 	if (p->exclude_sk == sk)
> >> > 		return;
> >> > 
> >> > 	if (nlk->portid == p->portid || p->group - 1 >= nlk->ngroups ||
> >> > 	    !test_bit(p->group - 1, nlk->groups))
> >> > 		return;
> >> > 
> >> > 	if (!net_eq(sock_net(sk), p->net)) {
> >>             ^^^^^^^^^^^^ Here
> >> > 		if (!(nlk->flags & NETLINK_F_LISTEN_ALL_NSID))
> >> > 			return;
> >>                 ^^^^^^^^^^^ Here
> >> > 
> >> > 		if (!peernet_has_id(sock_net(sk), p->net))
> >> > 			return;
> >> > 
> >> > 		if (!file_ns_capable(sk->sk_socket->file, p->net->user_ns,
> >> > 				     CAP_NET_BROADCAST))
> >> > 			return;
> >> > 	}
> >> 
> >> Which if you are not a magic NETLINK_F_LISTEN_ALL_NSID socket filters
> >> you out if you are the wrong network namespace.
> >> 
> >> 
> >> >> When a packet is transmitted with netlink_broadcast it is only
> >> >> transmitted within a single network namespace.
> >> >> 
> >> >> Even in the case of a NETLINK_F_LISTEN_ALL_NSID socket the skb is tagged
> >> >> with it's source network namespace so no confusion will result, and the
> >> >> permission checks have been done to make it safe. So you can safely
> >> >> ignore that case.  Please ignore that case.  It only needs to be
> >> >> considered if refactoring af_netlink.c
> >> >> 
> >> >> When I added netlink_broadcast_filtered I imagined that we would need
> >> >> code that worked across network namespaces that worked for different
> >> >> namespaces.   So it looked like we would need the level of granularity
> >> >> that you can get with netlink_broadcast_filtered.  It turns out we don't
> >> >> and that it was a case of over design.  As the only split we care about
> >> >> is per network namespace there is no need for
> >> >> netlink_broadcast_filtered.
> >> >> 
> >> >> > init_user_ns_broadcast_filtered(uevent_sock_list, kobj_bcast_filter);
> >> >> > user_ns_broadcast_filtered(uevent_sock_list,kobj_bcast_filter);
> >> >> >
> >> >> > The question that remains is whether we can rely on the network
> >> >> > namespace information we can gather from the kobject_ns_type_operations
> >> >> > to decide where we want to broadcast that event to. So something
> >> >> > *like*:
> >> >> 
> >> >> We can.  We already do.  That is what kobj_bcast_filter implements.
> >> >> 
> >> >> > 	ops = kobj_ns_ops(kobj);
> >> >> > 	if (!ops && kobj->kset) {
> >> >> > 		struct kobject *ksobj = &kobj->kset->kobj;
> >> >> > 		if (ksobj->parent != NULL)
> >> >> > 			ops = kobj_ns_ops(ksobj->parent);
> >> >> > 	}
> >> >> >
> >> >> > 	if (ops && ops->netlink_ns && kobj->ktype->namespace)
> >> >> > 		if (ops->type == KOBJ_NS_TYPE_NET)
> >> >> > 			net = kobj->ktype->namespace(kobj);
> >> >> 
> >> >> Please note the only entry in the enumeration in the kobj_ns_type
> >> >> enumeration other than KOBJ_NS_TYPE_NONE is KOBJ_NS_TYPE_NET.  So the
> >> >> check for ops->type in this case is redundant.
> >> >
> >> > Yes, I know the reason for doing it explicitly is to block the case
> >> > where kobjects get tagged with other namespaces. So we'd need to be
> >> > vigilant should that ever happen but fine.
> >> 
> >> It is fine to keep the check.
> >> 
> >> I was intending to point out that it is much more likely that we remove
> >> the enumeration and remove some of the extra abstraction, than another
> >> namespace is implemented there.
> >> 
> >> >> That is something else that could be simplifed.  At the time it was the
> >> >> necessary to get the sysfs changes merged.
> >> >> 
> >> >> > 	if (!net || net->user_ns == &init_user_ns)
> >> >> > 		ret = init_user_ns_broadcast(env, action_string, devpath);
> >> >> > 	else
> >> >> > 		ret = user_ns_broadcast(net->uevent_sock->sk, env,
> >> >> > 					action_string, devpath);
> >> >> 
> >> >> Almost.
> >> >> 
> >> >> 	if (!net)
> >> >>         	kobject_uevent_net_broadcast(kobj, env, action_string,
> >> >>         					dev_path);
> >> >> 	else
> >> >>         	netlink_broadcast(net->uevent_sock->sk, skb, 0, 1, GFP_KERNEL);
> >> >> 
> >> >> 
> >> >> I am handwaving to get the skb in the netlink_broadcast case but that
> >> >> should be enough for you to see what I am thinking.
> >> >
> >> > I have added a helper alloc_uevent_skb() that can be used in both cases.
> >> >
> >> > static struct sk_buff *alloc_uevent_skb(struct kobj_uevent_env *env,
> >> > 					const char *action_string,
> >> > 					const char *devpath)
> >> > {
> >> > 	struct sk_buff *skb = NULL;
> >> > 	char *scratch;
> >> > 	size_t len;
> >> >
> >> > 	/* allocate message with maximum possible size */
> >> > 	len = strlen(action_string) + strlen(devpath) + 2;
> >> > 	skb = alloc_skb(len + env->buflen, GFP_KERNEL);
> >> > 	if (!skb)
> >> > 		return NULL;
> >> >
> >> > 	/* add header */
> >> > 	scratch = skb_put(skb, len);
> >> > 	sprintf(scratch, "%s@%s", action_string, devpath);
> >> >
> >> > 	skb_put_data(skb, env->buf, env->buflen);
> >> >
> >> > 	NETLINK_CB(skb).dst_group = 1;
> >> >
> >> > 	return skb;
> >> > }
> >> >
> >> >> 
> >> >> My only concern with the above is that we almost certainly need to fix
> >> >> the credentials on the skb so that userspace does not drop the packet
> >
> > I guess we simply want:
> > 	if (user_ns != &init_user_ns) {
> > 		NETLINK_CB(skb).creds.uid = (kuid_t)0;
> > 		NETLINK_CB(skb).creds.gid = kgid_t)0;
> > 	}
> 
> I believe the above is what we already have.
> 
> > instead of the more complicated and - imho wrong:
> >
> > 	if (user_ns != &init_user_ns) {
> > 		/* fix credentials for udev running in user namespace */
> > 		kuid_t uid = NETLINK_CB(skb).creds.uid;
> > 		kgid_t gid = NETLINK_CB(skb).creds.gid;
> > 		NETLINK_CB(skb).creds.uid = from_kuid_munged(user_ns, uid);
> > 		NETLINK_CB(skb).creds.gid = from_kgid_munged(user_ns, gid);
> > 	}
> 
> The above is most definitely wrong as we store kuids and kgids in
> "NETLINK_CB(skb).creds".
> 
> I am pretty certain what we want is:
> 	kuid_t root_uid = make_kuid(net->user_ns, 0);
>         kgid_g root_gid = make_kgid(net->user_ns, 0);

Thanks! I looked at user_namespace.c which contained map_id_down() which
is the function that I wanted and remembered from a prior patchset of
mine but they weren't exported. I didn't spot make_k{g,u}id() which are
wrapping those. These are the droids^H^H^H^H^H^Hfunctions I was looking
for!

>         if (!uid_valid(root_uid))
>         	root_uid = GLOBAL_ROOT_UID;
> 	if (!gid_valid(root_gid))
>         	root_gid = GLOBAL_ROOT_GID;
> 	NETLINK_CB(skb).creds.uid = root_uid;
>         NETLINK_CB(skb).creds.gid = root_gid;
> 
> Let's be careful and only fix this for the networking uevents please.
> We want the other onces to just go away.

This is already handled by the

if (!net)
       handle_untagged_uevents()
else
       handle_taggged_uevents()

The else branch will only every contain network devices as to my
knowledge no other kernel devices are currently tagged.

Thanks!
Christian

> 
> The networking uevents we have to fix or they will be gone completely.
> 
> Eric

^ permalink raw reply

* Re: [PATCH v2 net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive
From: kbuild test robot @ 2018-04-27  8:44 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: kbuild-all, David S . Miller, netdev, Andy Lutomirski,
	linux-kernel, linux-mm, Eric Dumazet, Eric Dumazet,
	Soheil Hassas Yeganeh
In-Reply-To: <20180425214307.159264-2-edumazet@google.com>

[-- Attachment #1: Type: text/plain, Size: 895 bytes --]

Hi Eric,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Eric-Dumazet/tcp-add-TCP_ZEROCOPY_RECEIVE-support-for-zerocopy-receive/20180427-122234
config: sh-rsk7269_defconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sh 

All errors (new ones prefixed by >>):

   net/ipv4/tcp.o: In function `tcp_setsockopt':
>> tcp.c:(.text+0x3f80): undefined reference to `zap_page_range'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 10487 bytes --]

^ permalink raw reply

* Re: Page allocator bottleneck
From: Aaron Lu @ 2018-04-27  8:45 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: Linux Kernel Network Developers, linux-mm, Mel Gorman,
	David Miller, Jesper Dangaard Brouer, Eric Dumazet,
	Alexei Starovoitov, Saeed Mahameed, Eran Ben Elisha,
	Andrew Morton, Michal Hocko
In-Reply-To: <20180423131033.GA13792@intel.com>

On Mon, Apr 23, 2018 at 09:10:33PM +0800, Aaron Lu wrote:
> On Mon, Apr 23, 2018 at 11:54:57AM +0300, Tariq Toukan wrote:
> > Hi,
> > 
> > I ran my tests with your patches.
> > Initial BW numbers are significantly higher than I documented back then in
> > this mail-thread.
> > For example, in driver #2 (see original mail thread), with 6 rings, I now
> > get 92Gbps (slightly less than linerate) in comparison to 64Gbps back then.
> > 
> > However, there were many kernel changes since then, I need to isolate your
> > changes. I am not sure I can finish this today, but I will surely get to it
> > next week after I'm back from vacation.
> > 
> > Still, when I increase the scale (more rings, i.e. more cpus), I see that
> > queued_spin_lock_slowpath gets to 60%+ cpu. Still high, but lower than it
> > used to be.
> 
> I wonder if it is on allocation path or free path?

Just FYI, I have pushed two more commits on top of the branch.
They should improve free path zone lock contention for MIGRATE_UNMOVABLE
pages(most kernel code alloc such pages), you may consider apply them if
free path contention is a problem.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox