Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 7/8] can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels
From: Marc Kleine-Budde @ 2015-01-15 16:11 UTC (permalink / raw)
  To: netdev
  Cc: davem, linux-can, kernel, Ahmed S. Darwish, Olivier Sobrie,
	linux-stable, Marc Kleine-Budde
In-Reply-To: <1421338283-12417-1-git-send-email-mkl@pengutronix.de>

From: "Ahmed S. Darwish" <ahmed.darwish@valeo.com>

Recent Leaf firmware versions (>= 3.1.557) do not allow to send
commands for non-existing channels.  If a command is sent for a
non-existing channel, the firmware crashes.

Reported-by: Christopher Storah <Christopher.Storah@invetech.com.au>
Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/usb/kvaser_usb.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 9accc8272c27..cc7bfc0c0a71 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1503,6 +1503,10 @@ static int kvaser_usb_init_one(struct usb_interface *intf,
 	struct kvaser_usb_net_priv *priv;
 	int i, err;
 
+	err = kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, channel);
+	if (err)
+		return err;
+
 	netdev = alloc_candev(sizeof(*priv), MAX_TX_URBS);
 	if (!netdev) {
 		dev_err(&intf->dev, "Cannot alloc candev\n");
@@ -1607,9 +1611,6 @@ static int kvaser_usb_probe(struct usb_interface *intf,
 
 	usb_set_intfdata(intf, dev);
 
-	for (i = 0; i < MAX_NET_DEVICES; i++)
-		kvaser_usb_send_simple_msg(dev, CMD_RESET_CHIP, i);
-
 	err = kvaser_usb_get_software_info(dev);
 	if (err) {
 		dev_err(&intf->dev,
-- 
2.1.4

^ permalink raw reply related

* [PATCH 4/8] can: c_can: use regmap_update_bits() to modify RAMINIT register
From: Marc Kleine-Budde @ 2015-01-15 16:11 UTC (permalink / raw)
  To: netdev; +Cc: davem, linux-can, kernel, Roger Quadros, Marc Kleine-Budde
In-Reply-To: <1421338283-12417-1-git-send-email-mkl@pengutronix.de>

From: Roger Quadros <rogerq@ti.com>

use of regmap_read() and regmap_write() in c_can_hw_raminit_syscon()
is not safe as the RAMINIT register can be shared between different drivers
at least for TI SoCs.

To make the modification atomic we switch to using regmap_update_bits().

regmap_update_bits() skips writing to the register if it's read content is the
same as what is going to be written. This causes an issue for us when we
need to clear the DONE bit with the initial condition START:0, DONE:1 as
DONE bit must be written with 1 to clear it.

So we defer the clearing of DONE bit to later when we set the START bit.
There we are sure that START bit is changed from 0 to 1 so the write of
1 to already set DONE bit will happen.

Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/c_can/c_can_platform.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/net/can/c_can/c_can_platform.c b/drivers/net/can/c_can/c_can_platform.c
index f363972cd77d..e36d10520e24 100644
--- a/drivers/net/can/c_can/c_can_platform.c
+++ b/drivers/net/can/c_can/c_can_platform.c
@@ -103,27 +103,34 @@ static void c_can_hw_raminit_syscon(const struct c_can_priv *priv, bool enable)
 	mask = 1 << raminit->bits.start | 1 << raminit->bits.done;
 	regmap_read(raminit->syscon, raminit->reg, &ctrl);
 
-	/* We clear the done and start bit first. The start bit is
+	/* We clear the start bit first. The start bit is
 	 * looking at the 0 -> transition, but is not self clearing;
-	 * And we clear the init done bit as well.
 	 * NOTE: DONE must be written with 1 to clear it.
+	 * We can't clear the DONE bit here using regmap_update_bits()
+	 * as it will bypass the write if initial condition is START:0 DONE:1
+	 * e.g. on DRA7 which needs START pulse.
 	 */
-	ctrl &= ~(1 << raminit->bits.start);
-	ctrl |= 1 << raminit->bits.done;
-	regmap_write(raminit->syscon, raminit->reg, ctrl);
+	ctrl &= ~mask;	/* START = 0, DONE = 0 */
+	regmap_update_bits(raminit->syscon, raminit->reg, mask, ctrl);
 
-	ctrl &= ~(1 << raminit->bits.done);
-	c_can_hw_raminit_wait_syscon(priv, mask, ctrl);
+	/* check if START bit is 0. Ignore DONE bit for now
+	 * as it can be either 0 or 1.
+	 */
+	c_can_hw_raminit_wait_syscon(priv, 1 << raminit->bits.start, ctrl);
 
 	if (enable) {
-		/* Set start bit and wait for the done bit. */
+		/* Clear DONE bit & set START bit. */
 		ctrl |= 1 << raminit->bits.start;
-		regmap_write(raminit->syscon, raminit->reg, ctrl);
-
+		/* DONE must be written with 1 to clear it */
+		ctrl |= 1 << raminit->bits.done;
+		regmap_update_bits(raminit->syscon, raminit->reg, mask, ctrl);
+		/* prevent further clearing of DONE bit */
+		ctrl &= ~(1 << raminit->bits.done);
 		/* clear START bit if start pulse is needed */
 		if (raminit->needs_pulse) {
 			ctrl &= ~(1 << raminit->bits.start);
-			regmap_write(raminit->syscon, raminit->reg, ctrl);
+			regmap_update_bits(raminit->syscon, raminit->reg,
+					   mask, ctrl);
 		}
 
 		ctrl |= 1 << raminit->bits.done;
-- 
2.1.4

^ permalink raw reply related

* [PATCH 2/8] can: dev: fix crtlmode_supported check
From: Marc Kleine-Budde @ 2015-01-15 16:11 UTC (permalink / raw)
  To: netdev
  Cc: davem, linux-can, kernel, Oliver Hartkopp, Wolfgang Grandegger,
	linux-stable, Marc Kleine-Budde
In-Reply-To: <1421338283-12417-1-git-send-email-mkl@pengutronix.de>

From: Oliver Hartkopp <socketcan@hartkopp.net>

When changing flags in the CAN drivers ctrlmode the provided new content has to
be checked whether the bits are allowed to be changed. The bits that are to be
changed are given as a bitfield in cm->mask. Therefore checking against
cm->flags is wrong as the content can hold any kind of values.

The iproute2 tool sets the bits in cm->mask and cm->flags depending on the
detected command line options. To be robust against bogus user space
applications additionally sanitize the provided flags with the provided mask.

Cc: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/dev.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 3ec8f6f25e5f..847c1f813261 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -807,10 +807,14 @@ static int can_changelink(struct net_device *dev,
 		if (dev->flags & IFF_UP)
 			return -EBUSY;
 		cm = nla_data(data[IFLA_CAN_CTRLMODE]);
-		if (cm->flags & ~priv->ctrlmode_supported)
+
+		/* check whether changed bits are allowed to be modified */
+		if (cm->mask & ~priv->ctrlmode_supported)
 			return -EOPNOTSUPP;
+
+		/* clear bits to be modified and copy the flag values */
 		priv->ctrlmode &= ~cm->mask;
-		priv->ctrlmode |= cm->flags;
+		priv->ctrlmode |= (cm->flags & cm->mask);
 
 		/* CAN_CTRLMODE_FD can only be set when driver supports FD */
 		if (priv->ctrlmode & CAN_CTRLMODE_FD)
-- 
2.1.4

^ permalink raw reply related

* [PATCH 1/8] MAINTAINERS: update linux-can git repositories
From: Marc Kleine-Budde @ 2015-01-15 16:11 UTC (permalink / raw)
  To: netdev; +Cc: davem, linux-can, kernel, Marc Kleine-Budde
In-Reply-To: <1421338283-12417-1-git-send-email-mkl@pengutronix.de>

The linux-can upstream git repositories are now hosted on kernel.org, update
MAINTAINERS accordingly.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 MAINTAINERS | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 600d2aad8276..efa5f8d4086d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2346,7 +2346,8 @@ CAN NETWORK LAYER
 M:	Oliver Hartkopp <socketcan@hartkopp.net>
 L:	linux-can@vger.kernel.org
 W:	http://gitorious.org/linux-can
-T:	git git://gitorious.org/linux-can/linux-can-next.git
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can.git
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next.git
 S:	Maintained
 F:	Documentation/networking/can.txt
 F:	net/can/
@@ -2361,7 +2362,8 @@ M:	Wolfgang Grandegger <wg@grandegger.com>
 M:	Marc Kleine-Budde <mkl@pengutronix.de>
 L:	linux-can@vger.kernel.org
 W:	http://gitorious.org/linux-can
-T:	git git://gitorious.org/linux-can/linux-can-next.git
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can.git
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next.git
 S:	Maintained
 F:	drivers/net/can/
 F:	include/linux/can/dev.h
-- 
2.1.4

^ permalink raw reply related

* pull-request: can 2015-01-15
From: Marc Kleine-Budde @ 2015-01-15 16:11 UTC (permalink / raw)
  To: netdev; +Cc: davem, linux-can, kernel

Hello David,

this is a pull request of 8 patches.

Ahmed S. Darwish contributes 4 fixes for the kvaser_usb driver. The two patches
by Oliver Hartkopp mark the m_can driver as non-ISO, as the CANFD standard was
updated. Roger Quadros's patch for the c_can driver fixes the register access
during RAMINIT. And one patch by my, which updates the MAINTAINERS file, as we
moved the git repos to the kernel.org infrastructure.

regards,
Marc

---

The following changes since commit 16dde0d6ac159531a5e03cd3f8bc8a401d9f3fb6:

  be2net: Allow GRE to work concurrently while a VxLAN tunnel is configured (2015-01-15 01:55:05 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can.git tags/linux-can-fixes-for-3.19-20150115

for you to fetch changes up to a58518ccf39f86f898a65201518dd8e799b3abeb:

  can: kvaser_usb: Don't dereference skb after a netif_rx() (2015-01-15 16:58:02 +0100)

----------------------------------------------------------------
linux-can-fixes-for-3.19-20150115

----------------------------------------------------------------
Ahmed S. Darwish (4):
      can: kvaser_usb: Don't free packets when tight on URBs
      can: kvaser_usb: Reset all URB tx contexts upon channel close
      can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels
      can: kvaser_usb: Don't dereference skb after a netif_rx()

Marc Kleine-Budde (1):
      MAINTAINERS: update linux-can git repositories

Oliver Hartkopp (2):
      can: dev: fix crtlmode_supported check
      can: m_can: tag current CAN FD controllers as non-ISO

Roger Quadros (1):
      can: c_can: use regmap_update_bits() to modify RAMINIT register

 MAINTAINERS                            |  6 ++++--
 drivers/net/can/c_can/c_can_platform.c | 29 ++++++++++++++++++-----------
 drivers/net/can/dev.c                  |  8 ++++++--
 drivers/net/can/m_can/m_can.c          |  5 +++++
 drivers/net/can/usb/kvaser_usb.c       | 31 +++++++++++++++----------------
 include/uapi/linux/can/netlink.h       |  1 +
 6 files changed, 49 insertions(+), 31 deletions(-)

^ permalink raw reply

* [PATCH 8/8] can: kvaser_usb: Don't dereference skb after a netif_rx()
From: Marc Kleine-Budde @ 2015-01-15 16:11 UTC (permalink / raw)
  To: netdev; +Cc: davem, linux-can, kernel, Ahmed S. Darwish, Marc Kleine-Budde
In-Reply-To: <1421338283-12417-1-git-send-email-mkl@pengutronix.de>

From: "Ahmed S. Darwish" <ahmed.darwish@valeo.com>

We should not touch the packet after a netif_rx: it might
get freed behind our back.

Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/usb/kvaser_usb.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index cc7bfc0c0a71..c32cd61073bc 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -520,10 +520,10 @@ static void kvaser_usb_tx_acknowledge(const struct kvaser_usb *dev,
 		skb = alloc_can_err_skb(priv->netdev, &cf);
 		if (skb) {
 			cf->can_id |= CAN_ERR_RESTARTED;
-			netif_rx(skb);
 
 			stats->rx_packets++;
 			stats->rx_bytes += cf->can_dlc;
+			netif_rx(skb);
 		} else {
 			netdev_err(priv->netdev,
 				   "No memory left for err_skb\n");
@@ -770,10 +770,9 @@ static void kvaser_usb_rx_error(const struct kvaser_usb *dev,
 
 	priv->can.state = new_state;
 
-	netif_rx(skb);
-
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
+	netif_rx(skb);
 }
 
 static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
@@ -805,10 +804,9 @@ static void kvaser_usb_rx_can_err(const struct kvaser_usb_net_priv *priv,
 		stats->rx_over_errors++;
 		stats->rx_errors++;
 
-		netif_rx(skb);
-
 		stats->rx_packets++;
 		stats->rx_bytes += cf->can_dlc;
+		netif_rx(skb);
 	}
 }
 
@@ -887,10 +885,9 @@ static void kvaser_usb_rx_can_msg(const struct kvaser_usb *dev,
 			       cf->can_dlc);
 	}
 
-	netif_rx(skb);
-
 	stats->rx_packets++;
 	stats->rx_bytes += cf->can_dlc;
+	netif_rx(skb);
 }
 
 static void kvaser_usb_start_chip_reply(const struct kvaser_usb *dev,
-- 
2.1.4


^ permalink raw reply related

* [PATCH 6/8] can: kvaser_usb: Reset all URB tx contexts upon channel close
From: Marc Kleine-Budde @ 2015-01-15 16:11 UTC (permalink / raw)
  To: netdev
  Cc: davem, linux-can, kernel, Ahmed S. Darwish, linux-stable,
	Marc Kleine-Budde
In-Reply-To: <1421338283-12417-1-git-send-email-mkl@pengutronix.de>

From: "Ahmed S. Darwish" <ahmed.darwish@valeo.com>

Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in very high frequency (*), closing the CAN channel while
all the transmissions are on (#), opening the device again (@),
then sending a small number of packets would make the driver
enter an almost infinite loop of:

[....]
[15959.853988] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853990] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853991] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853993] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853994] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853995] kvaser_usb 4-3:1.0 can0: cannot find free context
[....]

_dragging the whole system down_ in the process due to the
excessive logging output.

Initially, this has caused random panics in the kernel due to a
buggy error recovery path.  That got fixed in an earlier commit.(%)
This patch aims at solving the root cause. -->

16 tx URBs and contexts are allocated per CAN channel per USB
device. Such URBs are protected by:

a) A simple atomic counter, up to a value of MAX_TX_URBS (16)
b) A flag in each URB context, stating if it's free
c) The fact that ndo_start_xmit calls are themselves protected
   by the networking layers higher above

After grabbing one of the tx URBs, if the driver noticed that all
of them are now taken, it stops the netif transmission queue.
Such queue is worken up again only if an acknowedgment was received
from the firmware on one of our earlier-sent frames.

Meanwhile, upon channel close (#), the driver sends a CMD_STOP_CHIP
to the firmware, effectively closing all further communication.  In
the high traffic case, the atomic counter remains at MAX_TX_URBS,
and all the URB contexts remain marked as active.  While opening
the channel again (@), it cannot send any further frames since no
more free tx URB contexts are available.

Reset all tx URB contexts upon CAN channel close.

(*) 50 parallel instances of `cangen0 -g 0 -ix`
(#) `ifconfig can0 down`
(@) `ifconfig can0 up`
(%) "can: kvaser_usb: Don't free packets when tight on URBs"

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/usb/kvaser_usb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 2e7d513a7c11..9accc8272c27 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1246,6 +1246,9 @@ static int kvaser_usb_close(struct net_device *netdev)
 	if (err)
 		netdev_warn(netdev, "Cannot stop device, error %d\n", err);

+	/* reset tx contexts */
+	kvaser_usb_unlink_tx_urbs(priv);
+
 	priv->can.state = CAN_STATE_STOPPED;
 	close_candev(priv->netdev);

-- 
2.1.4

^ permalink raw reply related

* [PATCH 5/8] can: kvaser_usb: Don't free packets when tight on URBs
From: Marc Kleine-Budde @ 2015-01-15 16:11 UTC (permalink / raw)
  To: netdev
  Cc: davem, linux-can, kernel, Ahmed S. Darwish, linux-stable,
	Marc Kleine-Budde
In-Reply-To: <1421338283-12417-1-git-send-email-mkl@pengutronix.de>

From: "Ahmed S. Darwish" <ahmed.darwish@valeo.com>

Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in high frequency caused seemingly-random panics in the
kernel.

On further inspection, it seems the driver erroneously freed the
to-be-transmitted packet upon getting tight on URBs and returning
NETDEV_TX_BUSY, leading to invalid memory writes and double frees
at a later point in time.

Note:

Finding no more URBs/transmit-contexts and returning NETDEV_TX_BUSY
is a driver bug in and out of itself: it means that our start/stop
queue flow control is broken.

This patch only fixes the (buggy) error handling code; the root
cause shall be fixed in a later commit.

Acked-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/usb/kvaser_usb.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/can/usb/kvaser_usb.c b/drivers/net/can/usb/kvaser_usb.c
index 541fb7a05625..2e7d513a7c11 100644
--- a/drivers/net/can/usb/kvaser_usb.c
+++ b/drivers/net/can/usb/kvaser_usb.c
@@ -1294,12 +1294,14 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (!urb) {
 		netdev_err(netdev, "No memory left for URBs\n");
 		stats->tx_dropped++;
-		goto nourbmem;
+		dev_kfree_skb(skb);
+		return NETDEV_TX_OK;
 	}
 
 	buf = kmalloc(sizeof(struct kvaser_msg), GFP_ATOMIC);
 	if (!buf) {
 		stats->tx_dropped++;
+		dev_kfree_skb(skb);
 		goto nobufmem;
 	}
 
@@ -1334,6 +1336,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 		}
 	}
 
+	/* This should never happen; it implies a flow control bug */
 	if (!context) {
 		netdev_warn(netdev, "cannot find free context\n");
 		ret =  NETDEV_TX_BUSY;
@@ -1364,9 +1367,6 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
 	if (unlikely(err)) {
 		can_free_echo_skb(netdev, context->echo_index);
 
-		skb = NULL; /* set to NULL to avoid double free in
-			     * dev_kfree_skb(skb) */
-
 		atomic_dec(&priv->active_tx_urbs);
 		usb_unanchor_urb(urb);
 
@@ -1388,8 +1388,6 @@ releasebuf:
 	kfree(buf);
 nobufmem:
 	usb_free_urb(urb);
-nourbmem:
-	dev_kfree_skb(skb);
 	return ret;
 }
 
-- 
2.1.4

^ permalink raw reply related

* [PATCH 3/8] can: m_can: tag current CAN FD controllers as non-ISO
From: Marc Kleine-Budde @ 2015-01-15 16:11 UTC (permalink / raw)
  To: netdev
  Cc: davem, linux-can, kernel, Oliver Hartkopp, linux-stable,
	Marc Kleine-Budde
In-Reply-To: <1421338283-12417-1-git-send-email-mkl@pengutronix.de>

From: Oliver Hartkopp <socketcan@hartkopp.net>

During the CAN FD standardization process within the ISO it turned out that
the failure detection capability has to be improved.

The CAN in Automation organization (CiA) defined the already implemented CAN
FD controllers as 'non-ISO' and the upcoming improved CAN FD controllers as
'ISO' compliant. See at http://www.can-cia.com/index.php?id=1937

Finally there will be three types of CAN FD controllers in the future:

1. ISO compliant (fixed)
2. non-ISO compliant (fixed, like the M_CAN IP v3.0.1 in m_can.c)
3. ISO/non-ISO CAN FD controllers (switchable, like the PEAK USB FD)

So the current M_CAN driver for the M_CAN IP v3.0.1 has to expose its non-ISO
implementation by setting the CAN_CTRLMODE_FD_NON_ISO ctrlmode at startup.
As this bit cannot be switched at configuration time CAN_CTRLMODE_FD_NON_ISO
must not be set in ctrlmode_supported of the current M_CAN driver.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 drivers/net/can/m_can/m_can.c    | 5 +++++
 include/uapi/linux/can/netlink.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index d7bc462aafdc..244529881be9 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -955,6 +955,11 @@ static struct net_device *alloc_m_can_dev(void)
 	priv->can.data_bittiming_const = &m_can_data_bittiming_const;
 	priv->can.do_set_mode = m_can_set_mode;
 	priv->can.do_get_berr_counter = m_can_get_berr_counter;
+
+	/* CAN_CTRLMODE_FD_NON_ISO is fixed with M_CAN IP v3.0.1 */
+	priv->can.ctrlmode = CAN_CTRLMODE_FD_NON_ISO;
+
+	/* CAN_CTRLMODE_FD_NON_ISO can not be changed with M_CAN IP v3.0.1 */
 	priv->can.ctrlmode_supported = CAN_CTRLMODE_LOOPBACK |
 					CAN_CTRLMODE_LISTENONLY |
 					CAN_CTRLMODE_BERR_REPORTING |
diff --git a/include/uapi/linux/can/netlink.h b/include/uapi/linux/can/netlink.h
index 3e4323a3918d..94ffe0c83ce7 100644
--- a/include/uapi/linux/can/netlink.h
+++ b/include/uapi/linux/can/netlink.h
@@ -98,6 +98,7 @@ struct can_ctrlmode {
 #define CAN_CTRLMODE_BERR_REPORTING	0x10	/* Bus-error reporting */
 #define CAN_CTRLMODE_FD			0x20	/* CAN FD mode */
 #define CAN_CTRLMODE_PRESUME_ACK	0x40	/* Ignore missing CAN ACKs */
+#define CAN_CTRLMODE_FD_NON_ISO		0x80	/* CAN FD in non-ISO mode */

 /*
  * CAN device statistics
-- 
2.1.4

^ permalink raw reply related

* Re: Re: Re: [bisected regression] e1000e: "Detected Hardware Unit Hang"
From: Eric Dumazet @ 2015-01-15 16:00 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: 'Linux Netdev List', Eric Dumazet, Jeff Kirsher,
	e1000-devel
In-Reply-To: <3089325.gjrPpo2XX1@storm>

On Thu, 2015-01-15 at 16:48 +0100, Thomas Jarosch wrote:
> On Thursday, 15. January 2015 07:25:32 Eric Dumazet wrote:
> > On Thu, 2015-01-15 at 15:58 +0100, Thomas Jarosch wrote:
> > > A colleague mentioned to me he saw the "Hardware Unit Hang" message
> > > every
> > > few days even running on kernel 3.4 (without your patch). Basically I'm
> > > testing now if that's still the case with 3.19-rc4+ or not.
> > > 
> > > I'm all for fixing the root cause. I'm just interested if the e1000e
> > > hang can even be triggered when using a max frag page size of 4096.
> > > So far it transferred 751.6 GiB without a hiccup.
> > 
> > You told it was forwarding setup.
> > 
> > 1) What is the NIC receiving traffic.
> > 2) What happens if you disable GRO on it ?
> 
> The setup is like this:
> 
> Win7 notebook (client)
>     -> "private LAN" eth0 (e1000e)
>         -> "external traffic" eth1 (r8169)
> 
>             -> local HTTP server in the intranet
>                (2x e1000e using bonding)
> 
> 
> Disabling gro on eth1 (r8169) seems to make eth0 (e1000e) stable.
> As it usually hangs within seconds, it already transferred 28 GiB right now.
> 
> When I switch gro back on, it takes around three seconds until the hang.
> 
> Does that point into the right / any direction?

Sure. 

Please apply this patch, and try to lower
/proc/sys/net/core/gro_max_frags and see if this makes a difference
(leaving GRO enabled)

(start with 7 and increase it, limit being 17)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 642d426a668f8ac94daf334c00117f96789f3990..817aee05a1b0623e5752beb0952a6fe6d66e583f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3400,6 +3400,7 @@ extern int		netdev_max_backlog;
 extern int		netdev_tstamp_prequeue;
 extern int		weight_p;
 extern int		bpf_jit_enable;
+extern int		sysctl_gro_max_frags;
 
 bool netdev_has_upper_dev(struct net_device *dev, struct net_device *upper_dev);
 struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 56db472e9b864e805e0ab36dd73a0404d2fc66d5..c2c2e7e53014617c5da574f2eb8a2889ed743719 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3197,6 +3197,8 @@ err:
 }
 EXPORT_SYMBOL_GPL(skb_segment);
 
+int sysctl_gro_max_frags = MAX_SKB_FRAGS;
+
 int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 {
 	struct skb_shared_info *pinfo, *skbinfo = skb_shinfo(skb);
@@ -3219,8 +3221,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 		int i = skbinfo->nr_frags;
 		int nr_frags = pinfo->nr_frags + i;
 
-		if (nr_frags > MAX_SKB_FRAGS)
-			goto merge;
+		if (nr_frags > sysctl_gro_max_frags)
+			return -E2BIG;
 
 		offset -= headlen;
 		pinfo->nr_frags = nr_frags;
@@ -3252,8 +3254,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 		unsigned int first_size = headlen - offset;
 		unsigned int first_offset;
 
-		if (nr_frags + 1 + skbinfo->nr_frags > MAX_SKB_FRAGS)
-			goto merge;
+		if (nr_frags + 1 + skbinfo->nr_frags > sysctl_gro_max_frags)
+			return -E2BIG;
 
 		first_offset = skb->data -
 			       (unsigned char *)page_address(page) +
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 31baba2a71ce15e49450f69dae81e7d3be1ff3f2..de73d51381bf8acd0aedeb859ed961468441014a 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -278,6 +278,13 @@ static struct ctl_table net_core_table[] = {
 		.proc_handler	= proc_dointvec
 	},
 	{
+		.procname	= "gro_max_frags",
+		.data		= &sysctl_gro_max_frags,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec
+	},
+	{
 		.procname	= "netdev_rss_key",
 		.data		= &netdev_rss_key,
 		.maxlen		= sizeof(int),

^ permalink raw reply related

* Re: Re: Re: [bisected regression] e1000e: "Detected Hardware Unit Hang"
From: Thomas Jarosch @ 2015-01-15 15:48 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: 'Linux Netdev List', Eric Dumazet, Jeff Kirsher,
	e1000-devel
In-Reply-To: <1421335532.11734.73.camel@edumazet-glaptop2.roam.corp.google.com>

On Thursday, 15. January 2015 07:25:32 Eric Dumazet wrote:
> On Thu, 2015-01-15 at 15:58 +0100, Thomas Jarosch wrote:
> > A colleague mentioned to me he saw the "Hardware Unit Hang" message
> > every
> > few days even running on kernel 3.4 (without your patch). Basically I'm
> > testing now if that's still the case with 3.19-rc4+ or not.
> > 
> > I'm all for fixing the root cause. I'm just interested if the e1000e
> > hang can even be triggered when using a max frag page size of 4096.
> > So far it transferred 751.6 GiB without a hiccup.
> 
> You told it was forwarding setup.
> 
> 1) What is the NIC receiving traffic.
> 2) What happens if you disable GRO on it ?

The setup is like this:

Win7 notebook (client)
    -> "private LAN" eth0 (e1000e)
        -> "external traffic" eth1 (r8169)

            -> local HTTP server in the intranet
               (2x e1000e using bonding)


Disabling gro on eth1 (r8169) seems to make eth0 (e1000e) stable.
As it usually hangs within seconds, it already transferred 28 GiB right now.

When I switch gro back on, it takes around three seconds until the hang.

Does that point into the right / any direction?

Thomas

^ permalink raw reply

* [PATCH net] net: sctp: fix race for one-to-many sockets in sendmsg's auto associate
From: Daniel Borkmann @ 2015-01-15 15:34 UTC (permalink / raw)
  To: davem; +Cc: vyasevich, netdev, linux-sctp

I.e. one-to-many sockets in SCTP are not required to explicitly
call into connect(2) or sctp_connectx(2) prior to data exchange.
Instead, they can directly invoke sendmsg(2) and the SCTP stack
will automatically trigger connection establishment through 4WHS
via sctp_primitive_ASSOCIATE(). However, this in its current
implementation is racy: INIT is being sent out immediately (as
it cannot be bundled anyway) and the rest of the DATA chunks are
queued up for later xmit when connection is established, meaning
sendmsg(2) will return successfully. This behaviour can result
in an undesired side-effect that the kernel made the application
think the data has already been transmitted, although none of it
has actually left the machine, worst case even after close(2)'ing
the socket.

Instead, when the association from client side has been shut down
e.g. first gracefully through SCTP_EOF and then close(2), the
client could afterwards still receive the server's INIT_ACK due
to a connection with higher latency. This INIT_ACK is then considered
out of the blue and hence responded with ABORT as there was no
alive assoc found anymore. This can be easily reproduced f.e.
with sctp_test application from lksctp. One way to fix this race
is to wait for the handshake to actually complete.

The fix defers waiting after sctp_primitive_ASSOCIATE() and
sctp_primitive_SEND() succeeded, so that DATA chunks cooked up
from sctp_sendmsg() have already been placed into the output
queue through the side-effect interpreter, and therefore can then
be bundeled together with COOKIE_ECHO control chunks.

strace from example application (shortened):

socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP) = 3
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(0)=[], msg_controllen=48, {cmsg_len=48, cmsg_level=0x84 /* SOL_??? */, cmsg_type=, ...},
           msg_flags=0}, 0) = 0 // graceful shutdown for SOCK_SEQPACKET via SCTP_EOF
close(3) = 0

tcpdump before patch (fooling the application):

22:33:36.306142 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 3879023686] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3139201684]
22:33:36.316619 IP 192.168.1.115.8888 > 192.168.1.114.41462: sctp (1) [INIT ACK] [init tag: 3345394793] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 3380109591]
22:33:36.317600 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [ABORT]

tcpdump after patch:

14:28:58.884116 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 438593213] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3092969729]
14:28:58.888414 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [INIT ACK] [init tag: 381429855] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 2141904492]
14:28:58.888638 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [COOKIE ECHO] , (2) [DATA] (B)(E) [TSN: 3092969729] [...]
14:28:58.893278 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [COOKIE ACK] , (2) [SACK] [cum ack 3092969729] [a_rwnd 106491] [#gap acks 0] [#dup tsns 0]
14:28:58.893591 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969730] [...]
14:28:59.096963 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969730] [a_rwnd 106496] [#gap acks 0] [#dup tsns 0]
14:28:59.097086 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969731] [...] , (2) [DATA] (B)(E) [TSN: 3092969732] [...]
14:28:59.103218 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969732] [a_rwnd 106486] [#gap acks 0] [#dup tsns 0]
14:28:59.103330 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN]
14:28:59.107793 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SHUTDOWN ACK]
14:28:59.107890 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN COMPLETE]

Looks like this bug is from the pre-git history museum. ;)

Fixes: 08707d5482df ("lksctp-2_5_31-0_5_1.patch")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
 net/sctp/socket.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 2625ecc..aafe94b 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1603,7 +1603,7 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	sctp_assoc_t associd = 0;
 	sctp_cmsgs_t cmsgs = { NULL };
 	sctp_scope_t scope;
-	bool fill_sinfo_ttl = false;
+	bool fill_sinfo_ttl = false, wait_connect = false;
 	struct sctp_datamsg *datamsg;
 	int msg_flags = msg->msg_flags;
 	__u16 sinfo_flags = 0;
@@ -1943,6 +1943,7 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 		if (err < 0)
 			goto out_free;
 
+		wait_connect = true;
 		pr_debug("%s: we associated primitively\n", __func__);
 	}
 
@@ -1980,6 +1981,11 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	sctp_datamsg_put(datamsg);
 	err = msg_len;
 
+	if (unlikely(wait_connect)) {
+		timeo = sock_sndtimeo(sk, msg_flags & MSG_DONTWAIT);
+		sctp_wait_for_connect(asoc, &timeo);
+	}
+
 	/* If we are already past ASSOCIATE, the lower
 	 * layers are responsible for association cleanup.
 	 */
-- 
1.7.11.7

^ permalink raw reply related

* Re: [PATCH v2 2/2] fixup! net/macb: improved ethtool statistics support
From: Xander Huff @ 2015-01-15 15:32 UTC (permalink / raw)
  To: Nicolas Ferre, davem
  Cc: netdev, jaeden.amero, rich.tollerton, ben.shelton, brad.mouring,
	linux-kernel, cyrille.pitchen
In-Reply-To: <54B797E5.6070503@atmel.com>

On 1/15/2015 4:35 AM, Nicolas Ferre wrote:
>>   #define GEM_OTX					0x0100 /* Octets transmitted */
> I see, it's modified hereafter! Why not integrate this part in previous
> patch?
>
>

I split these up the way I did by using the --fixup argument to allow the rebase 
--autosquash to work. I could instead have a series of fixup commits to fix each 
type of style issue and then also the functional issue if you'd prefer, but 
--autosquash will no longer work.

^ permalink raw reply

* Re: [PATCH] e100: Don't enable WoL by default on Toshiba devices
From: Jeff Kirsher @ 2015-01-15 15:31 UTC (permalink / raw)
  To: Ondrej Zary; +Cc: e1000-devel, netdev, David Miller, linux-kernel
In-Reply-To: <201501151618.07434.linux@rainbow-software.org>


[-- Attachment #1.1: Type: text/plain, Size: 2489 bytes --]

On Thu, 2015-01-15 at 16:18 +0100, Ondrej Zary wrote:
> On Thursday 15 January 2015, Jeff Kirsher wrote:
> > On Thu, 2015-01-15 at 14:40 +0100, Ondrej Zary wrote:
> > > On Thursday 13 November 2014, Jeff Kirsher wrote:
> > > > On Wed, 2014-11-12 at 18:18 -0500, David Miller wrote:
> > > > > From: Ondrej Zary <linux@rainbow-software.org>
> > > > > Date: Wed, 12 Nov 2014 23:47:25 +0100
> > > > >
> > > > > > Enabling WoL on some Toshiba laptops (such as Portege R100)
> > >
> > > causes
> > >
> > > > > > battery drain after shutdown (WoL is active even on battery).
> > >
> > > These
> > >
> > > > > > laptops have the WoL bit set in EEPROM ID, causing e100 driver
> > >
> > > to
> > >
> > > > > > enable WoL by default.
> > > > > >
> > > > > > Check subsystem vendor ID and if it's Toshiba, don't enable WoL
> > >
> > > by
> > >
> > > > > > default from EEPROM settings.
> > > > > >
> > > > > > Fixes
> > >
> > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/110784
> > >
> > > > > > Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
> > > > >
> > > > > Jeff, are you gonna pick this up?
> > > >
> > > > Yes, sorry I did not catch it earlier.
> > >
> > > What happened to this patch? I don't see it in net.git or
> > > net-next.git
> > > (checked both davem's and jkirsher's)
> >
> > Sorry, I thought I had replied with a NAK on this patch after further
> > review of the changes.
> >
> > We don't fix BIOS issues in the driver especially regarding feature
> > enablement like WoL.  We would end up with dozens of these kinds of fixes
> > if we to allow this.
> >
> > You should go back to the OEM and ask for a BIOS update to resolve this or
> > configure udev so that ethtool disables WoL.
> 
> This is not a BIOS bug. When the machine is powered off in BIOS (or GRUB), 
> everything is OK.
> 
> The bug is that e100 driver enables WoL based on some bit in EEPROM that 
> happens to be set on at least some Toshiba laptops and user has no way to 
> change it.

Yes, the EEPROM can be modified/updated through the BIOS update I
suggested earlier.  So again, a BIOS issue.

OR you can configure udev so that ethtool disables WoL if you do not
want to pursue a EEPROM update through a BIOS update.

>  Windows driver does not do this. Other Linux ethernet drivers 
> don't do this. When user wants WoL, (s)he enables it in BIOS and OS. Maybe 
> this (mis)feature should be removed from the driver.
> 



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 392 bytes --]

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet

[-- Attachment #3: Type: text/plain, Size: 257 bytes --]

_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* [PATCH net-next] bridge: use MDBA_SET_ENTRY_MAX for maxtype in nlmsg_parse()
From: Nicolas Dichtel @ 2015-01-15 15:29 UTC (permalink / raw)
  To: netdev; +Cc: davem, Nicolas Dichtel

This is just a cleanup, because in the current code MDBA_SET_ENTRY_MAX ==
MDBA_SET_ENTRY.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/bridge/br_mdb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 5df05269d17a..fed61c971b17 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -276,7 +276,7 @@ static int br_mdb_parse(struct sk_buff *skb, struct nlmsghdr *nlh,
 	struct net_device *dev;
 	int err;
 
-	err = nlmsg_parse(nlh, sizeof(*bpm), tb, MDBA_SET_ENTRY, NULL);
+	err = nlmsg_parse(nlh, sizeof(*bpm), tb, MDBA_SET_ENTRY_MAX, NULL);
 	if (err < 0)
 		return err;
 
-- 
2.2.2

^ permalink raw reply related

* Re: Re: [bisected regression] e1000e: "Detected Hardware Unit Hang"
From: Eric Dumazet @ 2015-01-15 15:25 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: 'Linux Netdev List', Eric Dumazet, Jeff Kirsher,
	e1000-devel
In-Reply-To: <8088599.PZmG8U31O2@storm>

On Thu, 2015-01-15 at 15:58 +0100, Thomas Jarosch wrote:

> A colleague mentioned to me he saw the "Hardware Unit Hang" message every 
> few days even running on kernel 3.4 (without your patch). Basically I'm 
> testing now if that's still the case with 3.19-rc4+ or not.
> 
> I'm all for fixing the root cause. I'm just interested if the e1000e
> hang can even be triggered when using a max frag page size of 4096.
> So far it transferred 751.6 GiB without a hiccup.

You told it was forwarding setup.

1) What is the NIC receiving traffic.
2) What happens if you disable GRO on it ?

^ permalink raw reply

* Re: [patch-net-next v2 3/3] net: ethernet: cpsw: don't requests IRQs we don't use
From: Felipe Balbi @ 2015-01-15 15:20 UTC (permalink / raw)
  To: Mugunthan V N
  Cc: Felipe Balbi, davem, Tony Lindgren, Linux OMAP Mailing List,
	netdev
In-Reply-To: <54B770FC.6060003@ti.com>

[-- Attachment #1: Type: text/plain, Size: 754 bytes --]

On Thu, Jan 15, 2015 at 01:19:16PM +0530, Mugunthan V N wrote:
> On Wednesday 14 January 2015 10:28 PM, Felipe Balbi wrote:
> > CPSW never uses RX_THRESHOLD or MISC interrupts. In
> > fact, they are always kept masked in their appropriate
> > IRQ Enable register.
> > 
> > Instead of allocating an IRQ that never fires, it's best
> > to remove that code altogether and let future patches
> > implement it if anybody needs those.
> > 
> > Signed-off-by: Felipe Balbi <balbi@ti.com>
> 
> Instead of introducing dummy ISR in previous patch and then removing in
> this patch, both can be squashed into a single patch.

sure they can. I decided to split to ease review and to make sure only
one thing happens in a single patch.

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: [PATCH] e100: Don't enable WoL by default on Toshiba devices
From: Ondrej Zary @ 2015-01-15 15:18 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: e1000-devel, netdev, David Miller, linux-kernel
In-Reply-To: <1421333621.2632.19.camel@jtkirshe-mobl>

On Thursday 15 January 2015, Jeff Kirsher wrote:
> On Thu, 2015-01-15 at 14:40 +0100, Ondrej Zary wrote:
> > On Thursday 13 November 2014, Jeff Kirsher wrote:
> > > On Wed, 2014-11-12 at 18:18 -0500, David Miller wrote:
> > > > From: Ondrej Zary <linux@rainbow-software.org>
> > > > Date: Wed, 12 Nov 2014 23:47:25 +0100
> > > >
> > > > > Enabling WoL on some Toshiba laptops (such as Portege R100)
> >
> > causes
> >
> > > > > battery drain after shutdown (WoL is active even on battery).
> >
> > These
> >
> > > > > laptops have the WoL bit set in EEPROM ID, causing e100 driver
> >
> > to
> >
> > > > > enable WoL by default.
> > > > >
> > > > > Check subsystem vendor ID and if it's Toshiba, don't enable WoL
> >
> > by
> >
> > > > > default from EEPROM settings.
> > > > >
> > > > > Fixes
> >
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/110784
> >
> > > > > Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
> > > >
> > > > Jeff, are you gonna pick this up?
> > >
> > > Yes, sorry I did not catch it earlier.
> >
> > What happened to this patch? I don't see it in net.git or
> > net-next.git
> > (checked both davem's and jkirsher's)
>
> Sorry, I thought I had replied with a NAK on this patch after further
> review of the changes.
>
> We don't fix BIOS issues in the driver especially regarding feature
> enablement like WoL.  We would end up with dozens of these kinds of fixes
> if we to allow this.
>
> You should go back to the OEM and ask for a BIOS update to resolve this or
> configure udev so that ethtool disables WoL.

This is not a BIOS bug. When the machine is powered off in BIOS (or GRUB), 
everything is OK.

The bug is that e100 driver enables WoL based on some bit in EEPROM that 
happens to be set on at least some Toshiba laptops and user has no way to 
change it. Windows driver does not do this. Other Linux ethernet drivers 
don't do this. When user wants WoL, (s)he enables it in BIOS and OS. Maybe 
this (mis)feature should be removed from the driver.

-- 
Ondrej Zary

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* RE: [net-next 10/17] i40e: clean up PTP log messages
From: Nelson, Shannon @ 2015-01-15 15:01 UTC (permalink / raw)
  To: David Laight, Kirsher, Jeffrey T, davem@davemloft.net
  Cc: netdev@vger.kernel.org, nhorman@redhat.com, sassmann@redhat.com,
	jogreene@redhat.com, Keller, Jacob E
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D1CAC9AB0@AcuExch.aculab.com>

> From: David Laight [mailto:David.Laight@ACULAB.COM]
> Sent: Thursday, January 15, 2015 4:35 AM
> 
> From: Jeff Kirsher
> > From: Shannon Nelson <shannon.nelson@intel.com>
> >
> > The netdev name at init time often defaults to eth0 but later gets
> changed
> > by UDEV, so printing it here is misleading.
> 
> Without the interface name you stand zero chance of working out which
> one it is.
> With it, and provided all the interface renames get into dmesg, you
> stand
> at least some chance.

The dev_info() messages have the PCI device and function number in the string, so it's really not too hard to track down the resulting netdev port:
Jan 13 15:46:55 snelson3-cup kernel: [621235.401627] i40e 0000:84:00.1: PHC enabled

Later messages use netdev_info() and have both the device number and the netdev name:
Jan 13 15:46:56 snelson3-cup kernel: [621236.508868] i40e 0000:04:00.1 p261p2: NIC Link is Up 10 Gbps Full Duplex, Flow Control: None

Note that the driver name appears in both as well.

sln

^ permalink raw reply

* Re: [bisected regression] e1000e: "Detected Hardware Unit Hang"
From: Jeff Kirsher @ 2015-01-15 14:59 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Thomas Jarosch, 'Linux Netdev List', Eric Dumazet,
	e1000-devel
In-Reply-To: <1421333009.11734.53.camel@edumazet-glaptop2.roam.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 2404 bytes --]

On Thu, 2015-01-15 at 06:43 -0800, Eric Dumazet wrote:
> On Thu, 2015-01-15 at 11:11 +0100, Thomas Jarosch wrote:
> > On Wednesday, 14. January 2015 09:20:52 Eric Dumazet wrote:
> > > I would try to use lower data per txd. I am not sure 24KB is really
> > > supported.
> > > 
> > > ( check commit d821a4c4d11ad160925dab2bb009b8444beff484 for details)
> > > 
> > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> > > b/drivers/net/ethernet/intel/e1000e/netdev.c index
> > > e14fd85f64eb..8d973f7edfbd 100644
> > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> > > @@ -3897,7 +3897,7 @@ void e1000e_reset(struct e1000_adapter *adapter)
> > >  	 * limit of 24KB due to receive synchronization limitations.
> > >  	 */
> > >  	adapter->tx_fifo_limit = min_t(u32, ((er32(PBA) >> 16) << 10) - 96,
> > > -				       24 << 10);
> > > +				       8 << 10);
> > > 
> > >  	/* Disable Adaptive Interrupt Moderation if 2 full packets cannot
> > >  	 * fit in receive buffer.
> > 
> > Thanks for checking!
> > 
> > I just tried that change on top of git f800c25 (git HEAD), same problem. 
> > Let's see what the Intel wizards come up with.
> > 
> > What "works" is to decrease the page size in git HEAD, too:
> > 
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index 85ab7d7..9f0ef97 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -2108,7 +2108,7 @@ static inline void __skb_queue_purge(struct 
> > sk_buff_head *list)
> >                 kfree_skb(skb);
> >  }
> >  
> > -#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768)
> > +#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(4096)
> >  #define NETDEV_FRAG_PAGE_MAX_SIZE  (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER)
> >  #define NETDEV_PAGECNT_MAX_BIAS           NETDEV_FRAG_PAGE_MAX_SIZE
> > 
> > 
> > 
> > When I try a page size of 8192, it starts failing again. I'll now run
> > a stress test with 4096 to see if the problem is really gone
> > or just happens more rarely.
> 
> Sure, you basically reverted my patch.
> 
> You are not the first to report a problem caused by this patch.
> 
> This patch is known to have uncovered some driver bugs.
> 
> We are not going to revert it. We are going to fix the real bugs.
> 
> Thanks
> 
> 

Agreed, we are looking into issue Thomas.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: Re: [bisected regression] e1000e: "Detected Hardware Unit Hang"
From: Thomas Jarosch @ 2015-01-15 14:58 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: 'Linux Netdev List', Eric Dumazet, Jeff Kirsher,
	e1000-devel
In-Reply-To: <1421333009.11734.53.camel@edumazet-glaptop2.roam.corp.google.com>

On Thursday, 15. January 2015 06:43:29 Eric Dumazet wrote:
> > -#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768)
> > +#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(4096)
> > 
> >  #define NETDEV_FRAG_PAGE_MAX_SIZE  (PAGE_SIZE <<
> >  NETDEV_FRAG_PAGE_MAX_ORDER) #define NETDEV_PAGECNT_MAX_BIAS          
> >  NETDEV_FRAG_PAGE_MAX_SIZE> 
> > When I try a page size of 8192, it starts failing again. I'll now run
> > a stress test with 4096 to see if the problem is really gone
> > or just happens more rarely.
> 
> Sure, you basically reverted my patch.
> 
> You are not the first to report a problem caused by this patch.
> 
> This patch is known to have uncovered some driver bugs.
> 
> We are not going to revert it. We are going to fix the real bugs.
> 
> Thanks

A colleague mentioned to me he saw the "Hardware Unit Hang" message every 
few days even running on kernel 3.4 (without your patch). Basically I'm 
testing now if that's still the case with 3.19-rc4+ or not.

I'm all for fixing the root cause. I'm just interested if the e1000e
hang can even be triggered when using a max frag page size of 4096.
So far it transferred 751.6 GiB without a hiccup.

Cheers,
Thomas

^ permalink raw reply

* Re: [PATCH] e100: Don't enable WoL by default on Toshiba devices
From: Jeff Kirsher @ 2015-01-15 14:53 UTC (permalink / raw)
  To: Ondrej Zary; +Cc: David Miller, e1000-devel, netdev, linux-kernel
In-Reply-To: <201501151440.32159.linux@rainbow-software.org>

[-- Attachment #1: Type: text/plain, Size: 1444 bytes --]

On Thu, 2015-01-15 at 14:40 +0100, Ondrej Zary wrote:
> On Thursday 13 November 2014, Jeff Kirsher wrote:
> > On Wed, 2014-11-12 at 18:18 -0500, David Miller wrote:
> > > From: Ondrej Zary <linux@rainbow-software.org>
> > > Date: Wed, 12 Nov 2014 23:47:25 +0100
> > >
> > > > Enabling WoL on some Toshiba laptops (such as Portege R100)
> causes
> > > > battery drain after shutdown (WoL is active even on battery).
> These
> > > > laptops have the WoL bit set in EEPROM ID, causing e100 driver
> to
> > > > enable WoL by default.
> > > >
> > > > Check subsystem vendor ID and if it's Toshiba, don't enable WoL
> by
> > > > default from EEPROM settings.
> > > >
> > > > Fixes
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/110784
> > > >
> > > > Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
> > >
> > > Jeff, are you gonna pick this up?
> >
> > Yes, sorry I did not catch it earlier.
> 
> What happened to this patch? I don't see it in net.git or
> net-next.git 
> (checked both davem's and jkirsher's)

Sorry, I thought I had replied with a NAK on this patch after further
review of the changes.

We don't fix BIOS issues in the driver especially regarding feature
enablement like WoL.  We would end up with dozens of these kinds of fixes
if we to allow this.

You should go back to the OEM and ask for a BIOS update to resolve this or
configure udev so that ethtool disables WoL.  


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: [bisected regression] e1000e: "Detected Hardware Unit Hang"
From: Eric Dumazet @ 2015-01-15 14:43 UTC (permalink / raw)
  To: Thomas Jarosch
  Cc: 'Linux Netdev List', Eric Dumazet, Jeff Kirsher,
	e1000-devel
In-Reply-To: <5229621.KczjbIR22Q@storm>

On Thu, 2015-01-15 at 11:11 +0100, Thomas Jarosch wrote:
> On Wednesday, 14. January 2015 09:20:52 Eric Dumazet wrote:
> > I would try to use lower data per txd. I am not sure 24KB is really
> > supported.
> > 
> > ( check commit d821a4c4d11ad160925dab2bb009b8444beff484 for details)
> > 
> > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> > b/drivers/net/ethernet/intel/e1000e/netdev.c index
> > e14fd85f64eb..8d973f7edfbd 100644
> > --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> > @@ -3897,7 +3897,7 @@ void e1000e_reset(struct e1000_adapter *adapter)
> >  	 * limit of 24KB due to receive synchronization limitations.
> >  	 */
> >  	adapter->tx_fifo_limit = min_t(u32, ((er32(PBA) >> 16) << 10) - 96,
> > -				       24 << 10);
> > +				       8 << 10);
> > 
> >  	/* Disable Adaptive Interrupt Moderation if 2 full packets cannot
> >  	 * fit in receive buffer.
> 
> Thanks for checking!
> 
> I just tried that change on top of git f800c25 (git HEAD), same problem. 
> Let's see what the Intel wizards come up with.
> 
> What "works" is to decrease the page size in git HEAD, too:
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 85ab7d7..9f0ef97 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2108,7 +2108,7 @@ static inline void __skb_queue_purge(struct 
> sk_buff_head *list)
>                 kfree_skb(skb);
>  }
>  
> -#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768)
> +#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(4096)
>  #define NETDEV_FRAG_PAGE_MAX_SIZE  (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER)
>  #define NETDEV_PAGECNT_MAX_BIAS           NETDEV_FRAG_PAGE_MAX_SIZE
> 
> 
> 
> When I try a page size of 8192, it starts failing again. I'll now run
> a stress test with 4096 to see if the problem is really gone
> or just happens more rarely.

Sure, you basically reverted my patch.

You are not the first to report a problem caused by this patch.

This patch is known to have uncovered some driver bugs.

We are not going to revert it. We are going to fix the real bugs.

Thanks

^ permalink raw reply

* Re: [PATCH 0/2] Remove T4 FCoE support
From: Hannes Reinecke @ 2015-01-15 14:25 UTC (permalink / raw)
  To: Praveen Madhavan, netdev, linux-scsi
  Cc: davem, JBottomley, hch, hariprasad, varun
In-Reply-To: <cover.1421328605.git.praveenm@chelsio.com>

On 01/15/2015 02:45 PM, Praveen Madhavan wrote:
> These patches removes FCoE support for chelsio T4 adapter.
> Please apply on net-next since depends on previous commits.
> 
> Praveen Madhavan (2):
>   csiostor:Remove T4 FCoE support.
>   csiostor:Removed file csio_hw_t4.c
> 
>  drivers/scsi/csiostor/Makefile       |   2 +-
>  drivers/scsi/csiostor/csio_hw.c      |  61 ++----
>  drivers/scsi/csiostor/csio_hw_chip.h |  43 ----
>  drivers/scsi/csiostor/csio_hw_t4.c   | 404 -----------------------------------
>  drivers/scsi/csiostor/csio_init.c    |   6 +-
>  drivers/scsi/csiostor/csio_wr.c      |  15 +-
>  6 files changed, 25 insertions(+), 506 deletions(-)
>  delete mode 100644 drivers/scsi/csiostor/csio_hw_t4.c
> 
Any particular reason for this?
It worked at one point, didn't it?

So why do you want to remove it?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* [PATCH net-next v5 1/4] netns: add rtnl cmd to add and get peer netns ids
From: Nicolas Dichtel @ 2015-01-15 14:11 UTC (permalink / raw)
  To: netdev, containers, linux-kernel, linux-api
  Cc: davem, ebiederm, stephen, akpm, luto, cwang, Nicolas Dichtel
In-Reply-To: <1421331078-21622-1-git-send-email-nicolas.dichtel@6wind.com>

With this patch, a user can define an id for a peer netns by providing a FD or a
PID. These ids are local to the netns where it is added (ie valid only into this
netns).

The main function (ie the one exported to other module), peernet2id(), allows to
get the id of a peer netns. If no id has been assigned by the user, this
function allocates one.

These ids will be used in netlink messages to point to a peer netns, for example
in case of a x-netns interface.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 MAINTAINERS                        |   1 +
 include/net/net_namespace.h        |   4 +
 include/uapi/linux/Kbuild          |   1 +
 include/uapi/linux/net_namespace.h |  23 ++++
 include/uapi/linux/rtnetlink.h     |   5 +
 net/core/net_namespace.c           | 210 +++++++++++++++++++++++++++++++++++++
 6 files changed, 244 insertions(+)
 create mode 100644 include/uapi/linux/net_namespace.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 9de900572633..9b91d9f0257e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6578,6 +6578,7 @@ F:	include/linux/netdevice.h
 F:	include/uapi/linux/in.h
 F:	include/uapi/linux/net.h
 F:	include/uapi/linux/netdevice.h
+F:	include/uapi/linux/net_namespace.h
 F:	tools/net/
 F:	tools/testing/selftests/net/
 F:	lib/random32.c
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 2e8756b8c775..36faf4990c4b 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -60,6 +60,7 @@ struct net {
 	struct list_head	exit_list;	/* Use only net_mutex */
 
 	struct user_namespace   *user_ns;	/* Owning user namespace */
+	struct idr		netns_ids;
 
 	struct ns_common	ns;
 
@@ -290,6 +291,9 @@ static inline struct net *read_pnet(struct net * const *pnet)
 #define __net_initconst	__initconst
 #endif
 
+int peernet2id(struct net *net, struct net *peer);
+struct net *get_net_ns_by_id(struct net *net, int id);
+
 struct pernet_operations {
 	struct list_head list;
 	int (*init)(struct net *net);
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 00b100023c47..14b7b6e44c77 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -283,6 +283,7 @@ header-y += net.h
 header-y += netlink_diag.h
 header-y += netlink.h
 header-y += netrom.h
+header-y += net_namespace.h
 header-y += net_tstamp.h
 header-y += nfc.h
 header-y += nfs2.h
diff --git a/include/uapi/linux/net_namespace.h b/include/uapi/linux/net_namespace.h
new file mode 100644
index 000000000000..778cd2c3ebf4
--- /dev/null
+++ b/include/uapi/linux/net_namespace.h
@@ -0,0 +1,23 @@
+/* Copyright (c) 2015 6WIND S.A.
+ * Author: Nicolas Dichtel <nicolas.dichtel@6wind.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+#ifndef _UAPI_LINUX_NET_NAMESPACE_H_
+#define _UAPI_LINUX_NET_NAMESPACE_H_
+
+/* Attributes of RTM_NEWNSID/RTM_GETNSID messages */
+enum {
+	NETNSA_NONE,
+#define NETNSA_NSID_NOT_ASSIGNED -1
+	NETNSA_NSID,
+	NETNSA_PID,
+	NETNSA_FD,
+	__NETNSA_MAX,
+};
+
+#define NETNSA_MAX		(__NETNSA_MAX - 1)
+
+#endif /* _UAPI_LINUX_NET_NAMESPACE_H_ */
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index a1d18593f41e..5cc5d66bf519 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -132,6 +132,11 @@ enum {
 	RTM_GETMDB = 86,
 #define RTM_GETMDB RTM_GETMDB
 
+	RTM_NEWNSID = 88,
+#define RTM_NEWNSID RTM_NEWNSID
+	RTM_GETNSID = 90,
+#define RTM_GETNSID RTM_GETNSID
+
 	__RTM_MAX,
 #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
 };
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index ce780c722e48..edf089dd792a 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -15,6 +15,10 @@
 #include <linux/file.h>
 #include <linux/export.h>
 #include <linux/user_namespace.h>
+#include <linux/net_namespace.h>
+#include <linux/rtnetlink.h>
+#include <net/sock.h>
+#include <net/netlink.h>
 #include <net/net_namespace.h>
 #include <net/netns/generic.h>
 
@@ -144,6 +148,77 @@ static void ops_free_list(const struct pernet_operations *ops,
 	}
 }
 
+static int alloc_netid(struct net *net, struct net *peer, int reqid)
+{
+	int min = 0, max = 0;
+
+	ASSERT_RTNL();
+
+	if (reqid >= 0) {
+		min = reqid;
+		max = reqid + 1;
+	}
+
+	return idr_alloc(&net->netns_ids, peer, min, max, GFP_KERNEL);
+}
+
+/* This function is used by idr_for_each(). If net is equal to peer, the
+ * function returns the id so that idr_for_each() stops. Because we cannot
+ * returns the id 0 (idr_for_each() will not stop), we return the magic value
+ * NET_ID_ZERO (-1) for it.
+ */
+#define NET_ID_ZERO -1
+static int net_eq_idr(int id, void *net, void *peer)
+{
+	if (net_eq(net, peer))
+		return id ? : NET_ID_ZERO;
+	return 0;
+}
+
+static int __peernet2id(struct net *net, struct net *peer, bool alloc)
+{
+	int id = idr_for_each(&net->netns_ids, net_eq_idr, peer);
+
+	ASSERT_RTNL();
+
+	/* Magic value for id 0. */
+	if (id == NET_ID_ZERO)
+		return 0;
+	if (id > 0)
+		return id;
+
+	if (alloc)
+		return alloc_netid(net, peer, -1);
+
+	return -ENOENT;
+}
+
+/* This function returns the id of a peer netns. If no id is assigned, one will
+ * be allocated and returned.
+ */
+int peernet2id(struct net *net, struct net *peer)
+{
+	int id = __peernet2id(net, peer, true);
+
+	return id >= 0 ? id : NETNSA_NSID_NOT_ASSIGNED;
+}
+
+struct net *get_net_ns_by_id(struct net *net, int id)
+{
+	struct net *peer;
+
+	if (id < 0)
+		return NULL;
+
+	rcu_read_lock();
+	peer = idr_find(&net->netns_ids, id);
+	if (peer)
+		get_net(peer);
+	rcu_read_unlock();
+
+	return peer;
+}
+
 /*
  * setup_net runs the initializers for the network namespace object.
  */
@@ -158,6 +233,7 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
 	atomic_set(&net->passive, 1);
 	net->dev_base_seq = 1;
 	net->user_ns = user_ns;
+	idr_init(&net->netns_ids);
 
 #ifdef NETNS_REFCNT_DEBUG
 	atomic_set(&net->use_count, 0);
@@ -288,6 +364,14 @@ static void cleanup_net(struct work_struct *work)
 	list_for_each_entry(net, &net_kill_list, cleanup_list) {
 		list_del_rcu(&net->list);
 		list_add_tail(&net->exit_list, &net_exit_list);
+		for_each_net(tmp) {
+			int id = __peernet2id(tmp, net, false);
+
+			if (id >= 0)
+				idr_remove(&tmp->netns_ids, id);
+		}
+		idr_destroy(&net->netns_ids);
+
 	}
 	rtnl_unlock();
 
@@ -402,6 +486,129 @@ static struct pernet_operations __net_initdata net_ns_ops = {
 	.exit = net_ns_net_exit,
 };
 
+static struct nla_policy rtnl_net_policy[NETNSA_MAX + 1] = {
+	[NETNSA_NONE]		= { .type = NLA_UNSPEC },
+	[NETNSA_NSID]		= { .type = NLA_S32 },
+	[NETNSA_PID]		= { .type = NLA_U32 },
+	[NETNSA_FD]		= { .type = NLA_U32 },
+};
+
+static int rtnl_net_newid(struct sk_buff *skb, struct nlmsghdr *nlh)
+{
+	struct net *net = sock_net(skb->sk);
+	struct nlattr *tb[NETNSA_MAX + 1];
+	struct net *peer;
+	int nsid, err;
+
+	err = nlmsg_parse(nlh, sizeof(struct rtgenmsg), tb, NETNSA_MAX,
+			  rtnl_net_policy);
+	if (err < 0)
+		return err;
+	if (!tb[NETNSA_NSID])
+		return -EINVAL;
+	nsid = nla_get_s32(tb[NETNSA_NSID]);
+
+	if (tb[NETNSA_PID])
+		peer = get_net_ns_by_pid(nla_get_u32(tb[NETNSA_PID]));
+	else if (tb[NETNSA_FD])
+		peer = get_net_ns_by_fd(nla_get_u32(tb[NETNSA_FD]));
+	else
+		return -EINVAL;
+	if (IS_ERR(peer))
+		return PTR_ERR(peer);
+
+	if (__peernet2id(net, peer, false) >= 0) {
+		err = -EEXIST;
+		goto out;
+	}
+
+	err = alloc_netid(net, peer, nsid);
+	if (err > 0)
+		err = 0;
+out:
+	put_net(peer);
+	return err;
+}
+
+static int rtnl_net_get_size(void)
+{
+	return NLMSG_ALIGN(sizeof(struct rtgenmsg))
+	       + nla_total_size(sizeof(s32)) /* NETNSA_NSID */
+	       ;
+}
+
+static int rtnl_net_fill(struct sk_buff *skb, u32 portid, u32 seq, int flags,
+			 int cmd, struct net *net, struct net *peer)
+{
+	struct nlmsghdr *nlh;
+	struct rtgenmsg *rth;
+	int id;
+
+	ASSERT_RTNL();
+
+	nlh = nlmsg_put(skb, portid, seq, cmd, sizeof(*rth), flags);
+	if (!nlh)
+		return -EMSGSIZE;
+
+	rth = nlmsg_data(nlh);
+	rth->rtgen_family = AF_UNSPEC;
+
+	id = __peernet2id(net, peer, false);
+	if  (id < 0)
+		id = NETNSA_NSID_NOT_ASSIGNED;
+	if (nla_put_s32(skb, NETNSA_NSID, id))
+		goto nla_put_failure;
+
+	return nlmsg_end(skb, nlh);
+
+nla_put_failure:
+	nlmsg_cancel(skb, nlh);
+	return -EMSGSIZE;
+}
+
+static int rtnl_net_getid(struct sk_buff *skb, struct nlmsghdr *nlh)
+{
+	struct net *net = sock_net(skb->sk);
+	struct nlattr *tb[NETNSA_MAX + 1];
+	struct sk_buff *msg;
+	int err = -ENOBUFS;
+	struct net *peer;
+
+	err = nlmsg_parse(nlh, sizeof(struct rtgenmsg), tb, NETNSA_MAX,
+			  rtnl_net_policy);
+	if (err < 0)
+		return err;
+	if (tb[NETNSA_PID])
+		peer = get_net_ns_by_pid(nla_get_u32(tb[NETNSA_PID]));
+	else if (tb[NETNSA_FD])
+		peer = get_net_ns_by_fd(nla_get_u32(tb[NETNSA_FD]));
+	else
+		return -EINVAL;
+
+	if (IS_ERR(peer))
+		return PTR_ERR(peer);
+
+	msg = nlmsg_new(rtnl_net_get_size(), GFP_KERNEL);
+	if (!msg) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	err = rtnl_net_fill(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq, 0,
+			    RTM_GETNSID, net, peer);
+	if (err < 0)
+		goto err_out;
+
+	err = rtnl_unicast(msg, net, NETLINK_CB(skb).portid);
+	goto out;
+
+err_out:
+	nlmsg_free(msg);
+out:
+	put_net(peer);
+	return err;
+}
+
 static int __init net_ns_init(void)
 {
 	struct net_generic *ng;
@@ -435,6 +642,9 @@ static int __init net_ns_init(void)
 
 	register_pernet_subsys(&net_ns_ops);
 
+	rtnl_register(PF_UNSPEC, RTM_NEWNSID, rtnl_net_newid, NULL, NULL);
+	rtnl_register(PF_UNSPEC, RTM_GETNSID, rtnl_net_getid, NULL, NULL);
+
 	return 0;
 }
 
-- 
2.2.2

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox