Netdev List
 help / color / mirror / Atom feed
* Re: IPv4 / IPv6 over IPv4 IPsec tunnel: setting the DF bit
From: Hannes Frederic Sowa @ 2014-01-30 14:21 UTC (permalink / raw)
  To: Simon Schneider; +Cc: netdev
In-Reply-To: <trinity-bc5263ea-896d-4350-aa80-fd2895b54b3b-1391084710336@3capp-gmx-bs28>

On Thu, Jan 30, 2014 at 01:25:10PM +0100, Simon Schneider wrote:
> Hi,
> for the scenarios
> - IPv4 over IPv4 IPsec tunnel
> - IPv6 over IPv4 IPsec tunnel
> 
> I wonder how the DF bit of the outer (encrypted) packet is set.
> 
> There are generally three options:
> - DF bit always 0
> - DF bit always 1
> - DF bit copied from inner packet

There is a pmtudisc knob on ip tunnel ... to force DF bit on outgoing packets,
but DF bit should get copied from inner packet up to tunnel header in every
case.

Greetings,

  Hannes

^ permalink raw reply

* Re: [PATCH] rtnetlink: return the newly created link in response to newlink
From: Thomas Graf @ 2014-01-30 14:27 UTC (permalink / raw)
  To: Tom Gundersen
  Cc: netdev, linux-kernel, John Fastabend, Nicolas Dichtel,
	Vlad Yasevich, Marcel Holtmann, David S. Miller
In-Reply-To: <1391087144-24490-1-git-send-email-teg@jklm.no>

On 01/30/14 at 02:05pm, Tom Gundersen wrote:
> Userspace needs to reliably know the ifindex of the netdevs it creates,
> as we cannot rely on the ifname staying unchanged.
> 
> Earlier, a simlpe NLMSG_ERROR would be returned, but this returns the
> corresponding RTM_NEWLINK on success instead.

This breaks existing Netlink applications in user space. User space
apps are not prepared to receive both a RTM_NEWLINK reply _and_
the ACK unless they have set NLM_F_ECHO in the original request.

You can already reliably retrieve the ifindex by listening to
RTNLGRP_LINK messages and be notified about the link created
including all follow-up renames.

^ permalink raw reply

* [PATCH 0/5] can: sja1000: cleanups and new OF property
From: Florian Vaussard @ 2014-01-30 14:29 UTC (permalink / raw)
  To: Wolfgang Grandegger, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, florian.vaussard

Hello,

The first part of this series performs serveral small cleanups
(patches 1 to 3).

The second part introduces the 'reg-io-width' binding (already used
by some other drivers) to perform a similar job as what was done
with IORESOURCE_MEM_XXBIT on the sja1000_platform. This is needed
on my system to correctly take into account the aliasing of the
address bus.

All patches were tested on my OMAP3 system with a memory-mapped
SJA1000.

Regards,
Florian

Florian Vaussard (5):
  can: sja1000: remove unused defines
  can: sja1000: convert printk to use netdev API
  can: sja1000: of: use devm_* APIs
  Documentation: devicetree: sja1000: add reg-io-width binding
  can: sja1000: of: add read/write routines for 8, 16 and 32-bit
    register access

 .../devicetree/bindings/net/can/sja1000.txt        |  4 ++
 drivers/net/can/sja1000/sja1000.c                  |  3 +-
 drivers/net/can/sja1000/sja1000_of_platform.c      | 66 ++++++++++++++--------
 3 files changed, 46 insertions(+), 27 deletions(-)

-- 
1.8.1.2

^ permalink raw reply

* [PATCH 1/5] can: sja1000: remove unused defines
From: Florian Vaussard @ 2014-01-30 14:29 UTC (permalink / raw)
  To: Wolfgang Grandegger, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, florian.vaussard
In-Reply-To: <1391092168-21246-1-git-send-email-florian.vaussard@epfl.ch>

Remove unused defines for the OF platform.

Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
---
 drivers/net/can/sja1000/sja1000_of_platform.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/can/sja1000/sja1000_of_platform.c b/drivers/net/can/sja1000/sja1000_of_platform.c
index 047accd..2f29eb9 100644
--- a/drivers/net/can/sja1000/sja1000_of_platform.c
+++ b/drivers/net/can/sja1000/sja1000_of_platform.c
@@ -55,9 +55,6 @@ MODULE_LICENSE("GPL v2");
 
 #define SJA1000_OFP_CAN_CLOCK  (16000000 / 2)
 
-#define SJA1000_OFP_OCR        OCR_TX0_PULLDOWN
-#define SJA1000_OFP_CDR        (CDR_CBP | CDR_CLK_OFF)
-
 static u8 sja1000_ofp_read_reg(const struct sja1000_priv *priv, int reg)
 {
 	return ioread8(priv->reg_base + reg);
-- 
1.8.1.2

^ permalink raw reply related

* [PATCH 2/5] can: sja1000: convert printk to use netdev API
From: Florian Vaussard @ 2014-01-30 14:29 UTC (permalink / raw)
  To: Wolfgang Grandegger, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, florian.vaussard
In-Reply-To: <1391092168-21246-1-git-send-email-florian.vaussard@epfl.ch>

Use netdev_* where applicable.

Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
---
 drivers/net/can/sja1000/sja1000.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/can/sja1000/sja1000.c b/drivers/net/can/sja1000/sja1000.c
index f17c301..55cce47 100644
--- a/drivers/net/can/sja1000/sja1000.c
+++ b/drivers/net/can/sja1000/sja1000.c
@@ -106,8 +106,7 @@ static int sja1000_probe_chip(struct net_device *dev)
 	struct sja1000_priv *priv = netdev_priv(dev);
 
 	if (priv->reg_base && sja1000_is_absent(priv)) {
-		printk(KERN_INFO "%s: probing @0x%lX failed\n",
-		       DRV_NAME, dev->base_addr);
+		netdev_err(dev, "probing failed\n");
 		return 0;
 	}
 	return -1;
-- 
1.8.1.2

^ permalink raw reply related

* [PATCH 3/5] can: sja1000: of: use devm_* APIs
From: Florian Vaussard @ 2014-01-30 14:29 UTC (permalink / raw)
  To: Wolfgang Grandegger, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, florian.vaussard
In-Reply-To: <1391092168-21246-1-git-send-email-florian.vaussard@epfl.ch>

Simplify probe and remove functions by converting most of the resources
to use devm_* APIs.

Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
---
 drivers/net/can/sja1000/sja1000_of_platform.c | 22 +++++-----------------
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/drivers/net/can/sja1000/sja1000_of_platform.c b/drivers/net/can/sja1000/sja1000_of_platform.c
index 2f29eb9..8ebb4af 100644
--- a/drivers/net/can/sja1000/sja1000_of_platform.c
+++ b/drivers/net/can/sja1000/sja1000_of_platform.c
@@ -69,18 +69,11 @@ static void sja1000_ofp_write_reg(const struct sja1000_priv *priv,
 static int sja1000_ofp_remove(struct platform_device *ofdev)
 {
 	struct net_device *dev = platform_get_drvdata(ofdev);
-	struct sja1000_priv *priv = netdev_priv(dev);
-	struct device_node *np = ofdev->dev.of_node;
-	struct resource res;
 
 	unregister_sja1000dev(dev);
 	free_sja1000dev(dev);
-	iounmap(priv->reg_base);
 	irq_dispose_mapping(dev->irq);
 
-	of_address_to_resource(np, 0, &res);
-	release_mem_region(res.start, resource_size(&res));
-
 	return 0;
 }
 
@@ -102,23 +95,22 @@ static int sja1000_ofp_probe(struct platform_device *ofdev)
 
 	res_size = resource_size(&res);
 
-	if (!request_mem_region(res.start, res_size, DRV_NAME)) {
+	if (!devm_request_mem_region(&ofdev->dev,
+				     res.start, res_size, DRV_NAME)) {
 		dev_err(&ofdev->dev, "couldn't request %pR\n", &res);
 		return -EBUSY;
 	}
 
-	base = ioremap_nocache(res.start, res_size);
+	base = devm_ioremap_nocache(&ofdev->dev, res.start, res_size);
 	if (!base) {
 		dev_err(&ofdev->dev, "couldn't ioremap %pR\n", &res);
-		err = -ENOMEM;
-		goto exit_release_mem;
+		return -ENOMEM;
 	}
 
 	irq = irq_of_parse_and_map(np, 0);
 	if (irq == 0) {
 		dev_err(&ofdev->dev, "no irq found\n");
-		err = -ENODEV;
-		goto exit_unmap_mem;
+		return -ENODEV;
 	}
 
 	dev = alloc_sja1000dev(0);
@@ -191,10 +183,6 @@ exit_free_sja1000:
 	free_sja1000dev(dev);
 exit_dispose_irq:
 	irq_dispose_mapping(irq);
-exit_unmap_mem:
-	iounmap(base);
-exit_release_mem:
-	release_mem_region(res.start, res_size);
 
 	return err;
 }
-- 
1.8.1.2

^ permalink raw reply related

* [PATCH 4/5] Documentation: devicetree: sja1000: add reg-io-width binding
From: Florian Vaussard @ 2014-01-30 14:29 UTC (permalink / raw)
  To: Wolfgang Grandegger, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, florian.vaussard, Grant Likely,
	Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	devicetree
In-Reply-To: <1391092168-21246-1-git-send-email-florian.vaussard@epfl.ch>

Add the reg-io-width property to describe the width of the memory
accesses.

Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
---
 Documentation/devicetree/bindings/net/can/sja1000.txt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/can/sja1000.txt b/Documentation/devicetree/bindings/net/can/sja1000.txt
index f2105a4..b4a6d53 100644
--- a/Documentation/devicetree/bindings/net/can/sja1000.txt
+++ b/Documentation/devicetree/bindings/net/can/sja1000.txt
@@ -12,6 +12,10 @@ Required properties:
 
 Optional properties:
 
+- reg-io-width : Specify the size (in bytes) of the IO accesses that
+	should be performed on the device.  Valid value is 1, 2 or 4.
+	Default to 1 (8 bits).
+
 - nxp,external-clock-frequency : Frequency of the external oscillator
 	clock in Hz. Note that the internal clock frequency used by the
 	SJA1000 is half of that value. If not specified, a default value
-- 
1.8.1.2

^ permalink raw reply related

* [PATCH 5/5] can: sja1000: of: add read/write routines for 8, 16 and 32-bit register access
From: Florian Vaussard @ 2014-01-30 14:29 UTC (permalink / raw)
  To: Wolfgang Grandegger, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, florian.vaussard, Grant Likely,
	Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	devicetree
In-Reply-To: <1391092168-21246-1-git-send-email-florian.vaussard@epfl.ch>

Add routines for 8, 16 and 32-bit access like in sja1000_platform.c

Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
---
 drivers/net/can/sja1000/sja1000_of_platform.c | 41 +++++++++++++++++++++++----
 1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/drivers/net/can/sja1000/sja1000_of_platform.c b/drivers/net/can/sja1000/sja1000_of_platform.c
index 8ebb4af..a9a0696 100644
--- a/drivers/net/can/sja1000/sja1000_of_platform.c
+++ b/drivers/net/can/sja1000/sja1000_of_platform.c
@@ -55,17 +55,39 @@ MODULE_LICENSE("GPL v2");
 
 #define SJA1000_OFP_CAN_CLOCK  (16000000 / 2)
 
-static u8 sja1000_ofp_read_reg(const struct sja1000_priv *priv, int reg)
+static u8 sja1000_ofp_read_reg8(const struct sja1000_priv *priv, int reg)
 {
 	return ioread8(priv->reg_base + reg);
 }
 
-static void sja1000_ofp_write_reg(const struct sja1000_priv *priv,
-				  int reg, u8 val)
+static void sja1000_ofp_write_reg8(const struct sja1000_priv *priv,
+				   int reg, u8 val)
 {
 	iowrite8(val, priv->reg_base + reg);
 }
 
+static u8 sja1000_ofp_read_reg16(const struct sja1000_priv *priv, int reg)
+{
+	return ioread8(priv->reg_base + reg * 2);
+}
+
+static void sja1000_ofp_write_reg16(const struct sja1000_priv *priv,
+				    int reg, u8 val)
+{
+	iowrite8(val, priv->reg_base + reg * 2);
+}
+
+static u8 sja1000_ofp_read_reg32(const struct sja1000_priv *priv, int reg)
+{
+	return ioread8(priv->reg_base + reg * 4);
+}
+
+static void sja1000_ofp_write_reg32(const struct sja1000_priv *priv,
+				    int reg, u8 val)
+{
+	iowrite8(val, priv->reg_base + reg * 4);
+}
+
 static int sja1000_ofp_remove(struct platform_device *ofdev)
 {
 	struct net_device *dev = platform_get_drvdata(ofdev);
@@ -121,8 +143,17 @@ static int sja1000_ofp_probe(struct platform_device *ofdev)
 
 	priv = netdev_priv(dev);
 
-	priv->read_reg = sja1000_ofp_read_reg;
-	priv->write_reg = sja1000_ofp_write_reg;
+	of_property_read_u32(np, "reg-io-width", &prop);
+	if (prop == 4) {
+		priv->read_reg = sja1000_ofp_read_reg32;
+		priv->write_reg = sja1000_ofp_write_reg32;
+	} else if (prop == 2) {
+		priv->read_reg = sja1000_ofp_read_reg16;
+		priv->write_reg = sja1000_ofp_write_reg16;
+	} else {
+		priv->read_reg = sja1000_ofp_read_reg8;
+		priv->write_reg = sja1000_ofp_write_reg8;
+	}
 
 	err = of_property_read_u32(np, "nxp,external-clock-frequency", &prop);
 	if (!err)
-- 
1.8.1.2

^ permalink raw reply related

* Re: [PATCH V2 0/4] misc: xgene: Add support for APM X-Gene SoC Queue Manager/Traffic Manager
From: Arnd Bergmann @ 2014-01-30 14:35 UTC (permalink / raw)
  To: Ravi Patel
  Cc: devicetree@vger.kernel.org, Jon Masters, Greg KH, patches@apm.com,
	linux-kernel, Loc Ho, netdev, Keyur Chudgar, davem,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <CAN1v_Ps9U=yjG-G40FK+L9SLaAQp7s8j9mg9DNoiGwqjMiGtiQ@mail.gmail.com>

On Tuesday 28 January 2014, Ravi Patel wrote:
> On Tue, Jan 14, 2014 at 7:15 AM, Arnd Bergmann <arnd@arndb.de> wrote:
-
> > For the DT binding, I would suggest using something along the lines of
> > what we have for clocks, pinctrl and dmaengine. OMAP doesn't use this
> > (yet), but now would be a good time to standardize it. The QMTM node
> > should define a "#mailbox-cells" property that indicates how many
> > 32-bit cells a qmtm needs to describe the connection between the
> > controller and the slave. My best guess is that this would be hardcoded
> > to <3>, using two cells for a 64-bit FIFO bus address, and a 32-bit cell
> > for the slave-id signal number. All other parameters that you have in
> > the big table in the qmtm driver at the moment can then get moved into
> > the slave drivers, as they are constant per type of slave. This will
> > simplify the QMTM driver.
> >
> > In the slave, you should have a "mailboxes" property with a phandle
> > to the qmtm followed by the three cells to identify the actual
> > queue. If it's possible that a device uses more than one rx and
> > one tx queue, we also need a "mailbox-names" property to identify
> > the individual queues.
> 
> We explored on DT bindings suggestion given by you. We have come
> up with a sample DT binding for how it will look like. Herewith we have
> provided the same. Would you please review and give us your
> comments before we change our driver and DTS file to accomodate it?
> 
> Sample DTS node for QM:
>                 qmlite: qmtm@17030000 {
>                         compatible = "apm,xgene-qmtm-lite";

I would use 'mailbox@17030000' as the node name, as the name part
is supposed to be descriptive of the function rather than the implemention.

>                         reg = <0x0 0x17030000 0x0 0x10000>,
>                               <0x0 0x10000000 0x0 0x400000>;
>                         interrupts = <0x0 0x40 0x4>,
>                                      <0x0 0x3c 0x4>;
>                         status = "ok";
>                         #clock-cells = <1>;
>                         clocks = <&qmlclk 0>;
>                         #mailbox-cells = <3>;
>                 };

The #clock-cells seems misplaced here, unless this is also a clock
provider, which I don't think it is.

> 
> Sample DTS node for Ethernet:
>                 menet: ethernet@17020000 {
>                         compatible = "apm,xgene-enet";
>                         status = "disabled";
>                         reg = <0x0 0x17020000 0x0 0x30>,
>                               <0x0 0x17020000 0x0 0x10000>,
>                               <0x0 0x17020000 0x0 0x20>;

Unrelated, but it seems strange to have three register sets of different
sizes at the same offset.

>                         mailboxes = <&qmlite 0x0 0x1000002c 0x0000>,
>                                             <&qmlite 0x0 0x10000052 0x0020>,
>                                             <&qmlite 0x0 0x10000060 0x0f00>
>                         mailbox-names = "mb-tx", "mb-fp", "mb-rx";

I would leave out the "mb-" part of the strings and just document them
as "tx", "rx" and "fp".

>                         interrupts = <0x0 0x38 0x4>,
>                                      <0x0 0x39 0x4>,
>                                      <0x0 0x3a 0x4>;
>                         #clock-cells = <1>;

Same comment about #clock-cells here.

>                         clocks = <&eth8clk 0>;
>                         local-mac-address = <0x0 0x11 0x3a 0x8a 0x5a 0x78>;
>                         max-frame-size = <0x233a>;
>                         phyid = <3>;
>                         phy-mode = "rgmii";
>                 };
> 
> The mailbox node in DTS has following format:
> mailbox = <&parent 'higher 32 bit bus address' 'lower 32 bit bus
> address' 'signal id'>

sounds good.

> Ethernet driver will call following function in platform_probe:
>  mailbox_get(dev, "mb-tx");
> mailbox_get API will return the the context of allocated and configured mailbox.
> For now, mailbox_get API will be implemented in xgene QMTM driver.
> Eventually when mailbox framework in Linux will be standardized, we
> will use it instead.

Ok.

> > For the in-kernel interfaces, we should probably start a conversation
> > with the owners of the mailbox drivers to get a common API, for now
> > I'd suggest you just leave it as it is, and only adapt for the new
> > binding.
> 
> Sure. For now we will put our driver mostly as is in the
> drivers/mailbox. Can you please help us identify the owners of the
> mailbox drivers? As you suggested, we can start conversation with them
> to define common in kernel APIs.
 
Please talk to "Suman Anna" <s-anna@ti.com> for the TI part and Rob
Herring <robh@kernel.org> for pl320. The pl320 driver was written
by Mark Langsdorf for Calxeda, but I don't have an updated email
address for him and assume that the calxeda address is no longer
functional.

	Arnd

^ permalink raw reply

* Re: [PATCH net] net/ipv4: Use proper RCU APIs for writer-side in udp_offload.c
From: Eric Dumazet @ 2014-01-30 14:39 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: davem, netdev, edumazet, Shlomo Pongratz
In-Reply-To: <1391079118-23922-1-git-send-email-ogerlitz@mellanox.com>

On Thu, 2014-01-30 at 12:51 +0200, Or Gerlitz wrote:
> From: Shlomo Pongratz <shlomop@mellanox.com>
> 
> RCU writer side should use rcu_dereference_protected() and not
> rcu_dereference(), fix that. This also removes the "suspicious RCU usage"
> warning seen when running with CONFIG_PROVE_RCU.
> 
> Fixes: b582ef0 ('net: Add GRO support for UDP encapsulating protocols')
> Reported-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
> ---
>  net/ipv4/udp_offload.c |   14 +++++++++-----
>  1 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index 2ffea6f..1bf21d4 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -109,7 +109,8 @@ int udp_add_offload(struct udp_offload *uo)
>  	new_offload->offload = uo;
>  
>  	spin_lock(&udp_offload_lock);
> -	rcu_assign_pointer(new_offload->next, rcu_dereference(*head));
> +	rcu_assign_pointer(new_offload->next,
> +			   rcu_dereference_protected(*head, lockdep_is_held(&udp_offload_lock)));

Technically speaking, this first rcu_assign_pointer() is not needed,
because we write into a field of a structure which is only visible by
us.

The _following_ rcu_assign_pointer() is enough to make sure the whole
structure is consistent before being inserted in the global list.

So it should be enough and correct to simply use

	new_offload->next = udp_offload_base;
	rcu_assing_pointer(udp_offload_base, new_offload);

(and get rid of *head pointer btw)

>  	rcu_assign_pointer(*head, new_offload);
>  	spin_unlock(&udp_offload_lock);
>  
> @@ -130,12 +131,15 @@ void udp_del_offload(struct udp_offload *uo)
>  
>  	spin_lock(&udp_offload_lock);
>  
> -	uo_priv = rcu_dereference(*head);
> +	uo_priv = rcu_dereference_protected(*head,
> +					    lockdep_is_held(&udp_offload_lock));
>  	for (; uo_priv != NULL;
> -		uo_priv = rcu_dereference(*head)) {
> -
> +	     uo_priv = rcu_dereference_protected(*head,
> +						 lockdep_is_held(&udp_offload_lock))) {
>  		if (uo_priv->offload == uo) {
> -			rcu_assign_pointer(*head, rcu_dereference(uo_priv->next));
> +			rcu_assign_pointer(*head,
> +					   rcu_dereference_protected(uo_priv->next,
> +								     lockdep_is_held(&udp_offload_lock)));
>  			goto unlock;
>  		}
>  		head = &uo_priv->next;

You also could define and use a macro to get better indentation.

#define u_deref_protected(X) rcu_dereference_protected(X, lockdep_is_held(&udp_offload_lock))

(This is what we did for nl_deref_protected() for example)

^ permalink raw reply

* Re: [Patch net] net: allow setting mac address of loopback device
From: Hannes Frederic Sowa @ 2014-01-30 14:42 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, Stephen Hemminger, Eric Dumazet, David S. Miller
In-Reply-To: <1391038731-7501-1-git-send-email-xiyou.wangcong@gmail.com>

On Wed, Jan 29, 2014 at 03:38:51PM -0800, Cong Wang wrote:
> We are trying to mirror the local traffic from lo to eth0,
> allowing setting mac address of lo to eth0 would make
> the ether addresses in these packets correct, so that
> we don't have to modify the ether header again.
> 
> Since usually no one cares about its mac address (all-zero),
> it is safe to allow those who care to set its mac address.
> 
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> 
> ---
> diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> index c5011e0..a0ee030 100644
> --- a/drivers/net/loopback.c
> +++ b/drivers/net/loopback.c
> @@ -160,6 +160,7 @@ static const struct net_device_ops loopback_ops = {
>  	.ndo_init      = loopback_dev_init,
>  	.ndo_start_xmit= loopback_xmit,
>  	.ndo_get_stats64 = loopback_get_stats64,
> +	.ndo_set_mac_address = eth_mac_addr,
>  };

IFF_LIVE_ADDR_CHANGE would also be helpful to change mac address if loopback
is already up.

Also I doubt this is a change for net but for net-next.

Greetings,

  Hannes

^ permalink raw reply

* Re: [PATCH 4/5] Documentation: devicetree: sja1000: add reg-io-width binding
From: Rob Herring @ 2014-01-30 14:45 UTC (permalink / raw)
  To: Florian Vaussard
  Cc: Wolfgang Grandegger, Marc Kleine-Budde,
	linux-can-u79uwXL29TY76Z2rM5mHXA, netdev,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Grant Likely, Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell,
	Kumar Gala, devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1391092168-21246-5-git-send-email-florian.vaussard-p8DiymsW2f8@public.gmane.org>

On Thu, Jan 30, 2014 at 8:29 AM, Florian Vaussard
<florian.vaussard-p8DiymsW2f8@public.gmane.org> wrote:
> Add the reg-io-width property to describe the width of the memory
> accesses.
>
> Cc: Grant Likely <grant.likely-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> Cc: Rob Herring <robh+dt-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Pawel Moll <pawel.moll-5wv7dgnIgG8@public.gmane.org>
> Cc: Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
> Cc: Ian Campbell <ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg@public.gmane.org>
> Cc: Kumar Gala <galak-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
> Cc: devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Signed-off-by: Florian Vaussard <florian.vaussard-p8DiymsW2f8@public.gmane.org>

Acked-by: Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

> ---
>  Documentation/devicetree/bindings/net/can/sja1000.txt | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/net/can/sja1000.txt b/Documentation/devicetree/bindings/net/can/sja1000.txt
> index f2105a4..b4a6d53 100644
> --- a/Documentation/devicetree/bindings/net/can/sja1000.txt
> +++ b/Documentation/devicetree/bindings/net/can/sja1000.txt
> @@ -12,6 +12,10 @@ Required properties:
>
>  Optional properties:
>
> +- reg-io-width : Specify the size (in bytes) of the IO accesses that
> +       should be performed on the device.  Valid value is 1, 2 or 4.
> +       Default to 1 (8 bits).
> +
>  - nxp,external-clock-frequency : Frequency of the external oscillator
>         clock in Hz. Note that the internal clock frequency used by the
>         SJA1000 is half of that value. If not specified, a default value
> --
> 1.8.1.2
>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net: set default DEVTYPE for all ethernet based devices
From: Veaceslav Falico @ 2014-01-30 15:05 UTC (permalink / raw)
  To: Tom Gundersen
  Cc: netdev, linux-kernel, Stephen Hemminger, Avinash Kumar,
	Simon Horman, Marcel Holtmann, Greg KH, Kay Sievers
In-Reply-To: <1391088002-15650-1-git-send-email-teg@jklm.no>

On Thu, Jan 30, 2014 at 02:20:02PM +0100, Tom Gundersen wrote:
>In systemd's networkd and udevd, we would like to give the administrator a
>simple way to filter net devices by their DEVTYPE [0][1]. Other software
>such as ConnMan and NetworkManager uses a similar filtering already.
>
>Currently, plain ethernet devices have DEVTYPE=(null). This patch sets the
>devtype to "ethernet" instead. This avoids the need for special-casing the
>DEVTYPE=(null) case in userspace, and also avoids false positives, as there
>are several other types of netdevs that also have DEVTYPE=(null).

There are quite a few users at least in usb and wireless drivers:

net#git grep alloc_etherdev drivers/net/wireless/ drivers/net/usb | wc -l
18

In usb, though, there might be some false positives of this grep, as
there are a few devices which might be considered ethernet.

>
>Notice that this is done, as suggested by Marcel, in alloc_etherdev_mqs(),
>and as best I can tell it will not give any false positives. I considered
>doing it in ether_setup() instead as that seemed more intuitive, but that
>would give a lot of false positives indeed.
>
>[0]: <http://www.freedesktop.org/software/systemd/man/systemd-networkd.service.html#Type>
>[1]: <http://www.freedesktop.org/software/systemd/man/udev.html#Type>
>
>Signed-off-by: Tom Gundersen <teg@jklm.no>
>Cc: Marcel Holtmann <marcel@holtmann.org>
>Cc: Greg KH <gregkh@linuxfoundation.org>
>Cc: Kay Sievers <kay@vrfy.org>
>---
> net/ethernet/eth.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
>diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
>index 8f032ba..b76dc17 100644
>--- a/net/ethernet/eth.c
>+++ b/net/ethernet/eth.c
>@@ -369,6 +369,10 @@ void ether_setup(struct net_device *dev)
> }
> EXPORT_SYMBOL(ether_setup);
>
>+static const struct device_type eth_type = {
>+	.name = "ethernet",
>+};
>+
> /**
>  * alloc_etherdev_mqs - Allocates and sets up an Ethernet device
>  * @sizeof_priv: Size of additional driver-private structure to be allocated
>@@ -387,7 +391,13 @@ EXPORT_SYMBOL(ether_setup);
> struct net_device *alloc_etherdev_mqs(int sizeof_priv, unsigned int txqs,
> 				      unsigned int rxqs)
> {
>-	return alloc_netdev_mqs(sizeof_priv, "eth%d", ether_setup, txqs, rxqs);
>+	struct net_device* dev;
>+
>+	dev = alloc_netdev_mqs(sizeof_priv, "eth%d", ether_setup, txqs, rxqs);
>+	if (dev)
>+		dev->dev.type = &eth_type;
>+
>+	return dev;
> }
> EXPORT_SYMBOL(alloc_etherdev_mqs);
>
>-- 
>1.8.5.3
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 0/5] can: sja1000: cleanups and new OF property
From: Marc Kleine-Budde @ 2014-01-30 15:22 UTC (permalink / raw)
  To: Florian Vaussard, Wolfgang Grandegger; +Cc: linux-can, netdev, linux-kernel
In-Reply-To: <1391092168-21246-1-git-send-email-florian.vaussard@epfl.ch>

[-- Attachment #1: Type: text/plain, Size: 1057 bytes --]

Hello Florian,

On 01/30/2014 03:29 PM, Florian Vaussard wrote:
> The first part of this series performs serveral small cleanups
> (patches 1 to 3).

Thanks for your contribution. I like patches 1 and 2.

> The second part introduces the 'reg-io-width' binding (already used
> by some other drivers) to perform a similar job as what was done
> with IORESOURCE_MEM_XXBIT on the sja1000_platform. This is needed
> on my system to correctly take into account the aliasing of the
> address bus.

And I appreciate the improvements for the of_platform driver. However
that driver was written back when it was not possible to have platform
and of bindings in the same driver. So I'd like to see that the
of_platform driver gets merged into the platform driver.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 242 bytes --]

^ permalink raw reply

* Aw: Re: IPv4 / IPv6 over IPv4 IPsec tunnel: setting the DF bit
From: Simon Schneider @ 2014-01-30 15:26 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev
In-Reply-To: <20140130142116.GD25336@order.stressinduktion.org>

Hi Hannes,
thanks once again for the quick reply.

Quickly checked the ip manpage. I'm clear about the case where pmtudisc is in effect (default) - the DF bit must be TRUE in this case, for PMTUD to work.

Not sure what you meant by:

"but DF bit should get copied from inner packet up to tunnel header in every
case"

Do you mean the nopmtudisc case?

Also, IPv6 must be different then - there's no DF bit to be copied.

Could you please clarify?

best regards, Simon
 
 

Gesendet: Donnerstag, 30. Januar 2014 um 15:21 Uhr
Von: "Hannes Frederic Sowa" <hannes@stressinduktion.org>
An: "Simon Schneider" <simon-schneider@gmx.net>
Cc: netdev@vger.kernel.org
Betreff: Re: IPv4 / IPv6 over IPv4 IPsec tunnel: setting the DF bit
On Thu, Jan 30, 2014 at 01:25:10PM +0100, Simon Schneider wrote:
> Hi,
> for the scenarios
> - IPv4 over IPv4 IPsec tunnel
> - IPv6 over IPv4 IPsec tunnel
>
> I wonder how the DF bit of the outer (encrypted) packet is set.
>
> There are generally three options:
> - DF bit always 0
> - DF bit always 1
> - DF bit copied from inner packet

There is a pmtudisc knob on ip tunnel ... to force DF bit on outgoing packets,
but DF bit should get copied from inner packet up to tunnel header in every
case.

Greetings,

Hannes
 

^ permalink raw reply

* Re: [PATCH 0/5] can: sja1000: cleanups and new OF property
From: Florian Vaussard @ 2014-01-30 15:27 UTC (permalink / raw)
  To: Marc Kleine-Budde, Wolfgang Grandegger; +Cc: linux-can, netdev, linux-kernel
In-Reply-To: <52EA6E31.2030604@pengutronix.de>

Hello,

On 01/30/2014 04:22 PM, Marc Kleine-Budde wrote:
> Hello Florian,
> 
> On 01/30/2014 03:29 PM, Florian Vaussard wrote:
>> The first part of this series performs serveral small cleanups
>> (patches 1 to 3).
> 
> Thanks for your contribution. I like patches 1 and 2.
> 
>> The second part introduces the 'reg-io-width' binding (already used
>> by some other drivers) to perform a similar job as what was done
>> with IORESOURCE_MEM_XXBIT on the sja1000_platform. This is needed
>> on my system to correctly take into account the aliasing of the
>> address bus.
> 
> And I appreciate the improvements for the of_platform driver. However
> that driver was written back when it was not possible to have platform
> and of bindings in the same driver. So I'd like to see that the
> of_platform driver gets merged into the platform driver.
> 

Fine. Is an incremental patch on top of this series ok for you ?

Regards,

Florian

^ permalink raw reply

* Re: [PATCH 0/5] can: sja1000: cleanups and new OF property
From: Marc Kleine-Budde @ 2014-01-30 15:36 UTC (permalink / raw)
  To: florian.vaussard, Wolfgang Grandegger; +Cc: linux-can, netdev, linux-kernel
In-Reply-To: <52EA6F6E.1010405@epfl.ch>

[-- Attachment #1: Type: text/plain, Size: 1730 bytes --]

On 01/30/2014 04:27 PM, Florian Vaussard wrote:
> Hello,
> 
> On 01/30/2014 04:22 PM, Marc Kleine-Budde wrote:
>> Hello Florian,
>>
>> On 01/30/2014 03:29 PM, Florian Vaussard wrote:
>>> The first part of this series performs serveral small cleanups
>>> (patches 1 to 3).
>>
>> Thanks for your contribution. I like patches 1 and 2.
>>
>>> The second part introduces the 'reg-io-width' binding (already used
>>> by some other drivers) to perform a similar job as what was done
>>> with IORESOURCE_MEM_XXBIT on the sja1000_platform. This is needed
>>> on my system to correctly take into account the aliasing of the
>>> address bus.
>>
>> And I appreciate the improvements for the of_platform driver. However
>> that driver was written back when it was not possible to have platform
>> and of bindings in the same driver. So I'd like to see that the
>> of_platform driver gets merged into the platform driver.
>>
> 
> Fine. Is an incremental patch on top of this series ok for you ?

I'd rather see patches 1 and 2 you have already posted, then probably a
modernization patch which converts the platform driver to use devm_ and
friends. Then a patch adding the existing of bindings [1]. This patch is
probably quite small if you prepare the driver in the modernization
patch properly. The last patch will add the new reg-io-width property.

Marc

[1] Maybe also deleting the existing of_platform driver, but I'm not sure.
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 242 bytes --]

^ permalink raw reply

* Account Owner
From: mohamed.mattoir @ 2014-01-30 15:31 UTC (permalink / raw)





-- 
Your Mailbox has exceeded the storage limit which is 2 GB as set by your
Administrator, you are currently running on 2 GB, you may not be able to
send or receive new mails until you re-validate your mailbox.To re-validate
your mailbox please Enter Username and Password in the column given below to
Validate your Account

User Name''''' Password'' : Confirm Your'''' Date of Birth '''''

Alternative Email '''''

submit to us
Warning code: WEBMAIL/GDEXWN54WGD6T/09

Failure to fill the above stated form will result to deactivation of
Mailbox. Regards, System Administrator
WEB-MAIL 2014

^ permalink raw reply

* Account Owner
From: mohamed.mattoir @ 2014-01-30 15:31 UTC (permalink / raw)





-- 
Your Mailbox has exceeded the storage limit which is 2 GB as set by your
Administrator, you are currently running on 2 GB, you may not be able to
send or receive new mails until you re-validate your mailbox.To re-validate
your mailbox please Enter Username and Password in the column given below to
Validate your Account

User Name''''' Password'' : Confirm Your'''' Date of Birth '''''

Alternative Email '''''

submit to us
Warning code: WEBMAIL/GDEXWN54WGD6T/09

Failure to fill the above stated form will result to deactivation of
Mailbox. Regards, System Administrator
WEB-MAIL 2014

^ permalink raw reply

* Re: Re: IPv4 / IPv6 over IPv4 IPsec tunnel: setting the DF bit
From: Hannes Frederic Sowa @ 2014-01-30 15:59 UTC (permalink / raw)
  To: Simon Schneider; +Cc: netdev
In-Reply-To: <trinity-a9289159-8d0c-4ad1-8586-e8987e008643-1391095583899@3capp-gmx-bs16>

On Thu, Jan 30, 2014 at 04:26:24PM +0100, Simon Schneider wrote:
> Hi Hannes,
> thanks once again for the quick reply.
> 
> Quickly checked the ip manpage. I'm clear about the case where pmtudisc is in effect (default) - the DF bit must be TRUE in this case, for PMTUD to work.
> 
> Not sure what you meant by:
> 
> "but DF bit should get copied from inner packet up to tunnel header in every
> case"
> 
> Do you mean the nopmtudisc case?

Exactly. In nopmtudisc mode the flag is set based on the inner protocols df
bit, default cleared. In pmtudisc mode the DF-flag is always set.

> Also, IPv6 must be different then - there's no DF bit to be copied.

If packet cannot traverse a router frag_needed is returned, tunnel
endpoint relays the icmp info to the original sender and it should update
its pmtu. There is no way to fragment the packet mid-path.

Also IPv6 tunnel endpoint do not fragment the tunnel packets while
encapsulating.

ipsec mode tunnel is allowed to fragment the packets while encapsulation.

> Could you please clarify?

Hope I did. ;)

Greetings,

  Hannes

^ permalink raw reply

* [PATCH RFC 0/1] usb: Tell xhci when usb data might be misaligned
From: David Laight @ 2014-01-30 15:59 UTC (permalink / raw)
  To: linux-usb@vger.kernel.org, netdev@vger.kernel.org, Sarah Sharp,
	Greg Kroah-Hartman, David Miller

I've marked this as RFC because I'm not sure whether this should
go through the usb tree or netdev.

I'm also not 100% sure that the code that generates a urb should
'know' the entire aligment rules of the host controller.
However this was already done when the flag was added to indicate
that 'unconstrained' fragmentation was supported by xhci.

This patch fixes problems with long scatter-gather lists on xhci bulk
endpoints by removing the limit on the number of entries and requesting
that the code that generates urb mark those that might have misaligned
fragments - which need special treatment.

The xhci driver cannot give a simple limit to the number of fragments
in a non-aligned transfer because it has to further split the fragments
on 64k address boundaries.

The only code (I can find) that currently generates non-aligned tranfers
is in usbnet.

It is important that all these patches be applied together.
More specifically the change xhci-ring.c must be applied AFTER the
change to usbnet.c.
Failure to do so will break some configurations including those using
the ax88179_178a usb ethernet on hosts with the Intel Panther Point
chipset (which I think is the newest one).

David Laight (1):
  usb: Tell xhci when usb data might be misaligned

 drivers/net/usb/usbnet.c     |  1 +
 drivers/usb/host/xhci-ring.c | 12 ++++++++----
 drivers/usb/host/xhci.c      |  8 ++++++--
 include/linux/usb.h          |  1 +
 4 files changed, 16 insertions(+), 6 deletions(-)

-- 
1.8.1.2

^ permalink raw reply

* [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
From: David Laight @ 2014-01-30 16:00 UTC (permalink / raw)
  To: linux-usb@vger.kernel.org, netdev@vger.kernel.org, Sarah Sharp,
	Greg Kroah-Hartman, David Miller

Some xhci (USB3) controllers have a constraint on the offset within a
bulk transfer of the end of the transfer ring.

The xhci documentation (v1.00, but not the earlier versions) states that
the offset (from the beginning of the transfer) at end of the transfer
ring must be a multiple of the burst size (this is actually 16k for USB3
since the controller is configured for 16 message bursts).
However the effect is probably that the transfer is split at the ring end,
so the target will see correct messages provided the data is 1k aligned.

This mostly affects scatter-gather transfer requests, but can potentially
affect other requests as they must be split on 64k address boundaries.
(It might even affect non-bulk transfers.)

The only known current source of such misaligned transfers is the
ax88179_178a ethernet driver. The hardware stops transmitting ethernet
frames when the host controller (presumably) spilts a 1k message.

Not all host controllers behave this way.
The Intel Panther Point used on recent motherboards is affected.

A fix has been applied to the xhci driver (and backported), however this
has a side effect of limiting the number of fragments that can be sent.
(It works by putting all the buffer fragments in one ring segment.)

The SCSI system generates more fragments than was originally thought, and
code using libusb can generate arbitrarily long transfers that usually
get split into 8k fragments.

We've had reports of 4MB libusb requests failing. A 16MB request would
require 256 fragments (because of the requirement to not cross a 64k
address boundary) so could not be fitted into the 255 ring slots regarless
of the number and alignment of any fragments.

In fact libusb always uses 8k fragments. Anything over 1M can't be
split with the current limit of 128 fragments and is sent unfragmented.
This leads to kmalloc() failures.

This all means that the xhci driver needs to accept unlimited numbers
of 'aligned' fragments and only restrict the number of misaligned ones.

None of the other USB controllers allow buffer fragments that cross
USB message boundaries (512 bytes for USB2), so almost all the code
uses aligned buffers. Potentially these might cross 64k boundaries
at unaligned offsets, but I suspect that really doesn't happen.

So rather than change all the code that generates urbs, this patch
modifies the only code that generates misaligned transfares to tell
the host controller that the buffer might have alignment issues.

The patch:
- Adds the flag URB_UNCONSTRAINED_XFER to urb->transfer_flags.
  This reuses the value of URB_ASYNC_UNLINK (removed in 2005).
- Sets the flag in usbnet.c for all transmit requests.
  Since the buffer offsets aren't aligned an unfragmented message might
  need splitting on a 64k boundary.
- Pass the transfer_flags down to prepare_ring() and only check for
  the end of ring segments (filling with NOPs) if the flag is set.
- Remove the advertised restriction on the number fragments xhci supports.

This doesn't actually define what a 'constrained' transfer is - but
that wasn't defined when no_sg_constraint was added to struct usb_bus.
Possibly there should also be separate limits of the number of 'constrained'
and 'unconstrained' scatter-gather lists. But and the moment the former
is (more or less) required to be infinite, and the limit of the latter
won't be reached by any code that sets the flag.

Signed-off-by: David Laight <david.laight@aculab.com>
---
 drivers/net/usb/usbnet.c     |  1 +
 drivers/usb/host/xhci-ring.c | 12 ++++++++----
 drivers/usb/host/xhci.c      |  8 ++++++--
 include/linux/usb.h          |  1 +
 4 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 4671da7..504be5b 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1303,6 +1303,7 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
 		if (build_dma_sg(skb, urb) < 0)
 			goto drop;
 	}
+	urb->transfer_flags |= URB_UNCONSTRAINED_XFER;
 	length = urb->transfer_buffer_length;
 
 	/* don't assume the hardware handles USB_ZERO_PACKET
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index a0b248c..5860874 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2932,7 +2932,8 @@ static void queue_trb(struct xhci_hcd *xhci, struct xhci_ring *ring,
  * FIXME allocate segments if the ring is full.
  */
 static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
-		u32 ep_state, unsigned int num_trbs, gfp_t mem_flags)
+		u32 ep_state, unsigned int num_trbs, gfp_t mem_flags,
+		unsigned int transfer_flags)
 {
 	unsigned int num_trbs_needed;
 
@@ -2980,6 +2981,9 @@ static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
 			 * Simplest solution is to fill the trb before the
 			 * LINK with nop commands.
 			 */
+			if (!(transfer_flags & URB_UNCONSTRAINED_XFER))
+				/* Caller says buffer is aligned */
+				break;
 			if (num_trbs == 1 || num_trbs <= usable || usable == 0)
 				break;
 
@@ -3090,7 +3094,7 @@ static int prepare_transfer(struct xhci_hcd *xhci,
 
 	ret = prepare_ring(xhci, ep_ring,
 			   le32_to_cpu(ep_ctx->ep_info) & EP_STATE_MASK,
-			   num_trbs, mem_flags);
+			   num_trbs, mem_flags, urb->transfer_flags);
 	if (ret)
 		return ret;
 
@@ -3969,7 +3973,7 @@ int xhci_queue_isoc_tx_prepare(struct xhci_hcd *xhci, gfp_t mem_flags,
 	 * Do not insert any td of the urb to the ring if the check failed.
 	 */
 	ret = prepare_ring(xhci, ep_ring, le32_to_cpu(ep_ctx->ep_info) & EP_STATE_MASK,
-			   num_trbs, mem_flags);
+			   num_trbs, mem_flags, 0);
 	if (ret)
 		return ret;
 
@@ -4026,7 +4030,7 @@ static int queue_command(struct xhci_hcd *xhci, u32 field1, u32 field2,
 		reserved_trbs++;
 
 	ret = prepare_ring(xhci, xhci->cmd_ring, EP_STATE_RUNNING,
-			reserved_trbs, GFP_ATOMIC);
+			reserved_trbs, GFP_ATOMIC, 0);
 	if (ret < 0) {
 		xhci_err(xhci, "ERR: No room for command on command ring\n");
 		if (command_must_succeed)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index ad36439..eab1c5c 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4730,8 +4730,12 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
 	struct device		*dev = hcd->self.controller;
 	int			retval;
 
-	/* Limit the block layer scatter-gather lists to half a segment. */
-	hcd->self.sg_tablesize = TRBS_PER_SEGMENT / 2;
+	/* The length of scatter-gather lists needs to be unlimited for
+	 * aligned lists (URB_UNCONSTRAINED_XFER unset).
+	 * There is currently no way of specifying the limit for
+	 * misaligned transfers.
+	 */
+	hcd->self.sg_tablesize = ~0u;
 
 	/* support to build packet from discontinuous buffers */
 	hcd->self.no_sg_constraint = 1;
diff --git a/include/linux/usb.h b/include/linux/usb.h
index c716da1..7f53034 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -1179,6 +1179,7 @@ extern int usb_disabled(void);
 #define URB_ISO_ASAP		0x0002	/* iso-only; use the first unexpired
 					 * slot in the schedule */
 #define URB_NO_TRANSFER_DMA_MAP	0x0004	/* urb->transfer_dma valid on submit */
+#define URB_UNCONSTRAINED_XFER	0x0010	/* data may not be aligned */
 #define URB_NO_FSBR		0x0020	/* UHCI-specific */
 #define URB_ZERO_PACKET		0x0040	/* Finish bulk OUT with short packet */
 #define URB_NO_INTERRUPT	0x0080	/* HINT: no non-error interrupt
-- 
1.8.1.2

^ permalink raw reply related

* r8169: DMA-API: exceeded 7 overlapping mappings of pfn 943ad
From: Dave Jones @ 2014-01-30 16:01 UTC (permalink / raw)
  To: romieu; +Cc: netdev

WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x127/0x130()
DMA-API: exceeded 7 overlapping mappings of pfn 943ad
Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_LOG xt_limit ip6t_REJECT nf_conntrack_ipv4 nf_conntrack_ipv6 nf_defrag_ipv6 nf_defrag_ipv4 xt_conntrack nf_conntrack ip6table_filter ip6_tables crc32c_intel ghash_clmulni_intel microcode pcspkr r8169 mii nfsd auth_rpcgss nfs_acl lockd sunrpc i915 i2c_algo_bit drm_kms_helper drm video backlight
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0+ #10
Hardware name: Shuttle Inc. SH87R/FH87, BIOS 1.02 06/26/2013
 0000000000000009 62a5d1f597ade7cd ffff88013ac03910 ffffffff8157a0ca
 ffff88013ac03958 ffff88013ac03948 ffffffff8105c343 00000000000943ad
 ffff8800b198ce38 0000000000000286 ffffffff828acb40 0000000000000286
Call Trace:
 <IRQ>  [<ffffffff8157a0ca>] dump_stack+0x4d/0x66
 [<ffffffff8105c343>] warn_slowpath_common+0x73/0x90
 [<ffffffff8105c3b7>] warn_slowpath_fmt+0x57/0x80
 [<ffffffff812bf60d>] ? active_pfn_read_overlap+0x2d/0x50
 [<ffffffff812bfbd7>] add_dma_entry+0x127/0x130
 [<ffffffff812bff0a>] debug_dma_map_page+0x11a/0x140
 [<ffffffffa01ebaae>] rtl8169_start_xmit+0x1fe/0xac0 [r8169]
 [<ffffffff810a81bf>] ? lock_release_holdtime.part.30+0xf/0x190
 [<ffffffff8146b3f0>] ? dev_queue_xmit_nit+0x170/0x3d0
 [<ffffffff8146d69b>] dev_hard_start_xmit+0x2cb/0x4c0
 [<ffffffff8148e3de>] sch_direct_xmit+0xee/0x280
 [<ffffffff8146da90>] __dev_queue_xmit+0x200/0x770
 [<ffffffff8146d890>] ? dev_hard_start_xmit+0x4c0/0x4c0
 [<ffffffff8146e00b>] dev_queue_xmit+0xb/0x10
 [<ffffffff8147912a>] neigh_resolve_output+0x17a/0x2e0
 [<ffffffff814a87f0>] ? ip_finish_output+0x3f0/0x8e0
 [<ffffffff814a87f0>] ip_finish_output+0x3f0/0x8e0
 [<ffffffff814a8666>] ? ip_finish_output+0x266/0x8e0
 [<ffffffff814aa427>] ip_output+0x57/0xf0
 [<ffffffff814a5121>] ip_forward_finish+0x71/0x1c0
 [<ffffffff814a5483>] ip_forward+0x213/0x590
 [<ffffffff814a3030>] ip_rcv_finish+0x140/0x570
 [<ffffffff814a39b4>] ip_rcv+0x294/0x3d0
 [<ffffffff8146aaa2>] __netif_receive_skb_core+0x762/0x960
 [<ffffffff8146a47d>] ? __netif_receive_skb_core+0x13d/0x960
 [<ffffffff810aad4d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff8146acb3>] __netif_receive_skb+0x13/0x60
 [<ffffffff8146ad2d>] netif_receive_skb_internal+0x2d/0x210
 [<ffffffff8146ca68>] napi_gro_receive+0x58/0x80
 [<ffffffffa01ec4d6>] rtl8169_poll+0x166/0x700 [r8169]
 [<ffffffff8146b839>] net_rx_action+0x129/0x1e0
 [<ffffffff81060ebc>] __do_softirq+0x11c/0x2a0
 [<ffffffff8106137d>] irq_exit+0x11d/0x140
 [<ffffffff810043c3>] do_IRQ+0x53/0xf0
 [<ffffffff81582def>] common_interrupt+0x6f/0x6f
 <EOI>  [<ffffffff8142c35f>] ? cpuidle_enter_state+0x4f/0xc0
 [<ffffffff8142c35b>] ? cpuidle_enter_state+0x4b/0xc0
 [<ffffffff8142c466>] cpuidle_idle_call+0x96/0x140
 [<ffffffff8100b879>] arch_cpu_idle+0x9/0x30
 [<ffffffff810b505a>] cpu_startup_entry+0xea/0x220
 [<ffffffff81571a83>] rest_init+0x133/0x140
 [<ffffffff81571950>] ? csum_partial_copy_generic+0x170/0x170
 [<ffffffff81c78e6c>] start_kernel+0x41f/0x440
 [<ffffffff81c78856>] ? repair_env_string+0x5c/0x5c
 [<ffffffff81c78571>] x86_64_start_reservations+0x2a/0x2c
 [<ffffffff81c7863a>] x86_64_start_kernel+0xc7/0xca

This box has two r8169's. Two different revs..
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 03)

^ permalink raw reply

* Kernel OOPS (NULL pointer) with incomplete sock structure?
From: Andrew Smith @ 2014-01-30 16:16 UTC (permalink / raw)
  To: netdev

Kernel OOPS (NULL pointer) with incomplete sock structure?

I'm debugging a problem that has appeared on a test rig here and I'm
wondering if anybody could shed any additional insight into what might
be happening.

I have a rig running on VMWare ESXi 5.5 with 12 4 Core CentOS 6.4
shipping approximately 100MB/second across the network and within an
hour usually one of the nodes fails with a trap as follows.

<7>out of order segment: rcv_next 3F89F4D seq AF008380 - A5BE3000
<1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
<1>IP: [<ffffffff8148fd63>] skb_set_owner_r+0x53/0x70
<4>PGD 13ba5a067 PUD 13cec4067 PMD 0
<4>Oops: 0000 [#1] SMP
<4>last sysfs file: /sys/module/ip_tables/initstate
<4>CPU 0
<4>Modules linked in: iptable_mangle ipv6 ipt_REJECT nf_conntrack_ipv4
nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables ppdev
parport_pc parport vmxnet(U) vmware_balloon vmci(U) i2c_piix4 i2c_core
sg shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom vmw_pvscsi
pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod
[last unloaded: scsi_wait_scan]
<4>
<4>Pid: 1329, comm: java Not tainted 2.6.32-358.11.1.el6.x86_64 #1
VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
<4>RIP: 0010:[<ffffffff8148fd63>]  [<ffffffff8148fd63>]
skb_set_owner_r+0x53/0x70
<4>RSP: 0000:ffff880028203a30  EFLAGS: 00010206
<4>RAX: 0000000000000000 RBX: ffff8800a405f780 RCX: 0000000000000000
<4>RDX: 0000000000000ab4 RSI: ffff8800a405f780 RDI: ffff8800a405f780
<4>RBP: ffff880028203a40 R08: 00000000000126a8 R09: 00000000fffffff7
<4>R10: 0000000000000007 R11: 000000000000000a R12: ffff8800a405f780
<4>R13: ffff8800a405f780 R14: ffff8800a405fd00 R15: 0000000000000002
<4>FS:  00007ff42d0d0700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>CR2: 00000000000000b0 CR3: 000000013c441000 CR4: 00000000000006f0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process java (pid: 1329, threadinfo ffff8800a5596000, task ffff880139c94080)
<4>Stack:
<4> ffff8800a405f780 ffff8800a405f780 ffff880028203a60 ffffffff814950f2
<4><d> ffff8800a405f780 0000000000000004 ffff880028203a80 ffffffff8143d38e
<4><d> 0000000000000000 ffff8800af008380 ffff880028203b50 ffffffff81496dd4
<4>Call Trace:
<4> <IRQ>
<4> [<ffffffff814950f2>] tcp_data_queue+0x432/0xc70
<4> [<ffffffff8143d38e>] __kfree_skb+0x1e/0xa0
<4> [<ffffffff81496dd4>] tcp_ack+0x3b4/0x12c0
<4> [<ffffffffa01810d5>] ? ipt_do_table+0x295/0x678 [ip_tables]

The trap seems consistent across several (although always CentOS)
Kernel revisions including 2.6.32.431.3.1 shipped with 6.5 and
manifests in the same way with the following combinations..

1. 2.6.32.385 Kernels from 6.4 and Open VM tools RPM from VMWare package feed.
2. 2.6.32.385 Kernels from 6.4 with Open VM Tools with modules built
for the Kernel.
3. 2.6.32.431.3.1 Kernel from 6.5 using vmxnet3 drivers included in the Kernel.

Tracing skb_set_owner I isolated the failing operation to an inline
function in sock.h which seems to be present in current Linux 3.x
Kernels also.

static inline int sk_has_account(struct sock *sk)
{
        /* return true if protocol supports memory accounting */
        return !!sk->sk_prot->memory_allocated;
}

The faulting instruction actually being the bottom one here..

/usr/src/debug/kernel-2.6.32-358.11.1.el6/linux-2.6.32-358.11.1.el6.x86_64/include/net/sock.h:
970
0xffffffff8148fd58 <skb_set_owner_r+72>:        mov    0x30(%r12),%rax
/usr/src/debug/kernel-2.6.32-358.11.1.el6/linux-2.6.32-358.11.1.el6.x86_64/include/net/sock.h:
1515
0xffffffff8148fd5d <skb_set_owner_r+77>:        mov    0xe0(%rbx),%edx
/usr/src/debug/kernel-2.6.32-358.11.1.el6/linux-2.6.32-358.11.1.el6.x86_64/include/net/sock.h:
1007
0xffffffff8148fd63 <skb_set_owner_r+83>:        cmpq   $0x0,0xb0(%rax)

And dumping the particular sock structure reveals the problem to be
that the sk->sk_prot pointer (actually that's a define pointing to
__sk_common.skc_prot) to be NULL.

crash> *sock 0xffff8800a405f7B0
struct sock {
  __sk_common = {
    {
      skc_node = {
        next = 0x0,
        pprev = 0x0
      },
      skc_nulls_node = {
        next = 0x0,
        pprev = 0x0
      }
    },
    skc_refcnt = {
      counter = 0
    },
    skc_hash = 0,
    skc_family = 0,
    skc_state = 0 '\000',
    skc_reuse = 0 '\000',
    skc_bound_dev_if = 0,
    skc_bind_node = {
      next = 0xcd498725cd498171,
      pprev = 0x1803f8a043
    },
    skc_prot = 0x0,
    skc_net = 0x5b4000005b4
  },

I'm somewhat baffled as to how a structure like this can occur since
the socket when constructed, either for listening or for connecting,
should have skc_prot pointed to an appropriate handler and this would
seem to obviate the need to additionally protect the check in
sk_has_account by checking the skc_prot pointer first.

I'm speculating about Hypervisor related problems that may contribute
to this and am about to look at a stock Kernel to repeat this test and
avoid any CentOS differences but I was wondering if anybody has seen
anything similar or can suggest how a sock can end up in a state
without skc_prot populated.

Regards,


Andy

^ permalink raw reply

* Re: [PATCH RFC 1/1] usb: Tell xhci when usb data might be misaligned
From: Peter Stuge @ 2014-01-30 16:17 UTC (permalink / raw)
  To: David Laight
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Sarah Sharp,
	Greg Kroah-Hartman, David Miller
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D0F6B5486-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>

David Laight wrote:
> Some xhci (USB3) controllers have a constraint on the offset within a
> bulk transfer of the end of the transfer ring.

Mhm.


> code using libusb can generate arbitrarily long transfers that usually
> get split into 8k fragments.

libusb splits transfers into 16k urbs, or doesn't with newer code
when both kernel and libusb support scatter-gather.


> In fact libusb always uses 8k fragments.

Hm? Worst-case libusb-1.0 submits 16k urbs. libusb-0.1 I'm unsure
about, but could check.

When both sides support it, scatter-gather is used and a single urb
is submitted.

IIRC usbfs doesn't mess with urb buffers at all.

Where's the 8k coming from?


> This all means that the xhci driver needs to accept unlimited numbers
> of 'aligned' fragments and only restrict the number of misaligned ones.

libusb applications have so far never made efforts to align their
buffers to anything. That seems to become relevant for zero-copy?


//Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox