Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: Kernel panic when using bridge
From: Jan Lübbe @ 2011-04-12 15:13 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA, netdev
In-Reply-To: <1302619749.3233.56.camel@edumazet-laptop>

On Tue, 2011-04-12 at 16:49 +0200, Eric Dumazet wrote: 
> Of course, this might be a complete shot in the dark, but a
> stackprotector fault in icmp_send() really sounds like a problem in
> ip_options_echo() [ or bad input data given to this function ]

It was my understanding that all IP options given to ip_options_echo are
either from local sources or have gone through ip_options_compile, which
seems to verify that the sum of the individual option lengths do not
exceed the ip header. So there wouldn't need to be additional checks in
ip_options_echo.

If this is not the case, we need size checks in ip_options_echo before
copying over each option.

> Other related changes (but as old as v2.6.22) :
> 
> commit 11a03f78fbf15a866ba
> ([NetLabel]: core network changes)

When investigating the problem I had with timestamps, i found that most
of the lines in ip_options_echo and _compile have not been changed since
before 2.2 (some even before 2.0). The newer changes have all been
updates for changed API elsewhere in the stack.

Regards,
Jan

^ permalink raw reply

* Re: [RFC] iproute2: Fix meta match u32 with 0xffffffff
From: Stephen Hemminger @ 2011-04-12 15:19 UTC (permalink / raw)
  To: tgraf; +Cc: netdev
In-Reply-To: <1302595009.3664.8.camel@lsx>

On Tue, 12 Apr 2011 09:56:49 +0200
Thomas Graf <tgraf@redhat.com> wrote:

> On Mon, 2011-04-11 at 11:52 -0700, Stephen Hemminger wrote: 
> > The value 0xffffffff is a valid mask and bstrtoul() would return
> > ULONG_MAX which was the error value. Resolve the problem by separating
> > return value and error indication.
> >  
> > -unsigned long bstrtoul(const struct bstr *b)
> > +int bstrtoul(const struct bstr *b, unsigned long *lp)
> >  {
> >  	char *inv = NULL;
> > -	unsigned long l;
> >  	char buf[b->len+1];
> >  
> > +	if (b->len == 0)
> > +		return -EINVAL;
> > +
> >  	memcpy(buf, b->data, b->len);
> >  	buf[b->len] = '\0';
> >  
> > -	l = strtoul(buf, &inv, 0);
> > -	if (l == ULONG_MAX || inv == buf)
> > -		return ULONG_MAX;
> > +	*lp = strtoul(buf, &inv, 0);
> > +	if (inv == buf)
> > +		return -EINVAL;
> > +
> > +	if (*lp == ULONG_MAX || errno == ERANGE)
> > +		return -ERANGE;
> >  
> > -	return l;
> > +	return 0;
> >  }
> 
> This is definitely much better but we still can't parse ULONG_MAX
> as string representative. Checking glibc docs, the only way to do it is
> to ignore the return value for error checking and look errno.
> 

I think the error case is ret == ULONG_MAX && errno == ERANGE
If there is no error, then strtoul doesn't set errno.



-- 

^ permalink raw reply

* [PATCH 1/2] net/sis900: store MAC into perm_addr for SiS 900, 630E, 635 and 96x variants
From: Otavio Salvador @ 2011-04-12 15:30 UTC (permalink / raw)
  To: netdev; +Cc: Otavio Salvador

Signed-off-by: Otavio Salvador <otavio@ossystems.com.br>
---
 drivers/net/sis900.c |   23 +++++++++++++++++++----
 1 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/net/sis900.c b/drivers/net/sis900.c
index cb317cd..484f795 100644
--- a/drivers/net/sis900.c
+++ b/drivers/net/sis900.c
@@ -240,7 +240,8 @@ static const struct ethtool_ops sis900_ethtool_ops;
  *	@net_dev: the net device to get address for
  *
  *	Older SiS900 and friends, use EEPROM to store MAC address.
- *	MAC address is read from read_eeprom() into @net_dev->dev_addr.
+ *	MAC address is read from read_eeprom() into @net_dev->dev_addr and
+ *	@net_dev->perm_addr.
  */
 
 static int __devinit sis900_get_mac_addr(struct pci_dev * pci_dev, struct net_device *net_dev)
@@ -261,6 +262,9 @@ static int __devinit sis900_get_mac_addr(struct pci_dev * pci_dev, struct net_de
 	for (i = 0; i < 3; i++)
 	        ((u16 *)(net_dev->dev_addr))[i] = read_eeprom(ioaddr, i+EEPROMMACAddr);
 
+	/* Store MAC Address in perm_addr */
+	memcpy(net_dev->perm_addr, net_dev->dev_addr, ETH_ALEN);
+
 	return 1;
 }
 
@@ -271,7 +275,8 @@ static int __devinit sis900_get_mac_addr(struct pci_dev * pci_dev, struct net_de
  *
  *	SiS630E model, use APC CMOS RAM to store MAC address.
  *	APC CMOS RAM is accessed through ISA bridge.
- *	MAC address is read into @net_dev->dev_addr.
+ *	MAC address is read into @net_dev->dev_addr and
+ *	@net_dev->perm_addr.
  */
 
 static int __devinit sis630e_get_mac_addr(struct pci_dev * pci_dev,
@@ -296,6 +301,10 @@ static int __devinit sis630e_get_mac_addr(struct pci_dev * pci_dev,
 		outb(0x09 + i, 0x70);
 		((u8 *)(net_dev->dev_addr))[i] = inb(0x71);
 	}
+
+	/* Store MAC Address in perm_addr */
+	memcpy(net_dev->perm_addr, net_dev->dev_addr, ETH_ALEN);
+
 	pci_write_config_byte(isa_bridge, 0x48, reg & ~0x40);
 	pci_dev_put(isa_bridge);
 
@@ -310,7 +319,7 @@ static int __devinit sis630e_get_mac_addr(struct pci_dev * pci_dev,
  *
  *	SiS635 model, set MAC Reload Bit to load Mac address from APC
  *	to rfdr. rfdr is accessed through rfcr. MAC address is read into
- *	@net_dev->dev_addr.
+ *	@net_dev->dev_addr and @net_dev->perm_addr.
  */
 
 static int __devinit sis635_get_mac_addr(struct pci_dev * pci_dev,
@@ -334,6 +343,9 @@ static int __devinit sis635_get_mac_addr(struct pci_dev * pci_dev,
 		*( ((u16 *)net_dev->dev_addr) + i) = inw(ioaddr + rfdr);
 	}
 
+	/* Store MAC Address in perm_addr */
+	memcpy(net_dev->perm_addr, net_dev->dev_addr, ETH_ALEN);
+
 	/* enable packet filtering */
 	outl(rfcrSave | RFEN, rfcr + ioaddr);
 
@@ -353,7 +365,7 @@ static int __devinit sis635_get_mac_addr(struct pci_dev * pci_dev,
  *	EEDONE signal to refuse EEPROM access by LAN.
  *	The EEPROM map of SiS962 or SiS963 is different to SiS900.
  *	The signature field in SiS962 or SiS963 spec is meaningless.
- *	MAC address is read into @net_dev->dev_addr.
+ *	MAC address is read into @net_dev->dev_addr and @net_dev->perm_addr.
  */
 
 static int __devinit sis96x_get_mac_addr(struct pci_dev * pci_dev,
@@ -372,6 +384,9 @@ static int __devinit sis96x_get_mac_addr(struct pci_dev * pci_dev,
 			for (i = 0; i < 3; i++)
 			        ((u16 *)(net_dev->dev_addr))[i] = read_eeprom(ioaddr, i+EEPROMMACAddr);
 
+			/* Store MAC Address in perm_addr */
+			memcpy(net_dev->perm_addr, net_dev->dev_addr, ETH_ALEN);
+
 			outl(EEDONE, ee_addr);
 			return 1;
 		} else {
-- 
1.7.4.1


^ permalink raw reply related

* [PATCH 2/2] net/natsami: store MAC into perm_addr
From: Otavio Salvador @ 2011-04-12 15:30 UTC (permalink / raw)
  To: netdev; +Cc: Otavio Salvador
In-Reply-To: <1302622241-11871-1-git-send-email-otavio@ossystems.com.br>

Signed-off-by: Otavio Salvador <otavio@ossystems.com.br>
---
 drivers/net/natsemi.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/natsemi.c b/drivers/net/natsemi.c
index aa2813e..1074231 100644
--- a/drivers/net/natsemi.c
+++ b/drivers/net/natsemi.c
@@ -860,6 +860,9 @@ static int __devinit natsemi_probe1 (struct pci_dev *pdev,
 		prev_eedata = eedata;
 	}
 
+	/* Store MAC Address in perm_addr */
+	memcpy(dev->perm_addr, dev->dev_addr, ETH_ALEN);
+
 	dev->base_addr = (unsigned long __force) ioaddr;
 	dev->irq = irq;
 
-- 
1.7.4.1


^ permalink raw reply related

* Re: [net-next PATCH 1/3] vxge: always enable hardware time stamp
From: Jon Mason @ 2011-04-12 15:36 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20110410.185845.70188298.davem@davemloft.net>

On Sun, Apr 10, 2011 at 06:58:45PM -0700, David Miller wrote:
> From: Jon Mason <jdmason@kudzu.us>
> Date: Fri,  8 Apr 2011 16:11:21 -0500
> 
> > Hardware time stamp calculation can only be enabled by the privileged
> > function. Enable it always by default and simply use the ethtool
> > interface to set a flag to indicate whether or not the respective
> > function driver should indicate the timestamp along with the received
> > packet.
> > 
> > Also, make certain fields in vxge_hw_device_config bit-fields to reduce
> > the size of the struct.
> > 
> > Signed-off-by: Jon Mason <jdmason@kudzu.us>
> 
> Doesn't this have some performance or latency impact?

It is all done in hardware by replacing the CRC with the HWTS value.
So, no perf or latency issues there.  It still only handles the HWTS
in receive if it is enabled in software via the ioctl. 

> 
> I think it should be stay off by default, people who want this know
> they want it and can turn it on if they want to.

^ permalink raw reply

* Re: 2.6.39-rc2 boot crash
From: Patrick McHardy @ 2011-04-12 15:39 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: Eric B Munson, David Miller, dave, linux-kernel, gregkh,
	ksrinivasan, NetDev
In-Reply-To: <4DA44A73.3060801@trash.net>

[-- Attachment #1: Type: text/plain, Size: 871 bytes --]

On 12.04.2011 14:49, Patrick McHardy wrote:
> On 12.04.2011 00:06, Evgeniy Polyakov wrote:
>> Hi.
>>
>> On Mon, Apr 11, 2011 at 05:07:47PM -0400, Eric B Munson (emunson@mgebm.net) wrote:
>>>> I can't figure this out, the only thing that should have changed is the
>>>> time the initial PROC_CN_MCAST_LISTEN message is received. Apparently
>>>> at that point connector is not fully initialized yet. Please post your
>>>> config and the full boot log. Thanks.
>>>>
>>>
>>> I am still seeing this on Linus' tree, is there anything more I can do to help
>>> track the problem?
> 
> Sorry, I had a hardware failure, I'm back working on this now.
> 
>> Patrick, do you need my assist on this bug?
> 
> Thanks, but I can meanwhile reproduce the problem, so I think I
> should have a fix soon.

I think this patch should fix the problem. Eric, could you please
give it a try?




[-- Attachment #2: cn.diff --]
[-- Type: text/x-patch, Size: 838 bytes --]

commit ad676e0dbbe8658ce46e192f449689bf3011bdf5
Author: Patrick McHardy <kaber@trash.net>
Date:   Tue Apr 12 17:37:04 2011 +0200

    connector: fix skb double free in cn_rx_skb()
    
    When a skb is delivered to a registered callback, cn_call_callback()
    incorrectly returns -ENODEV after freeing the skb, causing cn_rx_skb()
    to free the skb a second time.
    
    Reported-by: Eric B Munson <emunson@mgebm.net>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index d770058..219d88a 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -142,6 +142,7 @@ static int cn_call_callback(struct sk_buff *skb)
 		cbq->callback(msg, nsp);
 		kfree_skb(skb);
 		cn_queue_release_callback(cbq);
+		err = 0;
 	}
 
 	return err;

^ permalink raw reply related

* Re: 2.6.39-rc2 boot crash
From: Eric B Munson @ 2011-04-12 15:59 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Evgeniy Polyakov, David Miller, dave, linux-kernel, gregkh,
	NetDev
In-Reply-To: <4DA47247.20700@trash.net>

[-- Attachment #1: Type: text/plain, Size: 2006 bytes --]

On Tue, 12 Apr 2011, Patrick McHardy wrote:

> On 12.04.2011 14:49, Patrick McHardy wrote:
> > On 12.04.2011 00:06, Evgeniy Polyakov wrote:
> >> Hi.
> >>
> >> On Mon, Apr 11, 2011 at 05:07:47PM -0400, Eric B Munson (emunson@mgebm.net) wrote:
> >>>> I can't figure this out, the only thing that should have changed is the
> >>>> time the initial PROC_CN_MCAST_LISTEN message is received. Apparently
> >>>> at that point connector is not fully initialized yet. Please post your
> >>>> config and the full boot log. Thanks.
> >>>>
> >>>
> >>> I am still seeing this on Linus' tree, is there anything more I can do to help
> >>> track the problem?
> > 
> > Sorry, I had a hardware failure, I'm back working on this now.
> > 
> >> Patrick, do you need my assist on this bug?
> > 
> > Thanks, but I can meanwhile reproduce the problem, so I think I
> > should have a fix soon.
> 
> I think this patch should fix the problem. Eric, could you please
> give it a try?

This has me up and running again, thanks!

Tested-by: Eric B Munson <emunson@mgebm.net>
> 
> 
> 

> commit ad676e0dbbe8658ce46e192f449689bf3011bdf5
> Author: Patrick McHardy <kaber@trash.net>
> Date:   Tue Apr 12 17:37:04 2011 +0200
> 
>     connector: fix skb double free in cn_rx_skb()
>     
>     When a skb is delivered to a registered callback, cn_call_callback()
>     incorrectly returns -ENODEV after freeing the skb, causing cn_rx_skb()
>     to free the skb a second time.
>     
>     Reported-by: Eric B Munson <emunson@mgebm.net>
>     Signed-off-by: Patrick McHardy <kaber@trash.net>
> 
> diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
> index d770058..219d88a 100644
> --- a/drivers/connector/connector.c
> +++ b/drivers/connector/connector.c
> @@ -142,6 +142,7 @@ static int cn_call_callback(struct sk_buff *skb)
>  		cbq->callback(msg, nsp);
>  		kfree_skb(skb);
>  		cn_queue_release_callback(cbq);
> +		err = 0;
>  	}
>  
>  	return err;


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH] s2io: Fix warnings due to -Wunused-but-set-variable.
From: Jon Mason @ 2011-04-12 16:00 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20110411.160143.258100330.davem@davemloft.net>

On Mon, Apr 11, 2011 at 04:01:43PM -0700, David Miller wrote:
> 
> Most of these are cases where we are trying to read back a register
> after a write to ensure completion.
> 
> Simply pre-fixing the readl() or readq() with "(void)" is sufficient
> because these are volatile operations and the compiler cannot eliminate
> them just because no real assignment takes place.
> 
> The case of free_rxd_blk()'s assignments to "struct buffAdd *ba" is a
> real spurious assignment as this variable is completely otherwise
> unused.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Jon Mason <jdmason@kudzu.us>
> ---
>  drivers/net/s2io.c |    5 +----
>  drivers/net/s2io.h |   10 ++++------
>  2 files changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
> index ca8e75e..2d5cc61 100644
> --- a/drivers/net/s2io.c
> +++ b/drivers/net/s2io.c
> @@ -2244,13 +2244,12 @@ static int verify_xena_quiescence(struct s2io_nic *sp)
>  static void fix_mac_address(struct s2io_nic *sp)
>  {
>  	struct XENA_dev_config __iomem *bar0 = sp->bar0;
> -	u64 val64;
>  	int i = 0;
>  
>  	while (fix_mac[i] != END_SIGN) {
>  		writeq(fix_mac[i++], &bar0->gpio_control);
>  		udelay(10);
> -		val64 = readq(&bar0->gpio_control);
> +		(void) readq(&bar0->gpio_control);
>  	}
>  }
>  
> @@ -2727,7 +2726,6 @@ static void free_rxd_blk(struct s2io_nic *sp, int ring_no, int blk)
>  	int j;
>  	struct sk_buff *skb;
>  	struct RxD_t *rxdp;
> -	struct buffAdd *ba;
>  	struct RxD1 *rxdp1;
>  	struct RxD3 *rxdp3;
>  	struct mac_info *mac_control = &sp->mac_control;
> @@ -2751,7 +2749,6 @@ static void free_rxd_blk(struct s2io_nic *sp, int ring_no, int blk)
>  			memset(rxdp, 0, sizeof(struct RxD1));
>  		} else if (sp->rxd_mode == RXD_MODE_3B) {
>  			rxdp3 = (struct RxD3 *)rxdp;
> -			ba = &mac_control->rings[ring_no].ba[blk][j];
>  			pci_unmap_single(sp->pdev,
>  					 (dma_addr_t)rxdp3->Buffer0_ptr,
>  					 BUF0_LEN,
> diff --git a/drivers/net/s2io.h b/drivers/net/s2io.h
> index 628fd27..800b3a4 100644
> --- a/drivers/net/s2io.h
> +++ b/drivers/net/s2io.h
> @@ -1002,18 +1002,16 @@ static inline void writeq(u64 val, void __iomem *addr)
>  #define LF	2
>  static inline void SPECIAL_REG_WRITE(u64 val, void __iomem *addr, int order)
>  {
> -	u32 ret;
> -
>  	if (order == LF) {
>  		writel((u32) (val), addr);
> -		ret = readl(addr);
> +		(void) readl(addr);
>  		writel((u32) (val >> 32), (addr + 4));
> -		ret = readl(addr + 4);
> +		(void) readl(addr + 4);
>  	} else {
>  		writel((u32) (val >> 32), (addr + 4));
> -		ret = readl(addr + 4);
> +		(void) readl(addr + 4);
>  		writel((u32) (val), addr);
> -		ret = readl(addr);
> +		(void) readl(addr);
>  	}
>  }
>  
> -- 
> 1.7.4.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] iproute2: tc add mqprio qdisc support
From: John Fastabend @ 2011-04-12 15:57 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, bhutchings

Add mqprio qdisc support. Output matches the following,

# ./tc/tc qdisc
qdisc mq 0: dev eth1 root
qdisc mq 0: dev eth2 root
qdisc mqprio 8001: dev eth3 root  tc 8 map 0 1 2 3 4 5 6 7 1 1 1 1 1 1 1 1
             queues:(0:7) (8:15) (16:23) (24:31) (32:39) (40:47) (48:55) (56:63)

And usage is,

# ./tc/tc qdisc add dev eth3 root mqprio help
Usage: ... mclass [num_tc NUMBER] [map P0 P1...]
                  [offset txq0 txq1 ...] [count cnt0 cnt1 ...] [hw 1|0]

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 include/linux/pkt_sched.h |   12 ++++
 tc/Makefile               |    1 
 tc/q_mqprio.c             |  125 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 138 insertions(+), 0 deletions(-)
 create mode 100644 tc/q_mqprio.c

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 2cfa4bc..776cd93 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -481,4 +481,16 @@ struct tc_drr_stats {
 	__u32	deficit;
 };
 
+/* MQPRIO */
+#define TC_QOPT_BITMASK 15
+#define TC_QOPT_MAX_QUEUE 16
+
+struct tc_mqprio_qopt {
+	__u8	num_tc;
+	__u8	prio_tc_map[TC_QOPT_BITMASK + 1];
+	__u8	hw;
+	__u16	count[TC_QOPT_MAX_QUEUE];
+	__u16	offset[TC_QOPT_MAX_QUEUE];
+};
+
 #endif
diff --git a/tc/Makefile b/tc/Makefile
index 101cc83..df372c6 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -43,6 +43,7 @@ TCMODULES += em_nbyte.o
 TCMODULES += em_cmp.o
 TCMODULES += em_u32.o
 TCMODULES += em_meta.o
+TCMODULES += q_mqprio.o
 
 TCSO :=
 ifeq ($(TC_CONFIG_ATM),y)
diff --git a/tc/q_mqprio.c b/tc/q_mqprio.c
new file mode 100644
index 0000000..8b2c006
--- /dev/null
+++ b/tc/q_mqprio.c
@@ -0,0 +1,125 @@
+/*
+ * q_mqprio.c	MQ prio qdisc
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Author:	John Fastabend, <john.r.fastabend@intel.com>
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <syslog.h>
+#include <fcntl.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <string.h>
+
+#include "utils.h"
+#include "tc_util.h"
+
+static void explain(void)
+{
+	fprintf(stderr, "Usage: ... mqprio [num_tc NUMBER] [map P0 P1 ...]\n");
+	fprintf(stderr, "                  [offset txq0 txq1 ...] ");
+	fprintf(stderr, "[count cnt0,cnt1 ...] [hw 1|0]\n");
+}
+
+static int mqprio_parse_opt(struct qdisc_util *qu, int argc,
+			    char **argv, struct nlmsghdr *n)
+{
+	int idx;
+	struct tc_mqprio_qopt opt = {
+				     8,
+				     {0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 1, 1, 3, 3, 3, 3},
+				     1,
+				    };
+
+	while (argc > 0) {
+		idx = 0;
+		if (strcmp(*argv, "num_tc") == 0) {
+			NEXT_ARG();
+			if (get_u8(&opt.num_tc, *argv, 10)) {
+				fprintf(stderr, "Illegal \"num_tc\"\n");
+				return -1;
+			}
+		} else if (strcmp(*argv, "map") == 0) {
+			while (idx < TC_QOPT_MAX_QUEUE && NEXT_ARG_OK()) {
+				NEXT_ARG();
+				if (get_u8(&opt.prio_tc_map[idx], *argv, 10)) {
+					PREV_ARG();
+					break;
+				}
+				idx++;
+			}
+			for ( ; idx < TC_QOPT_MAX_QUEUE; idx++)
+				opt.prio_tc_map[idx] = 0;
+		} else if (strcmp(*argv, "offset") == 0) {
+			while (idx < TC_QOPT_MAX_QUEUE && NEXT_ARG_OK()) {
+				NEXT_ARG();
+				if (get_u16(&opt.offset[idx], *argv, 10)) {
+					PREV_ARG();
+					break;
+				}
+				idx++;
+			}
+		} else if (strcmp(*argv, "count") == 0) {
+			while (idx < TC_QOPT_MAX_QUEUE && NEXT_ARG_OK()) {
+				NEXT_ARG();
+				if (get_u16(&opt.count[idx], *argv, 10)) {
+					PREV_ARG();
+					break;
+				}
+				idx++;
+			}
+		} else if (strcmp(*argv, "hw") == 0) {
+			NEXT_ARG();
+			if (get_u8(&opt.hw, *argv, 10)) {
+				fprintf(stderr, "Illegal \"hw\"\n");
+				return -1;
+			}
+			idx++;
+		} else if (strcmp(*argv, "help") == 0) {
+			explain();
+			return -1;
+		} else {
+			fprintf(stderr, "Unknown argument\n");
+			return -1;
+		}
+		argc--; argv++;
+	}
+
+	addattr_l(n, 1024, TCA_OPTIONS, &opt, sizeof(opt));
+	return 0;
+}
+
+int mqprio_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
+{
+	int i;
+	struct tc_mqprio_qopt *qopt;
+
+	if (opt == NULL)
+		return 0;
+
+	qopt = RTA_DATA(opt);
+
+	fprintf(f, " tc %u map ", qopt->num_tc);
+	for (i = 0; i <= TC_PRIO_MAX; i++)
+		fprintf(f, "%d ", qopt->prio_tc_map[i]);
+	fprintf(f, "\n             queues:");
+	for (i = 0; i < qopt->num_tc; i++)
+		fprintf(f, "(%i:%i) ", qopt->offset[i],
+			qopt->offset[i] + qopt->count[i] - 1);
+	return 0;
+}
+
+struct qdisc_util mqprio_qdisc_util = {
+	.id		= "mqprio",
+	.parse_qopt	= mqprio_parse_opt,
+	.print_qopt	= mqprio_print_opt,
+};
+


^ permalink raw reply related

* [PATCH 2/3] MIPS: lantiq: add ethernet driver
From: John Crispin @ 2011-04-12 16:11 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: John Crispin, Ralph Hempel, linux-mips, netdev
In-Reply-To: <1302624675-18652-1-git-send-email-blogic@openwrt.org>

This patch adds the driver for the ETOP Packet Processing Engine (PPE32) found
inside the XWAY family of Lantiq MIPS SoCs. This driver makes 100MBit ethernet
work. Support for all 8 dma channels, gbit and the embedded switch found on
the ar9/vr9 still needs to be implemented.

Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: Ralph Hempel <ralph.hempel@lantiq.com>
Cc: linux-mips@linux-mips.org
Cc: netdev@vger.kernel.org

--

This Patch thould go via the MIPS tree.

 .../mips/include/asm/mach-lantiq/lantiq_platform.h |   14 +
 .../mips/include/asm/mach-lantiq/xway/lantiq_soc.h |    4 +-
 drivers/net/Kconfig                                |    7 +
 drivers/net/Makefile                               |    1 +
 drivers/net/lantiq_etop.c                          |  710 ++++++++++++++++++++
 5 files changed, 734 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/lantiq_etop.c

diff --git a/arch/mips/include/asm/mach-lantiq/lantiq_platform.h b/arch/mips/include/asm/mach-lantiq/lantiq_platform.h
index 1f1dba6..d6b600c 100644
--- a/arch/mips/include/asm/mach-lantiq/lantiq_platform.h
+++ b/arch/mips/include/asm/mach-lantiq/lantiq_platform.h
@@ -10,6 +10,7 @@
 #define _LANTIQ_PLATFORM_H__
 
 #include <linux/mtd/partitions.h>
+#include <linux/socket.h>
 
 /* struct used to pass info to the pci core */
 enum {
@@ -43,4 +44,17 @@ struct ltq_pci_data {
 	int irq[16];
 };
 
+/* struct used to pass info to network drivers */
+enum {
+	MII_MODE,
+	REV_MII_MODE,
+};
+
+#define LTQ_ETH_DATA_CHAN_MAX	0x8
+struct ltq_eth_data {
+	struct sockaddr mac;
+	int mii_mode;
+	int channel[LTQ_ETH_DATA_CHAN_MAX];
+};
+
 #endif
diff --git a/arch/mips/include/asm/mach-lantiq/xway/lantiq_soc.h b/arch/mips/include/asm/mach-lantiq/xway/lantiq_soc.h
index 95f1882..0213601 100644
--- a/arch/mips/include/asm/mach-lantiq/xway/lantiq_soc.h
+++ b/arch/mips/include/asm/mach-lantiq/xway/lantiq_soc.h
@@ -81,8 +81,8 @@
 #define PMU_SWITCH		0x10000000
 
 /* ETOP - ethernet */
-#define LTQ_PPE32_BASE_ADDR	0xBE180000
-#define LTQ_PPE32_SIZE		0x40000
+#define LTQ_ETOP_BASE_ADDR	0x1E180000
+#define LTQ_ETOP_SIZE		0x40000
 
 /* DMA */
 #define LTQ_DMA_BASE_ADDR	0x1E104100
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index b30c688..4878587 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2017,6 +2017,13 @@ config FTMAC100
 	  from Faraday. It is used on Faraday A320, Andes AG101 and some
 	  other ARM/NDS32 SoC's.
 
+config LANTIQ_ETOP
+	tristate "Lantiq SoC ETOP driver"
+	depends on SOC_TYPE_XWAY
+	help
+	  Support for the MII0 inside the Lantiq SoC
+
+
 source "drivers/net/fs_enet/Kconfig"
 
 source "drivers/net/octeon/Kconfig"
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index fbfca11..df71da7 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -261,6 +261,7 @@ obj-$(CONFIG_MLX4_CORE) += mlx4/
 obj-$(CONFIG_ENC28J60) += enc28j60.o
 obj-$(CONFIG_ETHOC) += ethoc.o
 obj-$(CONFIG_GRETH) += greth.o
+obj-$(CONFIG_LANTIQ_ETOP) += lantiq_etop.o
 
 obj-$(CONFIG_XTENSA_XT2000_SONIC) += xtsonic.o
 
diff --git a/drivers/net/lantiq_etop.c b/drivers/net/lantiq_etop.c
new file mode 100644
index 0000000..ec70a24
--- /dev/null
+++ b/drivers/net/lantiq_etop.c
@@ -0,0 +1,710 @@
+/*
+ *   This program is free software; you can redistribute it and/or modify it
+ *   under the terms of the GNU General Public License version 2 as published
+ *   by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program; if not, write to the Free Software
+ *   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
+ *
+ *   Copyright (C) 2011 John Crispin <blogic@openwrt.org>
+ */
+
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/interrupt.h>
+#include <linux/uaccess.h>
+#include <linux/in.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/phy.h>
+#include <linux/ip.h>
+#include <linux/tcp.h>
+#include <linux/skbuff.h>
+#include <linux/mm.h>
+#include <linux/platform_device.h>
+#include <linux/ethtool.h>
+#include <linux/init.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+
+#include <asm/checksum.h>
+
+#include <lantiq_soc.h>
+#include <xway_dma.h>
+#include <lantiq_platform.h>
+
+#define LTQ_ETOP_MDIO		0x11804
+#define MDIO_REQUEST		0x80000000
+#define MDIO_READ		0x40000000
+#define MDIO_ADDR_MASK		0x1f
+#define MDIO_ADDR_OFFSET	0x15
+#define MDIO_REG_MASK		0x1f
+#define MDIO_REG_OFFSET		0x10
+#define MDIO_VAL_MASK		0xffff
+
+#define PPE32_CGEN		0x800
+#define LQ_PPE32_ENET_MAC_CFG	0x1840
+
+#define LTQ_ETOP_ENETS0		0x11850
+#define LTQ_ETOP_MAC_DA0	0x1186C
+#define LTQ_ETOP_MAC_DA1	0x11870
+#define LTQ_ETOP_CFG		0x16020
+#define LTQ_ETOP_IGPLEN		0x16080
+
+#define MAX_DMA_CHAN            0x8
+#define MAX_DMA_CRC_LEN		0x4
+#define MAX_DMA_DATA_LEN	0x600
+
+#define ETOP_FTCU		BIT(28)
+#define ETOP_MII_MASK		0xf
+#define ETOP_MII_NORMAL		0xd
+#define ETOP_MII_REVERSE	0xe
+#define ETOP_PLEN_UNDER		0x40
+#define ETOP_CGEN		0x800
+
+#define IS_TX(x)		(x % 2)
+
+#define ltq_etop_r32(x)		ltq_r32(ltq_etop_membase + (x))
+#define ltq_etop_w32(x, y)	ltq_w32(x, ltq_etop_membase + (y))
+#define ltq_etop_w32_mask(x, y, z)	\
+		ltq_w32_mask(x, y, ltq_etop_membase + (z))
+
+static void __iomem *ltq_etop_membase;
+
+struct ltq_mii_priv {
+	struct ltq_eth_data *pldata;
+	struct resource *res;
+	struct net_device_stats stats;
+
+	struct mii_bus *mii_bus;
+	struct phy_device *phydev;
+
+	struct ltq_dma_channel dma[MAX_DMA_CHAN];
+	struct sk_buff *tx_skb[MAX_DMA_CHAN >> 1][LTQ_DESC_NUM];
+	struct sk_buff *rx_skb[MAX_DMA_CHAN >> 1][LTQ_DESC_NUM];
+
+	struct tasklet_struct rx_tasklet;
+	int rx_tasklet_running;
+	u32 rx_channel_mask;
+
+	struct tasklet_struct tx_tasklet;
+	int tx_tasklet_running;
+	u32 tx_channel_mask;
+	int tx_free[MAX_DMA_CHAN >> 1];
+};
+
+static int
+ltq_etop_alloc_rx_skb(struct ltq_mii_priv *priv, int ch, int desc)
+{
+	int idx = ch >> 1;
+
+	priv->rx_skb[idx][desc] = dev_alloc_skb(MAX_DMA_DATA_LEN);
+	if (!priv->rx_skb[idx][desc])
+		return -ENOMEM;
+
+	priv->dma[ch].desc_base[desc].addr = dma_map_single(NULL,
+			priv->rx_skb[idx][desc]->data,
+			MAX_DMA_DATA_LEN, DMA_FROM_DEVICE);
+	priv->dma[ch].desc_base[desc].addr =
+		CPHYSADDR(priv->rx_skb[idx][desc]->data);
+	priv->dma[ch].desc_base[desc].ctl =
+		LTQ_DMA_OWN | LTQ_DMA_RX_OFFSET(2) | MAX_DMA_DATA_LEN;
+	skb_reserve(priv->rx_skb[idx][desc], 2);
+	return 0;
+}
+
+static void
+ltq_etop_hw_receive(struct net_device *dev, struct ltq_mii_priv *priv, int ch)
+{
+	struct ltq_dma_desc *d = &priv->dma[ch].desc_base[priv->dma[ch].desc];
+	int len = (d->ctl & LTQ_DMA_SIZE_MASK) - MAX_DMA_CRC_LEN;
+	struct sk_buff *skb = priv->rx_skb[ch >> 1][priv->dma[ch].desc];
+
+	if (ltq_etop_alloc_rx_skb(priv, ch, priv->dma[ch].desc)) {
+		netdev_err(dev,
+			"failed to allocate new rx buffer, stopping DMA\n");
+		ltq_dma_close(&priv->dma[ch]);
+	}
+
+	priv->dma[ch].desc++;
+	priv->dma[ch].desc %= LTQ_DESC_NUM;
+
+	skb_put(skb, len);
+	skb->dev = dev;
+	skb->protocol = eth_type_trans(skb, dev);
+	netif_rx(skb);
+	priv->stats.rx_packets++;
+	priv->stats.rx_bytes += len;
+}
+
+static void
+ltq_etop_rx_tasklet(unsigned long _dev)
+{
+	struct net_device *dev = (struct net_device *)_dev;
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	unsigned long flags;
+	int max_irq = 16;
+
+	while (priv->rx_channel_mask && max_irq--) {
+		int ch = __fls(priv->rx_channel_mask);
+		int idx = priv->dma[ch].desc;
+		struct ltq_dma_desc *desc = &priv->dma[ch].desc_base[idx];
+
+		if ((desc->ctl & (LTQ_DMA_OWN | LTQ_DMA_C)) == LTQ_DMA_C) {
+			/* this is a completed rx transaction */
+			ltq_etop_hw_receive(dev, priv, ch);
+		} else {
+			/* there are no more complete descriptors */
+			priv->rx_channel_mask &= ~BIT(ch);
+			ltq_dma_ack_irq(&priv->dma[ch]);
+		}
+	}
+
+	local_irq_save(flags);
+	priv->rx_tasklet_running = 0;
+	if (priv->rx_channel_mask) {
+		priv->rx_tasklet_running = 1;
+		tasklet_schedule(&priv->rx_tasklet);
+	}
+	local_irq_restore(flags);
+}
+
+static int
+ltq_etop_tx_housekeeping(struct net_device *dev, int ch)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	int idx = ch >> 1;
+	int start_queue = 0;
+
+	while ((priv->dma[ch].desc_base[priv->tx_free[idx]].ctl &
+			(LTQ_DMA_OWN | LTQ_DMA_C)) == LTQ_DMA_C) {
+		dev_kfree_skb_any(priv->tx_skb[idx][priv->tx_free[idx]]);
+		priv->tx_skb[idx][priv->tx_free[idx]] = NULL;
+		memset(&priv->dma[ch].desc_base[priv->tx_free[idx]], 0,
+			sizeof(struct ltq_dma_desc));
+		priv->tx_free[idx]++;
+		priv->tx_free[idx] %= LTQ_DESC_NUM;
+		start_queue = 1;
+	}
+	return start_queue;
+}
+
+static void
+ltq_etop_tx_tasklet(unsigned long _dev)
+{
+	struct net_device *dev = (struct net_device *)_dev;
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	int start_queue = 0;
+
+	while (priv->tx_channel_mask) {
+		int ch = __fls(priv->tx_channel_mask);
+		priv->tx_channel_mask &= ~BIT(ch);
+		start_queue |= ltq_etop_tx_housekeeping(dev, ch);
+		ltq_dma_ack_irq(&priv->dma[ch]);
+	}
+	if (start_queue)
+		netif_start_queue(dev);
+	priv->tx_tasklet_running = 0;
+}
+
+static irqreturn_t
+ltq_etop_dma_irq(int irq, void *_priv)
+{
+	struct ltq_mii_priv *priv = _priv;
+	int ch = irq - LTQ_DMA_CH0_INT;
+
+	if (!IS_TX(ch) && !priv->rx_tasklet_running) {
+		priv->rx_channel_mask |= BIT(ch);
+		priv->rx_tasklet_running = 1;
+		tasklet_schedule(&priv->rx_tasklet);
+	}
+
+	if (IS_TX(ch) && !priv->tx_tasklet_running) {
+		priv->tx_channel_mask |= BIT(ch);
+		priv->tx_tasklet_running = 1;
+		tasklet_schedule(&priv->tx_tasklet);
+	}
+	return IRQ_HANDLED;
+}
+
+static void
+ltq_etop_free_channel(struct net_device *dev, struct ltq_dma_channel *ch)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+
+	ltq_dma_free(ch);
+	if (ch->irq)
+		free_irq(ch->irq, priv);
+
+	if (!IS_TX(ch->nr)) {
+		int desc;
+		for (desc = 0; desc < LTQ_DESC_NUM; desc++)
+			if (priv->rx_skb[ch->nr >> 1][desc])
+				dev_kfree_skb_any(
+					priv->rx_skb[ch->nr >> 1][desc]);
+	}
+}
+
+static void
+ltq_etop_hw_exit(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	int ch;
+
+	ltq_pmu_disable(PMU_PPE);
+	for (ch = 0; ch < MAX_DMA_CHAN; ch++)
+		if (priv->pldata->channel[ch])
+			ltq_etop_free_channel(dev, &priv->dma[ch]);
+}
+
+static int
+ltq_etop_hw_init(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	int ch;
+
+	ltq_pmu_enable(PMU_PPE);
+
+	if (priv->pldata->mii_mode == REV_MII_MODE)
+		ltq_etop_w32_mask(ETOP_MII_MASK,
+			ETOP_MII_REVERSE, LTQ_ETOP_CFG);
+	else
+		ltq_etop_w32_mask(ETOP_MII_MASK,
+			ETOP_MII_NORMAL, LTQ_ETOP_CFG);
+	ltq_etop_w32(PPE32_CGEN, LQ_PPE32_ENET_MAC_CFG);
+
+	ltq_dma_init_port(DMA_PORT_ETOP);
+
+	for (ch = 0; ch < MAX_DMA_CHAN; ch++) {
+		int irq = LTQ_DMA_CH0_INT + ch;
+
+		if (!priv->pldata->channel[ch])
+			continue;
+
+		priv->dma[ch].nr = ch;
+
+		if (IS_TX(priv->dma[ch].nr)) {
+			ltq_dma_alloc_tx(&priv->dma[ch]);
+			request_irq(irq, ltq_etop_dma_irq,
+				IRQF_DISABLED, "etop_tx", priv);
+		} else {
+			int desc;
+			ltq_dma_alloc_rx(&priv->dma[ch]);
+			for (desc = 0; desc < LTQ_DESC_NUM; desc++)
+				if (ltq_etop_alloc_rx_skb(priv, ch, desc))
+					return -ENOMEM;
+			request_irq(irq, ltq_etop_dma_irq,
+				IRQF_DISABLED, "etop_rx", priv);
+		}
+		priv->dma[ch].irq = irq;
+	}
+	tasklet_init(&priv->rx_tasklet, ltq_etop_rx_tasklet,
+		(unsigned long)dev);
+	tasklet_init(&priv->tx_tasklet, ltq_etop_tx_tasklet,
+		(unsigned long)dev);
+	return 0;
+}
+
+static int
+ltq_etop_mdio_wr(struct mii_bus *bus, int phy_addr, int phy_reg, u16 phy_data)
+{
+	u32 val = MDIO_REQUEST |
+		((phy_addr & MDIO_ADDR_MASK) << MDIO_ADDR_OFFSET) |
+		((phy_reg & MDIO_REG_MASK) << MDIO_REG_OFFSET) |
+		phy_data;
+
+	while (ltq_etop_r32(LTQ_ETOP_MDIO) & MDIO_REQUEST)
+		;
+	ltq_etop_w32(val, LTQ_ETOP_MDIO);
+	return 0;
+}
+
+static int
+ltq_etop_mdio_rd(struct mii_bus *bus, int phy_addr, int phy_reg)
+{
+	u32 val = MDIO_REQUEST | MDIO_READ |
+		((phy_addr & MDIO_ADDR_MASK) << MDIO_ADDR_OFFSET) |
+		((phy_reg & MDIO_REG_MASK) << MDIO_REG_OFFSET);
+
+	while (ltq_etop_r32(LTQ_ETOP_MDIO) & MDIO_REQUEST)
+		;
+	ltq_etop_w32(val, LTQ_ETOP_MDIO);
+	while (ltq_etop_r32(LTQ_ETOP_MDIO) & MDIO_REQUEST)
+		;
+	val = ltq_etop_r32(LTQ_ETOP_MDIO) & MDIO_VAL_MASK;
+	return val;
+}
+
+static void
+ltq_etop_mdio_link(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	struct phy_device *phydev = priv->phydev;
+
+	phy_print_status(phydev);
+}
+
+static int
+ltq_etop_mdio_probe(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	struct phy_device *phydev = NULL;
+	int phy_addr;
+
+	for (phy_addr = 0; phy_addr < PHY_MAX_ADDR; phy_addr++) {
+		if (priv->mii_bus->phy_map[phy_addr]) {
+			phydev = priv->mii_bus->phy_map[phy_addr];
+			break;
+		}
+	}
+
+	if (!phydev) {
+		netdev_err(dev, "no PHY found\n");
+		return -ENODEV;
+	}
+
+	phydev = phy_connect(dev, dev_name(&phydev->dev), &ltq_etop_mdio_link,
+			0, PHY_INTERFACE_MODE_MII);
+
+	if (IS_ERR(phydev)) {
+		netdev_err(dev, "Could not attach to PHY\n");
+		return PTR_ERR(phydev);
+	}
+
+	phydev->supported &= (SUPPORTED_10baseT_Half
+			      | SUPPORTED_10baseT_Full
+			      | SUPPORTED_100baseT_Half
+			      | SUPPORTED_100baseT_Full
+			      | SUPPORTED_Autoneg
+			      | SUPPORTED_MII
+			      | SUPPORTED_TP);
+
+	phydev->advertising = phydev->supported;
+	priv->phydev = phydev;
+	pr_info("%s: attached PHY [%s] (phy_addr=%s, irq=%d)\n",
+	       dev->name, phydev->drv->name,
+	       dev_name(&phydev->dev), phydev->irq);
+
+	return 0;
+}
+
+static int
+ltq_etop_mdio_init(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	int i;
+	int err;
+
+	priv->mii_bus = mdiobus_alloc();
+	if (!priv->mii_bus) {
+		netdev_err(dev, "failed to allocate mii bus\n");
+		err = -ENOMEM;
+		goto err_out;
+	}
+
+	priv->mii_bus->priv = dev;
+	priv->mii_bus->read = ltq_etop_mdio_rd;
+	priv->mii_bus->write = ltq_etop_mdio_wr;
+	priv->mii_bus->name = "ltq_mii";
+	snprintf(priv->mii_bus->id, MII_BUS_ID_SIZE, "%x", 0);
+	priv->mii_bus->irq = kmalloc(sizeof(int) * PHY_MAX_ADDR, GFP_KERNEL);
+	if (!priv->mii_bus->irq) {
+		err = -ENOMEM;
+		goto err_out_free_mdiobus;
+	}
+
+	for (i = 0; i < PHY_MAX_ADDR; ++i)
+		priv->mii_bus->irq[i] = PHY_POLL;
+
+	if (mdiobus_register(priv->mii_bus)) {
+		err = -ENXIO;
+		goto err_out_free_mdio_irq;
+	}
+
+	if (ltq_etop_mdio_probe(dev)) {
+		err = -ENXIO;
+		goto err_out_unregister_bus;
+	}
+	return 0;
+
+err_out_unregister_bus:
+	mdiobus_unregister(priv->mii_bus);
+err_out_free_mdio_irq:
+	kfree(priv->mii_bus->irq);
+err_out_free_mdiobus:
+	mdiobus_free(priv->mii_bus);
+err_out:
+	return err;
+}
+
+static void
+ltq_etop_mdio_cleanup(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	phy_disconnect(priv->phydev);
+	mdiobus_unregister(priv->mii_bus);
+	kfree(priv->mii_bus->irq);
+	mdiobus_free(priv->mii_bus);
+}
+
+static int
+ltq_etop_open(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	int ch;
+
+	for (ch = 0; ch < MAX_DMA_CHAN; ch++)
+		if (priv->pldata->channel[ch])
+			ltq_dma_open(&priv->dma[ch]);
+	netif_start_queue(dev);
+	return 0;
+}
+
+static int
+ltq_etop_stop(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	int ch;
+
+	for (ch = 0; ch < MAX_DMA_CHAN; ch++)
+		if (priv->pldata->channel[ch])
+			ltq_dma_close(&priv->dma[ch]);
+	netif_stop_queue(dev);
+	return 0;
+}
+
+static int
+ltq_etop_tx(struct sk_buff *skb, struct net_device *dev)
+{
+	int ch = 1;
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	struct ltq_dma_desc *d = &priv->dma[ch].desc_base[priv->dma[ch].desc];
+	int len;
+	unsigned long flags;
+	u32 byte_offset;
+
+	len = skb->len < ETH_ZLEN ? ETH_ZLEN : skb->len;
+	dev->trans_start = jiffies;
+
+	if ((d->ctl & (LTQ_DMA_OWN | LTQ_DMA_C)) ||
+			priv->tx_skb[ch >> 1][priv->dma[ch].desc]) {
+		priv->stats.tx_errors++;
+		priv->stats.tx_dropped++;
+		dev_kfree_skb_any(skb);
+		netdev_err(dev, "tx ring full\n");
+		netif_stop_queue(dev);
+		return NETDEV_TX_BUSY;
+	}
+
+	/* dma needs to start on a 16 byte aligned address */
+	byte_offset = CPHYSADDR(skb->data) % 16;
+	priv->tx_skb[ch >> 1][priv->dma[ch].desc] = skb;
+
+	local_irq_save(flags);
+	d->addr = ((unsigned int) dma_map_single(NULL, skb->data, len,
+						DMA_TO_DEVICE)) - byte_offset;
+	wmb();
+	d->ctl = LTQ_DMA_OWN | LTQ_DMA_SOP | LTQ_DMA_EOP |
+		LTQ_DMA_TX_OFFSET(byte_offset) | (len & LTQ_DMA_SIZE_MASK);
+	local_irq_restore(flags);
+
+	priv->dma[ch].desc++;
+	priv->dma[ch].desc %= LTQ_DESC_NUM;
+
+	if (priv->dma[ch].desc_base[priv->dma[ch].desc].ctl & LTQ_DMA_OWN)
+		netif_stop_queue(dev);
+
+	priv->stats.tx_packets++;
+	priv->stats.tx_bytes += len;
+	return NETDEV_TX_OK;
+}
+
+static void
+ltq_etop_tx_timeout(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+
+	priv->stats.tx_errors++;
+	netif_wake_queue(dev);
+}
+
+static int
+ltq_etop_change_mtu(struct net_device *dev, int new_mtu)
+{
+	int retval = eth_change_mtu(dev, new_mtu);
+
+	if (!retval)
+		ltq_etop_w32((ETOP_PLEN_UNDER << 16) | new_mtu,
+			LTQ_ETOP_IGPLEN);
+	return retval;
+}
+
+static int
+ltq_etop_set_mac_address(struct net_device *dev, void *p)
+{
+	int retcode = eth_mac_addr(dev, p);
+
+	if (!retcode) {
+		/* store the mac for the unicast filter */
+		ltq_etop_w32(*((u32 *)dev->dev_addr), LTQ_ETOP_MAC_DA0);
+		ltq_etop_w32(*((u16 *)&dev->dev_addr[4]) << 16,
+			LTQ_ETOP_MAC_DA1);
+	}
+	return retcode;
+}
+
+static void
+ltq_etop_set_multicast_list(struct net_device *dev)
+{
+	/* ensure that the unicast filter is not enabled in promiscious mode */
+	if ((dev->flags & IFF_PROMISC) || (dev->flags & IFF_ALLMULTI))
+		ltq_etop_w32_mask(ETOP_FTCU, 0, LTQ_ETOP_ENETS0);
+	else
+		ltq_etop_w32_mask(0, ETOP_FTCU, LTQ_ETOP_ENETS0);
+}
+
+static int
+ltq_etop_init(struct net_device *dev)
+{
+	struct ltq_mii_priv *priv = netdev_priv(dev);
+	int err;
+
+	ether_setup(dev);
+	dev->watchdog_timeo = 10 * HZ;
+	err = ltq_etop_hw_init(dev);
+	if (err)
+		goto err_hw;
+	ltq_etop_change_mtu(dev, 1500);
+	err = ltq_etop_set_mac_address(dev, &priv->pldata->mac);
+	if (err)
+		goto err_netdev;
+	ltq_etop_set_multicast_list(dev);
+	err = ltq_etop_mdio_init(dev);
+	if (err)
+		goto err_netdev;
+	return 0;
+
+err_netdev:
+	unregister_netdev(dev);
+	free_netdev(dev);
+err_hw:
+	ltq_etop_hw_exit(dev);
+	return err;
+}
+
+static const struct net_device_ops ltq_eth_netdev_ops = {
+	.ndo_open = ltq_etop_open,
+	.ndo_stop = ltq_etop_stop,
+	.ndo_start_xmit = ltq_etop_tx,
+	.ndo_tx_timeout = ltq_etop_tx_timeout,
+	.ndo_change_mtu = ltq_etop_change_mtu,
+	.ndo_set_mac_address = ltq_etop_set_mac_address,
+	.ndo_validate_addr = eth_validate_addr,
+	.ndo_set_multicast_list = ltq_etop_set_multicast_list,
+	.ndo_init = ltq_etop_init,
+};
+
+static int __init
+ltq_etop_probe(struct platform_device *pdev)
+{
+	struct net_device *dev;
+	struct ltq_mii_priv *priv;
+	struct resource *res;
+	int err;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		dev_err(&pdev->dev, "failed to get etop resource\n");
+		err = -ENOENT;
+		goto err_out;
+	}
+
+	res = devm_request_mem_region(&pdev->dev, res->start,
+		resource_size(res), dev_name(&pdev->dev));
+	if (!res) {
+		dev_err(&pdev->dev, "failed to request etop resource\n");
+		err = -EBUSY;
+		goto err_out;
+	}
+
+	ltq_etop_membase = devm_ioremap_nocache(&pdev->dev,
+		res->start, resource_size(res));
+	if (!ltq_etop_membase) {
+		dev_err(&pdev->dev, "failed to remap etop engine %d\n",
+			pdev->id);
+		err = -ENOMEM;
+		goto err_out;
+	}
+
+	dev = alloc_etherdev(sizeof(struct ltq_mii_priv));
+	strcpy(dev->name, "eth%d");
+	dev->netdev_ops = &ltq_eth_netdev_ops;
+	priv = netdev_priv(dev);
+	priv->res = res;
+	priv->pldata = dev_get_platdata(&pdev->dev);
+
+	err = register_netdev(dev);
+	if (err)
+		goto err_free;
+
+	platform_set_drvdata(pdev, dev);
+	return 0;
+
+err_free:
+	kfree(dev);
+err_out:
+	return err;
+}
+
+static int __devexit
+ltq_etop_remove(struct platform_device *pdev)
+{
+	struct net_device *dev = platform_get_drvdata(pdev);
+
+	if (dev) {
+		netif_stop_queue(dev);
+		ltq_etop_hw_exit(dev);
+		ltq_etop_mdio_cleanup(dev);
+		unregister_netdev(dev);
+	}
+	return 0;
+}
+
+static struct platform_driver ltq_mii_driver = {
+	.remove = __devexit_p(ltq_etop_remove),
+	.driver = {
+		.name = "ltq_etop",
+		.owner = THIS_MODULE,
+	},
+};
+
+int __init
+init_ltq_etop(void)
+{
+	int ret = platform_driver_probe(&ltq_mii_driver, ltq_etop_probe);
+
+	if (ret)
+		pr_err("ltq_etop: Error registering platfom driver!");
+	return ret;
+}
+
+static void __exit
+exit_ltq_etop(void)
+{
+	platform_driver_unregister(&ltq_mii_driver);
+}
+
+module_init(init_ltq_etop);
+module_exit(exit_ltq_etop);
+
+MODULE_AUTHOR("John Crispin <blogic@openwrt.org>");
+MODULE_DESCRIPTION("Lantiq SoC ETOP");
+MODULE_LICENSE("GPL");
-- 
1.7.2.3


^ permalink raw reply related

* Re: Kernel panic when using bridge
From: Eric Dumazet @ 2011-04-12 16:14 UTC (permalink / raw)
  To: Jan Lübbe
  Cc: Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA, netdev,
	Bandan Das
In-Reply-To: <1302621233.30934.44.camel@polaris.local>

Le mardi 12 avril 2011 à 17:13 +0200, Jan Lübbe a écrit :
> On Tue, 2011-04-12 at 16:49 +0200, Eric Dumazet wrote: 
> > Of course, this might be a complete shot in the dark, but a
> > stackprotector fault in icmp_send() really sounds like a problem in
> > ip_options_echo() [ or bad input data given to this function ]
> 
> It was my understanding that all IP options given to ip_options_echo are
> either from local sources or have gone through ip_options_compile, which
> seems to verify that the sum of the individual option lengths do not
> exceed the ip header. So there wouldn't need to be additional checks in
> ip_options_echo.
> 
> If this is not the case, we need size checks in ip_options_echo before
> copying over each option.
> 
> > Other related changes (but as old as v2.6.22) :
> > 
> > commit 11a03f78fbf15a866ba
> > ([NetLabel]: core network changes)
> 
> When investigating the problem I had with timestamps, i found that most
> of the lines in ip_options_echo and _compile have not been changed since
> before 2.2 (some even before 2.0). The newer changes have all been
> updates for changed API elsewhere in the stack.
> 

commit 462fb2af9788a82 might be the problem.
(bridge : Sanitize skb before it enters the IP stack)

We are supposed to provide a zeroed ip_options to ip_options_compile()

diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 008ff6c..f3bc322 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb)
 		goto drop;
 	}
 
-	/* Zero out the CB buffer if no options present */
-	if (iph->ihl == 5) {
-		memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
+	memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
+	if (iph->ihl == 5)
 		return 0;
-	}
 
 	opt->optlen = iph->ihl*4 - sizeof(struct iphdr);
 	if (ip_options_compile(dev_net(dev), opt, skb))




^ permalink raw reply related

* Re: Kernel panic when using bridge
From: Stephen Hemminger @ 2011-04-12 16:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jan Lübbe, Scot Doyle, Hiroaki SHIMODA, netdev, Bandan Das
In-Reply-To: <1302624851.3233.63.camel@edumazet-laptop>

On Tue, 12 Apr 2011 18:14:11 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le mardi 12 avril 2011 à 17:13 +0200, Jan Lübbe a écrit :
> > On Tue, 2011-04-12 at 16:49 +0200, Eric Dumazet wrote: 
> > > Of course, this might be a complete shot in the dark, but a
> > > stackprotector fault in icmp_send() really sounds like a problem in
> > > ip_options_echo() [ or bad input data given to this function ]
> > 
> > It was my understanding that all IP options given to ip_options_echo are
> > either from local sources or have gone through ip_options_compile, which
> > seems to verify that the sum of the individual option lengths do not
> > exceed the ip header. So there wouldn't need to be additional checks in
> > ip_options_echo.
> > 
> > If this is not the case, we need size checks in ip_options_echo before
> > copying over each option.
> > 
> > > Other related changes (but as old as v2.6.22) :
> > > 
> > > commit 11a03f78fbf15a866ba
> > > ([NetLabel]: core network changes)
> > 
> > When investigating the problem I had with timestamps, i found that most
> > of the lines in ip_options_echo and _compile have not been changed since
> > before 2.2 (some even before 2.0). The newer changes have all been
> > updates for changed API elsewhere in the stack.
> > 
> 
> commit 462fb2af9788a82 might be the problem.
> (bridge : Sanitize skb before it enters the IP stack)
> 
> We are supposed to provide a zeroed ip_options to ip_options_compile()
> 
> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> index 008ff6c..f3bc322 100644
> --- a/net/bridge/br_netfilter.c
> +++ b/net/bridge/br_netfilter.c
> @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb)
>  		goto drop;
>  	}
>  
> -	/* Zero out the CB buffer if no options present */
> -	if (iph->ihl == 5) {
> -		memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
> +	memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
> +	if (iph->ihl == 5)
>  		return 0;
> -	}
>  
>  	opt->optlen = iph->ihl*4 - sizeof(struct iphdr);
>  	if (ip_options_compile(dev_net(dev), opt, skb))

I think the confusion is that IPCB(skb) is not the IP header but
scratch space used during IP header processing. Before the sanitize
patch the CB was cleared.

Acked-by: Stephen Hemminger <shemminger@vyatta.com>

^ permalink raw reply

* Re: [PATCH] iproute2: tc add mqprio qdisc support
From: Stephen Hemminger @ 2011-04-12 16:23 UTC (permalink / raw)
  To: John Fastabend; +Cc: netdev, bhutchings
In-Reply-To: <20110412155727.4656.42756.stgit@jf-dev1-dcblab>

On Tue, 12 Apr 2011 08:57:27 -0700
John Fastabend <john.r.fastabend@intel.com> wrote:

> Add mqprio qdisc support. Output matches the following,
> 
> # ./tc/tc qdisc
> qdisc mq 0: dev eth1 root
> qdisc mq 0: dev eth2 root
> qdisc mqprio 8001: dev eth3 root  tc 8 map 0 1 2 3 4 5 6 7 1 1 1 1 1 1 1 1
>              queues:(0:7) (8:15) (16:23) (24:31) (32:39) (40:47) (48:55) (56:63)
> 
> And usage is,
> 
> # ./tc/tc qdisc add dev eth3 root mqprio help
> Usage: ... mclass [num_tc NUMBER] [map P0 P1...]
>                   [offset txq0 txq1 ...] [count cnt0 cnt1 ...] [hw 1|0]
> 
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>

Applied to net-next branch.



-- 

^ permalink raw reply

* RE: [net-next-2.6 RFC PATCH] ethtool: allow custom interval for physical identification
From: Ben Hutchings @ 2011-04-12 16:28 UTC (permalink / raw)
  To: Allan, Bruce W; +Cc: Stephen Hemminger, netdev@vger.kernel.org
In-Reply-To: <8DD2590731AB5D4C9DBF71A877482A90018A2A315B@orsmsx509.amr.corp.intel.com>

On Mon, 2011-04-11 at 18:07 -0700, Allan, Bruce W wrote:
> >-----Original Message-----
> >From: Ben Hutchings [mailto:bhutchings@solarflare.com]
> >Sent: Monday, April 11, 2011 5:01 PM
> >To: Allan, Bruce W
> >Cc: Stephen Hemminger; netdev@vger.kernel.org
> >Subject: RE: [net-next-2.6 RFC PATCH] ethtool: allow custom interval for
> >physical identification
> >
> >I noticed that some drivers did this.  Do you know if these OEMs expect
> >this of all hardware, or do they actually want different vendors'
> >hardware to blink in different ways?  If it's a common requirement to
> >blink at 2 Hz then let's use that frequency for all the drivers that
> >want to be called periodically.
> >
> >Ben.
> 
> Sorry, I don't know.  I'll ask around, but doubt I will get a definitive
> answer.

I enquired here and found that we do have an OEM specifying 1 Hz.

> FWIW, without digging too deep into how other drivers identify their
> respective ports through software, it appears it was split:
> * bnx2*, cxgb3, niu, s2io, sfc, sky2, tg3 - once per second
> * e100*, igb, ixgb*, pcnet32, ewrk3, cxgb4 - approx. twice per second
>
> AFAIK for parts that can set the physical identification through hardware,
> the Intel drivers set the on/off intervals to approximately twice/second;
> I don't know what other drivers do in that situation.
> 
> So, I would guess it is not a common requirement to blink at a specific Hz.
> I have no problem with changing the hard-coded blink frequency to what our
> OEMs expect, but that might be an issue for those other vendors; I was just
> trying to make it flexible.

Sadly it appears this is necessary.

Let's define the return value for drivers wanting periodic callbacks to
be the blink frequency in Hz (normally 1 or 2), and get rid of the
special case of -EINVAL.  This also removes the rather inelegant
semantic that drivers may need to change their state despite returning
an error code.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [Bugme-new] [Bug 33042] New: Marvell 88E1145 phy configured incorrectly in fiber mode
From: David Daney @ 2011-04-12 16:34 UTC (permalink / raw)
  To: Alex Dubov
  Cc: Andrew Morton, netdev, bugzilla-daemon, bugme-daemon,
	Grant Likely, Andy Fleming
In-Reply-To: <392488.26736.qm@web37606.mail.mud.yahoo.com>

On 04/11/2011 08:45 PM, Alex Dubov wrote:
>>
>> How does your u-boot configure the part?  Does it
>> write any of the
>> configuration registers, or is it just the default
>> configuration set via
>> the strapping pins?
>
> U-boot configures this phy just like any other phy - by running a set of
> register assignments from phy_info_M88E1145.
>
> Unfortunately, I don't have a datasheet for this phy

I would guess that the people who designed your board have it.

You do seem to have the uboot code though, so you know which registers 
are being set.  Given this, you can fiddle around with the Linux driver 
until it works.  Then send a patch.

To me it looks like you need to set register 22 (Page Select) to the 
value of 1 to access the register sets associated with fiber.  I don't 
have any hardware with this PHY connected to fiber, so I can't really 
test it.

David Daney

> and kernel does
> quite a few things differently, so simply copying stuff from u-boot
> does not work well (in kernel, phy initialization is broken into 3
> functions, if I'm not mistaken).
>
> Otherwise, my problem seems to be identical to the one reported some
> time ago against 88E1111 phy (which resulted in the addition of
> "marvell_read_status" in the first place). The problem was, as it seems
> to be now, that phy is always configured in "copper" mode, instead of
> driver checking for the correct "fiber" mode bits.
>
>
>>
>> In any event, you will probably have to read the
>> configuration before
>> the drivers/net/phy/marvel.c changes them.  Then
>> compare that to what
>> the driver is trying to set.  Then you will either
>> have to override the
>> configuration with the device tree "marvell,reg-init"
>> property, or if
>> you are not using the device tree, add a 88e1145 specific
>> flag that you
>> set when calling phy_connect().
>>
>> David Daney
>>


^ permalink raw reply

* Re: Kernel panic when using bridge
From: Eric Dumazet @ 2011-04-12 16:35 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Jan Lübbe, Scot Doyle, Hiroaki SHIMODA, netdev, Bandan Das
In-Reply-To: <20110412092039.69f420f6@nehalam>

Le mardi 12 avril 2011 à 09:20 -0700, Stephen Hemminger a écrit :

> I think the confusion is that IPCB(skb) is not the IP header but
> scratch space used during IP header processing. Before the sanitize
> patch the CB was cleared.
> 
> Acked-by: Stephen Hemminger <shemminger@vyatta.com>

Should we clear it also in br_nf_dev_queue_xmit(), since we did this
prior to commit 462fb2af9788a8 ?

Thanks !



^ permalink raw reply

* Re: Kernel panic when using bridge
From: Bandan Das @ 2011-04-12 16:32 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jan Lübbe, Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA,
	netdev, Bandan Das
In-Reply-To: <1302624851.3233.63.camel@edumazet-laptop>

On  0, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 12 avril 2011 à 17:13 +0200, Jan Lübbe a écrit :
> > On Tue, 2011-04-12 at 16:49 +0200, Eric Dumazet wrote: 
> > > Of course, this might be a complete shot in the dark, but a
> > > stackprotector fault in icmp_send() really sounds like a problem in
> > > ip_options_echo() [ or bad input data given to this function ]
> > 
> > It was my understanding that all IP options given to ip_options_echo are
> > either from local sources or have gone through ip_options_compile, which
> > seems to verify that the sum of the individual option lengths do not
> > exceed the ip header. So there wouldn't need to be additional checks in
> > ip_options_echo.
> > 
> > If this is not the case, we need size checks in ip_options_echo before
> > copying over each option.
> > 
> > > Other related changes (but as old as v2.6.22) :
> > > 
> > > commit 11a03f78fbf15a866ba
> > > ([NetLabel]: core network changes)
> > 
> > When investigating the problem I had with timestamps, i found that most
> > of the lines in ip_options_echo and _compile have not been changed since
> > before 2.2 (some even before 2.0). The newer changes have all been
> > updates for changed API elsewhere in the stack.
> > 
> 
> commit 462fb2af9788a82 might be the problem.
> (bridge : Sanitize skb before it enters the IP stack)
> 
> We are supposed to provide a zeroed ip_options to ip_options_compile()
> 
> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> index 008ff6c..f3bc322 100644
> --- a/net/bridge/br_netfilter.c
> +++ b/net/bridge/br_netfilter.c
> @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb)
>  		goto drop;
>  	}
>  
> -	/* Zero out the CB buffer if no options present */
> -	if (iph->ihl == 5) {
> -		memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
> +	memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
> +	if (iph->ihl == 5)
>  		return 0;
> -	}
>  
>  	opt->optlen = iph->ihl*4 - sizeof(struct iphdr);
>  	if (ip_options_compile(dev_net(dev), opt, skb))
> 
> 
Looks good to me. The CB area should be cleared out anyways before 
handing over the packet. Thank you for spotting this!

Acked-by: Bandan Das <bandan.das@stratus.com>

^ permalink raw reply

* Re: Kernel panic when using bridge
From: Bandan Das @ 2011-04-12 16:45 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Stephen Hemminger, Jan Lübbe, Scot Doyle, Hiroaki SHIMODA,
	netdev, Bandan Das
In-Reply-To: <1302626152.3233.66.camel@edumazet-laptop>

On  0, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 12 avril 2011 à 09:20 -0700, Stephen Hemminger a écrit :
> 
> > I think the confusion is that IPCB(skb) is not the IP header but
> > scratch space used during IP header processing. Before the sanitize
> > patch the CB was cleared.
> > 
> > Acked-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> Should we clear it also in br_nf_dev_queue_xmit(), since we did this
> prior to commit 462fb2af9788a8 ?
> 
> Thanks !
> 
Wouldn't that clear out any valid IP options if it were there ? I think
that was the whole point of adding br_parse_ip_options :

/* BUG: Should really parse the IP options here. */
memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));

 

-- 
Bandan 

^ permalink raw reply

* Re: Kernel panic when using bridge
From: Eric Dumazet @ 2011-04-12 16:54 UTC (permalink / raw)
  To: Bandan Das
  Cc: Stephen Hemminger, Jan Lübbe, Scot Doyle, Hiroaki SHIMODA,
	netdev
In-Reply-To: <20110412164557.GF2047@stratus.com>

Le mardi 12 avril 2011 à 12:45 -0400, Bandan Das a écrit :
> On  0, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Le mardi 12 avril 2011 à 09:20 -0700, Stephen Hemminger a écrit :
> > 
> > > I think the confusion is that IPCB(skb) is not the IP header but
> > > scratch space used during IP header processing. Before the sanitize
> > > patch the CB was cleared.
> > > 
> > > Acked-by: Stephen Hemminger <shemminger@vyatta.com>
> > 
> > Should we clear it also in br_nf_dev_queue_xmit(), since we did this
> > prior to commit 462fb2af9788a8 ?
> > 
> > Thanks !
> > 
> Wouldn't that clear out any valid IP options if it were there ? I think
> that was the whole point of adding br_parse_ip_options :
> 
> /* BUG: Should really parse the IP options here. */
> memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
> 
>  
> 

Oh yes, I missed br_nf_dev_queue_xmit() called br_parse_ip_options() and
not ip_options_compile()

I'll submit an official patch, thanks !



^ permalink raw reply

* [PATCH] bridge: reset IPCB in br_parse_ip_options
From: Eric Dumazet @ 2011-04-12 17:18 UTC (permalink / raw)
  To: David Miller
  Cc: Stephen Hemminger, Jan Lübbe, Scot Doyle, Hiroaki SHIMODA,
	netdev, Bandan Das
In-Reply-To: <1302627281.3233.70.camel@edumazet-laptop>

Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP
stack), missed one IPCB init before calling ip_options_compile()

Thanks to Scot Doyle for his tests and bug reports.

Reported-by: Scot Doyle <lkml@scotdoyle.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Acked-by: Bandan Das <bandan.das@stratus.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jan Lübbe <jluebbe@debian.org>
---
 net/bridge/br_netfilter.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 008ff6c..b353f7c 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb)
 		goto drop;
 	}
 
-	/* Zero out the CB buffer if no options present */
-	if (iph->ihl == 5) {
-		memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
+	memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
+	if (iph->ihl == 5)
 		return 0;
-	}
 
 	opt->optlen = iph->ihl*4 - sizeof(struct iphdr);
 	if (ip_options_compile(dev_net(dev), opt, skb))



^ permalink raw reply related

* Re: [RFC] iproute2: Fix meta match u32 with 0xffffffff
From: Thomas Graf @ 2011-04-12 17:22 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20110412081907.4f2e21fd@nehalam>

On Tue, 2011-04-12 at 08:19 -0700, Stephen Hemminger wrote:

> > 
> > This is definitely much better but we still can't parse ULONG_MAX
> > as string representative. Checking glibc docs, the only way to do it is
> > to ignore the return value for error checking and look errno.
> > 
> 
> I think the error case is ret == ULONG_MAX && errno == ERANGE
> If there is no error, then strtoul doesn't set errno.

That's ok too but your patch adds ret == ULONG_MAX || errno == ERANGE
which will not allow to parse ULONG_MAX as a string. You probably
still have to clear errno to 0 before calling strtoul in case
ULONG_MAX is meant as legitimate return value but a previous glibc
function has left errno set to ERANGE.

^ permalink raw reply

* Re: [RFC] iproute2: Fix meta match u32 with 0xffffffff
From: Stephen Hemminger @ 2011-04-12 17:31 UTC (permalink / raw)
  To: tgraf; +Cc: netdev
In-Reply-To: <1302628925.3664.20.camel@lsx>

On Tue, 12 Apr 2011 19:22:05 +0200
Thomas Graf <tgraf@redhat.com> wrote:

> On Tue, 2011-04-12 at 08:19 -0700, Stephen Hemminger wrote:
> 
> > > 
> > > This is definitely much better but we still can't parse ULONG_MAX
> > > as string representative. Checking glibc docs, the only way to do it is
> > > to ignore the return value for error checking and look errno.
> > > 
> > 
> > I think the error case is ret == ULONG_MAX && errno == ERANGE
> > If there is no error, then strtoul doesn't set errno.
> 
> That's ok too but your patch adds ret == ULONG_MAX || errno == ERANGE
> which will not allow to parse ULONG_MAX as a string. You probably
> still have to clear errno to 0 before calling strtoul in case
> ULONG_MAX is meant as legitimate return value but a previous glibc
> function has left errno set to ERANGE.
> 

I changed to && in final version.
Still needs more work because most of the code assumes that
unsigned long can be used for mask etc, when infact the code only
allows for u32. 

>From 63da1ed89970af18383c7318deb38bdb062fe74e Mon Sep 17 00:00:00 2001
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Mon, 11 Apr 2011 11:48:55 -0700
Subject: [PATCH 1/2] Fix meta match u32 with 0xffffffff

The value 0xffffffff is a valid mask and bstrtoul() would return
ULONG_MAX which was the error value.
---
 tc/em_cmp.c   |   12 ++++--------
 tc/em_meta.c  |    9 +++------
 tc/em_nbyte.c |    6 ++----
 tc/em_u32.c   |   19 +++++++++----------
 tc/m_ematch.c |   17 +++++++++++------
 tc/m_ematch.h |    2 +-
 6 files changed, 30 insertions(+), 35 deletions(-)

diff --git a/tc/em_cmp.c b/tc/em_cmp.c
index 6addce0..af3e591 100644
--- a/tc/em_cmp.c
+++ b/tc/em_cmp.c
@@ -69,8 +69,7 @@ static int cmp_parse_eopt(struct nlmsghdr *n, struct tcf_ematch_hdr *hdr,
 				return PARSE_ERR(a, "cmp: missing argument");
 			a = bstr_next(a);
 
-			offset = bstrtoul(a);
-			if (offset == ULONG_MAX)
+			if (bstrtoul(a, &offset) < 0)
 				return PARSE_ERR(a, "cmp: invalid offset, " \
 				    "must be numeric");
 
@@ -82,8 +81,7 @@ static int cmp_parse_eopt(struct nlmsghdr *n, struct tcf_ematch_hdr *hdr,
 
 			layer = parse_layer(a);
 			if (layer == INT_MAX) {
-				layer = bstrtoul(a);
-				if (layer == ULONG_MAX)
+				if (bstrtoul(a, &layer) < 0)
 					return PARSE_ERR(a, "cmp: invalid " \
 					    "layer");
 			}
@@ -96,8 +94,7 @@ static int cmp_parse_eopt(struct nlmsghdr *n, struct tcf_ematch_hdr *hdr,
 				return PARSE_ERR(a, "cmp: missing argument");
 			a = bstr_next(a);
 
-			mask = bstrtoul(a);
-			if (mask == ULONG_MAX)
+			if (bstrtoul(a, &mask) < 0)
 				return PARSE_ERR(a, "cmp: invalid mask");
 		} else if (!bstrcmp(a, "trans")) {
 			cmp.flags |= TCF_EM_CMP_TRANS;
@@ -115,8 +112,7 @@ static int cmp_parse_eopt(struct nlmsghdr *n, struct tcf_ematch_hdr *hdr,
 				return PARSE_ERR(a, "cmp: missing argument");
 			a = bstr_next(a);
 
-			value = bstrtoul(a);
-			if (value == ULONG_MAX)
+			if (bstrtoul(a, &value) < 0)
 				return PARSE_ERR(a, "cmp: invalid value");
 
 			value_present = 1;
diff --git a/tc/em_meta.c b/tc/em_meta.c
index 033e29f..276223a 100644
--- a/tc/em_meta.c
+++ b/tc/em_meta.c
@@ -260,8 +260,7 @@ parse_object(struct bstr *args, struct bstr *arg, struct tcf_meta_val *obj,
 		return bstr_next(arg);
 	}
 
-	num = bstrtoul(arg);
-	if (num != ULONG_MAX) {
+	if (bstrtoul(arg, &num) < 0) {
 		obj->kind = TCF_META_TYPE_INT << 12;
 		obj->kind |= TCF_META_ID_VALUE;
 		*dst = (unsigned long) num;
@@ -318,8 +317,7 @@ compatible:
 			}
 			a = bstr_next(a);
 
-			shift = bstrtoul(a);
-			if (shift == ULONG_MAX) {
+			if (bstrtoul(a, &shift) < 0) {
 				PARSE_ERR(a, "meta: invalid shift, must " \
 				    "be numeric");
 				return PARSE_FAILURE;
@@ -336,8 +334,7 @@ compatible:
 			}
 			a = bstr_next(a);
 
-			mask = bstrtoul(a);
-			if (mask == ULONG_MAX) {
+			if (bstrtoul(a, &mask) < 0) {
 				PARSE_ERR(a, "meta: invalid mask, must be " \
 				    "numeric");
 				return PARSE_FAILURE;
diff --git a/tc/em_nbyte.c b/tc/em_nbyte.c
index 87f3e9d..9a52ffc 100644
--- a/tc/em_nbyte.c
+++ b/tc/em_nbyte.c
@@ -63,8 +63,7 @@ static int nbyte_parse_eopt(struct nlmsghdr *n, struct tcf_ematch_hdr *hdr,
 				return PARSE_ERR(a, "nbyte: missing argument");
 			a = bstr_next(a);
 
-			offset = bstrtoul(a);
-			if (offset == ULONG_MAX)
+			if (bstrtoul(a, &offset) < 0)
 				return PARSE_ERR(a, "nbyte: invalid offset, " \
 				    "must be numeric");
 
@@ -76,8 +75,7 @@ static int nbyte_parse_eopt(struct nlmsghdr *n, struct tcf_ematch_hdr *hdr,
 
 			layer = parse_layer(a);
 			if (layer == INT_MAX) {
-				layer = bstrtoul(a);
-				if (layer == ULONG_MAX)
+				if (bstrtoul(a, &layer) < 0)
 					return PARSE_ERR(a, "nbyte: invalid " \
 					    "layer");
 			}
diff --git a/tc/em_u32.c b/tc/em_u32.c
index 21ed70f..88b5fa1 100644
--- a/tc/em_u32.c
+++ b/tc/em_u32.c
@@ -33,6 +33,7 @@ static void u32_print_usage(FILE *fd)
 	    "Example: u32(u16 0x1122 0xffff at nexthdr+4)\n");
 }
 
+
 static int u32_parse_eopt(struct nlmsghdr *n, struct tcf_ematch_hdr *hdr,
 			  struct bstr *args)
 {
@@ -62,16 +63,14 @@ static int u32_parse_eopt(struct nlmsghdr *n, struct tcf_ematch_hdr *hdr,
 	if (a == NULL)
 		return PARSE_ERR(a, "u32: missing key");
 
-	key = bstrtoul(a);
-	if (key == ULONG_MAX)
+	if (bstrtoul(a, &key) < 0)
 		return PARSE_ERR(a, "u32: invalid key, must be numeric");
 
 	a = bstr_next(a);
 	if (a == NULL)
 		return PARSE_ERR(a, "u32: missing mask");
 
-	mask = bstrtoul(a);
-	if (mask == ULONG_MAX)
+	if (bstrtoul(a, &mask) < 0)
 		return PARSE_ERR(a, "u32: invalid mask, must be numeric");
 
 	a = bstr_next(a);
@@ -92,12 +91,12 @@ static int u32_parse_eopt(struct nlmsghdr *n, struct tcf_ematch_hdr *hdr,
 		a = bstr_next(a);
 		if (a == NULL)
 			return PARSE_ERR(a, "u32: missing offset");
-		offset = bstrtoul(a);
-	} else
-		offset = bstrtoul(a);
-
-	if (offset == ULONG_MAX)
-		return PARSE_ERR(a, "u32: invalid offset");
+		if (bstrtoul(a, &offset) < 0)
+			return PARSE_ERR(a, "u32: invalid offset");
+	} else {
+		if (bstrtoul(a, &offset) < 0)
+			return PARSE_ERR(a, "u32: invalid offset");
+	}
 
 	if (a->next)
 		return PARSE_ERR(a->next, "u32: unexpected trailer");
diff --git a/tc/m_ematch.c b/tc/m_ematch.c
index 4c3acf8..b7eed18 100644
--- a/tc/m_ematch.c
+++ b/tc/m_ematch.c
@@ -510,20 +510,25 @@ struct bstr * bstr_alloc(const char *text)
 	return b;
 }
 
-unsigned long bstrtoul(const struct bstr *b)
+int bstrtoul(const struct bstr *b, unsigned long *lp)
 {
 	char *inv = NULL;
-	unsigned long l;
 	char buf[b->len+1];
 
+	if (b->len == 0)
+		return -EINVAL;
+
 	memcpy(buf, b->data, b->len);
 	buf[b->len] = '\0';
 
-	l = strtoul(buf, &inv, 0);
-	if (l == ULONG_MAX || inv == buf)
-		return ULONG_MAX;
+	*lp = strtoul(buf, &inv, 0);
+	if (inv == buf)
+		return -EINVAL;
+
+	if (*lp == ULONG_MAX && errno == ERANGE)
+		return -ERANGE;
 
-	return l;
+	return 0;
 }
 
 void bstr_print(FILE *fd, const struct bstr *b, int ascii)
diff --git a/tc/m_ematch.h b/tc/m_ematch.h
index 5036e9b..e676290 100644
--- a/tc/m_ematch.h
+++ b/tc/m_ematch.h
@@ -49,7 +49,7 @@ static inline struct bstr *bstr_next(struct bstr *b)
 	return b->next;
 }
 
-extern unsigned long bstrtoul(const struct bstr *b);
+extern int bstrtoul(const struct bstr *b, unsigned long *lp);
 extern void bstr_print(FILE *fd, const struct bstr *b, int ascii);
 
 
-- 
1.7.1


^ permalink raw reply related

* [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Bryan Schumaker @ 2011-04-12 17:41 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Myklebust, Trond, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mm-commits-u79uwXL29TY76Z2rM5mHXA, ML netdev,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, Jiri Slaby
In-Reply-To: <4DA36DB6.8060108-AlSwsSmVLrQ@public.gmane.org>

On 04/11/2011 05:08 PM, Jiri Slaby wrote:
> 
> Sorry for an extra message. I've just found out that there appears
> messages in dmesg:
> [   58.656048] RPC: AUTH_GSS upcall timed out.
> [   58.656050] Please check user daemon is running.
> [   88.656065] RPC: AUTH_GSS upcall timed out.
> [   88.656068] Please check user daemon is running.
> [  118.656077] RPC: AUTH_GSS upcall timed out.
> [  118.656080] Please check user daemon is running.
> [  148.656049] RPC: AUTH_GSS upcall timed out.
> [  148.656052] Please check user daemon is running.
> [  178.656046] RPC: AUTH_GSS upcall timed out.
> [  178.656049] Please check user daemon is running.
> 
> 
> I instrumented the code and it's stuck with trying RPC_AUTH_GSS_KRB5.
> 
> I don't use GSS at all.
> 
> regards,

Does this patch help?

- Bryan



There can be an infinite loop if gss_create_upcall() is called without
the userspace program running.  To prevent this, we return -EACCES if
we notice that pipe_version hasn't changed (indicating that the pipe
has not been opened).

Signed-off-by: Bryan Schumaker <bjschuma-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
--
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9bf41ea..8a03ee0 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2224,8 +2224,9 @@ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
 
 	for (i = 0; i < len; i++) {
 		status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
-		if (status != -EPERM)
-			break;
+		if (status == -EPERM || status == -EACCES)
+			continue;
+		break;
 	}
 	if (status == 0)
 		status = nfs4_server_capabilities(server, fhandle);
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index f3914d0..339ba64 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -520,7 +520,7 @@ gss_refresh_upcall(struct rpc_task *task)
 		warn_gssd();
 		task->tk_timeout = 15*HZ;
 		rpc_sleep_on(&pipe_version_rpc_waitqueue, task, NULL);
-		return 0;
+		return -EAGAIN;
 	}
 	if (IS_ERR(gss_msg)) {
 		err = PTR_ERR(gss_msg);
@@ -563,10 +563,12 @@ retry:
 	if (PTR_ERR(gss_msg) == -EAGAIN) {
 		err = wait_event_interruptible_timeout(pipe_version_waitqueue,
 				pipe_version >= 0, 15*HZ);
+		if (pipe_version < 0) {
+			warn_gssd();
+			err = -EACCES;
+		}
 		if (err)
 			goto out;
-		if (pipe_version < 0)
-			warn_gssd();
 		goto retry;
 	}
 	if (IS_ERR(gss_msg)) {


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [net-next PATCH 1/3] vxge: always enable hardware time stamp
From: David Miller @ 2011-04-12 18:01 UTC (permalink / raw)
  To: jdmason; +Cc: netdev
In-Reply-To: <20110412153604.GA1433@kudzu.us>

From: Jon Mason <jdmason@kudzu.us>
Date: Tue, 12 Apr 2011 10:36:06 -0500

> On Sun, Apr 10, 2011 at 06:58:45PM -0700, David Miller wrote:
>> From: Jon Mason <jdmason@kudzu.us>
>> Date: Fri,  8 Apr 2011 16:11:21 -0500
>> 
>> > Hardware time stamp calculation can only be enabled by the privileged
>> > function. Enable it always by default and simply use the ethtool
>> > interface to set a flag to indicate whether or not the respective
>> > function driver should indicate the timestamp along with the received
>> > packet.
>> > 
>> > Also, make certain fields in vxge_hw_device_config bit-fields to reduce
>> > the size of the struct.
>> > 
>> > Signed-off-by: Jon Mason <jdmason@kudzu.us>
>> 
>> Doesn't this have some performance or latency impact?
> 
> It is all done in hardware by replacing the CRC with the HWTS value.
> So, no perf or latency issues there.  It still only handles the HWTS
> in receive if it is enabled in software via the ioctl. 

Ok, thanks for the clarification, I'll apply this patch set.

^ permalink raw reply

* Re: [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Jiri Slaby @ 2011-04-12 18:05 UTC (permalink / raw)
  To: Bryan Schumaker
  Cc: Jiri Slaby, Myklebust, Trond, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mm-commits-u79uwXL29TY76Z2rM5mHXA, ML netdev,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4DA48EB0.40600-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>

On 04/12/2011 07:41 PM, Bryan Schumaker wrote:
> On 04/11/2011 05:08 PM, Jiri Slaby wrote:
>>
>> Sorry for an extra message. I've just found out that there appears
>> messages in dmesg:
>> [   58.656048] RPC: AUTH_GSS upcall timed out.
>> [   58.656050] Please check user daemon is running.
>> [   88.656065] RPC: AUTH_GSS upcall timed out.
>> [   88.656068] Please check user daemon is running.
>> [  118.656077] RPC: AUTH_GSS upcall timed out.
>> [  118.656080] Please check user daemon is running.
>> [  148.656049] RPC: AUTH_GSS upcall timed out.
>> [  148.656052] Please check user daemon is running.
>> [  178.656046] RPC: AUTH_GSS upcall timed out.
>> [  178.656049] Please check user daemon is running.
>>
>>
>> I instrumented the code and it's stuck with trying RPC_AUTH_GSS_KRB5.
>>
>> I don't use GSS at all.
>>
>> regards,
> 
> Does this patch help?
> 
> - Bryan
> 
> 
> 
> There can be an infinite loop if gss_create_upcall() is called without
> the userspace program running.  To prevent this, we return -EACCES if
> we notice that pipe_version hasn't changed (indicating that the pipe
> has not been opened).

Yes, it fixes the problem. But it waits 15s before it times out. This is
inacceptable for automounted NFS dirs.

thanks,
-- 
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox