Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net 1/1] 8139cp: revert "set ring address before enabling receiver"
From: David Miller @ 2012-11-22  4:39 UTC (permalink / raw)
  To: jgarzik; +Cc: romieu, netdev, dwmw2, jasowang, gilboad
In-Reply-To: <50AD9F82.6030609@pobox.com>

From: Jeff Garzik <jgarzik@pobox.com>
Date: Wed, 21 Nov 2012 22:44:02 -0500

> On 11/21/2012 03:07 PM, Francois Romieu wrote:
>> This patch reverts b01af4579ec41f48e9b9c774e70bd6474ad210db.
>>
>> The original patch was tested with emulated hardware. Real
>> hardware chokes.
>>
>> Fixes https://bugzilla.kernel.org/show_bug.cgi?id=47041
>>
>> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
> 
> Acked-by: Jeff Garzik <jgarzik@redhat.com>

I'm definitely not applying the revert, because it hurts the largest
user of this driver, which is the virtual hardware.

David Woodhouse's patch is the only reasonable way forward at this
point, so I'd apprecite it if you'd give his patch a review too.

Thank you.

^ permalink raw reply

* Re: [PATCH] net: ipv6: change %8s to %s for rt->dst.dev->name in seq_printf of rt6_info_route
From: Shan Wei @ 2012-11-22  5:28 UTC (permalink / raw)
  To: Chen Gang; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50AD9351.5020805@asianux.com>

Hi chen gang:

For length of device name which less than 8 char，
your patch changes them to be print from align right 
to align left. But at least since 2005(git age-time),
we keep this style so far.
Maybe, since birth of this code, just align right. :-)

Why we *should* change this style?
just keep be consistent with the case which length of device
name greater than 8 char?

Not only old name rule i.e. eth0,eth1, but also new name rule
base on pci address ,i.e. em1,p3p1. most of them are less than 8 char.
Should not we take more attention on the case less than 8 char?

By addition, if we want to add new field in the future,
align right is a better choice.


Chen Gang said, at 2012/11/22 10:52:
> Hi Shan Wei, Eric Dumazet
> 
>   is this patch integrated into main branch ?
>   if need me for additional completion (such as: merge another 2 trivial patches into this patch, too)
>   please tell me, I will do. 
> 
>   I understand you are working overtime, maybe no time for any minor and trivial patches.
>   if surely it is, I think:
>     you can modify these code manually, and obsolete these minor and trivial patches which I provided.
>     I do not mind whether mention me in another new patches (you can mention me or not mention me, both are OK).
>     since our goal is to provide contributes to outside, efficiently.
> 
>  regards
> 
> gchen
> 
> 
> 于 2012年11月05日 11:02, Chen Gang 写道:
>>
>> 1. not to send same patch triple times. 
> 
>   thanks, I shall notice, next time.
>   (I shall 'believe' another members).
> 
>> 2. config your email client,because tab is changed to space.
>>    you can read Documentation/email-clients.txt.
> 
>   1) thanks. I shall notice, next time.
>   2) now, I get gvim as extention editor for thounderbird
>   3) the patch is generated by `git format-patch -s --summary --stat`
>      it use "' '\t" as head, I do not touch it, maybe it is correct.
> 
> welcome any members to giving additional suggestions and completions.
> 
> thanks
> 
> the modified contents are below,
> -----------------------------------------------------------------------------------
> 
>   the length of rt->dst.dev->name is 16 (IFNAMSIZ)
>   in seq_printf, it is not suitable to use %8s for rt->dst.dev->name.
>   so change it to %s, since each line has not been solid any more.
> 
>   additional information:
> 
>     %8s  limit the width, not for the original string output length
>          if name length is more than 8, it still can be fully displayed.
>          if name length is less than 8, the ' ' will be filled before name.
> 
>     %.8s truly limit the original string output length (precision)
> 
> Signed-off-by: Chen Gang <gang.chen@asianux.com>
> ---
>  net/ipv6/route.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index c42650c..b60bc52 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -2835,7 +2835,7 @@ static int rt6_info_route(struct rt6_info *rt, void *p_arg)
>  	} else {
>  		seq_puts(m, "00000000000000000000000000000000");
>  	}
> -	seq_printf(m, " %08x %08x %08x %08x %8s\n",
> +	seq_printf(m, " %08x %08x %08x %08x %s\n",
>  		   rt->rt6i_metric, atomic_read(&rt->dst.__refcnt),
>  		   rt->dst.__use, rt->rt6i_flags,
>  		   rt->dst.dev ? rt->dst.dev->name : "");
> 
> 
> 

^ permalink raw reply

* [PATCH net] bnx2x: remove redundant warning log
From: Ariel Elior @ 2012-11-22 17:16 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein

fix bug where a register which was only meant to be read in 578xx/57712
devices causes a bogus error message to be logged when read from other
devices.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |   11 +++++++----
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index bd1fd3d..01611b3 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -9545,10 +9545,13 @@ static int __devinit bnx2x_prev_unload_common(struct bnx2x *bp)
  */
 static void __devinit bnx2x_prev_interrupted_dmae(struct bnx2x *bp)
 {
-	u32 val = REG_RD(bp, PGLUE_B_REG_PGLUE_B_INT_STS);
-	if (val & PGLUE_B_PGLUE_B_INT_STS_REG_WAS_ERROR_ATTN) {
-		BNX2X_ERR("was error bit was found to be set in pglueb upon startup. Clearing");
-		REG_WR(bp, PGLUE_B_REG_WAS_ERROR_PF_7_0_CLR, 1 << BP_FUNC(bp));
+	if (!CHIP_IS_E1x(bp)) {
+		u32 val = REG_RD(bp, PGLUE_B_REG_PGLUE_B_INT_STS);
+		if (val & PGLUE_B_PGLUE_B_INT_STS_REG_WAS_ERROR_ATTN) {
+			BNX2X_ERR("was error bit was found to be set in pglueb upon startup. Clearing");
+			REG_WR(bp, PGLUE_B_REG_WAS_ERROR_PF_7_0_CLR,
+			       1 << BP_FUNC(bp));
+		}
 	}
 }
 
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH] sctp: fix -ENOMEM result with invalid user space pointer in sendto() syscall
From: Tommi Rantala @ 2012-11-22 13:23 UTC (permalink / raw)
  To: linux-sctp, netdev
  Cc: Neil Horman, Vlad Yasevich, Sridhar Samudrala, David S. Miller,
	Dave Jones, Tommi Rantala

Consider the following program, that sets the second argument to the
sendto() syscall incorrectly:

 #include <string.h>
 #include <arpa/inet.h>
 #include <sys/socket.h>

 int main(void)
 {
         int fd;
         struct sockaddr_in sa;

         fd = socket(AF_INET, SOCK_STREAM, 132 /*IPPROTO_SCTP*/);
         if (fd < 0)
                 return 1;

         memset(&sa, 0, sizeof(sa));
         sa.sin_family = AF_INET;
         sa.sin_addr.s_addr = inet_addr("127.0.0.1");
         sa.sin_port = htons(11111);

         sendto(fd, NULL, 1, 0, (struct sockaddr *)&sa, sizeof(sa));

         return 0;
 }

We get -ENOMEM:

 $ strace -e sendto ./demo
 sendto(3, NULL, 1, 0, {sa_family=AF_INET, sin_port=htons(11111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ENOMEM (Cannot allocate memory)

Propagate the error code from sctp_user_addto_chunk(), so that we will
tell user space what actually went wrong:

 $ strace -e sendto ./demo
 sendto(3, NULL, 1, 0, {sa_family=AF_INET, sin_port=htons(11111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EFAULT (Bad address)

Noticed while running Trinity (the syscall fuzzer).

Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
---
 net/sctp/chunk.c  |   13 +++++++++----
 net/sctp/socket.c |    4 ++--
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index d241ef5..3952ca9 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -183,7 +183,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 
 	msg = sctp_datamsg_new(GFP_KERNEL);
 	if (!msg)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	/* Note: Calculate this outside of the loop, so that all fragments
 	 * have the same expiration.
@@ -280,8 +280,11 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 
 		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag, 0);
 
-		if (!chunk)
+		if (!chunk) {
+			err = -ENOMEM;
 			goto errout;
+		}
+
 		err = sctp_user_addto_chunk(chunk, offset, len, msgh->msg_iov);
 		if (err < 0)
 			goto errout_chunk_put;
@@ -315,8 +318,10 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 
 		chunk = sctp_make_datafrag_empty(asoc, sinfo, over, frag, 0);
 
-		if (!chunk)
+		if (!chunk) {
+			err = -ENOMEM;
 			goto errout;
+		}
 
 		err = sctp_user_addto_chunk(chunk, offset, over,msgh->msg_iov);
 
@@ -342,7 +347,7 @@ errout:
 		sctp_chunk_free(chunk);
 	}
 	sctp_datamsg_put(msg);
-	return NULL;
+	return ERR_PTR(err);
 }
 
 /* Check whether this message has expired. */
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index a60d1f8..406d957 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1915,8 +1915,8 @@ SCTP_STATIC int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 
 	/* Break the message into multiple chunks of maximum size. */
 	datamsg = sctp_datamsg_from_user(asoc, sinfo, msg, msg_len);
-	if (!datamsg) {
-		err = -ENOMEM;
+	if (IS_ERR(datamsg)) {
+		err = PTR_ERR(datamsg);
 		goto out_free;
 	}
 
-- 
1.7.9.5

^ permalink raw reply related

* Re: VXLAN multicast receive not working
From: Bernhard Schmidt @ 2012-11-22  0:09 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20121122000525.GA3447@fliwatuet.svr02.mucip.net>

On Thu, Nov 22, 2012 at 01:05:25AM +0100, Bernhard Schmidt wrote:

> The same VXLAN domain is defined on the Nexus 1000V and a VM is attached
> to it. When I send some broadcast traffic down vxlan0 (i.e. ping
> 10.1.1.2 which generates an ARP request) the VM sees the packet just
> fine.
> 
> When I do it the other way around (the VM sends a broadcast ARP for
> 10.1.1.3) I see a packet coming into eth1 on the multicast group, but
> vxlan0 stays silent. 

I think I found a possible reason, my vxlan interface is on top of eth1

7: vxlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue
state UNKNOWN mode DEFAULT 
    link/ether 96:06:c6:cf:a0:2e brd ff:ff:ff:ff:ff:ff
    vxlan id 12340 group 239.0.0.42 dev eth1 port 32768 61000 ageing 300 

but the multicast group is joined only on eth0

root@lxbscDA-VXLAN:~/iproute2# ip maddr
1:	lo
	inet  224.0.0.1
	inet6 ff02::1
2:	eth0
	link  33:33:00:00:00:01
	link  01:00:5e:00:00:01
	link  33:33:ff:8e:0d:c7
	link  01:00:5e:00:00:2a
	inet  239.0.0.42
	inet  224.0.0.1
	inet6 ff02::1:ff8e:dc7 users 2
	inet6 ff02::1
4:	eth1
	link  33:33:00:00:00:01
	link  01:00:5e:00:00:01
	link  33:33:ff:8e:0d:c8
	inet  224.0.0.1
	inet6 ff02::1:ff8e:dc8
	inet6 ff02::1
7:	vxlan0
	link  33:33:00:00:00:01
	link  01:00:5e:00:00:01
	link  33:33:ff:cf:a0:2e
	inet  224.0.0.1
	inet6 ff02::1:ffcf:a02e
	inet6 ff02::1

At first glance I did not spot the issue with my two weeks of C, so
someone else has to hunt this down.

Regards,
Bernhard

^ permalink raw reply

* VXLAN multicast receive not working
From: Bernhard Schmidt @ 2012-11-22  0:05 UTC (permalink / raw)
  To: netdev

[Apologies if you receive this twice, my original mail seems to be lost]

Hello,

I'm just trying to play with VXLAN a bit and wanted to build a Linux
gateway routing into seperate VXLAN segments.

Debian Wheezy, running 3.7-rc6, with current git HEAD of iproute2.
It's a VMware VM but that should not matter much.

Two vmxnet3 NICs, one with management and one with my VXLAN transport
network.

4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UNKNOWN qlen 1000
    link/ether 00:50:56:8e:0d:c8 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.250/24 scope global eth1
    inet6 fe80::250:56ff:fe8e:dc8/64 scope link 
       valid_lft forever preferred_lft forever

In the same network segment are two VMware ESXi 5.0 hosts with Nexus
1000V for VLAN termination (10.0.0.1 and 10.0.0.2)

On top of that there is a VXLAN interface defined, with ID 12340 and
group 239.0.0.42.

6: vxlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue
state UNKNOWN mode DEFAULT 
    link/ether f6:59:e7:db:82:92 brd ff:ff:ff:ff:ff:ff
    vxlan id 12340 group 239.0.0.42 dev eth1 port 32768 61000 ageing 300 

That interface has an address as well

6: vxlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue
state UNKNOWN 
    link/ether f6:59:e7:db:82:92 brd ff:ff:ff:ff:ff:ff
    inet 10.1.1.1/24 scope global vxlan0
    inet6 fe80::f459:e7ff:fedb:8292/64 scope link 
       valid_lft forever preferred_lft forever

The same VXLAN domain is defined on the Nexus 1000V and a VM is attached
to it. When I send some broadcast traffic down vxlan0 (i.e. ping
10.1.1.2 which generates an ARP request) the VM sees the packet just
fine.

When I do it the other way around (the VM sends a broadcast ARP for
10.1.1.3) I see a packet coming into eth1 on the multicast group, but
vxlan0 stays silent. 

I have captured one of those packets, wireshark does not support
disecting it yet but in my eyes the packet is correct. I've put it
online at http://users.birkenwald.de/~berni/temp/vxlan.pcap

Weirdly enough, as soon as I populate the ARP and VXLAN forwarding table
by pinging back from the destination to the source (so the source can
learn both MAC->Nexthop for VXLAN and IP->MAC from the ARP request) it
starts working. 

To summarize, Multicast/Broadcast from N1k to Linux seems to be broken,
the encapsulated packet is seen on the Ethernet but the decapsulated
packet is not seen on vxlan0. Broadcast/Multicast in the other direction
works just fine as well as Unicast in both directions.

Thanks for any pointers,
Bernhard

^ permalink raw reply

* Re: [PATCH] net: ipv6: change %8s to %s for rt->dst.dev->name in seq_printf of rt6_info_route
From: Chen Gang @ 2012-11-22  8:37 UTC (permalink / raw)
  To: Shan Wei; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <50ADB7E7.1000009@gmail.com>

于 2012年11月22日 13:28, Shan Wei 写道:
> Hi chen gang:
> 
> For length of device name which less than 8 char，
> your patch changes them to be print from align right 
> to align left. But at least since 2005(git age-time),
> we keep this style so far.
> Maybe, since birth of this code, just align right. :-)
> 

  originally, it is a solid output length, the length is "#define
RT6_INFO_LEN (32 + 4 + 32 + 4 + 32 + 40 + 5 + 1)"
  and RHEL5 (kernel-2.6.18-308.20.el5) still use it.
  it assume that the length of rt->rt6i_dev->name (in RHEL5) is 8.

> Why we *should* change this style?
> just keep be consistent with the case which length of device
> name greater than 8 char?
> 

  as a solid length, 8 is not suitable, firstly I suggest to '%16s' (I
call it 'beautiful',  but for RHEL5, it is a correctness issue)
  and Eric Dumazet suggest use '%s' is better, since it is not solid
length any more (have already let seq_printf instead of arg->buffer)
  and I think: as a result, what he said is reasonable

> Not only old name rule i.e. eth0,eth1, but also new name rule
> base on pci address ,i.e. em1,p3p1. most of them are less than 8 char.
> Should not we take more attention on the case less than 8 char?
> 

  I have ever seen such a device name is more than 8 characters.
  I am not quite sure: maybe they are eth-route* or eth-usb* ...
  I will check it in these days, please wait for some days.


> By addition, if we want to add new field in the future,
> align right is a better choice.
> 

  maybe what you said is better (still keep it 'beautiful', but need use
'%16s' instead of '%8s')

  for this, Eric Dumazet maybe have his opinions.


 Regards

gchen.

> 
> Chen Gang said, at 2012/11/22 10:52:
>> Hi Shan Wei, Eric Dumazet
>>
>>   is this patch integrated into main branch ?
>>   if need me for additional completion (such as: merge another 2 trivial patches into this patch, too)
>>   please tell me, I will do. 
>>
>>   I understand you are working overtime, maybe no time for any minor and trivial patches.
>>   if surely it is, I think:
>>     you can modify these code manually, and obsolete these minor and trivial patches which I provided.
>>     I do not mind whether mention me in another new patches (you can mention me or not mention me, both are OK).
>>     since our goal is to provide contributes to outside, efficiently.
>>
>>  regards
>>
>> gchen
>>
>>
>> 于 2012年11月05日 11:02, Chen Gang 写道:
>>>
>>> 1. not to send same patch triple times. 
>>
>>   thanks, I shall notice, next time.
>>   (I shall 'believe' another members).
>>
>>> 2. config your email client,because tab is changed to space.
>>>    you can read Documentation/email-clients.txt.
>>
>>   1) thanks. I shall notice, next time.
>>   2) now, I get gvim as extention editor for thounderbird
>>   3) the patch is generated by `git format-patch -s --summary --stat`
>>      it use "' '\t" as head, I do not touch it, maybe it is correct.
>>
>> welcome any members to giving additional suggestions and completions.
>>
>> thanks
>>
>> the modified contents are below,
>> -----------------------------------------------------------------------------------
>>
>>   the length of rt->dst.dev->name is 16 (IFNAMSIZ)
>>   in seq_printf, it is not suitable to use %8s for rt->dst.dev->name.
>>   so change it to %s, since each line has not been solid any more.
>>
>>   additional information:
>>
>>     %8s  limit the width, not for the original string output length
>>          if name length is more than 8, it still can be fully displayed.
>>          if name length is less than 8, the ' ' will be filled before name.
>>
>>     %.8s truly limit the original string output length (precision)
>>
>> Signed-off-by: Chen Gang <gang.chen@asianux.com>
>> ---
>>  net/ipv6/route.c |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>> index c42650c..b60bc52 100644
>> --- a/net/ipv6/route.c
>> +++ b/net/ipv6/route.c
>> @@ -2835,7 +2835,7 @@ static int rt6_info_route(struct rt6_info *rt, void *p_arg)
>>  	} else {
>>  		seq_puts(m, "00000000000000000000000000000000");
>>  	}
>> -	seq_printf(m, " %08x %08x %08x %08x %8s\n",
>> +	seq_printf(m, " %08x %08x %08x %08x %s\n",
>>  		   rt->rt6i_metric, atomic_read(&rt->dst.__refcnt),
>>  		   rt->dst.__use, rt->rt6i_flags,
>>  		   rt->dst.dev ? rt->dst.dev->name : "");
>>
>>
>>
> 
> 
> 


-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* Re: [PATCH 080/493] fddi: remove use of __devexit_p
From: Maciej W. Rozycki @ 2012-11-21 23:49 UTC (permalink / raw)
  To: Greg KH; +Cc: Bill Pemberton, netdev
In-Reply-To: <20121119192949.GA16976@kroah.com>

Greg,

> >  I have unconfused myself now, so please replace the above with the 
> > following question: what about configurations (e.g. buses) that not 
> > support hotplug at all?  For example apart from PCI the defxx driver 
> > concerned here supports the TURBOchannel bus that by design does not have 
> > the concept of live option card removal (no such circuitry).  So should 
> > now the precious memory be wasted on systems that will never ever handle 
> > hotplug?
> 
> CONFIG_HOTPLUG is always enabled now, so that's not an option anymore.
> And again, a user can "hot unbind" a driver from a device from
> userspace, no matter if the bus physically supports it or not.

 Hmm, what purpose does this serve for devices that cannot be physically 
removed?  If there is none, shouldn't that policy be set by individual 
drivers or platform even?  Even if HOTPLUG as a whole is unconditional (I 
suppose the amount of space core support itself takes is much less to what 
driver code can).

 TURBOchannel, although valid, is an old exotic case that might not be 
worth arguing for, except for purity maybe.  But there are surely many 
contemporary systems out there that are known they are never going to 
support hot device replacement.  Consider most of the embedded systems for 
example, where devices may even physically be cast into a single SOC (with 
no prospect of chipping off any pieces ever ;) ), that certainly could not 
care less of device replacement, but they do care a lot about memory 
consumption.

 Was this implication considered, discussed and diregarded as not 
important enough compared to benefits from hardcoding HOTPLUG support?  
I'm seriously asking for a pointer, not trying to cause any stir-up -- 
regrettably I fail to follow most discussions these days, but I would like 
to know what the background behind this decision was.  Thanks a lot!

  Maciej

^ permalink raw reply

* [PATCH 1/4] xfrm: remove redundant replay_esn check
From: Steffen Klassert @ 2012-11-22  8:02 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <1353571362-6774-1-git-send-email-steffen.klassert@secunet.com>

From: Ulrich Weber <ulrich.weber@sophos.com>

x->replay_esn is already checked in if clause,
so remove check and ident properly

Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/xfrm/xfrm_replay.c |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/net/xfrm/xfrm_replay.c b/net/xfrm/xfrm_replay.c
index 3efb07d..765f6fe 100644
--- a/net/xfrm/xfrm_replay.c
+++ b/net/xfrm/xfrm_replay.c
@@ -521,13 +521,12 @@ int xfrm_init_replay(struct xfrm_state *x)
 		    replay_esn->bmp_len * sizeof(__u32) * 8)
 			return -EINVAL;
 
-	if ((x->props.flags & XFRM_STATE_ESN) && replay_esn->replay_window == 0)
-		return -EINVAL;
-
-	if ((x->props.flags & XFRM_STATE_ESN) && x->replay_esn)
-		x->repl = &xfrm_replay_esn;
-	else
-		x->repl = &xfrm_replay_bmp;
+		if (x->props.flags & XFRM_STATE_ESN) {
+			if (replay_esn->replay_window == 0)
+				return -EINVAL;
+			x->repl = &xfrm_replay_esn;
+		} else
+			x->repl = &xfrm_replay_bmp;
 	} else
 		x->repl = &xfrm_replay_legacy;
 
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 3/4] xfrm: Use a static gc threshold value for ipv6
From: Steffen Klassert @ 2012-11-22  8:02 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <1353571362-6774-1-git-send-email-steffen.klassert@secunet.com>

Unlike ipv4 did, ipv6 does not handle the maximum number of cached
routes dynamically. So no need to try to handle the IPsec gc threshold
value dynamically. This patch sets the IPsec gc threshold value back to
1024 routes, as it is for non-IPsec routes.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv6/xfrm6_policy.c |   16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index f3ed8ca..6ce4a4f 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -327,21 +327,7 @@ static struct ctl_table_header *sysctl_hdr;
 int __init xfrm6_init(void)
 {
 	int ret;
-	unsigned int gc_thresh;
-
-	/*
-	 * We need a good default value for the xfrm6 gc threshold.
-	 * In ipv4 we set it to the route hash table size * 8, which
-	 * is half the size of the maximaum route cache for ipv4.  It
-	 * would be good to do the same thing for v6, except the table is
-	 * constructed differently here.  Here each table for a net namespace
-	 * can have FIB_TABLE_HASHSZ entries, so lets go with the same
-	 * computation that we used for ipv4 here.  Also, lets keep the initial
-	 * gc_thresh to a minimum of 1024, since, the ipv6 route cache defaults
-	 * to that as a minimum as well
-	 */
-	gc_thresh = FIB6_TABLE_HASHSZ * 8;
-	xfrm6_dst_ops.gc_thresh = (gc_thresh < 1024) ? 1024 : gc_thresh;
+
 	dst_entries_init(&xfrm6_dst_ops);
 
 	ret = xfrm6_policy_init();
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 2/4] net: xfrm: use __this_cpu_read per-cpu helper
From: Steffen Klassert @ 2012-11-22  8:02 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <1353571362-6774-1-git-send-email-steffen.klassert@secunet.com>

From: Shan Wei <shanwei88@gmail.com>

this_cpu_ptr/this_cpu_read is faster than per_cpu_ptr(p, smp_processor_id())
and can reduce  memory accesses.
The latter helper needs to find the offset for current cpu,
and needs more assembler instructions which objdump shows in following.

this_cpu_ptr relocates and address. this_cpu_read() relocates the address
and performs the fetch. this_cpu_read() saves you more instructions
since it can do the relocation and the fetch in one instruction.

per_cpu_ptr(p, smp_processor_id())：
  1e:   65 8b 04 25 00 00 00 00         mov    %gs:0x0,%eax
  26:   48 98                           cltq
  28:   31 f6                           xor    %esi,%esi
  2a:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi
  31:   48 8b 04 c5 00 00 00 00         mov    0x0(,%rax,8),%rax
  39:   c7 44 10 04 14 00 00 00         movl   $0x14,0x4(%rax,%rdx,1)

this_cpu_ptr(p)
  1e:   65 48 03 14 25 00 00 00 00      add    %gs:0x0,%rdx
  27:   31 f6                           xor    %esi,%esi
  29:   c7 42 04 14 00 00 00            movl   $0x14,0x4(%rdx)
  30:   48 c7 c7 00 00 00 00            mov    $0x0,%rdi

Signed-off-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/xfrm/xfrm_ipcomp.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index e5246fb..2906d52 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -276,18 +276,16 @@ static struct crypto_comp * __percpu *ipcomp_alloc_tfms(const char *alg_name)
 	struct crypto_comp * __percpu *tfms;
 	int cpu;
 
-	/* This can be any valid CPU ID so we don't need locking. */
-	cpu = raw_smp_processor_id();
 
 	list_for_each_entry(pos, &ipcomp_tfms_list, list) {
 		struct crypto_comp *tfm;
 
-		tfms = pos->tfms;
-		tfm = *per_cpu_ptr(tfms, cpu);
+		/* This can be any valid CPU ID so we don't need locking. */
+		tfm = __this_cpu_read(*pos->tfms);
 
 		if (!strcmp(crypto_comp_name(tfm), alg_name)) {
 			pos->users++;
-			return tfms;
+			return pos->tfms;
 		}
 	}
 
-- 
1.7.9.5

^ permalink raw reply related

* pull request: ipsec-next 2012-11-22
From: Steffen Klassert @ 2012-11-22  8:02 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev

This pull request is intended for net-next and contains the following changes:

1) Remove a redundant check when initializing the xfrm replay functions,
   from Ulrich Weber.
2) Use a faster per-cpu helper when allocating ipcomt transforms,
   from Shan Wei.
3) Use a static gc threshold value for ipv6, simmilar to what we do
   for ipv4 now.
4) Remove a commented out function call.

Please pull or let me know if there are problems.

Thanks!

The following changes since commit f1e0b5b4f1eae56a3192688177f36e2bdf0e01ac:

  Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge (2012-11-07 19:08:42 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next.git master

for you to fetch changes up to 0afe21fdf6cfe0fe8a184d82a399773cc331bf40:

  xfrm6: Remove commented out function call to xfrm6_input_fini (2012-11-16 08:07:56 +0100)

----------------------------------------------------------------
Shan Wei (1):
      net: xfrm: use __this_cpu_read per-cpu helper

Steffen Klassert (2):
      xfrm: Use a static gc threshold value for ipv6
      xfrm6: Remove commented out function call to xfrm6_input_fini

Ulrich Weber (1):
      xfrm: remove redundant replay_esn check

 net/ipv6/xfrm6_policy.c |   17 +----------------
 net/xfrm/xfrm_ipcomp.c  |    8 +++-----
 net/xfrm/xfrm_replay.c  |   13 ++++++-------
 3 files changed, 10 insertions(+), 28 deletions(-)

^ permalink raw reply

* [PATCH 4/4] xfrm6: Remove commented out function call to xfrm6_input_fini
From: Steffen Klassert @ 2012-11-22  8:02 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <1353571362-6774-1-git-send-email-steffen.klassert@secunet.com>

xfrm6_input_fini() is not in the tree since more than 10 years,
so remove the commented out function call.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv6/xfrm6_policy.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 6ce4a4f..c984413 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -356,7 +356,6 @@ void xfrm6_fini(void)
 	if (sysctl_hdr)
 		unregister_net_sysctl_table(sysctl_hdr);
 #endif
-	//xfrm6_input_fini();
 	xfrm6_policy_fini();
 	xfrm6_state_fini();
 	dst_entries_destroy(&xfrm6_dst_ops);
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH v2] bonding: fix multiple bugs
From: Nikolay Aleksandrov @ 2012-11-22 14:39 UTC (permalink / raw)
  To: netdev; +Cc: fubar, andy, davem
In-Reply-To: <1353589827-2921-1-git-send-email-nikolay@redhat.com>

This patch aims to fix multiple race conditions, missing module parameter checks and
workqueue usage. First I would give three observations which will be used later.
Observation 1: if (delayed_work_pending(wq)) cancel_delayed_work(wq)
 This usage is wrong because the pending bit is cleared just before the work's fn is
 executed and if the function re-arms itself we might end up with the work still
 running. It is safe to call cancel_delayed_work_sync() even if the work is not queued
 at all.
Observation 2: Use of INIT_DELAYED_WORK()
 Work needs to be initialized only once prior to (de/en)queueing.
Observation 3: IFF_UPP is set only after ndo_open is called

Bugs:
1. Race between bonding_store_miimon() and bonding_store_arp_interval()
 Because of Obs.1 we can end up having both works enqueued.
2. Multiple races with INIT_DELAYED_WORK()
 Since the works are not protected by anything between INIT_DELAYED_WORK() and
 calls to (en/de)queue it is possible for races between the following functions:
 (races are also possible between calls to INIT_DELAYED_WORK() and workqueue code)
 bonding_store_miimon() - bonding_store_arp_interval(), bond_close(), bond_open(),
			  enqueued functions
 bonding_store_arp_interval() - bonding_store_miimon(), bond_close(), bond_open(),
				enqueued functions
3. Race between bonding_store_slaves_active() and slave manipulation functions
 The bond_for_each_slave use in bonding_store_slaves_active() is not protected
 by any synchronization mechanisms. NULL pointer dereference is easy to reach.
4. By Obs.1 we need to change bond_cancel_all()
5. The module can be loaded with arp_ip_target="255.255.255.255" which makes it
 impossible to remove as the function in sysfs checks for that value, so we make
 the argument checks consistent with sysfs.

Bugs 1 and 2 are fixed by moving all work initializations in bond_open which by
Obs. 2 and Obs. 3 and the fact that we make sure that all works are cancelled in
bond_close(), is guaranteed not to have any work enqueued. Also RTNL lock is now
acquired in bonding_store_miimon/arp_interval so they can't race with bond_close
and bond_open. The opposing work is cancelled only if the IFF_UPP flag is set
and it is cancelled unconditionally. The opposing work is already cancelled if
the interface is down so no need to cancel it again. This way we don't need new
synchronizations for the bonding workqueue. Bug 3 is fixed by acquiring the
read_lock for the slave walk.

v2: Removed 2 trailing white spaces
I sent the wrong version, please use this one without the 2 whitespaces.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_main.c  | 94 +++++++++++++---------------------------
 drivers/net/bonding/bond_sysfs.c | 36 +++++----------
 2 files changed, 42 insertions(+), 88 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 5f5b69f..d0b27a9 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3459,6 +3459,28 @@ static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
 
 /*-------------------------- Device entry points ----------------------------*/
 
+static void bond_work_init_all(struct bonding *bond)
+{
+	INIT_DELAYED_WORK(&bond->mcast_work,
+			  bond_resend_igmp_join_requests_delayed);
+	INIT_DELAYED_WORK(&bond->alb_work, bond_alb_monitor);
+	INIT_DELAYED_WORK(&bond->mii_work, bond_mii_monitor);
+	if (bond->params.mode == BOND_MODE_ACTIVEBACKUP)
+		INIT_DELAYED_WORK(&bond->arp_work, bond_activebackup_arp_mon);
+	else
+		INIT_DELAYED_WORK(&bond->arp_work, bond_loadbalance_arp_mon);
+	INIT_DELAYED_WORK(&bond->ad_work, bond_3ad_state_machine_handler);
+}
+
+static void bond_work_cancel_all(struct bonding *bond)
+{
+	cancel_delayed_work_sync(&bond->mii_work);
+	cancel_delayed_work_sync(&bond->arp_work);
+	cancel_delayed_work_sync(&bond->alb_work);
+	cancel_delayed_work_sync(&bond->ad_work);
+	cancel_delayed_work_sync(&bond->mcast_work);
+}
+
 static int bond_open(struct net_device *bond_dev)
 {
 	struct bonding *bond = netdev_priv(bond_dev);
@@ -3481,41 +3503,27 @@ static int bond_open(struct net_device *bond_dev)
 	}
 	read_unlock(&bond->lock);
 
-	INIT_DELAYED_WORK(&bond->mcast_work, bond_resend_igmp_join_requests_delayed);
+	bond_work_init_all(bond);
 
 	if (bond_is_lb(bond)) {
 		/* bond_alb_initialize must be called before the timer
 		 * is started.
 		 */
-		if (bond_alb_initialize(bond, (bond->params.mode == BOND_MODE_ALB))) {
-			/* something went wrong - fail the open operation */
+		if (bond_alb_initialize(bond, (bond->params.mode == BOND_MODE_ALB)))
 			return -ENOMEM;
-		}
-
-		INIT_DELAYED_WORK(&bond->alb_work, bond_alb_monitor);
 		queue_delayed_work(bond->wq, &bond->alb_work, 0);
 	}
 
-	if (bond->params.miimon) {  /* link check interval, in milliseconds. */
-		INIT_DELAYED_WORK(&bond->mii_work, bond_mii_monitor);
+	if (bond->params.miimon)  /* link check interval, in milliseconds. */
 		queue_delayed_work(bond->wq, &bond->mii_work, 0);
-	}
 
 	if (bond->params.arp_interval) {  /* arp interval, in milliseconds. */
-		if (bond->params.mode == BOND_MODE_ACTIVEBACKUP)
-			INIT_DELAYED_WORK(&bond->arp_work,
-					  bond_activebackup_arp_mon);
-		else
-			INIT_DELAYED_WORK(&bond->arp_work,
-					  bond_loadbalance_arp_mon);
-
 		queue_delayed_work(bond->wq, &bond->arp_work, 0);
 		if (bond->params.arp_validate)
 			bond->recv_probe = bond_arp_rcv;
 	}
 
 	if (bond->params.mode == BOND_MODE_8023AD) {
-		INIT_DELAYED_WORK(&bond->ad_work, bond_3ad_state_machine_handler);
 		queue_delayed_work(bond->wq, &bond->ad_work, 0);
 		/* register to receive LACPDUs */
 		bond->recv_probe = bond_3ad_lacpdu_recv;
@@ -3530,34 +3538,10 @@ static int bond_close(struct net_device *bond_dev)
 	struct bonding *bond = netdev_priv(bond_dev);
 
 	write_lock_bh(&bond->lock);
-
 	bond->send_peer_notif = 0;
-
 	write_unlock_bh(&bond->lock);
 
-	if (bond->params.miimon) {  /* link check interval, in milliseconds. */
-		cancel_delayed_work_sync(&bond->mii_work);
-	}
-
-	if (bond->params.arp_interval) {  /* arp interval, in milliseconds. */
-		cancel_delayed_work_sync(&bond->arp_work);
-	}
-
-	switch (bond->params.mode) {
-	case BOND_MODE_8023AD:
-		cancel_delayed_work_sync(&bond->ad_work);
-		break;
-	case BOND_MODE_TLB:
-	case BOND_MODE_ALB:
-		cancel_delayed_work_sync(&bond->alb_work);
-		break;
-	default:
-		break;
-	}
-
-	if (delayed_work_pending(&bond->mcast_work))
-		cancel_delayed_work_sync(&bond->mcast_work);
-
+	bond_work_cancel_all(bond);
 	if (bond_is_lb(bond)) {
 		/* Must be called only after all
 		 * slaves have been released
@@ -4436,26 +4420,6 @@ static void bond_setup(struct net_device *bond_dev)
 	bond_dev->features |= bond_dev->hw_features;
 }
 
-static void bond_work_cancel_all(struct bonding *bond)
-{
-	if (bond->params.miimon && delayed_work_pending(&bond->mii_work))
-		cancel_delayed_work_sync(&bond->mii_work);
-
-	if (bond->params.arp_interval && delayed_work_pending(&bond->arp_work))
-		cancel_delayed_work_sync(&bond->arp_work);
-
-	if (bond->params.mode == BOND_MODE_ALB &&
-	    delayed_work_pending(&bond->alb_work))
-		cancel_delayed_work_sync(&bond->alb_work);
-
-	if (bond->params.mode == BOND_MODE_8023AD &&
-	    delayed_work_pending(&bond->ad_work))
-		cancel_delayed_work_sync(&bond->ad_work);
-
-	if (delayed_work_pending(&bond->mcast_work))
-		cancel_delayed_work_sync(&bond->mcast_work);
-}
-
 /*
 * Destroy a bonding device.
 * Must be under rtnl_lock when this function is called.
@@ -4706,12 +4670,14 @@ static int bond_check_params(struct bond_params *params)
 	     arp_ip_count++) {
 		/* not complete check, but should be good enough to
 		   catch mistakes */
-		if (!isdigit(arp_ip_target[arp_ip_count][0])) {
+		__be32 ip = in_aton(arp_ip_target[arp_ip_count]);
+		if (!isdigit(arp_ip_target[arp_ip_count][0])
+		    || ip == 0
+		    || ip == htonl(INADDR_BROADCAST)) {
 			pr_warning("Warning: bad arp_ip_target module parameter (%s), ARP monitoring will not be performed\n",
 				   arp_ip_target[arp_ip_count]);
 			arp_interval = 0;
 		} else {
-			__be32 ip = in_aton(arp_ip_target[arp_ip_count]);
 			arp_target[arp_ip_count] = ip;
 		}
 	}
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index ef8d2a0..1877ed7 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -513,6 +513,8 @@ static ssize_t bonding_store_arp_interval(struct device *d,
 	int new_value, ret = count;
 	struct bonding *bond = to_bond(d);
 
+	if (!rtnl_trylock())
+		return restart_syscall();
 	if (sscanf(buf, "%d", &new_value) != 1) {
 		pr_err("%s: no arp_interval value specified.\n",
 		       bond->dev->name);
@@ -539,10 +541,6 @@ static ssize_t bonding_store_arp_interval(struct device *d,
 		pr_info("%s: ARP monitoring cannot be used with MII monitoring. %s Disabling MII monitoring.\n",
 			bond->dev->name, bond->dev->name);
 		bond->params.miimon = 0;
-		if (delayed_work_pending(&bond->mii_work)) {
-			cancel_delayed_work(&bond->mii_work);
-			flush_workqueue(bond->wq);
-		}
 	}
 	if (!bond->params.arp_targets[0]) {
 		pr_info("%s: ARP monitoring has been set up, but no ARP targets have been specified.\n",
@@ -554,19 +552,12 @@ static ssize_t bonding_store_arp_interval(struct device *d,
 		 * timer will get fired off when the open function
 		 * is called.
 		 */
-		if (!delayed_work_pending(&bond->arp_work)) {
-			if (bond->params.mode == BOND_MODE_ACTIVEBACKUP)
-				INIT_DELAYED_WORK(&bond->arp_work,
-						  bond_activebackup_arp_mon);
-			else
-				INIT_DELAYED_WORK(&bond->arp_work,
-						  bond_loadbalance_arp_mon);
-
-			queue_delayed_work(bond->wq, &bond->arp_work, 0);
-		}
+		cancel_delayed_work_sync(&bond->mii_work);
+		queue_delayed_work(bond->wq, &bond->arp_work, 0);
 	}
 
 out:
+	rtnl_unlock();
 	return ret;
 }
 static DEVICE_ATTR(arp_interval, S_IRUGO | S_IWUSR,
@@ -962,6 +953,8 @@ static ssize_t bonding_store_miimon(struct device *d,
 	int new_value, ret = count;
 	struct bonding *bond = to_bond(d);
 
+	if (!rtnl_trylock())
+		return restart_syscall();
 	if (sscanf(buf, "%d", &new_value) != 1) {
 		pr_err("%s: no miimon value specified.\n",
 		       bond->dev->name);
@@ -993,10 +986,6 @@ static ssize_t bonding_store_miimon(struct device *d,
 				bond->params.arp_validate =
 					BOND_ARP_VALIDATE_NONE;
 			}
-			if (delayed_work_pending(&bond->arp_work)) {
-				cancel_delayed_work(&bond->arp_work);
-				flush_workqueue(bond->wq);
-			}
 		}
 
 		if (bond->dev->flags & IFF_UP) {
@@ -1005,15 +994,12 @@ static ssize_t bonding_store_miimon(struct device *d,
 			 * timer will get fired off when the open function
 			 * is called.
 			 */
-			if (!delayed_work_pending(&bond->mii_work)) {
-				INIT_DELAYED_WORK(&bond->mii_work,
-						  bond_mii_monitor);
-				queue_delayed_work(bond->wq,
-						   &bond->mii_work, 0);
-			}
+			cancel_delayed_work_sync(&bond->arp_work);
+			queue_delayed_work(bond->wq, &bond->mii_work, 0);
 		}
 	}
 out:
+	rtnl_unlock();
 	return ret;
 }
 static DEVICE_ATTR(miimon, S_IRUGO | S_IWUSR,
@@ -1582,6 +1568,7 @@ static ssize_t bonding_store_slaves_active(struct device *d,
 		goto out;
 	}
 
+	read_lock(&bond->lock);
 	bond_for_each_slave(bond, slave, i) {
 		if (!bond_is_active_slave(slave)) {
 			if (new_value)
@@ -1590,6 +1577,7 @@ static ssize_t bonding_store_slaves_active(struct device *d,
 				slave->inactive = 1;
 		}
 	}
+	read_unlock(&bond->lock);
 out:
 	return ret;
 }
-- 
1.7.11.7

^ permalink raw reply related

* Re: [PATCH] 8139cp: set ring address after enabling C+ mode
From: David Miller @ 2012-11-21 23:12 UTC (permalink / raw)
  To: dwmw2; +Cc: romieu, jgarzik, jasowang, netdev, hayeswang, gilboad
In-Reply-To: <1353538374.26346.203.camel@shinybook.infradead.org>

From: David Woodhouse <dwmw2@infradead.org>
Date: Wed, 21 Nov 2012 22:52:54 +0000

> On Wed, 2012-11-21 at 17:40 -0500, David Miller wrote:
>> On the contrary, for networking I submit everything manually and I
>> remove the CC: tags.
>> 
>> I have a queue on patchwork that I add such patches to, so that they
>> do not get lost.
> 
> Ah, right. Thanks for the correction. Is it even worth giving the hint
> that this should be for the stable tree (from v3.5 onwards), or should I
> leave you to work that all out for yourself? And if it *is* worth giving
> that hint, is it better to do it in a comment after --- at the end of
> the commit comment, rather than the "normal" 'Cc: stable' tag?

The more information you give in the commit message the better, that
way I don't have to guess :-)

^ permalink raw reply

* Re: [PATCH] 8139cp: set ring address after enabling C+ mode
From: Jeff Garzik @ 2012-11-22  3:47 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Jason Wang, David S. Miller, netdev
In-Reply-To: <1353529639.26346.164.camel@shinybook.infradead.org>

On 11/21/2012 03:27 PM, David Woodhouse wrote:
> This fixes (for me) a regression introduced by commit b01af457 ("8139cp:
> set ring address before enabling receiver"). That commit configured the
> descriptor ring addresses earlier in the initialisation sequence, in
> order to avoid the possibility of triggering stray DMA before the
> correct address had been set up.
>
> Unfortunately, it seems that the hardware will scribble garbage into the
> TxRingAddr registers when we enable "plus mode" Tx in the CpCmd
> register. Observed on a Traverse Geos router board.
>
> To deal with this, while not reintroducing the problem which led to the
> original commit, we augment cp_start_hw() to write to the CpCmd register
> *first*, then set the descriptor ring addresses, and then finally to
> enable Rx and Tx in the original 8139 Cmd register. The datasheet
> actually indicates that we should enable Tx/Rx in the Cmd register
> *before* configuring the descriptor addresses, but that would appear to
> re-introduce the problem that the offending commit b01af457 was trying
> to solve. And this variant appears to work fine on real hardware.
>
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
> Cc: stable@kernel.org [3.5+]
>
> ---
> How about this? I'm still somewhat confused about when it actually
> *does* start doing DMA, given what the datasheet says.

Well, we have three logical code states:

State A:  pre-b01af457, known working
State B:  b01af457, known broken
State C:  dwmw2 proposed fix, tested on 1 hardware, new technique, query 
open with Realtek

State A seems safer for late -rc, which is where we are.  Fix the 
regression by reverting to well-tested, widely deployed state.

Then apply your patch here as an immediate candidate for net-next.

	Jeff

^ permalink raw reply

* Re: [PATCH] 8139cp: set ring address after enabling C+ mode
From: David Miller @ 2012-11-21 22:40 UTC (permalink / raw)
  To: dwmw2; +Cc: romieu, jgarzik, jasowang, netdev, hayeswang, gilboad
In-Reply-To: <1353537131.26346.199.camel@shinybook.infradead.org>

From: David Woodhouse <dwmw2@infradead.org>
Date: Wed, 21 Nov 2012 22:32:11 +0000

> On Wed, 2012-11-21 at 21:40 +0100, Francois Romieu wrote:
>> Straight to -stable ?
> 
> That's the way it works. You put the Cc: stable on the *original* commit
> that goes upstream. There's no sane way to retroactively add that tag
> after it's already been merged and tested.
> 
> Yes, you can bug Greg manually to 'please add this upstream commit which
> we forgot to mark as Cc: stable' but that isn't the way it's usually
> done.

On the contrary, for networking I submit everything manually and I
remove the CC: tags.

I have a queue on patchwork that I add such patches to, so that they
do not get lost.

^ permalink raw reply

* Re: [PATCH 2/2] smsc95xx: support PHY wakeup source
From: Steve Glendinning @ 2012-11-22 12:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20121121.120721.1391625922017753607.davem@davemloft.net>

>> +{
>> +     struct mii_if_info *mii = &dev->mii;
>> +
>> +        /* first, a dummy read, needed to latch some MII phys */
>> +     int ret = smsc95xx_mdio_read_nopm(dev->net, mii->phy_id, MII_BMSR);
>
> Please keep the local variable declarations together at the beginning
> of the basic block, don't intermix empty lines and comments as you
> are doing here.

Thanks David, please drop these two patches.  I'll fix these points up
and re-submit.

^ permalink raw reply

* Re: [PATCH 080/493] fddi: remove use of __devexit_p
From: Greg KH @ 2012-11-22  0:23 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Bill Pemberton, netdev
In-Reply-To: <alpine.LFD.2.02.1211212334080.25230@eddie.linux-mips.org>

On Wed, Nov 21, 2012 at 11:49:50PM +0000, Maciej W. Rozycki wrote:
> Greg,
> 
> > >  I have unconfused myself now, so please replace the above with the 
> > > following question: what about configurations (e.g. buses) that not 
> > > support hotplug at all?  For example apart from PCI the defxx driver 
> > > concerned here supports the TURBOchannel bus that by design does not have 
> > > the concept of live option card removal (no such circuitry).  So should 
> > > now the precious memory be wasted on systems that will never ever handle 
> > > hotplug?
> > 
> > CONFIG_HOTPLUG is always enabled now, so that's not an option anymore.
> > And again, a user can "hot unbind" a driver from a device from
> > userspace, no matter if the bus physically supports it or not.
> 
>  Hmm, what purpose does this serve for devices that cannot be physically 
> removed?  If there is none, shouldn't that policy be set by individual 
> drivers or platform even?  Even if HOTPLUG as a whole is unconditional (I 
> suppose the amount of space core support itself takes is much less to what 
> driver code can).
> 
>  TURBOchannel, although valid, is an old exotic case that might not be 
> worth arguing for, except for purity maybe.  But there are surely many 
> contemporary systems out there that are known they are never going to 
> support hot device replacement.  Consider most of the embedded systems for 
> example, where devices may even physically be cast into a single SOC (with 
> no prospect of chipping off any pieces ever ;) ), that certainly could not 
> care less of device replacement, but they do care a lot about memory 
> consumption.

Even those don't care about less than 5k of memory, do they?

>  Was this implication considered, discussed and diregarded as not 
> important enough compared to benefits from hardcoding HOTPLUG support?  

Yes, I don't know of any modern system that does not enable
CONFIG_HOTPLUG, do you?

> I'm seriously asking for a pointer, not trying to cause any stir-up -- 
> regrettably I fail to follow most discussions these days, but I would like 
> to know what the background behind this decision was.  Thanks a lot!

See Russell's response in this thread for details if you are curious.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] 8139cp: set ring address after enabling C+ mode
From: David Miller @ 2012-11-22  4:39 UTC (permalink / raw)
  To: jgarzik; +Cc: dwmw2, jasowang, netdev
In-Reply-To: <50ADA05B.2000308@pobox.com>

From: Jeff Garzik <jgarzik@pobox.com>
Date: Wed, 21 Nov 2012 22:47:39 -0500

> State A:  pre-b01af457, known working
> State B:  b01af457, known broken

State A is also known buggy on the largest consumer of this driver,
the emulated hardware.

Please evaluate this realistically.

^ permalink raw reply

* Re: 3.6 routing cache regression, multicast loopback broken
From: Maxime Bizon @ 2012-11-22 14:04 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: David Miller, netdev
In-Reply-To: <alpine.LFD.2.00.1211210055040.1751@ja.ssi.bg>


On Wed, 2012-11-21 at 02:35 +0200, Julian Anastasov wrote:

Hi,

> 	Do you also have CONFIG_IP_MROUTE enabled? I see the

I don't

> problem with caching both "RTCF_MULTICAST | RTCF_LOCAL" and
> "RTCF_MULTICAST" flags to same place but I'm not sure in the
> sequence of route lookups for your setup. ip_mc_output does
> not care about RTCF_LOCAL when CONFIG_IP_MROUTE is disabled,
> so in this case it should loop the packet even on wrong
> caching (without RTCF_LOCAL flag).

but the problem here was that dst was using ip_output() instead of
ip_mc_output().

> 	Can you try the following patch. If it does not
> solve the problem we have to add some debugs in __mkroute_output.
> And I'm not sure if this is fatal only when CONFIG_IP_MROUTE
> is enabled, I have to spend more time to check the code.
> As an optimization, the patch avoids lookup for fnhe
> when multicast is not cached because multicasts are
> not redirected and they do not learn PMTU.
> 
> 	The patch is not tested. I'll update the commit
> message after your tests.

the patch fixes the problem, and it looks valid to me.

thanks !

-- 
Maxime

^ permalink raw reply

* [PATCH] arping: Fix find_device_by_ifaddrs()
From: Jan Synacek @ 2012-11-22  8:51 UTC (permalink / raw)
  To: yoshfuji; +Cc: netdev, Jan Synacek

Look for another device if the device name and the currently found one are the
same, not different.

Also, make checking for the device's flags nonfatal.

Signed-off-by: Jan Synacek <jsynacek@redhat.com>
---
 arping.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arping.c b/arping.c
index ff77bec..d0edccf 100644
--- a/arping.c
+++ b/arping.c
@@ -550,10 +550,10 @@ static int find_device_by_ifaddrs(void)
 			continue;
 		if (ifa->ifa_addr->sa_family != AF_PACKET)
 			continue;
-		if (device.name && ifa->ifa_name && strcmp(ifa->ifa_name, device.name))
+		if (device.name && ifa->ifa_name && !strcmp(ifa->ifa_name, device.name))
 			continue;
 
-		if (check_ifflags(ifa->ifa_flags, device.name != NULL) < 0)
+		if (check_ifflags(ifa->ifa_flags, 0) < 0)
 			continue;
 
 		if (!((struct sockaddr_ll *)ifa->ifa_addr)->sll_halen)
-- 
1.7.11.7

^ permalink raw reply related

* GOOd/540/2878
From: Mr. Glenn Taylor @ 2012-11-22 17:44 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: Document Of Business Proposal(Mr. Glenn Taylor)open the attachment for more details.pdf --]
[-- Type: application/pdf, Size: 87791 bytes --]

^ permalink raw reply

* [PATCH] xfrm: Fix the gc threshold value for ipv4
From: Steffen Klassert @ 2012-11-22  7:56 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, netdev
In-Reply-To: <1353570997-2571-1-git-send-email-steffen.klassert@secunet.com>

The xfrm gc threshold value depends on ip_rt_max_size. This
value was set to INT_MAX with the routing cache removal patch,
so we start doing garbage collecting when we have INT_MAX/2
IPsec routes cached. Fix this by going back to the static
threshold of 1024 routes.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/net/xfrm.h      |    2 +-
 net/ipv4/route.c        |    2 +-
 net/ipv4/xfrm4_policy.c |   13 +------------
 3 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 6f0ba01..63445ed 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1351,7 +1351,7 @@ struct xfrm6_tunnel {
 };
 
 extern void xfrm_init(void);
-extern void xfrm4_init(int rt_hash_size);
+extern void xfrm4_init(void);
 extern int xfrm_state_init(struct net *net);
 extern void xfrm_state_fini(struct net *net);
 extern void xfrm4_state_init(void);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index a8c6512..200d287 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2597,7 +2597,7 @@ int __init ip_rt_init(void)
 		pr_err("Unable to create route proc files\n");
 #ifdef CONFIG_XFRM
 	xfrm_init();
-	xfrm4_init(ip_rt_max_size);
+	xfrm4_init();
 #endif
 	rtnl_register(PF_INET, RTM_GETROUTE, inet_rtm_getroute, NULL, NULL);
 
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index 05c5ab8..3be0ac2 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -279,19 +279,8 @@ static void __exit xfrm4_policy_fini(void)
 	xfrm_policy_unregister_afinfo(&xfrm4_policy_afinfo);
 }
 
-void __init xfrm4_init(int rt_max_size)
+void __init xfrm4_init(void)
 {
-	/*
-	 * Select a default value for the gc_thresh based on the main route
-	 * table hash size.  It seems to me the worst case scenario is when
-	 * we have ipsec operating in transport mode, in which we create a
-	 * dst_entry per socket.  The xfrm gc algorithm starts trying to remove
-	 * entries at gc_thresh, and prevents new allocations as 2*gc_thresh
-	 * so lets set an initial xfrm gc_thresh value at the rt_max_size/2.
-	 * That will let us store an ipsec connection per route table entry,
-	 * and start cleaning when were 1/2 full
-	 */
-	xfrm4_dst_ops.gc_thresh = rt_max_size/2;
 	dst_entries_init(&xfrm4_dst_ops);
 
 	xfrm4_state_init();
-- 
1.7.9.5

^ permalink raw reply related

* pull request: ipsec 2012-11-22
From: Steffen Klassert @ 2012-11-22  7:56 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, netdev

This pull request is intended for 3.7 and contains a single patch to
fix the IPsec gc threshold value for ipv4.

Please pull or let me know if there are problems.

Thanks!

The following changes since commit a66fe1653f4e81c007a68ca975067432a42df05b:

  net: usb: cdc_eem: Fix rx skb allocation for 802.1Q VLANs (2012-11-07 21:12:26 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec.git master

for you to fetch changes up to 703fb94ec58e0e8769380c2877a8a34aeb5b6c97:

  xfrm: Fix the gc threshold value for ipv4 (2012-11-13 09:15:07 +0100)

----------------------------------------------------------------
Steffen Klassert (1):
      xfrm: Fix the gc threshold value for ipv4

 include/net/xfrm.h      |    2 +-
 net/ipv4/route.c        |    2 +-
 net/ipv4/xfrm4_policy.c |   13 +------------
 3 files changed, 3 insertions(+), 14 deletions(-)

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox