Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 1/2] ixgb: Don't check for vlan group on transmit.
From: Jeff Kirsher @ 2010-11-05 19:11 UTC (permalink / raw)
  To: Jesse Gross
  Cc: David Miller, netdev@vger.kernel.org, Brandeburg, Jesse,
	Duyck, Alexander H
In-Reply-To: <1288464591-31528-1-git-send-email-jesse@nicira.com>

[-- Attachment #1: Type: text/plain, Size: 1592 bytes --]

On Sat, 2010-10-30 at 11:49 -0700, Jesse Gross wrote:
> On transmit, the ixgb driver will only use vlan acceleration if a
> vlan group is configured.  This can lead to tags getting dropped
> when bridging because the networking core assumes that a driver
> that claims vlan acceleration support can do it at all times.  This
> change should have been part of commit eab6d18d "vlan: Don't check for
> vlan group before vlan_tx_tag_present." but was missed.
> 
> Signed-off-by: Jesse Gross <jesse@nicira.com>
> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
> CC: PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com>
> ---
>  drivers/net/ixgb/ixgb_main.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
> index caa8192..d18194e 100644
> --- a/drivers/net/ixgb/ixgb_main.c
> +++ b/drivers/net/ixgb/ixgb_main.c
> @@ -1498,7 +1498,7 @@ ixgb_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
>                       DESC_NEEDED)))
>  		return NETDEV_TX_BUSY;
>  
> -	if (adapter->vlgrp && vlan_tx_tag_present(skb)) {
> +	if (vlan_tx_tag_present(skb)) {
>  		tx_flags |= IXGB_TX_FLAGS_VLAN;
>  		vlan_id = vlan_tx_tag_get(skb);
>  	}

After further review, NAK because this will cause a bug.  With this
patch it would be possible to overrun the buffers, so the correct fix is
to increase max_frame_size by VLAN_TAG_SIZE in ixgb/igb_change_mtu.

Alex has said that he will generate the patches for the alternate fix.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH] atomic: add atomic_inc_not_zero_hint()
From: Eric Dumazet @ 2010-11-05 19:12 UTC (permalink / raw)
  To: paulmck
  Cc: Andrew Morton, linux-kernel, David Miller, netdev,
	Arnaldo Carvalho de Melo, Christoph Lameter, Ingo Molnar,
	Andi Kleen, Nick Piggin
In-Reply-To: <20101105184034.GG2850@linux.vnet.ibm.com>

Le vendredi 05 novembre 2010 à 11:40 -0700, Paul E. McKenney a écrit :

> OK, so I cannot resist the challenge...  ;-)
> 

I knew that ;)

> Suppose that the atomic_inc_not_zero_hint() is in common code that might
> be invoked from a cleanup path.  On the cleanup path, perhaps within an
> RCU callback, if the reference is zero, we have the only reference and
> thus don't need to increment the reference count.  On the other hand,
> if the reference is non-zero, we want to obtain a reference in order
> to safely attempt to encourage the other reference holder to let go
> more quickly.
> 
> Perhaps a bit of a stretch, but why not just replace the above
> "return 0" with "atomic_inc_not_zero(v)"?  It will usually be
> compiled out, right?

Yes indeed, thanks !




^ permalink raw reply

* Re: [PATCH] atomic: add atomic_inc_not_zero_hint()
From: Eric Dumazet @ 2010-11-05 19:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, David Miller, netdev, Arnaldo Carvalho de Melo,
	Christoph Lameter, Ingo Molnar, Andi Kleen, Paul E. McKenney,
	Nick Piggin
In-Reply-To: <20101105112821.57f80481.akpm@linux-foundation.org>

Le vendredi 05 novembre 2010 à 11:28 -0700, Andrew Morton a écrit :
> But we haven't established that there _is_ duplicated code which needs
> that treatment.
> 
> Scanning arch/x86/include/asm/atomic.h, perhaps ATOMIC_INIT() is a
> candidate.  But I'm not sure that it _should_ be hoisted up - if every
> architecture happens to do it the same way then that's just a fluke.
> 
> 

Not sure I understand you. I was trying to avoid recursive includes, but
that should be protected anyway. I see a lot of code that could be
factorized in this new header (atomic_inc_not_zero() for example)

Thanks

[PATCH v3] atomic: add atomic_inc_not_zero_hint()

Followup of perf tools session in Netfilter WorkShop 2010

In network stack we make high usage of atomic_inc_not_zero() in contexts
we know the probable value of atomic before increment (2 for udp sockets
for example)

Using a special version of atomic_inc_not_zero() giving this hint can
help processor to use less bus transactions.

On x86 (MESI protocol) for example, this avoids entering Shared state,
because "lock cmpxchg" issues an RFO (Read For Ownership)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: David Miller <davem@davemloft.net>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
---
V3: adds the include <asm/atomic.h>
    if hint is null, use atomic_inc_not_zero() (Paul suggestion)
V2: add #ifndef atomic_inc_not_zero_hint
    kerneldoc changes
    test that hint is not null
    Meant to be included at end of arch/*/asm/atomic.h files

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
new file mode 100644
index 0000000..5a7df87
--- /dev/null
+++ b/include/linux/atomic.h
@@ -0,0 +1,37 @@
+#ifndef _LINUX_ATOMIC_H
+#define _LINUX_ATOMIC_H
+#include <asm/atomic.h>
+
+/**
+ * atomic_inc_not_zero_hint - increment if not null
+ * @v: pointer of type atomic_t
+ * @hint: probable value of the atomic before the increment
+ *
+ * This version of atomic_inc_not_zero() gives a hint of probable
+ * value of the atomic. This helps processor to not read the memory
+ * before doing the atomic read/modify/write cycle, lowering
+ * number of bus transactions on some arches.
+ *
+ * Returns: 0 if increment was not done, 1 otherwise.
+ */
+#ifndef atomic_inc_not_zero_hint
+static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
+{
+	int val, c = hint;
+
+	/* sanity test, should be removed by compiler if hint is a constant */
+	if (!hint)
+		return atomic_inc_not_zero(v);
+
+ 	do {
+		val = atomic_cmpxchg(v, c, c + 1);
+		if (val == c)
+			return 1;
+		c = val;
+	} while (c);
+
+	return 0;
+}
+#endif
+
+#endif /* _LINUX_ATOMIC_H */

^ permalink raw reply related

* Re: [PATCH] atomic: add atomic_inc_not_zero_hint()
From: Andrew Morton @ 2010-11-05 19:39 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linux-kernel, David Miller, netdev, Arnaldo Carvalho de Melo,
	Christoph Lameter, Ingo Molnar, Andi Kleen, Paul E. McKenney,
	Nick Piggin
In-Reply-To: <1288984844.2665.52.camel@edumazet-laptop>

On Fri, 05 Nov 2010 20:20:44 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le vendredi 05 novembre 2010 __ 11:28 -0700, Andrew Morton a __crit :
> > But we haven't established that there _is_ duplicated code which needs
> > that treatment.
> > 
> > Scanning arch/x86/include/asm/atomic.h, perhaps ATOMIC_INIT() is a
> > candidate.  But I'm not sure that it _should_ be hoisted up - if every
> > architecture happens to do it the same way then that's just a fluke.
> > 
> > 
> 
> Not sure I understand you. I was trying to avoid recursive includes, but
> that should be protected anyway. I see a lot of code that could be
> factorized in this new header (atomic_inc_not_zero() for example)

Ah.  I wasn't able to see much duplicated code at all, so I wasn't sure
that we needed to bother about this issue.

yup, atomic_inc_not_zero() looks like a candidate.

> [PATCH v3] atomic: add atomic_inc_not_zero_hint()

Let's go with this for now ;)

I'll assume that you intend to make use of this function soon, and it
looks safe enough to sneak it into 2.6.37-rc2, IMO.  If Linus shouts at
me then we could merge it into 2.6.38-rc1 via net-next, but I think
straight-to-mainline is best.

^ permalink raw reply

* Re: [PATCH] atomic: add atomic_inc_not_zero_hint()
From: Eric Dumazet @ 2010-11-05 19:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, David Miller, netdev, Arnaldo Carvalho de Melo,
	Christoph Lameter, Ingo Molnar, Andi Kleen, Paul E. McKenney,
	Nick Piggin
In-Reply-To: <20101105123927.5779e464.akpm@linux-foundation.org>

Le vendredi 05 novembre 2010 à 12:39 -0700, Andrew Morton a écrit :

> Ah.  I wasn't able to see much duplicated code at all, so I wasn't sure
> that we needed to bother about this issue.
> 
> yup, atomic_inc_not_zero() looks like a candidate.

yes, and atomic_add_unless()...

> 
> > [PATCH v3] atomic: add atomic_inc_not_zero_hint()
> 
> Let's go with this for now ;)
> 
> I'll assume that you intend to make use of this function soon, and it
> looks safe enough to sneak it into 2.6.37-rc2, IMO.  If Linus shouts at
> me then we could merge it into 2.6.38-rc1 via net-next, but I think
> straight-to-mainline is best.
> 

Well, I dont expect using it before 2.6.38, no hurry Andrew, but it
probably can be merged before, since it has no user yet. It'll help our
job for sure.

Thanks



^ permalink raw reply

* [PATCH] iputils build fix
From: Lucas C. Villa Real @ 2010-11-05 19:49 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 144 bytes --]

Hi,

Please find attached a patch which fixes the build of
iputils-s20101006 on platforms that don't have the SO_MARK operation.

Thanks,
Lucas

[-- Attachment #2: 01-iputils-SO_MARK.patch --]
[-- Type: application/octet-stream, Size: 857 bytes --]

Fixes build on platforms where SO_MARK is not defined.

Signed-off-by: Lucas C. Villa Real <lucasvr@us.ibm.com>

diff -urp iputils-s20101006.orig/ping_common.c iputils-s20101006/ping_common.c
--- iputils-s20101006.orig/ping_common.c	2010-10-06 08:59:20.000000000 -0300
+++ iputils-s20101006/ping_common.c	2010-11-05 16:08:06.000000000 -0200
@@ -475,6 +475,7 @@ void setup(int icmp_sock)
 			fprintf(stderr, "Warning: no SO_TIMESTAMP support, falling back to SIOCGSTAMP\n");
 	}
 #endif
+#ifdef SO_MARK
 	if (options & F_MARK) {
 		if (setsockopt(icmp_sock, SOL_SOCKET, SO_MARK,
 				&mark, sizeof(mark)) == -1) {
@@ -484,6 +485,7 @@ void setup(int icmp_sock)
 			fprintf(stderr, "Warning: Failed to set mark %d\n", mark);
 		}
 	}
+#endif
 
 	/* Set some SNDTIMEO to prevent blocking forever
 	 * on sends, when device is too slow or stalls. Just put limit

^ permalink raw reply

* Re: [PATCH] atomic: add atomic_inc_not_zero_hint()
From: Paul E. McKenney @ 2010-11-05 19:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrew Morton, linux-kernel, David Miller, netdev,
	Arnaldo Carvalho de Melo, Christoph Lameter, Ingo Molnar,
	Andi Kleen, Nick Piggin
In-Reply-To: <1288984844.2665.52.camel@edumazet-laptop>

On Fri, Nov 05, 2010 at 08:20:44PM +0100, Eric Dumazet wrote:
> Le vendredi 05 novembre 2010 à 11:28 -0700, Andrew Morton a écrit :
> > But we haven't established that there _is_ duplicated code which needs
> > that treatment.
> > 
> > Scanning arch/x86/include/asm/atomic.h, perhaps ATOMIC_INIT() is a
> > candidate.  But I'm not sure that it _should_ be hoisted up - if every
> > architecture happens to do it the same way then that's just a fluke.
> > 
> > 
> 
> Not sure I understand you. I was trying to avoid recursive includes, but
> that should be protected anyway. I see a lot of code that could be
> factorized in this new header (atomic_inc_not_zero() for example)
> 
> Thanks
> 
> [PATCH v3] atomic: add atomic_inc_not_zero_hint()
> 
> Followup of perf tools session in Netfilter WorkShop 2010
> 
> In network stack we make high usage of atomic_inc_not_zero() in contexts
> we know the probable value of atomic before increment (2 for udp sockets
> for example)
> 
> Using a special version of atomic_inc_not_zero() giving this hint can
> help processor to use less bus transactions.
> 
> On x86 (MESI protocol) for example, this avoids entering Shared state,
> because "lock cmpxchg" issues an RFO (Read For Ownership)
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Christoph Lameter <cl@linux-foundation.org>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Andi Kleen <andi@firstfloor.org>
> Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
> Cc: David Miller <davem@davemloft.net>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Looks quite good to me!

Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

> Cc: Nick Piggin <npiggin@kernel.dk>
> ---
> V3: adds the include <asm/atomic.h>
>     if hint is null, use atomic_inc_not_zero() (Paul suggestion)
> V2: add #ifndef atomic_inc_not_zero_hint
>     kerneldoc changes
>     test that hint is not null
>     Meant to be included at end of arch/*/asm/atomic.h files
> 
> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> new file mode 100644
> index 0000000..5a7df87
> --- /dev/null
> +++ b/include/linux/atomic.h
> @@ -0,0 +1,37 @@
> +#ifndef _LINUX_ATOMIC_H
> +#define _LINUX_ATOMIC_H
> +#include <asm/atomic.h>
> +
> +/**
> + * atomic_inc_not_zero_hint - increment if not null
> + * @v: pointer of type atomic_t
> + * @hint: probable value of the atomic before the increment
> + *
> + * This version of atomic_inc_not_zero() gives a hint of probable
> + * value of the atomic. This helps processor to not read the memory
> + * before doing the atomic read/modify/write cycle, lowering
> + * number of bus transactions on some arches.
> + *
> + * Returns: 0 if increment was not done, 1 otherwise.
> + */
> +#ifndef atomic_inc_not_zero_hint
> +static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
> +{
> +	int val, c = hint;
> +
> +	/* sanity test, should be removed by compiler if hint is a constant */
> +	if (!hint)
> +		return atomic_inc_not_zero(v);
> +
> + 	do {
> +		val = atomic_cmpxchg(v, c, c + 1);
> +		if (val == c)
> +			return 1;
> +		c = val;
> +	} while (c);
> +
> +	return 0;
> +}
> +#endif
> +
> +#endif /* _LINUX_ATOMIC_H */
> 
> 

^ permalink raw reply

* [PATCH] iputils signal mask issue
From: Lucas C. Villa Real @ 2010-11-05 19:56 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 717 bytes --]

Hi,

Today we found an issue where arping would get stuck in an endless
loop. The problem was found to be related to the parent process (a
shell script) having had SIGALRM blocked, which is used by arping to
update "count" and to check the "timeout" global.

Since we cannot make assumptions on the signal masks of the
environment it's better to explicitly unblock the signals that the
utility needs before the main loop executes. It's worth noting that
although sigaction() is called to associate a handler to a given
signal, that function doesn't automatically unblock that signal.

A similar problem was noticed on rdisc, so the attached patch also
ensures to unblock the signals that it relies on.

Thanks,
Lucas

[-- Attachment #2: 02-iputils-SIG_UNBLOCK.patch --]
[-- Type: application/octet-stream, Size: 1499 bytes --]

Don't rely on the parent signal mask. If that process has blocked SIGALRM 
then chances are high that arping and rdisc will execute forever, as that
signal will never be delivered and the globals, such as 'count' and 'timeout',
won't be updated or checked.

Signed-off-by: Lucas C. Villa Real <lucasvr@us.ibm.com>

diff -urp iputils-s20101006.orig/arping.c iputils-s20101006/arping.c
--- iputils-s20101006.orig/arping.c	2010-10-06 08:59:20.000000000 -0300
+++ iputils-s20101006/arping.c	2010-11-05 16:12:42.000000000 -0200
@@ -346,6 +346,7 @@ main(int argc, char **argv)
 {
 	int socket_errno;
 	int ch;
+	sigset_t sset, osset;
 	uid_t uid = getuid();
 
 	s = socket(PF_PACKET, SOCK_DGRAM, 0);
@@ -544,13 +545,17 @@ main(int argc, char **argv)
 		exit(2);
 	}
 
+	sigemptyset(&sset);
+	sigaddset(&sset, SIGALRM);
+	sigaddset(&sset, SIGINT);
+	sigprocmask(SIG_UNBLOCK, &sset, NULL);
+
 	set_signal(SIGINT, finish);
 	set_signal(SIGALRM, catcher);
 
 	catcher();
 
 	while(1) {
-		sigset_t sset, osset;
 		unsigned char packet[4096];
 		struct sockaddr_storage from;
 		socklen_t alen = sizeof(from);
diff -urp iputils-s20101006.orig/rdisc.c iputils-s20101006/rdisc.c
--- iputils-s20101006.orig/rdisc.c	2010-10-06 08:59:20.000000000 -0300
+++ iputils-s20101006/rdisc.c	2010-11-05 16:14:33.000000000 -0200
@@ -449,6 +449,7 @@ next:
 	sigaddset(&sset, SIGHUP);
 	sigaddset(&sset, SIGTERM);
 	sigaddset(&sset, SIGINT);
+	sigprocmask(SIG_UNBLOCK, &sset, NULL);
 
 	init();
 	if (join(s, &joinaddr) < 0) {

^ permalink raw reply

* Re: [PATCH 1/2] ixgb: Don't check for vlan group on transmit.
From: Jesse Gross @ 2010-11-05 19:56 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: David Miller, netdev@vger.kernel.org, Brandeburg, Jesse,
	Duyck, Alexander H
In-Reply-To: <1288984304.3091.11.camel@jtkirshe-MOBL1>

On Fri, Nov 5, 2010 at 12:11 PM, Jeff Kirsher
<jeffrey.t.kirsher@intel.com> wrote:
> On Sat, 2010-10-30 at 11:49 -0700, Jesse Gross wrote:
>> On transmit, the ixgb driver will only use vlan acceleration if a
>> vlan group is configured.  This can lead to tags getting dropped
>> when bridging because the networking core assumes that a driver
>> that claims vlan acceleration support can do it at all times.  This
>> change should have been part of commit eab6d18d "vlan: Don't check for
>> vlan group before vlan_tx_tag_present." but was missed.
>>
>> Signed-off-by: Jesse Gross <jesse@nicira.com>
>> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>> CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
>> CC: PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com>
>> ---
>>  drivers/net/ixgb/ixgb_main.c |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
>> index caa8192..d18194e 100644
>> --- a/drivers/net/ixgb/ixgb_main.c
>> +++ b/drivers/net/ixgb/ixgb_main.c
>> @@ -1498,7 +1498,7 @@ ixgb_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
>>                       DESC_NEEDED)))
>>               return NETDEV_TX_BUSY;
>>
>> -     if (adapter->vlgrp && vlan_tx_tag_present(skb)) {
>> +     if (vlan_tx_tag_present(skb)) {
>>               tx_flags |= IXGB_TX_FLAGS_VLAN;
>>               vlan_id = vlan_tx_tag_get(skb);
>>       }
>
> After further review, NAK because this will cause a bug.  With this
> patch it would be possible to overrun the buffers, so the correct fix is
> to increase max_frame_size by VLAN_TAG_SIZE in ixgb/igb_change_mtu.

Hmm, I didn't see any other place where it made changes to the
handling of packets on transmit if a vlan group is configured.  Maybe
the buffer is extended when a group is registered and stripping is
enabled?

In any case, you might want to check the other Intel drivers for
similar problems.  I did a pass and made a mass conversion of this
type a little while ago.  Those changes have already been merged, I
just missed this one by accident.

^ permalink raw reply

* Re: [PATCH 1/2] ixgb: Don't check for vlan group on transmit.
From: Jeff Kirsher @ 2010-11-05 20:06 UTC (permalink / raw)
  To: Jesse Gross
  Cc: David Miller, netdev@vger.kernel.org, Brandeburg, Jesse,
	Duyck, Alexander H
In-Reply-To: <AANLkTin62WWL+cnLVNaOYmcvL28rfkUDVCVqiimSPZ=e@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2355 bytes --]

On Fri, 2010-11-05 at 12:56 -0700, Jesse Gross wrote:
> On Fri, Nov 5, 2010 at 12:11 PM, Jeff Kirsher
> <jeffrey.t.kirsher@intel.com> wrote:
> > On Sat, 2010-10-30 at 11:49 -0700, Jesse Gross wrote:
> >> On transmit, the ixgb driver will only use vlan acceleration if a
> >> vlan group is configured.  This can lead to tags getting dropped
> >> when bridging because the networking core assumes that a driver
> >> that claims vlan acceleration support can do it at all times.  This
> >> change should have been part of commit eab6d18d "vlan: Don't check for
> >> vlan group before vlan_tx_tag_present." but was missed.
> >>
> >> Signed-off-by: Jesse Gross <jesse@nicira.com>
> >> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> >> CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
> >> CC: PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com>
> >> ---
> >>  drivers/net/ixgb/ixgb_main.c |    2 +-
> >>  1 files changed, 1 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
> >> index caa8192..d18194e 100644
> >> --- a/drivers/net/ixgb/ixgb_main.c
> >> +++ b/drivers/net/ixgb/ixgb_main.c
> >> @@ -1498,7 +1498,7 @@ ixgb_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
> >>                       DESC_NEEDED)))
> >>               return NETDEV_TX_BUSY;
> >>
> >> -     if (adapter->vlgrp && vlan_tx_tag_present(skb)) {
> >> +     if (vlan_tx_tag_present(skb)) {
> >>               tx_flags |= IXGB_TX_FLAGS_VLAN;
> >>               vlan_id = vlan_tx_tag_get(skb);
> >>       }
> >
> > After further review, NAK because this will cause a bug.  With this
> > patch it would be possible to overrun the buffers, so the correct fix is
> > to increase max_frame_size by VLAN_TAG_SIZE in ixgb/igb_change_mtu.
> 
> Hmm, I didn't see any other place where it made changes to the
> handling of packets on transmit if a vlan group is configured.  Maybe
> the buffer is extended when a group is registered and stripping is
> enabled?
> 
> In any case, you might want to check the other Intel drivers for
> similar problems.  I did a pass and made a mass conversion of this
> type a little while ago.  Those changes have already been merged, I
> just missed this one by accident.

I will get with Alex and review the other Intel drivers, thanks Jesse.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: OOM when adding ipv6 route:  How to make available more per-cpu memory?
From: Eric Dumazet @ 2010-11-05 20:20 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, linux-kernel, Tejun Heo
In-Reply-To: <4CD449A5.5070305@candelatech.com>

Le vendredi 05 novembre 2010 à 11:15 -0700, Ben Greear a écrit :

> root@lanforge-ubuntu:/home/lanforge# cat /proc/vmallocinfo
> 0xf7ffe000-0xf8000000    8192 hpet_enable+0x2d/0x1b8 phys=fed00000 ioremap
> 0xf8002000-0xf8004000    8192 acpi_os_map_memory+0x16/0x1f phys=df79e000 ioremap
> 0xf8004000-0xf8007000   12288 acpi_os_map_memory+0x16/0x1f phys=df7a0000 ioremap
> 0xf8008000-0xf800a000    8192 acpi_os_map_memory+0x16/0x1f phys=df790000 ioremap
> 0xf800b000-0xf8010000   20480 module_alloc+0x72/0x80 pages=4 vmalloc
> 0xf8010000-0xf8019000   36864 acpi_os_map_memory+0x16/0x1f phys=df790000 ioremap
> 0xf801a000-0xf801c000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xf801d000-0xf8020000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8023000-0xf8026000   12288 module_alloc+0x72/0x80 pages=2 vmalloc
> 0xf8026000-0xf8028000    8192 msix_capability_init+0xae/0x2b0 phys=fa4fe000 ioremap
> 0xf8028000-0xf802a000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xf802b000-0xf802e000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf802f000-0xf8032000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8033000-0xf8036000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8037000-0xf803a000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf803b000-0xf803e000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf803f000-0xf8048000   36864 module_alloc+0x72/0x80 pages=8 vmalloc
> 0xf804b000-0xf8055000   40960 module_alloc+0x72/0x80 pages=9 vmalloc
> 0xf8056000-0xf8059000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf805b000-0xf8063000   32768 module_alloc+0x72/0x80 pages=7 vmalloc
> 0xf8066000-0xf8068000    8192 msix_capability_init+0xae/0x2b0 phys=fa4fa000 ioremap
> 0xf8068000-0xf8070000   32768 module_alloc+0x72/0x80 pages=7 vmalloc
> 0xf8072000-0xf8075000   12288 module_alloc+0x72/0x80 pages=2 vmalloc
> 0xf8076000-0xf8083000   53248 module_alloc+0x72/0x80 pages=12 vmalloc
> 0xf8084000-0xf8087000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8088000-0xf808b000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf808c000-0xf808f000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8090000-0xf8093000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8094000-0xf8097000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8098000-0xf809b000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf809c000-0xf809f000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80a2000-0xf80ad000   45056 module_alloc+0x72/0x80 pages=10 vmalloc
> 0xf80ae000-0xf80b2000   16384 module_alloc+0x72/0x80 pages=3 vmalloc
> 0xf80b3000-0xf80b6000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80b7000-0xf80ba000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80bb000-0xf80be000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80bf000-0xf80c2000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80c3000-0xf80c6000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80c8000-0xf80cd000   20480 pci_iomap+0x81/0x90 phys=fa4fc000 ioremap
> 0xf80ce000-0xf80d1000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80d2000-0xf80d5000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80d6000-0xf80d9000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80da000-0xf80dd000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80de000-0xf80e1000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80e2000-0xf80e5000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80e6000-0xf80e9000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80ea000-0xf80ed000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80ee000-0xf80f1000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80f2000-0xf80f5000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80f6000-0xf80f9000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80fa000-0xf80fd000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf80fe000-0xf8101000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8102000-0xf8105000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8106000-0xf8109000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf810a000-0xf810d000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf810d000-0xf8118000   45056 module_alloc+0x72/0x80 pages=10 vmalloc
> 0xf8119000-0xf811c000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf811d000-0xf8120000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8121000-0xf8124000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8125000-0xf8128000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8129000-0xf812c000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf812d000-0xf8130000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8130000-0xf8132000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xf8133000-0xf8136000   12288 module_alloc+0x72/0x80 pages=2 vmalloc
> 0xf8137000-0xf813a000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf813e000-0xf8140000    8192 msix_capability_init+0xae/0x2b0 phys=fa4f6000 ioremap
> 0xf8140000-0xf8145000   20480 pci_iomap+0x81/0x90 phys=fa4f8000 ioremap
> 0xf8146000-0xf8148000    8192 msix_capability_init+0xae/0x2b0 phys=fa4f2000 ioremap
> 0xf8148000-0xf814d000   20480 pci_iomap+0x81/0x90 phys=fa4f4000 ioremap
> 0xf814e000-0xf8150000    8192 msix_capability_init+0xae/0x2b0 phys=fa4ee000 ioremap
> 0xf8150000-0xf8155000   20480 pci_iomap+0x81/0x90 phys=fa4f0000 ioremap
> 0xf8156000-0xf8158000    8192 msix_capability_init+0xae/0x2b0 phys=fa4ea000 ioremap
> 0xf8158000-0xf815d000   20480 pci_iomap+0x81/0x90 phys=fa4ec000 ioremap
> 0xf815e000-0xf8160000    8192 msix_capability_init+0xae/0x2b0 phys=fa4e6000 ioremap
> 0xf8160000-0xf8165000   20480 pci_iomap+0x81/0x90 phys=fa4e8000 ioremap
> 0xf8166000-0xf8168000    8192 msix_capability_init+0xae/0x2b0 phys=fa4e2000 ioremap
> 0xf8168000-0xf816d000   20480 pci_iomap+0x81/0x90 phys=fa4e4000 ioremap
> 0xf816e000-0xf8170000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xf8170000-0xf8175000   20480 pci_iomap+0x81/0x90 phys=fa4e0000 ioremap
> 0xf8176000-0xf8179000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf817a000-0xf817d000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf817e000-0xf8181000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8182000-0xf8185000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8186000-0xf8189000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf818e000-0xf819d000   61440 module_alloc+0x72/0x80 pages=14 vmalloc
> 0xf819e000-0xf81a1000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81a2000-0xf81a5000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81a6000-0xf81a9000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81aa000-0xf81ad000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81ae000-0xf81b1000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81b2000-0xf81b5000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81b6000-0xf81b9000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81ba000-0xf81bd000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81be000-0xf81c1000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81c2000-0xf81c5000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf81fe000-0xf8201000   12288 e1000e_setup_tx_resources+0x27/0xb0 [e1000e] pages=2 vmalloc
> 0xf8202000-0xf8205000   12288 e1000e_setup_rx_resources+0x2a/0x120 [e1000e] pages=2 vmalloc
> 0xf8206000-0xf8209000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf820c000-0xf820f000   12288 module_alloc+0x72/0x80 pages=2 vmalloc
> 0xf8210000-0xf8213000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8214000-0xf8217000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8218000-0xf821c000   16384 module_alloc+0x72/0x80 pages=3 vmalloc
> 0xf8225000-0xf8228000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8229000-0xf822b000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xf82ab000-0xf82ae000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82af000-0xf82b2000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82b3000-0xf82b6000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82b7000-0xf82ba000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82bb000-0xf82be000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82bf000-0xf82c2000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82c3000-0xf82c6000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82c7000-0xf82ca000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82cb000-0xf82ce000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82cf000-0xf82d2000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82d3000-0xf82d6000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82d7000-0xf82da000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82db000-0xf82de000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82df000-0xf82e2000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82e3000-0xf82e6000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82e7000-0xf82ea000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82eb000-0xf82ee000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf82ef000-0xf82f2000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8332000-0xf8363000  200704 module_alloc+0x72/0x80 pages=48 vmalloc
> 0xf8364000-0xf8367000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8368000-0xf836b000   12288 e1000e_setup_tx_resources+0x27/0xb0 [e1000e] pages=2 vmalloc
> 0xf836c000-0xf836f000   12288 e1000e_setup_rx_resources+0x2a/0x120 [e1000e] pages=2 vmalloc
> 0xf8370000-0xf8373000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8374000-0xf8377000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8378000-0xf837b000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf837c000-0xf837f000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf837f000-0xf83a0000  135168 module_alloc+0x72/0x80 pages=32 vmalloc
> 0xf83a1000-0xf83a4000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf83a5000-0xf83a8000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf83a9000-0xf83ac000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf83ad000-0xf83b0000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf83b0000-0xf83ba000   40960 module_alloc+0x72/0x80 pages=9 vmalloc
> 0xf83bb000-0xf83bd000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xf83be000-0xf83c1000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf83cc000-0xf83e5000  102400 module_alloc+0x72/0x80 pages=24 vmalloc
> 0xf83ec000-0xf83ee000    8192 msix_capability_init+0xae/0x2b0 phys=faedc000 ioremap
> 0xf83f0000-0xf83f2000    8192 msix_capability_init+0xae/0x2b0 phys=fa69c000 ioremap
> 0xf83fa000-0xf83fc000    8192 msix_capability_init+0xae/0x2b0 phys=fafdc000 ioremap
> 0xf83fe000-0xf8400000    8192 msix_capability_init+0xae/0x2b0 phys=fa698000 ioremap
> 0xf8400000-0xf8421000  135168 e1000_probe+0x206/0x9d0 [e1000e] phys=faee0000 ioremap
> 0xf8422000-0xf8424000    8192 msix_capability_init+0xae/0x2b0 phys=fa79c000 ioremap
> 0xf8426000-0xf8428000    8192 msix_capability_init+0xae/0x2b0 phys=fa798000 ioremap
> 0xf842a000-0xf842c000    8192 msix_capability_init+0xae/0x2b0 phys=fa99c000 ioremap
> 0xf842e000-0xf8430000    8192 msix_capability_init+0xae/0x2b0 phys=fa998000 ioremap
> 0xf8432000-0xf8434000    8192 msix_capability_init+0xae/0x2b0 phys=faa9c000 ioremap
> 0xf8436000-0xf8438000    8192 msix_capability_init+0xae/0x2b0 phys=faa98000 ioremap
> 0xf843a000-0xf843c000    8192 msix_capability_init+0xae/0x2b0 phys=fac9c000 ioremap
> 0xf843e000-0xf8440000    8192 msix_capability_init+0xae/0x2b0 phys=fac98000 ioremap
> 0xf8440000-0xf8461000  135168 igb_probe+0x1f5/0x87a [igb] phys=fa6e0000 ioremap
> 0xf8462000-0xf8464000    8192 msix_capability_init+0xae/0x2b0 phys=fad9c000 ioremap
> 0xf8466000-0xf8468000    8192 msix_capability_init+0xae/0x2b0 phys=fad98000 ioremap
> 0xf846c000-0xf846e000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xf8476000-0xf8478000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xf8480000-0xf84a1000  135168 e1000_probe+0x206/0x9d0 [e1000e] phys=fafe0000 ioremap
> 0xf84a9000-0xf84af000   24576 module_alloc+0x72/0x80 pages=5 vmalloc
> 0xf84bc000-0xf84bf000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf84c0000-0xf84e1000  135168 igb_probe+0x1f5/0x87a [igb] phys=fa660000 ioremap
> 0xf84e2000-0xf84e5000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf84e6000-0xf84e9000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf84ea000-0xf84ed000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf84ee000-0xf84f1000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf84f2000-0xf84f5000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf84f6000-0xf84f9000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf84fb000-0xf84ff000   16384 module_alloc+0x72/0x80 pages=3 vmalloc
> 0xf8500000-0xf8521000  135168 igb_probe+0x1f5/0x87a [igb] phys=fa7e0000 ioremap
> 0xf8522000-0xf852f000   53248 module_alloc+0x72/0x80 pages=12 vmalloc
> 0xf8530000-0xf8533000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8534000-0xf8537000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8538000-0xf853b000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf853c000-0xf853f000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8540000-0xf8561000  135168 igb_probe+0x1f5/0x87a [igb] phys=fa760000 ioremap
> 0xf8562000-0xf8565000   12288 module_alloc+0x72/0x80 pages=2 vmalloc
> 0xf8566000-0xf8569000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf856a000-0xf856c000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xf856d000-0xf8570000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8571000-0xf8574000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8575000-0xf8578000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8579000-0xf857c000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf857d000-0xf857f000    8192 swap_cgroup_swapon+0x3c/0x140 pages=1 vmalloc
> 0xf8580000-0xf85a1000  135168 igb_probe+0x1f5/0x87a [igb] phys=fa9e0000 ioremap
> 0xf85a2000-0xf85a5000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf85a6000-0xf85a9000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf85aa000-0xf85ad000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf85af000-0xf85b7000   32768 module_alloc+0x72/0x80 pages=7 vmalloc
> 0xf85b8000-0xf85bb000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf85bc000-0xf85bf000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf85c0000-0xf85e1000  135168 igb_probe+0x1f5/0x87a [igb] phys=fa960000 ioremap
> 0xf85e2000-0xf85ed000   45056 sys_swapon+0x5a6/0xa40 pages=10 vmalloc
> 0xf85ee000-0xf8600000   73728 module_alloc+0x72/0x80 pages=17 vmalloc
> 0xf8600000-0xf8621000  135168 igb_probe+0x1f5/0x87a [igb] phys=faae0000 ioremap
> 0xf8622000-0xf8625000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8626000-0xf8629000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf862a000-0xf862d000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf862e000-0xf8631000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8632000-0xf8635000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8636000-0xf8639000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf863a000-0xf863d000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8640000-0xf8661000  135168 igb_probe+0x1f5/0x87a [igb] phys=faa60000 ioremap
> 0xf8662000-0xf8665000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8666000-0xf8669000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf866a000-0xf866d000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf866e000-0xf8671000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8672000-0xf8675000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8676000-0xf8679000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf867a000-0xf867d000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8680000-0xf86a1000  135168 igb_probe+0x1f5/0x87a [igb] phys=face0000 ioremap
> 0xf86a2000-0xf86a5000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf86a6000-0xf86a9000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf86aa000-0xf86ad000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf86ae000-0xf86b1000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf86b2000-0xf86b5000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf86b6000-0xf86b9000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf86ba000-0xf86bd000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf86c0000-0xf86e1000  135168 igb_probe+0x1f5/0x87a [igb] phys=fac60000 ioremap
> 0xf86e2000-0xf86e5000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8700000-0xf8721000  135168 igb_probe+0x1f5/0x87a [igb] phys=fade0000 ioremap
> 0xf8740000-0xf8761000  135168 igb_probe+0x1f5/0x87a [igb] phys=fad60000 ioremap
> 0xf8762000-0xf8767000   20480 module_alloc+0x72/0x80 pages=4 vmalloc
> 0xf8768000-0xf876b000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf876c000-0xf876f000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8770000-0xf8773000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8774000-0xf8777000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8778000-0xf877b000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf877c000-0xf877f000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8780000-0xf8783000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8784000-0xf8787000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8788000-0xf878b000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf878c000-0xf878f000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8790000-0xf8793000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8793000-0xf87a4000   69632 module_alloc+0x72/0x80 pages=16 vmalloc
> 0xf87a5000-0xf87a8000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf87aa000-0xf87e7000  249856 module_alloc+0x72/0x80 pages=60 vmalloc
> 0xf87f4000-0xf87f7000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf87f8000-0xf87fb000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf87fe000-0xf8877000  495616 module_alloc+0x72/0x80 pages=120 vmalloc
> 0xf8878000-0xf887b000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf887c000-0xf887f000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8880000-0xf8883000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8884000-0xf8887000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8888000-0xf888b000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf888c000-0xf888f000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8890000-0xf8893000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8894000-0xf8897000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf8898000-0xf889b000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf889c000-0xf889f000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf88a0000-0xf88a3000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf88a4000-0xf88a7000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf88a8000-0xf88ab000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf88ac000-0xf88af000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xf88af000-0xf88d5000  155648 module_alloc+0x72/0x80 pages=37 vmalloc
> 0xf8a65000-0xf8a67000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfa026000-0xfa028000    8192 acpi_os_map_memory+0x16/0x1f phys=df79e000 ioremap
> 0xfa02a000-0xfa02c000    8192 acpi_os_map_memory+0x16/0x1f phys=df790000 ioremap
> 0xfa03a000-0xfa03c000    8192 acpi_os_map_memory+0x16/0x1f phys=df790000 ioremap
> 0xfa03e000-0xfa040000    8192 acpi_os_map_memory+0x16/0x1f phys=df790000 ioremap
> 0xfa042000-0xfa044000    8192 acpi_os_map_memory+0x16/0x1f phys=df79e000 ioremap
> 0xfa046000-0xfa048000    8192 acpi_os_map_memory+0x16/0x1f phys=df79a000 ioremap
> 0xfa04a000-0xfa04c000    8192 acpi_os_map_memory+0x16/0x1f phys=df79a000 ioremap
> 0xfa04e000-0xfa050000    8192 acpi_os_map_memory+0x16/0x1f phys=df79a000 ioremap
> 0xfa052000-0xfa054000    8192 acpi_os_map_memory+0x16/0x1f phys=df79a000 ioremap
> 0xfa056000-0xfa058000    8192 acpi_os_map_memory+0x16/0x1f phys=df79a000 ioremap
> 0xfa05e000-0xfa060000    8192 acpi_os_map_memory+0x16/0x1f phys=fed1f000 ioremap
> 0xfa06e000-0xfa070000    8192 usb_hcd_pci_probe+0x17b/0x350 phys=fa4de000 ioremap
> 0xfa072000-0xfa074000    8192 usb_hcd_pci_probe+0x17b/0x350 phys=fa4dc000 ioremap
> 0xfa78c000-0xfa79d000   69632 module_alloc+0x72/0x80 pages=16 vmalloc
> 0xfa805000-0xfa84a000  282624 module_alloc+0x72/0x80 pages=68 vmalloc
> 0xfc000000-0xfc400000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 0xfc400000-0xfc800000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 0xfc93f000-0xfc941000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfc94b000-0xfc94e000   12288 module_alloc+0x72/0x80 pages=2 vmalloc
> 0xfc95e000-0xfc964000   24576 module_alloc+0x72/0x80 pages=5 vmalloc
> 0xfc96f000-0xfc972000   12288 module_alloc+0x72/0x80 pages=2 vmalloc
> 0xfcdcc000-0xfcddf000   77824 module_alloc+0x72/0x80 pages=18 vmalloc
> 0xfce11000-0xfce14000   12288 reqsk_queue_alloc+0x54/0xd0 pages=2 vmalloc
> 0xfce15000-0xfce18000   12288 reqsk_queue_alloc+0x54/0xd0 pages=2 vmalloc
> 0xfce19000-0xfce1c000   12288 reqsk_queue_alloc+0x54/0xd0 pages=2 vmalloc
> 0xfce21000-0xfce23000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfce2a000-0xfce2c000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfce30000-0xfce33000   12288 reqsk_queue_alloc+0x54/0xd0 pages=2 vmalloc
> 0xfce34000-0xfce37000   12288 reqsk_queue_alloc+0x54/0xd0 pages=2 vmalloc
> 0xfce38000-0xfce3b000   12288 reqsk_queue_alloc+0x54/0xd0 pages=2 vmalloc
> 0xfd0e7000-0xfd0ea000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd0eb000-0xfd0ee000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd0ef000-0xfd0f2000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd0f3000-0xfd0f6000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd0f7000-0xfd0fa000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd0fb000-0xfd0fe000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd0ff000-0xfd102000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd103000-0xfd106000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd107000-0xfd10a000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd10b000-0xfd10e000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd10f000-0xfd112000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd113000-0xfd116000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd117000-0xfd11a000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd11b000-0xfd11e000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd11f000-0xfd122000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd123000-0xfd126000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd127000-0xfd12a000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd12b000-0xfd12e000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd12f000-0xfd132000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd133000-0xfd136000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd137000-0xfd13a000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd13b000-0xfd13e000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd13f000-0xfd142000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd143000-0xfd146000   12288 igb_setup_tx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd147000-0xfd14a000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd14b000-0xfd14e000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd14f000-0xfd152000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd153000-0xfd156000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd157000-0xfd15a000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd15b000-0xfd15e000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd15f000-0xfd162000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd163000-0xfd166000   12288 igb_setup_rx_resources+0x27/0x140 [igb] pages=2 vmalloc
> 0xfd1b2000-0xfd1b6000   16384 module_alloc+0x72/0x80 pages=3 vmalloc
> 0xfd1bf000-0xfd1c1000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd1cb000-0xfd1ce000   12288 module_alloc+0x72/0x80 pages=2 vmalloc
> 0xfd1d5000-0xfd1d7000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd1de000-0xfd1e0000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd1e8000-0xfd1ea000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd1f2000-0xfd1f4000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd1fb000-0xfd1fd000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd205000-0xfd207000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd20f000-0xfd211000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd219000-0xfd21b000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd223000-0xfd225000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd22d000-0xfd22f000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd23a000-0xfd23c000    8192 module_alloc+0x72/0x80 pages=1 vmalloc
> 0xfd248000-0xfd24c000   16384 module_alloc+0x72/0x80 pages=3 vmalloc
> 0xfd400000-0xfd800000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 0xfd800000-0xfdc00000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 0xfdc00000-0xfe000000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 0xfe000000-0xfe400000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 0xfe400000-0xfe800000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 0xfe800000-0xfec00000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 0xfec00000-0xff000000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 0xff000000-0xff400000 4194304 pcpu_get_vm_areas+0x0/0x610 vmalloc
> 

Thanks

Your vmalloc space is very fragmented. pcpu_get_vm_areas() want
hugepages (4MB on your machine, 2MB on mine because I have
CONFIG_HIGHMEM64G=y)

You could :

1) Use a 64 bit kernel ( :) )

or

2) boot parameter vmalloc=256M   to get more room
   (default is 128 Mbytes)

and eventually

select a 2G/2G User/Kernel split to get more LOWMEM, because big vmalloc
windows shrinks the LOWMEM zone. (CONFIG_VMSPLIT_2G=y)

^ permalink raw reply

* Re: OOM when adding ipv6 route:  How to make available more per-cpu memory?
From: Ben Greear @ 2010-11-05 20:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, linux-kernel, Tejun Heo
In-Reply-To: <1288988403.2665.268.camel@edumazet-laptop>

On 11/05/2010 01:20 PM, Eric Dumazet wrote:

> Your vmalloc space is very fragmented. pcpu_get_vm_areas() want
> hugepages (4MB on your machine, 2MB on mine because I have
> CONFIG_HIGHMEM64G=y)
>
> You could :
>
> 1) Use a 64 bit kernel ( :) )

We mostly use 64-bit, but just not for the remastered live cd image.

> or
>
> 2) boot parameter vmalloc=256M   to get more room
>     (default is 128 Mbytes)

We'll try that.

>
> and eventually
>
> select a 2G/2G User/Kernel split to get more LOWMEM, because big vmalloc
> windows shrinks the LOWMEM zone. (CONFIG_VMSPLIT_2G=y)

That sounds promising as well.


I was also wondering if it would make sense to allow one to disable
the snmp stats for ipv6?  I don't think I have any use for those
stats anyway..

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: ping -I eth1 ....
From: Thomas Graf @ 2010-11-05 20:31 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Eric Dumazet, netdev
In-Reply-To: <OFAA7963DE.C3F04675-ONC12577D2.0056F55C-C12577D2.00575EC2@transmode.se>

On Fri, Nov 05, 2010 at 04:54:18PM +0100, Joakim Tjernlund wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/11/05 16:06:54:
> >
> > > Hopefully most of that is legacy or just plain wrong? Unless
> > > someone can say why only test IFF_UP one should consider changing them.
> > >
> >
> > Most of the places are hot path.
> >
> > You dont want to replace one test by four tests.
> >
> > _This_ would be wrong :)
> 
> Wrong is wrong, even if it is in the hot path :)
> Perhaps it is time define and internal IFF_OPERATIONAL flag
> which is the sum of IFF_UP, IFF_RUNNING etc.? Tht
> way you still get one test in the hot path and can abstract
> what defines an operational link.

You definitely don't want to have your send() call fail simply because
the carrier was off for a few msec or the routing daemon has put a link
down temporarly. Also, the outgoing interface looked up at routing
decision is not necessarly the interface used for sending in the end.
The packet may get mangled and rerouted by netfilter or tc on the way.

Personally I'm even ok with the current behaviour of sendto() while the
socket is bound to an interface but if we choose to return an error
if the interface is down we might as well do so based on the operational
status.

^ permalink raw reply

* Re: OOM when adding ipv6 route:  How to make available more per-cpu memory?
From: Eric Dumazet @ 2010-11-05 20:53 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, linux-kernel, Tejun Heo
In-Reply-To: <4CD46892.6050408@candelatech.com>

Le vendredi 05 novembre 2010 à 13:26 -0700, Ben Greear a écrit :

> 
> I was also wondering if it would make sense to allow one to disable
> the snmp stats for ipv6?  I don't think I have any use for those
> stats anyway..
> 

I agree. IPV6 have per device SNMP fields, percpu... thats probably not
needed.

We have many SNMP fields that could avoid being percpu, even for ipv4.




^ permalink raw reply

* @see this
From: customerscare_msn> @ 2010-11-05 21:13 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 26 bytes --]





see the file attached

[-- Attachment #2: YOU HAVE WON A PRIZE-1.doc --]
[-- Type: application/msword, Size: 247808 bytes --]

^ permalink raw reply

* radvd and auto-ipv6 address regression from 2.6.31 to 2.6.34+
From: Ben Greear @ 2010-11-05 21:24 UTC (permalink / raw)
  To: NetDev


I'm seeing something strange.  I'm running radvd on a VETH interface (veth0 for argument)
with a single global IPv6 address (and a link-local address).

On hacked 2.6.31, this works as I expect:  The veth0 interface does not gain or lose any
IPv6 addresses and peer VETH port gets an auto-created IPv6 addresses.

On hacked 2.6.34 and 2.6.36 kernels, however, the veth0 gains a new address that appears
to be generated similar to other IPs associated with auto-creation via radvd.

I have not yet tested intervening kernels or physical interfaces between two machines.

So, the question is:  Is the new behaviour on purpose, or is it a regression bug?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply

* Re: OOM when adding ipv6 route:  How to make available more per-cpu memory?
From: Eric Dumazet @ 2010-11-05 22:11 UTC (permalink / raw)
  To: Ben Greear; +Cc: NetDev, linux-kernel, Tejun Heo
In-Reply-To: <1288988403.2665.268.camel@edumazet-laptop>

Le vendredi 05 novembre 2010 à 21:20 +0100, Eric Dumazet a écrit :
> Your vmalloc space is very fragmented. pcpu_get_vm_areas() want
> hugepages (4MB on your machine, 2MB on mine because I have
> CONFIG_HIGHMEM64G=y)

Well, this is wrong. We use normal (4KB) pages, unfortunately.

I have a NUMA machine, with two nodes, so pcpu_get_vm_areas() allocates
two zones, one for each node, with a 'known' offset between them.
Then, 4KB pages are allocated to populate the zone when needed.

# grep pcpu_get_vm_areas /proc/vmallocinfo 
0xffffe8ffa0400000-0xffffe8ffa0600000 2097152 pcpu_get_vm_areas+0x0/0x740 vmalloc
0xffffe8ffffc00000-0xffffe8ffffe00000 2097152 pcpu_get_vm_areas+0x0/0x740 vmalloc

BTW, we dont have the number of pages currently allocated in each
'vmalloc' zone, and/or node information.

Tejun, do you have plans to use hugepages eventually ?
(and fallback to 4KB pages, but most percpu data are allocated right
after boot)

Thanks



^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2010-11-05 22:14 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


Here are the bug fixes that have queued up since -rc1,
several "touches netdev queue before device registry"
cases as well as the "net dst" percpu fixup you want
to see merged ASAP.

Please pull, thanks a lot!

The following changes since commit ff8b16d7e15a8ba2a6086645614a483e048e3fbf:

  vmstat: fix offset calculation on void* (2010-11-03 14:39:58 -0400)

are available in the git repository at:
  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Amerigo Wang (1):
      netxen: remove unused firmware exports

André Carvalho de Matos (1):
      caif: Bugfix for socket priority, bindtodev and dbg channel.

David S. Miller (2):
      ibm_newemac: Remove netif_stop_queue() in emac_probe().
      Merge branch 'master' of git://git.kernel.org/.../kaber/nf-2.6

Divy Le Ray (3):
      cxgb3: remove call to stop TX queues at load time.
      cxgb4: remove call to stop TX queues at load time.
      cxgb4vf: remove call to stop TX queues at load time.

Dmitry Artamonow (1):
      USB: gadget: fix ethernet gadget crash in gether_setup

Dr. David Alan Gilbert (1):
      l2tp: kzalloc with swapped params in l2tp_dfs_seq_open

Eric Dumazet (7):
      netfilter: nf_conntrack: allow nf_ct_alloc_hashtable() to get highmem pages
      netfilter: fix nf_conntrack_l4proto_register()
      jme: fix panic on load
      qlcnic: fix panic on load
      atl1 : fix panic on load
      de2104x: fix panic on load
      fib: fib_result_assign() should not change fib refcounts

Herbert Xu (1):
      cls_cgroup: Fix crash on module unload

Jan Engelhardt (1):
      netfilter: ip6_tables: fix information leak to userspace

John Faith (1):
      smsc911x: Set Ethernet EEPROM size to supported device's size

Ming Lei (1):
      usbnet: fix usb_autopm_get_interface failure(v1)

Nelson Elhage (2):
      netlink: Make nlmsg_find_attr take a const nlmsghdr*.
      inet_diag: Make sure we actually run the same bytecode we audited.

Patrick McHardy (1):
      netfilter: nf_nat: fix compiler warning with CONFIG_NF_CT_NETLINK=n

Pavel Emelyanov (2):
      rds: Lost locking in loop connection freeing
      rds: Remove kfreed tcp conn from list

Sjur Brændeland (1):
      caif: SPI-driver bugfix - incorrect padding.

Thomas Graf (1):
      text ematch: check for NULL pointer before destroying textsearch config

Tom Herbert (1):
      net: check queue_index from sock is valid for device

Uwe Kleine-König (1):
      trivial: fix typos concerning "function"

Vasiliy Kulikov (2):
      ipv4: netfilter: arp_tables: fix information leak to userland
      ipv4: netfilter: ip_tables: fix information leak to userland

Xiaotian Feng (1):
      net dst: fix percpu_counter list corruption and poison overwritten

Yaniv Rosner (8):
      bnx2x: Restore appropriate delay during BMAC reset
      bnx2x: Fix waiting for reset complete on BCM848x3 PHYs
      bnx2x: Fix port selection in case of E2
      bnx2x: Clear latch indication on link reset
      bnx2x: Fix resetting BCM8726 PHY during common init
      bnx2x: Do not enable CL37 BAM unless it is explicitly enabled
      bnx2x: Reset 8073 phy during common init
      bnx2x: Update version number

andrew hendry (1):
      memory corruption in X.25 facilities parsing

sjur.brandeland@stericsson.com (1):
      caif: Remove noisy printout when disconnecting caif socket

 drivers/isdn/hisax/isar.c            |    4 +-
 drivers/net/atlx/atl1.c              |    1 -
 drivers/net/bnx2x/bnx2x.h            |    4 +-
 drivers/net/bnx2x/bnx2x_hsi.h        |    9 +++++-
 drivers/net/bnx2x/bnx2x_link.c       |   57 +++++++++++++++++++++++++---------
 drivers/net/caif/caif_spi.c          |   57 +++++++++++++++++++++++++---------
 drivers/net/caif/caif_spi_slave.c    |   13 +++++--
 drivers/net/cxgb3/cxgb3_main.c       |    1 -
 drivers/net/cxgb4/cxgb4_main.c       |    1 -
 drivers/net/cxgb4vf/cxgb4vf_main.c   |    1 -
 drivers/net/ibm_newemac/core.c       |    1 -
 drivers/net/jme.c                    |    4 --
 drivers/net/netxen/netxen_nic_main.c |    3 --
 drivers/net/qlcnic/qlcnic_main.c     |    1 -
 drivers/net/smsc911x.h               |    2 +-
 drivers/net/tulip/de2104x.c          |    1 -
 drivers/net/usb/usbnet.c             |   11 ++++++
 drivers/usb/gadget/u_ether.c         |    1 -
 include/net/caif/caif_dev.h          |    4 +-
 include/net/caif/caif_spi.h          |    2 +
 include/net/caif/cfcnfg.h            |    8 ++--
 include/net/netlink.h                |    2 +-
 net/caif/caif_config_util.c          |   13 ++++++--
 net/caif/caif_dev.c                  |    2 +
 net/caif/caif_socket.c               |   45 +++++++++------------------
 net/caif/cfcnfg.c                    |   17 ++++------
 net/caif/cfctrl.c                    |    3 +-
 net/caif/cfdbgl.c                    |   14 ++++++++
 net/caif/cfrfml.c                    |    2 +-
 net/core/dev.c                       |    2 +-
 net/ipv4/fib_lookup.h                |    5 +--
 net/ipv4/inet_diag.c                 |   27 +++++++++------
 net/ipv4/netfilter/arp_tables.c      |    1 +
 net/ipv4/netfilter/ip_tables.c       |    1 +
 net/ipv4/netfilter/nf_nat_core.c     |   40 ++++++++++++------------
 net/ipv6/netfilter/ip6_tables.c      |    1 +
 net/ipv6/route.c                     |    2 +
 net/l2tp/l2tp_debugfs.c              |    2 +-
 net/netfilter/nf_conntrack_core.c    |    3 +-
 net/netfilter/nf_conntrack_proto.c   |    6 +++
 net/rds/loop.c                       |    4 ++
 net/rds/tcp.c                        |    6 +++
 net/sched/cls_cgroup.c               |    2 -
 net/sched/em_text.c                  |    3 +-
 net/x25/x25_facilities.c             |    8 ++--
 net/x25/x25_in.c                     |    2 +
 46 files changed, 246 insertions(+), 153 deletions(-)

^ permalink raw reply

* RE: [PATCH 1/2] ixgb: Don't check for vlan group on transmit.
From: Duyck, Alexander H @ 2010-11-05 22:30 UTC (permalink / raw)
  To: Kirsher, Jeffrey T, Jesse Gross
  Cc: David Miller, netdev@vger.kernel.org, Brandeburg, Jesse
In-Reply-To: <1288987609.3091.16.camel@jtkirshe-MOBL1>



>-----Original Message-----
>From: Kirsher, Jeffrey T
>Sent: Friday, November 05, 2010 1:07 PM
>To: Jesse Gross
>Cc: David Miller; netdev@vger.kernel.org; Brandeburg, Jesse; Duyck,
>Alexander H
>Subject: Re: [PATCH 1/2] ixgb: Don't check for vlan group on transmit.
>
>On Fri, 2010-11-05 at 12:56 -0700, Jesse Gross wrote:
>> On Fri, Nov 5, 2010 at 12:11 PM, Jeff Kirsher
>> <jeffrey.t.kirsher@intel.com> wrote:
>> > On Sat, 2010-10-30 at 11:49 -0700, Jesse Gross wrote:
>> >> On transmit, the ixgb driver will only use vlan acceleration if a
>> >> vlan group is configured.  This can lead to tags getting dropped
>> >> when bridging because the networking core assumes that a driver
>> >> that claims vlan acceleration support can do it at all times.
>This
>> >> change should have been part of commit eab6d18d "vlan: Don't
>check for
>> >> vlan group before vlan_tx_tag_present." but was missed.
>> >>
>> >> Signed-off-by: Jesse Gross <jesse@nicira.com>
>> >> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>> >> CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
>> >> CC: PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com>
>> >> ---
>> >>  drivers/net/ixgb/ixgb_main.c |    2 +-
>> >>  1 files changed, 1 insertions(+), 1 deletions(-)
>> >>
>> >> diff --git a/drivers/net/ixgb/ixgb_main.c
>b/drivers/net/ixgb/ixgb_main.c
>> >> index caa8192..d18194e 100644
>> >> --- a/drivers/net/ixgb/ixgb_main.c
>> >> +++ b/drivers/net/ixgb/ixgb_main.c
>> >> @@ -1498,7 +1498,7 @@ ixgb_xmit_frame(struct sk_buff *skb, struct
>net_device *netdev)
>> >>                       DESC_NEEDED)))
>> >>               return NETDEV_TX_BUSY;
>> >>
>> >> -     if (adapter->vlgrp && vlan_tx_tag_present(skb)) {
>> >> +     if (vlan_tx_tag_present(skb)) {
>> >>               tx_flags |= IXGB_TX_FLAGS_VLAN;
>> >>               vlan_id = vlan_tx_tag_get(skb);
>> >>       }
>> >
>> > After further review, NAK because this will cause a bug.  With
>this
>> > patch it would be possible to overrun the buffers, so the correct
>fix is
>> > to increase max_frame_size by VLAN_TAG_SIZE in
>ixgb/igb_change_mtu.
>>
>> Hmm, I didn't see any other place where it made changes to the
>> handling of packets on transmit if a vlan group is configured.
>Maybe
>> the buffer is extended when a group is registered and stripping is
>> enabled?
>>
>> In any case, you might want to check the other Intel drivers for
>> similar problems.  I did a pass and made a mass conversion of this
>> type a little while ago.  Those changes have already been merged, I
>> just missed this one by accident.
>
>I will get with Alex and review the other Intel drivers, thanks Jesse.

Just to make things clear.  The ixgb patch is fine.  There isn't anything wrong with it.

The patch with the bug is the other patch, "2/2 igb: Don't depend on VLAN group for receive size".  The problem is it was updating the RLPML register, but not updating the buffer sizes as such there were a few cases where we could receive a buffer larger the SKB head room.  The bug itself probably won't come up very often since there are only a couple of very specific MTU sizes where it will be an issue.

The quick fix for your patch is to move the addition of VLAN_TAG_SIZE to the max_frame in igb_change_mtu instead of in the set_rlpml call.  Otherwise I will see about submitting an updated patch in the next few days.

Thanks,

Alex

^ permalink raw reply

* Sky2 - problems with VLANs - kernel 2.6.36
From: David @ 2010-11-05 23:06 UTC (permalink / raw)
  To: netdev, Linux Kernel Mailing List

I've just installed a Lycom dual port gigabit ethernet card, picked up
as follows :-

03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8062 PCI-E
IPMI Gigabit Ethernet Controller (rev 14)
    Subsystem: Marvell Technology Group Ltd. Device 6222
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 41
    Region 0: Memory at fe8fc000 (64-bit, non-prefetchable) [size=16K]
    Region 2: I/O ports at c800 [size=256]
    Expansion ROM at fe8c0000 [disabled] [size=128K]
    Capabilities: <access denied>
    Kernel driver in use: sky2
    Kernel modules: sky2

I'm having a problem with VLANs. Outgoing packets are tagged correctly
and devices on the VLAN are responding. Unfortunately all of the
response packets stay on the raw device and are not allocated to the VLAN.

I've done some investigation (printks etc.), and have found that neither
of the following cases in sky2_status_intr() are being triggered...

                case OP_RXVLAN:
                        printk("RXVLAN, length=%u, status=%u\n", length,
status);
                        sky2->rx_tag = length;
                        break;

                case OP_RXCHKSVLAN:
                        printk("RXCHKSVLAN, length=%u, status=%u\n",
length, status);
                        sky2->rx_tag = length;
                        /* fall through */

... however the status when calling sky2_skb_rx() does have GMR_FS_VLAN
set, it's just we haven't been able to find out which VLAN the packet
comes from (and sky2->rx_tag is zero). Does anyone have any suggestions
as to how I proceed from here? I'm happy to test patches etc.

Cheers
David

^ permalink raw reply

* [PATCH 0/7] Convert sprintf_symbol uses to %p[Ss]
From: Joe Perches @ 2010-11-05 23:12 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: linux-arm-kernel, linux-kernel, cluster-devel, linux-mm,
	linux-nfs, netdev

Remove unnecessary declarations of temporary buffers.
Use %pS or %ps as appropriate.
Minor reformatting in a couple of places.

Compiled, but otherwise untested.

Joe Perches (7):
  arch/arm/kernel/traps.c: Convert sprintf_symbol to %pS
  arch/x86/kernel/pci-iommu_table.c: Convert sprintf_symbol to %pS
  fs/gfs2/glock.c: Convert sprintf_symbol to %pS
  fs/proc/base.c kernel/latencytop.c: Convert sprintf_symbol to %ps
  kernel/lockdep_proc.c: Convert sprintf_symbol to %pS
  mm: Convert sprintf_symbol to %pS
  net/sunrpc/clnt.c: Convert sprintf_symbol to %ps

 arch/arm/kernel/traps.c           |    5 +----
 arch/x86/kernel/pci-iommu_table.c |   18 ++++--------------
 fs/gfs2/glock.c                   |   15 +++++++--------
 fs/proc/base.c                    |   22 ++++++++--------------
 kernel/latencytop.c               |   23 +++++++++--------------
 kernel/lockdep_proc.c             |   16 ++++++----------
 mm/slub.c                         |   11 ++++-------
 mm/vmalloc.c                      |    9 ++-------
 net/sunrpc/clnt.c                 |   12 ++----------
 9 files changed, 43 insertions(+), 88 deletions(-)

-- 
1.7.3.2.146.gca209

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [PATCH 7/7] net/sunrpc/clnt.c: Convert sprintf_symbol to %ps
From: Joe Perches @ 2010-11-05 23:12 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: J. Bruce Fields, Neil Brown, Trond Myklebust, David S. Miller,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1288998760-11775-1-git-send-email-joe-6d6DIl74uiNBDgjK7y7TUQ@public.gmane.org>

Signed-off-by: Joe Perches <joe-6d6DIl74uiNBDgjK7y7TUQ@public.gmane.org>
---
 net/sunrpc/clnt.c |   12 ++----------
 1 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 9dab957..0c3d395 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1824,23 +1824,15 @@ static void rpc_show_task(const struct rpc_clnt *clnt,
 			  const struct rpc_task *task)
 {
 	const char *rpc_waitq = "none";
-	char *p, action[KSYM_SYMBOL_LEN];
 
 	if (RPC_IS_QUEUED(task))
 		rpc_waitq = rpc_qname(task->tk_waitqueue);
 
-	/* map tk_action pointer to a function name; then trim off
-	 * the "+0x0 [sunrpc]" */
-	sprint_symbol(action, (unsigned long)task->tk_action);
-	p = strchr(action, '+');
-	if (p)
-		*p = '\0';
-
-	printk(KERN_INFO "%5u %04x %6d %8p %8p %8ld %8p %sv%u %s a:%s q:%s\n",
+	printk(KERN_INFO "%5u %04x %6d %8p %8p %8ld %8p %sv%u %s a:%ps q:%s\n",
 		task->tk_pid, task->tk_flags, task->tk_status,
 		clnt, task->tk_rqstp, task->tk_timeout, task->tk_ops,
 		clnt->cl_protname, clnt->cl_vers, rpc_proc_name(task),
-		action, rpc_waitq);
+		task->tk_action, rpc_waitq);
 }
 
 void rpc_show_tasks(void)
-- 
1.7.3.2.146.gca209

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: radvd and auto-ipv6 address regression from 2.6.31 to 2.6.34+
From: Ben Greear @ 2010-11-05 23:13 UTC (permalink / raw)
  To: NetDev
In-Reply-To: <4CD47622.5040507@candelatech.com>

On 11/05/2010 02:24 PM, Ben Greear wrote:
>
> I'm seeing something strange. I'm running radvd on a VETH interface
> (veth0 for argument)
> with a single global IPv6 address (and a link-local address).
>
> On hacked 2.6.31, this works as I expect: The veth0 interface does not
> gain or lose any
> IPv6 addresses and peer VETH port gets an auto-created IPv6 addresses.
>
> On hacked 2.6.34 and 2.6.36 kernels, however, the veth0 gains a new
> address that appears
> to be generated similar to other IPs associated with auto-creation via
> radvd.
>
> I have not yet tested intervening kernels or physical interfaces between
> two machines.
>
> So, the question is: Is the new behaviour on purpose, or is it a
> regression bug?

Actually, this doesn't seem to work for 2.6.31 either, so I guess it
isn't a regression.

Is it expected behaviour, however?

Thanks,
Ben

>
> Thanks,
> Ben
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply

* Re: [RFC v2] ipvs: allow transmit of GRO aggregated skbs
From: Simon Horman @ 2010-11-05 23:54 UTC (permalink / raw)
  To: lvs-devel, netdev; +Cc: Julian Anastasov
In-Reply-To: <20101105135222.GA8714@verge.net.au>

This is a first attempt at allowing LVS to transmit
skbs of greater than MTU length that have been aggregated by GRO.

I have lightly tested the ip_vs_dr_xmit() portion of this patch and
although it seems to work I am unsure that netif_needs_gso() is the correct
test to use.

Signed-off-by: Simon Horman <horms@verge.net.au>

--- 

* LRO is still an outstanding issue, but as its deprecated in favour
  of GRO perhaps it doesn't need to be solved.

* v1
  - Based on 2.6.35

* v2
  - Rebase on current nf-next-2.6 tree (~2.6.37-rc1)

Index: lvs-test-2.6/net/netfilter/ipvs/ip_vs_xmit.c
===================================================================
--- lvs-test-2.6.orig/net/netfilter/ipvs/ip_vs_xmit.c	2010-10-30 11:43:37.000000000 +0900
+++ lvs-test-2.6/net/netfilter/ipvs/ip_vs_xmit.c	2010-11-06 08:09:17.000000000 +0900
@@ -408,7 +408,8 @@ ip_vs_bypass_xmit(struct sk_buff *skb, s
 
 	/* MTU checking */
 	mtu = dst_mtu(&rt->dst);
-	if ((skb->len > mtu) && (iph->frag_off & htons(IP_DF))) {
+	if ((skb->len > mtu) && (iph->frag_off & htons(IP_DF)) &&
+	    !netif_needs_gso(rt->dst.dev, skb)) {
 		ip_rt_put(rt);
 		icmp_send(skb, ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED, htonl(mtu));
 		IP_VS_DBG_RL("%s(): frag needed\n", __func__);
@@ -461,7 +462,7 @@ ip_vs_bypass_xmit_v6(struct sk_buff *skb
 
 	/* MTU checking */
 	mtu = dst_mtu(&rt->dst);
-	if (skb->len > mtu) {
+	if (skb->len > mtu && !netif_needs_gso(rt->dst.dev, skb)) {
 		if (!skb->dev) {
 			struct net *net = dev_net(skb_dst(skb)->dev);
 
@@ -560,7 +561,8 @@ ip_vs_nat_xmit(struct sk_buff *skb, stru
 
 	/* MTU checking */
 	mtu = dst_mtu(&rt->dst);
-	if ((skb->len > mtu) && (iph->frag_off & htons(IP_DF))) {
+	if ((skb->len > mtu) && (iph->frag_off & htons(IP_DF)) &&
+	    !netif_needs_gso(rt->dst.dev, skb)) {
 		icmp_send(skb, ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED, htonl(mtu));
 		IP_VS_DBG_RL_PKT(0, AF_INET, pp, skb, 0,
 				 "ip_vs_nat_xmit(): frag needed for");
@@ -675,7 +677,7 @@ ip_vs_nat_xmit_v6(struct sk_buff *skb, s
 
 	/* MTU checking */
 	mtu = dst_mtu(&rt->dst);
-	if (skb->len > mtu) {
+	if (skb->len > mtu && !netif_needs_gso(rt->dst.dev, skb)) {
 		if (!skb->dev) {
 			struct net *net = dev_net(skb_dst(skb)->dev);
 
@@ -790,8 +792,9 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, s
 
 	df |= (old_iph->frag_off & htons(IP_DF));
 
-	if ((old_iph->frag_off & htons(IP_DF))
-	    && mtu < ntohs(old_iph->tot_len)) {
+	if ((old_iph->frag_off & htons(IP_DF) &&
+	    mtu < ntohs(old_iph->tot_len) &&
+	    !netif_needs_gso(rt->dst.dev, skb))) {
 		icmp_send(skb, ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED, htonl(mtu));
 		IP_VS_DBG_RL("%s(): frag needed\n", __func__);
 		goto tx_error_put;
@@ -903,7 +906,8 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb
 	if (skb_dst(skb))
 		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), mtu);
 
-	if (mtu < ntohs(old_iph->payload_len) + sizeof(struct ipv6hdr)) {
+	if (mtu < ntohs(old_iph->payload_len) + sizeof(struct ipv6hdr) &&
+	    !netif_needs_gso(rt->dst.dev, skb)) {
 		if (!skb->dev) {
 			struct net *net = dev_net(skb_dst(skb)->dev);
 
@@ -1008,7 +1012,8 @@ ip_vs_dr_xmit(struct sk_buff *skb, struc
 
 	/* MTU checking */
 	mtu = dst_mtu(&rt->dst);
-	if ((iph->frag_off & htons(IP_DF)) && skb->len > mtu) {
+	if ((iph->frag_off & htons(IP_DF)) && skb->len > mtu &&
+	    !netif_needs_gso(rt->dst.dev, skb)) {
 		icmp_send(skb, ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED, htonl(mtu));
 		ip_rt_put(rt);
 		IP_VS_DBG_RL("%s(): frag needed\n", __func__);
@@ -1174,7 +1179,8 @@ ip_vs_icmp_xmit(struct sk_buff *skb, str
 
 	/* MTU checking */
 	mtu = dst_mtu(&rt->dst);
-	if ((skb->len > mtu) && (ip_hdr(skb)->frag_off & htons(IP_DF))) {
+	if ((skb->len > mtu) && (ip_hdr(skb)->frag_off & htons(IP_DF)) &&
+	    !netif_needs_gso(rt->dst.dev, skb)) {
 		icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
 		IP_VS_DBG_RL("%s(): frag needed\n", __func__);
 		goto tx_error_put;
@@ -1288,7 +1294,7 @@ ip_vs_icmp_xmit_v6(struct sk_buff *skb,
 
 	/* MTU checking */
 	mtu = dst_mtu(&rt->dst);
-	if (skb->len > mtu) {
+	if (skb->len > mtu && !netif_needs_gso(rt->dst.dev, skb)) {
 		if (!skb->dev) {
 			struct net *net = dev_net(skb_dst(skb)->dev);
 

^ permalink raw reply

* Re: OOM when adding ipv6 route:  How to make available more per-cpu memory?
From: Ben Greear @ 2010-11-06  0:07 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev, linux-kernel, Tejun Heo
In-Reply-To: <1288995103.2665.653.camel@edumazet-laptop>

On 11/05/2010 03:11 PM, Eric Dumazet wrote:
> Le vendredi 05 novembre 2010 à 21:20 +0100, Eric Dumazet a écrit :
>> Your vmalloc space is very fragmented. pcpu_get_vm_areas() want
>> hugepages (4MB on your machine, 2MB on mine because I have
>> CONFIG_HIGHMEM64G=y)
>
> Well, this is wrong. We use normal (4KB) pages, unfortunately.
>
> I have a NUMA machine, with two nodes, so pcpu_get_vm_areas() allocates
> two zones, one for each node, with a 'known' offset between them.
> Then, 4KB pages are allocated to populate the zone when needed.
>
> # grep pcpu_get_vm_areas /proc/vmallocinfo
> 0xffffe8ffa0400000-0xffffe8ffa0600000 2097152 pcpu_get_vm_areas+0x0/0x740 vmalloc
> 0xffffe8ffffc00000-0xffffe8ffffe00000 2097152 pcpu_get_vm_areas+0x0/0x740 vmalloc
>
> BTW, we dont have the number of pages currently allocated in each
> 'vmalloc' zone, and/or node information.
>
> Tejun, do you have plans to use hugepages eventually ?
> (and fallback to 4KB pages, but most percpu data are allocated right
> after boot)

We just tried creating 1000 macvlans with IPv6 addrs on a 64-bit machine
with 12GB RAM.  Only around 520 interfaces properly set their IPs, and
again there are errors about of-of-memory from 'ip', but no obvious
splats in dmesg.

'top' shows 10G or so free.

It will take some time to figure out what exactly is returning
the ENOMEM....

Thanks,
Ben

>
> Thanks
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox