Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2 4/7] fsl_pq_mdio: Add Suport for etsec2.0 devices.
From: David Miller @ 2009-10-28  9:43 UTC (permalink / raw)
  To: sandeep.kumar; +Cc: netdev
In-Reply-To: <12566625182528-git-send-email-sandeep.kumar@freescale.com>

From: Sandeep Gopalpet <sandeep.kumar@freescale.com>
Date: Tue, 27 Oct 2009 22:25:18 +0530

> This patch adds mdio support for etsec2.0 devices.
> 
> Modified the fsl_pq_mdio structure to include the new mdio
> members.
> 
> Signed-off-by: Sandeep Gopalpet <sandeep.kumar@freescale.com>

This is the third time you've submitted this patch, and for
the third time it DOES NOT apply to net-next-2.6 at all when
I try to apply this gianfar patch series.

You must be patching against another tree that has some changes
that conflict with this one.

Sort this out before submitting this again.

If you submit once more this same series, and it doesn't apply
properly to net-next-2.6, I will flat our ignore your submissions
for a week or so.

You are wasting that much of my time by doing this over and over.

Get your act together.

^ permalink raw reply

* Re: [PATCH -next] netxen: fix builds for SYSFS=n or MODULES=n
From: David Miller @ 2009-10-28  9:36 UTC (permalink / raw)
  To: randy.dunlap; +Cc: sfr, netdev, linux-next, linux-kernel, dhananjay
In-Reply-To: <20091026150945.3f35a811.randy.dunlap@oracle.com>

From: Randy Dunlap <randy.dunlap@oracle.com>
Date: Mon, 26 Oct 2009 15:09:45 -0700

> From: Randy Dunlap <randy.dunlap@oracle.com>
> 
> When CONFIG_MODULES=n:
> drivers/net/netxen/netxen_nic_main.c:2751: error: dereferencing pointer to incomplete type
> drivers/net/netxen/netxen_nic_main.c:2764: error: dereferencing pointer to incomplete type
> 
> Also needs addition of <linux/sysfs.h> for sysfs function prototypes or
> stubs when CONFIG_SYSFS=n.
> 
> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>

Applied, thanks Randy.

^ permalink raw reply

* Re: [PATCH -next] netxen: fix builds for SYSFS=n or MODULES=n
From: David Miller @ 2009-10-28  9:34 UTC (permalink / raw)
  To: dhananjay.phadke; +Cc: randy.dunlap, sfr, netdev, linux-next, linux-kernel
In-Reply-To: <7608421F3572AB4292BB2532AE89D5658B0B8903CB@AVEXMB1.qlogic.org>

From: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Date: Tue, 27 Oct 2009 15:09:24 -0700

> Sorry, seems like I am keeping you busy with the build errors.
> Thanks for fixing it again (may be I need to setup all different build configs).
> 
> Acked-by: Dhananjay Phadke <dhananjay@netxen.com>

Please adhere to the following netiquette when replying to
patches:

1) DO NOT TOP POST.

   You can feel free to continue doing this, but I am warning you
   that soon a day will come when replying to postings in a top-post
   manner will have your posting completely rejected from the list.

   This is absolutely the largest pet peeve amongst kernel developers.
   Quote only the relevant parts of the message, nothing more, and
   then afterwards you place your comments.  Do not add comments
   before the quoted material.  People need to read the quoted part to
   gain the context necessary to understand your reply, not the other
   way around.

2) Do not quote the patch when just adding your ACK or signoff.

   That makes the patch appear twice in patchwork, making more work
   for me and confusion for the patch submitted.

Thank you.

^ permalink raw reply

* Re: [PATCH] net: fold network name hash (v2)
From: David Miller @ 2009-10-28  9:28 UTC (permalink / raw)
  To: eric.dumazet
  Cc: shemminger, netdev, linux-kernel, akpm, torvalds, opurdila, viro
In-Reply-To: <4AE7DF8E.3020607@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 28 Oct 2009 07:07:10 +0100

> You should therefore use hash_32(hash, NETDEV_HASHBITS),
> not hash_long() that maps to hash_64() on 64 bit arches, which is
> slower and certainly not any better with a 32bits input.

Agreed.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 0/6] net: Speedup netdevice unregisters
From: David Miller @ 2009-10-28  9:23 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <4AE7279B.7070300@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 27 Oct 2009 18:02:19 +0100

> netdevice unregisters are serialized and call synchronize_{net|rcu}()
> three times per device.
> 
> This means it can take a long time to remove a device, if many virtual
> devices are attached. (vlan, macvlan, tunnels, ...)
>
> This patch series partially solve the problem by batching several devices in a list,
> so that two synchronize_net() calls can be factorized.

Looks really good Eric.

> PATCH 1/6 : net: Introduce unregister_netdevice_queue()
> PATCH 2/6 : net: Introduce unregister_netdevice_many()
> PATCH 3/6 : net: Add a list_head parameter to dellink() method
> PATCH 4/6 : vlan: Optimize multiple unregistration
> PATCH 5/6 : ipip: Optimize multiple unregistration
> PATCH 6/6 : gre: Optimize multiple unregistration
> 

All applied to net-next-2.6, thanks!

^ permalink raw reply

* [PATCH 4/7] mlx4: Use bitmap_find_next_zero_area
From: Akinobu Mita @ 2009-10-28  8:43 UTC (permalink / raw)
  To: linux-kernel, akpm; +Cc: Akinobu Mita, Yevgeny Petrilin, netdev
In-Reply-To: <1256719397-4258-1-git-send-email-akinobu.mita@gmail.com>

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Roland Dreier <rolandd@cisco.com>
Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Cc: netdev@vger.kernel.org
---
 drivers/net/mlx4/alloc.c |   37 ++++---------------------------------
 1 files changed, 4 insertions(+), 33 deletions(-)

diff --git a/drivers/net/mlx4/alloc.c b/drivers/net/mlx4/alloc.c
index ad95d5f..8c85156 100644
--- a/drivers/net/mlx4/alloc.c
+++ b/drivers/net/mlx4/alloc.c
@@ -72,35 +72,6 @@ void mlx4_bitmap_free(struct mlx4_bitmap *bitmap, u32 obj)
 	mlx4_bitmap_free_range(bitmap, obj, 1);
 }
 
-static unsigned long find_aligned_range(unsigned long *bitmap,
-					u32 start, u32 nbits,
-					int len, int align)
-{
-	unsigned long end, i;
-
-again:
-	start = ALIGN(start, align);
-
-	while ((start < nbits) && test_bit(start, bitmap))
-		start += align;
-
-	if (start >= nbits)
-		return -1;
-
-	end = start+len;
-	if (end > nbits)
-		return -1;
-
-	for (i = start + 1; i < end; i++) {
-		if (test_bit(i, bitmap)) {
-			start = i + 1;
-			goto again;
-		}
-	}
-
-	return start;
-}
-
 u32 mlx4_bitmap_alloc_range(struct mlx4_bitmap *bitmap, int cnt, int align)
 {
 	u32 obj, i;
@@ -110,13 +81,13 @@ u32 mlx4_bitmap_alloc_range(struct mlx4_bitmap *bitmap, int cnt, int align)
 
 	spin_lock(&bitmap->lock);
 
-	obj = find_aligned_range(bitmap->table, bitmap->last,
-				 bitmap->max, cnt, align);
+	obj = bitmap_find_next_zero_area(bitmap->table, bitmap->max,
+				bitmap->last, cnt, align - 1);
 	if (obj >= bitmap->max) {
 		bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top)
 				& bitmap->mask;
-		obj = find_aligned_range(bitmap->table, 0, bitmap->max,
-					 cnt, align);
+		obj = bitmap_find_next_zero_area(bitmap->table, bitmap->max,
+						0, cnt, align - 1);
 	}
 
 	if (obj < bitmap->max) {
-- 
1.6.5.1

^ permalink raw reply related

* [PATCH 1/7] bitmap: Introduce bitmap_set, bitmap_clear, bitmap_find_next_zero_area
From: Akinobu Mita @ 2009-10-28  8:43 UTC (permalink / raw)
  To: linux-kernel, akpm
  Cc: Akinobu Mita, FUJITA Tomonori, David S. Miller, sparclinux,
	Benjamin Herrenschmidt, Paul Mackerras, linuxppc-dev,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Lothar Wassmann, linux-usb, Roland Dreier,
	Yevgeny Petrilin, netdev, Tony Luck, Fenghua Yu, linux-ia64,
	linux-altix, Joerg Roedel

This introduces new bitmap functions:

bitmap_set: Set specified bit area
bitmap_clear: Clear specified bit area
bitmap_find_next_zero_area: Find free bit area

These are mostly stolen from iommu helper. The differences are:

- Use find_next_bit instead of doing test_bit for each bit

- Rewrite bitmap_set and bitmap_clear

  Instead of setting or clearing for each bit.

- Check the last bit of the limit

  iommu-helper doesn't want to find such area

- The return value if there is no zero area

  find_next_zero_area in iommu helper: returns -1
  bitmap_find_next_zero_area: return >= bitmap size

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Lothar Wassmann <LW@KARO-electronics.de>
Cc: linux-usb@vger.kernel.org
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Cc: netdev@vger.kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: linux-altix@sgi.com
Cc: Joerg Roedel <joerg.roedel@amd.com>
---
 include/linux/bitmap.h |   11 ++++++
 lib/bitmap.c           |   81 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 92 insertions(+), 0 deletions(-)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 756d78b..daf8c48 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -42,6 +42,9 @@
  * bitmap_empty(src, nbits)			Are all bits zero in *src?
  * bitmap_full(src, nbits)			Are all bits set in *src?
  * bitmap_weight(src, nbits)			Hamming Weight: number set bits
+ * bitmap_set(dst, pos, nbits)			Set specified bit area
+ * bitmap_clear(dst, pos, nbits)		Clear specified bit area
+ * bitmap_find_next_zero_area(buf, len, pos, n, mask)	Find bit free area
  * bitmap_shift_right(dst, src, n, nbits)	*dst = *src >> n
  * bitmap_shift_left(dst, src, n, nbits)	*dst = *src << n
  * bitmap_remap(dst, src, old, new, nbits)	*dst = map(old, new)(src)
@@ -108,6 +111,14 @@ extern int __bitmap_subset(const unsigned long *bitmap1,
 			const unsigned long *bitmap2, int bits);
 extern int __bitmap_weight(const unsigned long *bitmap, int bits);
 
+extern void bitmap_set(unsigned long *map, int i, int len);
+extern void bitmap_clear(unsigned long *map, int start, int nr);
+extern unsigned long bitmap_find_next_zero_area(unsigned long *map,
+					 unsigned long size,
+					 unsigned long start,
+					 unsigned int nr,
+					 unsigned long align_mask);
+
 extern int bitmap_scnprintf(char *buf, unsigned int len,
 			const unsigned long *src, int nbits);
 extern int __bitmap_parse(const char *buf, unsigned int buflen, int is_user,
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 7025658..11bf497 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -271,6 +271,87 @@ int __bitmap_weight(const unsigned long *bitmap, int bits)
 }
 EXPORT_SYMBOL(__bitmap_weight);
 
+#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) % BITS_PER_LONG))
+
+void bitmap_set(unsigned long *map, int start, int nr)
+{
+	unsigned long *p = map + BIT_WORD(start);
+	const int size = start + nr;
+	int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
+	unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+
+	while (nr - bits_to_set >= 0) {
+		*p |= mask_to_set;
+		nr -= bits_to_set;
+		bits_to_set = BITS_PER_LONG;
+		mask_to_set = ~0UL;
+		p++;
+	}
+	if (nr) {
+		mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+		*p |= mask_to_set;
+	}
+}
+EXPORT_SYMBOL(bitmap_set);
+
+void bitmap_clear(unsigned long *map, int start, int nr)
+{
+	unsigned long *p = map + BIT_WORD(start);
+	const int size = start + nr;
+	int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+	unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+
+	while (nr - bits_to_clear >= 0) {
+		*p &= ~mask_to_clear;
+		nr -= bits_to_clear;
+		bits_to_clear = BITS_PER_LONG;
+		mask_to_clear = ~0UL;
+		p++;
+	}
+	if (nr) {
+		mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+		*p &= ~mask_to_clear;
+	}
+}
+EXPORT_SYMBOL(bitmap_clear);
+
+/*
+ * bitmap_find_next_zero_area - find a contiguous aligned zero area
+ * @map: The address to base the search on
+ * @size: The bitmap size in bits
+ * @start: The bitnumber to start searching at
+ * @nr: The number of zeroed bits we're looking for
+ * @align_mask: Alignment mask for zero area
+ *
+ * The @align_mask should be one less than a power of 2; the effect is that
+ * the bit offset of all zero areas this function finds is multiples of that
+ * power of 2. A @align_mask of 0 means no alignment is required.
+ */
+unsigned long bitmap_find_next_zero_area(unsigned long *map,
+					 unsigned long size,
+					 unsigned long start,
+					 unsigned int nr,
+					 unsigned long align_mask)
+{
+	unsigned long index, end, i;
+again:
+	index = find_next_zero_bit(map, size, start);
+
+	/* Align allocation */
+	index = __ALIGN_MASK(index, align_mask);
+
+	end = index + nr;
+	if (end > size)
+		return end;
+	i = find_next_bit(map, end, index);
+	if (i < end) {
+		start = i + 1;
+		goto again;
+	}
+	return index;
+}
+EXPORT_SYMBOL(bitmap_find_next_zero_area);
+
 /*
  * Bitmap printing & parsing functions: first version by Bill Irwin,
  * second version by Paul Jackson, third by Joe Korty.
-- 
1.6.5.1


^ permalink raw reply related

* Re: [net-next-2.6 PATCH 01/20] igb: add new data structure for handling interrupts and NAPI
From: David Miller @ 2009-10-28  8:28 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, alexander.h.duyck
In-Reply-To: <20091028014858.12470.99520.stgit@localhost.localdomain>


All 20 patches applied, thanks a lot.

^ permalink raw reply

* Re: [PATCH] udev: create empty regular files to represent net interfaces
From: Kay Sievers @ 2009-10-28  8:23 UTC (permalink / raw)
  To: Matt Domsch
  Cc: dann frazier, linux-hotplug, Narendra_K, netdev, Jordan_Hargrave,
	Charles_Rose, Ben Hutchings
In-Reply-To: <20091027205551.GA31963@auslistsprd01.us.dell.com>

On Tue, Oct 27, 2009 at 21:55, Matt Domsch <Matt_Domsch@dell.com> wrote:
> On Thu, Oct 22, 2009 at 12:36:20AM -0600, dann frazier wrote:
>> Here's a proof of concept to further the discussion..
>>
>> The default filename uses the format:
>>   /dev/netdev/by-ifindex/$ifindex
>>
>> This provides the infrastructure to permit udev rules to create aliases for
>> network devices using symlinks, for example:
>>
>>   /dev/netdev/by-name/eth0 -> ../by-ifindex/1
>>   /dev/netdev/by-biosname/LOM0 -> ../by-ifindex/3
>>
>> A library (such as the proposed libnetdevname) could use this information
>> to provide an alias->realname mapping for network utilities.
>
> yes, this could work, as IFINDEX is already exported in the uevents,
> and that's the primary value udev needs to set up the mapping.
>
> While I like the little ifindex2name script you've got, I think udev
> could simply call if_indextoname() to get this, and not call an
> external program?  I suppose it could be a really really simple
> external program too.

What's the point of all this? Why would udev ever need to find the
name of a device by the ifindex? The device name is the primary value
for the kernel events udev acts on.

> I'd be fine with this approach.  It has the advantages of not
> requiring a kernel change at all, and not creating a whole character
> device which would be useless.  And it doesn't preclude someone in the
> future from creating a char device for network devices should they so
> choose.
>
> Kay, what say you as udev owner?

That all sounds very much like something which will hit us back some
day. I'm not sure, if udev should publish such dead text files in
/dev, it does not seem to fit the usual APIs/assumptions where /sys
and /dev match, and libudev provides access to both. It all sounds
more like a database for a possible netdevname library, which does not
need to be public in /dev, right?

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-hotplug" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: iproute uses too small of a receive buffer
From: David Miller @ 2009-10-28  7:55 UTC (permalink / raw)
  To: eric.dumazet; +Cc: shemminger, greearb, netdev
In-Reply-To: <4AE7F859.7020105@gmail.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 28 Oct 2009 08:52:57 +0100

> Stephen Hemminger a écrit :
>> 
>> Just having larger buffer isn't guarantee of success. Allocating
>> a huge buffer is not going to work on embedded.
>> 
> 
> Please note we do not allocate a big buffer, only allow more small skbs
> to be queued on socket receive queue.
> 
> If memory is not available, skb allocation will eventually fail
> and be reported as well, embedded or not.
> 
> I vote for allowing 1024*1024 bytes instead of 32768,
> and eventually user should be warned that it is capped by 
> /proc/sys/net/core/rmem_max

This discussion constantly reminds me of:

/*
 *	skb should fit one page. This choice is good for headerless malloc.
 *	But we should limit to 8K so that userspace does not have to
 *	use enormous buffer sizes on recvmsg() calls just to avoid
 *	MSG_TRUNC when PAGE_SIZE is very large.
 */
#if PAGE_SIZE < 8192UL
#define NLMSG_GOODSIZE	SKB_WITH_OVERHEAD(PAGE_SIZE)
#else
#define NLMSG_GOODSIZE	SKB_WITH_OVERHEAD(8192UL)
#endif

#define NLMSG_DEFAULT_SIZE (NLMSG_GOODSIZE - NLMSG_HDRLEN)

^ permalink raw reply

* Re: iproute uses too small of a receive buffer
From: Eric Dumazet @ 2009-10-28  7:52 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Ben Greear, NetDev
In-Reply-To: <20091027162434.6dc31b2d@nehalam>

Stephen Hemminger a écrit :
> 
> Just having larger buffer isn't guarantee of success. Allocating
> a huge buffer is not going to work on embedded.
> 

Please note we do not allocate a big buffer, only allow more small skbs
to be queued on socket receive queue.

If memory is not available, skb allocation will eventually fail
and be reported as well, embedded or not.

I vote for allowing 1024*1024 bytes instead of 32768,
and eventually user should be warned that it is capped by 
/proc/sys/net/core/rmem_max


> Why not have it continue after one error.

Yes, but caller of 'ip monitor' just restart it anyway

^ permalink raw reply

* Re: iproute uses too small of a receive buffer
From: Eric Dumazet @ 2009-10-28  7:37 UTC (permalink / raw)
  To: Ben Greear, Stephen Hemminger; +Cc: NetDev
In-Reply-To: <4AE7EC65.8000600@gmail.com>

Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Probably the right way is to give a cmd-line arg to set the buffer size
>> and also continue if the error is ENOBUFs (but print some error out
>> so users know they have issues).  I can make the attempt if that
>> sounds good to you.
> 
> Real fix is to realloc buffer at receive time, no need for user setting.
> 
> In my testings I saw it reaching 1 Mbyte
> write(2, "REALLOC buflen 8192\n"..., 20) = 20
> write(2, "REALLOC buflen 16384\n"..., 21) = 21
> write(2, "REALLOC buflen 32768\n"..., 21) = 21
> write(2, "REALLOC buflen 65536\n"..., 21) = 21
> write(2, "REALLOC buflen 131072\n"..., 22) = 22
> write(2, "REALLOC buflen 262144\n"..., 22) = 22
> write(2, "REALLOC buflen 524288\n"..., 22) = 22
> 
> 
> [iproute2] realloc buffer in rtnl_listen
> 
> # ip monitor route
> netlink receive error No buffer space available (105)
> Dump terminated 
> 
> Reported-by: Ben Greear<greearb@candelatech.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Oops, this was wrong, Ben was right, sorry...

ENOBUFS errors is a flag to actually report to user that some information was dropped,
not that user supplied buffer at recv() time is not big enough.

I was surprised that buffer could reach 1Mbytes, while RCVBUF was 32768 or so.



^ permalink raw reply

* Re: PATCH 23/10]Optimize the upload speed for PPP connection.
From: fangxiaozhi 00110321 @ 2009-10-28  7:30 UTC (permalink / raw)
  To: davem, william.allen.simpson; +Cc: netdev, kernel, zihan, greg, haegar


Thanks your advice.

But generally, PAGE_SIZE is 4096, whether it is too large or not?

If PAGE_SIZE is really appropriate, then I can resubmit the patch.

Thanks very much.
----- Original Message ----- 
From: "David Miller" <davem@davemloft.net>
To: <william.allen.simpson@gmail.com>
Cc: <huananhu@huawei.com>; <netdev@vger.kernel.org>; <linux-kernel@vger.kernel.org>; <zihan@huawei.com>; <greg@kroah.com>; <haegar@sdinet.de>
Sent: Saturday, October 24, 2009 9:46 PM
Subject: Re: PATCH 23/10]Optimize the upload speed for PPP connection.


> From: William Allen Simpson <william.allen.simpson@gmail.com>
> Date: Fri, 23 Oct 2009 07:46:08 -0400
> 
>> Concur.  I'd go further than that, my code usually made room for at
>> least
>> a full MTU (MRU) with HDLC escaping.  To minimize context switches,
>> that
>> should be 3014 ((1500 MRU + 2 FCS + 4 header) * 2 escapes + 2 flags).
>> 
>> Even in the old days, when memory was tight, context switches and
>> interrupt
>> time were more expensive, too.  PPP is supposed to scale to OC-192.
> 
> Actually I'd like to see ->obuf allocated externally and then
> make it simply PAGE_SIZE.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
******************************************************************************************
 This email and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained here in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email
 immediately and delete it!
 *****************************************************************************************

^ permalink raw reply

* Re: iproute uses too small of a receive buffer
From: Eric Dumazet @ 2009-10-28  7:09 UTC (permalink / raw)
  Cc: Ben Greear, Stephen Hemminger, NetDev
In-Reply-To: <4AE7EC65.8000600@gmail.com>

Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Probably the right way is to give a cmd-line arg to set the buffer size
>> and also continue if the error is ENOBUFs (but print some error out
>> so users know they have issues).  I can make the attempt if that
>> sounds good to you.
> 
> Real fix is to realloc buffer at receive time, no need for user setting.
> 

Then, another problem is that some information can be dropped at kernel level
when socket rcvbuf is full (ip monitor too slow to read its socket)

Thats hard to fix because you need to tweak /proc/sys/net/core/rmem_max



^ permalink raw reply

* Re: iproute uses too small of a receive buffer
From: Eric Dumazet @ 2009-10-28  7:01 UTC (permalink / raw)
  To: Ben Greear; +Cc: Stephen Hemminger, NetDev
In-Reply-To: <4AE78297.9000909@candelatech.com>

Ben Greear a écrit :
> 
> Probably the right way is to give a cmd-line arg to set the buffer size
> and also continue if the error is ENOBUFs (but print some error out
> so users know they have issues).  I can make the attempt if that
> sounds good to you.

Real fix is to realloc buffer at receive time, no need for user setting.

In my testings I saw it reaching 1 Mbyte
write(2, "REALLOC buflen 8192\n"..., 20) = 20
write(2, "REALLOC buflen 16384\n"..., 21) = 21
write(2, "REALLOC buflen 32768\n"..., 21) = 21
write(2, "REALLOC buflen 65536\n"..., 21) = 21
write(2, "REALLOC buflen 131072\n"..., 22) = 22
write(2, "REALLOC buflen 262144\n"..., 22) = 22
write(2, "REALLOC buflen 524288\n"..., 22) = 22


[iproute2] realloc buffer in rtnl_listen

# ip monitor route
netlink receive error No buffer space available (105)
Dump terminated 

Reported-by: Ben Greear<greearb@candelatech.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..134ce7f 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -392,8 +392,14 @@ int rtnl_listen(struct rtnl_handle *rtnl,
 		.msg_iov = &iov,
 		.msg_iovlen = 1,
 	};
-	char   buf[8192];
+	char   *buf;
+	size_t buflen = 8192;
 
+	buf = malloc(buflen);
+	if (buf == NULL) {
+		fprintf(stderr, "netlink could not alloc %lu bytes\n", buflen);
+		return -1;
+	}
 	memset(&nladdr, 0, sizeof(nladdr));
 	nladdr.nl_family = AF_NETLINK;
 	nladdr.nl_pid = 0;
@@ -401,12 +407,20 @@ int rtnl_listen(struct rtnl_handle *rtnl,
 
 	iov.iov_base = buf;
 	while (1) {
-		iov.iov_len = sizeof(buf);
+		iov.iov_len = buflen;
 		status = recvmsg(rtnl->fd, &msg, 0);
 
 		if (status < 0) {
 			if (errno == EINTR || errno == EAGAIN)
 				continue;
+			if (errno == ENOBUFS) {
+				buf = realloc(buf, buflen * 2);
+				if (buf) {
+					buflen *= 2;
+					iov.iov_base = buf;
+					continue;
+				}
+			}
 			fprintf(stderr, "netlink receive error %s (%d)\n",
 				strerror(errno), errno);
 			return -1;

^ permalink raw reply related

* Re: [PATCH] net: fold network name hash (v2)
From: Eric Dumazet @ 2009-10-28  6:07 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, netdev, linux-kernel, akpm, torvalds, opurdila,
	viro
In-Reply-To: <20091027150436.56e673cd@nehalam>

Stephen Hemminger a écrit :
> The full_name_hash does not produce a value that is evenly distributed
> over the lower 8 bits. This causes name hash to be unbalanced with large
> number of names. There is a standard function to fold in upper bits
> so use that.
> 
> This is independent of possible improvements to full_name_hash()
> in future.

>  static inline struct hlist_head *dev_name_hash(struct net *net, const char *name)
>  {
>  	unsigned hash = full_name_hash(name, strnlen(name, IFNAMSIZ));
> -	return &net->dev_name_head[hash & ((1 << NETDEV_HASHBITS) - 1)];
> +	return &net->dev_name_head[hash_long(hash, NETDEV_HASHBITS)];
>  }
>  
>  static inline struct hlist_head *dev_index_hash(struct net *net, int ifindex)

full_name_hash() returns an "unsigned int", which is guaranteed to be 32 bits

You should therefore use hash_32(hash, NETDEV_HASHBITS),
not hash_long() that maps to hash_64() on 64 bit arches, which is
slower and certainly not any better with a 32bits input.



/* Compute the hash for a name string. */
static inline unsigned int
full_name_hash(const unsigned char *name, unsigned int len)
{
        unsigned long hash = init_name_hash();
        while (len--)
                hash = partial_name_hash(*name++, hash);
        return end_name_hash(hash);
}

static inline u32 hash_32(u32 val, unsigned int bits)
{
        /* On some cpus multiply is faster, on others gcc will do shifts */
        u32 hash = val * GOLDEN_RATIO_PRIME_32;

        /* High bits are more random, so use them. */
        return hash >> (32 - bits);
}


static inline u64 hash_64(u64 val, unsigned int bits)
{
        u64 hash = val;

        /*  Sigh, gcc can't optimise this alone like it does for 32 bits. */
        u64 n = hash;
        n <<= 18;
        hash -= n;
        n <<= 33;
        hash -= n;
        n <<= 3;
        hash += n;
        n <<= 3;
        hash -= n;
        n <<= 4;
        hash += n;
        n <<= 2;
        hash += n;

        /* High bits are more random, so use them. */
        return hash >> (64 - bits);
}

^ permalink raw reply

* [net-next-2.6 PATCH] vxge: Configure the number of transmit descriptors per packet to MAX_SKB_FRAGS + 1.
From: Sreenivasa Honnur @ 2009-10-28  5:49 UTC (permalink / raw)
  To: davem; +Cc: netdev, support

- Configure the number of transmit descriptors per packet to MAX_SKB_FRAGS + 1.

Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@neterion.com>
---
diff -urpN orig/drivers/net/vxge/vxge-main.c patch1/drivers/net/vxge/vxge-main.c
--- orig/drivers/net/vxge/vxge-main.c	2009-10-26 23:28:15.000000000 -0700
+++ patch1/drivers/net/vxge/vxge-main.c	2009-10-26 23:32:07.000000000 -0700
@@ -3612,11 +3612,12 @@ static int __devinit vxge_config_vpaths(
 		device_config->vp_config[i].fifo.enable =
 						VXGE_HW_FIFO_ENABLE;
 		device_config->vp_config[i].fifo.max_frags =
-				MAX_SKB_FRAGS;
+				MAX_SKB_FRAGS + 1;
 		device_config->vp_config[i].fifo.memblock_size =
 			VXGE_HW_MIN_FIFO_MEMBLOCK_SIZE;
 
-		txdl_size = MAX_SKB_FRAGS * sizeof(struct vxge_hw_fifo_txd);
+		txdl_size = device_config->vp_config[i].fifo.max_frags *
+				sizeof(struct vxge_hw_fifo_txd);
 		txdl_per_memblock = VXGE_HW_MIN_FIFO_MEMBLOCK_SIZE / txdl_size;
 
 		device_config->vp_config[i].fifo.fifo_blocks =
diff -urpN orig/drivers/net/vxge/vxge-version.h patch1/drivers/net/vxge/vxge-version.h
--- orig/drivers/net/vxge/vxge-version.h	2009-10-26 23:28:15.000000000 -0700
+++ patch1/drivers/net/vxge/vxge-version.h	2009-10-27 02:19:18.000000000 -0700
@@ -18,6 +18,6 @@
 #define VXGE_VERSION_MAJOR	"2"
 #define VXGE_VERSION_MINOR	"0"
 #define VXGE_VERSION_FIX	"6"
-#define VXGE_VERSION_BUILD	"18707"
+#define VXGE_VERSION_BUILD	"18937"
 #define VXGE_VERSION_FOR	"k"
 #endif


^ permalink raw reply

* Re: [patch net-next]atl1c: duplicate atl1c_get_tpd
From: David Miller @ 2009-10-28  5:31 UTC (permalink / raw)
  To: jie.yang; +Cc: netdev, linux-kernel
In-Reply-To: <12567068551959-git-send-email-jie.yang@atheros.com>

From: <jie.yang@atheros.com>
Date: Wed, 28 Oct 2009 13:14:15 +0800

> From: Jie Yang <jie.yang@atheros.com>
> 
> remove duplicate atl1c_get_tpd, it may cause hardware to send wrong packets.
> 
> Signed-off-by: Jie Yang <jie.yang@atheros.com>

Applied, thanks.

^ permalink raw reply

* [patch net-next]atl1c: duplicate atl1c_get_tpd
From: jie.yang @ 2009-10-28  5:14 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, Jie Yang

From: Jie Yang <jie.yang@atheros.com>

remove duplicate atl1c_get_tpd, it may cause hardware to send wrong packets.

Signed-off-by: Jie Yang <jie.yang@atheros.com>

---

 drivers/net/atl1c/atl1c_main.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/atl1c/atl1c_main.c b/drivers/net/atl1c/atl1c_main.c
index 1372e9a..3b8801a 100644
--- a/drivers/net/atl1c/atl1c_main.c
+++ b/drivers/net/atl1c/atl1c_main.c
@@ -1981,8 +1981,6 @@ static void atl1c_tx_map(struct atl1c_adapter *adapter,
 		else {
 			use_tpd = atl1c_get_tpd(adapter, type);
 			memcpy(use_tpd, tpd, sizeof(struct atl1c_tpd_desc));
-			use_tpd = atl1c_get_tpd(adapter, type);
-			memcpy(use_tpd, tpd, sizeof(struct atl1c_tpd_desc));
 		}
 		buffer_info = atl1c_get_tx_buffer(adapter, use_tpd);
 		buffer_info->length = buf_len - mapped_len;

^ permalink raw reply related

* Re: [PATCH 2/3] net: TCP thin linear timeouts
From: William Allen Simpson @ 2009-10-28  3:20 UTC (permalink / raw)
  To: Andreas Petlund; +Cc: netdev, linux-kernel, shemminger, ilpo.jarvinen, davem
In-Reply-To: <4AE72079.4030504@simula.no>

Sorry to be too picky about the naming, but "rm_expb" really doesn't
mean what is actually done.  Perhaps TCP_THIN_LINEAR_BACKOFF and
sysctl_tcp_thin_linear_backoff?

Also, as debated on some other recent patches, shouldn't the global
sysctl specify the default, and the per socket option specify the
forced override?

^ permalink raw reply

* Re: [PATCH 1/3] net: TCP thin-stream detection
From: William Allen Simpson @ 2009-10-28  3:09 UTC (permalink / raw)
  To: Andreas Petlund; +Cc: netdev, linux-kernel, shemminger, ilpo.jarvinen, davem
In-Reply-To: <4AE72075.4070702@simula.no>

Andreas Petlund wrote:
> +/* Determines whether this is a thin stream (which may suffer from
> + * increased latency). Used to trigger latency-reducing mechanisms.
> + */
> +static inline unsigned int tcp_stream_is_thin(const struct tcp_sock *tp)
> +{
> +	return tp->packets_out < 4;
> +}
> +
This bothers me a bit.  Having just looked at your Linux presentation,
and not (yet) read your papers, it seems much of your justification was
with 1 packet per RTT.  Here, you seem to be concentrating on 4, probably
because many implementations quickly ramp up to 4.

But there's a fair amount of experience showing that ramping to 4 is
problematic on congested paths, especially wireless networks.  Fast
retransmit in that case would be disastrous.

Once upon a time, I worked on a fair number of interactive games a decade
or so ago.  And agree that this can be a problem, although I've never
been a fan of turning off the Nagle algorithm.  My solution has always
been a heartbeat, rather than trying to shoehorn this into TCP.

Also, I've not seen any discussion on the end-to-end interest list.

^ permalink raw reply

* Re: [PATCH 3/3] net: TCP thin dupack
From: William Allen Simpson @ 2009-10-28  2:43 UTC (permalink / raw)
  To: Andreas Petlund; +Cc: netdev, linux-kernel, shemminger, ilpo.jarvinen, davem
In-Reply-To: <4AE7207D.8090402@simula.no>

Andreas Petlund wrote:
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index e64368d..f4a05ff 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -97,6 +97,7 @@ enum {
>  #define TCP_CONGESTION		13	/* Congestion control algorithm */
>  #define TCP_MD5SIG		14	/* TCP MD5 Signature (RFC2385) */
>  #define TCP_THIN_RM_EXPB        15      /* Remove exp. backoff for thin streams*/
> +#define TCP_THIN_DUPACK         16      /* Fast retrans. after 1 dupack */
>  
I've not had the chance to examine the rest, but I've been poking at a
patch series that's used 15 for over a year, so could you try 16 and 17?

^ permalink raw reply

* [net-next-2.6 PATCH 20/20] igb: cleanup "todo" code found in igb_ethtool.c
From: Jeff Kirsher @ 2009-10-28  1:55 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Alexander Duyck, Jeff Kirsher
In-Reply-To: <20091028014858.12470.99520.stgit@localhost.localdomain>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This patch moves some defines into the e1000_regs.h file since this is the
correct place for register defines and not inside of igb_ethtool.c

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igb/e1000_regs.h  |    7 +++++++
 drivers/net/igb/igb_ethtool.c |   11 +----------
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/net/igb/e1000_regs.h b/drivers/net/igb/e1000_regs.h
index 76c3389..e06c3b7 100644
--- a/drivers/net/igb/e1000_regs.h
+++ b/drivers/net/igb/e1000_regs.h
@@ -288,10 +288,17 @@ enum {
 #define E1000_MTA      0x05200  /* Multicast Table Array - RW Array */
 #define E1000_RA       0x05400  /* Receive Address - RW Array */
 #define E1000_RA2      0x054E0  /* 2nd half of receive address array - RW Array */
+#define E1000_PSRTYPE(_i)       (0x05480 + ((_i) * 4))
 #define E1000_RAL(_i)  (((_i) <= 15) ? (0x05400 + ((_i) * 8)) : \
                                        (0x054E0 + ((_i - 16) * 8)))
 #define E1000_RAH(_i)  (((_i) <= 15) ? (0x05404 + ((_i) * 8)) : \
                                        (0x054E4 + ((_i - 16) * 8)))
+#define E1000_IP4AT_REG(_i)     (0x05840 + ((_i) * 8))
+#define E1000_IP6AT_REG(_i)     (0x05880 + ((_i) * 4))
+#define E1000_WUPM_REG(_i)      (0x05A00 + ((_i) * 4))
+#define E1000_FFMT_REG(_i)      (0x09000 + ((_i) * 8))
+#define E1000_FFVT_REG(_i)      (0x09800 + ((_i) * 8))
+#define E1000_FFLT_REG(_i)      (0x05F00 + ((_i) * 8))
 #define E1000_VFTA     0x05600  /* VLAN Filter Table Array - RW Array */
 #define E1000_VT_CTL   0x0581C  /* VMDq Control - RW */
 #define E1000_WUC      0x05800  /* Wakeup Control - RW */
diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index 65c538f..048a615 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -502,19 +502,10 @@ static void igb_get_regs(struct net_device *netdev,
 	regs_buff[119] = adapter->stats.scvpc;
 	regs_buff[120] = adapter->stats.hrmpc;
 
-	/* These should probably be added to e1000_regs.h instead */
-	#define E1000_PSRTYPE_REG(_i) (0x05480 + ((_i) * 4))
-	#define E1000_IP4AT_REG(_i)   (0x05840 + ((_i) * 8))
-	#define E1000_IP6AT_REG(_i)   (0x05880 + ((_i) * 4))
-	#define E1000_WUPM_REG(_i)    (0x05A00 + ((_i) * 4))
-	#define E1000_FFMT_REG(_i)    (0x09000 + ((_i) * 8))
-	#define E1000_FFVT_REG(_i)    (0x09800 + ((_i) * 8))
-	#define E1000_FFLT_REG(_i)    (0x05F00 + ((_i) * 8))
-
 	for (i = 0; i < 4; i++)
 		regs_buff[121 + i] = rd32(E1000_SRRCTL(i));
 	for (i = 0; i < 4; i++)
-		regs_buff[125 + i] = rd32(E1000_PSRTYPE_REG(i));
+		regs_buff[125 + i] = rd32(E1000_PSRTYPE(i));
 	for (i = 0; i < 4; i++)
 		regs_buff[129 + i] = rd32(E1000_RDBAL(i));
 	for (i = 0; i < 4; i++)


^ permalink raw reply related

* [net-next-2.6 PATCH 19/20] igb: add single vector msi-x testing to interrupt test
From: Jeff Kirsher @ 2009-10-28  1:55 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Alexander Duyck, Jeff Kirsher
In-Reply-To: <20091028014858.12470.99520.stgit@localhost.localdomain>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change adds testing of the first msix vector to the interrupt testing.
This should help with determining the cause of interrupt issues when they are
encountered.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igb/igb_ethtool.c |   27 +++++++++++++++++----------
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index aa05f00..65c538f 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -1123,32 +1123,36 @@ static int igb_intr_test(struct igb_adapter *adapter, u64 *data)
 	*data = 0;
 
 	/* Hook up test interrupt handler just for this test */
-	if (adapter->msix_entries)
-		/* NOTE: we don't test MSI-X interrupts here, yet */
-		return 0;
+	if (adapter->msix_entries) {
+		if (request_irq(adapter->msix_entries[0].vector,
+		                &igb_test_intr, 0, netdev->name, adapter)) {
+			*data = 1;
+			return -1;
+		}
 
-	if (adapter->flags & IGB_FLAG_HAS_MSI) {
+	} else if (adapter->flags & IGB_FLAG_HAS_MSI) {
 		shared_int = false;
-		if (request_irq(irq, &igb_test_intr, 0, netdev->name, netdev)) {
+		if (request_irq(irq,
+		                &igb_test_intr, 0, netdev->name, adapter)) {
 			*data = 1;
 			return -1;
 		}
 	} else if (!request_irq(irq, &igb_test_intr, IRQF_PROBE_SHARED,
-				netdev->name, netdev)) {
+				netdev->name, adapter)) {
 		shared_int = false;
 	} else if (request_irq(irq, &igb_test_intr, IRQF_SHARED,
-		 netdev->name, netdev)) {
+		 netdev->name, adapter)) {
 		*data = 1;
 		return -1;
 	}
 	dev_info(&adapter->pdev->dev, "testing %s interrupt\n",
 		(shared_int ? "shared" : "unshared"));
 	/* Disable all the interrupts */
-	wr32(E1000_IMC, 0xFFFFFFFF);
+	wr32(E1000_IMC, ~0);
 	msleep(10);
 
 	/* Define all writable bits for ICS */
-	switch(hw->mac.type) {
+	switch (hw->mac.type) {
 	case e1000_82575:
 		ics_mask = 0x37F47EDD;
 		break;
@@ -1238,7 +1242,10 @@ static int igb_intr_test(struct igb_adapter *adapter, u64 *data)
 	msleep(10);
 
 	/* Unhook test interrupt handler */
-	free_irq(irq, netdev);
+	if (adapter->msix_entries)
+		free_irq(adapter->msix_entries[0].vector, adapter);
+	else
+		free_irq(irq, adapter);
 
 	return *data;
 }


^ permalink raw reply related

* [net-next-2.6 PATCH 18/20] igb: make ethtool use core xmit map and free functionality
From: Jeff Kirsher @ 2009-10-28  1:55 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Alexander Duyck, Jeff Kirsher
In-Reply-To: <20091028014858.12470.99520.stgit@localhost.localdomain>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change adds a clean_rx/tx_irq type function call to the ethtool loopback
testing which allows us to test the core transmit and receive functionality in
the driver.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igb/igb_ethtool.c |  156 +++++++++++++++++++++++------------------
 1 files changed, 89 insertions(+), 67 deletions(-)

diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index 80afd8a..aa05f00 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -1254,7 +1254,7 @@ static int igb_setup_desc_rings(struct igb_adapter *adapter)
 	struct igb_ring *tx_ring = &adapter->test_tx_ring;
 	struct igb_ring *rx_ring = &adapter->test_rx_ring;
 	struct e1000_hw *hw = &adapter->hw;
-	int i, ret_val;
+	int ret_val;
 
 	/* Setup Tx descriptor ring and Tx buffers */
 	tx_ring->count = IGB_DEFAULT_TXD;
@@ -1270,34 +1270,6 @@ static int igb_setup_desc_rings(struct igb_adapter *adapter)
 	igb_setup_tctl(adapter);
 	igb_configure_tx_ring(adapter, tx_ring);
 
-	for (i = 0; i < tx_ring->count; i++) {
-		union e1000_adv_tx_desc *tx_desc;
-		unsigned int size = 1024;
-		struct sk_buff *skb = alloc_skb(size, GFP_KERNEL);
-
-		if (!skb) {
-			ret_val = 2;
-			goto err_nomem;
-		}
-		skb_put(skb, size);
-		tx_ring->buffer_info[i].skb = skb;
-		tx_ring->buffer_info[i].length = skb->len;
-		tx_ring->buffer_info[i].dma =
-			pci_map_single(tx_ring->pdev, skb->data, skb->len,
-				       PCI_DMA_TODEVICE);
-		tx_desc = E1000_TX_DESC_ADV(*tx_ring, i);
-		tx_desc->read.buffer_addr =
-			cpu_to_le64(tx_ring->buffer_info[i].dma);
-		tx_desc->read.olinfo_status = cpu_to_le32(skb->len) <<
-		                              E1000_ADVTXD_PAYLEN_SHIFT;
-		tx_desc->read.cmd_type_len = cpu_to_le32(skb->len);
-		tx_desc->read.cmd_type_len |= cpu_to_le32(E1000_TXD_CMD_EOP |
-		                                          E1000_TXD_CMD_IFCS |
-		                                          E1000_TXD_CMD_RS |
-		                                          E1000_ADVTXD_DTYP_DATA |
-		                                          E1000_ADVTXD_DCMD_DEXT);
-	}
-
 	/* Setup Rx descriptor ring and Rx buffers */
 	rx_ring->count = IGB_DEFAULT_RXD;
 	rx_ring->pdev = adapter->pdev;
@@ -1470,14 +1442,78 @@ static int igb_check_lbtest_frame(struct sk_buff *skb, unsigned int frame_size)
 	return 13;
 }
 
+static int igb_clean_test_rings(struct igb_ring *rx_ring,
+                                struct igb_ring *tx_ring,
+                                unsigned int size)
+{
+	union e1000_adv_rx_desc *rx_desc;
+	struct igb_buffer *buffer_info;
+	int rx_ntc, tx_ntc, count = 0;
+	u32 staterr;
+
+	/* initialize next to clean and descriptor values */
+	rx_ntc = rx_ring->next_to_clean;
+	tx_ntc = tx_ring->next_to_clean;
+	rx_desc = E1000_RX_DESC_ADV(*rx_ring, rx_ntc);
+	staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
+
+	while (staterr & E1000_RXD_STAT_DD) {
+		/* check rx buffer */
+		buffer_info = &rx_ring->buffer_info[rx_ntc];
+
+		/* unmap rx buffer, will be remapped by alloc_rx_buffers */
+		pci_unmap_single(rx_ring->pdev,
+		                 buffer_info->dma,
+				 rx_ring->rx_buffer_len,
+				 PCI_DMA_FROMDEVICE);
+		buffer_info->dma = 0;
+
+		/* verify contents of skb */
+		if (!igb_check_lbtest_frame(buffer_info->skb, size))
+			count++;
+
+		/* unmap buffer on tx side */
+		buffer_info = &tx_ring->buffer_info[tx_ntc];
+		igb_unmap_and_free_tx_resource(tx_ring, buffer_info);
+
+		/* increment rx/tx next to clean counters */
+		rx_ntc++;
+		if (rx_ntc == rx_ring->count)
+			rx_ntc = 0;
+		tx_ntc++;
+		if (tx_ntc == tx_ring->count)
+			tx_ntc = 0;
+
+		/* fetch next descriptor */
+		rx_desc = E1000_RX_DESC_ADV(*rx_ring, rx_ntc);
+		staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
+	}
+
+	/* re-map buffers to ring, store next to clean values */
+	igb_alloc_rx_buffers_adv(rx_ring, count);
+	rx_ring->next_to_clean = rx_ntc;
+	tx_ring->next_to_clean = tx_ntc;
+
+	return count;
+}
+
 static int igb_run_loopback_test(struct igb_adapter *adapter)
 {
 	struct igb_ring *tx_ring = &adapter->test_tx_ring;
 	struct igb_ring *rx_ring = &adapter->test_rx_ring;
-	int i, j, k, l, lc, good_cnt, ret_val = 0;
-	unsigned long time;
+	int i, j, lc, good_cnt, ret_val = 0;
+	unsigned int size = 1024;
+	netdev_tx_t tx_ret_val;
+	struct sk_buff *skb;
+
+	/* allocate test skb */
+	skb = alloc_skb(size, GFP_KERNEL);
+	if (!skb)
+		return 11;
 
-	writel(rx_ring->count - 1, rx_ring->tail);
+	/* place data into test skb */
+	igb_create_lbtest_frame(skb, size);
+	skb_put(skb, size);
 
 	/* Calculate the loop count based on the largest descriptor ring
 	 * The idea is to wrap the largest ring a number of times using 64
@@ -1489,50 +1525,36 @@ static int igb_run_loopback_test(struct igb_adapter *adapter)
 	else
 		lc = ((rx_ring->count / 64) * 2) + 1;
 
-	k = l = 0;
 	for (j = 0; j <= lc; j++) { /* loop count loop */
-		for (i = 0; i < 64; i++) { /* send the packets */
-			igb_create_lbtest_frame(tx_ring->buffer_info[k].skb,
-						1024);
-			pci_dma_sync_single_for_device(tx_ring->pdev,
-				tx_ring->buffer_info[k].dma,
-				tx_ring->buffer_info[k].length,
-				PCI_DMA_TODEVICE);
-			k++;
-			if (k == tx_ring->count)
-				k = 0;
-		}
-		writel(k, tx_ring->tail);
-		msleep(200);
-		time = jiffies; /* set the start time for the receive */
+		/* reset count of good packets */
 		good_cnt = 0;
-		do { /* receive the sent packets */
-			pci_dma_sync_single_for_cpu(rx_ring->pdev,
-					rx_ring->buffer_info[l].dma,
-					IGB_RXBUFFER_2048,
-					PCI_DMA_FROMDEVICE);
-
-			ret_val = igb_check_lbtest_frame(
-					     rx_ring->buffer_info[l].skb, 1024);
-			if (!ret_val)
+
+		/* place 64 packets on the transmit queue*/
+		for (i = 0; i < 64; i++) {
+			skb_get(skb);
+			tx_ret_val = igb_xmit_frame_ring_adv(skb, tx_ring);
+			if (tx_ret_val == NETDEV_TX_OK)
 				good_cnt++;
-			l++;
-			if (l == rx_ring->count)
-				l = 0;
-			/* time + 20 msecs (200 msecs on 2.4) is more than
-			 * enough time to complete the receives, if it's
-			 * exceeded, break and error off
-			 */
-		} while (good_cnt < 64 && jiffies < (time + 20));
+		}
+
 		if (good_cnt != 64) {
-			ret_val = 13; /* ret_val is the same as mis-compare */
+			ret_val = 12;
 			break;
 		}
-		if (jiffies >= (time + 20)) {
-			ret_val = 14; /* error code for time out error */
+
+		/* allow 200 milliseconds for packets to go from tx to rx */
+		msleep(200);
+
+		good_cnt = igb_clean_test_rings(rx_ring, tx_ring, size);
+		if (good_cnt != 64) {
+			ret_val = 13;
 			break;
 		}
 	} /* end loop count loop */
+
+	/* free the original skb */
+	kfree_skb(skb);
+
 	return ret_val;
 }
 


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox