* Best route for re-implementing TCPHA
From: RichardFliam @ 2011-04-13 23:08 UTC (permalink / raw)
To: netdev
TCPHA (http://dragon.linux-vs.org/~dragonfly/htm/tcpha.htm) provided
several neat features for content and health aware load balancing. I
am looking to re-implement on the 2.6 kernel and I am struck by
indecision on a few key features.
In particular the original project created its own polling methods for
TCP sockets based on fs/select.c and tcp_poll but to me this seems
inelegant. I am wondering if there is a "correct" way to poll sockets
in kernel or should I simply call sock_map_fd on the kernel socket.
After extensive searching I did find this post
http://permalink.gmane.org/gmane.linux.network/180354 to this mailing
list, but it does not seem to contain an answer as to the correct
direction for polling tcp sockets in kernel.
--
--Richard Fliam
^ permalink raw reply
* kernel panic, 2.6.38.2, gretap
From: Denys Fedoryshchenko @ 2011-04-13 22:58 UTC (permalink / raw)
To: netdev
Did following rule to route incoming (over eth0) traffic over gretap
interface
Bringing up interface
ip link add eoip1 type gretap remote X.X.X.X local Y.Y.Y.Y nopmtudisc
ifconfig eoip1 10.255.254.1 netmask 255.255.255.252 up mtu 1500
made source routing:
32000: from all iif eth0 lookup 203
Some routes added to table 203
After few(1-3) seconds running around 30-40 Mbps getting kernel panic:
Notes: I have vlan on same interface, eth0.2023, where rest of traffic
going, and this vlan "shaped" by HTB. It is not involved in gretap
operation.
on eth0 i have huge bfifo:
qdisc bfifo 8001: dev eth0 root refcnt 9 limit 100000000b
Sent 14652829681 bytes 15646355 pkt (dropped 0, overlimits 0 requeues
8)
backlog 0b 0p requeues 8
[ 658.492347] skb_over_panic: text:f80f37d4 len:3028 put:1514
head:d1af2000 data:d1af20a4 tail:0xd1af2c78 end:0xd1af2700 dev:eth0.2022
[ 658.492975] ------------[ cut here ]------------
[ 658.493264] Kernel BUG at c0377eaf [verbose debug info unavailable]
[ 658.493317] invalid opcode: 0000 [#1]
SMP
[ 658.493317] last sysfs file:
/sys/devices/virtual/net/eth0.2022/address
[ 658.493317] Modules linked in:
ip_gre
gre
netconsole
ipmi_si
tun
configfs
cls_u32
sch_htb
8021q
garp
stp
llc
iptable_filter
ipt_addrtype
xt_dscp
xt_string
xt_owner
xt_multiport
xt_iprange
xt_hashlimit
xt_conntrack
xt_DSCP
xt_NFQUEUE
xt_mark
xt_connmark
nf_conntrack
ip_tables
x_tables
bnx2
ipmi_devintf
ipmi_msghandler
processor
ata_piix
i5k_amb
iTCO_wdt
pata_acpi
hwmon
[last unloaded: netconsole]
[ 658.493317]
[ 658.493317] Pid: 0, comm: kworker/0:1 Not tainted 2.6.38.2-devel2 #2
Dell Inc. PowerEdge 1950
/
0D8635
[ 658.493317] EIP: 0060:[<c0377eaf>] EFLAGS: 00010282 CPU: 3
[ 658.493317] EIP is at skb_put+0x7f/0x89
[ 658.493317] EAX: 0000008e EBX: d1af2c78 ECX: f64b5e40 EDX: c05032e8
[ 658.493317] ESI: 000005ea EDI: f5f28380 EBP: 006d006d ESP: f64b5e3c
[ 658.493317] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 658.493317] Process kworker/0:1 (pid: 0, ti=f64b4000 task=f64a4a80
task.ti=f64b0000)
[ 658.493317] Stack:
[ 658.493317] c05032e8
f80f37d4
00000bd4
000005ea
d1af2000
d1af20a4
d1af2c78
d1af2700
[ 658.493317] f5e54000
00000000
eee81e00
f80f37d4
00000604
00000002
00000000
e5602500
[ 658.493317] 00000001
f6b02cb8
0000004d
512e75c0
eef28480
00000000
f5f28400
f5f28380
[ 658.493317] Call Trace:
[ 658.493317] [<f80f37d4>] ? bnx2_poll_work+0x980/0xf48 [bnx2]
[ 658.493317] [<f80f37d4>] ? bnx2_poll_work+0x980/0xf48 [bnx2]
[ 658.493317] [<c0140e49>] ? hrtimer_start+0x20/0x25
[ 658.493317] [<f826ffd1>] ? htb_dequeue+0x757/0x770 [sch_htb]
[ 658.493317] [<f80f3f27>] ? bnx2_poll+0xf7/0x1d9 [bnx2]
[ 658.493317] [<c037f564>] ? net_rx_action+0x8c/0x176
[ 658.493317] [<c012f28f>] ? __do_softirq+0x6b/0x104
[ 658.493317] [<c012f224>] ? __do_softirq+0x0/0x104
[ 658.493317] <IRQ>
[ 658.493317] [<c012f17e>] ? irq_exit+0x26/0x59
[ 658.493317] [<c0103b3d>] ? do_IRQ+0x81/0x95
[ 658.493317] [<c0102ca9>] ? common_interrupt+0x29/0x30
[ 658.493317] [<c010807a>] ? mwait_idle+0x51/0x56
[ 658.493317] [<c0101a97>] ? cpu_idle+0x41/0x5e
[ 658.493317] Code:
24
14
8b
81
a4
00
00
00
89
74
24
0c
89
44
24
10
8b
41
4c
c7
04
24
e8
32
50
c0
89
44
24
08
8b
44
24
2c
89
44
24
04
e8
51
85
07
00
Apr 13 22:48:46 217.151.224.119 unparseable log message: "<0f> "
0b
eb
fe
83
c4
24
5b
5e
c3
55
57
56
53
83
ec
24
fc
89
c5
89
[ 658.493317] EIP: [<c0377eaf>]
skb_put+0x7f/0x89
SS:ESP 0068:f64b5e3c
[ 658.512472] ---[ end trace d06a076521439891 ]---
[ 658.512750] Kernel panic - not syncing: Fatal exception in interrupt
[ 658.514034] Rebooting in 5 seconds..
^ permalink raw reply
* RE: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval for physical identification
From: Allan, Bruce W @ 2011-04-13 22:55 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev@vger.kernel.org
In-Reply-To: <1302734679.2873.23.camel@bwh-desktop>
>-----Original Message-----
>From: Ben Hutchings [mailto:bhutchings@solarflare.com]
>Sent: Wednesday, April 13, 2011 3:45 PM
>To: Allan, Bruce W
>Cc: netdev@vger.kernel.org
>Subject: RE: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval
>for physical identification
>
>On Wed, 2011-04-13 at 15:39 -0700, Allan, Bruce W wrote:
>>
>> >-----Original Message-----
>> >From: Ben Hutchings [mailto:bhutchings@solarflare.com]
>> >Sent: Wednesday, April 13, 2011 1:25 PM
>> >To: Allan, Bruce W
>> >Cc: netdev@vger.kernel.org
>> >Subject: Re: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval
>> >for physical identification
>> >
>> >I'm sure there ought to be a clearer way to do this, and to avoid any
>> >weird effects from integer overflow in the multiplication. How about
>> >using an inner loop for each second:
>> >
>> > /* Driver expects to be called at twice the frequency in rc */
>> > int n = rc * 2, i, interval = HZ / n;
>> >
>
> /* Count down seconds */
>> > do {
> /* Count down iterations per second */
>> > i = n;
>> > do {
>> > rtnl_lock();
>> > rc = dev->ethtool_ops->set_phys_id(
>> > dev, (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
>> > rtnl_unlock();
>> > if (rc)
>> > break;
>> > schedule_timeout_interruptible(interval);
>> > } while (!signal_pending(current) && --i != 0);
>> > } while (!signal_pending(current) &&
>> > (id.data == 0 || --id.data != 0));
>> >
>> >Ben.
>>
>> OK, if that is clearer to you...v3 forthcoming.
>
>I guess it wouldn't hurt to add comemnts too. Would you agree that it's
>clear with the additions above?
>
>Ben.
Sure, makes sense to me.
Thanks,
Bruce.
^ permalink raw reply
* RE: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval for physical identification
From: Ben Hutchings @ 2011-04-13 22:44 UTC (permalink / raw)
To: Allan, Bruce W; +Cc: netdev@vger.kernel.org
In-Reply-To: <8DD2590731AB5D4C9DBF71A877482A90018A3427B6@orsmsx509.amr.corp.intel.com>
On Wed, 2011-04-13 at 15:39 -0700, Allan, Bruce W wrote:
>
> >-----Original Message-----
> >From: Ben Hutchings [mailto:bhutchings@solarflare.com]
> >Sent: Wednesday, April 13, 2011 1:25 PM
> >To: Allan, Bruce W
> >Cc: netdev@vger.kernel.org
> >Subject: Re: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval
> >for physical identification
> >
> >I'm sure there ought to be a clearer way to do this, and to avoid any
> >weird effects from integer overflow in the multiplication. How about
> >using an inner loop for each second:
> >
> > /* Driver expects to be called at twice the frequency in rc */
> > int n = rc * 2, i, interval = HZ / n;
> >
/* Count down seconds */
> > do {
/* Count down iterations per second */
> > i = n;
> > do {
> > rtnl_lock();
> > rc = dev->ethtool_ops->set_phys_id(
> > dev, (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
> > rtnl_unlock();
> > if (rc)
> > break;
> > schedule_timeout_interruptible(interval);
> > } while (!signal_pending(current) && --i != 0);
> > } while (!signal_pending(current) &&
> > (id.data == 0 || --id.data != 0));
> >
> >Ben.
>
> OK, if that is clearer to you...v3 forthcoming.
I guess it wouldn't hurt to add comemnts too. Would you agree that it's
clear with the additions above?
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* RE: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval for physical identification
From: Allan, Bruce W @ 2011-04-13 22:39 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev@vger.kernel.org
In-Reply-To: <1302726313.2873.18.camel@bwh-desktop>
>-----Original Message-----
>From: Ben Hutchings [mailto:bhutchings@solarflare.com]
>Sent: Wednesday, April 13, 2011 1:25 PM
>To: Allan, Bruce W
>Cc: netdev@vger.kernel.org
>Subject: Re: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval
>for physical identification
>
>I'm sure there ought to be a clearer way to do this, and to avoid any
>weird effects from integer overflow in the multiplication. How about
>using an inner loop for each second:
>
> /* Driver expects to be called at twice the frequency in rc */
> int n = rc * 2, i, interval = HZ / n;
>
> do {
> i = n;
> do {
> rtnl_lock();
> rc = dev->ethtool_ops->set_phys_id(
> dev, (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
> rtnl_unlock();
> if (rc)
> break;
> schedule_timeout_interruptible(interval);
> } while (!signal_pending(current) && --i != 0);
> } while (!signal_pending(current) &&
> (id.data == 0 || --id.data != 0));
>
>Ben.
OK, if that is clearer to you...v3 forthcoming.
Thanks,
Bruce.
^ permalink raw reply
* Re: [PATCHv2 net-next-2.6] rndis_host: Poll status before control channel where necessary
From: David Miller @ 2011-04-13 21:49 UTC (permalink / raw)
To: ben-/+tVBieCtBitmTQ+vhA3Yw
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, vzeeaxwl-ubggFOsnOr3gwBMGfI3FeA,
linux-usb-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1302670523.5282.610.camel@localhost>
From: Ben Hutchings <ben-/+tVBieCtBitmTQ+vhA3Yw@public.gmane.org>
Date: Wed, 13 Apr 2011 05:55:23 +0100
> Some RNDIS devices don't respond on the control channel until polled
> on the status channel. In particular, this was reported to be the
> case for the 2Wire HomePortal 1000SW and for some Windows Mobile
> devices.
>
> This is roughly based on a patch by John Carr <john.carr-3P/l8hQepEe9FHfhHBbuYA@public.gmane.org>
> which is currently applied by Mandriva.
>
> Reported-by: Mark Glassberg <vzeeaxwl-ubggFOsnOr3gwBMGfI3FeA@public.gmane.org>
> Signed-off-by: Ben Hutchings <ben-/+tVBieCtBitmTQ+vhA3Yw@public.gmane.org>
> ---
> The first version made this behaviour unconditional and had to be
> reverted. This version adds a quirk flag instead.
Applied, thanks Ben.
The feedback about whether to use the point-to-point flag or not should
be addressed, but seperately.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options
From: David Miller @ 2011-04-13 21:48 UTC (permalink / raw)
To: eric.dumazet; +Cc: lkml, shemminger, shimoda.hiroaki, netdev
In-Reply-To: <1302708487.3725.0.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 13 Apr 2011 17:28:07 +0200
> Dont worry, Stephen or me will send it asap.
I'm looking forward to it :)
^ permalink raw reply
* [RFC][PATCH] Zero-copy receive from socket into bio
From: Andreas Gruenbacher @ 2011-04-13 21:39 UTC (permalink / raw)
To: David S. Miller, netdev; +Cc: linux-kernel
Hello,
I'm currently looking into supporting zero-copy receive in drbd.
The basic idea is this: drbd transmits bios via sockets. An ideal sender
sends the packet header and data in separate packets, and the network driver
supports RX_COPYBREAK and receives them into separate socket buffers. The
socket buffers end up aligned properly, and we add them to bios and submit
them, no copying required.
This scenario doesn't seem to be supported by the existing infrastructure, so
does this patch make sense?
Thanks,
Andreas
---
[PATCH] Add a generic zero-copy-receive primitive
This requires a network driver which supports header-data split, i.e.,
receiving small header packets and big data packets into different
buffers so that the data will end up aligned well enough for consumption
by the block layer (search for RX_COPYBREAK in the drivers).
diff --git a/tcp_recvbio.c b/tcp_recvbio.c
new file mode 100644
index 0000000..38342e9
--- /dev/null
+++ b/tcp_recvbio.c
@@ -0,0 +1,185 @@
+#include <linux/kernel.h>
+#include <net/tcp.h>
+#include <linux/bio.h>
+#include <linux/blkdev.h>
+#include <linux/fs.h>
+#include "tcp_recvbio.h"
+
+static int tcp_recvbio_add(struct sk_buff *skb, struct bio *bio,
+ struct bio_vec *last)
+{
+ struct request_queue *q = bio->bi_bdev->bd_disk->queue;
+ struct sk_buff **frag_list = &skb_shinfo(skb)->frag_list;
+ int ret;
+
+ /*
+ * Reject fragmented skbs: there should be no need to support them. We
+ * use frag_list to keep track of the skbs attached to a bio instead.
+ */
+ if (*frag_list && skb != (struct sk_buff *)bio->bi_private)
+ return false;
+
+ if (!blk_rq_aligned(q, last->bv_offset, last->bv_len))
+ return false;
+ ret = bio_add_page(bio, last->bv_page, last->bv_len, last->bv_offset);
+
+ if (ret && !*frag_list) {
+ /* Tell the network layer to leave @skb alone. */
+ skb_get(skb);
+
+ /* Put this skb on the list. */
+ *frag_list = (struct sk_buff *)bio->bi_private;
+ bio->bi_private = skb;
+ }
+ return ret;
+}
+
+static int tcp_recvbio_data(read_descriptor_t *rd_desc, struct sk_buff *skb,
+ unsigned int offset, size_t len)
+{
+ struct bio *bio = rd_desc->arg.data;
+ struct request_queue *q = bio->bi_bdev->bd_disk->queue;
+ int start = skb_headlen(skb), consumed = 0, i;
+ struct bio_vec last = { };
+
+ /* Cannot zero-copy from the header. */
+ if (offset < start)
+ goto give_up;
+
+ /* Give up if the payload is unaligned. */
+ if (!blk_rq_aligned(q, offset - start, 0))
+ goto give_up;
+
+ /* Do not consume more data than we need. */
+ if (len > rd_desc->count - rd_desc->written)
+ len = rd_desc->count - rd_desc->written;
+
+ for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+ struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i];
+ int end, frag_len;
+
+ WARN_ON(start > offset + len);
+
+ end = start + frag->size;
+ frag_len = end - offset;
+ if (frag_len > 0) {
+ bool merged = false;
+ unsigned int page_offset;
+
+ if (frag_len > len)
+ frag_len = len;
+
+ page_offset = frag->page_offset + offset - start;
+ if (last.bv_page == frag->page &&
+ last.bv_offset + last.bv_len == page_offset) {
+ /* Merge with the previous fragment. */
+ last.bv_len += frag_len;
+ merged = true;
+ }
+ len -= frag_len;
+ offset += frag_len;
+ if (!len || !merged) {
+ if (last.bv_page) {
+ if (!tcp_recvbio_add(skb, bio, &last))
+ goto give_up;
+ consumed += last.bv_len;
+ }
+ if (!len)
+ goto out;
+ last.bv_page = frag->page;
+ last.bv_offset = page_offset;
+ last.bv_len = frag_len;
+ }
+ }
+ start = end;
+ }
+
+ /*
+ * We don't care if there are additional blocks in the skb's frag_list
+ * that are zero-copyable: at worst, we end up copying too many blocks.
+ * (See skb_copy_bits() for an example of walking the frag_list.)
+ */
+
+out:
+ rd_desc->written += consumed;
+ return consumed;
+
+give_up:
+ rd_desc->count = 0;
+ goto out;
+}
+
+/**
+ * tcp_recvbio - zero-copy receive a bio from a socket
+ * @sk: socket to receive from
+ * @bio: bio to add socket data to
+ * @size: bytes to receive
+ * @list: single linked list of skbs added to @bio
+ *
+ * Zero-copy receive data from @sk into @bio by directly using the socket
+ * buffer pages, bypassing the page cache. To keep the network layer from
+ * modifying the socket buffers while in use by @bio, we skb_get() them and
+ * return a list of skbs that @bio now references. The caller is
+ * responsible for releasing @list with consume_skbs() once done.
+ *
+ * Returns the number of bytes received into @bio.
+ */
+int tcp_recvbio(struct sock *sk, struct bio *bio, size_t size,
+ struct sk_buff **list)
+{
+ read_descriptor_t rd_desc = {
+ .count = size,
+ .arg = { .data = bio },
+ };
+ void *old_bi_private;
+ int err = 0;
+
+ /* Temporarily build referenced skb list in bi_private. */
+ old_bi_private = bio->bi_private;
+ bio->bi_private = NULL;
+
+ lock_sock(sk);
+ while (rd_desc.written < rd_desc.count) {
+ long timeo = sock_rcvtimeo(sk, 0);
+
+ sk_wait_data(sk, &timeo);
+ if (signal_pending(current)) {
+ err = sock_intr_errno(timeo);
+ break;
+ }
+ if (!timeo) {
+ if (!rd_desc.written)
+ err = -EAGAIN;
+ break;
+ }
+ read_lock(&sk->sk_callback_lock);
+ err = tcp_read_sock(sk, &rd_desc, tcp_recvbio_data);
+ read_unlock(&sk->sk_callback_lock);
+ if (err < 0)
+ break;
+ }
+ release_sock(sk);
+
+ *list = (struct sk_buff *)bio->bi_private;
+ bio->bi_private = old_bi_private;
+
+ if (err)
+ return err;
+ return rd_desc.written;
+}
+
+/**
+ * consume_skbs - consume a list of skbs
+ *
+ * This assumes that the skbs are linked on frag_list, as the @list returned
+ * from tcp_recvbio().
+ */
+void consume_skbs(struct sk_buff **skb)
+{
+ while (*skb) {
+ struct sk_buff *tmp = *skb;
+ *skb = skb_shinfo(tmp)->frag_list;
+ skb_shinfo(tmp)->frag_list = NULL;
+ consume_skb(tmp);
+ }
+}
diff --git a/tcp_recvbio.h b/tcp_recvbio.h
new file mode 100644
index 0000000..0ba30ee
--- /dev/null
+++ b/tcp_recvbio.h
@@ -0,0 +1,9 @@
+#ifndef __TCP_RECVBIO_H
+#define __TCP_RECVBIO_H
+
+
+extern int tcp_recvbio(struct sock *, struct bio *, size_t, struct sk_buff **);
+extern void consume_skbs(struct sk_buff **);
+
+
+#endif /* __TCP_RECVBIO_H */
--
1.7.4.1.415.g5e839
^ permalink raw reply related
* RE: SMSC 8720a/MDIO/PHY help.
From: ANDY KENNEDY @ 2011-04-13 21:38 UTC (permalink / raw)
To: michael, netdev
In-Reply-To: <1302729564.2742.28.camel@malcolm>
> -----Original Message-----
> From: Michael Riesch [mailto:michael@riesch.at]
> Sent: Wednesday, April 13, 2011 4:19 PM
> To: netdev@vger.kernel.org
> Cc: ANDY KENNEDY
> Subject: Re: SMSC 8720a/MDIO/PHY help.
>
>
> > If you have an idea of something for me to try, I'd love to
> entertain
> > it.
>
> I am rather new to PHYLIB, but these are my ideas:
>
> 1) make sure phy_connect is executed (AFIAK called by MDIO bus
> driver)
Going through the phy.txt doc under Documentation/networking:
PHY Abstraction Layer
(Updated 2008-04-08)
though it may be a bit out-of-date, I did see what you are talking about. What I'm hung up on at the moment is the behavior of adjust_link(). It appears that I only need to start the queues, though I don’t know.
>
> 2) maybe you need to call phy_start / phy_stop (AFAIK from the PHY
> driver's open / close function)
Currently, when I do this I only get the call to adjust_link() over and over again.
>
> HTH,
> Michael
Thanks for the help!
Andy
^ permalink raw reply
* Re: [PATCH 1/1] ipv6: ignore looped-back NA while dad is running
From: David Miller @ 2011-04-13 21:30 UTC (permalink / raw)
To: dwalter; +Cc: netdev, linux-kernel
In-Reply-To: <1302706963.8923.25.camel@localhost>
From: Daniel Walter <dwalter@barracuda.com>
Date: Wed, 13 Apr 2011 17:02:43 +0200
> This message and any attached files are confidential and intended
> solely for the addressee(s). Any publication, transmission or other
> use of the information by a person or entity other than the intended
> addressee is prohibited. If you receive this in error please contact
> the sender and delete the material. The sender does not accept
> liability for any errors or omissions as a result of the
> transmission.
I'm not applying patches that have legal disclaimers like this.
It has no place in a posting made on a public mailing list where open
and unrestricted discussions are essential.
^ permalink raw reply
* Re: SMSC 8720a/MDIO/PHY help.
From: Michael Riesch @ 2011-04-13 21:19 UTC (permalink / raw)
To: netdev; +Cc: ANDY KENNEDY
In-Reply-To: <9AC3F0E75060224C8BBC5BA2DDC8853A1FA8E632@EXV1.corp.adtran.com>
> If you have an idea of something for me to try, I'd love to entertain
> it.
I am rather new to PHYLIB, but these are my ideas:
1) make sure phy_connect is executed (AFIAK called by MDIO bus driver)
2) maybe you need to call phy_start / phy_stop (AFAIK from the PHY
driver's open / close function)
HTH,
Michael
^ permalink raw reply
* Re: [net-next-2.6 RFC PATCH v2 12/13] sky2: set ethtool set_phys_id on/off cycle frequency to 1/sec
From: Stephen Hemminger @ 2011-04-13 21:00 UTC (permalink / raw)
To: Bruce Allan; +Cc: netdev
In-Reply-To: <20110413195949.25901.86878.stgit@gitlad.jf.intel.com>
On Wed, 13 Apr 2011 12:59:49 -0700
Bruce Allan <bruce.w.allan@intel.com> wrote:
> Physical identification frequency based on how it was done prior to the
> introduction of set_phys_id. Compile tested only.
>
> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
> Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Assume same for skge
^ permalink raw reply
* [PATCH net-next 3/5] tg3: Automatically size stat/test string arrays
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson, Benjamin Li
This patch reimplements the size preprocessor constants of the stats and
ethtool test string arrays. The size is calculated at compile time
rather than using static constants.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 15 ++++++++-------
1 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index b61b52f..9975cdb 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -165,11 +165,6 @@
#define TG3_RAW_IP_ALIGN 2
-/* number of ETHTOOL_GSTATS u64's */
-#define TG3_NUM_STATS (sizeof(struct tg3_ethtool_stats)/sizeof(u64))
-
-#define TG3_NUM_TEST 6
-
#define TG3_FW_UPDATE_TIMEOUT_SEC 5
#define FIRMWARE_TG3 "tigon/tg3.bin"
@@ -279,7 +274,7 @@ MODULE_DEVICE_TABLE(pci, tg3_pci_tbl);
static const struct {
const char string[ETH_GSTRING_LEN];
-} ethtool_stats_keys[TG3_NUM_STATS] = {
+} ethtool_stats_keys[] = {
{ "rx_octets" },
{ "rx_fragments" },
{ "rx_ucast_packets" },
@@ -358,9 +353,12 @@ static const struct {
{ "nic_tx_threshold_hit" }
};
+#define TG3_NUM_STATS ARRAY_SIZE(ethtool_stats_keys)
+
+
static const struct {
const char string[ETH_GSTRING_LEN];
-} ethtool_test_keys[TG3_NUM_TEST] = {
+} ethtool_test_keys[] = {
{ "nvram test (online) " },
{ "link test (online) " },
{ "register test (offline)" },
@@ -369,6 +367,9 @@ static const struct {
{ "interrupt test (offline)" },
};
+#define TG3_NUM_TEST ARRAY_SIZE(ethtool_test_keys)
+
+
static void tg3_write32(struct tg3 *tp, u32 off, u32 val)
{
writel(val, tp->regs + off);
--
1.7.3.4
^ permalink raw reply related
* [PATCH net-next 2/5] tg3: Dump registers when status block shows errors
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson, Michael Chan
This patch monitors the error bit of the status word within the status
block. If it is set, the driver will dump the driver state after
validating the error and then reset the chip.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
---
drivers/net/tg3.c | 40 +++++++++++++++++++++++++++++++++++++++-
drivers/net/tg3.h | 3 +++
2 files changed, 42 insertions(+), 1 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 7274435..b61b52f 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -5259,6 +5259,40 @@ tx_recovery:
return work_done;
}
+static void tg3_process_error(struct tg3 *tp)
+{
+ u32 val;
+ bool real_error = false;
+
+ if (tp->tg3_flags & TG3_FLAG_ERROR_PROCESSED)
+ return;
+
+ /* Check Flow Attention register */
+ val = tr32(HOSTCC_FLOW_ATTN);
+ if (val & ~HOSTCC_FLOW_ATTN_MBUF_LWM) {
+ netdev_err(tp->dev, "FLOW Attention error. Resetting chip.\n");
+ real_error = true;
+ }
+
+ if (tr32(MSGINT_STATUS) & ~MSGINT_STATUS_MSI_REQ) {
+ netdev_err(tp->dev, "MSI Status error. Resetting chip.\n");
+ real_error = true;
+ }
+
+ if (tr32(RDMAC_STATUS) || tr32(WDMAC_STATUS)) {
+ netdev_err(tp->dev, "DMA Status error. Resetting chip.\n");
+ real_error = true;
+ }
+
+ if (!real_error)
+ return;
+
+ tg3_dump_state(tp);
+
+ tp->tg3_flags |= TG3_FLAG_ERROR_PROCESSED;
+ schedule_work(&tp->reset_task);
+}
+
static int tg3_poll(struct napi_struct *napi, int budget)
{
struct tg3_napi *tnapi = container_of(napi, struct tg3_napi, napi);
@@ -5267,6 +5301,9 @@ static int tg3_poll(struct napi_struct *napi, int budget)
struct tg3_hw_status *sblk = tnapi->hw_status;
while (1) {
+ if (sblk->status & SD_STATUS_ERROR)
+ tg3_process_error(tp);
+
tg3_poll_link(tp);
work_done = tg3_poll_work(tnapi, work_done, budget);
@@ -7316,7 +7353,8 @@ static int tg3_chip_reset(struct tg3 *tp)
tg3_restore_pci_state(tp);
- tp->tg3_flags &= ~TG3_FLAG_CHIP_RESETTING;
+ tp->tg3_flags &= ~(TG3_FLAG_CHIP_RESETTING |
+ TG3_FLAG_ERROR_PROCESSED);
val = 0;
if (tp->tg3_flags2 & TG3_FLG2_5780_CLASS)
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 9912010..b3ccfcc 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -1201,6 +1201,7 @@
#define HOSTCC_STATS_BLK_NIC_ADDR 0x00003c40
#define HOSTCC_STATUS_BLK_NIC_ADDR 0x00003c44
#define HOSTCC_FLOW_ATTN 0x00003c48
+#define HOSTCC_FLOW_ATTN_MBUF_LWM 0x00000040
/* 0x3c4c --> 0x3c50 unused */
#define HOSTCC_JUMBO_CON_IDX 0x00003c50
#define HOSTCC_STD_CON_IDX 0x00003c54
@@ -1611,6 +1612,7 @@
#define MSGINT_MODE_ONE_SHOT_DISABLE 0x00000020
#define MSGINT_MODE_MULTIVEC_EN 0x00000080
#define MSGINT_STATUS 0x00006004
+#define MSGINT_STATUS_MSI_REQ 0x00000001
#define MSGINT_FIFO 0x00006008
/* 0x600c --> 0x6400 unused */
@@ -2886,6 +2888,7 @@ struct tg3 {
#define TG3_FLAG_TAGGED_STATUS 0x00000001
#define TG3_FLAG_TXD_MBOX_HWBUG 0x00000002
#define TG3_FLAG_USE_LINKCHG_REG 0x00000008
+#define TG3_FLAG_ERROR_PROCESSED 0x00000010
#define TG3_FLAG_ENABLE_ASF 0x00000020
#define TG3_FLAG_ASPM_WORKAROUND 0x00000040
#define TG3_FLAG_POLL_SERDES 0x00000080
--
1.7.3.4
^ permalink raw reply related
* [PATCH net-next 5/5] tg3: Add support for extended VPD blocks
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson
In some devices, the VPD block is relocated to a different area in
NVRAM. The original location can still contain old, but still valid VPD
data. This patch changes the code to look for an extended VPD block in
NVRAM. If one is found, that block is used for all VPD operations
instead.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
---
drivers/net/tg3.c | 125 ++++++++++++++++++++++++++++++++++-------------------
drivers/net/tg3.h | 2 +
2 files changed, 83 insertions(+), 44 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 52dd516..10fa476 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -10416,6 +10416,81 @@ static void tg3_get_ethtool_stats(struct net_device *dev,
memcpy(tmp_stats, tg3_get_estats(tp), sizeof(tp->estats));
}
+static __be32 * tg3_vpd_readblock(struct tg3 *tp)
+{
+ int i;
+ __be32 *buf;
+ u32 offset = 0, len = 0;
+ u32 magic, val;
+
+ if ((tp->tg3_flags3 & TG3_FLG3_NO_NVRAM) ||
+ tg3_nvram_read(tp, 0, &magic))
+ return NULL;
+
+ if (magic == TG3_EEPROM_MAGIC) {
+ for (offset = TG3_NVM_DIR_START;
+ offset < TG3_NVM_DIR_END;
+ offset += TG3_NVM_DIRENT_SIZE) {
+ if (tg3_nvram_read(tp, offset, &val))
+ return NULL;
+
+ if ((val >> TG3_NVM_DIRTYPE_SHIFT) ==
+ TG3_NVM_DIRTYPE_EXTVPD)
+ break;
+ }
+
+ if (offset != TG3_NVM_DIR_END) {
+ len = (val & TG3_NVM_DIRTYPE_LENMSK) * 4;
+ if (tg3_nvram_read(tp, offset + 4, &offset))
+ return NULL;
+
+ offset = tg3_nvram_logical_addr(tp, offset);
+ }
+ }
+
+ if (!offset || !len) {
+ offset = TG3_NVM_VPD_OFF;
+ len = TG3_NVM_VPD_LEN;
+ }
+
+ buf = kmalloc(len, GFP_KERNEL);
+ if (buf == NULL)
+ return NULL;
+
+ if (magic == TG3_EEPROM_MAGIC) {
+ for (i = 0; i < len; i += 4) {
+ /* The data is in little-endian format in NVRAM.
+ * Use the big-endian read routines to preserve
+ * the byte order as it exists in NVRAM.
+ */
+ if (tg3_nvram_read_be32(tp, offset + i, &buf[i/4]))
+ goto error;
+ }
+ } else {
+ u8 *ptr;
+ ssize_t cnt;
+ unsigned int pos = 0;
+
+ ptr = (u8 *)&buf[0];
+ for (i = 0; pos < len && i < 3; i++, pos += cnt, ptr += cnt) {
+ cnt = pci_read_vpd(tp->pdev, pos,
+ len - pos, ptr);
+ if (cnt == -ETIMEDOUT || cnt == -EINTR)
+ cnt = 0;
+ else if (cnt < 0)
+ goto error;
+ }
+ if (pos != len)
+ goto error;
+ }
+
+ return buf;
+
+error:
+ kfree(buf);
+ return NULL;
+}
+
#define NVRAM_TEST_SIZE 0x100
#define NVRAM_SELFBOOT_FORMAT1_0_SIZE 0x14
#define NVRAM_SELFBOOT_FORMAT1_2_SIZE 0x18
@@ -10555,14 +10630,11 @@ static int tg3_test_nvram(struct tg3 *tp)
if (csum != le32_to_cpu(buf[0xfc/4]))
goto out;
- for (i = 0; i < TG3_NVM_VPD_LEN; i += 4) {
- /* The data is in little-endian format in NVRAM.
- * Use the big-endian read routines to preserve
- * the byte order as it exists in NVRAM.
- */
- if (tg3_nvram_read_be32(tp, TG3_NVM_VPD_OFF + i, &buf[i/4]))
- goto out;
- }
+ kfree(buf);
+
+ buf = tg3_vpd_readblock(tp);
+ if (!buf)
+ return -ENOMEM;
i = pci_vpd_find_tag((u8 *)buf, 0, TG3_NVM_VPD_LEN,
PCI_VPD_LRDT_RO_DATA);
@@ -12905,46 +12977,11 @@ static void __devinit tg3_read_vpd(struct tg3 *tp)
u8 *vpd_data;
unsigned int block_end, rosize, len;
int j, i = 0;
- u32 magic;
-
- if ((tp->tg3_flags3 & TG3_FLG3_NO_NVRAM) ||
- tg3_nvram_read(tp, 0x0, &magic))
- goto out_no_vpd;
- vpd_data = kmalloc(TG3_NVM_VPD_LEN, GFP_KERNEL);
+ vpd_data = (u8 *)tg3_vpd_readblock(tp);
if (!vpd_data)
goto out_no_vpd;
- if (magic == TG3_EEPROM_MAGIC) {
- for (i = 0; i < TG3_NVM_VPD_LEN; i += 4) {
- u32 tmp;
-
- /* The data is in little-endian format in NVRAM.
- * Use the big-endian read routines to preserve
- * the byte order as it exists in NVRAM.
- */
- if (tg3_nvram_read_be32(tp, TG3_NVM_VPD_OFF + i, &tmp))
- goto out_not_found;
-
- memcpy(&vpd_data[i], &tmp, sizeof(tmp));
- }
- } else {
- ssize_t cnt;
- unsigned int pos = 0;
-
- for (; pos < TG3_NVM_VPD_LEN && i < 3; i++, pos += cnt) {
- cnt = pci_read_vpd(tp->pdev, pos,
- TG3_NVM_VPD_LEN - pos,
- &vpd_data[pos]);
- if (cnt == -ETIMEDOUT || cnt == -EINTR)
- cnt = 0;
- else if (cnt < 0)
- goto out_not_found;
- }
- if (pos != TG3_NVM_VPD_LEN)
- goto out_not_found;
- }
-
i = pci_vpd_find_tag(vpd_data, 0, TG3_NVM_VPD_LEN,
PCI_VPD_LRDT_RO_DATA);
if (i < 0)
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index b3ccfcc..224c3e0 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2009,7 +2009,9 @@
#define TG3_NVM_DIR_END 0x78
#define TG3_NVM_DIRENT_SIZE 0xc
#define TG3_NVM_DIRTYPE_SHIFT 24
+#define TG3_NVM_DIRTYPE_LENMSK 0x003fffff
#define TG3_NVM_DIRTYPE_ASFINI 1
+#define TG3_NVM_DIRTYPE_EXTVPD 20
#define TG3_NVM_PTREV_BCVER 0x94
#define TG3_NVM_BCVER_MAJMSK 0x0000ff00
#define TG3_NVM_BCVER_MAJSFT 8
--
1.7.3.4
^ permalink raw reply related
* [PATCH net-next 4/5] tg3: Add jumbo frame loopback tests to selftest
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson
This patch adds jumbo frame loopback test support to the ethtool
selftest.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
---
drivers/net/tg3.c | 34 +++++++++++++++++++++++++---------
1 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 9975cdb..52dd516 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -10935,7 +10935,7 @@ static int tg3_test_memory(struct tg3 *tp)
#define TG3_MAC_LOOPBACK 0
#define TG3_PHY_LOOPBACK 1
-static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
+static int tg3_run_loopback(struct tg3 *tp, u32 pktsz, int loopback_mode)
{
u32 mac_mode, rx_start_idx, rx_idx, tx_idx, opaque_key;
u32 desc_idx, coal_now;
@@ -11033,7 +11033,7 @@ static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
err = -EIO;
- tx_len = 1514;
+ tx_len = pktsz;
skb = netdev_alloc_skb(tp->dev, tx_len);
if (!skb)
return -ENOMEM;
@@ -11042,7 +11042,7 @@ static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
memcpy(tx_data, tp->dev->dev_addr, 6);
memset(tx_data + 6, 0x0, 8);
- tw32(MAC_RX_MTU_SIZE, tx_len + 4);
+ tw32(MAC_RX_MTU_SIZE, tx_len + ETH_FCS_LEN);
for (i = 14; i < tx_len; i++)
tx_data[i] = (u8) (i & 0xff);
@@ -11098,8 +11098,6 @@ static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
desc = &rnapi->rx_rcb[rx_start_idx];
desc_idx = desc->opaque & RXD_OPAQUE_INDEX_MASK;
opaque_key = desc->opaque & RXD_OPAQUE_RING_MASK;
- if (opaque_key != RXD_OPAQUE_RING_STD)
- goto out;
if ((desc->err_vlan & RXD_ERR_MASK) != 0 &&
(desc->err_vlan != RXD_ERR_ODD_NIBBLE_RCVD_MII))
@@ -11109,9 +11107,20 @@ static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
if (rx_len != tx_len)
goto out;
- rx_skb = tpr->rx_std_buffers[desc_idx].skb;
+ if (pktsz <= TG3_RX_STD_DMA_SZ - ETH_FCS_LEN) {
+ if (opaque_key != RXD_OPAQUE_RING_STD)
+ goto out;
+
+ rx_skb = tpr->rx_std_buffers[desc_idx].skb;
+ map = dma_unmap_addr(&tpr->rx_std_buffers[desc_idx], mapping);
+ } else {
+ if (opaque_key != RXD_OPAQUE_RING_JUMBO)
+ goto out;
+
+ rx_skb = tpr->rx_jmb_buffers[desc_idx].skb;
+ map = dma_unmap_addr(&tpr->rx_jmb_buffers[desc_idx], mapping);
+ }
- map = dma_unmap_addr(&tpr->rx_std_buffers[desc_idx], mapping);
pci_dma_sync_single_for_cpu(tp->pdev, map, rx_len, PCI_DMA_FROMDEVICE);
for (i = 14; i < tx_len; i++) {
@@ -11177,9 +11186,13 @@ static int tg3_test_loopback(struct tg3 *tp)
CPMU_CTRL_LINK_AWARE_MODE));
}
- if (tg3_run_loopback(tp, TG3_MAC_LOOPBACK))
+ if (tg3_run_loopback(tp, ETH_FRAME_LEN, TG3_MAC_LOOPBACK))
err |= TG3_MAC_LOOPBACK_FAILED;
+ if ((tp->tg3_flags & TG3_FLAG_JUMBO_RING_ENABLE) &&
+ tg3_run_loopback(tp, 9000 + ETH_HLEN, TG3_MAC_LOOPBACK))
+ err |= (TG3_MAC_LOOPBACK_FAILED << 2);
+
if (tp->tg3_flags & TG3_FLAG_CPMU_PRESENT) {
tw32(TG3_CPMU_CTRL, cpmuctrl);
@@ -11189,8 +11202,11 @@ static int tg3_test_loopback(struct tg3 *tp)
if (!(tp->phy_flags & TG3_PHYFLG_PHY_SERDES) &&
!(tp->tg3_flags3 & TG3_FLG3_USE_PHYLIB)) {
- if (tg3_run_loopback(tp, TG3_PHY_LOOPBACK))
+ if (tg3_run_loopback(tp, ETH_FRAME_LEN, TG3_PHY_LOOPBACK))
err |= TG3_PHY_LOOPBACK_FAILED;
+ if ((tp->tg3_flags & TG3_FLAG_JUMBO_RING_ENABLE) &&
+ tg3_run_loopback(tp, 9000 + ETH_HLEN, TG3_PHY_LOOPBACK))
+ err |= (TG3_PHY_LOOPBACK_FAILED << 2);
}
/* Re-enable gphy autopowerdown. */
--
1.7.3.4
^ permalink raw reply related
* [PATCH net-next 0/5] tg3: Add more selftest and debug support
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson
This patchset adds register dump capabilities for first failure debugging,
a jumbo frame loopback test mode, and extended VPD block handling.
^ permalink raw reply
* [PATCH net-next 1/5] tg3: Provide full regdump on tx timeout
From: Matt Carlson @ 2011-04-13 21:05 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson, Michael Chan
The current amount of information provided in the output of a tx timeout
is insufficient to determine a root cause. This patch replaces the
terse, four-register status output with a more complete body of
information. For PCIe devices, the full register space is dumped. For
other devices, select registers are dumped instead.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
---
drivers/net/tg3.c | 189 ++++++++++++++++++++++++++++++++++-------------------
drivers/net/tg3.h | 2 +
2 files changed, 123 insertions(+), 68 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 9d7defc..7274435 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4459,6 +4459,123 @@ static inline int tg3_irq_sync(struct tg3 *tp)
return tp->irq_sync;
}
+static inline void tg3_rd32_loop(struct tg3 *tp, u32 *dst, u32 off, u32 len)
+{
+ int i;
+
+ dst = (u32 *)((u8 *)dst + off);
+ for (i = 0; i < len; i += sizeof(u32))
+ *dst++ = tr32(off + i);
+}
+
+static void tg3_dump_legacy_regs(struct tg3 *tp, u32 *regs)
+{
+ tg3_rd32_loop(tp, regs, TG3PCI_VENDOR, 0xb0);
+ tg3_rd32_loop(tp, regs, MAILBOX_INTERRUPT_0, 0x200);
+ tg3_rd32_loop(tp, regs, MAC_MODE, 0x4f0);
+ tg3_rd32_loop(tp, regs, SNDDATAI_MODE, 0xe0);
+ tg3_rd32_loop(tp, regs, SNDDATAC_MODE, 0x04);
+ tg3_rd32_loop(tp, regs, SNDBDS_MODE, 0x80);
+ tg3_rd32_loop(tp, regs, SNDBDI_MODE, 0x48);
+ tg3_rd32_loop(tp, regs, SNDBDC_MODE, 0x04);
+ tg3_rd32_loop(tp, regs, RCVLPC_MODE, 0x20);
+ tg3_rd32_loop(tp, regs, RCVLPC_SELLST_BASE, 0x15c);
+ tg3_rd32_loop(tp, regs, RCVDBDI_MODE, 0x0c);
+ tg3_rd32_loop(tp, regs, RCVDBDI_JUMBO_BD, 0x3c);
+ tg3_rd32_loop(tp, regs, RCVDBDI_BD_PROD_IDX_0, 0x44);
+ tg3_rd32_loop(tp, regs, RCVDCC_MODE, 0x04);
+ tg3_rd32_loop(tp, regs, RCVBDI_MODE, 0x20);
+ tg3_rd32_loop(tp, regs, RCVCC_MODE, 0x14);
+ tg3_rd32_loop(tp, regs, RCVLSC_MODE, 0x08);
+ tg3_rd32_loop(tp, regs, MBFREE_MODE, 0x08);
+ tg3_rd32_loop(tp, regs, HOSTCC_MODE, 0x100);
+
+ if (tp->tg3_flags & TG3_FLAG_SUPPORT_MSIX)
+ tg3_rd32_loop(tp, regs, HOSTCC_RXCOL_TICKS_VEC1, 0x180);
+
+ tg3_rd32_loop(tp, regs, MEMARB_MODE, 0x10);
+ tg3_rd32_loop(tp, regs, BUFMGR_MODE, 0x58);
+ tg3_rd32_loop(tp, regs, RDMAC_MODE, 0x08);
+ tg3_rd32_loop(tp, regs, WDMAC_MODE, 0x08);
+ tg3_rd32_loop(tp, regs, RX_CPU_MODE, 0x04);
+ tg3_rd32_loop(tp, regs, RX_CPU_STATE, 0x04);
+ tg3_rd32_loop(tp, regs, RX_CPU_PGMCTR, 0x04);
+ tg3_rd32_loop(tp, regs, RX_CPU_HWBKPT, 0x04);
+
+ if (!(tp->tg3_flags2 & TG3_FLG2_5705_PLUS)) {
+ tg3_rd32_loop(tp, regs, TX_CPU_MODE, 0x04);
+ tg3_rd32_loop(tp, regs, TX_CPU_STATE, 0x04);
+ tg3_rd32_loop(tp, regs, TX_CPU_PGMCTR, 0x04);
+ }
+
+ tg3_rd32_loop(tp, regs, GRCMBOX_INTERRUPT_0, 0x110);
+ tg3_rd32_loop(tp, regs, FTQ_RESET, 0x120);
+ tg3_rd32_loop(tp, regs, MSGINT_MODE, 0x0c);
+ tg3_rd32_loop(tp, regs, DMAC_MODE, 0x04);
+ tg3_rd32_loop(tp, regs, GRC_MODE, 0x4c);
+
+ if (tp->tg3_flags & TG3_FLAG_NVRAM)
+ tg3_rd32_loop(tp, regs, NVRAM_CMD, 0x24);
+}
+
+static void tg3_dump_state(struct tg3 *tp)
+{
+ int i;
+ u32 *regs;
+
+ regs = kzalloc(TG3_REG_BLK_SIZE, GFP_ATOMIC);
+ if (!regs) {
+ netdev_err(tp->dev, "Failed allocating register dump buffer\n");
+ return;
+ }
+
+ if (tp->tg3_flags2 & TG3_FLG2_PCI_EXPRESS) {
+ /* Read up to but not including private PCI registers */
+ for (i = 0; i < TG3_PCIE_TLDLPL_PORT; i += sizeof(u32))
+ regs[i / sizeof(u32)] = tr32(i);
+ } else
+ tg3_dump_legacy_regs(tp, regs);
+
+ for (i = 0; i < TG3_REG_BLK_SIZE / sizeof(u32); i += 4) {
+ if (!regs[i + 0] && !regs[i + 1] &&
+ !regs[i + 2] && !regs[i + 3])
+ continue;
+
+ netdev_err(tp->dev, "0x%08x: 0x%08x, 0x%08x, 0x%08x, 0x%08x\n",
+ i * 4,
+ regs[i + 0], regs[i + 1], regs[i + 2], regs[i + 3]);
+ }
+
+ kfree(regs);
+
+ for (i = 0; i < tp->irq_cnt; i++) {
+ struct tg3_napi *tnapi = &tp->napi[i];
+
+ /* SW status block */
+ netdev_err(tp->dev,
+ "%d: Host status block [%08x:%08x:(%04x:%04x:%04x):(%04x:%04x)]\n",
+ i,
+ tnapi->hw_status->status,
+ tnapi->hw_status->status_tag,
+ tnapi->hw_status->rx_jumbo_consumer,
+ tnapi->hw_status->rx_consumer,
+ tnapi->hw_status->rx_mini_consumer,
+ tnapi->hw_status->idx[0].rx_producer,
+ tnapi->hw_status->idx[0].tx_consumer);
+
+ netdev_err(tp->dev,
+ "%d: NAPI info [%08x:%08x:(%04x:%04x:%04x):%04x:(%04x:%04x:%04x:%04x)]\n",
+ i,
+ tnapi->last_tag, tnapi->last_irq_tag,
+ tnapi->tx_prod, tnapi->tx_cons, tnapi->tx_pending,
+ tnapi->rx_rcb_ptr,
+ tnapi->prodring.rx_std_prod_idx,
+ tnapi->prodring.rx_std_cons_idx,
+ tnapi->prodring.rx_jmb_prod_idx,
+ tnapi->prodring.rx_jmb_cons_idx);
+ }
+}
+
/* This is called whenever we suspect that the system chipset is re-
* ordering the sequence of MMIO to the tx send mailbox. The symptom
* is bogus tx completions. We try to recover by setting the
@@ -5516,21 +5633,13 @@ out:
tg3_phy_start(tp);
}
-static void tg3_dump_short_state(struct tg3 *tp)
-{
- netdev_err(tp->dev, "DEBUG: MAC_TX_STATUS[%08x] MAC_RX_STATUS[%08x]\n",
- tr32(MAC_TX_STATUS), tr32(MAC_RX_STATUS));
- netdev_err(tp->dev, "DEBUG: RDMAC_STATUS[%08x] WDMAC_STATUS[%08x]\n",
- tr32(RDMAC_STATUS), tr32(WDMAC_STATUS));
-}
-
static void tg3_tx_timeout(struct net_device *dev)
{
struct tg3 *tp = netdev_priv(dev);
if (netif_msg_tx_err(tp)) {
netdev_err(dev, "transmit timed out, resetting\n");
- tg3_dump_short_state(tp);
+ tg3_dump_state(tp);
}
schedule_work(&tp->reset_task);
@@ -9624,82 +9733,26 @@ static void tg3_set_rx_mode(struct net_device *dev)
tg3_full_unlock(tp);
}
-#define TG3_REGDUMP_LEN (32 * 1024)
-
static int tg3_get_regs_len(struct net_device *dev)
{
- return TG3_REGDUMP_LEN;
+ return TG3_REG_BLK_SIZE;
}
static void tg3_get_regs(struct net_device *dev,
struct ethtool_regs *regs, void *_p)
{
- u32 *p = _p;
struct tg3 *tp = netdev_priv(dev);
- u8 *orig_p = _p;
- int i;
regs->version = 0;
- memset(p, 0, TG3_REGDUMP_LEN);
+ memset(_p, 0, TG3_REG_BLK_SIZE);
if (tp->phy_flags & TG3_PHYFLG_IS_LOW_POWER)
return;
tg3_full_lock(tp, 0);
-#define __GET_REG32(reg) (*(p)++ = tr32(reg))
-#define GET_REG32_LOOP(base, len) \
-do { p = (u32 *)(orig_p + (base)); \
- for (i = 0; i < len; i += 4) \
- __GET_REG32((base) + i); \
-} while (0)
-#define GET_REG32_1(reg) \
-do { p = (u32 *)(orig_p + (reg)); \
- __GET_REG32((reg)); \
-} while (0)
-
- GET_REG32_LOOP(TG3PCI_VENDOR, 0xb0);
- GET_REG32_LOOP(MAILBOX_INTERRUPT_0, 0x200);
- GET_REG32_LOOP(MAC_MODE, 0x4f0);
- GET_REG32_LOOP(SNDDATAI_MODE, 0xe0);
- GET_REG32_1(SNDDATAC_MODE);
- GET_REG32_LOOP(SNDBDS_MODE, 0x80);
- GET_REG32_LOOP(SNDBDI_MODE, 0x48);
- GET_REG32_1(SNDBDC_MODE);
- GET_REG32_LOOP(RCVLPC_MODE, 0x20);
- GET_REG32_LOOP(RCVLPC_SELLST_BASE, 0x15c);
- GET_REG32_LOOP(RCVDBDI_MODE, 0x0c);
- GET_REG32_LOOP(RCVDBDI_JUMBO_BD, 0x3c);
- GET_REG32_LOOP(RCVDBDI_BD_PROD_IDX_0, 0x44);
- GET_REG32_1(RCVDCC_MODE);
- GET_REG32_LOOP(RCVBDI_MODE, 0x20);
- GET_REG32_LOOP(RCVCC_MODE, 0x14);
- GET_REG32_LOOP(RCVLSC_MODE, 0x08);
- GET_REG32_1(MBFREE_MODE);
- GET_REG32_LOOP(HOSTCC_MODE, 0x100);
- GET_REG32_LOOP(MEMARB_MODE, 0x10);
- GET_REG32_LOOP(BUFMGR_MODE, 0x58);
- GET_REG32_LOOP(RDMAC_MODE, 0x08);
- GET_REG32_LOOP(WDMAC_MODE, 0x08);
- GET_REG32_1(RX_CPU_MODE);
- GET_REG32_1(RX_CPU_STATE);
- GET_REG32_1(RX_CPU_PGMCTR);
- GET_REG32_1(RX_CPU_HWBKPT);
- GET_REG32_1(TX_CPU_MODE);
- GET_REG32_1(TX_CPU_STATE);
- GET_REG32_1(TX_CPU_PGMCTR);
- GET_REG32_LOOP(GRCMBOX_INTERRUPT_0, 0x110);
- GET_REG32_LOOP(FTQ_RESET, 0x120);
- GET_REG32_LOOP(MSGINT_MODE, 0x0c);
- GET_REG32_1(DMAC_MODE);
- GET_REG32_LOOP(GRC_MODE, 0x4c);
- if (tp->tg3_flags & TG3_FLAG_NVRAM)
- GET_REG32_LOOP(NVRAM_CMD, 0x24);
-
-#undef __GET_REG32
-#undef GET_REG32_LOOP
-#undef GET_REG32_1
+ tg3_dump_legacy_regs(tp, (u32 *)_p);
tg3_full_unlock(tp);
}
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 829a84a..9912010 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -1954,6 +1954,8 @@
#define TG3_PCIE_PL_LO_PHYCTL5 0x00000014
#define TG3_PCIE_PL_LO_PHYCTL5_DIS_L2CLKREQ 0x80000000
+#define TG3_REG_BLK_SIZE 0x00008000
+
/* OTP bit definitions */
#define TG3_OTP_AGCTGT_MASK 0x000000e0
#define TG3_OTP_AGCTGT_SHIFT 1
--
1.7.3.4
^ permalink raw reply related
* Re: [PATCH] NFS: Fix infinite loop in gss_create_upcall()
From: Bryan Schumaker @ 2011-04-13 20:42 UTC (permalink / raw)
To: Jiri Slaby
Cc: Trond Myklebust, Jiri Slaby, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
mm-commits-u79uwXL29TY76Z2rM5mHXA, ML netdev,
linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4DA49F7F.8060005-AlSwsSmVLrQ@public.gmane.org>
On 04/12/2011 02:52 PM, Jiri Slaby wrote:
> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>>> inacceptable for automounted NFS dirs.
>>>>
>>>> I'm still confused as to why you are hitting it at all. In the normal
>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>>> and then trying rpcsec_gss if and only if that fails.
>>>>
>>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>>> supported flavour?
>>>
>>> I don't know, I connect to a nfs server which is not maintained by me.
>>> It looks like that. How can I find out?
>>
>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
>
> I don't have NFS in modules. It's all built-in. And this one is
> unconditionally selected because of CONFIG_NFS_V4.
Does this patch help?
- Bryan
We should attempt an AUTH_NULL style mount before
trying gss flavors. This should prevent a hang if
gss modules are loaded but the userspace program
isn't running.
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9bf41ea..4e3c16b 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2218,8 +2218,8 @@ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
rpc_authflavor_t flav_array[NFS_MAX_SECFLAVORS + 2];
flav_array[0] = RPC_AUTH_UNIX;
- len = gss_mech_list_pseudoflavors(&flav_array[1]);
- flav_array[1+len] = RPC_AUTH_NULL;
+ flav_array[1] = RPC_AUTH_NULL;
+ len = gss_mech_list_pseudoflavors(&flav_array[2]);
len += 2;
for (i = 0; i < len; i++) {
>
> regards,
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH] Add Qualcomm Gobi 2000/3000 driver.
From: David Miller @ 2011-04-13 20:37 UTC (permalink / raw)
To: ellyjones; +Cc: netdev, dcbw, mjg59, jglasgow, trond
In-Reply-To: <20110413190023.GC1652@google.com>
From: Elly Jones <ellyjones@google.com>
Date: Wed, 13 Apr 2011 15:00:24 -0400
> +void qcusbnet_put(struct qcusbnet *dev)
> +{
> + mutex_lock(&qcusbnet_lock);
> + kref_put(&dev->refcount, free_dev);
> + mutex_unlock(&qcusbnet_lock);
> +}
This locking looks excessive, and shouldn't be needed simply to
release a reference to an object.
> +int qc_suspend(struct usb_interface *iface, pm_message_t event)
> +{
> + struct usbnet *usbnet;
> + struct qcusbnet *dev;
> +
> + if (!iface)
> + return -ENOMEM;
When is qc_suspend() called with a NULL iface arguemnt?
> +static int qc_resume(struct usb_interface *iface)
> +{
> + struct usbnet *usbnet;
> + struct qcusbnet *dev;
> + int ret;
> + int oldstate;
> +
> + if (iface == 0)
> + return -ENOMEM;
Likewise, and if it is needed use consistent tests for NULL. Testing
against the integer "0" is definitely the wrong way.
> + if (usb_endpoint_dir_in(&endpoint->desc)
> + && !usb_endpoint_xfer_int(&endpoint->desc)) {
Please do it like this:
if (A &&
B) {
Not like:
if (A
&& B
the latter looks awful at best.
> + if (!usbnet || !usbnet->net) {
> + DBG("failed to get usbnet device\n");
> + return;
> + }
> +
> + dev = (struct qcusbnet *)usbnet->data[0];
> + if (!dev) {
> + DBG("failed to get QMIDevice\n");
> + return;
> + }
These NULL checks are everywhere! Do we really _ever_ create a full
registered netdev with any of these things being NULL? I severely
doubt it.
> +static int qcnet_worker(void *arg)
> +{
> + struct list_head *node, *tmp;
> + unsigned long activeflags, listflags;
> + struct urbreq *req;
> + int status;
> + struct usb_device *usbdev;
> + struct worker *worker = arg;
> + if (!worker) {
> + DBG("passed null pointer\n");
> + return -EINVAL;
> + }
This NULL check is impossible, you register the worker function with an
explicit &dev->worker argument, so seeing NULL here is impossible.
> +static int qcnet_startxmit(struct sk_buff *skb, struct net_device *netdev)
> +{
> + unsigned long listflags;
> + struct qcusbnet *dev;
> + struct worker *worker;
> + struct urbreq *req;
> + void *data;
> + struct usbnet *usbnet = netdev_priv(netdev);
> +
> + DBG("\n");
> +
> + if (!usbnet || !usbnet->net) {
> + DBG("failed to get usbnet device\n");
> + return NETDEV_TX_BUSY;
> + }
> +
> + dev = (struct qcusbnet *)usbnet->data[0];
> + if (!dev) {
Again, kill this NULL check noise, all of it can't be necessary.
> + netdev->trans_start = jiffies;
Setting netdev->trans_start in drivers is expensive and deprecated,
please set netdev_queue->trans_start instead.
> +static int qcnet_open(struct net_device *netdev)
> +{
> + int status = 0;
> + struct qcusbnet *dev;
> + struct usbnet *usbnet = netdev_priv(netdev);
> +
> + if (!usbnet) {
> + DBG("failed to get usbnet device\n");
> + return -ENXIO;
> + }
> +
> + dev = (struct qcusbnet *)usbnet->data[0];
> + if (!dev) {
> + DBG("failed to get QMIDevice\n");
> + return -ENXIO;
> + }
Again, excessive NULL checks.
> +int qcnet_stop(struct net_device *netdev)
> +{
> + struct qcusbnet *dev;
> + struct usbnet *usbnet = netdev_priv(netdev);
> +
> + if (!usbnet || !usbnet->net) {
> + DBG("failed to get netdevice\n");
> + return -ENXIO;
> + }
> +
> + dev = (struct qcusbnet *)usbnet->data[0];
> + if (!dev) {
> + DBG("failed to get QMIDevice\n");
> + return -ENXIO;
> + }
Here too.
> +static u8 nibble(unsigned char c)
> +{
> + if (likely(isdigit(c)))
> + return c - '0';
> + c = toupper(c);
> + if (likely(isxdigit(c)))
> + return 10 + c - 'A';
> + return 0;
> +}
Remove this function and use hex_to_bin() instead.
^ permalink raw reply
* Re: [net-next-2.6 RFC PATCH v2 00/13] ethtool: allow custom interval for
From: David Miller @ 2011-04-13 20:25 UTC (permalink / raw)
To: bhutchings; +Cc: bruce.w.allan, netdev
In-Reply-To: <1302725464.2873.7.camel@bwh-desktop>
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Wed, 13 Apr 2011 21:11:04 +0100
> On Wed, 2011-04-13 at 12:58 -0700, Bruce Allan wrote:
>> physical identification
>>
>> The following series changes the recently added ethtool set_phys_id
>> functions to allow drivers to provide a frequency at which to cycle
>> through an on/off identifier via software if/when the capability is
>> not provided by hardware.
> [...]
>
> The first patch leaves all the drivers broken temporarily. Since the
> change in each driver is trivial, I think you can squash this all into
> one patch.
Agreed.
^ permalink raw reply
* Re: [net-next-2.6 RFC PATCH v2 01/13] ethtool: allow custom interval for physical identification
From: Ben Hutchings @ 2011-04-13 20:25 UTC (permalink / raw)
To: Bruce Allan; +Cc: netdev
In-Reply-To: <20110413195851.25901.8139.stgit@gitlad.jf.intel.com>
On Wed, 2011-04-13 at 12:58 -0700, Bruce Allan wrote:
> When physical identification of an adapter is done by toggling the
> mechanism on and off through software utilizing the set_phys_id operation,
> it is done with a fixed duration for both on and off states. Some drivers
> may want to set a custom duration for the on/off intervals. This patch
> changes the API so the return code from the driver's entry point when it
> is called with ETHTOOL_ID_ACTIVE can specify the frequency at which to
> cycle the on/off states.
[...]
> @@ -1655,23 +1655,26 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
> schedule_timeout_interruptible(
> id.data ? (id.data * HZ) : MAX_SCHEDULE_TIMEOUT);
> } else {
> - /* Driver expects to be called periodically */
> + /* Driver expects to be called using the frequency in rc */
> + int i = 0, interval = (HZ / (rc * 2));
> +
> do {
> rtnl_lock();
> rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_ON);
> rtnl_unlock();
> if (rc)
> break;
> - schedule_timeout_interruptible(HZ / 2);
> + schedule_timeout_interruptible(interval);
>
> rtnl_lock();
> rc = dev->ethtool_ops->set_phys_id(dev, ETHTOOL_ID_OFF);
> rtnl_unlock();
> if (rc)
> break;
> - schedule_timeout_interruptible(HZ / 2);
> + schedule_timeout_interruptible(interval);
> } while (!signal_pending(current) &&
> - (id.data == 0 || --id.data != 0));
> + (id.data == 0 ||
> + (++i * 2 * interval) < (id.data * HZ)));
[...]
I'm sure there ought to be a clearer way to do this, and to avoid any
weird effects from integer overflow in the multiplication. How about
using an inner loop for each second:
/* Driver expects to be called at twice the frequency in rc */
int n = rc * 2, i, interval = HZ / n;
do {
i = n;
do {
rtnl_lock();
rc = dev->ethtool_ops->set_phys_id(
dev, (i & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
rtnl_unlock();
if (rc)
break;
schedule_timeout_interruptible(interval);
} while (!signal_pending(current) && --i != 0);
} while (!signal_pending(current) &&
(id.data == 0 || --id.data != 0));
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [Bug 32772] New: PROBLEM: kernel BUG at net/ipv4/inetpeer.c:386
From: David Miller @ 2011-04-13 20:24 UTC (permalink / raw)
To: dimetrios; +Cc: eric.dumazet, shemminger, netdev
In-Reply-To: <BANLkTi=PTqcYd1wO_QzQTtg_PWEq2fAJMg@mail.gmail.com>
From: Dmitry Novikov <dimetrios@gmail.com>
Date: Wed, 13 Apr 2011 23:14:03 +0300
> Crash again after 7 days of uptime. slub_nomerge is set
Looks like too deep stack, try this patch which is in net-2.6:
--------------------
inetpeer: reduce stack usage
On 64bit arches, we use 752 bytes of stack when cleanup_once() is called
from inet_getpeer().
Lets share the avl stack to save ~376 bytes.
Before patch :
# objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl
0x000006c3 unlink_from_pool [inetpeer.o]: 376
0x00000721 unlink_from_pool [inetpeer.o]: 376
0x00000cb1 inet_getpeer [inetpeer.o]: 376
0x00000e6d inet_getpeer [inetpeer.o]: 376
0x0004 inet_initpeers [inetpeer.o]: 112
# size net/ipv4/inetpeer.o
text data bss dec hex filename
5320 432 21 5773 168d net/ipv4/inetpeer.o
After patch :
objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl
0x00000c11 inet_getpeer [inetpeer.o]: 376
0x00000dcd inet_getpeer [inetpeer.o]: 376
0x00000ab9 peer_check_expire [inetpeer.o]: 328
0x00000b7f peer_check_expire [inetpeer.o]: 328
0x0004 inet_initpeers [inetpeer.o]: 112
# size net/ipv4/inetpeer.o
text data bss dec hex filename
5163 432 21 5616 15f0 net/ipv4/inetpeer.o
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Scot Doyle <lkml@scotdoyle.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Reviewed-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/ipv4/inetpeer.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c
index dd1b20e..9df4e63 100644
--- a/net/ipv4/inetpeer.c
+++ b/net/ipv4/inetpeer.c
@@ -354,7 +354,8 @@ static void inetpeer_free_rcu(struct rcu_head *head)
}
/* May be called with local BH enabled. */
-static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base)
+static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base,
+ struct inet_peer __rcu **stack[PEER_MAXDEPTH])
{
int do_free;
@@ -368,7 +369,6 @@ static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base)
* We use refcnt=-1 to alert lockless readers this entry is deleted.
*/
if (atomic_cmpxchg(&p->refcnt, 1, -1) == 1) {
- struct inet_peer __rcu **stack[PEER_MAXDEPTH];
struct inet_peer __rcu ***stackptr, ***delp;
if (lookup(&p->daddr, stack, base) != p)
BUG();
@@ -422,7 +422,7 @@ static struct inet_peer_base *peer_to_base(struct inet_peer *p)
}
/* May be called with local BH enabled. */
-static int cleanup_once(unsigned long ttl)
+static int cleanup_once(unsigned long ttl, struct inet_peer __rcu **stack[PEER_MAXDEPTH])
{
struct inet_peer *p = NULL;
@@ -454,7 +454,7 @@ static int cleanup_once(unsigned long ttl)
* happen because of entry limits in route cache. */
return -1;
- unlink_from_pool(p, peer_to_base(p));
+ unlink_from_pool(p, peer_to_base(p), stack);
return 0;
}
@@ -524,7 +524,7 @@ struct inet_peer *inet_getpeer(struct inetpeer_addr *daddr, int create)
if (base->total >= inet_peer_threshold)
/* Remove one less-recently-used entry. */
- cleanup_once(0);
+ cleanup_once(0, stack);
return p;
}
@@ -540,6 +540,7 @@ static void peer_check_expire(unsigned long dummy)
{
unsigned long now = jiffies;
int ttl, total;
+ struct inet_peer __rcu **stack[PEER_MAXDEPTH];
total = compute_total();
if (total >= inet_peer_threshold)
@@ -548,7 +549,7 @@ static void peer_check_expire(unsigned long dummy)
ttl = inet_peer_maxttl
- (inet_peer_maxttl - inet_peer_minttl) / HZ *
total / inet_peer_threshold * HZ;
- while (!cleanup_once(ttl)) {
+ while (!cleanup_once(ttl, stack)) {
if (jiffies != now)
break;
}
--
1.7.4.3
^ permalink raw reply related
* Re: [Bug 32772] New: PROBLEM: kernel BUG at net/ipv4/inetpeer.c:386
From: Dmitry Novikov @ 2011-04-13 20:14 UTC (permalink / raw)
To: David Miller; +Cc: eric.dumazet, shemminger, netdev
In-Reply-To: <20110406.111649.193697123.davem@davemloft.net>
Hello.
Crash again after 7 days of uptime. slub_nomerge is set
[559353.216526] ------------[ cut here ]------------
[559353.217494] kernel BUG at net/ipv4/inetpeer.c:386!
[559353.217494] invalid opcode: 0000 [#1] SMP
[559353.217494] last sysfs file: /sys/module/nf_conntrack_pptp/initstate
[559353.217494] Modules linked in: nf_nat_pptp nf_nat_proto_gre
nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_ftp nf_conntrack_ftp
ipt_REJECT xt_state xt_tcpudp xt_multiport ip_set iptable_filter
iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack
nf_defrag_ipv4 ip_tables x_tables act_police cls_u32 sch_ingress
sch_tbf 8021q garp bridge ipv6 stp llc loop intel_agp intel_gtt
agpgart rng_core pcspkr i2c_i801 i2c_core processor thermal_sys
parport_pc evdev parport serio_raw tpm_tis tpm button tpm_bios ext3
jbd mbcache sd_mod crc_t10dif ata_generic ata_piix libata scsi_mod
uhci_hcd ide_pci_generic e1000e ehci_hcd igb r8169 ide_core dca mii
usbcore nls_base [last unloaded: scsi_wait_scan]
[559353.217494]
[559353.217494] Pid: 0, comm: kworker/0:0 Not tainted
2.6.38-demyan-1.1demyan #1 Gigabyte Technology Co., Ltd.
G41MT-ES2L/G41MT-ES2L
[559353.217494] EIP: 0060:[<c11e0caa>] EFLAGS: 00010287 CPU: 1
[559353.217494] EIP is at unlink_from_pool+0x85/0x14a
[559353.217494] EAX: c125ff04 EBX: efcb09c0 ECX: abfd6970 EDX: ee6d77c4
[559353.217494] ESI: c1333338 EDI: f4c91bfc EBP: abfea42e ESP: f4c91ba8
[559353.217494] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[559353.217494] Process kworker/0:0 (pid: 0, ti=f4c90000 task=f4c6a400
task.ti=f4c8c000)
[559353.217494] Stack:
[559353.217494] f351c790 00000001 abfd6970 c133333c c1333338 efc6b384
efe2af80 efcd1c04
[559353.217494] f3cc2784 ef3f62c4 f251da80 ef054b40 efcdc300 f0373dc0
f429a144 ef254a80
[559353.217494] ed4e6340 f0705f40 efcdb580 f05261c0 ee6d77c4 f4c91cb4
f351c790 f4c91c78
[559353.217494] Call Trace:
[559353.217494] [<c120f068>] ? fib4_rule_action+0x40/0x4d
[559353.217494] [<c11d1be3>] ? fib_rules_lookup+0x8d/0xe4
[559353.217494] [<c11e0de9>] ? cleanup_once+0x7a/0x7f
[559353.217494] [<c11e0fa9>] ? inet_getpeer+0x1bb/0x1dc
[559353.217494] [<c11dc073>] ? nf_ct_attach+0x12/0x13
[559353.217494] [<c1202404>] ? icmp_glue_bits+0x65/0x6a
[559353.217494] [<c11e4109>] ? ip_append_data+0x595/0x850
[559353.217494] [<c11e025d>] ? rt_bind_peer+0x1d/0x3d
[559353.217494] [<c11e029f>] ? __ip_select_ident+0x22/0xa6
[559353.217494] [<c11e4f60>] ? ip_push_pending_frames+0x206/0x2cb
[559353.217494] [<c120301b>] ? icmp_send+0x4fe/0x523
[559353.217494] [<f81a6b09>] ? ____nf_conntrack_find+0xfa/0x142 [nf_conntrack]
[559353.217494] [<f81a8069>] ? nf_conntrack_in+0x4f3/0x5e3 [nf_conntrack]
[559353.217494] [<f815c536>] ? ipt_do_table+0x4bc/0x4eb [ip_tables]
[559353.217494] [<c11e2949>] ? ip_forward+0x2ef/0x316
[559353.217494] [<c11e13da>] ? ip_rcv_finish+0x2fa/0x31f
[559353.217494] [<c11c1b3c>] ? __netif_receive_skb+0x405/0x42c
[559353.217494] [<c11c1a63>] ? __netif_receive_skb+0x32c/0x42c
[559353.217494] [<c1047585>] ? ktime_get_real+0x10/0x2d
[559353.217494] [<c11c2547>] ? netif_receive_skb+0x5a/0x5f
[559353.217494] [<c11c25ff>] ? napi_skb_finish+0x1b/0x30
[559353.217494] [<f8104723>] ? igb_poll+0x649/0x94a [igb]
[559353.217494] [<c1007765>] ? sched_clock+0x9/0xd
[559353.217494] [<c1030094>] ? wait_consider_task+0x977/0xa91
[559353.217494] [<c104438f>] ? sched_clock_local+0x17/0x13d
[559353.217494] [<c11c2b7b>] ? net_rx_action+0x90/0x150
[559353.217494] [<c1031f12>] ? __do_softirq+0x75/0x10e
[559353.217494] [<c1031e9d>] ? __do_softirq+0x0/0x10e
[559353.217494] <IRQ>
[559353.217494] [<c1031df3>] ? irq_exit+0x31/0x64
[559353.217494] [<c1004397>] ? do_IRQ+0x73/0x84
[559353.217494] [<c1003429>] ? common_interrupt+0x29/0x30
[559353.217494] [<c10089b4>] ? mwait_idle+0x4f/0x59
[559353.217494] [<c10021ef>] ? cpu_idle+0x46/0x63
[559353.217494] Code: 24 08 39 cd 75 09 42 3b 54 24 04 7c e9 eb 18 3b
6c 24 08 8d 50 04 0f 42 d0 89 17 83 c7 04 8b 02 3d 04 ff 25 c1 75 bb
39 d8 74 04 <0f> 0b eb fe 8d 6f fc 81 3b 04 ff 25 c1 89 6c 24 08 75 0d
8b 47
[559353.217494] EIP: [<c11e0caa>] unlink_from_pool+0x85/0x14a SS:ESP
0068:f4c91ba8
[559354.302112] ---[ end trace 55cdab910854890a ]---
[559354.316239] Kernel panic - not syncing: Fatal exception in interrupt
[559354.335557] Pid: 0, comm: kworker/0:0 Tainted: G D
2.6.38-demyan-1.1demyan #1
[559354.359578] Call Trace:
[559354.367198] [<c1231f71>] ? panic+0x4d/0x137
[559354.380274] [<c1005722>] ? oops_end+0x8e/0x99
[559354.393871] [<c1003a0e>] ? do_invalid_op+0x0/0x75
[559354.408509] [<c1003a7a>] ? do_invalid_op+0x6c/0x75
[559354.423407] [<c11e0caa>] ? unlink_from_pool+0x85/0x14a
[559354.439345] [<c120f068>] ? fib4_rule_action+0x40/0x4d
[559354.455022] [<c11d1be3>] ? fib_rules_lookup+0x8d/0xe4
[559354.470700] [<c120f122>] ? fib_lookup+0x31/0x3f
[559354.484818] [<c11ca4f1>] ? neigh_lookup+0x8e/0x96
[559354.499454] [<c123464e>] ? error_code+0x5a/0x60
[559354.513571] [<c1003a0e>] ? do_invalid_op+0x0/0x75
[559354.528208] [<c11e0caa>] ? unlink_from_pool+0x85/0x14a
[559354.544146] [<c120f068>] ? fib4_rule_action+0x40/0x4d
[559354.559823] [<c11d1be3>] ? fib_rules_lookup+0x8d/0xe4
[559354.575500] [<c11e0de9>] ? cleanup_once+0x7a/0x7f
[559354.590137] [<c11e0fa9>] ? inet_getpeer+0x1bb/0x1dc
[559354.605297] [<c11dc073>] ? nf_ct_attach+0x12/0x13
[559354.619934] [<c1202404>] ? icmp_glue_bits+0x65/0x6a
[559354.635090] [<c11e4109>] ? ip_append_data+0x595/0x850
[559354.650767] [<c11e025d>] ? rt_bind_peer+0x1d/0x3d
[559354.665405] [<c11e029f>] ? __ip_select_ident+0x22/0xa6
[559354.681344] [<c11e4f60>] ? ip_push_pending_frames+0x206/0x2cb
[559354.699099] [<c120301b>] ? icmp_send+0x4fe/0x523
[559354.713479] [<f81a6b09>] ? ____nf_conntrack_find+0xfa/0x142 [nf_conntrack]
[559354.734615] [<f81a8069>] ? nf_conntrack_in+0x4f3/0x5e3 [nf_conntrack]
[559354.754452] [<f815c536>] ? ipt_do_table+0x4bc/0x4eb [ip_tables]
[559354.772731] [<c11e2949>] ? ip_forward+0x2ef/0x316
[559354.787366] [<c11e13da>] ? ip_rcv_finish+0x2fa/0x31f
[559354.802785] [<c11c1b3c>] ? __netif_receive_skb+0x405/0x42c
[559354.819762] [<c11c1a63>] ? __netif_receive_skb+0x32c/0x42c
[559354.836738] [<c1047585>] ? ktime_get_real+0x10/0x2d
[559354.851901] [<c11c2547>] ? netif_receive_skb+0x5a/0x5f
[559354.867835] [<c11c25ff>] ? napi_skb_finish+0x1b/0x30
[559354.883254] [<f8104723>] ? igb_poll+0x649/0x94a [igb]
[559354.898930] [<c1007765>] ? sched_clock+0x9/0xd
[559354.912786] [<c1030094>] ? wait_consider_task+0x977/0xa91
[559354.929502] [<c104438f>] ? sched_clock_local+0x17/0x13d
[559354.945701] [<c11c2b7b>] ? net_rx_action+0x90/0x150
[559354.960857] [<c1031f12>] ? __do_softirq+0x75/0x10e
[559354.975756] [<c1031e9d>] ? __do_softirq+0x0/0x10e
[559354.990393] <IRQ> [<c1031df3>] ? irq_exit+0x31/0x64
[559355.005862] [<c1004397>] ? do_IRQ+0x73/0x84
[559355.018941] [<c1003429>] ? common_interrupt+0x29/0x30
[559355.034618] [<c10089b4>] ? mwait_idle+0x4f/0x59
[559355.048734] [<c10021ef>] ? cpu_idle+0x46/0x63
[559355.062333] Rebooting in 5 seconds..
^ permalink raw reply
* Re: [net-next-2.6 RFC PATCH v2 00/13] ethtool: allow custom interval for
From: Ben Hutchings @ 2011-04-13 20:11 UTC (permalink / raw)
To: Bruce Allan; +Cc: netdev
In-Reply-To: <20110413195146.25901.72193.stgit@gitlad.jf.intel.com>
On Wed, 2011-04-13 at 12:58 -0700, Bruce Allan wrote:
> physical identification
>
> The following series changes the recently added ethtool set_phys_id
> functions to allow drivers to provide a frequency at which to cycle
> through an on/off identifier via software if/when the capability is
> not provided by hardware.
[...]
The first patch leaves all the drivers broken temporarily. Since the
change in each driver is trivial, I think you can squash this all into
one patch.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox