From: Michael Breuer <mbreuer@majjas.com>
To: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Berck E. Nash" <flyboy@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
netdev@vger.kernel.org
Subject: Re: sky2 panic in 2.6.32.1 under load (new oops)
Date: Sat, 26 Dec 2009 15:37:29 -0500 [thread overview]
Message-ID: <4B367409.5060202@majjas.com> (raw)
In-Reply-To: <20091226095723.7ac82b18@nehalam>
On 12/26/2009 12:57 PM, Stephen Hemminger wrote:
> On Fri, 25 Dec 2009 22:23:51 -0500
> Michael Breuer<mbreuer@majjas.com> wrote:
>
>
>> On 12/25/2009 6:22 PM, Stephen Hemminger wrote:
>>
>>> On Fri, 25 Dec 2009 11:28:55 -0500
>>> Michael Breuer<mbreuer@majjas.com> wrote:
>>>
>>>
>>>
>>>> More data points - I'm able to reliably recreate this now.
>>>> While I thought it was coincidence, each and every time I hit this issue
>>>> there is a DHCP renew event immediately before the first error.
>>>> The crash occurs while under load - in my case seems that the traffic is
>>>> actually IPV6 (hadn't noticed that before).
>>>> I ran nethogs on a remote display - the reported rx rate on the IPV6 smb
>>>> connection at the time of the lockup was 33889.688 KB/sec on a 1gbit
>>>> nic. I've got two events like this - don't recall if the earlier one was
>>>> the exact same # - but it was in the ballpark.
>>>>
>>>> On 12/24/2009 2:01 AM, Andrew Morton wrote:
>>>>
>>>>
>>>>> cc's added again.
>>>>>
>>>>> On Wed, 23 Dec 2009 17:54:27 -0500 Michael Breuer<mbreuer@majjas.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Ok - not the firmware. Ran another Windows backup and sky2 went down.
>>>>>>
>>>>>> Nothing in dmesg.old - have oops in syslog. System became unresponsive
>>>>>> and watchdog kicked in after a minute.
>>>>>>
>>>>>> Also note that I have a similar oops with VT-D disabled (posted here on
>>>>>> 12/5). I'm attaching the oops from that below this oops for comparison.
>>>>>> That also happened under similar load.
>>>>>>
>>>>>> On the assumption that I can recreate this (although it takes a while)
>>>>>> please let me know how I can help.
>>>>>>
>>>>>> What's in my log (starting with an smbd error about 2 min before the
>>>>>> oops (note: the dchpd is not the system doing the backup).
>>>>>>
>>>>>>
>>>>>>
>>>>> This (nastily wordwrapped) oops appers to be quite different from
>>>>> Berck's one.
>>>>>
>>>>>
>>>>>
>>>>>
>>> What is the MTU?
>>>
>>>
>> 1500
>>
>>>>
>>>>
> It looks like the problem only shows up for packets generated by DHCP,
> and these come through AF_PACKET. The problem maybe related to how this
> packets are fragmented into header and page, in a different way than other
> packets confusing the driver or DMA engine.
>
> Does this help?
> -----
>
> --- a/drivers/net/sky2.c 2009-12-26 09:50:20.869565022 -0800
> +++ b/drivers/net/sky2.c 2009-12-26 09:55:54.620645355 -0800
> @@ -1616,6 +1616,13 @@ static netdev_tx_t sky2_xmit_frame(struc
> if (unlikely(tx_avail(sky2)< tx_le_req(skb)))
> return NETDEV_TX_BUSY;
>
> + if (!pskb_may_pull(skb, ETH_HLEN)) {
> + if (net_ratelimit())
> + pr_info(PFX "%s: packet missing ether header (%d)?",
> + dev->name, skb->len);
> + goto drop;
> + }
> +
> len = skb_headlen(skb);
> mapping = pci_map_single(hw->pdev, skb->data, len, PCI_DMA_TODEVICE);
>
> @@ -1761,6 +1768,7 @@ mapping_unwind:
> mapping_error:
> if (net_ratelimit())
> dev_warn(&hw->pdev->dev, "%s: tx mapping error\n", dev->name);
> +drop:
> dev_kfree_skb(skb);
> return NETDEV_TX_OK;
> }
>
>
>
>
>
That seems to have done the trick!
Still one odd message sequence, but no hangs or crashes.
The first time I forced a DHCP renew while running at high throughput, I
got the same SMB errors I saw in my original error log (pre-crash). This
only happened once:
Dec 26 15:24:18 mail dhcpd: DHCPACK on 10.0.0.56 to 00:1c:cc:f3:9f:f6
(BLACKBERRY-9542) via eth0
Dec 26 15:24:25 mail smbd[8937]: [2009/12/26 15:24:25, 0]
lib/util_sock.c:1564(matchname)
Dec 26 15:24:25 mail smbd[8937]: matchname: host name/address
mismatch: ::ffff:10.0.0.11 != potter.majjas.com
Dec 26 15:24:25 mail smbd[8937]: [2009/12/26 15:24:25, 0]
lib/util_sock.c:1685(get_peer_name)
Dec 26 15:24:25 mail smbd[8937]: Matchname failed on potter.majjas.com
::ffff:10.0.0.11
Dec 26 15:24:25 mail smbd[8937]: [2009/12/26 15:24:25, 0]
smbd/nttrans.c:2076(call_nt_transact_ioctl)
Dec 26 15:24:25 mail smbd[8937]: call_nt_transact_ioctl(0x900eb):
Currently not implemented.
I would discount this, but the same sequence was present in the logs
pre-crash as well. I do not see this at all absent the preceding DHCP
renew sequence. I also don't see this unless the adapter is under load.
next prev parent reply other threads:[~2009-12-26 20:37 UTC|newest]
Thread overview: 145+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-21 23:52 sky2 panic in 2.6.32.1 under load Berck E. Nash
2009-12-22 0:09 ` Michael Breuer
2009-12-22 18:50 ` Michael Breuer
2009-12-23 22:54 ` sky2 panic in 2.6.32.1 under load (new oops) Michael Breuer
2009-12-24 7:01 ` Andrew Morton
2009-12-24 19:18 ` Michael Breuer
2009-12-24 22:27 ` Stephen Hemminger
2009-12-25 16:28 ` Michael Breuer
2009-12-25 23:22 ` Stephen Hemminger
2009-12-26 3:23 ` Michael Breuer
2009-12-26 17:57 ` Stephen Hemminger
2009-12-26 20:37 ` Michael Breuer [this message]
2009-12-26 22:05 ` [PATCH] sky2: make sure ethernet header is in transmit skb Stephen Hemminger
2009-12-27 3:44 ` David Miller
2009-12-27 4:11 ` David Miller
2010-01-04 5:32 ` David Miller
2010-01-04 16:40 ` Stephen Hemminger
2010-01-04 17:02 ` Michael Breuer
2010-01-05 23:07 ` [PATCH] af_packet: Don't use skb after dev_queue_xmit() Jarek Poplawski
2010-01-05 23:16 ` Michael Breuer
2010-01-05 23:29 ` Jarek Poplawski
2010-01-06 2:36 ` Michael Breuer
2010-01-06 7:22 ` Jarek Poplawski
2010-01-06 9:15 ` [PATCH alt.2] " Jarek Poplawski
2010-01-06 14:49 ` Stephen Hemminger
2010-01-06 19:40 ` Jarek Poplawski
2010-01-06 19:49 ` [PATCH] " Michael Breuer
2010-01-06 20:22 ` Jarek Poplawski
2010-01-06 20:33 ` Michael Breuer
2010-01-06 21:09 ` Jarek Poplawski
2010-01-06 21:32 ` Michael Breuer
2010-01-06 21:10 ` Stephen Hemminger
2010-01-06 21:20 ` Michael Breuer
2010-01-06 23:26 ` Michael Breuer
2010-01-07 2:42 ` Michael Breuer
2010-01-07 4:00 ` Michael Breuer
2010-01-07 4:53 ` Stephen Hemminger
2010-01-07 5:10 ` Michael Breuer
2010-01-07 5:32 ` Michael Breuer
2010-01-07 5:54 ` Michael Breuer
2010-01-07 7:20 ` Michael Breuer
2010-01-07 7:47 ` Jarek Poplawski
2010-01-07 7:55 ` Michael Breuer
2010-01-07 8:21 ` Jarek Poplawski
2010-01-07 15:03 ` Michael Breuer
2010-01-07 17:56 ` Jarek Poplawski
2010-01-07 18:17 ` Jarek Poplawski
2010-01-07 15:05 ` Michael Breuer
2010-01-07 18:01 ` Jarek Poplawski
2010-01-07 18:19 ` Michael Breuer
2010-01-07 18:35 ` Jarek Poplawski
2010-01-07 18:40 ` Michael Breuer
2010-01-07 18:43 ` Michael Breuer
2010-01-07 18:50 ` Jarek Poplawski
2010-01-07 19:36 ` Jarek Poplawski
2010-01-07 19:55 ` Michael Breuer
2010-01-07 20:22 ` Jarek Poplawski
2010-01-07 23:11 ` Michael Breuer
2010-01-08 7:45 ` Jarek Poplawski
2010-01-08 16:40 ` Michael Breuer
2010-01-08 21:29 ` Jarek Poplawski
2010-01-08 21:48 ` Michael Breuer
2010-01-08 22:02 ` Jarek Poplawski
2010-01-09 4:45 ` Michael Breuer
2010-01-09 5:44 ` Michael Breuer
2010-01-09 12:28 ` Jarek Poplawski
2010-01-09 18:34 ` Michael Breuer
2010-01-13 20:39 ` Michael Breuer
2010-01-13 21:09 ` Jarek Poplawski
2010-01-13 21:16 ` Michael Breuer
2010-01-13 21:34 ` Jarek Poplawski
2010-01-17 16:26 ` Michael Breuer
2010-01-17 22:17 ` Jarek Poplawski
2010-01-17 22:34 ` Michael Breuer
2010-01-17 23:05 ` Jarek Poplawski
2010-01-17 23:15 ` Michael Breuer
2010-01-18 7:30 ` Jarek Poplawski
2010-01-18 16:29 ` Michael Breuer
2010-01-18 20:46 ` Jarek Poplawski
2010-01-18 20:56 ` Michael Breuer
2010-01-18 21:00 ` Stephen Hemminger
2010-01-18 21:06 ` Jarek Poplawski
2010-01-18 21:24 ` Michael Breuer
2010-01-18 21:50 ` Jarek Poplawski
2010-01-18 21:25 ` Jarek Poplawski
2010-01-18 21:39 ` Michael Breuer
2010-01-18 22:08 ` Jarek Poplawski
2010-01-18 22:17 ` Jarek Poplawski
2010-01-18 22:47 ` Michael Breuer
2010-01-19 5:46 ` Michael Breuer
2010-01-19 8:41 ` Jarek Poplawski
2010-01-19 15:28 ` Michael Breuer
2010-01-21 19:48 ` Michael Breuer
2010-01-19 10:47 ` Jarek Poplawski
2010-01-19 15:47 ` Michael Breuer
2010-01-19 19:59 ` Jarek Poplawski
2010-01-19 20:06 ` Michael Breuer
2010-01-19 20:29 ` Jarek Poplawski
2010-01-19 22:45 ` Jarek Poplawski
2010-01-20 1:01 ` Michael Breuer
2010-01-20 1:10 ` Stephen Hemminger
2010-01-21 16:14 ` Stefan Richter
2010-01-21 16:50 ` Stefan Richter
2010-01-18 22:25 ` Michael Breuer
2010-01-18 22:40 ` Jarek Poplawski
2009-12-27 17:03 ` sky2 panic in 2.6.32.1 under load (new oops) Michael Breuer
2009-12-27 18:22 ` Stephen Hemminger
2009-12-27 19:39 ` Michael Breuer
2009-12-29 17:30 ` Stephen Hemminger
2009-12-29 17:39 ` Michael Breuer
2009-12-29 18:38 ` Michael Breuer
2009-12-29 18:54 ` Michael Breuer
2009-12-29 19:49 ` Stephen Hemminger
2009-12-29 20:41 ` Michael Breuer
2009-12-30 7:23 ` Michael Breuer
2009-12-30 7:58 ` Stephen Hemminger
2009-12-30 17:49 ` Michael Breuer
2009-12-30 19:15 ` audit.c skb - tty race condition - was " Michael Breuer
2009-12-30 20:44 ` Michael Breuer
2009-12-30 21:15 ` Michael Breuer
2009-12-30 21:21 ` Michael Breuer
2009-12-30 7:59 ` Stephen Hemminger
2009-12-30 15:40 ` Michael Breuer
2009-12-30 18:10 ` Stephen Hemminger
2009-12-30 18:37 ` Michael Breuer
2009-12-31 18:09 ` Michael Breuer
2009-12-31 18:24 ` Stephen Hemminger
2010-01-01 17:42 ` Michael Breuer
2010-01-01 19:26 ` sky2 panic in 2.6.32.1 under load (tty NULL write) Michael Breuer
2010-01-01 20:34 ` Michael Breuer
2010-01-02 21:42 ` Michael Breuer
2009-12-29 19:15 ` sky2 panic in 2.6.32.1 under load (new oops) Jarek Poplawski
2009-12-29 19:20 ` Michael Breuer
2009-12-30 8:07 ` Stephen Hemminger
2009-12-30 15:36 ` Michael Breuer
2009-12-22 0:52 ` sky2 panic in 2.6.32.1 under load Daniel Hazelton
2009-12-24 6:58 ` Andrew Morton
2009-12-24 16:03 ` Berck Nash
2009-12-24 16:28 ` Daniel Hazelton
2009-12-24 22:21 ` Stephen Hemminger
2009-12-24 22:42 ` Michael Breuer
2009-12-25 0:06 ` Daniel Hazelton
2009-12-24 16:10 ` Michael Breuer
2009-12-24 16:16 ` Berck Nash
2009-12-24 16:26 ` Michael Breuer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B367409.5060202@majjas.com \
--to=mbreuer@majjas.com \
--cc=akpm@linux-foundation.org \
--cc=flyboy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=shemminger@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.