From: Michael Breuer <mbreuer@majjas.com>
To: Jarek Poplawski <jarkao2@gmail.com>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>,
David Miller <davem@davemloft.net>,
akpm@linux-foundation.org, flyboy@gmail.com,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] af_packet: Don't use skb after dev_queue_xmit()
Date: Mon, 18 Jan 2010 11:29:31 -0500 [thread overview]
Message-ID: <4B548C6B.10607@majjas.com> (raw)
In-Reply-To: <20100118073018.GA6270@ff.dom.local>
On 01/18/2010 02:30 AM, Jarek Poplawski wrote:
> On Sun, Jan 17, 2010 at 06:15:22PM -0500, Michael Breuer wrote:
>
>> On 1/17/2010 6:05 PM, Jarek Poplawski wrote:
>>
>>> On Sun, Jan 17, 2010 at 05:34:19PM -0500, Michael Breuer wrote:
>>>
>>>
>>>> On 1/17/2010 5:17 PM, Jarek Poplawski wrote:
>>>>
>>>>
>>>>> On Sun, Jan 17, 2010 at 11:26:46AM -0500, Michael Breuer wrote:
>>>>>
>>>>>
>>>>>> On 01/13/2010 04:16 PM, Michael Breuer wrote:
>>>>>>
>>>>>>
>>>>>>> On 1/13/2010 4:09 PM, Jarek Poplawski wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Wed, Jan 13, 2010 at 03:39:37PM -0500, Michael Breuer wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>> Update: after leaving the system up for a few days, I hit the DMAR
>>>>>> error again.
>>>>>>
>>>>>>
>>>>> My proposal is to send some summary as a new thread, with dmar in the
>>>>> subject, and cc-ed dmar maintainers.
>>>>>
>>>>>
>>>>>
>>>> Not sure I agree. The symptoms are identical to those I hit without
>>>> DMAR earlier on. Also, as this issue only happens when there is high
>>>> receive load, I'm thinking there's some sort of race between TX and
>>>> RX within the sky2 driver, or hardware. I think that DMAR is
>>>> correctly catching the error.
>>>>
>>>>
>>> Hmm... OK, then let's wait with this report and go back to testing
>>> it "really really long" ;-) without DMAR, and maybe without the
>>> last Stephen's patch either? (So only the two things in the current
>>> linux-2.6.)
>>>
>>> Jarek P.
>>>
>>>
>> Ok - but absent the last patch, I think I still need the pskb_may_pull
>> patch... so it'd be pskb_may_pull and afpacket v3 and no DMAR.
>>
> Exactly. Or if it's working for you already, the mainline (2.6.33-rc4)
> with the pskb_may_pull patch. And check for warnings from the latter.
>
>
>> Also - not sure if related, but there's still the odd tx side behavior
>> when RX is under load. That I CAN reproduce at will (yesterday's report
>> - no crash, but I confirmed that DHCPOFFER packets are being dropped
>> somewhere after wireshark sees them and before hitting the wire.
>>
> I'm not sure either, but until there is no crash it might be some
> minor bug or/and missing stat. Btw, you could probably try alternative
> test with ping from this overloaded box to the router and win7.
>
>
>> I am also wondering whether or not that testing I did yesterday set up
>> today's hang - perhaps those lost TX packets are corrupting something
>> that manifests worse later.
>>
> Maybe, but you wrote earlier they had to fix something around this
> DMAR in the meantime, because it triggered much faster during your
> previous tests. So, I don't know why you assume this DMAR has to be
> correct this time.
>
> Jarek P.
>
Ok - up on the two patches, no DMAR. Some early observations:
1. There's an early on MMAP oops (see below). This happens once, at the
completion of the transition to runlevel 5 (I've seen it entering
runlevel 3 as well). This does not recur when runlevels are subsequently
changed. I do not see this when running with DMAR enabled.
2. The dropped tx packet (DHCP) is a bit harder to recreate, but it
still happens. Interestingly, I initially saw no dropped packets with
ping - but after I went the DCHP route and eventually reconnected, I
could then cause dropped tx packets with ping. To clarify:
a) start throughput
b) ping device - no packet loss - this was true for the entire test run.
c) start throughput again
d) ping - no loss.
e) drop wifi on the device & restart - first attempt worked. Repeat
attempt yielded the dropped DHCPOFFER packets. After about 6 tries, the
device reconnected to wifi.
f) ping again (after the reconnection) - packet loss rate about 80%.
g) simultaneously ping the wifi router - no loss.
h) After a while, packets are no longer dropped during ping. If I manage
to cause the dhcp drop again, and then ping after the device finally
reconnects, packet loss is significant for a while (maybe 30 sec to a
minute). Then things return to normal. Note that the packet loss
continues even if the reported throughput drops to nil.
i) I can't cause the initial packet loss at RX rates below about
30,000KBPS (as reported by nethogs). At rates over 40 I can reproduce
this on this set of patches & config about 60% of the time.
The initial sky2 oops:
Jan 18 10:42:43 mail kernel: ------------[ cut here ]------------
Jan 18 10:42:43 mail kernel: WARNING: at lib/dma-debug.c:898
check_sync+0xbd/0x426()
Jan 18 10:42:43 mail kernel: Hardware name: System Product Name
Jan 18 10:42:43 mail kernel: sky2 0000:06:00.0: DMA-API: device driver
tries to sync DMA memory it has not allocated [device
address=0x00000003249b4022] [size=98 bytes]
Jan 18 10:42:43 mail kernel: Modules linked in: microcode(+)
ip6table_mangle ip6table_filter ip6_tables iptable_raw iptable_mangle
ipt_MASQUERADE iptable_nat nf_nat appletalk psnap llc nfsd lockd nfs_acl
auth_rpcgss exportfs hwmon_vid coretemp sunrpc acpi_cpufreq sit tunnel4
ipt_LOG nf_conntrack_netbios_ns nf_conntrack_ftp nf_conntrack_ipv6
xt_multiport xt_DSCP xt_dscp xt_MARK ipv6 dm_multipath kvm_intel kvm
snd_hda_codec_analog snd_ens1371 gameport snd_hda_intel snd_rawmidi
snd_hda_codec snd_ac97_codec gspca_spca505 ac97_bus snd_hwdep snd_seq
gspca_main snd_seq_device firewire_ohci videodev firewire_core
v4l1_compat snd_pcm i2c_i801 pcspkr v4l2_compat_ioctl32 crc_itu_t
asus_atk0110 hwmon iTCO_wdt iTCO_vendor_support wmi snd_timer snd sky2
soundcore snd_page_alloc fbcon tileblit font bitblit softcursor raid456
async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx
raid1 ata_generic pata_acpi pata_marvell nouveau ttm drm_kms_helper drm
agpgart fb i2c_algo_bit cfbcopyarea i2c_core cfbimgblt cfbf
Jan 18 10:42:43 mail kernel: illrect [last unloaded: ip6_tables]
Jan 18 10:42:43 mail kernel: Pid: 0, comm: swapper Not tainted
2.6.32NOMMAPNODMARAF3SKY2PSKBMAYPULL-00893-gb5d5baa-dirty #3
Jan 18 10:42:43 mail kernel: Call Trace:
Jan 18 10:42:43 mail kernel: <IRQ> [<ffffffff81053676>]
warn_slowpath_common+0x7c/0x94
Jan 18 10:42:43 mail kernel: [<ffffffff810536e5>]
warn_slowpath_fmt+0x41/0x43
Jan 18 10:42:43 mail kernel: [<ffffffff8127ae7d>] check_sync+0xbd/0x426
Jan 18 10:42:43 mail kernel: [<ffffffff813c5b4c>] ?
__netdev_alloc_skb+0x34/0x50
Jan 18 10:42:43 mail kernel: [<ffffffff8127b539>]
debug_dma_sync_single_for_cpu+0x42/0x44
Jan 18 10:42:43 mail kernel: [<ffffffff812788d7>] ?
swiotlb_sync_single+0x2a/0xb6
Jan 18 10:42:43 mail kernel: [<ffffffff81278a33>] ?
swiotlb_sync_single_for_cpu+0xc/0xe
Jan 18 10:42:43 mail kernel: [<ffffffffa015eed6>] sky2_poll+0x4c6/0xae1
[sky2]
Jan 18 10:42:43 mail kernel: [<ffffffff814673f2>] ?
_spin_unlock_irqrestore+0x29/0x41
Jan 18 10:42:43 mail kernel: [<ffffffff813cc7ea>] net_rx_action+0xb5/0x1f3
Jan 18 10:42:43 mail kernel: [<ffffffff8105ae57>] __do_softirq+0xf8/0x1cd
Jan 18 10:42:43 mail kernel: [<ffffffff810a2e0e>] ?
handle_IRQ_event+0x119/0x12b
Jan 18 10:42:43 mail kernel: [<ffffffff81012e1c>] call_softirq+0x1c/0x30
Jan 18 10:42:43 mail kernel: [<ffffffff810143a3>] do_softirq+0x4b/0xa6
Jan 18 10:42:43 mail kernel: [<ffffffff8105aa37>] irq_exit+0x4a/0x8c
Jan 18 10:42:43 mail kernel: [<ffffffff8146b445>] do_IRQ+0xa5/0xbc
Jan 18 10:42:43 mail kernel: [<ffffffff81012613>] ret_from_intr+0x0/0x16
Jan 18 10:42:43 mail kernel: <EOI> [<ffffffff812c251e>] ?
acpi_idle_enter_bm+0x256/0x28a
Jan 18 10:42:43 mail kernel: [<ffffffff812c2517>] ?
acpi_idle_enter_bm+0x24f/0x28a
Jan 18 10:42:43 mail kernel: [<ffffffff813a1b78>] ?
cpuidle_idle_call+0x9e/0xfa
Jan 18 10:42:43 mail kernel: [<ffffffff81010c90>] ? cpu_idle+0xb4/0xf6
Jan 18 10:42:43 mail kernel: [<ffffffff81460acf>] ?
start_secondary+0x201/0x242
Jan 18 10:42:43 mail kernel: ---[ end trace 188c0cdbace3665e ]---
next prev parent reply other threads:[~2010-01-18 16:30 UTC|newest]
Thread overview: 145+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-21 23:52 sky2 panic in 2.6.32.1 under load Berck E. Nash
2009-12-22 0:09 ` Michael Breuer
2009-12-22 18:50 ` Michael Breuer
2009-12-23 22:54 ` sky2 panic in 2.6.32.1 under load (new oops) Michael Breuer
2009-12-24 7:01 ` Andrew Morton
2009-12-24 19:18 ` Michael Breuer
2009-12-24 22:27 ` Stephen Hemminger
2009-12-25 16:28 ` Michael Breuer
2009-12-25 23:22 ` Stephen Hemminger
2009-12-26 3:23 ` Michael Breuer
2009-12-26 17:57 ` Stephen Hemminger
2009-12-26 20:37 ` Michael Breuer
2009-12-26 22:05 ` [PATCH] sky2: make sure ethernet header is in transmit skb Stephen Hemminger
2009-12-27 3:44 ` David Miller
2009-12-27 4:11 ` David Miller
2010-01-04 5:32 ` David Miller
2010-01-04 16:40 ` Stephen Hemminger
2010-01-04 17:02 ` Michael Breuer
2010-01-05 23:07 ` [PATCH] af_packet: Don't use skb after dev_queue_xmit() Jarek Poplawski
2010-01-05 23:16 ` Michael Breuer
2010-01-05 23:29 ` Jarek Poplawski
2010-01-06 2:36 ` Michael Breuer
2010-01-06 7:22 ` Jarek Poplawski
2010-01-06 9:15 ` [PATCH alt.2] " Jarek Poplawski
2010-01-06 14:49 ` Stephen Hemminger
2010-01-06 19:40 ` Jarek Poplawski
2010-01-06 19:49 ` [PATCH] " Michael Breuer
2010-01-06 20:22 ` Jarek Poplawski
2010-01-06 20:33 ` Michael Breuer
2010-01-06 21:09 ` Jarek Poplawski
2010-01-06 21:32 ` Michael Breuer
2010-01-06 21:10 ` Stephen Hemminger
2010-01-06 21:20 ` Michael Breuer
2010-01-06 23:26 ` Michael Breuer
2010-01-07 2:42 ` Michael Breuer
2010-01-07 4:00 ` Michael Breuer
2010-01-07 4:53 ` Stephen Hemminger
2010-01-07 5:10 ` Michael Breuer
2010-01-07 5:32 ` Michael Breuer
2010-01-07 5:54 ` Michael Breuer
2010-01-07 7:20 ` Michael Breuer
2010-01-07 7:47 ` Jarek Poplawski
2010-01-07 7:55 ` Michael Breuer
2010-01-07 8:21 ` Jarek Poplawski
2010-01-07 15:03 ` Michael Breuer
2010-01-07 17:56 ` Jarek Poplawski
2010-01-07 18:17 ` Jarek Poplawski
2010-01-07 15:05 ` Michael Breuer
2010-01-07 18:01 ` Jarek Poplawski
2010-01-07 18:19 ` Michael Breuer
2010-01-07 18:35 ` Jarek Poplawski
2010-01-07 18:40 ` Michael Breuer
2010-01-07 18:43 ` Michael Breuer
2010-01-07 18:50 ` Jarek Poplawski
2010-01-07 19:36 ` Jarek Poplawski
2010-01-07 19:55 ` Michael Breuer
2010-01-07 20:22 ` Jarek Poplawski
2010-01-07 23:11 ` Michael Breuer
2010-01-08 7:45 ` Jarek Poplawski
2010-01-08 16:40 ` Michael Breuer
2010-01-08 21:29 ` Jarek Poplawski
2010-01-08 21:48 ` Michael Breuer
2010-01-08 22:02 ` Jarek Poplawski
2010-01-09 4:45 ` Michael Breuer
2010-01-09 5:44 ` Michael Breuer
2010-01-09 12:28 ` Jarek Poplawski
2010-01-09 18:34 ` Michael Breuer
2010-01-13 20:39 ` Michael Breuer
2010-01-13 21:09 ` Jarek Poplawski
2010-01-13 21:16 ` Michael Breuer
2010-01-13 21:34 ` Jarek Poplawski
2010-01-17 16:26 ` Michael Breuer
2010-01-17 22:17 ` Jarek Poplawski
2010-01-17 22:34 ` Michael Breuer
2010-01-17 23:05 ` Jarek Poplawski
2010-01-17 23:15 ` Michael Breuer
2010-01-18 7:30 ` Jarek Poplawski
2010-01-18 16:29 ` Michael Breuer [this message]
2010-01-18 20:46 ` Jarek Poplawski
2010-01-18 20:56 ` Michael Breuer
2010-01-18 21:00 ` Stephen Hemminger
2010-01-18 21:06 ` Jarek Poplawski
2010-01-18 21:24 ` Michael Breuer
2010-01-18 21:50 ` Jarek Poplawski
2010-01-18 21:25 ` Jarek Poplawski
2010-01-18 21:39 ` Michael Breuer
2010-01-18 22:08 ` Jarek Poplawski
2010-01-18 22:17 ` Jarek Poplawski
2010-01-18 22:47 ` Michael Breuer
2010-01-19 5:46 ` Michael Breuer
2010-01-19 8:41 ` Jarek Poplawski
2010-01-19 15:28 ` Michael Breuer
2010-01-21 19:48 ` Michael Breuer
2010-01-19 10:47 ` Jarek Poplawski
2010-01-19 15:47 ` Michael Breuer
2010-01-19 19:59 ` Jarek Poplawski
2010-01-19 20:06 ` Michael Breuer
2010-01-19 20:29 ` Jarek Poplawski
2010-01-19 22:45 ` Jarek Poplawski
2010-01-20 1:01 ` Michael Breuer
2010-01-20 1:10 ` Stephen Hemminger
2010-01-21 16:14 ` Stefan Richter
2010-01-21 16:50 ` Stefan Richter
2010-01-18 22:25 ` Michael Breuer
2010-01-18 22:40 ` Jarek Poplawski
2009-12-27 17:03 ` sky2 panic in 2.6.32.1 under load (new oops) Michael Breuer
2009-12-27 18:22 ` Stephen Hemminger
2009-12-27 19:39 ` Michael Breuer
2009-12-29 17:30 ` Stephen Hemminger
2009-12-29 17:39 ` Michael Breuer
2009-12-29 18:38 ` Michael Breuer
2009-12-29 18:54 ` Michael Breuer
2009-12-29 19:49 ` Stephen Hemminger
2009-12-29 20:41 ` Michael Breuer
2009-12-30 7:23 ` Michael Breuer
2009-12-30 7:58 ` Stephen Hemminger
2009-12-30 17:49 ` Michael Breuer
2009-12-30 19:15 ` audit.c skb - tty race condition - was " Michael Breuer
2009-12-30 20:44 ` Michael Breuer
2009-12-30 21:15 ` Michael Breuer
2009-12-30 21:21 ` Michael Breuer
2009-12-30 7:59 ` Stephen Hemminger
2009-12-30 15:40 ` Michael Breuer
2009-12-30 18:10 ` Stephen Hemminger
2009-12-30 18:37 ` Michael Breuer
2009-12-31 18:09 ` Michael Breuer
2009-12-31 18:24 ` Stephen Hemminger
2010-01-01 17:42 ` Michael Breuer
2010-01-01 19:26 ` sky2 panic in 2.6.32.1 under load (tty NULL write) Michael Breuer
2010-01-01 20:34 ` Michael Breuer
2010-01-02 21:42 ` Michael Breuer
2009-12-29 19:15 ` sky2 panic in 2.6.32.1 under load (new oops) Jarek Poplawski
2009-12-29 19:20 ` Michael Breuer
2009-12-30 8:07 ` Stephen Hemminger
2009-12-30 15:36 ` Michael Breuer
2009-12-22 0:52 ` sky2 panic in 2.6.32.1 under load Daniel Hazelton
2009-12-24 6:58 ` Andrew Morton
2009-12-24 16:03 ` Berck Nash
2009-12-24 16:28 ` Daniel Hazelton
2009-12-24 22:21 ` Stephen Hemminger
2009-12-24 22:42 ` Michael Breuer
2009-12-25 0:06 ` Daniel Hazelton
2009-12-24 16:10 ` Michael Breuer
2009-12-24 16:16 ` Berck Nash
2009-12-24 16:26 ` Michael Breuer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B548C6B.10607@majjas.com \
--to=mbreuer@majjas.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=flyboy@gmail.com \
--cc=jarkao2@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=shemminger@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox