From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753449Ab2ATOYV (ORCPT ); Fri, 20 Jan 2012 09:24:21 -0500 Received: from mta3.srv.hcvlny.cv.net ([167.206.4.198]:56369 "EHLO mta3.srv.hcvlny.cv.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751570Ab2ATOYT (ORCPT ); Fri, 20 Jan 2012 09:24:19 -0500 Date: Fri, 20 Jan 2012 09:24:38 -0500 From: Michael Breuer Subject: Re: Regression: sky2 kernel between 3.1 and 3.2.1 (last known good 3.0.9) In-reply-to: <4F1452B1.4010200@majjas.com> To: Jarek Poplawski Cc: David Miller , Stephen Hemminger , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Message-id: <4F197926.6080309@majjas.com> MIME-version: 1.0 Content-type: text/plain; charset=ISO-8859-1; format=flowed Content-transfer-encoding: 7BIT References: <20100120094103.GA6225@ff.dom.local> <4B58B217.8030001@majjas.com> <20100121204133.GB3085@del.dom.local> <4B59E7EB.3050605@majjas.com> <4F1452B1.4010200@majjas.com> User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0) Gecko/20120118 Thunderbird/10.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/16/2012 11:39 AM, Michael Breuer wrote: > Synopsis: > > Receiving DMAR and other errors after approximately three days of > uptime. The symptoms exactly match errors seen and then fixed around > 2.6.32.4. > > While the system remains unaffected for too long to do a bisect, I was > able to confirm that the problem exists in the 3.1 stable branch (I > jumped from 3.0 to 3.2 when 3.2. was released). > > For now I reverted to the sky2.c from 3.0.9 and am running the rest of > the kernel from 3.1.2, but won't be certain that this works until > later in the week. > > Note that 20 seconds prior to the log extract below were DHCP renewal > attempts on eth1, the issue below was on eth0. Not sure it's relevant, > however back in 2010 a preceding DHCP event did turn out to be > relevant to the manifestation of the bug. > > The 3.2.1-dirty I'm running is from git with a single local patch - > for sidewinder force-feedback support (shouldn't be relevant to the > sky2 issue). > > Log extract: > > Jan 16 05:49:46 mail kernel: [198230.628919] DRHD: handling fault > status reg 2 > Jan 16 05:49:46 mail kernel: [198230.628925] sky2 0000:06:00.0: error > interrupt status=0x80000000 > Jan 16 05:49:46 mail kernel: [198230.628929] DMAR:[DMA Read] Request > device [06:00.0] fault addr fff78000 > Jan 16 05:49:46 mail kernel: [198230.628931] DMAR:[fault reason 06] > PTE Read access is not set > Jan 16 05:49:46 mail kernel: [198230.628939] sky2 0000:06:00.0: PCI > hardware error (0x2010) > Jan 16 05:49:53 mail dhclient[1616]: DHCPREQUEST on eth1 to > 10.240.184.29 port 67 > Jan 16 05:50:01 mail kernel: [198246.288400] ------------[ cut here > ]------------ > Jan 16 05:50:01 mail kernel: [198246.288408] WARNING: at > net/sched/sch_generic.c:255 dev_watchdog+0x247/0x250() > Jan 16 05:50:01 mail kernel: [198246.288411] Hardware name: System > Product Name > Jan 16 05:50:01 mail kernel: [198246.288413] NETDEV WATCHDOG: eth0 > (sky2): transmit queue 0 timed out > Jan 16 05:50:01 mail kernel: [198246.288415] Modules linked in: tcp_lp > cpufreq_stats ebtable_nat ebtables nf_conntrack_netbios_ns > nf_conntrack_broadcast ip6table_mangle ip6table_filter ip6_tables > iptable_mangle ipt_MASQUERADE iptable_nat nf_nat iptable_raw tun > bridge stp llc lockd sit tunnel4 ipt_LOG nf_conntrack_ftp > nf_conntrack_ipv6 nf_defrag_ipv6 xt_CHECKSUM xt_multiport xt_DSCP > w83627ehf xt_mark xt_dscp hwmon_vid binfmt_misc raid1 btrfs sunrpc > zlib_deflate libcrc32c snd_hda_codec_analog snd_ens1371 gameport > snd_hda_intel snd_rawmidi snd_ac97_codec snd_hda_codec snd_hwdep > ac97_bus snd_seq snd_seq_device snd_pcm gspca_spca505 snd_timer > gspca_main snd videodev media soundcore i2c_i801 iTCO_wdt microcode > v4l2_compat_ioctl32 snd_page_alloc i7core_edac sky2 edac_core pcspkr > iTCO_vendor_support virtio_net virtio virtio_ring kvm_intel kvm uinput > ipv6 raid456 async_raid6_recov async_pq raid6_pq async_xor > firewire_ohci firewire_core pata_acpi ata_generic xor async_memcpy > async_tx crc_itu_t pata_marvell nouveau ttm d > Jan 16 05:50:01 mail kernel: rm_kms_helper drm i2c_algo_bit i2c_core > mxm_wmi video [last unloaded: nf_conntrack_broadcast] > Jan 16 05:50:01 mail kernel: [198246.288487] Pid: 0, comm: swapper/0 > Tainted: G W 3.2.1-dirty #1 > Jan 16 05:50:01 mail kernel: [198246.288489] Call Trace: > Jan 16 05:50:01 mail kernel: [198246.288491] > [] warn_slowpath_common+0x7f/0xc0 > Jan 16 05:50:01 mail kernel: [198246.288501] [] ? > lapic_next_event+0x1d/0x30 > Jan 16 05:50:01 mail kernel: [198246.288504] [] > warn_slowpath_fmt+0x46/0x50 > Jan 16 05:50:01 mail kernel: [198246.288509] [] ? > read_tsc+0x9/0x20 > Jan 16 05:50:01 mail kernel: [198246.288513] [] > dev_watchdog+0x247/0x250 > Jan 16 05:50:01 mail kernel: [198246.288518] [] > run_timer_softirq+0x12b/0x3b0 > Jan 16 05:50:01 mail kernel: [198246.288521] [] ? > qdisc_reset+0x50/0x50 > Jan 16 05:50:01 mail kernel: [198246.288525] [] > __do_softirq+0xa8/0x210 > Jan 16 05:50:01 mail kernel: [198246.288529] [] > call_softirq+0x1c/0x30 > Jan 16 05:50:01 mail kernel: [198246.288533] [] > do_softirq+0x65/0xa0 > Jan 16 05:50:01 mail kernel: [198246.288536] [] > irq_exit+0x8e/0xb0 > Jan 16 05:50:01 mail kernel: [198246.288539] [] > do_IRQ+0x63/0xe0 > Jan 16 05:50:01 mail kernel: [198246.288543] [] > common_interrupt+0x6e/0x6e > Jan 16 05:50:01 mail kernel: [198246.288545] > [] ? intel_idle+0xed/0x150 > Jan 16 05:50:01 mail kernel: [198246.288551] [] ? > intel_idle+0xcf/0x150 > Jan 16 05:50:01 mail kernel: [198246.288555] [] > cpuidle_idle_call+0xc1/0x280 > Jan 16 05:50:01 mail kernel: [198246.288559] [] > cpu_idle+0xca/0x120 > Jan 16 05:50:01 mail kernel: [198246.288563] [] > rest_init+0x72/0x74 > Jan 16 05:50:01 mail kernel: [198246.288568] [] > start_kernel+0x3b5/0x3c0 > Jan 16 05:50:01 mail kernel: [198246.288572] [] > x86_64_start_reservations+0x132/0x136 > Jan 16 05:50:01 mail kernel: [198246.288576] [] ? > early_idt_handlers+0x140/0x140 > Jan 16 05:50:01 mail kernel: [198246.288580] [] > x86_64_start_kernel+0x102/0x111 > Jan 16 05:50:01 mail kernel: [198246.288583] ---[ end trace > bb26011d21a2b1d7 ]--- > Jan 16 05:50:01 mail kernel: [198246.288586] sky2 0000:06:00.0: eth0: > tx timeout > Jan 16 05:50:01 mail kernel: [198246.288593] sky2 0000:06:00.0: eth0: > transmit ring 115 .. 10 report=115 done=115 > > > FYI - I've been up for four days now without issues running on 3.2.1 + sky2.c from 3.0.9. Looks like the issue is in fact in one of the modifications made in sky2.c between those two releases. -- Mike