From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: Question about how to troubleshoot sandybridge kernel opps and subsequest GPU lockup Date: Mon, 24 Oct 2011 08:46:56 +0200 Message-ID: <20111024064656.GA2908@phenom.ffwll.local> References: <20111024024822.GA5123@mindspring.com> <20111024041219.GA7575@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail-wy0-f177.google.com (mail-wy0-f177.google.com [74.125.82.177]) by gabe.freedesktop.org (Postfix) with ESMTP id B0AAE9E97F for ; Sun, 23 Oct 2011 23:46:09 -0700 (PDT) Received: by wyg8 with SMTP id 8so7171376wyg.36 for ; Sun, 23 Oct 2011 23:46:08 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20111024041219.GA7575@mindspring.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: "James R. Leu" Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Sun, Oct 23, 2011 at 11:12:21PM -0500, James R. Leu wrote: > I'm running wow in wine on 64 bit fedora rawhide on a dell vostro 3550 > (i5 with integrated GPU). > > I'm reliably able to produce 2 types of crashes: > - wow freezes, but I can get to text console, in this case I'm able to > grab a kernel stack trace (below) prior to seeing the normal > [drm:i915_wait_request] *ERROR* i915_wait_request returns -11 (awaiting 452684 at 452608, next 452686) I'm pretty sure that below that line there's a gpu hang report. If that's the case, the please grab everything in /sys/kernel/debug/dri, put it into a tar.gz and attach it (you need to do this _after_ the machine is hung, the kernel will write a gpu crash dump into i915_error_state). The userspace parts of the i915 driver are very important for gpu hangs, so please attach the version of mesa, libdrm and xf86-video-intel you've installed. Also please attach all your i915.ko module options as listed in /sys/module/i915/parameters > - the other is a complete freeze of the system, hard reset required, nothing logged to /var/log/messages It's rather likely that this is the same issue as above. Depending upon exact circumstances the gpu can take down the entire system. > Is there any value in me creating a bug report for this, it seems to be a pretty common issue. > Is there any use in my trying different kernel command line optios for > the i915 driver or config options to the xorg intel driver? Yes, gpu hangs are one of the more common issues, but until you've submitted the error_state there's no way to diagnose the issue and tell whether we have got a report already. > I have the various git trees pulled out (I was looking for recent changes that might be related > to this issue). I'm capable of building and installing from these git trees if there are specific > bits that I should test. > > [ 939.830806] ------------[ cut here ]------------ > [ 939.830814] WARNING: at drivers/gpu/drm/i915/i915_drv.c:372 gen6_gt_force_wake_put+0x29/0x51 [i915]() > [ 939.830816] Hardware name: Vostro 3550 > [ 939.830818] Modules linked in: snd_seq_dummy fuse ip6table_filter ip6_tables ebtable_nat ebtables xt_state xt_CHECKSUM iptable_mangle ppdev parport_pc lp parport vboxpci vboxnetadp vboxnetflt vboxdrv bridge stp llc tun rfcomm bnep ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 snd_hda_codec_hdmi snd_hda_codec_idt uvcvideo videodev btusb media bluetooth v4l2_compat_ioctl32 arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm iwlagn microcode mac80211 dell_laptop iTCO_wdt r8169 i2c_i801 snd_timer cfg80211 snd mii iTCO_vendor_support dcdbas dell_wmi sparse_keymap soundcore rfkill snd_page_alloc virtio_net kvm_intel kvm binfmt_misc wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] > [ 939.830926] Pid: 0, comm: swapper Tainted: G WC 3.1.0-0.rc10.git0.1.fc17.x86_64 #1 > [ 939.830928] Call Trace: > [ 939.830930] ] warn_slowpath_common+0x83/0x9b > [ 939.830941] [] warn_slowpath_null+0x1a/0x1c > [ 939.830952] [] gen6_gt_force_wake_put+0x29/0x51 [i915] > [ 939.830963] [] i915_read32+0x44/0x6b [i915] > [ 939.830975] [] i915_hangcheck_elapsed+0xe8/0x1f8 [i915] > [ 939.831027] [] irq_exit+0x5d/0xcf > [ 939.831032] [] smp_apic_timer_interrupt+0x7c/0x8a > [ 939.831036] [] apic_timer_interrupt+0x73/0x80 > [ 939.831038] ] ? paravirt_read_tsc+0x9/0xd > [ 939.831046] [] ? intel_idle+0xe5/0x10c > [ 939.831050] [] ? intel_idle+0xe1/0x10c > [ 939.831054] [] cpuidle_idle_call+0x11c/0x1fe > [ 939.831059] [] cpu_idle+0xab/0x101 > [ 939.831063] [] rest_init+0xd7/0xde > [ 939.831067] [] ? csum_partial_copy_generic+0x16c/0x16c > [ 939.831072] [] start_kernel+0x3dd/0x3ea > [ 939.831076] [] x86_64_start_reservations+0xaf/0xb3 > [ 939.831081] [] ? early_idt_handlers+0x140/0x140 > [ 939.831085] [] x86_64_start_kernel+0x102/0x111 > [ 939.831088] ---[ end trace f5cba358bac6b7e5 ]--- This WARN here is a possible sideeffect of a dying gpu. Independant, but rather harmless bug. Unfortunately no easy solution, hence no patch atm. Yours, Daniel -- Daniel Vetter Mail: daniel@ffwll.ch Mobile: +41 (0)79 365 57 48