From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: xen.git pvops kernel bug: i915 bug after memory upgrade Date: Tue, 15 Jun 2010 19:49:16 -0400 Message-ID: <20100615234916.GA6141@phenom.dumpdata.com> References: <4B729C3B.8080604@endlessvoid.com> <20100210205738.GB21068@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="opJtzjQTFsWo+cga" Return-path: Content-Disposition: inline In-Reply-To: <20100210205738.GB21068@phenom.dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Yasir Assam Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org --opJtzjQTFsWo+cga Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Feb 10, 2010 at 03:57:38PM -0500, Konrad Rzeszutek Wilk wrote: > On Wed, Feb 10, 2010 at 10:44:59PM +1100, Yasir Assam wrote: > > I upgraded my RAM from 2GB to 8GB today, and I'm no longer able to run > > X. My guess is this is a bug in the xen.git kernel (the dom0 kernel) in > > the i915 module. Other kernels (vanilla 2.6.32.x) work fine. > > > > I have attached the full dmesg log. The problem is completely > > reproducible on my machine. > > 1) Can you give me the hardware specs? Note: Per personal converstation it was an Asus P7H55-M Pro which has Intel H55 chipset or I965.. > > .. snip .. > > [ 23.261678] BUG: unable to handle kernel paging request at ffffc900000c6000 > > [ 23.261685] IP: [] intel_i915_chipset_flush+0x22/0x3e [intel_agp] > > [ 23.261694] PGD 33d2067 PUD 33d3067 PMD 33d4067 PTE 0 > > [ 23.261700] Oops: 0002 [#1] SMP > > [ 23.261703] last sysfs file: /sys/module/i2c_core/initstate > > [ 23.261705] CPU 0 > > [ 23.261707] Modules linked in: i915(+) drm i2c_algo_bit video output ppdev lp parport sco bnep rfcomm l2cap bluetooth rfkill battery cpufreq_stats cpufreq_userspace cpufreq_conservative cpufreq_powersave fuse hwmon_vid k8temp eeprom i2c_nforce2 firewire_sbp2 firewire_core crc_itu_t loop snd_hda_codec_intelhdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi processor snd_seq_midi_event evdev pcspkr i2c_i801 i2c_core asus_atk0110 snd_seq snd_timer button snd_seq_device acpi_processor snd soundcore snd_page_alloc ext3 jbd mbcache dm_mod raid1 md_mod sg sd_mod crc_t10dif sr_mod cdrom usbhid hid pata_jmicron ata_generic ata_piix libata scsi_mod ide_pci_generic ehci_hcd r8169 mii ide_core usbcore nls_base intel_agp th ermal fan thermal_sys [last unloaded: scsi_wait_scan] > > [ 23.261775] Pid: 2379, comm: modprobe Not tainted 2.6.31.6-pvops-dom0 #7 System Product Name > > [ 23.261777] RIP: e030:[] [] intel_i915_chipset_flush+0x22/0x3e [intel_agp] > > [ 23.261783] RSP: e02b:ffff880002155a58 EFLAGS: 00010286 > > [ 23.261785] RAX: 0000000000000001 RBX: ffff88001e0f7300 RCX: 0000000000001000 > > [ 23.261787] RDX: ffffc900000c6000 RSI: 00000000000007e9 RDI: ffff88001d5efe00 > > [ 23.261789] RBP: ffff88001e96c000 R08: 0000000000000040 R09: ffff8800016f1000 > > [ 23.261792] R10: ffff880000000000 R11: 6db6db6db6db6db7 R12: 0000000000000001 > > [ 23.261794] R13: 00000000007e9000 R14: ffff88001e0f7f00 R15: 00000000007e9000 > > [ 23.261799] FS: 00007f1a00ddd6f0(0000) GS:ffffc90000000000(0000) knlGS:0000000000000000 > > [ 23.261801] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 23.261803] CR2: ffffc900000c6000 CR3: 000000001dd65000 CR4: 0000000000002660 > > [ 23.261806] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 23.261808] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > [ 23.261811] Process modprobe (pid: 2379, threadinfo ffff880002154000, task ffff8800198b8000) > > [ 23.261812] Stack: > > [ 23.261814] 0000000d00000000 000000007ea42086 000000007ea42086 ffffffffa03c387c > > [ 23.261818] <0> ffff88001e0f7f00 000000007ea42086 ffff88001e96c000 ffff88001e0f7300 > > [ 23.261823] <0> ffff88001e0f7f00 ffffffffa03c4fc3 ffff88001e0f7300 0000000000000000 > > [ 23.261828] Call Trace: > > [ 23.261840] [] ? i915_gem_object_flush_cpu_write_domain+0x30/0x53 [i915] > > [ 23.261849] [] ? i915_gem_object_set_to_gtt_domain+0x57/0x9d [i915] > > [ 23.261860] [] ? intelfb_create+0x1e5/0x7a3 [i915] > > [ 23.261866] [] ? xen_force_evtchn_callback+0x1d/0x37 > > [ 23.261877] [] ? intelfb_probe+0x3c6/0x62e [i915] > > [ 23.261881] [] ? xen_restore_fl_direct_end+0x0/0x1 > > [ 23.261894] [] ? drm_helper_initial_config+0x176/0x19c [drm] > > [ 23.261902] [] ? i915_driver_load+0xaa7/0xb3c [i915] > > [ 23.261913] [] ? drm_get_dev+0x321/0x444 [drm] > > [ 23.261919] [] ? local_pci_probe+0x22/0x3e > > [ 23.261922] [] ? xen_force_evtchn_callback+0x1d/0x37 > > [ 23.261925] [] ? pci_device_probe+0x68/0xab > > [ 23.261930] [] ? driver_probe_device+0xa2/0x13a > > [ 23.261933] [] ? xen_restore_fl_direct_end+0x0/0x1 > > [ 23.261936] [] ? __driver_attach+0x63/0x9a > > [ 23.261939] [] ? __driver_attach+0x0/0x9a > > [ 23.261942] [] ? bus_for_each_dev+0x54/0x9d > > [ 23.261945] [] ? bus_add_driver+0xbc/0x218 > > [ 23.261948] [] ? driver_register+0xa3/0x122 > > [ 23.261951] [] ? __pci_register_driver+0x5e/0xe7 > > [ 23.261959] [] ? i915_init+0x0/0x74 [i915] > > [ 23.261962] [] ? do_one_initcall+0x77/0x1c1 > > [ 23.261966] [] ? sys_init_module+0xda/0x223 > > [ 23.261970] [] ? system_call_fastpath+0x16/0x1b > > [ 23.261972] Code: 86 51 06 e1 48 83 c4 18 c3 48 83 ec 18 48 8b 15 f1 80 00 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 48 85 d2 74 04 b0 01 <89> 02 48 8b 44 24 10 65 48 33 04 25 28 00 00 00 74 05 e8 48 51 > > [ 23.262012] RIP [] intel_i915_chipset_flush+0x22/0x3e [intel_agp] > > [ 23.262017] RSP > > [ 23.262019] CR2: ffffc900000c6000 > > [ 23.262022] ---[ end trace cf5e2ee5497e2d52 ]--- > > [ 26.955198] eth0: no IPv6 routers present > > [ 27.230515] peth0: no IPv6 routers present In the latest of PV-OPS kernel (and the 2.6.31.x) there does not seem to be a big red mark on why this would happen. There are two things that I think might at fault here: 1). CONFIG_DMAR was not set and you ended up using the non-PCI DMA mapping of pages. 2). We mapped the wrong address. I am perplexed here. But we can narrow this down. 1) Apply the attached patch. 2) With a working setup (perhaps booting PV-OPS kernel without Xen) but still with 8GB of RAM, run lspci -vvv and also 'dmesg'. 3) Get a PCI or PCI-e Serial card. I've been using the Rosewill RC-301 and RC-301EU with success. I had to figure the ioports from 'lspci' and put this in my Xen command line: "com1=115200,8n1,0xd800,0". The 0xd800 is what lspci told me was on the first IO port of that serial card. 4). Also add to your Xen command line: 'console=com1,vga guest_loglvl=all" 5). On your Linux kernel command line add: "initcall_debug debug" 6). Compile the kernel and reboot. Make sure to have CONFIG_DMAR=y set. --opJtzjQTFsWo+cga Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="test.patch" diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c index 4d01d0e..134c115 100644 --- a/drivers/char/agp/intel-agp.c +++ b/drivers/char/agp/intel-agp.c @@ -1133,6 +1133,9 @@ static void intel_i9xx_setup_flush(void) } if (intel_private.ifp_resource.start) { + printk(KERN_INFO "%s: %lx->%lx\n", __FUNCTION__, + intel_private.ifp_resource.start, + intel_private.ifp_resource.end); intel_private.i9xx_flush_page = ioremap_nocache(intel_private.ifp_resource.start, PAGE_SIZE); if (!intel_private.i9xx_flush_page) dev_info(&intel_private.pcidev->dev, "can't ioremap flush page - no chipset flushing"); --opJtzjQTFsWo+cga Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --opJtzjQTFsWo+cga--