* SB600 AHCI: Hard Disk Corruption @ 2008-05-25 12:10 Patrick 2008-05-25 12:16 ` Patrick 2008-05-25 17:38 ` Pavel Machek 0 siblings, 2 replies; 25+ messages in thread From: Patrick @ 2008-05-25 12:10 UTC (permalink / raw) To: linux-kernel Hello (Tejun Heo *) I've got an annoying problem with my athlon 64bit, 4gb ram, asus m2a-vm (->SB600 AHCI controller), SAMSUNG HD501LJ SATA Disk. I'm using kernel 2.6.26-rc3. Everything works fine, expect for standby/suspend/hibernate. Standby freezes, hibernate, I acually haven't tested lately cause I want suspend to ram to work first. "echo mem > /sys/power/state; vbetool post;" (on text console) successfully suspends the system and it resumes as well, BUT: After resuming, things quickly turn bad: "file not fonund", kernel reports ext2 errors on root (lvm) partition. After a (hard) reboot the root fileystem won't even be recognized again by mount and e2fschk can harldy recover it (thousands of inodes go to lost+found, have to restore backups to make the system work again). This happend even when the partition was mounted _readonly_ and it happens to ALL partitions mounted during suspend. ** I'm testing now by appending break=init to the kernel command line, getting to a busybox on the initramfs, and then unmounting "root" before suspending. From there i can dmesg to see what's happening (though the dmesg buffer is quiet small...can i increase that in proc somewhere?). I'd be willing to test and send whatever logs you need to get this fixed. Some additional infos: Upgrading from 2.6.24, I hoped the AHCI_HFLAG_NO_MSI in drivers/ata/ahci.c might solve the issue - no luck. All the other sb600 workarounds: obviousley no luck as well. irqpoll: slightly different behaviour when unloading sd_mod and ahci modules before suspending: without irqpoll, the disk ([sda]) doesn't show up again after "modprobe ahci; modprobe sd_mod" and I get "ata5.00: failed to IDENTIFY [...] err_mask=0x80" "failed to restore some devices [...]" errors with irqpoll, disk shows up again and no errors, but "there is different data" on each read (head -c10000) from /dev/sda. Though the disk is not changed, after rebooting it contains the original data. I just wonder how the data is "created" - it seems to be disk content from different locations (not beginning) on the disk - if i "dd if=/dev/sda of=/dev/null", i hear the disk reading data.... Well - I hope you might be able to make some sense of that and tell me what logs and dumps exactly you need to fix it... Greets - Patrick * I read many threads in which Tejun provided patches for the SB600 AHCI Controller which seems to be seriously broken - if only i knew that in advance... Maybe he can fix this issue as well - last ressort. Otherwise I'll burn that mobo! ** After my firs install and configuring the system for a day, trying out suspend to ram smashed it with no backups, since then i didn't learn my lesson and smashed it again 2-3 times, this time with backups at hand though, ... ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SB600 AHCI: Hard Disk Corruption 2008-05-25 12:10 SB600 AHCI: Hard Disk Corruption Patrick @ 2008-05-25 12:16 ` Patrick 2008-05-25 17:38 ` Pavel Machek 1 sibling, 0 replies; 25+ messages in thread From: Patrick @ 2008-05-25 12:16 UTC (permalink / raw) To: linux-kernel Uh, dough, forgot: Could you please CC me on replies, since i'm not subscribed - thanks ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SB600 AHCI: Hard Disk Corruption 2008-05-25 12:10 SB600 AHCI: Hard Disk Corruption Patrick 2008-05-25 12:16 ` Patrick @ 2008-05-25 17:38 ` Pavel Machek 2008-05-25 20:08 ` Patrick 1 sibling, 1 reply; 25+ messages in thread From: Pavel Machek @ 2008-05-25 17:38 UTC (permalink / raw) To: Patrick; +Cc: linux-kernel Hi! > I've got an annoying problem with my athlon 64bit, 4gb ram, ???asus m2a-vm > (->SB600 AHCI controller), SAMSUNG HD501LJ SATA Disk. I'm using kernel > 2.6.26-rc3. Everything works fine, expect for standby/suspend/hibernate. > Standby freezes, hibernate, I acually ???haven't tested lately cause I > want suspend to ram to work first. > > "echo mem > /sys/power/state; vbetool post;" (on text console) > successfully suspends the system and it resumes as well, BUT: After > resuming, things quickly turn bad: "file not fonund", kernel reports iommu problem? Try it with mem=3G. > * I read many threads in which Tejun provided patches for the SB600 AHCI > Controller which seems to be seriously broken - if only i knew that in > advance... Maybe he can fix this issue as well - last ressort. Otherwise > I'll burn that mobo! I suspect all you need is to burn one dimm.. or send it to me so that I can reproduce it ;-). > ???** After my firs install and configuring the system for a day, trying > out suspend to ram smashed it with no backups, since then i didn't learn > my lesson and smashed it again 2-3 times, this time with backups at hand > though, ... Boot it from cd ;-). -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SB600 AHCI: Hard Disk Corruption 2008-05-25 17:38 ` Pavel Machek @ 2008-05-25 20:08 ` Patrick 2008-05-25 20:39 ` >3G => iommu => suspend problems -- was " Pavel Machek 0 siblings, 1 reply; 25+ messages in thread From: Patrick @ 2008-05-25 20:08 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-kernel > iommu problem? Try it with mem=3G. YES! :-) How did you know? Even mem=4G did the trick. I should have tried that out long time back (!), since it also used to make the fglrx driver problems I was dealing with go away. These are solved now though, by a recent bios update (Version 1705 2008/04/21). The fglrx driver used to work once but stopped to work after an upgrade one day. Now with the new bios, the current (ubuntu 8.04) version is working. So now, using mem=4G, after successfully suspending once, my ahci-scsi-libata-[sda] disk is still working. The error messages I mentioned dissapeared from the kernel messages. Some other, usb driver error messages, that used show up after resume have disappeared as well. Anyway: using mem=4G is definitely no option! I get only 3G of usable memory!!! * So how are we going to get this fixed??? Should the IOMMU be reinitialized after resume? Or should the bios do it but doesn't? ** On the m2a-vm, the "GART" seems to be used as iommu, so... :-) ??? I put some kernel message logs here: http://zefir.freesitespace.net/dmesg/ One log without mem=4G will also be there shortly, where you will see the messages concerning the iommu (which seems to be unused right now). Greets * > I suspect all you need is to burn one dimm.. or send it to me so that > I can reproduce it ;-). ;-) ...looking back - all the trouble the 4G caused me till now, I would have saved lots of time mounting only one 2G dimm in the first place, even if the system was a little slower, but be it... > Boot it from cd ;-). I'm impatient... ** Seems, the SB600 is not to blame - alas sorry for the subject. ^ permalink raw reply [flat|nested] 25+ messages in thread
* >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-05-25 20:08 ` Patrick @ 2008-05-25 20:39 ` Pavel Machek 2008-05-25 21:10 ` Pavel Machek 0 siblings, 1 reply; 25+ messages in thread From: Pavel Machek @ 2008-05-25 20:39 UTC (permalink / raw) To: Patrick; +Cc: linux-kernel Hi! > > iommu problem? Try it with mem=3G. > > YES! :-) How did you know? Guess how... I hit it myself. > So how are we going to get this fixed??? Write a patch, submit it? ;-). Okay, I guess I should do the patch, but I can't test it easily. If you can do testing/some development, I guess I can try to cook up something. (But no, I'm not an IOMMU expert). > Should the IOMMU be reinitialized after resume? Or should the bios do it > but doesn't? ** On the m2a-vm, the "GART" seems to be used as iommu, > so... :-) ??? It is a Linux bug. BIOS could be more helpful, but... this is a Linux problem. > I put some kernel message logs here: > http://zefir.freesitespace.net/dmesg/ > One log without mem=4G will also be there shortly, where you will see > the messages concerning the iommu (which seems to be unused right > now). Yep, they are similar to what I see. > ???** Seems, the SB600 is not to blame - alas sorry for the subject. Subject is easy to change ;-). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-05-25 20:39 ` >3G => iommu => suspend problems -- was " Pavel Machek @ 2008-05-25 21:10 ` Pavel Machek 2008-05-26 15:31 ` Patrick 2008-05-27 10:23 ` Pavel Machek 0 siblings, 2 replies; 25+ messages in thread From: Pavel Machek @ 2008-05-25 21:10 UTC (permalink / raw) To: Patrick; +Cc: linux-kernel Hi! > > > iommu problem? Try it with mem=3G. > > > > YES! :-) How did you know? > > Guess how... I hit it myself. > > > So how are we going to get this fixed??? > > Write a patch, submit it? ;-). Can you try this one? It should prevent suspend in the broken cases, but allow it in mem=4G config. Pavel iommu/gart support misses suspend/resume code, which can do bad stuff, including memory corruption on resume. Prevent system suspend in case we would be unable to resume. Signed-off-by: Pavel Machek <pavel@suse.cz> --- commit 7724af033ea084f0b037ae8a2032da5e40255088 tree 06de661a106ba83a96cab2ee0e76e3f3c44823ab parent e9f4353b46ec2b05f73e1a84085c305de211bd3e author Pavel <pavel@amd.ucw.cz> Sun, 25 May 2008 23:08:17 +0200 committer Pavel <pavel@amd.ucw.cz> Sun, 25 May 2008 23:08:17 +0200 arch/x86/kernel/pci-gart_64.c | 31 ++++++++++++++++++++++++++++++- 1 files changed, 30 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c index 620ec3a..926af9c 100644 --- a/arch/x86/kernel/pci-gart_64.c +++ b/arch/x86/kernel/pci-gart_64.c @@ -26,6 +26,7 @@ #include <linux/bitops.h> #include <linux/kdebug.h> #include <linux/scatterlist.h> #include <linux/iommu-helper.h> +#include <linux/sysdev.h> #include <asm/atomic.h> #include <asm/io.h> #include <asm/mtrr.h> @@ -548,6 +549,28 @@ static __init unsigned read_aperture(str return aper_base; } +static int gart_resume(struct sys_device *dev) +{ + return 0; +} + +static int gart_suspend(struct sys_device *dev, pm_message_t state) +{ + return -EINVAL; +} + +static struct sysdev_class gart_sysdev_class = { + .name = "gart", + .suspend = gart_suspend, + .resume = gart_resume, + +}; + +static struct sys_device device_gart = { + .id = 0, + .cls = &gart_sysdev_class, +}; + /* * Private Northbridge GATT initialization in case we cannot use the * AGP driver for some reason. @@ -558,7 +581,7 @@ static __init int init_k8_gatt(struct ag unsigned aper_base, new_aper_base; struct pci_dev *dev; void *gatt; - int i; + int i, error; printk(KERN_INFO "PCI-DMA: Disabling AGP.\n"); aper_size = aper_base = info->aper_size = 0; @@ -595,6 +618,12 @@ static __init int init_k8_gatt(struct ag dev = k8_northbridges[i]; enable_gart_translation(dev, __pa(gatt)); } + + error = sysdev_class_register(&gart_sysdev_class); + if (!error) + error = sysdev_register(&device_gart); + if (error) + panic("Could not register gart_sysdev -- would corrupt data on next suspend"); flush_gart(); printk(KERN_INFO "PCI-DMA: aperture base @ %x size %u KB\n", -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-05-25 21:10 ` Pavel Machek @ 2008-05-26 15:31 ` Patrick 2008-05-27 11:22 ` Pavel Machek 2008-05-27 10:23 ` Pavel Machek 1 sibling, 1 reply; 25+ messages in thread From: Patrick @ 2008-05-26 15:31 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-kernel > Can you try this one? It should prevent suspend in the broken cases, > but allow it in mem=4G config. Sure! root@babar:/usr/src/linux-2.6.25# patch -p1 < /home/pat/patch-2.6.26-rc3.gart-suspend patching file arch/x86/kernel/pci-gart_64.c Hunk #4 succeeded at 629 with fuzz 2 (offset 11 lines). .....make; cp bzImage /boot; reboot.... without mem=4G: ... [17180761.682783] CPU1 is down [17180761.682968] Class suspend failed for gart0 [17180761.683321] PM: Some devices failed to power down [17180761.683326] Enabling non-boot CPUs ... with mem=4G: ... [no change] -> perfect! Behaves as you predicted. logs are ... here: http://zefir.890m.com/dmesg/ * Thanks for attacking the issue really quickly. I'm ready to do further testing and will try to find enough time to do one test a day during the week. Cheers * The other host's dns entry is gone, I guess that's because it had too much requests and was taken down...? Hope this one survives! ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-05-26 15:31 ` Patrick @ 2008-05-27 11:22 ` Pavel Machek 2008-05-29 18:44 ` Patrick 2008-06-03 22:33 ` Rafael J. Wysocki 0 siblings, 2 replies; 25+ messages in thread From: Pavel Machek @ 2008-05-27 11:22 UTC (permalink / raw) To: Patrick; +Cc: linux-kernel Hi! > > Can you try this one? It should prevent suspend in the broken cases, > > but allow it in mem=4G config. > > Sure! > > root@babar:/usr/src/linux-2.6.25# patch -p1 < /home/pat/patch-2.6.26-rc3.gart-suspend > patching file arch/x86/kernel/pci-gart_64.c > Hunk #4 succeeded at 629 with fuzz 2 (offset 11 lines). > > .....make; cp bzImage /boot; reboot.... Thanks! This goes on top of the second patch... it makes it work. Pavel For iommu suspend/resume code to work, functions it calls may not be __init. Signed-off-by: Pavel Machek <pavel@suse.cz> --- commit 0ea376de01be797f9563c2c2464149f8f0af6329 tree 4b5179fe97fe045cc770091bce94f898f26e4499 parent 017834f8541b8ded8ef831e5fe2b5f9cead4f6b0 author Pavel <pavel@amd.ucw.cz> Tue, 27 May 2008 13:21:05 +0200 committer Pavel <pavel@amd.ucw.cz> Tue, 27 May 2008 13:21:05 +0200 arch/x86/kernel/k8.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/k8.c b/arch/x86/kernel/k8.c index 7377ccb..acf4770 100644 --- a/arch/x86/kernel/k8.c +++ b/arch/x86/kernel/k8.c @@ -76,7 +76,7 @@ EXPORT_SYMBOL_GPL(cache_k8_northbridges) /* Ignores subdevice/subvendor but as far as I can figure out they're useless anyways */ -int __init early_is_k8_nb(u32 device) +int early_is_k8_nb(u32 device) { struct pci_device_id *id; u32 vendor = device & 0xffff; -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-05-27 11:22 ` Pavel Machek @ 2008-05-29 18:44 ` Patrick 2008-05-29 18:51 ` Patrick 2008-05-29 21:05 ` Patrick 2008-06-03 22:33 ` Rafael J. Wysocki 1 sibling, 2 replies; 25+ messages in thread From: Patrick @ 2008-05-29 18:44 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-kernel Hi! > This goes on top of the second patch... it makes it work. > > Pavel > All right. It works! :-) Thank you very much! I'll describe how I tested: So far you sent 3 Patches to the list. As I couldn't apply the second one to my v2.6.26-rc4 tree i had to get git and then your tree. Now I have a git working tree with two branches *master* and *pavel*, corresponding to torvalds/linux-2.6 and pavel/work respectively. I made *pavel* the current branch and issued the following command: git diff v2.6.26-rc4 arch/x86/kernel/aperture_64.c \ arch/x86/kernel/k8.c \ arch/x86/kernel/pci-gart_64.c \ drivers/char/agp/generic.c \ include/asm-x86/gart.h \ > /home/pat/suspend-vs-iommu.patch The result is on http://zefir.890m.com/dmesg/ I applied this patch to my old v2.6.26-rc4 tree and recompiled a new kernel there *, put it in place and rebooted. I suspended several times since then and put a kernel log of one normal boot process with suspend from console while X is running (radeonhd) on the site mentioned before. I'm running debian testing / ubuntu hardy mixed system and used kernel .config from linux-headers-2.6.25-2-amd64.deb (debian unstable) with "make oldconfig" and defaults on new options. The diff is nearly 500 lines long, but i it could be narrowed, I guess. For me it works perfectly like this though. If you want me to do any further tests, just say so. I'm trying to get to know git a bit better and hope to be able to help again sometime. It's great fun! :-) Cheers * I get "ACPI: Unable to turn cooling device [ffff81012fa5cdd0] 'off'" every two seconds and "fancontrol" isn't working any more... going to send a acpidump to the acpi group. Maybe i'll find out myself as well... investigating. Not sure if it's caused by the patch. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-05-29 18:44 ` Patrick @ 2008-05-29 18:51 ` Patrick 2008-05-29 21:05 ` Patrick 1 sibling, 0 replies; 25+ messages in thread From: Patrick @ 2008-05-29 18:51 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-kernel _This_ was meant to be the first footnote, actually: * I guess these build warnings have nothing to do with the patch I applied: root@babar:/mnt/reiser/linux/linux-2.6.25# make -j3 CONFIG_DEBUG_SECTION_MISMATCH=y CHK include/linux/version.h CHK include/linux/utsrelease.h CALL scripts/checksyscalls.sh CHK include/linux/compile.h Building modules, stage 2. Kernel: arch/x86/boot/bzImage is ready (#2) MODPOST 1876 modules WARNING: drivers/isdn/hisax/hisax.o(.text+0xbc7): Section mismatch in reference from the function HiSax_inithardware() to the function .devinit.text:hisax_cs_setup_card() The function HiSax_inithardware() references the function __devinit hisax_cs_setup_card(). This is often because HiSax_inithardware lacks a __devinit annotation or the annotation of hisax_cs_setup_card is wrong. WARNING: drivers/isdn/hisax/hisax.o(.text+0xccc): Section mismatch in reference from the function hisax_init_pcmcia() to the function .devinit.text:hisax_cs_setup_card() The function hisax_init_pcmcia() references the function __devinit hisax_cs_setup_card(). This is often because hisax_init_pcmcia lacks a __devinit annotation or the annotation of hisax_cs_setup_card is wrong. WARNING: drivers/isdn/hisax/hisax.o(.text+0x1198): Section mismatch in reference from the function hisax_register() to the function .devinit.text:hisax_cs_setup_card() The function hisax_register() references the function __devinit hisax_cs_setup_card(). This is often because hisax_register lacks a __devinit annotation or the annotation of hisax_cs_setup_card is wrong. WARNING: drivers/scsi/gdth.o(.text+0x3c35): Section mismatch in reference from the function gdth_pci_probe_one() to the function .init.text:gdth_search_drives() The function gdth_pci_probe_one() references the function __init gdth_search_drives(). This is often because gdth_pci_probe_one lacks a __init annotation or the annotation of gdth_search_drives is wrong. WARNING: drivers/scsi/gdth.o(.text+0x3d3a): Section mismatch in reference from the function gdth_pci_probe_one() to the function .init.text:gdth_enable_int() The function gdth_pci_probe_one() references the function __init gdth_enable_int(). This is often because gdth_pci_probe_one lacks a __init annotation or the annotation of gdth_enable_int is wrong. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-05-29 18:44 ` Patrick 2008-05-29 18:51 ` Patrick @ 2008-05-29 21:05 ` Patrick 1 sibling, 0 replies; 25+ messages in thread From: Patrick @ 2008-05-29 21:05 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-kernel > * I get > "ACPI: Unable to turn cooling device [ffff81012fa5cdd0] 'off'" every two > seconds and "fancontrol" isn't working any more... going to send a > acpidump to the acpi group. Maybe i'll find out myself as well... > investigating. Not sure if it's caused by the patch. It's caused by the suspend/resume process, not the patch - it happens with the unpatched -rc4 kernel and mem=4G as well after resume. "fancontrol" needed a config file update only (new device, hwmon0). ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-05-27 11:22 ` Pavel Machek 2008-05-29 18:44 ` Patrick @ 2008-06-03 22:33 ` Rafael J. Wysocki 2008-06-06 13:20 ` Pavel Machek 1 sibling, 1 reply; 25+ messages in thread From: Rafael J. Wysocki @ 2008-06-03 22:33 UTC (permalink / raw) To: Pavel Machek; +Cc: Patrick, linux-kernel, Ingo Molnar, pm list On Tuesday, 27 of May 2008, Pavel Machek wrote: > Hi! > > > > Can you try this one? It should prevent suspend in the broken cases, > > > but allow it in mem=4G config. > > > > Sure! > > > > root@babar:/usr/src/linux-2.6.25# patch -p1 < /home/pat/patch-2.6.26-rc3.gart-suspend > > patching file arch/x86/kernel/pci-gart_64.c > > Hunk #4 succeeded at 629 with fuzz 2 (offset 11 lines). > > > > .....make; cp bzImage /boot; reboot.... > > Thanks! > > This goes on top of the second patch... it makes it work. > > Pavel > > For iommu suspend/resume code to work, functions it calls may not be > __init. > > Signed-off-by: Pavel Machek <pavel@suse.cz> I consolidated some of your patches sent in this thread and made the result apply to the current -git. It hasn't been tested yet, but does it look good? It's on top of the patch that adds the GART sysdev. Thanks, Rafael --- Handle GART IOMMU suspend and resume. Not-yet-signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/aperture_64.c | 34 ++++++++++++++++--------- arch/x86/kernel/k8.c | 2 - arch/x86/kernel/pci-gart_64.c | 55 +++++++++++++++++++++++++++++------------- include/asm-x86/gart.h | 2 + 4 files changed, 63 insertions(+), 30 deletions(-) Index: linux-2.6/arch/x86/kernel/pci-gart_64.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/pci-gart_64.c +++ linux-2.6/arch/x86/kernel/pci-gart_64.c @@ -549,14 +549,50 @@ static __init unsigned read_aperture(str return aper_base; } +static void enable_gart_translations(void) +{ + int i; + + for (i = 0; i < num_k8_northbridges; i++) { + struct pci_dev *dev; + u32 gatt_reg; + u32 ctl; + + dev = k8_northbridges[i]; + gatt_reg = __pa(agp_gatt_table) >> 12; + gatt_reg <<= 4; + pci_write_config_dword(dev, 0x98, gatt_reg); + pci_read_config_dword(dev, 0x90, &ctl); + + ctl |= 1; + ctl &= ~((1<<4) | (1<<5)); + + pci_write_config_dword(dev, 0x90, ctl); + } +} + +static bool fix_north_bridges; /* call fix_up_north_bridges() on resume */ +static u32 aperture_order; /* arguments for fix_up_north_bridges() */ +static u32 aperture_alloc; + +void set_gart_resume_data(u32 aper_order, u32 aper_alloc) +{ + fix_north_bridges = true; + aperture_order = aper_order; + aperture_alloc = aper_alloc; +} + static int gart_resume(struct sys_device *dev) { + if (fix_north_bridges) + fix_up_north_bridges(aperture_order, aperture_alloc); + enable_gart_translations(); return 0; } static int gart_suspend(struct sys_device *dev, pm_message_t state) { - return -EINVAL; + return 0; } static struct sysdev_class gart_sysdev_class = { @@ -613,27 +649,14 @@ static __init int init_k8_gatt(struct ag memset(gatt, 0, gatt_size); agp_gatt_table = gatt; - for (i = 0; i < num_k8_northbridges; i++) { - u32 gatt_reg; - u32 ctl; - - dev = k8_northbridges[i]; - gatt_reg = __pa(gatt) >> 12; - gatt_reg <<= 4; - pci_write_config_dword(dev, 0x98, gatt_reg); - pci_read_config_dword(dev, 0x90, &ctl); - - ctl |= 1; - ctl &= ~((1<<4) | (1<<5)); - - pci_write_config_dword(dev, 0x90, ctl); - } + enable_gart_translations(); error = sysdev_class_register(&gart_sysdev_class); if (!error) error = sysdev_register(&device_gart); if (error) panic("Could not register gart_sysdev -- would corrupt data on next suspend"); + flush_gart(); printk(KERN_INFO "PCI-DMA: aperture base @ %x size %u KB\n", Index: linux-2.6/arch/x86/kernel/k8.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/k8.c +++ linux-2.6/arch/x86/kernel/k8.c @@ -76,7 +76,7 @@ EXPORT_SYMBOL_GPL(cache_k8_northbridges) /* Ignores subdevice/subvendor but as far as I can figure out they're useless anyways */ -int __init early_is_k8_nb(u32 device) +int early_is_k8_nb(u32 device) { struct pci_device_id *id; u32 vendor = device & 0xffff; Index: linux-2.6/arch/x86/kernel/aperture_64.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/aperture_64.c +++ linux-2.6/arch/x86/kernel/aperture_64.c @@ -310,6 +310,25 @@ void __init early_gart_iommu_check(void) } +void fix_up_north_bridges(u32 aper_order, u32 aper_alloc) +{ + int num; + + /* Fix up the north bridges */ + for (num = 24; num < 32; num++) { + if (!early_is_k8_nb(read_pci_config(0, num, 3, 0x00))) + continue; + + /* + * Don't enable translation yet. That is done later. + * Assume this BIOS didn't initialise the GART so + * just overwrite all previous bits + */ + write_pci_config(0, num, 3, 0x90, aper_order << 1); + write_pci_config(0, num, 3, 0x94, aper_alloc >> 25); + } +} + void __init gart_iommu_hole_init(void) { u32 aper_size, aper_alloc = 0, aper_order = 0, last_aper_order = 0; @@ -400,17 +419,6 @@ void __init gart_iommu_hole_init(void) return; } - /* Fix up the north bridges */ - for (num = 24; num < 32; num++) { - if (!early_is_k8_nb(read_pci_config(0, num, 3, 0x00))) - continue; - - /* - * Don't enable translation yet. That is done later. - * Assume this BIOS didn't initialise the GART so - * just overwrite all previous bits - */ - write_pci_config(0, num, 3, 0x90, aper_order<<1); - write_pci_config(0, num, 3, 0x94, aper_alloc>>25); - } + fix_up_north_bridges(aper_order, aper_alloc); + set_gart_resume_data(aper_order, aper_alloc); } Index: linux-2.6/include/asm-x86/gart.h =================================================================== --- linux-2.6.orig/include/asm-x86/gart.h +++ linux-2.6/include/asm-x86/gart.h @@ -11,6 +11,8 @@ extern void gart_iommu_shutdown(void); extern void __init gart_parse_options(char *); extern void early_gart_iommu_check(void); extern void gart_iommu_hole_init(void); +extern void set_gart_resume_data(u32, u32); +extern void fix_up_north_bridges(u32, u32); extern int fallback_aper_order; extern int fallback_aper_force; extern int gart_iommu_aperture; ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-06-03 22:33 ` Rafael J. Wysocki @ 2008-06-06 13:20 ` Pavel Machek 2008-06-08 22:36 ` Rafael J. Wysocki 0 siblings, 1 reply; 25+ messages in thread From: Pavel Machek @ 2008-06-06 13:20 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Patrick, linux-kernel, Ingo Molnar, pm list Hi! > > > > Can you try this one? It should prevent suspend in the broken cases, > > > > but allow it in mem=4G config. > > > > > > Sure! > > > > > > root@babar:/usr/src/linux-2.6.25# patch -p1 < /home/pat/patch-2.6.26-rc3.gart-suspend > > > patching file arch/x86/kernel/pci-gart_64.c > > > Hunk #4 succeeded at 629 with fuzz 2 (offset 11 lines). > > > > > > .....make; cp bzImage /boot; reboot.... > > > > Thanks! > > > > This goes on top of the second patch... it makes it work. > > > > Pavel > > > > For iommu suspend/resume code to work, functions it calls may not be > > __init. > > > > Signed-off-by: Pavel Machek <pavel@suse.cz> > > I consolidated some of your patches sent in this thread and made the result > apply to the current -git. It hasn't been tested yet, but does it look good? > > It's on top of the patch that adds the GART sysdev. Looks ok to me. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-06-06 13:20 ` Pavel Machek @ 2008-06-08 22:36 ` Rafael J. Wysocki [not found] ` <20080609124630.GA28799@elte.hu> 2008-06-11 11:43 ` >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption Patrick 0 siblings, 2 replies; 25+ messages in thread From: Rafael J. Wysocki @ 2008-06-08 22:36 UTC (permalink / raw) To: Pavel Machek; +Cc: Patrick, linux-kernel, Ingo Molnar, pm list, Andi Kleen On Friday, 6 of June 2008, Pavel Machek wrote: > Hi! > > > > > > Can you try this one? It should prevent suspend in the broken cases, > > > > > but allow it in mem=4G config. > > > > > > > > Sure! > > > > > > > > root@babar:/usr/src/linux-2.6.25# patch -p1 < /home/pat/patch-2.6.26-rc3.gart-suspend > > > > patching file arch/x86/kernel/pci-gart_64.c > > > > Hunk #4 succeeded at 629 with fuzz 2 (offset 11 lines). > > > > > > > > .....make; cp bzImage /boot; reboot.... > > > > > > Thanks! > > > > > > This goes on top of the second patch... it makes it work. > > > > > > Pavel > > > > > > For iommu suspend/resume code to work, functions it calls may not be > > > __init. > > > > > > Signed-off-by: Pavel Machek <pavel@suse.cz> > > > > I consolidated some of your patches sent in this thread and made the result > > apply to the current -git. It hasn't been tested yet, but does it look good? > > > > It's on top of the patch that adds the GART sysdev. > > Looks ok to me. Still, it may be improved. :-) First, we shouldn't mix the "early PCI config access" thing with the "normal" method. Second, we don't have to check for the K8 north bridges on resume, because we already know where they are in the configuration space and we can use this information. Updated patch follows. It has been tested a little on my new 4 GB test box on which 2.6.26-rc4 failed miserably with severe consequences. More testing welcome, but please be careful. Thanks, Rafael --- Add resume handling to GART IOMMU. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/aperture_64.c | 2 + arch/x86/kernel/pci-gart_64.c | 75 +++++++++++++++++++++++++++++++++--------- include/asm-x86/gart.h | 1 3 files changed, 62 insertions(+), 16 deletions(-) Index: linux-2.6/arch/x86/kernel/pci-gart_64.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/pci-gart_64.c +++ linux-2.6/arch/x86/kernel/pci-gart_64.c @@ -549,14 +549,70 @@ static __init unsigned read_aperture(str return aper_base; } +static void enable_gart_translations(void) +{ + int i; + + for (i = 0; i < num_k8_northbridges; i++) { + struct pci_dev *dev; + u32 gatt_reg; + u32 ctl; + + dev = k8_northbridges[i]; + gatt_reg = __pa(agp_gatt_table) >> 12; + gatt_reg <<= 4; + pci_write_config_dword(dev, 0x98, gatt_reg); + pci_read_config_dword(dev, 0x90, &ctl); + + ctl |= 1; + ctl &= ~((1<<4) | (1<<5)); + + pci_write_config_dword(dev, 0x90, ctl); + } +} + +/* + * If fix_up_north_bridges is set, the north bridges have to be fixed up on + * resume in the same way as they are handled in gart_iommu_hole_init(). + */ +static bool fix_up_north_bridges; +static u32 aperture_order; +static u32 aperture_alloc; + +void set_up_gart_resume(u32 aper_order, u32 aper_alloc) +{ + fix_up_north_bridges = true; + aperture_order = aper_order; + aperture_alloc = aper_alloc; +} + static int gart_resume(struct sys_device *dev) { + printk(KERN_INFO "PCI-DMA: Resuming GART IOMMU\n"); + + if (fix_up_north_bridges) { + int i; + + for (i = 0; i < num_k8_northbridges; i++) { + struct pci_dev *dev = k8_northbridges[i]; + + /* + * Don't enable translations just yet. That is the next + * step. Restore the pre-suspend aperture settings. + */ + pci_write_config_dword(dev, 0x90, aperture_order << 1); + pci_write_config_dword(dev, 0x94, aperture_alloc >> 25); + } + } + + enable_gart_translations(); + return 0; } static int gart_suspend(struct sys_device *dev, pm_message_t state) { - return -EINVAL; + return 0; } static struct sysdev_class gart_sysdev_class = { @@ -614,27 +670,14 @@ static __init int init_k8_gatt(struct ag memset(gatt, 0, gatt_size); agp_gatt_table = gatt; - for (i = 0; i < num_k8_northbridges; i++) { - u32 gatt_reg; - u32 ctl; - - dev = k8_northbridges[i]; - gatt_reg = __pa(gatt) >> 12; - gatt_reg <<= 4; - pci_write_config_dword(dev, 0x98, gatt_reg); - pci_read_config_dword(dev, 0x90, &ctl); - - ctl |= 1; - ctl &= ~((1<<4) | (1<<5)); - - pci_write_config_dword(dev, 0x90, ctl); - } + enable_gart_translations(); error = sysdev_class_register(&gart_sysdev_class); if (!error) error = sysdev_register(&device_gart); if (error) panic("Could not register gart_sysdev -- would corrupt data on next suspend"); + flush_gart(); printk(KERN_INFO "PCI-DMA: aperture base @ %x size %u KB\n", Index: linux-2.6/arch/x86/kernel/aperture_64.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/aperture_64.c +++ linux-2.6/arch/x86/kernel/aperture_64.c @@ -413,4 +413,6 @@ void __init gart_iommu_hole_init(void) write_pci_config(0, num, 3, 0x90, aper_order<<1); write_pci_config(0, num, 3, 0x94, aper_alloc>>25); } + + set_up_gart_resume(aper_order, aper_alloc); } Index: linux-2.6/include/asm-x86/gart.h =================================================================== --- linux-2.6.orig/include/asm-x86/gart.h +++ linux-2.6/include/asm-x86/gart.h @@ -11,6 +11,7 @@ extern void gart_iommu_shutdown(void); extern void __init gart_parse_options(char *); extern void early_gart_iommu_check(void); extern void gart_iommu_hole_init(void); +extern void set_up_gart_resume(u32, u32); extern int fallback_aper_order; extern int fallback_aper_force; extern int gart_iommu_aperture; ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <20080609124630.GA28799@elte.hu>]
* [PATCH] x86 GART: Add resume handling (was: Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption) [not found] ` <20080609124630.GA28799@elte.hu> @ 2008-06-09 22:10 ` Rafael J. Wysocki 2008-06-10 10:03 ` Rafael J. Wysocki 0 siblings, 1 reply; 25+ messages in thread From: Rafael J. Wysocki @ 2008-06-09 22:10 UTC (permalink / raw) To: Ingo Molnar; +Cc: Pavel Machek, the arch/x86 maintainers, pm list, LKML On Monday, 9 of June 2008, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > Looks ok to me. > > > > Still, it may be improved. :-) > > > > First, we shouldn't mix the "early PCI config access" thing with the > > "normal" method. Second, we don't have to check for the K8 north > > bridges on resume, because we already know where they are in the > > configuration space and we can use this information. > > > > Updated patch follows. It has been tested a little on my new 4 GB > > test box on which 2.6.26-rc4 failed miserably with severe > > consequences. More testing welcome, but please be careful. > > Rafael, could we try this against the tip/x86/gart tree perhaps? It > already has a couple of fixes from Pavel and your patch collides with > them in a non-obvious way. > > http://people.redhat.com/mingo/tip.git/README Okay, appended is the patch rebased on tip/x86/gart with (mainline) commit cd76374e9de4501acc74f833dc6cb5e7a5dca115 "suspend-vs-iommu: prevent suspend if we could not resume" (which appears to be missing from tip/x86/gart) applied. This version of the patch doesn't break compilation, but it hasn't been really tested yet. Thanks, Rafael --- Add resume handling to GART IOMMU. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/aperture_64.c | 2 + arch/x86/kernel/pci-gart_64.c | 57 ++++++++++++++++++++++++++++++++++++++---- include/asm-x86/gart.h | 1 3 files changed, 55 insertions(+), 5 deletions(-) Index: tip.git/arch/x86/kernel/aperture_64.c =================================================================== --- tip.git.orig/arch/x86/kernel/aperture_64.c +++ tip.git/arch/x86/kernel/aperture_64.c @@ -496,4 +496,6 @@ out: write_pci_config(bus, slot, 3, AMD64_GARTAPERTUREBASE, aper_alloc >> 25); } } + + set_up_gart_resume(aper_order, aper_alloc); } Index: tip.git/arch/x86/kernel/pci-gart_64.c =================================================================== --- tip.git.orig/arch/x86/kernel/pci-gart_64.c +++ tip.git/arch/x86/kernel/pci-gart_64.c @@ -549,14 +549,63 @@ static __init unsigned read_aperture(str return aper_base; } +static void enable_gart_translations(void) +{ + int i; + + for (i = 0; i < num_k8_northbridges; i++) { + struct pci_dev *dev = k8_northbridges[i]; + + enable_gart_translation(dev, __pa(agp_gatt_table)); + } +} + +/* + * If fix_up_north_bridges is set, the north bridges have to be fixed up on + * resume in the same way as they are handled in gart_iommu_hole_init(). + */ +static bool fix_up_north_bridges; +static u32 aperture_order; +static u32 aperture_alloc; + +void set_up_gart_resume(u32 aper_order, u32 aper_alloc) +{ + fix_up_north_bridges = true; + aperture_order = aper_order; + aperture_alloc = aper_alloc; +} + static int gart_resume(struct sys_device *dev) { + printk(KERN_INFO "PCI-DMA: Resuming GART IOMMU\n"); + + if (fix_up_north_bridges) { + int i; + + printk(KERN_INFO "PCI-DMA: Restoring GART aperture settings\n"); + + for (i = 0; i < num_k8_northbridges; i++) { + struct pci_dev *dev = k8_northbridges[i]; + + /* + * Don't enable translations just yet. That is the next + * step. Restore the pre-suspend aperture settings. + */ + pci_write_config_dword(dev, AMD64_GARTAPERTURECTL, + aperture_order << 1); + pci_write_config_dword(dev, AMD64_GARTAPERTUREBASE, + aperture_alloc >> 25); + } + } + + enable_gart_translations(); + return 0; } static int gart_suspend(struct sys_device *dev, pm_message_t state) { - return -EINVAL; + return 0; } static struct sysdev_class gart_sysdev_class = { @@ -614,16 +663,14 @@ static __init int init_k8_gatt(struct ag memset(gatt, 0, gatt_size); agp_gatt_table = gatt; - for (i = 0; i < num_k8_northbridges; i++) { - dev = k8_northbridges[i]; - enable_gart_translation(dev, __pa(gatt)); - } + enable_gart_translations(); error = sysdev_class_register(&gart_sysdev_class); if (!error) error = sysdev_register(&device_gart); if (error) panic("Could not register gart_sysdev -- would corrupt data on next suspend"); + flush_gart(); printk(KERN_INFO "PCI-DMA: aperture base @ %x size %u KB\n", Index: tip.git/include/asm-x86/gart.h =================================================================== --- tip.git.orig/include/asm-x86/gart.h +++ tip.git/include/asm-x86/gart.h @@ -14,6 +14,7 @@ extern void gart_iommu_shutdown(void); extern void __init gart_parse_options(char *); extern void early_gart_iommu_check(void); extern void gart_iommu_hole_init(void); +extern void set_up_gart_resume(u32, u32); extern int fallback_aper_order; extern int fallback_aper_force; extern int gart_iommu_aperture; ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] x86 GART: Add resume handling (was: Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption) 2008-06-09 22:10 ` [PATCH] x86 GART: Add resume handling (was: Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption) Rafael J. Wysocki @ 2008-06-10 10:03 ` Rafael J. Wysocki 2008-06-12 9:34 ` Ingo Molnar 0 siblings, 1 reply; 25+ messages in thread From: Rafael J. Wysocki @ 2008-06-10 10:03 UTC (permalink / raw) To: Ingo Molnar; +Cc: Pavel Machek, the arch/x86 maintainers, pm list, LKML On Tuesday, 10 of June 2008, Rafael J. Wysocki wrote: > On Monday, 9 of June 2008, Ingo Molnar wrote: > > > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > > > Looks ok to me. > > > > > > Still, it may be improved. :-) > > > > > > First, we shouldn't mix the "early PCI config access" thing with the > > > "normal" method. Second, we don't have to check for the K8 north > > > bridges on resume, because we already know where they are in the > > > configuration space and we can use this information. > > > > > > Updated patch follows. It has been tested a little on my new 4 GB > > > test box on which 2.6.26-rc4 failed miserably with severe > > > consequences. More testing welcome, but please be careful. > > > > Rafael, could we try this against the tip/x86/gart tree perhaps? It > > already has a couple of fixes from Pavel and your patch collides with > > them in a non-obvious way. > > > > http://people.redhat.com/mingo/tip.git/README > > Okay, appended is the patch rebased on tip/x86/gart with (mainline) commit > cd76374e9de4501acc74f833dc6cb5e7a5dca115 "suspend-vs-iommu: prevent suspend if > we could not resume" (which appears to be missing from tip/x86/gart) applied. > > This version of the patch doesn't break compilation, but it hasn't been really > tested yet. Now it has been (successfully) tested too. :-) Thanks, Rafael > --- > Add resume handling to GART IOMMU. > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> > --- > arch/x86/kernel/aperture_64.c | 2 + > arch/x86/kernel/pci-gart_64.c | 57 ++++++++++++++++++++++++++++++++++++++---- > include/asm-x86/gart.h | 1 > 3 files changed, 55 insertions(+), 5 deletions(-) > > Index: tip.git/arch/x86/kernel/aperture_64.c > =================================================================== > --- tip.git.orig/arch/x86/kernel/aperture_64.c > +++ tip.git/arch/x86/kernel/aperture_64.c > @@ -496,4 +496,6 @@ out: > write_pci_config(bus, slot, 3, AMD64_GARTAPERTUREBASE, aper_alloc >> 25); > } > } > + > + set_up_gart_resume(aper_order, aper_alloc); > } > Index: tip.git/arch/x86/kernel/pci-gart_64.c > =================================================================== > --- tip.git.orig/arch/x86/kernel/pci-gart_64.c > +++ tip.git/arch/x86/kernel/pci-gart_64.c > @@ -549,14 +549,63 @@ static __init unsigned read_aperture(str > return aper_base; > } > > +static void enable_gart_translations(void) > +{ > + int i; > + > + for (i = 0; i < num_k8_northbridges; i++) { > + struct pci_dev *dev = k8_northbridges[i]; > + > + enable_gart_translation(dev, __pa(agp_gatt_table)); > + } > +} > + > +/* > + * If fix_up_north_bridges is set, the north bridges have to be fixed up on > + * resume in the same way as they are handled in gart_iommu_hole_init(). > + */ > +static bool fix_up_north_bridges; > +static u32 aperture_order; > +static u32 aperture_alloc; > + > +void set_up_gart_resume(u32 aper_order, u32 aper_alloc) > +{ > + fix_up_north_bridges = true; > + aperture_order = aper_order; > + aperture_alloc = aper_alloc; > +} > + > static int gart_resume(struct sys_device *dev) > { > + printk(KERN_INFO "PCI-DMA: Resuming GART IOMMU\n"); > + > + if (fix_up_north_bridges) { > + int i; > + > + printk(KERN_INFO "PCI-DMA: Restoring GART aperture settings\n"); > + > + for (i = 0; i < num_k8_northbridges; i++) { > + struct pci_dev *dev = k8_northbridges[i]; > + > + /* > + * Don't enable translations just yet. That is the next > + * step. Restore the pre-suspend aperture settings. > + */ > + pci_write_config_dword(dev, AMD64_GARTAPERTURECTL, > + aperture_order << 1); > + pci_write_config_dword(dev, AMD64_GARTAPERTUREBASE, > + aperture_alloc >> 25); > + } > + } > + > + enable_gart_translations(); > + > return 0; > } > > static int gart_suspend(struct sys_device *dev, pm_message_t state) > { > - return -EINVAL; > + return 0; > } > > static struct sysdev_class gart_sysdev_class = { > @@ -614,16 +663,14 @@ static __init int init_k8_gatt(struct ag > memset(gatt, 0, gatt_size); > agp_gatt_table = gatt; > > - for (i = 0; i < num_k8_northbridges; i++) { > - dev = k8_northbridges[i]; > - enable_gart_translation(dev, __pa(gatt)); > - } > + enable_gart_translations(); > > error = sysdev_class_register(&gart_sysdev_class); > if (!error) > error = sysdev_register(&device_gart); > if (error) > panic("Could not register gart_sysdev -- would corrupt data on next suspend"); > + > flush_gart(); > > printk(KERN_INFO "PCI-DMA: aperture base @ %x size %u KB\n", > Index: tip.git/include/asm-x86/gart.h > =================================================================== > --- tip.git.orig/include/asm-x86/gart.h > +++ tip.git/include/asm-x86/gart.h > @@ -14,6 +14,7 @@ extern void gart_iommu_shutdown(void); > extern void __init gart_parse_options(char *); > extern void early_gart_iommu_check(void); > extern void gart_iommu_hole_init(void); > +extern void set_up_gart_resume(u32, u32); > extern int fallback_aper_order; > extern int fallback_aper_force; > extern int gart_iommu_aperture; > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > -- "Premature optimization is the root of all evil." - Donald Knuth ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] x86 GART: Add resume handling (was: Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption) 2008-06-10 10:03 ` Rafael J. Wysocki @ 2008-06-12 9:34 ` Ingo Molnar 0 siblings, 0 replies; 25+ messages in thread From: Ingo Molnar @ 2008-06-12 9:34 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Pavel Machek, the arch/x86 maintainers, pm list, LKML * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > Okay, appended is the patch rebased on tip/x86/gart with (mainline) > > commit cd76374e9de4501acc74f833dc6cb5e7a5dca115 "suspend-vs-iommu: > > prevent suspend if we could not resume" (which appears to be missing > > from tip/x86/gart) applied. > > > > This version of the patch doesn't break compilation, but it hasn't > > been really tested yet. > > Now it has been (successfully) tested too. :-) applied to tip/x86/gart, thanks Rafael. Ingo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-06-08 22:36 ` Rafael J. Wysocki [not found] ` <20080609124630.GA28799@elte.hu> @ 2008-06-11 11:43 ` Patrick 2008-06-11 14:38 ` Rafael J. Wysocki 1 sibling, 1 reply; 25+ messages in thread From: Patrick @ 2008-06-11 11:43 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pavel Machek, linux-kernel, Ingo Molnar, pm list, Andi Kleen Hello On Mon, 2008-06-09 at 00:36 +0200, Rafael J. Wysocki wrote: > > > It's on top of the patch that adds the GART sysdev. > Updated patch follows. It has been tested a little on my new 4 GB test box on > which 2.6.26-rc4 failed miserably with severe consequences. More testing > welcome, but please be careful. No risk no fun! :-) As the suspend-vs-iommu-prevent-suspend-if-we-could-not-resume.patch, where the GART sysdev is added, is now included in 2.6.26-rc5, I was able to apply this one seamlessly to it (-rc5 tree) and have just rebooted *, suspended to disk and ram once wiht all filesystems (>400GB) mounted rw ** and everything is _working fine_. As usual, a kernel log, the applied patch and a test report are available at http://zefir.890m.com/kernel-testing/ . pat@babar:~/tmp/dmesg$ grep -C3 GART dmesg.2.6.26-rc5-gart-suspend.txt [ 0.294419] PCI: Using ACPI for IRQ routing [ 0.314420] PCI-DMA: Disabling AGP. [ 0.314420] PCI-DMA: aperture base @ 4000000 size 65536 KB [ 0.314420] PCI-DMA: using GART IOMMU. [ 0.314420] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture [ 0.314420] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0 [ 0.314420] hpet0: 4 32-bit timers, 14318180 Hz -- [ 221.770810] CPU1 is down [ 221.770966] PM: Creating hibernation image: [ 221.772014] PM: Need to copy 220260 pages [ 221.772014] PCI-DMA: Resuming GART IOMMU [ 221.772014] Enabling non-boot CPUs ... [ 221.772014] CPU0 attaching NULL sched-domain. [ 221.781073] SMP alternatives: switching to SMP code -- [ 251.756678] groups: 0 [ 251.756984] CPU1 is down [ 251.756984] Back to C! [ 251.756988] PCI-DMA: Resuming GART IOMMU [ 251.757355] Enabling non-boot CPUs ... [ 251.757515] CPU0 attaching NULL sched-domain. [ 251.766920] SMP alternatives: switching to SMP code * Rebooted for the first time since i compiled and rebooted -rc4 with a similar "home-grown" patch on 31. of May, with which I suspended several times to disk and ram without any problems. ** and also generally the same setup where suspending has disastrous effects using kernel < 2.6.26-rc5 > --- > Add resume handling to GART IOMMU. > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> > --- > arch/x86/kernel/aperture_64.c | 2 + > arch/x86/kernel/pci-gart_64.c | 75 +++++++++++++++++++++++++++++++++--------- > include/asm-x86/gart.h | 1 > 3 files changed, 62 insertions(+), 16 deletions(-) > > Index: linux-2.6/arch/x86/kernel/pci-gart_64.c > =================================================================== > --- linux-2.6.orig/arch/x86/kernel/pci-gart_64.c > +++ linux-2.6/arch/x86/kernel/pci-gart_64.c > @@ -549,14 +549,70 @@ static __init unsigned read_aperture(str > return aper_base; > } > > +static void enable_gart_translations(void) > +{ > + int i; > + > + for (i = 0; i < num_k8_northbridges; i++) { > + struct pci_dev *dev; > + u32 gatt_reg; > + u32 ctl; > + > + dev = k8_northbridges[i]; > + gatt_reg = __pa(agp_gatt_table) >> 12; > + gatt_reg <<= 4; > + pci_write_config_dword(dev, 0x98, gatt_reg); > + pci_read_config_dword(dev, 0x90, &ctl); > + > + ctl |= 1; > + ctl &= ~((1<<4) | (1<<5)); > + > + pci_write_config_dword(dev, 0x90, ctl); > + } > +} > + > +/* > + * If fix_up_north_bridges is set, the north bridges have to be fixed up on > + * resume in the same way as they are handled in gart_iommu_hole_init(). > + */ > +static bool fix_up_north_bridges; > +static u32 aperture_order; > +static u32 aperture_alloc; > + > +void set_up_gart_resume(u32 aper_order, u32 aper_alloc) > +{ > + fix_up_north_bridges = true; > + aperture_order = aper_order; > + aperture_alloc = aper_alloc; > +} > + > static int gart_resume(struct sys_device *dev) > { > + printk(KERN_INFO "PCI-DMA: Resuming GART IOMMU\n"); > + > + if (fix_up_north_bridges) { > + int i; > + > + for (i = 0; i < num_k8_northbridges; i++) { > + struct pci_dev *dev = k8_northbridges[i]; > + > + /* > + * Don't enable translations just yet. That is the next > + * step. Restore the pre-suspend aperture settings. > + */ > + pci_write_config_dword(dev, 0x90, aperture_order << 1); > + pci_write_config_dword(dev, 0x94, aperture_alloc >> 25); > + } > + } > + > + enable_gart_translations(); > + > return 0; > } > > static int gart_suspend(struct sys_device *dev, pm_message_t state) > { > - return -EINVAL; > + return 0; > } > > static struct sysdev_class gart_sysdev_class = { > @@ -614,27 +670,14 @@ static __init int init_k8_gatt(struct ag > memset(gatt, 0, gatt_size); > agp_gatt_table = gatt; > > - for (i = 0; i < num_k8_northbridges; i++) { > - u32 gatt_reg; > - u32 ctl; > - > - dev = k8_northbridges[i]; > - gatt_reg = __pa(gatt) >> 12; > - gatt_reg <<= 4; > - pci_write_config_dword(dev, 0x98, gatt_reg); > - pci_read_config_dword(dev, 0x90, &ctl); > - > - ctl |= 1; > - ctl &= ~((1<<4) | (1<<5)); > - > - pci_write_config_dword(dev, 0x90, ctl); > - } > + enable_gart_translations(); > > error = sysdev_class_register(&gart_sysdev_class); > if (!error) > error = sysdev_register(&device_gart); > if (error) > panic("Could not register gart_sysdev -- would corrupt data on next suspend"); > + > flush_gart(); > > printk(KERN_INFO "PCI-DMA: aperture base @ %x size %u KB\n", > Index: linux-2.6/arch/x86/kernel/aperture_64.c > =================================================================== > --- linux-2.6.orig/arch/x86/kernel/aperture_64.c > +++ linux-2.6/arch/x86/kernel/aperture_64.c > @@ -413,4 +413,6 @@ void __init gart_iommu_hole_init(void) > write_pci_config(0, num, 3, 0x90, aper_order<<1); > write_pci_config(0, num, 3, 0x94, aper_alloc>>25); > } > + > + set_up_gart_resume(aper_order, aper_alloc); > } > Index: linux-2.6/include/asm-x86/gart.h > =================================================================== > --- linux-2.6.orig/include/asm-x86/gart.h > +++ linux-2.6/include/asm-x86/gart.h > @@ -11,6 +11,7 @@ extern void gart_iommu_shutdown(void); > extern void __init gart_parse_options(char *); > extern void early_gart_iommu_check(void); > extern void gart_iommu_hole_init(void); > +extern void set_up_gart_resume(u32, u32); > extern int fallback_aper_order; > extern int fallback_aper_force; > extern int gart_iommu_aperture; ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-06-11 11:43 ` >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption Patrick @ 2008-06-11 14:38 ` Rafael J. Wysocki 2008-06-11 15:04 ` Andi Kleen 0 siblings, 1 reply; 25+ messages in thread From: Rafael J. Wysocki @ 2008-06-11 14:38 UTC (permalink / raw) To: Patrick; +Cc: Pavel Machek, linux-kernel, Ingo Molnar, pm list, Andi Kleen On Wednesday, 11 of June 2008, Patrick wrote: > Hello Hi, > On Mon, 2008-06-09 at 00:36 +0200, Rafael J. Wysocki wrote: > > > > > It's on top of the patch that adds the GART sysdev. > > > Updated patch follows. It has been tested a little on my new 4 GB test box on > > which 2.6.26-rc4 failed miserably with severe consequences. More testing > > welcome, but please be careful. > > No risk no fun! :-) > > As the suspend-vs-iommu-prevent-suspend-if-we-could-not-resume.patch, > where the GART sysdev is added, is now included in 2.6.26-rc5, I was > able to apply this one seamlessly to it (-rc5 tree) and have just > rebooted *, suspended to disk and ram once wiht all filesystems (>400GB) > mounted rw ** and everything is _working fine_. > > As usual, a kernel log, the applied patch and a test report are > available at http://zefir.890m.com/kernel-testing/ . Thanks for the testing! Well, I was hoping to be able to get this patch into 2.6.26, as I don't really like the temporary hack, preventing the affected systems from resuming at all, that we have in there, but it seems to be too late. :-( Hopefully, we'll get it into 2.6.27. Thanks, Rafael > > --- > > Add resume handling to GART IOMMU. > > > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> > > --- > > arch/x86/kernel/aperture_64.c | 2 + > > arch/x86/kernel/pci-gart_64.c | 75 +++++++++++++++++++++++++++++++++--------- > > include/asm-x86/gart.h | 1 > > 3 files changed, 62 insertions(+), 16 deletions(-) > > > > Index: linux-2.6/arch/x86/kernel/pci-gart_64.c > > =================================================================== > > --- linux-2.6.orig/arch/x86/kernel/pci-gart_64.c > > +++ linux-2.6/arch/x86/kernel/pci-gart_64.c > > @@ -549,14 +549,70 @@ static __init unsigned read_aperture(str > > return aper_base; > > } > > > > +static void enable_gart_translations(void) > > +{ > > + int i; > > + > > + for (i = 0; i < num_k8_northbridges; i++) { > > + struct pci_dev *dev; > > + u32 gatt_reg; > > + u32 ctl; > > + > > + dev = k8_northbridges[i]; > > + gatt_reg = __pa(agp_gatt_table) >> 12; > > + gatt_reg <<= 4; > > + pci_write_config_dword(dev, 0x98, gatt_reg); > > + pci_read_config_dword(dev, 0x90, &ctl); > > + > > + ctl |= 1; > > + ctl &= ~((1<<4) | (1<<5)); > > + > > + pci_write_config_dword(dev, 0x90, ctl); > > + } > > +} > > + > > +/* > > + * If fix_up_north_bridges is set, the north bridges have to be fixed up on > > + * resume in the same way as they are handled in gart_iommu_hole_init(). > > + */ > > +static bool fix_up_north_bridges; > > +static u32 aperture_order; > > +static u32 aperture_alloc; > > + > > +void set_up_gart_resume(u32 aper_order, u32 aper_alloc) > > +{ > > + fix_up_north_bridges = true; > > + aperture_order = aper_order; > > + aperture_alloc = aper_alloc; > > +} > > + > > static int gart_resume(struct sys_device *dev) > > { > > + printk(KERN_INFO "PCI-DMA: Resuming GART IOMMU\n"); > > + > > + if (fix_up_north_bridges) { > > + int i; > > + > > + for (i = 0; i < num_k8_northbridges; i++) { > > + struct pci_dev *dev = k8_northbridges[i]; > > + > > + /* > > + * Don't enable translations just yet. That is the next > > + * step. Restore the pre-suspend aperture settings. > > + */ > > + pci_write_config_dword(dev, 0x90, aperture_order << 1); > > + pci_write_config_dword(dev, 0x94, aperture_alloc >> 25); > > + } > > + } > > + > > + enable_gart_translations(); > > + > > return 0; > > } > > > > static int gart_suspend(struct sys_device *dev, pm_message_t state) > > { > > - return -EINVAL; > > + return 0; > > } > > > > static struct sysdev_class gart_sysdev_class = { > > @@ -614,27 +670,14 @@ static __init int init_k8_gatt(struct ag > > memset(gatt, 0, gatt_size); > > agp_gatt_table = gatt; > > > > - for (i = 0; i < num_k8_northbridges; i++) { > > - u32 gatt_reg; > > - u32 ctl; > > - > > - dev = k8_northbridges[i]; > > - gatt_reg = __pa(gatt) >> 12; > > - gatt_reg <<= 4; > > - pci_write_config_dword(dev, 0x98, gatt_reg); > > - pci_read_config_dword(dev, 0x90, &ctl); > > - > > - ctl |= 1; > > - ctl &= ~((1<<4) | (1<<5)); > > - > > - pci_write_config_dword(dev, 0x90, ctl); > > - } > > + enable_gart_translations(); > > > > error = sysdev_class_register(&gart_sysdev_class); > > if (!error) > > error = sysdev_register(&device_gart); > > if (error) > > panic("Could not register gart_sysdev -- would corrupt data on next suspend"); > > + > > flush_gart(); > > > > printk(KERN_INFO "PCI-DMA: aperture base @ %x size %u KB\n", > > Index: linux-2.6/arch/x86/kernel/aperture_64.c > > =================================================================== > > --- linux-2.6.orig/arch/x86/kernel/aperture_64.c > > +++ linux-2.6/arch/x86/kernel/aperture_64.c > > @@ -413,4 +413,6 @@ void __init gart_iommu_hole_init(void) > > write_pci_config(0, num, 3, 0x90, aper_order<<1); > > write_pci_config(0, num, 3, 0x94, aper_alloc>>25); > > } > > + > > + set_up_gart_resume(aper_order, aper_alloc); > > } > > Index: linux-2.6/include/asm-x86/gart.h > > =================================================================== > > --- linux-2.6.orig/include/asm-x86/gart.h > > +++ linux-2.6/include/asm-x86/gart.h > > @@ -11,6 +11,7 @@ extern void gart_iommu_shutdown(void); > > extern void __init gart_parse_options(char *); > > extern void early_gart_iommu_check(void); > > extern void gart_iommu_hole_init(void); > > +extern void set_up_gart_resume(u32, u32); > > extern int fallback_aper_order; > > extern int fallback_aper_force; > > extern int gart_iommu_aperture; ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-06-11 14:38 ` Rafael J. Wysocki @ 2008-06-11 15:04 ` Andi Kleen 2008-07-03 17:35 ` Patrick 0 siblings, 1 reply; 25+ messages in thread From: Andi Kleen @ 2008-06-11 15:04 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Patrick, Pavel Machek, linux-kernel, Ingo Molnar, pm list > Well, I was hoping to be able to get this patch into 2.6.26, as I don't really > like the temporary hack, preventing the affected systems from resuming > at all, that we have in there, but it seems to be too late. :-( I agree with you that this patch would be far better than the hack. -Andi ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-06-11 15:04 ` Andi Kleen @ 2008-07-03 17:35 ` Patrick 2008-08-07 8:17 ` Pavel Machek 0 siblings, 1 reply; 25+ messages in thread From: Patrick @ 2008-07-03 17:35 UTC (permalink / raw) To: Andi Kleen Cc: Rafael J. Wysocki, Pavel Machek, linux-kernel, Ingo Molnar, pm list *bump* Why "too late" ??? It (2.6.26) is still not out! Please (;-) oh please!) reconsider / push it a bit! It's tested and working perfectly here ever since. But I'd like for example vmware to work again as will and it doesn't yet even with 2.6.25... Please let this patch go into 2.6.26 so there's a chance of having a stock kernel with working vmware, fglrx, and such again soon! Thx Greets - Patrick On Mit, 2008-06-11 at 17:04 +0200, Andi Kleen wrote: > > Well, I was hoping to be able to get this patch into 2.6.26, as I don't really > > like the temporary hack, preventing the affected systems from resuming > > at all, that we have in there, but it seems to be too late. :-( > > I agree with you that this patch would be far better than the hack. > > -Andi ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-07-03 17:35 ` Patrick @ 2008-08-07 8:17 ` Pavel Machek 2008-08-08 22:40 ` Patrick 0 siblings, 1 reply; 25+ messages in thread From: Pavel Machek @ 2008-08-07 8:17 UTC (permalink / raw) To: Patrick; +Cc: Andi Kleen, Rafael J. Wysocki, linux-kernel, Ingo Molnar, pm list Hi! > *bump* > > Why "too late" ??? It (2.6.26) is still not out! Please (;-) oh please!) > reconsider / push it a bit! It's tested and working perfectly here ever > since. But I'd like for example vmware to work again as will and it > doesn't yet even with 2.6.25... Please let this patch go into 2.6.26 so > there's a chance of having a stock kernel with working vmware, fglrx, > and such again soon! Can you test 2.6.27-rc2? It should be all there... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-08-07 8:17 ` Pavel Machek @ 2008-08-08 22:40 ` Patrick 2008-09-02 8:05 ` Pavel Machek 0 siblings, 1 reply; 25+ messages in thread From: Patrick @ 2008-08-08 22:40 UTC (permalink / raw) To: Pavel Machek Cc: Andi Kleen, Rafael J. Wysocki, linux-kernel, Ingo Molnar, pm list On Thu, 2008-08-07 at 10:17 +0200, Pavel Machek wrote: > Can you test 2.6.27-rc2? It should be all there... Ja, sure! I noticed that it's there already... thank you all! So as you requested, I got myself a -rc2 tree, compiled and tested #s2ram and #s2disk. It works! pat@babar:~/tmp/dmesg$ grep -C3 GART dmesg.2.6.27-rc2-gart-suspend.txt [ 0.208013] PCI: Using ACPI for IRQ routing [ 0.228126] PCI-DMA: Disabling AGP. [ 0.228918] PCI-DMA: aperture base @ 20000000 size 65536 KB [ 0.228956] PCI-DMA: using GART IOMMU. [ 0.228993] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture [ 0.229429] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0 [ 0.229615] hpet0: 4 32-bit timers, 14318180 Hz -- [ 45.170754] [drm] Initialized drm 1.1.0 20060810 [ 45.199305] pci 0000:01:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 [ 45.199447] [drm] Initialized radeon 1.29.0 20080528 on minor 0 [ 46.127083] [drm] Setting GART location based on new memory map [ 46.128083] [drm] Loading RS690 Microcode [ 46.128116] [drm] Num pipes: 1 [ 46.128123] [drm] writeback test succeeded in 1 usecs -- [ 5341.902040] groups: 0 [ 5341.902511] CPU1 is down [ 5341.902511] Back to C! [ 5341.902535] PCI-DMA: Resuming GART IOMMU [ 5341.902537] PCI-DMA: Restoring GART aperture settings [ 5341.902831] Enabling non-boot CPUs ... [ 5341.903244] SMP alternatives: switching to SMP code [ 5341.914231] Booting processor 1/1 ip 6000 -- [ 5778.246557] CPU1 is down [ 5778.246726] PM: Creating hibernation image: [ 5778.256008] PM: Need to copy 236714 pages [ 5778.256008] PCI-DMA: Resuming GART IOMMU [ 5778.256008] PCI-DMA: Restoring GART aperture settings [ 5778.256008] Enabling non-boot CPUs ... [ 5778.256008] SMP alternatives: switching to SMP code [ 5778.267186] Booting processor 1/1 ip 6000 -------- full log and small test report is here: http://zefir.890m.com/kernel-testing/ http://zefir.890m.com/kernel-testing/dmesg.2.6.27-rc2.txt I have no clue what caused the oops (see dmesg), but I think it's unrelated (look at the time stamps). Maybe it's got something to do with the fact that I'm using https://launchpad.net/~xorg-edgers xserver now to have 3d acceleration working for my radeon X1200 (RS690 builtin) and the drm module versions don't match. Xserver wants 1.30 but starts anyway. I can't fix this because the script can't compile the git://anongit.freedesktop.org/git/mesa/drm modules for 2.6.27-rc2 .... This creates some unpredictable problems with the X server process when I log into my account (can't find out what triggers it ... cpu usage goes to 100%, have to reboot) As a workaround, I could log into a guest account to do the tests but switched back to my patched 2.6.26-rc5 for now for everyday PC usage. BTW, if somebody would like to explain to me, why the simultanious usage of the GART by drm and as iommu DON'T bite each other, I'd be quiet interrested to learn. Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-08-08 22:40 ` Patrick @ 2008-09-02 8:05 ` Pavel Machek 0 siblings, 0 replies; 25+ messages in thread From: Pavel Machek @ 2008-09-02 8:05 UTC (permalink / raw) To: Patrick; +Cc: Andi Kleen, Rafael J. Wysocki, linux-kernel, Ingo Molnar, pm list Hi! > > Can you test 2.6.27-rc2? It should be all there... > > Ja, sure! I noticed that it's there already... thank you all! > > So as you requested, I got myself a -rc2 tree, compiled and tested > #s2ram and #s2disk. It works! Good. > Maybe it's got something to do with the fact that I'm using > https://launchpad.net/~xorg-edgers xserver now to have 3d acceleration > working for my radeon X1200 (RS690 builtin) and the drm module versions > don't match. Xserver wants 1.30 but starts anyway. I can't fix this > because the script can't compile the > git://anongit.freedesktop.org/git/mesa/drm modules for 2.6.27-rc2 .... > This creates some unpredictable problems with the X server process when > I log into my account (can't find out what triggers it ... cpu usage > goes to 100%, have to reboot) As a workaround, I could log into a guest > account to do the tests but switched back to my patched 2.6.26-rc5 for > now for everyday PC usage. Yes, 3d has some problems. > BTW, if somebody would like to explain to me, why the simultanious usage > of the GART by drm and as iommu DON'T bite each other, I'd be quiet > interrested to learn. No idea, sorry. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption 2008-05-25 21:10 ` Pavel Machek 2008-05-26 15:31 ` Patrick @ 2008-05-27 10:23 ` Pavel Machek 1 sibling, 0 replies; 25+ messages in thread From: Pavel Machek @ 2008-05-27 10:23 UTC (permalink / raw) To: Patrick, Rafael J. Wysocki; +Cc: linux-kernel Hi! > > > > iommu problem? Try it with mem=3G. > > > > > > YES! :-) How did you know? > > > > Guess how... I hit it myself. > > > > > So how are we going to get this fixed??? > > > > Write a patch, submit it? ;-). > > Can you try this one? It should prevent suspend in the broken cases, > but allow it in mem=4G config. Apply this on top of previous patch, and you may get working system _and_ all the memory... Add resume support to pci-gart_64.c. This is neccessary for resume not to currupt disk on >3GB machines. Signed-off-by: Pavel Machek <pavel@suse.cz> --- commit cc8201de538dda6c17e03fe495146e7fc755f64d tree 9f4ece8312b59e6e14eb3a38d489ff37070d6cf1 parent db95a81f7f2106655c6ceb05a38300fd26f6ea3f author Pavel <pavel@amd.ucw.cz> Tue, 27 May 2008 12:23:45 +0200 committer Pavel <pavel@amd.ucw.cz> Tue, 27 May 2008 12:23:45 +0200 arch/x86/kernel/aperture_64.c | 45 +++++++++++++++++++++++++++-------------- arch/x86/kernel/pci-gart_64.c | 28 ++++++++++++++++++++------ drivers/char/agp/generic.c | 2 +- include/asm-x86/gart.h | 1 + 4 files changed, 54 insertions(+), 22 deletions(-) diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c index 2088b6a..2571dc4 100644 --- a/arch/x86/kernel/aperture_64.c +++ b/arch/x86/kernel/aperture_64.c @@ -302,6 +302,32 @@ void __init early_gart_iommu_check(void) } +u32 fix_aper_enabled, fix_aper_order, fix_aper_alloc; + +void fix_up_north_bridges(void) +{ + int num; + if (!fix_aper_enabled) + return; + + /* Fix up the north bridges */ + for (num = 24; num < 32; num++) { + if (!early_is_k8_nb(read_pci_config(0, num, 3, 0x00))) + continue; + + /* + * Don't enable translation yet. That is done later + * by enable_gart_translation. + * + * Assume this BIOS didn't initialise the GART so + * just overwrite all previous bits + */ + write_pci_config(0, num, 3, AMD64_GARTAPERTURECTL, fix_aper_order<<1); + write_pci_config(0, num, 3, AMD64_GARTAPERTUREBASE, fix_aper_alloc>>25); + } +} + + void __init gart_iommu_hole_init(void) { u32 aper_size, aper_alloc = 0, aper_order = 0, last_aper_order = 0; @@ -393,19 +419,8 @@ void __init gart_iommu_hole_init(void) return; } - /* Fix up the north bridges */ - for (num = 24; num < 32; num++) { - if (!early_is_k8_nb(read_pci_config(0, num, 3, 0x00))) - continue; - - /* - * Don't enable translation yet. That is done later - * by enable_gart_translation. - * - * Assume this BIOS didn't initialise the GART so - * just overwrite all previous bits - */ - write_pci_config(0, num, 3, AMD64_GARTAPERTURECTL, aper_order<<1); - write_pci_config(0, num, 3, AMD64_GARTAPERTUREBASE, aper_alloc>>25); - } + fix_aper_enabled = 1; + fix_aper_order = aper_order; + fix_aper_alloc = aper_alloc; + fix_up_north_bridges(); } diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c index 926af9c..dbd3000 100644 --- a/arch/x86/kernel/pci-gart_64.c +++ b/arch/x86/kernel/pci-gart_64.c @@ -549,14 +549,27 @@ static __init unsigned read_aperture(str return aper_base; } +static void enable_gart_translations(void) +{ + int i; + struct pci_dev *dev; + + for (i = 0; i < num_k8_northbridges; i++) { + dev = k8_northbridges[i]; + enable_gart_translation(dev, __pa(agp_gatt_table)); + } +} + static int gart_resume(struct sys_device *dev) { + fix_up_north_bridges(); + enable_gart_translations(); return 0; } static int gart_suspend(struct sys_device *dev, pm_message_t state) { - return -EINVAL; + return 0; } static struct sysdev_class gart_sysdev_class = { @@ -571,6 +584,7 @@ static struct sys_device device_gart = { .cls = &gart_sysdev_class, }; + /* * Private Northbridge GATT initialization in case we cannot use the * AGP driver for some reason. @@ -614,11 +628,8 @@ static __init int init_k8_gatt(struct ag memset(gatt, 0, gatt_size); agp_gatt_table = gatt; - for (i = 0; i < num_k8_northbridges; i++) { - dev = k8_northbridges[i]; - enable_gart_translation(dev, __pa(gatt)); - } - + enable_gart_translations(); + error = sysdev_class_register(&gart_sysdev_class); if (!error) error = sysdev_register(&device_gart); @@ -651,6 +662,11 @@ static const struct dma_mapping_ops gart .unmap_sg = gart_unmap_sg, }; +/* Called from native_machine_shutdown; should this use regular + * shutdown call from sysdev? Why is this needed at all? Some broken + * BIOS can't cope with gart enabled during reboot? + */ + void gart_iommu_shutdown(void) { struct pci_dev *dev; diff --git a/drivers/char/agp/generic.c b/drivers/char/agp/generic.c index 7fc0c99..7fb4d5b 100644 --- a/drivers/char/agp/generic.c +++ b/drivers/char/agp/generic.c @@ -43,7 +43,7 @@ #include <asm/cacheflush.h> #include <asm/pgtable.h> #include "agp.h" -__u32 *agp_gatt_table; +u32 *agp_gatt_table; int agp_memory_reserved; /* diff --git a/include/asm-x86/gart.h b/include/asm-x86/gart.h index f37d83b..6f27b14 100644 --- a/include/asm-x86/gart.h +++ b/include/asm-x86/gart.h @@ -93,5 +93,6 @@ static inline int __aperture_valid(u64 a return 1; } +extern void fix_up_north_bridges(void); #endif -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply related [flat|nested] 25+ messages in thread
end of thread, other threads:[~2008-09-02 9:22 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-25 12:10 SB600 AHCI: Hard Disk Corruption Patrick
2008-05-25 12:16 ` Patrick
2008-05-25 17:38 ` Pavel Machek
2008-05-25 20:08 ` Patrick
2008-05-25 20:39 ` >3G => iommu => suspend problems -- was " Pavel Machek
2008-05-25 21:10 ` Pavel Machek
2008-05-26 15:31 ` Patrick
2008-05-27 11:22 ` Pavel Machek
2008-05-29 18:44 ` Patrick
2008-05-29 18:51 ` Patrick
2008-05-29 21:05 ` Patrick
2008-06-03 22:33 ` Rafael J. Wysocki
2008-06-06 13:20 ` Pavel Machek
2008-06-08 22:36 ` Rafael J. Wysocki
[not found] ` <20080609124630.GA28799@elte.hu>
2008-06-09 22:10 ` [PATCH] x86 GART: Add resume handling (was: Re: >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption) Rafael J. Wysocki
2008-06-10 10:03 ` Rafael J. Wysocki
2008-06-12 9:34 ` Ingo Molnar
2008-06-11 11:43 ` >3G => iommu => suspend problems -- was Re: SB600 AHCI: Hard Disk Corruption Patrick
2008-06-11 14:38 ` Rafael J. Wysocki
2008-06-11 15:04 ` Andi Kleen
2008-07-03 17:35 ` Patrick
2008-08-07 8:17 ` Pavel Machek
2008-08-08 22:40 ` Patrick
2008-09-02 8:05 ` Pavel Machek
2008-05-27 10:23 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox