* [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded.
@ 2006-06-24 0:19 Andi Kleen
2006-06-24 23:14 ` Alan Cox
0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2006-06-24 0:19 UTC (permalink / raw)
To: torvalds; +Cc: discuss, akpm, markh, alan, linux-scsi
- Rename the GART_IOMMU option to IOMMU to make clear it's not
just for AMD
- Rewrite the help text to better emphatise this fact
- Make it an embedded option because too many people get it wrong.
To my astonishment I discovered the aacraid driver tests this
symbol directly. This looks quite broken to me - it's an internal
implementation detail of the PCI DMA API. Can the maintainer
please clarify what this test was intended to do?
Cc: linux-scsi@vger.kernel.org
Cc: alan@redhat.com
Cc: markh@osdl.org
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86_64/Kconfig | 28 +++++++++++++++-------------
arch/x86_64/Kconfig.debug | 2 +-
arch/x86_64/kernel/Makefile | 2 +-
arch/x86_64/kernel/io_apic.c | 2 +-
arch/x86_64/kernel/pci-dma.c | 2 +-
arch/x86_64/kernel/setup.c | 2 +-
drivers/char/agp/Kconfig | 4 ++--
drivers/char/agp/amd64-agp.c | 4 ++--
drivers/scsi/aacraid/comminit.c | 5 ++++-
include/asm-x86_64/pci.h | 2 +-
include/asm-x86_64/proto.h | 2 +-
11 files changed, 30 insertions(+), 25 deletions(-)
Index: linux/arch/x86_64/Kconfig
===================================================================
--- linux.orig/arch/x86_64/Kconfig
+++ linux/arch/x86_64/Kconfig
@@ -386,24 +386,26 @@ config HPET_EMULATE_RTC
bool "Provide RTC interrupt"
depends on HPET_TIMER && RTC=y
-config GART_IOMMU
- bool "K8 GART IOMMU support"
+# Mark as embedded because too many people got it wrong.
+# The code disables itself when not needed.
+config IOMMU
+ bool "IOMMU support" if EMBEDDED
default y
select SWIOTLB
select AGP
depends on PCI
help
- Support for hardware IOMMU in AMD's Opteron/Athlon64 Processors
- and for the bounce buffering software IOMMU.
- Needed to run systems with more than 3GB of memory properly with
- 32-bit PCI devices that do not support DAC (Double Address Cycle).
- The IOMMU can be turned off at runtime with the iommu=off parameter.
- Normally the kernel will take the right choice by itself.
- This option includes a driver for the AMD Opteron/Athlon64 IOMMU
- northbridge and a software emulation used on other systems without
- hardware IOMMU. If unsure, say Y.
+ Support for full DMA access of devices with 32bit memory access only
+ on systems with more than 3GB. This is usually needed for USB,
+ sound, many IDE/SATA chipsets and some other devices.
+ Provides a driver for the AMD Athlon64/Opteron/Turion/Sempron GART
+ based IOMMU and a software bounce buffer based IOMMU used on Intel
+ systems and as fallback.
+ The code is only active when needed (enough memory and limited
+ device) unless CONFIG_IOMMU_DEBUG or iommu=force is specified
+ too.
-# need this always selected by GART_IOMMU for the VIA workaround
+# need this always selected by IOMMU for the VIA workaround
config SWIOTLB
bool
@@ -503,7 +505,7 @@ config REORDER
config K8_NB
def_bool y
- depends on AGP_AMD64 || GART_IOMMU || (PCI && NUMA)
+ depends on AGP_AMD64 || IOMMU || (PCI && NUMA)
endmenu
Index: linux/arch/x86_64/Kconfig.debug
===================================================================
--- linux.orig/arch/x86_64/Kconfig.debug
+++ linux/arch/x86_64/Kconfig.debug
@@ -13,7 +13,7 @@ config DEBUG_RODATA
If in doubt, say "N".
config IOMMU_DEBUG
- depends on GART_IOMMU && DEBUG_KERNEL
+ depends on IOMMU && DEBUG_KERNEL
bool "Enable IOMMU debugging"
help
Force the IOMMU to on even when you have less than 4GB of
Index: linux/arch/x86_64/kernel/Makefile
===================================================================
--- linux.orig/arch/x86_64/kernel/Makefile
+++ linux/arch/x86_64/kernel/Makefile
@@ -28,7 +28,7 @@ obj-$(CONFIG_PM) += suspend.o
obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend_asm.o
obj-$(CONFIG_CPU_FREQ) += cpufreq/
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
-obj-$(CONFIG_GART_IOMMU) += pci-gart.o aperture.o
+obj-$(CONFIG_IOMMU) += pci-gart.o aperture.o
obj-$(CONFIG_SWIOTLB) += pci-swiotlb.o
obj-$(CONFIG_KPROBES) += kprobes.o
obj-$(CONFIG_X86_PM_TIMER) += pmtimer.o
Index: linux/arch/x86_64/kernel/io_apic.c
===================================================================
--- linux.orig/arch/x86_64/kernel/io_apic.c
+++ linux/arch/x86_64/kernel/io_apic.c
@@ -319,7 +319,7 @@ void __init check_ioapic(void)
vendor &= 0xffff;
switch (vendor) {
case PCI_VENDOR_ID_VIA:
-#ifdef CONFIG_GART_IOMMU
+#ifdef CONFIG_IOMMU
if ((end_pfn > MAX_DMA32_PFN ||
force_iommu) &&
!iommu_aperture_allowed) {
Index: linux/arch/x86_64/kernel/pci-dma.c
===================================================================
--- linux.orig/arch/x86_64/kernel/pci-dma.c
+++ linux/arch/x86_64/kernel/pci-dma.c
@@ -266,7 +266,7 @@ __init int iommu_setup(char *p)
swiotlb = 1;
#endif
-#ifdef CONFIG_GART_IOMMU
+#ifdef CONFIG_IOMMU
gart_parse_options(p);
#endif
Index: linux/arch/x86_64/kernel/setup.c
===================================================================
--- linux.orig/arch/x86_64/kernel/setup.c
+++ linux/arch/x86_64/kernel/setup.c
@@ -797,7 +797,7 @@ void __init setup_arch(char **cmdline_p)
e820_setup_gap();
-#ifdef CONFIG_GART_IOMMU
+#ifdef CONFIG_IOMMU
iommu_hole_init();
#endif
Index: linux/drivers/char/agp/Kconfig
===================================================================
--- linux.orig/drivers/char/agp/Kconfig
+++ linux/drivers/char/agp/Kconfig
@@ -55,9 +55,9 @@ config AGP_AMD
X on AMD Irongate, 761, and 762 chipsets.
config AGP_AMD64
- tristate "AMD Opteron/Athlon64 on-CPU GART support" if !GART_IOMMU
+ tristate "AMD Opteron/Athlon64 on-CPU GART support" if !IOMMU
depends on AGP && X86
- default y if GART_IOMMU
+ default y if IOMMU
help
This option gives you AGP support for the GLX component of
X using the on-CPU northbridge of the AMD Athlon64/Opteron CPUs.
Index: linux/drivers/char/agp/amd64-agp.c
===================================================================
--- linux.orig/drivers/char/agp/amd64-agp.c
+++ linux/drivers/char/agp/amd64-agp.c
@@ -292,7 +292,7 @@ static int __devinit aperture_valid(u64
/*
* W*s centric BIOS sometimes only set up the aperture in the AGP
* bridge, not the northbridge. On AMD64 this is handled early
- * in aperture.c, but when GART_IOMMU is not enabled or we run
+ * in aperture.c, but when IOMMU is not enabled or we run
* on a 32bit kernel this needs to be redone.
* Unfortunately it is impossible to fix the aperture here because it's too late
* to allocate that much memory. But at least error out cleanly instead of
@@ -775,7 +775,7 @@ static void __exit agp_amd64_cleanup(voi
/* On AMD64 the PCI driver needs to initialize this driver early
for the IOMMU, so it has to be called via a backdoor. */
-#ifndef CONFIG_GART_IOMMU
+#ifndef CONFIG_IOMMU
module_init(agp_amd64_init);
module_exit(agp_amd64_cleanup);
#endif
Index: linux/drivers/scsi/aacraid/comminit.c
===================================================================
--- linux.orig/drivers/scsi/aacraid/comminit.c
+++ linux/drivers/scsi/aacraid/comminit.c
@@ -104,8 +104,11 @@ static int aac_alloc_comm(struct aac_dev
* always true on real computers. It also has some slight problems
* with the GART on x86-64. I've btw never tried DMA from PCI space
* on this platform but don't be surprised if its problematic.
+ * [AK: something is very very wrong when a driver tests this symbol.
+ * Someone should figure out what the comment writer really meant here and fix
+ * the code. Or just remove that bad code. ]
*/
-#ifndef CONFIG_GART_IOMMU
+#ifndef CONFIG_IOMMU
if ((num_physpages << (PAGE_SHIFT - 12)) <= AAC_MAX_HOSTPHYSMEMPAGES) {
init->HostPhysMemPages =
cpu_to_le32(num_physpages << (PAGE_SHIFT-12));
Index: linux/include/asm-x86_64/pci.h
===================================================================
--- linux.orig/include/asm-x86_64/pci.h
+++ linux/include/asm-x86_64/pci.h
@@ -52,7 +52,7 @@ extern int iommu_setup(char *opt);
*/
#define PCI_DMA_BUS_IS_PHYS (dma_ops->is_phys)
-#ifdef CONFIG_GART_IOMMU
+#ifdef CONFIG_IOMMU
/*
* x86-64 always supports DAC, but sometimes it is useful to force
Index: linux/include/asm-x86_64/proto.h
===================================================================
--- linux.orig/include/asm-x86_64/proto.h
+++ linux/include/asm-x86_64/proto.h
@@ -116,7 +116,7 @@ extern int skip_ioapic_setup;
extern int acpi_ht;
extern int acpi_disabled;
-#ifdef CONFIG_GART_IOMMU
+#ifdef CONFIG_IOMMU
extern int fallback_aper_order;
extern int fallback_aper_force;
extern int iommu_aperture;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded.
2006-06-24 0:19 [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded Andi Kleen
@ 2006-06-24 23:14 ` Alan Cox
2006-06-25 15:22 ` Andi Kleen
0 siblings, 1 reply; 8+ messages in thread
From: Alan Cox @ 2006-06-24 23:14 UTC (permalink / raw)
To: Andi Kleen; +Cc: torvalds, discuss, akpm, markh, alan, linux-scsi
On Sat, Jun 24, 2006 at 02:19:56AM +0200, Andi Kleen wrote:
> To my astonishment I discovered the aacraid driver tests this
> symbol directly. This looks quite broken to me - it's an internal
> implementation detail of the PCI DMA API. Can the maintainer
> please clarify what this test was intended to do?
Shouldn't be to your astonishment, you were involved in the process
Older aacraid hardware cannot address the 3-4GB range where the iommu
remaps pages. As the PCI DMA implementation for the x86-64 is flawed and
doesn't support any nice way to deal with this via swiotlb instead the
driver handles it internally.
Yeah its ugly, but its old hw and its a one off that can do without tangling
the core code. At least thats what you decided last time around ;)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded.
2006-06-24 23:14 ` Alan Cox
@ 2006-06-25 15:22 ` Andi Kleen
2006-06-25 19:00 ` Alan Cox
0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2006-06-25 15:22 UTC (permalink / raw)
To: Alan Cox; +Cc: torvalds, discuss, akpm, markh, linux-scsi, axboe
On Sunday 25 June 2006 01:14, Alan Cox wrote:
> On Sat, Jun 24, 2006 at 02:19:56AM +0200, Andi Kleen wrote:
> > To my astonishment I discovered the aacraid driver tests this
> > symbol directly. This looks quite broken to me - it's an internal
> > implementation detail of the PCI DMA API. Can the maintainer
> > please clarify what this test was intended to do?
>
> Shouldn't be to your astonishment, you were involved in the process
Can't remember sorry.
> Older aacraid hardware cannot address the 3-4GB range where the iommu
> remaps pages. As the PCI DMA implementation for the x86-64 is flawed and
> doesn't support any nice way to deal with this via swiotlb instead the
> driver handles it internally.
Then you should just force a low bounce pfn < 0xfffffff for the block device -
then the block layer should use GFP_DMA bouncing.
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded.
2006-06-25 15:22 ` Andi Kleen
@ 2006-06-25 19:00 ` Alan Cox
2006-06-25 19:32 ` Andi Kleen
0 siblings, 1 reply; 8+ messages in thread
From: Alan Cox @ 2006-06-25 19:00 UTC (permalink / raw)
To: Andi Kleen; +Cc: Alan Cox, torvalds, discuss, akpm, markh, linux-scsi, axboe
On Sun, Jun 25, 2006 at 05:22:23PM +0200, Andi Kleen wrote:
> > Older aacraid hardware cannot address the 3-4GB range where the iommu
> > remaps pages. As the PCI DMA implementation for the x86-64 is flawed and
> > doesn't support any nice way to deal with this via swiotlb instead the
> > driver handles it internally.
>
> Then you should just force a low bounce pfn < 0xfffffff for the block device -
> then the block layer should use GFP_DMA bouncing.
>From a tiny 16MB DMA pool that can't sustain the required load ? Or has that
bit changed.
Alan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded.
2006-06-25 19:00 ` Alan Cox
@ 2006-06-25 19:32 ` Andi Kleen
2006-06-25 22:21 ` Alan Cox
0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2006-06-25 19:32 UTC (permalink / raw)
To: Alan Cox; +Cc: torvalds, discuss, akpm, markh, linux-scsi, axboe
On Sunday 25 June 2006 21:00, Alan Cox wrote:
> On Sun, Jun 25, 2006 at 05:22:23PM +0200, Andi Kleen wrote:
> > > Older aacraid hardware cannot address the 3-4GB range where the iommu
> > > remaps pages. As the PCI DMA implementation for the x86-64 is flawed and
> > > doesn't support any nice way to deal with this via swiotlb instead the
> > > driver handles it internally.
> >
> > Then you should just force a low bounce pfn < 0xfffffff for the block device -
> > then the block layer should use GFP_DMA bouncing.
>
> From a tiny 16MB DMA pool that can't sustain the required load ? Or has that
> bit changed.
It should be ok because it blocks. It will be slow, but what else do you expect
from broken hardware like this?
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded.
2006-06-25 19:32 ` Andi Kleen
@ 2006-06-25 22:21 ` Alan Cox
2006-06-26 6:24 ` Andi Kleen
0 siblings, 1 reply; 8+ messages in thread
From: Alan Cox @ 2006-06-25 22:21 UTC (permalink / raw)
To: Andi Kleen; +Cc: Alan Cox, torvalds, discuss, akpm, markh, linux-scsi, axboe
On Sun, Jun 25, 2006 at 09:32:41PM +0200, Andi Kleen wrote:
> > From a tiny 16MB DMA pool that can't sustain the required load ? Or has that
> > bit changed.
>
> It should be ok because it blocks. It will be slow, but what else do you expect
> from broken hardware like this?
Sustained 80Mbyte/second I/O rates. At least thats what it gets in other OS
products. This is one of the reasons (broadcomm 4400 was another) that a 30
or 31 bit DMA zone not a 32bit one was called for, which also didn't happen
it seems.
The ifdef approach isnt perfect but it essentially means you get good
performance on x86-32 and only x86-64 has problems which seems fine to me,
because its an old card.
Alan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded.
2006-06-25 22:21 ` Alan Cox
@ 2006-06-26 6:24 ` Andi Kleen
0 siblings, 0 replies; 8+ messages in thread
From: Andi Kleen @ 2006-06-26 6:24 UTC (permalink / raw)
To: Alan Cox; +Cc: torvalds, discuss, akpm, markh, linux-scsi, axboe
On Monday 26 June 2006 00:21, Alan Cox wrote:
> On Sun, Jun 25, 2006 at 09:32:41PM +0200, Andi Kleen wrote:
> > > From a tiny 16MB DMA pool that can't sustain the required load ? Or has that
> > > bit changed.
> >
> > It should be ok because it blocks. It will be slow, but what else do you expect
> > from broken hardware like this?
>
> Sustained 80Mbyte/second I/O rates. At least thats what it gets in other OS
> products. This is one of the reasons (broadcomm 4400 was another) that a 30
> or 31 bit DMA zone not a 32bit one was called for, which also didn't happen
> it seems.
Should we hurt everybody for two terminally broken devices?
BCM4400 and its wireless brethen is fine with GFP_DMA btw and mostly only
exists in low end machines where most/all of the memory is below 2GB anyways
(so the check first approach works quite well)
>
> The ifdef approach isnt perfect but it essentially means you get good
> performance on x86-32 and only x86-64 has problems which seems fine to me,
> because its an old card.
Blk-bounce would have the same effect.
If you set blk-bounce then the block layer will only bounce if the
address is < limit and it knows about the different zones on 32bit
and will use GFP_NORMAL there.
If you have upto 6GB you have a 50:50% chance of not bouncing at least.
The ifdef approach is just broken. Please fix it.
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded.
@ 2006-06-26 12:37 Salyzyn, Mark
0 siblings, 0 replies; 8+ messages in thread
From: Salyzyn, Mark @ 2006-06-26 12:37 UTC (permalink / raw)
To: Andi Kleen, Alan Cox; +Cc: torvalds, discuss, akpm, markh, linux-scsi, axboe
[-- Attachment #1: Type: text/plain, Size: 1398 bytes --]
MarkH is out on Vacation for a few weeks.
This may seem like a DILLIGAF, but after chatting with the F/W folks,
there is no harm in dropping the page calculation as denoted in the
enclosed patch for these older adapters in this new age of 4GB+ memory
sticks. Any resource optimization within the old-old-old adapters for
systems with less than 4G of memory is of little consequence. The
existing AAC_QUIRK_31BIT flag in linit.c should look after the rest of
the legacy hardware DMA limitations.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
---
Applies to the scsi-misc-2.6 git tree.
<enclosed aacraid-GART.patch>
Andi Kleen sez:
> On Sunday 25 June 2006 21:00, Alan Cox wrote:
>> On Sun, Jun 25, 2006 at 05:22:23PM +0200, Andi Kleen wrote:
>>>> Older aacraid hardware cannot address the 3-4GB range where the
iommu
>>>> remaps pages. As the PCI DMA implementation for the x86-64 is
flawed and
>>>> doesn't support any nice way to deal with this via swiotlb instead
the
>>>> driver handles it internally.
>>> Then you should just force a low bounce pfn < 0xfffffff for the
block device -
>>> then the block layer should use GFP_DMA bouncing.
>> From a tiny 16MB DMA pool that can't sustain the required load ? Or
has that
>> bit changed.
> It should be ok because it blocks. It will be slow, but what
> else do you expect from broken hardware like this?
[-- Attachment #2: aacraid-GART.patch --]
[-- Type: application/octet-stream, Size: 1383 bytes --]
--- a/drivers/scsi/aacraid/comminit.c 2006-06-14 09:26:19.000000000 -0400
+++ b/drivers/scsi/aacraid/comminit.c 2006-06-26 08:19:29.186795710 -0400
@@ -92,28 +92,7 @@
init->AdapterFibsPhysicalAddress = cpu_to_le32((u32)phys);
init->AdapterFibsSize = cpu_to_le32(fibsize);
init->AdapterFibAlign = cpu_to_le32(sizeof(struct hw_fib));
- /*
- * number of 4k pages of host physical memory. The aacraid fw needs
- * this number to be less than 4gb worth of pages. num_physpages is in
- * system page units. New firmware doesn't have any issues with the
- * mapping system, but older Firmware did, and had *troubles* dealing
- * with the math overloading past 32 bits, thus we must limit this
- * field.
- *
- * This assumes the memory is mapped zero->n, which isnt
- * always true on real computers. It also has some slight problems
- * with the GART on x86-64. I've btw never tried DMA from PCI space
- * on this platform but don't be surprised if its problematic.
- */
-#ifndef CONFIG_GART_IOMMU
- if ((num_physpages << (PAGE_SHIFT - 12)) <= AAC_MAX_HOSTPHYSMEMPAGES) {
- init->HostPhysMemPages =
- cpu_to_le32(num_physpages << (PAGE_SHIFT-12));
- } else
-#endif
- {
- init->HostPhysMemPages = cpu_to_le32(AAC_MAX_HOSTPHYSMEMPAGES);
- }
+ init->HostPhysMemPages = cpu_to_le32(AAC_MAX_HOSTPHYSMEMPAGES);
init->InitFlags = 0;
if (dev->new_comm_interface) {
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-06-26 12:37 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-24 0:19 [PATCH] [29/82] x86_64: Rename IOMMU option, fix help and mark option embedded Andi Kleen
2006-06-24 23:14 ` Alan Cox
2006-06-25 15:22 ` Andi Kleen
2006-06-25 19:00 ` Alan Cox
2006-06-25 19:32 ` Andi Kleen
2006-06-25 22:21 ` Alan Cox
2006-06-26 6:24 ` Andi Kleen
-- strict thread matches above, loose matches on Subject: below --
2006-06-26 12:37 Salyzyn, Mark
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).