public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
@ 2008-11-18 15:43 Joerg Roedel
  2008-11-18 15:49 ` Ingo Molnar
  2008-11-19  6:05 ` FUJITA Tomonori
  0 siblings, 2 replies; 10+ messages in thread
From: Joerg Roedel @ 2008-11-18 15:43 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: iommu, linux-kernel

(sorry for resend, forgot to add the mailing lists)

Hi Ingo,


The following changes since commit 4e14e833ac3b97a4aa8803eea49f899adc5bb5f4:
  Linus Torvalds (1):
        Merge git://git.kernel.org/.../sfrench/cifs-2.6

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu.git iommu-fixes-2.6.28

Joerg Roedel (4):
      AMD IOMMU: add parameter to disable device isolation
      AMD IOMMU: enable device isolation per default
      AMD IOMMU: fix fullflush comparison length
      AMD IOMMU: check for next_bit also in unmapped area

 Documentation/kernel-parameters.txt |    4 +++-
 arch/x86/kernel/amd_iommu.c         |    2 +-
 arch/x86/kernel/amd_iommu_init.c    |    6 ++++--
 3 files changed, 8 insertions(+), 4 deletions(-)

As the most important change these patches enable device isolation per
default. Tests have shown that there are drivers which have bugs and do
double-freeing of DMA memory. This can lead to data corruption with a
hardware IOMMU when multiple devices share the same protection domain.
Therefore device isolation should be enabled by default.
The full diff of these changes is appended. Please pull.

Thanks,

Joerg

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9fa6508..f2e1e7f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -294,7 +294,9 @@ and is between 256 and 4096 characters. It is defined in the file
 			Possible values are:
 			isolate - enable device isolation (each device, as far
 			          as possible, will get its own protection
-			          domain)
+			          domain) [default]
+			share - put every device behind one IOMMU into the
+				same protection domain
 			fullflush - enable flushing of IO/TLB entries when
 				    they are unmapped. Otherwise they are
 				    flushed before they will be reused, which
diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c
index 331b318..e4899e0 100644
--- a/arch/x86/kernel/amd_iommu.c
+++ b/arch/x86/kernel/amd_iommu.c
@@ -537,7 +537,7 @@ static void dma_ops_free_addresses(struct dma_ops_domain *dom,
 	address >>= PAGE_SHIFT;
 	iommu_area_free(dom->bitmap, address, pages);
 
-	if (address + pages >= dom->next_bit)
+	if (address >= dom->next_bit)
 		dom->need_flush = true;
 }
 
diff --git a/arch/x86/kernel/amd_iommu_init.c b/arch/x86/kernel/amd_iommu_init.c
index 0cdcda3..30ae270 100644
--- a/arch/x86/kernel/amd_iommu_init.c
+++ b/arch/x86/kernel/amd_iommu_init.c
@@ -121,7 +121,7 @@ u16 amd_iommu_last_bdf;			/* largest PCI device id we have
 LIST_HEAD(amd_iommu_unity_map);		/* a list of required unity mappings
 					   we find in ACPI */
 unsigned amd_iommu_aperture_order = 26; /* size of aperture in power of 2 */
-int amd_iommu_isolate;			/* if 1, device isolation is enabled */
+int amd_iommu_isolate = 1;		/* if 1, device isolation is enabled */
 bool amd_iommu_unmap_flush;		/* if true, flush on every unmap */
 
 LIST_HEAD(amd_iommu_list);		/* list of all AMD IOMMUs in the
@@ -1213,7 +1213,9 @@ static int __init parse_amd_iommu_options(char *str)
 	for (; *str; ++str) {
 		if (strncmp(str, "isolate", 7) == 0)
 			amd_iommu_isolate = 1;
-		if (strncmp(str, "fullflush", 11) == 0)
+		if (strncmp(str, "share", 5) == 0)
+			amd_iommu_isolate = 0;
+		if (strncmp(str, "fullflush", 9) == 0)
 			amd_iommu_unmap_flush = true;
 	}
 
-- 
           |           AMD Saxony Limited Liability Company & Co. KG
 Operating |         Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System    |                  Register Court Dresden: HRA 4896
 Research  |              General Partner authorized to represent:
 Center    |             AMD Saxony LLC (Wilmington, Delaware, US)
           | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
  2008-11-18 15:43 [GIT PULL] AMD IOMMU updates for 2.6.28-rc5 Joerg Roedel
@ 2008-11-18 15:49 ` Ingo Molnar
  2008-11-19  6:05 ` FUJITA Tomonori
  1 sibling, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2008-11-18 15:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Ingo Molnar, iommu, linux-kernel


* Joerg Roedel <joerg.roedel@amd.com> wrote:

> (sorry for resend, forgot to add the mailing lists)
> 
> Hi Ingo,
> 
> 
> The following changes since commit 4e14e833ac3b97a4aa8803eea49f899adc5bb5f4:
>   Linus Torvalds (1):
>         Merge git://git.kernel.org/.../sfrench/cifs-2.6
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu.git iommu-fixes-2.6.28
> 
> Joerg Roedel (4):
>       AMD IOMMU: add parameter to disable device isolation
>       AMD IOMMU: enable device isolation per default
>       AMD IOMMU: fix fullflush comparison length
>       AMD IOMMU: check for next_bit also in unmapped area
> 
>  Documentation/kernel-parameters.txt |    4 +++-
>  arch/x86/kernel/amd_iommu.c         |    2 +-
>  arch/x86/kernel/amd_iommu_init.c    |    6 ++++--
>  3 files changed, 8 insertions(+), 4 deletions(-)
> 
> As the most important change these patches enable device isolation 
> per default. Tests have shown that there are drivers which have bugs 
> and do double-freeing of DMA memory. This can lead to data 
> corruption with a hardware IOMMU when multiple devices share the 
> same protection domain. Therefore device isolation should be enabled 
> by default. The full diff of these changes is appended. Please pull.

pulled into tip/x86/urgent, thanks Joerg!

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
  2008-11-18 15:43 [GIT PULL] AMD IOMMU updates for 2.6.28-rc5 Joerg Roedel
  2008-11-18 15:49 ` Ingo Molnar
@ 2008-11-19  6:05 ` FUJITA Tomonori
  2008-11-19  9:25   ` Joerg Roedel
  2008-11-19 12:57   ` Muli Ben-Yehuda
  1 sibling, 2 replies; 10+ messages in thread
From: FUJITA Tomonori @ 2008-11-19  6:05 UTC (permalink / raw)
  To: joerg.roedel; +Cc: mingo, iommu, linux-kernel

On Tue, 18 Nov 2008 16:43:22 +0100
Joerg Roedel <joerg.roedel@amd.com> wrote:

> Joerg Roedel (4):
>       AMD IOMMU: add parameter to disable device isolation
>       AMD IOMMU: enable device isolation per default
>       AMD IOMMU: fix fullflush comparison length
>       AMD IOMMU: check for next_bit also in unmapped area
> 
>  Documentation/kernel-parameters.txt |    4 +++-
>  arch/x86/kernel/amd_iommu.c         |    2 +-
>  arch/x86/kernel/amd_iommu_init.c    |    6 ++++--
>  3 files changed, 8 insertions(+), 4 deletions(-)
> 
> As the most important change these patches enable device isolation per
> default. Tests have shown that there are drivers which have bugs and do
> double-freeing of DMA memory.

What drivers? We need to fix them if they are mainline drivers.


> This can lead to data corruption with a
> hardware IOMMU when multiple devices share the same protection domain.
> Therefore device isolation should be enabled by default.

Hmm, the change is just because of the bug workaround? If so, I'm not
sure it's a good idea. We need to fix the buggy drivers anyway. And
device isolation is not free; e.g. use more memory rather than sharing
a protection domain. I guess that more people prefer sharing a
protection domain by default. It had been the default option for AMD
IOMMU until you hit the bugs. IIRC, VT-d also shares a protection
domain by default. It would be nice to avoid surprising users if the
two virtualization IOMMUs works in the similar way.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
  2008-11-19  6:05 ` FUJITA Tomonori
@ 2008-11-19  9:25   ` Joerg Roedel
  2008-11-19  9:36     ` Ingo Molnar
  2008-11-20  4:25     ` FUJITA Tomonori
  2008-11-19 12:57   ` Muli Ben-Yehuda
  1 sibling, 2 replies; 10+ messages in thread
From: Joerg Roedel @ 2008-11-19  9:25 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: joerg.roedel, iommu, mingo, linux-kernel

On Wed, Nov 19, 2008 at 03:05:24PM +0900, FUJITA Tomonori wrote:
> On Tue, 18 Nov 2008 16:43:22 +0100
> Joerg Roedel <joerg.roedel@amd.com> wrote:
> 
> > Joerg Roedel (4):
> >       AMD IOMMU: add parameter to disable device isolation
> >       AMD IOMMU: enable device isolation per default
> >       AMD IOMMU: fix fullflush comparison length
> >       AMD IOMMU: check for next_bit also in unmapped area
> > 
> >  Documentation/kernel-parameters.txt |    4 +++-
> >  arch/x86/kernel/amd_iommu.c         |    2 +-
> >  arch/x86/kernel/amd_iommu_init.c    |    6 ++++--
> >  3 files changed, 8 insertions(+), 4 deletions(-)
> > 
> > As the most important change these patches enable device isolation per
> > default. Tests have shown that there are drivers which have bugs and do
> > double-freeing of DMA memory.
> 
> What drivers? We need to fix them if they are mainline drivers.

I found issues in network drivers only for now. The two drivers where I
found issues are the in-kernel ixgbe driver (I see IO_PAGE_FAULTS
there), the ixgbe version from the Intel website has a double-free bug
when unloading the driver or changing the device mtu. The same problem
was found with the Broadcom NetXtreme II driver.

> > This can lead to data corruption with a
> > hardware IOMMU when multiple devices share the same protection domain.
> > Therefore device isolation should be enabled by default.
> 
> Hmm, the change is just because of the bug workaround? If so, I'm not
> sure it's a good idea. We need to fix the buggy drivers anyway. And
> device isolation is not free; e.g. use more memory rather than sharing
> a protection domain. I guess that more people prefer sharing a
> protection domain by default. It had been the default option for AMD
> IOMMU until you hit the bugs. IIRC, VT-d also shares a protection
> domain by default. It would be nice to avoid surprising users if the
> two virtualization IOMMUs works in the similar way.

We can't test all drivers for those bugs until 2.6.28 will be released.
And these bugs can corrupt data, for example when a driver frees dma
addresses allocated by another driver and these addresses are then
reallocated.
The only way to protect the drivers from each other is to isolate them
in different protection domains. The AMD IOMMU driver prints a WARN_ON()
if a driver frees dma addresses not yet mapped. This triggered with the
bnx2 and the ixgbe driver.
And the data corruption is real, it eat the root-fs of my testbox one
time.
I agree that we need to fix the drivers. I plan to implement some debug
code which allows driver developers to detect those bugs even if they
have no IOMMU in the system.

Joerg


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
  2008-11-19  9:25   ` Joerg Roedel
@ 2008-11-19  9:36     ` Ingo Molnar
  2008-11-20  4:25     ` FUJITA Tomonori
  1 sibling, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2008-11-19  9:36 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: FUJITA Tomonori, joerg.roedel, iommu, mingo, linux-kernel


* Joerg Roedel <joro@8bytes.org> wrote:

> We can't test all drivers for those bugs until 2.6.28 will be 
> released. And these bugs can corrupt data, for example when a driver 
> frees dma addresses allocated by another driver and these addresses 
> are then reallocated.
>
> The only way to protect the drivers from each other is to isolate 
> them in different protection domains. The AMD IOMMU driver prints a 
> WARN_ON() if a driver frees dma addresses not yet mapped. This 
> triggered with the bnx2 and the ixgbe driver.
>
> And the data corruption is real, it eat the root-fs of my testbox 
> one time.

a WARN_ON() can be acted upon much easier than silent/spurious data 
corruption. So printing a WARN_ON() will result in drivers being fixed 
a lot faster (and with a lot less debugging needed) than if we were 
intentionally letting DMA corruption happen. The WARN_ON() will be 
routed to kerneloops.org on the major distros, etc. etc.

> I agree that we need to fix the drivers. I plan to implement some 
> debug code which allows driver developers to detect those bugs even 
> if they have no IOMMU in the system.

That would be _really_ nice to have.

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
  2008-11-19  6:05 ` FUJITA Tomonori
  2008-11-19  9:25   ` Joerg Roedel
@ 2008-11-19 12:57   ` Muli Ben-Yehuda
  2008-11-20  4:25     ` FUJITA Tomonori
  1 sibling, 1 reply; 10+ messages in thread
From: Muli Ben-Yehuda @ 2008-11-19 12:57 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: joerg.roedel, iommu, mingo, linux-kernel

On Wed, Nov 19, 2008 at 03:05:24PM +0900, FUJITA Tomonori wrote:
> On Tue, 18 Nov 2008 16:43:22 +0100
> Joerg Roedel <joerg.roedel@amd.com> wrote:
> 
> > Joerg Roedel (4):
> >       AMD IOMMU: add parameter to disable device isolation
> >       AMD IOMMU: enable device isolation per default
> >       AMD IOMMU: fix fullflush comparison length
> >       AMD IOMMU: check for next_bit also in unmapped area
> > 
> >  Documentation/kernel-parameters.txt |    4 +++-
> >  arch/x86/kernel/amd_iommu.c         |    2 +-
> >  arch/x86/kernel/amd_iommu_init.c    |    6 ++++--
> >  3 files changed, 8 insertions(+), 4 deletions(-)
> > 
> > As the most important change these patches enable device isolation
> > per default. Tests have shown that there are drivers which have
> > bugs and do double-freeing of DMA memory.
> 
> What drivers? We need to fix them if they are mainline drivers.
> 
> > This can lead to data corruption with a hardware IOMMU when
> > multiple devices share the same protection domain.  Therefore
> > device isolation should be enabled by default.
> 
> Hmm, the change is just because of the bug workaround? If so, I'm not
> sure it's a good idea. We need to fix the buggy drivers anyway.

This won't work around the bug, it will just make its outcome less
severe (by restricting the fault to the offensive device only).

> device isolation is not free; e.g. use more memory rather than
> sharing a protection domain. I guess that more people prefer sharing
> a protection domain by default.

I doubt it, why use an isolation-capable IOMMU at all if not for the
increased reliability? The majority of modern devices---those that you
are likely to find on machines with an IOMMU---don't have DMA
limitations.

> It had been the default option for AMD IOMMU until you hit the
> bugs. IIRC, VT-d also shares a protection domain by default. It
> would be nice to avoid surprising users if the two virtualization
> IOMMUs works in the similar way.

Calgary has a per-bus protection domain, both on x86 and PPC.

Cheers,
Muli
-- 
The First Workshop on I/O Virtualization (WIOV '08)
Dec 2008, San Diego, CA, http://www.usenix.org/wiov08/
                       <->
SYSTOR 2009---The Israeli Experimental Systems Conference
http://www.haifa.il.ibm.com/conferences/systor2009/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
  2008-11-19  9:25   ` Joerg Roedel
  2008-11-19  9:36     ` Ingo Molnar
@ 2008-11-20  4:25     ` FUJITA Tomonori
  2008-11-20 11:31       ` Joerg Roedel
  1 sibling, 1 reply; 10+ messages in thread
From: FUJITA Tomonori @ 2008-11-20  4:25 UTC (permalink / raw)
  To: joro; +Cc: fujita.tomonori, joerg.roedel, iommu, mingo, linux-kernel

On Wed, 19 Nov 2008 10:25:44 +0100
Joerg Roedel <joro@8bytes.org> wrote:

> On Wed, Nov 19, 2008 at 03:05:24PM +0900, FUJITA Tomonori wrote:
> > On Tue, 18 Nov 2008 16:43:22 +0100
> > Joerg Roedel <joerg.roedel@amd.com> wrote:
> > 
> > > Joerg Roedel (4):
> > >       AMD IOMMU: add parameter to disable device isolation
> > >       AMD IOMMU: enable device isolation per default
> > >       AMD IOMMU: fix fullflush comparison length
> > >       AMD IOMMU: check for next_bit also in unmapped area
> > > 
> > >  Documentation/kernel-parameters.txt |    4 +++-
> > >  arch/x86/kernel/amd_iommu.c         |    2 +-
> > >  arch/x86/kernel/amd_iommu_init.c    |    6 ++++--
> > >  3 files changed, 8 insertions(+), 4 deletions(-)
> > > 
> > > As the most important change these patches enable device isolation per
> > > default. Tests have shown that there are drivers which have bugs and do
> > > double-freeing of DMA memory.
> > 
> > What drivers? We need to fix them if they are mainline drivers.
> 
> I found issues in network drivers only for now. The two drivers where I
> found issues are the in-kernel ixgbe driver (I see IO_PAGE_FAULTS
> there), the ixgbe version from the Intel website has a double-free bug
> when unloading the driver or changing the device mtu. The same problem
> was found with the Broadcom NetXtreme II driver.

I see, thanks. You already reported the bugs to netdev?


> > > This can lead to data corruption with a
> > > hardware IOMMU when multiple devices share the same protection domain.
> > > Therefore device isolation should be enabled by default.
> > 
> > Hmm, the change is just because of the bug workaround? If so, I'm not
> > sure it's a good idea. We need to fix the buggy drivers anyway. And
> > device isolation is not free; e.g. use more memory rather than sharing
> > a protection domain. I guess that more people prefer sharing a
> > protection domain by default. It had been the default option for AMD
> > IOMMU until you hit the bugs. IIRC, VT-d also shares a protection
> > domain by default. It would be nice to avoid surprising users if the
> > two virtualization IOMMUs works in the similar way.
> 
> We can't test all drivers for those bugs until 2.6.28 will be released.
> And these bugs can corrupt data, for example when a driver frees dma
> addresses allocated by another driver and these addresses are then
> reallocated.
> The only way to protect the drivers from each other is to isolate them
> in different protection domains. The AMD IOMMU driver prints a WARN_ON()
> if a driver frees dma addresses not yet mapped. This triggered with the
> bnx2 and the ixgbe driver.

It would be better to add such WARN_ON to VT-d. VT-d is everywhere
nowadays. I think that there are some developers who can test these
drivers with VT-d.


> And the data corruption is real, it eat the root-fs of my testbox one
> time.
> I agree that we need to fix the drivers. I plan to implement some debug
> code which allows driver developers to detect those bugs even if they
> have no IOMMU in the system.

It's not so hard to add such debug feature to swiotlb, I guess.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
  2008-11-19 12:57   ` Muli Ben-Yehuda
@ 2008-11-20  4:25     ` FUJITA Tomonori
  2008-11-20  7:51       ` Ingo Molnar
  0 siblings, 1 reply; 10+ messages in thread
From: FUJITA Tomonori @ 2008-11-20  4:25 UTC (permalink / raw)
  To: muli; +Cc: fujita.tomonori, joerg.roedel, iommu, mingo, linux-kernel

On Wed, 19 Nov 2008 14:57:50 +0200
Muli Ben-Yehuda <muli@il.ibm.com> wrote:

> > device isolation is not free; e.g. use more memory rather than
> > sharing a protection domain. I guess that more people prefer sharing
> > a protection domain by default.
> 
> I doubt it, why use an isolation-capable IOMMU at all if not for the
> increased reliability? The majority of modern devices---those that you
> are likely to find on machines with an IOMMU---don't have DMA
> limitations.

I guess that there are still some modern SATA HBAs that are not
capable of 64bit DMA. You might be right though.


> > It had been the default option for AMD IOMMU until you hit the
> > bugs. IIRC, VT-d also shares a protection domain by default. It
> > would be nice to avoid surprising users if the two virtualization
> > IOMMUs works in the similar way.
> 
> Calgary has a per-bus protection domain, both on x86 and PPC.

I see. Then it might be better to change VT-d to use a separate
protection domain by default.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
  2008-11-20  4:25     ` FUJITA Tomonori
@ 2008-11-20  7:51       ` Ingo Molnar
  0 siblings, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2008-11-20  7:51 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: muli, joerg.roedel, iommu, mingo, linux-kernel


* FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:

> > > It had been the default option for AMD IOMMU until you hit the 
> > > bugs. IIRC, VT-d also shares a protection domain by default. It 
> > > would be nice to avoid surprising users if the two 
> > > virtualization IOMMUs works in the similar way.
> > 
> > Calgary has a per-bus protection domain, both on x86 and PPC.
> 
> I see. Then it might be better to change VT-d to use a separate 
> protection domain by default.

yes, agreed, and that should be the sane default for any IOMMU driver 
- unless the performance impact is prohibitive.

Note that this widens the positive impact of the IOMMU code: not only 
does it enable transparent support of DMA to/from devices that have a 
limited DMA range, not only does it help isolation in virtualization - 
it also acts as a daily debug helper for _native_ drivers.

Note that people will prefer to run with an IOMMU enabled even if all 
devices support the full memory range - just due to the DMA protection 
features. Just like people prefer to run an OS with paging protections 
enabled ;-)

It also puts pressure on the hw design side to treat IOMMUs not just 
as some fringe feature for compatibility with older transports or 
virtualization, but also as a prime-time native IO feature.

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [GIT PULL] AMD IOMMU updates for 2.6.28-rc5
  2008-11-20  4:25     ` FUJITA Tomonori
@ 2008-11-20 11:31       ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2008-11-20 11:31 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: joro, iommu, mingo, linux-kernel

On Thu, Nov 20, 2008 at 01:25:15PM +0900, FUJITA Tomonori wrote:
> On Wed, 19 Nov 2008 10:25:44 +0100
> Joerg Roedel <joro@8bytes.org> wrote:
> > I found issues in network drivers only for now. The two drivers where I
> > found issues are the in-kernel ixgbe driver (I see IO_PAGE_FAULTS
> > there), the ixgbe version from the Intel website has a double-free bug
> > when unloading the driver or changing the device mtu. The same problem
> > was found with the Broadcom NetXtreme II driver.
> 
> I see, thanks. You already reported the bugs to netdev?
> 

Not yet, but I will do so when I can prove the bug is not in my driver
;) (== when my debug code is ready)

> > We can't test all drivers for those bugs until 2.6.28 will be released.
> > And these bugs can corrupt data, for example when a driver frees dma
> > addresses allocated by another driver and these addresses are then
> > reallocated.
> > The only way to protect the drivers from each other is to isolate them
> > in different protection domains. The AMD IOMMU driver prints a WARN_ON()
> > if a driver frees dma addresses not yet mapped. This triggered with the
> > bnx2 and the ixgbe driver.
> 
> It would be better to add such WARN_ON to VT-d. VT-d is everywhere
> nowadays. I think that there are some developers who can test these
> drivers with VT-d.

Yes, I agree.

> > And the data corruption is real, it eat the root-fs of my testbox one
> > time.
> > I agree that we need to fix the drivers. I plan to implement some debug
> > code which allows driver developers to detect those bugs even if they
> > have no IOMMU in the system.
> 
> It's not so hard to add such debug feature to swiotlb, I guess.
> 

Yes, but I prefer to have it outside of any dma_ops implementation. This
way it can also be used to find out if a bug originated from the device
driver or the dma_ops implementation.

Joerg

-- 
           |           AMD Saxony Limited Liability Company & Co. KG
 Operating |         Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System    |                  Register Court Dresden: HRA 4896
 Research  |              General Partner authorized to represent:
 Center    |             AMD Saxony LLC (Wilmington, Delaware, US)
           | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-11-20 11:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-18 15:43 [GIT PULL] AMD IOMMU updates for 2.6.28-rc5 Joerg Roedel
2008-11-18 15:49 ` Ingo Molnar
2008-11-19  6:05 ` FUJITA Tomonori
2008-11-19  9:25   ` Joerg Roedel
2008-11-19  9:36     ` Ingo Molnar
2008-11-20  4:25     ` FUJITA Tomonori
2008-11-20 11:31       ` Joerg Roedel
2008-11-19 12:57   ` Muli Ben-Yehuda
2008-11-20  4:25     ` FUJITA Tomonori
2008-11-20  7:51       ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox