All of lore.kernel.org
 help / color / mirror / Atom feed
* x86 swiotlb questions
@ 2006-12-15 12:50 Jan Beulich
  2006-12-15 13:35 ` Keir Fraser
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2006-12-15 12:50 UTC (permalink / raw)
  To: xen-devel

I'm not certain when this code was changed significantly from what I remember, but

- What is the purpose of using alloc_bootmem_low variants here? I.e., where is the
dependency on physical addresses being below 4G here (machine addresses are
being restricted after the allocation anyway)? The panic message text after the
failed allocation is confusing me additionally.

- While I can see the idea behind the overflow buffer, it doesn't seem to prevent
data corruption, and if I understand it correctly it doesn't even prevent memory
corruption (since its machine address doesn't get restricted anywhere, so the fall
back return value would not necessarily meet the device requirements).

- With various parameters now being command line configurable, if any of these
get set inconsistently or incorrectly the user would probably get a cryptic crash
(from the BUG_ON() following the call to xen_create_contiguous_region()).

- The default bit width for DMA (also in Xen itself) was now changed to 30, just
because of a single device (b44). Shouldn't it, with the settings being
customizable now, rather be 32 (and those who own such ill devices need to
make use of the option)?

- The DMA bit widths can be set to different values in Xen and kernel, which can
lead to surprising results, I would think. Shouldn't the kernel rather obtain Xen's
value, so they are consistent?

Thanks, Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-15 12:50 x86 swiotlb questions Jan Beulich
@ 2006-12-15 13:35 ` Keir Fraser
  2006-12-15 13:53   ` Jan Beulich
                     ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Keir Fraser @ 2006-12-15 13:35 UTC (permalink / raw)
  To: Jan Beulich, xen-devel

On 15/12/06 12:50, "Jan Beulich" <jbeulich@novell.com> wrote:

> I'm not certain when this code was changed significantly from what I remember,
> but
> 
> - What is the purpose of using alloc_bootmem_low variants here? I.e., where is
> the dependency on physical addresses being below 4G here (machine addresses
are
> being restricted after the allocation anyway)? The panic message text after
> the
> failed allocation is confusing me additionally.

This is how it's always been since we took ia64's swiotlb.c.

> - While I can see the idea behind the overflow buffer, it doesn't seem to
> prevent
> data corruption, and if I understand it correctly it doesn't even prevent
> memory
> corruption (since its machine address doesn't get restricted anywhere, so the
> fall
> back return value would not necessarily meet the device requirements).

Same here. We didn't implement this. It doesn't seem to make that much
sense. Sync'ing with lib/swiotb.c and throwing away our special one would be
very nice. :-)

> - With various parameters now being command line configurable, if any of these
> get set inconsistently or incorrectly the user would probably get a cryptic
> crash
> (from the BUG_ON() following the call to xen_create_contiguous_region()).

True. Best know what you're doing if you mess with these.

> - The default bit width for DMA (also in Xen itself) was now changed to 30,
> just
> because of a single device (b44). Shouldn't it, with the settings being
> customizable now, rather be 32 (and those who own such ill devices need to
> make use of the option)?

Arguable. Depends how common 31- and 30-bit limitations are. Limiting the
swiotlb to 1GB of memory doesn't seem that harsh -- how much swiotlb memory
are you likely to want, system wide?

> - The DMA bit widths can be set to different values in Xen and kernel, which
> can
> lead to surprising results, I would think. Shouldn't the kernel rather obtain
> Xen's
> value, so they are consistent?

We would like to generalise Xen's heap allocator so that it keeps separate
heaps for different bit widths. Then there would be no 'DMA width' or 'DMA
pool' in Xen.

 -- Keir

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-15 13:35 ` Keir Fraser
@ 2006-12-15 13:53   ` Jan Beulich
  2006-12-15 14:03     ` Keir Fraser
  2006-12-15 16:19   ` Alan
  2006-12-18  7:44   ` Jan Beulich
  2 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2006-12-15 13:53 UTC (permalink / raw)
  To: xen-devel, Keir Fraser

>> - The DMA bit widths can be set to different values in Xen and kernel, which
>> can
>> lead to surprising results, I would think. Shouldn't the kernel rather obtain
>> Xen's
>> value, so they are consistent?
>
>We would like to generalise Xen's heap allocator so that it keeps separate
>heaps for different bit widths. Then there would be no 'DMA width' or 'DMA
>pool' in Xen.

I already have patches ready to do this (the DMA thing really is a nice side
effect, I mostly wanted it for 32on64, so that I can restrict domain
allocations for 32-bit domains). Are you saying I should throw away the
DMA specialization then altogether (I already have no special DMA heap
anymore)? The leftovers from it are so that one can reserve some portion
of low memory to be returned only when the width restriction is low enough
(i.e. to retain dma_emergency_pool functionality), which certainly isn't
really appropriate anymore now (it should rather be a percentage or
something like that, so that the lower you get the more of the memory
remains reserved for specialized allocations).

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-15 13:53   ` Jan Beulich
@ 2006-12-15 14:03     ` Keir Fraser
  2006-12-15 14:17       ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2006-12-15 14:03 UTC (permalink / raw)
  To: Jan Beulich, xen-devel, Keir Fraser

On 15/12/06 13:53, "Jan Beulich" <jbeulich@novell.com> wrote:

> I already have patches ready to do this (the DMA thing really is a nice side
> effect, I mostly wanted it for 32on64, so that I can restrict domain
> allocations for 32-bit domains). Are you saying I should throw away the
> DMA specialization then altogether (I already have no special DMA heap
> anymore)? The leftovers from it are so that one can reserve some portion
> of low memory to be returned only when the width restriction is low enough
> (i.e. to retain dma_emergency_pool functionality), which certainly isn't
> really appropriate anymore now (it should rather be a percentage or
> something like that, so that the lower you get the more of the memory
> remains reserved for specialized allocations).

I think dma_emergency_pool as is can go. Possibly it should be replaced by
allocator-management tools in dom0 to allow setting of limits on a
per-bitwidth basis.

Is this one of the patches you already sent in your 32-on-64 batch, or an
additional one?

 -- Keir

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-15 14:03     ` Keir Fraser
@ 2006-12-15 14:17       ` Jan Beulich
  2006-12-15 14:19         ` Keir Fraser
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2006-12-15 14:17 UTC (permalink / raw)
  To: xen-devel, Keir Fraser

>>> Keir Fraser <keir@xensource.com> 15.12.06 15:03 >>>
>On 15/12/06 13:53, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>> I already have patches ready to do this (the DMA thing really is a nice side
>> effect, I mostly wanted it for 32on64, so that I can restrict domain
>> allocations for 32-bit domains). Are you saying I should throw away the
>> DMA specialization then altogether (I already have no special DMA heap
>> anymore)? The leftovers from it are so that one can reserve some portion
>> of low memory to be returned only when the width restriction is low enough
>> (i.e. to retain dma_emergency_pool functionality), which certainly isn't
>> really appropriate anymore now (it should rather be a percentage or
>> something like that, so that the lower you get the more of the memory
>> remains reserved for specialized allocations).
>
>I think dma_emergency_pool as is can go. Possibly it should be replaced by
>allocator-management tools in dom0 to allow setting of limits on a
>per-bitwidth basis.

Okay, but I think I'll leave this as a separate change (that we probably first
should reach agreement on what it really ought to do and not do).

>Is this one of the patches you already sent in your 32-on-64 batch, or an
>additional one?

An additional one (or actually, a set of them, to make the individual steps
more clear). As a followup, I'm also planning to get rid of the Xen heaps
on those arches where they aren't needed (x86-64, not sure about ppc
and ia64, but I would assume it's really only x86-32 that needs it).

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-15 14:17       ` Jan Beulich
@ 2006-12-15 14:19         ` Keir Fraser
  2006-12-15 14:46           ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2006-12-15 14:19 UTC (permalink / raw)
  To: Jan Beulich, xen-devel, Keir Fraser




On 15/12/06 14:17, "Jan Beulich" <jbeulich@novell.com> wrote:

>> Is this one of the patches you already sent in your 32-on-64 batch, or an
>> additional one?
> 
> An additional one (or actually, a set of them, to make the individual steps
> more clear). As a followup, I'm also planning to get rid of the Xen heaps
> on those arches where they aren't needed (x86-64, not sure about ppc
> and ia64, but I would assume it's really only x86-32 that needs it).

Yes, that would be great. There are a couple of minor caveats around special
semantics of the Xen heap but those could do with cleaning up and being made
explicit anyway.

 -- Keir

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-15 14:19         ` Keir Fraser
@ 2006-12-15 14:46           ` Jan Beulich
  2006-12-15 16:47             ` Keir Fraser
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2006-12-15 14:46 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

>>> Keir Fraser <keir@xensource.com> 15.12.06 15:19 >>>
>On 15/12/06 14:17, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>>> Is this one of the patches you already sent in your 32-on-64 batch, or an
>>> additional one?
>> 
>> An additional one (or actually, a set of them, to make the individual steps
>> more clear). As a followup, I'm also planning to get rid of the Xen heaps
>> on those arches where they aren't needed (x86-64, not sure about ppc
>> and ia64, but I would assume it's really only x86-32 that needs it).
>
>Yes, that would be great. There are a couple of minor caveats around special
>semantics of the Xen heap but those could do with cleaning up and being made
>explicit anyway.

Mind adding a little more detail here?

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-15 13:35 ` Keir Fraser
  2006-12-15 13:53   ` Jan Beulich
@ 2006-12-15 16:19   ` Alan
  2006-12-18  7:44   ` Jan Beulich
  2 siblings, 0 replies; 29+ messages in thread
From: Alan @ 2006-12-15 16:19 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Jan Beulich

> Arguable. Depends how common 31- and 30-bit limitations are. Limiting the
> swiotlb to 1GB of memory doesn't seem that harsh -- how much swiotlb memory
> are you likely to want, system wide?

31bit pops up now and then, 30bit also. There are a few devices that are
even worse but they are older sound cards so not important.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-15 14:46           ` Jan Beulich
@ 2006-12-15 16:47             ` Keir Fraser
  0 siblings, 0 replies; 29+ messages in thread
From: Keir Fraser @ 2006-12-15 16:47 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser; +Cc: xen-devel

On 15/12/06 14:46, "Jan Beulich" <jbeulich@novell.com> wrote:

>> Yes, that would be great. There are a couple of minor caveats around special
>> semantics of the Xen heap but those could do with cleaning up and being made
>> explicit anyway.
> 
> Mind adding a little more detail here?

Some allocations need to be below 4GB (e.g., domain pointers, as they get
pickled into a 32-bit field in page structures). These should use an
explicit address-width parameter to the allocator.

Pages allocated from the 'Xen heap' but shared with a domain have special
semantics when their refcount falls to zero. This includes shared_info and
grant-table pages. We may need an extra PGC_ bit to flag these pages.

That's all I can think of.

 -- Keir

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-15 13:35 ` Keir Fraser
  2006-12-15 13:53   ` Jan Beulich
  2006-12-15 16:19   ` Alan
@ 2006-12-18  7:44   ` Jan Beulich
  2006-12-18  9:39     ` Keir Fraser
  2 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2006-12-18  7:44 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

>> - What is the purpose of using alloc_bootmem_low variants here? I.e., where is
>> the dependency on physical addresses being below 4G here (machine addresses
>are
>> being restricted after the allocation anyway)? The panic message text after
>> the
>> failed allocation is confusing me additionally.
>
>This is how it's always been since we took ia64's swiotlb.c.

Okay, then I must have forgotten about how it looked like. However, the
specific panic message has a Xen-specific addition, so I still wonder what its
background is...

>> - While I can see the idea behind the overflow buffer, it doesn't seem to
>> prevent
>> data corruption, and if I understand it correctly it doesn't even prevent
>> memory
>> corruption (since its machine address doesn't get restricted anywhere, so the
>> fall
>> back return value would not necessarily meet the device requirements).
>
>Same here. We didn't implement this. It doesn't seem to make that much
>sense. Sync'ing with lib/swiotb.c and throwing away our special one would be
>very nice. :-)

Trying to do that I find one extra issue: in_swiotlb_aperture() does its check
based on pfn, while lib/swiotlb.c uses the virtual address in the respective
checks instead. Is there some subtlety behind that (that then should be
commented upon), or is this just due to this originally having been an
mfn-based check?

Thanks, Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-18  7:44   ` Jan Beulich
@ 2006-12-18  9:39     ` Keir Fraser
  2006-12-19 12:48       ` Jan Beulich
  2006-12-20 16:40       ` Jan Beulich
  0 siblings, 2 replies; 29+ messages in thread
From: Keir Fraser @ 2006-12-18  9:39 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser; +Cc: xen-devel

On 18/12/06 7:44 am, "Jan Beulich" <jbeulich@novell.com> wrote:

>> Same here. We didn't implement this. It doesn't seem to make that much
>> sense. Sync'ing with lib/swiotb.c and throwing away our special one would be
>> very nice. :-)
> 
> Trying to do that I find one extra issue: in_swiotlb_aperture() does its check
> based on pfn, while lib/swiotlb.c uses the virtual address in the respective
> checks instead. Is there some subtlety behind that (that then should be
> commented upon), or is this just due to this originally having been an
> mfn-based check?

Yes, it's because we used to do an mfn range check which was okay when the
swiotlb aperture was filled with contiguous machine memory. Since it is
composed of discontiguous slabs now, we changed to a pfn check but that
could equally well be a virtual-address check.

Do we merge okay with lib/swiotlb.c then? One concern I had was with our
preferred setup semantics -- we really want the user to be able to forcibly
enable the swiotlb via a boot parameter *but* not have to suffer using it
for every DMA operation. Last I looked the generic swiotlb didn't have that
option. That and our very Xen-specific checks for whether to auto-enable the
swiotlb led me to think that the very start-of-day setup of swiotlb would
need to be overridable by architecture.

 -- Keir

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-18  9:39     ` Keir Fraser
@ 2006-12-19 12:48       ` Jan Beulich
  2006-12-19 14:14         ` Keir Fraser
  2006-12-20 16:40       ` Jan Beulich
  1 sibling, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2006-12-19 12:48 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

>Do we merge okay with lib/swiotlb.c then?

Not yet - because of the highmem handling needed for i386. I wonder, however,
how native Linux gets away with not handling this through swiotlb, and why
nevertheless Xen needs to special case this. Any ideas?

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-19 12:48       ` Jan Beulich
@ 2006-12-19 14:14         ` Keir Fraser
  2006-12-19 14:39           ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2006-12-19 14:14 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser; +Cc: xen-devel




On 19/12/06 12:48, "Jan Beulich" <jbeulich@novell.com> wrote:

>> Do we merge okay with lib/swiotlb.c then?
> 
> Not yet - because of the highmem handling needed for i386. I wonder, however,
> how native Linux gets away with not handling this through swiotlb, and why
> nevertheless Xen needs to special case this. Any ideas?

Probably because GFP_KERNEL and GFP_DMA allocations are guaranteed to be
DMAable by 30-bit-capable devices on native, but not on Xen.

 -- Keir

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-19 14:14         ` Keir Fraser
@ 2006-12-19 14:39           ` Jan Beulich
  2006-12-19 14:46             ` Keir Fraser
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2006-12-19 14:39 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

>>> Keir Fraser <keir@xensource.com> 19.12.06 15:14 >>>
>On 19/12/06 12:48, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>>> Do we merge okay with lib/swiotlb.c then?
>> 
>> Not yet - because of the highmem handling needed for i386. I wonder, however,
>> how native Linux gets away with not handling this through swiotlb, and why
>> nevertheless Xen needs to special case this. Any ideas?
>
>Probably because GFP_KERNEL and GFP_DMA allocations are guaranteed to be
>DMAable by 30-bit-capable devices on native, but not on Xen.

Not sure I understand your thinking here. Nothing prevents user pages (or anything
else that I/O may happen against) to come from highmem, hence the bounce logic
in mm/highmem.c needs to control this anyway (as I understand it). And since all
we're talking about here are physical addresses (and their translations to virtual
ones), I would rather conclude that we'll never see a page in the I/O path that
page_address() would return NULL for, but if that's the case, then there's no need
to kmap such pages or to favor page_to_bus() over virt_to_bus(page_address()).

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-19 14:39           ` Jan Beulich
@ 2006-12-19 14:46             ` Keir Fraser
  2006-12-19 17:07               ` Muli Ben-Yehuda
  0 siblings, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2006-12-19 14:46 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser; +Cc: xen-devel


One thing is we change the value of PCI_BUS_IS_PHYS (or some similar macro
which I can't quite remember the name of) which I believe turns off some
bounce-buffer logic contained within the block-device subsystem. So that
will mean that we get highmem requests hitting the DMA interfaces, where on
native they would have got filtered earlier by the highmem/lowmem bounce
buffer logic that is specific to block-device requests.

Obviously we want to turn off that lowmem/highmem bounce buffer logic on Xen
as it is nonsense when there is no direct correspondence to low/high machine
addresses.

 -- Keir

On 19/12/06 14:39, "Jan Beulich" <jbeulich@novell.com> wrote:

>>> Not yet - because of the highmem handling needed for i386. I wonder,
>>> however,
>>> how native Linux gets away with not handling this through swiotlb, and why
>>> nevertheless Xen needs to special case this. Any ideas?
>> 
>> Probably because GFP_KERNEL and GFP_DMA allocations are guaranteed to be
>> DMAable by 30-bit-capable devices on native, but not on Xen.
> 
> Not sure I understand your thinking here. Nothing prevents user pages (or
> anything
> else that I/O may happen against) to come from highmem, hence the bounce logic
> in mm/highmem.c needs to control this anyway (as I understand it). And since
> all
> we're talking about here are physical addresses (and their translations to
> virtual
> ones), I would rather conclude that we'll never see a page in the I/O path
> that
> page_address() would return NULL for, but if that's the case, then there's no
> need
> to kmap such pages or to favor page_to_bus() over virt_to_bus(page_address()).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-19 14:46             ` Keir Fraser
@ 2006-12-19 17:07               ` Muli Ben-Yehuda
  0 siblings, 0 replies; 29+ messages in thread
From: Muli Ben-Yehuda @ 2006-12-19 17:07 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Jan Beulich

On Tue, Dec 19, 2006 at 02:46:41PM +0000, Keir Fraser wrote:
> 
> One thing is we change the value of PCI_BUS_IS_PHYS (or some similar macro
> which I can't quite remember the name of) which I believe turns off some
> bounce-buffer logic contained within the block-device subsystem. So that
> will mean that we get highmem requests hitting the DMA interfaces, where on
> native they would have got filtered earlier by the highmem/lowmem bounce
> buffer logic that is specific to block-device requests.

PCI_DMA_BUS_IS_PHYS, commonly found in asm-$(arch)/pci.h.
 
Cheers,
Muli

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-18  9:39     ` Keir Fraser
  2006-12-19 12:48       ` Jan Beulich
@ 2006-12-20 16:40       ` Jan Beulich
  1 sibling, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2006-12-20 16:40 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Muli Ben-Yehuda, xen-devel

[-- Attachment #1: Type: text/plain, Size: 976 bytes --]

>Do we merge okay with lib/swiotlb.c then? One concern I had was with our
>preferred setup semantics -- we really want the user to be able to forcibly
>enable the swiotlb via a boot parameter *but* not have to suffer using it
>for every DMA operation. Last I looked the generic swiotlb didn't have that
>option. That and our very Xen-specific checks for whether to auto-enable the
>swiotlb led me to think that the very start-of-day setup of swiotlb would
>need to be overridable by architecture.

I think I retained all of the semantics, attached the patches as I have them
by now. This is a submission for review only, as the first four patches will
need to go to kernel.org (and hopefully will get accepted). The Xen
customization is fairly ugly, but I didn't see anything nicer than that while
also keeping the amount of changes to lib/swiotlb.c reasonable.

Patch order is
swiotlb-bugs.patch
swiotlb-bus.patch
swiotlb-cleanup.patch
swiotlb-split.patch
xen-swiotlb.patch


[-- Attachment #2: swiotlb-split.patch --]
[-- Type: text/plain, Size: 12706 bytes --]

This patch adds abstraction so that the file can be used by environments other
than IA64 and EM64T, namely for Xen.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-18/include/asm-ia64/swiotlb.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sle10-sp1-2006-12-18/include/asm-ia64/swiotlb.h	2006-12-20 15:54:48.000000000 +0100
@@ -0,0 +1,8 @@
+#ifndef _ASM_SWIOTLB_H
+#define _ASM_SWIOTLB_H 1
+
+#include <asm/machvec.h>
+
+#define SWIOTLB_ARCH_WANT_LATE_INIT
+
+#endif /* _ASM_SWIOTLB_H */
Index: sle10-sp1-2006-12-18/lib/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-18.orig/lib/swiotlb.c	2006-12-20 12:39:41.000000000 +0100
+++ sle10-sp1-2006-12-18/lib/swiotlb.c	2006-12-20 15:35:01.000000000 +0100
@@ -28,6 +28,7 @@
 #include <asm/io.h>
 #include <asm/dma.h>
 #include <asm/scatterlist.h>
+#include <asm/swiotlb.h>
 
 #include <linux/init.h>
 #include <linux/bootmem.h>
@@ -36,7 +37,9 @@
 	                   ( (val) & ( (align) - 1)))
 
 #define SG_ENT_VIRT_ADDRESS(sg)	(page_address((sg)->page) + (sg)->offset)
+#ifndef SG_ENT_PHYS_ADDRESS
 #define SG_ENT_PHYS_ADDRESS(sg)	virt_to_bus(SG_ENT_VIRT_ADDRESS(sg))
+#endif
 
 /*
  * Maximum allowable number of contiguous slabs to map,
@@ -101,13 +104,25 @@ static unsigned int io_tlb_index;
  * We need to save away the original address corresponding to a mapped entry
  * for the sync operations.
  */
-static unsigned char **io_tlb_orig_addr;
+#ifndef SWIOTLB_ARCH_HAS_IO_TLB_ADDR_T
+typedef char *io_tlb_addr_t;
+#define swiotlb_orig_addr_null(buffer) (!(buffer))
+#define ptr_to_io_tlb_addr(ptr) (ptr)
+#define page_to_io_tlb_addr(pg, off) (page_address(pg) + (off))
+#define sg_to_io_tlb_addr(sg) SG_ENT_VIRT_ADDRESS(sg)
+#endif
+static io_tlb_addr_t *io_tlb_orig_addr;
 
 /*
  * Protect the above data structures in the map and unmap calls
  */
 static DEFINE_SPINLOCK(io_tlb_lock);
 
+#ifdef SWIOTLB_EXTRA_VARIABLES
+SWIOTLB_EXTRA_VARIABLES;
+#endif
+
+#ifndef SWIOTLB_ARCH_HAS_SETUP_IO_TLB_NPAGES
 static int __init
 setup_io_tlb_npages(char *str)
 {
@@ -122,9 +137,25 @@ setup_io_tlb_npages(char *str)
 		swiotlb_force = 1;
 	return 1;
 }
+#endif
 __setup("swiotlb=", setup_io_tlb_npages);
 /* make io_tlb_overflow tunable too? */
 
+#ifndef swiotlb_adjust_size
+#define swiotlb_adjust_size(size) ((void)0)
+#endif
+
+#ifndef swiotlb_adjust_seg
+#define swiotlb_adjust_seg(start, size) ((void)0)
+#endif
+
+#ifndef swiotlb_print_info
+#define swiotlb_print_info(bytes) \
+	printk(KERN_INFO "Placing %luMB software IO TLB between 0x%lx - " \
+	       "0x%lx\n", bytes >> 20, \
+	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end))
+#endif
+
 /*
  * Statically reserve bounce buffer space and initialize bounce buffer data
  * structures for the software IO TLB used to implement the DMA API.
@@ -138,6 +169,8 @@ swiotlb_init_with_default_size(size_t de
 		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
 		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
 	}
+	swiotlb_adjust_size(io_tlb_nslabs);
+	swiotlb_adjust_size(io_tlb_overflow);
 
 	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
@@ -155,25 +188,33 @@ swiotlb_init_with_default_size(size_t de
 	 * between io_tlb_start and io_tlb_end.
 	 */
 	io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int));
-	for (i = 0; i < io_tlb_nslabs; i++)
+	for (i = 0; i < io_tlb_nslabs; i++) {
+		if ( !(i % IO_TLB_SEGSIZE) )
+			swiotlb_adjust_seg(io_tlb_start + (i << IO_TLB_SHIFT),
+				IO_TLB_SEGSIZE << IO_TLB_SHIFT);
  		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+ 	}
 	io_tlb_index = 0;
-	io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(char *));
+	io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(io_tlb_addr_t));
 
 	/*
 	 * Get the overflow emergency buffer
 	 */
 	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
-	printk(KERN_INFO "Placing software IO TLB between 0x%lx - 0x%lx\n",
-	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
+	swiotlb_adjust_seg(io_tlb_overflow_buffer, io_tlb_overflow);
+	swiotlb_print_info(bytes);
 }
+#ifndef __swiotlb_init_with_default_size
+#define __swiotlb_init_with_default_size swiotlb_init_with_default_size
+#endif
 
 void __init
 swiotlb_init(void)
 {
-	swiotlb_init_with_default_size(64 * (1<<20));	/* default to 64MB */
+	__swiotlb_init_with_default_size(64 * (1<<20));	/* default to 64MB */
 }
 
+#ifdef SWIOTLB_ARCH_WANT_LATE_INIT
 /*
  * Systems with larger DMA zones (those that don't support ISA) can
  * initialize the swiotlb later using the slab allocator if needed.
@@ -231,12 +272,12 @@ swiotlb_late_init_with_default_size(size
  		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
 	io_tlb_index = 0;
 
-	io_tlb_orig_addr = (unsigned char **)__get_free_pages(GFP_KERNEL,
-	                           get_order(io_tlb_nslabs * sizeof(char *)));
+	io_tlb_orig_addr = (io_tlb_addr_t *)__get_free_pages(GFP_KERNEL,
+	                           get_order(io_tlb_nslabs * sizeof(io_tlb_addr_t)));
 	if (!io_tlb_orig_addr)
 		goto cleanup3;
 
-	memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(char *));
+	memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(io_tlb_addr_t));
 
 	/*
 	 * Get the overflow emergency buffer
@@ -246,19 +287,17 @@ swiotlb_late_init_with_default_size(size
 	if (!io_tlb_overflow_buffer)
 		goto cleanup4;
 
-	printk(KERN_INFO "Placing %luMB software IO TLB between 0x%lx - "
-	       "0x%lx\n", bytes >> 20,
-	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
+	swiotlb_print_info(bytes);
 
 	return 0;
 
 cleanup4:
-	free_pages((unsigned long)io_tlb_orig_addr, get_order(io_tlb_nslabs *
-	                                                      sizeof(char *)));
+	free_pages((unsigned long)io_tlb_orig_addr,
+		   get_order(io_tlb_nslabs * sizeof(io_tlb_addr_t)));
 	io_tlb_orig_addr = NULL;
 cleanup3:
-	free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
-	                                                 sizeof(int)));
+	free_pages((unsigned long)io_tlb_list,
+		   get_order(io_tlb_nslabs * sizeof(int)));
 	io_tlb_list = NULL;
 cleanup2:
 	io_tlb_end = NULL;
@@ -268,7 +307,9 @@ cleanup1:
 	io_tlb_nslabs = req_nslabs;
 	return -ENOMEM;
 }
+#endif
 
+#ifndef SWIOTLB_ARCH_HAS_NEEDS_MAPPING
 static inline int
 address_needs_mapping(struct device *hwdev, dma_addr_t addr)
 {
@@ -279,11 +320,35 @@ address_needs_mapping(struct device *hwd
 	return (addr & ~mask) != 0;
 }
 
+static inline int range_needs_mapping(const void *ptr, size_t size)
+{
+	return swiotlb_force;
+}
+
+static inline int order_needs_mapping(unsigned int order)
+{
+	return 0;
+}
+#endif
+
+static void
+__sync_single(io_tlb_addr_t buffer, char *dma_addr, size_t size, int dir)
+{
+#ifndef SWIOTLB_ARCH_HAS_SYNC_SINGLE
+	if (dir == DMA_TO_DEVICE)
+		memcpy(dma_addr, buffer, size);
+	else
+		memcpy(buffer, dma_addr, size);
+#else
+	__swiotlb_arch_sync_single(buffer, dma_addr, size, dir);
+#endif
+}
+
 /*
  * Allocates bounce buffer and returns its kernel virtual address.
  */
 static void *
-map_single(struct device *hwdev, char *buffer, size_t size, int dir)
+map_single(struct device *hwdev, io_tlb_addr_t buffer, size_t size, int dir)
 {
 	unsigned long flags;
 	char *dma_addr;
@@ -357,7 +422,7 @@ map_single(struct device *hwdev, char *b
 	 */
 	io_tlb_orig_addr[index] = buffer;
 	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
-		memcpy(dma_addr, buffer, size);
+		__sync_single(buffer, dma_addr, size, DMA_TO_DEVICE);
 
 	return dma_addr;
 }
@@ -371,17 +436,18 @@ unmap_single(struct device *hwdev, char 
 	unsigned long flags;
 	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
 	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
-	char *buffer = io_tlb_orig_addr[index];
+	io_tlb_addr_t buffer = io_tlb_orig_addr[index];
 
 	/*
 	 * First, sync the memory before unmapping the entry
 	 */
-	if (buffer && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
+	if (!swiotlb_orig_addr_null(buffer)
+	    && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
 		/*
 		 * bounce... copy the data back into the original buffer * and
 		 * delete the bounce buffer.
 		 */
-		memcpy(buffer, dma_addr, size);
+		__sync_single(buffer, dma_addr, size, DMA_FROM_DEVICE);
 
 	/*
 	 * Return the buffer to the free list by setting the corresponding
@@ -414,18 +480,18 @@ sync_single(struct device *hwdev, char *
 	    int dir, int target)
 {
 	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
-	char *buffer = io_tlb_orig_addr[index];
+	io_tlb_addr_t buffer = io_tlb_orig_addr[index];
 
 	switch (target) {
 	case SYNC_FOR_CPU:
 		if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
-			memcpy(buffer, dma_addr, size);
+			__sync_single(buffer, dma_addr, size, DMA_FROM_DEVICE);
 		else if (dir != DMA_TO_DEVICE)
 			BUG();
 		break;
 	case SYNC_FOR_DEVICE:
 		if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
-			memcpy(dma_addr, buffer, size);
+			__sync_single(buffer, dma_addr, size, DMA_TO_DEVICE);
 		else if (dir != DMA_FROM_DEVICE)
 			BUG();
 		break;
@@ -449,7 +515,10 @@ swiotlb_alloc_coherent(struct device *hw
 	 */
 	flags |= GFP_DMA;
 
-	ret = (void *)__get_free_pages(flags, order);
+	if (!order_needs_mapping(order))
+		ret = (void *)__get_free_pages(flags, order);
+	else
+		ret = NULL;
 	if (ret && address_needs_mapping(hwdev, virt_to_bus(ret))) {
 		/*
 		 * The allocated memory isn't reachable by the device.
@@ -536,18 +605,20 @@ swiotlb_map_single(struct device *hwdev,
 
 	if (dir == DMA_NONE)
 		BUG();
+
 	/*
 	 * If the pointer passed in happens to be in the device's DMA window,
 	 * we can safely return the device addr and not worry about bounce
 	 * buffering it.
 	 */
-	if (!address_needs_mapping(hwdev, dev_addr) && !swiotlb_force)
+	if (!range_needs_mapping(ptr, size)
+	    && !address_needs_mapping(hwdev, dev_addr))
 		return dev_addr;
 
 	/*
 	 * Oh well, have to allocate and map a bounce buffer.
 	 */
-	map = map_single(hwdev, ptr, size, dir);
+	map = map_single(hwdev, ptr_to_io_tlb_addr(ptr), size, dir);
 	if (!map) {
 		swiotlb_full(hwdev, size, dir, 1);
 		map = io_tlb_overflow_buffer;
@@ -688,8 +759,9 @@ swiotlb_map_sg(struct device *hwdev, str
 	for (i = 0; i < nelems; i++, sg++) {
 		addr = SG_ENT_VIRT_ADDRESS(sg);
 		dev_addr = virt_to_bus(addr);
-		if (swiotlb_force || address_needs_mapping(hwdev, dev_addr)) {
-			void *map = map_single(hwdev, addr, sg->length, dir);
+		if (range_needs_mapping(addr, sg->length)
+		    || address_needs_mapping(hwdev, dev_addr)) {
+			void *map = map_single(hwdev, sg_to_io_tlb_addr(sg), sg->length, dir);
 			if (!map) {
 				/* Don't panic here, we expect map_sg users
 				   to do proper error handling. */
@@ -765,6 +837,46 @@ swiotlb_sync_sg_for_device(struct device
 	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
 }
 
+#ifdef SWIOTLB_ARCH_NEED_MAP_PAGE
+
+dma_addr_t
+swiotlb_map_page(struct device *hwdev, struct page *page,
+		 unsigned long offset, size_t size,
+		 enum dma_data_direction direction)
+{
+	dma_addr_t dev_addr;
+	char *map;
+
+	dev_addr = page_to_bus(page) + offset;
+	if (address_needs_mapping(hwdev, dev_addr)) {
+		map = map_single(hwdev, page_to_io_tlb_addr(page, offset), size, direction);
+		if (!map) {
+			swiotlb_full(hwdev, size, direction, 1);
+			map = io_tlb_overflow_buffer;
+		}
+		dev_addr = virt_to_bus(map);
+	}
+
+	return dev_addr;
+}
+EXPORT_SYMBOL(swiotlb_map_page);
+
+void
+swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
+		   size_t size, enum dma_data_direction direction)
+{
+	char *dma_addr = bus_to_virt(dev_addr);
+
+	BUG_ON(direction == DMA_NONE);
+	if (dma_addr >= io_tlb_start && dma_addr < io_tlb_end)
+		unmap_single(hwdev, dma_addr, size, direction);
+	else if (direction == DMA_FROM_DEVICE)
+		dma_mark_clean(dma_addr, size);
+}
+EXPORT_SYMBOL(swiotlb_unmap_page);
+
+#endif
+
 int
 swiotlb_dma_mapping_error(dma_addr_t dma_addr)
 {
@@ -780,7 +892,11 @@ swiotlb_dma_mapping_error(dma_addr_t dma
 int
 swiotlb_dma_supported (struct device *hwdev, u64 mask)
 {
+#ifndef __swiotlb_dma_supported
 	return (virt_to_bus(io_tlb_end) - 1) <= mask;
+#else
+	return __swiotlb_dma_supported(hwdev, mask);
+#endif
 }
 
 EXPORT_SYMBOL(swiotlb_init);

[-- Attachment #3: swiotlb-bus.patch --]
[-- Type: text/plain, Size: 5241 bytes --]

Convert all phys_to_virt/virt_to_phys uses to bus_to_virt/virt_to_bus.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-18/lib/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-18.orig/lib/swiotlb.c	2006-12-20 12:02:01.000000000 +0100
+++ sle10-sp1-2006-12-18/lib/swiotlb.c	2006-12-20 12:09:03.000000000 +0100
@@ -36,7 +36,7 @@
 	                   ( (val) & ( (align) - 1)))
 
 #define SG_ENT_VIRT_ADDRESS(sg)	(page_address((sg)->page) + (sg)->offset)
-#define SG_ENT_PHYS_ADDRESS(SG)	virt_to_phys(SG_ENT_VIRT_ADDRESS(SG))
+#define SG_ENT_PHYS_ADDRESS(sg)	virt_to_bus(SG_ENT_VIRT_ADDRESS(sg))
 
 /*
  * Maximum allowable number of contiguous slabs to map,
@@ -163,7 +163,7 @@ swiotlb_init_with_default_size (size_t d
 	 */
 	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
 	printk(KERN_INFO "Placing software IO TLB between 0x%lx - 0x%lx\n",
-	       virt_to_phys(io_tlb_start), virt_to_phys(io_tlb_end));
+	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
 }
 
 void
@@ -244,7 +244,7 @@ swiotlb_late_init_with_default_size (siz
 
 	printk(KERN_INFO "Placing %ldMB software IO TLB between 0x%lx - "
 	       "0x%lx\n", (io_tlb_nslabs * (1 << IO_TLB_SHIFT)) >> 20,
-	       virt_to_phys(io_tlb_start), virt_to_phys(io_tlb_end));
+	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
 
 	return 0;
 
@@ -446,7 +446,7 @@ swiotlb_alloc_coherent(struct device *hw
 	flags |= GFP_DMA;
 
 	ret = (void *)__get_free_pages(flags, order);
-	if (ret && address_needs_mapping(hwdev, virt_to_phys(ret))) {
+	if (ret && address_needs_mapping(hwdev, virt_to_bus(ret))) {
 		/*
 		 * The allocated memory isn't reachable by the device.
 		 * Fall back on swiotlb_map_single().
@@ -466,11 +466,11 @@ swiotlb_alloc_coherent(struct device *hw
 		if (swiotlb_dma_mapping_error(handle))
 			return NULL;
 
-		ret = phys_to_virt(handle);
+		ret = bus_to_virt(handle);
 	}
 
 	memset(ret, 0, size);
-	dev_addr = virt_to_phys(ret);
+	dev_addr = virt_to_bus(ret);
 
 	/* Confirm address can be DMA'd by device */
 	if (address_needs_mapping(hwdev, dev_addr)) {
@@ -526,7 +526,7 @@ swiotlb_full(struct device *dev, size_t 
 dma_addr_t
 swiotlb_map_single(struct device *hwdev, void *ptr, size_t size, int dir)
 {
-	unsigned long dev_addr = virt_to_phys(ptr);
+	unsigned long dev_addr = virt_to_bus(ptr);
 	void *map;
 
 	if (dir == DMA_NONE)
@@ -548,7 +548,7 @@ swiotlb_map_single(struct device *hwdev,
 		map = io_tlb_overflow_buffer;
 	}
 
-	dev_addr = virt_to_phys(map);
+	dev_addr = virt_to_bus(map);
 
 	/*
 	 * Ensure that the address returned is DMA'ble
@@ -571,7 +571,7 @@ void
 swiotlb_unmap_single(struct device *hwdev, dma_addr_t dev_addr, size_t size,
 		     int dir)
 {
-	char *dma_addr = phys_to_virt(dev_addr);
+	char *dma_addr = bus_to_virt(dev_addr);
 
 	if (dir == DMA_NONE)
 		BUG();
@@ -595,7 +595,7 @@ static inline void
 swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
 		    size_t size, int dir, int target)
 {
-	char *dma_addr = phys_to_virt(dev_addr);
+	char *dma_addr = bus_to_virt(dev_addr);
 
 	if (dir == DMA_NONE)
 		BUG();
@@ -627,7 +627,7 @@ swiotlb_sync_single_range(struct device 
 			  unsigned long offset, size_t size,
 			  int dir, int target)
 {
-	char *dma_addr = phys_to_virt(dev_addr) + offset;
+	char *dma_addr = bus_to_virt(dev_addr) + offset;
 
 	if (dir == DMA_NONE)
 		BUG();
@@ -682,7 +682,7 @@ swiotlb_map_sg(struct device *hwdev, str
 
 	for (i = 0; i < nelems; i++, sg++) {
 		addr = SG_ENT_VIRT_ADDRESS(sg);
-		dev_addr = virt_to_phys(addr);
+		dev_addr = virt_to_bus(addr);
 		if (swiotlb_force || address_needs_mapping(hwdev, dev_addr)) {
 			void *map = map_single(hwdev, addr, sg->length, dir);
 			if (!map) {
@@ -716,7 +716,8 @@ swiotlb_unmap_sg(struct device *hwdev, s
 
 	for (i = 0; i < nelems; i++, sg++)
 		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			unmap_single(hwdev, (void *) phys_to_virt(sg->dma_address), sg->dma_length, dir);
+			unmap_single(hwdev, bus_to_virt(sg->dma_address),
+				     sg->dma_length, dir);
 		else if (dir == DMA_FROM_DEVICE)
 			dma_mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
 }
@@ -739,7 +740,7 @@ swiotlb_sync_sg(struct device *hwdev, st
 
 	for (i = 0; i < nelems; i++, sg++)
 		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			sync_single(hwdev, phys_to_virt(sg->dma_address),
+			sync_single(hwdev, bus_to_virt(sg->dma_address),
 				    sg->dma_length, dir, target);
 		else if (dir == DMA_FROM_DEVICE)
 			dma_mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
@@ -762,7 +763,7 @@ swiotlb_sync_sg_for_device(struct device
 int
 swiotlb_dma_mapping_error(dma_addr_t dma_addr)
 {
-	return (dma_addr == virt_to_phys(io_tlb_overflow_buffer));
+	return (dma_addr == virt_to_bus(io_tlb_overflow_buffer));
 }
 
 /*
@@ -774,7 +775,7 @@ swiotlb_dma_mapping_error(dma_addr_t dma
 int
 swiotlb_dma_supported (struct device *hwdev, u64 mask)
 {
-	return (virt_to_phys (io_tlb_end) - 1) <= mask;
+	return (virt_to_bus(io_tlb_end) - 1) <= mask;
 }
 
 EXPORT_SYMBOL(swiotlb_init);

[-- Attachment #4: swiotlb-cleanup.patch --]
[-- Type: text/plain, Size: 7532 bytes --]

This patch
- adds proper __init decoration to swiotlb's init code (and the code calling
  it, where not already the case)
- replaces uses of 'unsigned long' with dma_addr_t where appropriate
- does miscellaneous simplicfication and cleanup

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-18/arch/ia64/mm/init.c
===================================================================
--- sle10-sp1-2006-12-18.orig/arch/ia64/mm/init.c	2006-12-20 12:02:01.000000000 +0100
+++ sle10-sp1-2006-12-18/arch/ia64/mm/init.c	2006-12-20 12:09:22.000000000 +0100
@@ -586,7 +586,7 @@ nolwsys_setup (char *s)
 
 __setup("nolwsys", nolwsys_setup);
 
-void
+void __init
 mem_init (void)
 {
 	long reserved_pages, codesize, datasize, initsize;
Index: sle10-sp1-2006-12-18/arch/x86_64/kernel/pci-swiotlb.c
===================================================================
--- sle10-sp1-2006-12-18.orig/arch/x86_64/kernel/pci-swiotlb.c	2006-12-20 11:51:05.000000000 +0100
+++ sle10-sp1-2006-12-18/arch/x86_64/kernel/pci-swiotlb.c	2006-12-20 12:09:22.000000000 +0100
@@ -28,7 +28,7 @@ struct dma_mapping_ops swiotlb_dma_ops =
 	.dma_supported = NULL,
 };
 
-void pci_swiotlb_init(void)
+void __init pci_swiotlb_init(void)
 {
 	/* don't initialize swiotlb if iommu=off (no_iommu=1) */
 	if (!iommu_detected && !no_iommu &&
Index: sle10-sp1-2006-12-18/lib/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-18.orig/lib/swiotlb.c	2006-12-20 12:09:03.000000000 +0100
+++ sle10-sp1-2006-12-18/lib/swiotlb.c	2006-12-20 12:39:41.000000000 +0100
@@ -1,7 +1,7 @@
 /*
  * Dynamic DMA mapping support.
  *
- * This implementation is for IA-64 and EM64T platforms that do not support
+ * This implementation is a fallback for platforms that do not support
  * I/O TLBs (aka DMA address translation hardware).
  * Copyright (C) 2000 Asit Mallick <Asit.K.Mallick@intel.com>
  * Copyright (C) 2000 Goutham Rao <goutham.rao@intel.com>
@@ -68,7 +68,7 @@ enum dma_sync_target {
 	SYNC_FOR_DEVICE = 1,
 };
 
-int swiotlb_force;
+static int swiotlb_force;
 
 /*
  * Used to do a quick range check in swiotlb_unmap_single and
@@ -129,23 +129,25 @@ __setup("swiotlb=", setup_io_tlb_npages)
  * Statically reserve bounce buffer space and initialize bounce buffer data
  * structures for the software IO TLB used to implement the DMA API.
  */
-void
-swiotlb_init_with_default_size (size_t default_size)
+void __init
+swiotlb_init_with_default_size(size_t default_size)
 {
-	unsigned long i;
+	unsigned long i, bytes;
 
 	if (!io_tlb_nslabs) {
 		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
 		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
 	}
 
+	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	io_tlb_start = alloc_bootmem_low_pages(io_tlb_nslabs * (1 << IO_TLB_SHIFT));
+	io_tlb_start = alloc_bootmem_low_pages(bytes);
 	if (!io_tlb_start)
 		panic("Cannot allocate SWIOTLB buffer");
-	io_tlb_end = io_tlb_start + io_tlb_nslabs * (1 << IO_TLB_SHIFT);
+	io_tlb_end = io_tlb_start + bytes;
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
@@ -166,8 +168,8 @@ swiotlb_init_with_default_size (size_t d
 	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
 }
 
-void
-swiotlb_init (void)
+void __init
+swiotlb_init(void)
 {
 	swiotlb_init_with_default_size(64 * (1<<20));	/* default to 64MB */
 }
@@ -178,9 +180,9 @@ swiotlb_init (void)
  * This should be just like above, but with some error catching.
  */
 int
-swiotlb_late_init_with_default_size (size_t default_size)
+swiotlb_late_init_with_default_size(size_t default_size)
 {
-	unsigned long i, req_nslabs = io_tlb_nslabs;
+	unsigned long i, bytes, req_nslabs = io_tlb_nslabs;
 	unsigned int order;
 
 	if (!io_tlb_nslabs) {
@@ -191,8 +193,9 @@ swiotlb_late_init_with_default_size (siz
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	order = get_order(io_tlb_nslabs * (1 << IO_TLB_SHIFT));
+	order = get_order(io_tlb_nslabs << IO_TLB_SHIFT);
 	io_tlb_nslabs = SLABS_PER_PAGE << order;
+	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
 	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
 		io_tlb_start = (char *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
@@ -205,13 +208,14 @@ swiotlb_late_init_with_default_size (siz
 	if (!io_tlb_start)
 		goto cleanup1;
 
-	if (order != get_order(io_tlb_nslabs * (1 << IO_TLB_SHIFT))) {
+	if (order != get_order(bytes)) {
 		printk(KERN_WARNING "Warning: only able to allocate %ld MB "
 		       "for software IO TLB\n", (PAGE_SIZE << order) >> 20);
 		io_tlb_nslabs = SLABS_PER_PAGE << order;
+		bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 	}
-	io_tlb_end = io_tlb_start + io_tlb_nslabs * (1 << IO_TLB_SHIFT);
-	memset(io_tlb_start, 0, io_tlb_nslabs * (1 << IO_TLB_SHIFT));
+	io_tlb_end = io_tlb_start + bytes;
+	memset(io_tlb_start, 0, bytes);
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
@@ -242,8 +246,8 @@ swiotlb_late_init_with_default_size (siz
 	if (!io_tlb_overflow_buffer)
 		goto cleanup4;
 
-	printk(KERN_INFO "Placing %ldMB software IO TLB between 0x%lx - "
-	       "0x%lx\n", (io_tlb_nslabs * (1 << IO_TLB_SHIFT)) >> 20,
+	printk(KERN_INFO "Placing %luMB software IO TLB between 0x%lx - "
+	       "0x%lx\n", bytes >> 20,
 	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
 
 	return 0;
@@ -256,8 +260,8 @@ cleanup3:
 	free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
 	                                                 sizeof(int)));
 	io_tlb_list = NULL;
-	io_tlb_end = NULL;
 cleanup2:
+	io_tlb_end = NULL;
 	free_pages((unsigned long)io_tlb_start, order);
 	io_tlb_start = NULL;
 cleanup1:
@@ -434,7 +438,7 @@ void *
 swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 		       dma_addr_t *dma_handle, gfp_t flags)
 {
-	unsigned long dev_addr;
+	dma_addr_t dev_addr;
 	void *ret;
 	int order = get_order(size);
 
@@ -474,8 +478,9 @@ swiotlb_alloc_coherent(struct device *hw
 
 	/* Confirm address can be DMA'd by device */
 	if (address_needs_mapping(hwdev, dev_addr)) {
-		printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016lx\n",
-		       (unsigned long long)*hwdev->dma_mask, dev_addr);
+		printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
+		       (unsigned long long)*hwdev->dma_mask,
+		       (unsigned long long)dev_addr);
 		panic("swiotlb_alloc_coherent: allocated memory is out of "
 		      "range for device");
 	}
@@ -505,7 +510,7 @@ swiotlb_full(struct device *dev, size_t 
 	 * When the mapping is small enough return a static buffer to limit
 	 * the damage, or panic when the transfer is too big.
 	 */
-	printk(KERN_ERR "DMA: Out of SW-IOMMU space for %lu bytes at "
+	printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at "
 	       "device %s\n", size, dev ? dev->bus_id : "?");
 
 	if (size > io_tlb_overflow && do_panic) {
@@ -526,7 +531,7 @@ swiotlb_full(struct device *dev, size_t 
 dma_addr_t
 swiotlb_map_single(struct device *hwdev, void *ptr, size_t size, int dir)
 {
-	unsigned long dev_addr = virt_to_bus(ptr);
+	dma_addr_t dev_addr = virt_to_bus(ptr);
 	void *map;
 
 	if (dir == DMA_NONE)
@@ -674,7 +679,7 @@ swiotlb_map_sg(struct device *hwdev, str
 	       int dir)
 {
 	void *addr;
-	unsigned long dev_addr;
+	dma_addr_t dev_addr;
 	int i;
 
 	if (dir == DMA_NONE)

[-- Attachment #5: swiotlb-bugs.patch --]
[-- Type: text/plain, Size: 6364 bytes --]

This patch fixes
- marking I-cache clean of pages DMAed to now only done for IA64
- broken multiple inclusion in include/asm-x86_64/swiotlb.h
- missing phys-to-virt translation in swiotlb_sync_sg()
- missing call to mark_clean in swiotlb_sync_sg()

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-18/arch/ia64/mm/init.c
===================================================================
--- sle10-sp1-2006-12-18.orig/arch/ia64/mm/init.c	2006-03-20 06:53:29.000000000 +0100
+++ sle10-sp1-2006-12-18/arch/ia64/mm/init.c	2006-12-20 12:02:01.000000000 +0100
@@ -123,6 +123,25 @@ lazy_mmu_prot_update (pte_t pte)
 	set_bit(PG_arch_1, &page->flags);	/* mark page as clean */
 }
 
+/*
+ * Since DMA is i-cache coherent, any (complete) pages that were written via
+ * DMA can be marked as "clean" so that lazy_mmu_prot_update() doesn't have to
+ * flush them when they get mapped into an executable vm-area.
+ */
+void
+dma_mark_clean(void *addr, size_t size)
+{
+	unsigned long pg_addr, end;
+
+	pg_addr = PAGE_ALIGN((unsigned long) addr);
+	end = (unsigned long) addr + size;
+	while (pg_addr + PAGE_SIZE <= end) {
+		struct page *page = virt_to_page(pg_addr);
+		set_bit(PG_arch_1, &page->flags);
+		pg_addr += PAGE_SIZE;
+	}
+}
+
 inline void
 ia64_set_rbs_bot (void)
 {
Index: sle10-sp1-2006-12-18/include/asm-ia64/dma.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-ia64/dma.h	2006-03-20 06:53:29.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-ia64/dma.h	2006-12-20 12:02:01.000000000 +0100
@@ -20,4 +20,6 @@ extern unsigned long MAX_DMA_ADDRESS;
 
 #define free_dma(x)
 
+void dma_mark_clean(void *addr, size_t size);
+
 #endif /* _ASM_IA64_DMA_H */
Index: sle10-sp1-2006-12-18/include/asm-x86_64/mach-xen/asm/dma-mapping.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-x86_64/mach-xen/asm/dma-mapping.h	2006-12-20 15:53:41.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-x86_64/mach-xen/asm/dma-mapping.h	2006-12-20 12:02:01.000000000 +0100
@@ -10,7 +10,6 @@
 
 #include <asm/scatterlist.h>
 #include <asm/io.h>
-#include <asm/swiotlb.h>
 
 struct dma_mapping_ops {
 	int             (*mapping_error)(dma_addr_t dma_addr);
Index: sle10-sp1-2006-12-18/include/asm-x86_64/swiotlb.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-x86_64/swiotlb.h	2006-12-20 15:53:41.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-x86_64/swiotlb.h	2006-12-20 15:53:57.000000000 +0100
@@ -1,5 +1,5 @@
 #ifndef _ASM_SWIOTLB_H
-#define _ASM_SWTIOLB_H 1
+#define _ASM_SWIOTLB_H 1
 
 #include <linux/config.h>
 
@@ -59,4 +59,6 @@ extern int swiotlb;
 
 extern void pci_swiotlb_init(void);
 
-#endif /* _ASM_SWTIOLB_H */
+static inline void dma_mark_clean(void *addr, size_t size) {}
+
+#endif /* _ASM_SWIOTLB_H */
Index: sle10-sp1-2006-12-18/lib/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-18.orig/lib/swiotlb.c	2006-03-20 06:53:29.000000000 +0100
+++ sle10-sp1-2006-12-18/lib/swiotlb.c	2006-12-20 12:02:01.000000000 +0100
@@ -560,25 +560,6 @@ swiotlb_map_single(struct device *hwdev,
 }
 
 /*
- * Since DMA is i-cache coherent, any (complete) pages that were written via
- * DMA can be marked as "clean" so that lazy_mmu_prot_update() doesn't have to
- * flush them when they get mapped into an executable vm-area.
- */
-static void
-mark_clean(void *addr, size_t size)
-{
-	unsigned long pg_addr, end;
-
-	pg_addr = PAGE_ALIGN((unsigned long) addr);
-	end = (unsigned long) addr + size;
-	while (pg_addr + PAGE_SIZE <= end) {
-		struct page *page = virt_to_page(pg_addr);
-		set_bit(PG_arch_1, &page->flags);
-		pg_addr += PAGE_SIZE;
-	}
-}
-
-/*
  * Unmap a single streaming mode DMA translation.  The dma_addr and size must
  * match what was provided for in a previous swiotlb_map_single call.  All
  * other usages are undefined.
@@ -597,7 +578,7 @@ swiotlb_unmap_single(struct device *hwde
 	if (dma_addr >= io_tlb_start && dma_addr < io_tlb_end)
 		unmap_single(hwdev, dma_addr, size, dir);
 	else if (dir == DMA_FROM_DEVICE)
-		mark_clean(dma_addr, size);
+		dma_mark_clean(dma_addr, size);
 }
 
 /*
@@ -621,7 +602,7 @@ swiotlb_sync_single(struct device *hwdev
 	if (dma_addr >= io_tlb_start && dma_addr < io_tlb_end)
 		sync_single(hwdev, dma_addr, size, dir, target);
 	else if (dir == DMA_FROM_DEVICE)
-		mark_clean(dma_addr, size);
+		dma_mark_clean(dma_addr, size);
 }
 
 void
@@ -653,7 +634,7 @@ swiotlb_sync_single_range(struct device 
 	if (dma_addr >= io_tlb_start && dma_addr < io_tlb_end)
 		sync_single(hwdev, dma_addr, size, dir, target);
 	else if (dir == DMA_FROM_DEVICE)
-		mark_clean(dma_addr, size);
+		dma_mark_clean(dma_addr, size);
 }
 
 void
@@ -704,7 +685,6 @@ swiotlb_map_sg(struct device *hwdev, str
 		dev_addr = virt_to_phys(addr);
 		if (swiotlb_force || address_needs_mapping(hwdev, dev_addr)) {
 			void *map = map_single(hwdev, addr, sg->length, dir);
-			sg->dma_address = virt_to_bus(map);
 			if (!map) {
 				/* Don't panic here, we expect map_sg users
 				   to do proper error handling. */
@@ -713,6 +693,7 @@ swiotlb_map_sg(struct device *hwdev, str
 				sg[0].dma_length = 0;
 				return 0;
 			}
+			sg->dma_address = virt_to_bus(map);
 		} else
 			sg->dma_address = dev_addr;
 		sg->dma_length = sg->length;
@@ -737,7 +718,7 @@ swiotlb_unmap_sg(struct device *hwdev, s
 		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
 			unmap_single(hwdev, (void *) phys_to_virt(sg->dma_address), sg->dma_length, dir);
 		else if (dir == DMA_FROM_DEVICE)
-			mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
+			dma_mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
 }
 
 /*
@@ -758,8 +739,10 @@ swiotlb_sync_sg(struct device *hwdev, st
 
 	for (i = 0; i < nelems; i++, sg++)
 		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			sync_single(hwdev, (void *) sg->dma_address,
+			sync_single(hwdev, phys_to_virt(sg->dma_address),
 				    sg->dma_length, dir, target);
+		else if (dir == DMA_FROM_DEVICE)
+			dma_mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
 }
 
 void

[-- Attachment #6: xen-swiotlb.patch --]
[-- Type: text/plain, Size: 34517 bytes --]

This patch eliminates Xen's special version of the swiotlb code. Along with
that it adds trivial forwarding of dma_{,un}map_page when not using highmem.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-18/arch/i386/kernel/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-18.orig/arch/i386/kernel/swiotlb.c	2006-12-20 15:53:32.000000000 +0100
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,682 +0,0 @@
-/*
- * Dynamic DMA mapping support.
- *
- * This implementation is a fallback for platforms that do not support
- * I/O TLBs (aka DMA address translation hardware).
- * Copyright (C) 2000 Asit Mallick <Asit.K.Mallick@intel.com>
- * Copyright (C) 2000 Goutham Rao <goutham.rao@intel.com>
- * Copyright (C) 2000, 2003 Hewlett-Packard Co
- *	David Mosberger-Tang <davidm@hpl.hp.com>
- */
-
-#include <linux/cache.h>
-#include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/pci.h>
-#include <linux/spinlock.h>
-#include <linux/string.h>
-#include <linux/types.h>
-#include <linux/ctype.h>
-#include <linux/init.h>
-#include <linux/bootmem.h>
-#include <linux/highmem.h>
-#include <asm/io.h>
-#include <asm/pci.h>
-#include <asm/dma.h>
-#include <asm/uaccess.h>
-#include <xen/interface/memory.h>
-
-int swiotlb;
-EXPORT_SYMBOL(swiotlb);
-
-#define OFFSET(val,align) ((unsigned long)((val) & ( (align) - 1)))
-
-#define SG_ENT_PHYS_ADDRESS(sg)	(page_to_bus((sg)->page) + (sg)->offset)
-
-/*
- * Maximum allowable number of contiguous slabs to map,
- * must be a power of 2.  What is the appropriate value ?
- * The complexity of {map,unmap}_single is linearly dependent on this value.
- */
-#define IO_TLB_SEGSIZE	128
-
-/*
- * log of the size of each IO TLB slab.  The number of slabs is command line
- * controllable.
- */
-#define IO_TLB_SHIFT 11
-
-/* Width of DMA addresses in the IO TLB. 30 bits is a b44 limitation. */
-#define DEFAULT_IO_TLB_DMA_BITS 30
-
-static int swiotlb_force;
-static char *iotlb_virt_start;
-static unsigned long iotlb_nslabs;
-
-/*
- * Used to do a quick range check in swiotlb_unmap_single and
- * swiotlb_sync_single_*, to see if the memory was in fact allocated by this
- * API.
- */
-static unsigned long iotlb_pfn_start, iotlb_pfn_end;
-
-/* Does the given dma address reside within the swiotlb aperture? */
-static inline int in_swiotlb_aperture(dma_addr_t dev_addr)
-{
-	unsigned long pfn = mfn_to_local_pfn(dev_addr >> PAGE_SHIFT);
-	return (pfn_valid(pfn)
-		&& (pfn >= iotlb_pfn_start)
-		&& (pfn < iotlb_pfn_end));
-}
-
-/*
- * When the IOMMU overflows we return a fallback buffer. This sets the size.
- */
-static unsigned long io_tlb_overflow = 32*1024;
-
-void *io_tlb_overflow_buffer;
-
-/*
- * This is a free list describing the number of free entries available from
- * each index
- */
-static unsigned int *io_tlb_list;
-static unsigned int io_tlb_index;
-
-/*
- * We need to save away the original address corresponding to a mapped entry
- * for the sync operations.
- */
-static struct phys_addr {
-	struct page *page;
-	unsigned int offset;
-} *io_tlb_orig_addr;
-
-/*
- * Protect the above data structures in the map and unmap calls
- */
-static DEFINE_SPINLOCK(io_tlb_lock);
-
-static unsigned int io_tlb_dma_bits = DEFAULT_IO_TLB_DMA_BITS;
-static int __init
-setup_io_tlb_bits(char *str)
-{
-	io_tlb_dma_bits = simple_strtoul(str, NULL, 0);
-	return 0;
-}
-__setup("swiotlb_bits=", setup_io_tlb_bits);
-
-static int __init
-setup_io_tlb_npages(char *str)
-{
-	/* Unlike ia64, the size is aperture in megabytes, not 'slabs'! */
-	if (isdigit(*str)) {
-		iotlb_nslabs = simple_strtoul(str, &str, 0) <<
-			(20 - IO_TLB_SHIFT);
-		iotlb_nslabs = ALIGN(iotlb_nslabs, IO_TLB_SEGSIZE);
-		/* Round up to power of two (xen_create_contiguous_region). */
-		while (iotlb_nslabs & (iotlb_nslabs-1))
-			iotlb_nslabs += iotlb_nslabs & ~(iotlb_nslabs-1);
-	}
-	if (*str == ',')
-		++str;
-	/*
-         * NB. 'force' enables the swiotlb, but doesn't force its use for
-         * every DMA like it does on native Linux. 'off' forcibly disables
-         * use of the swiotlb.
-         */
-	if (!strcmp(str, "force"))
-		swiotlb_force = 1;
-	else if (!strcmp(str, "off"))
-		swiotlb_force = -1;
-	return 1;
-}
-__setup("swiotlb=", setup_io_tlb_npages);
-/* make io_tlb_overflow tunable too? */
-
-/*
- * Statically reserve bounce buffer space and initialize bounce buffer data
- * structures for the software IO TLB used to implement the PCI DMA API.
- */
-void
-swiotlb_init_with_default_size (size_t default_size)
-{
-	unsigned long i, bytes;
-
-	if (!iotlb_nslabs) {
-		iotlb_nslabs = (default_size >> IO_TLB_SHIFT);
-		iotlb_nslabs = ALIGN(iotlb_nslabs, IO_TLB_SEGSIZE);
-		/* Round up to power of two (xen_create_contiguous_region). */
-		while (iotlb_nslabs & (iotlb_nslabs-1))
-			iotlb_nslabs += iotlb_nslabs & ~(iotlb_nslabs-1);
-	}
-
-	bytes = iotlb_nslabs * (1UL << IO_TLB_SHIFT);
-
-	/*
-	 * Get IO TLB memory from the low pages
-	 */
-	iotlb_virt_start = alloc_bootmem_low_pages(bytes);
-	if (!iotlb_virt_start)
-		panic("Cannot allocate SWIOTLB buffer!\n"
-		      "Use dom0_mem Xen boot parameter to reserve\n"
-		      "some DMA memory (e.g., dom0_mem=-128M).\n");
-
-	for (i = 0; i < iotlb_nslabs; i += IO_TLB_SEGSIZE) {
-		int rc = xen_create_contiguous_region(
-			(unsigned long)iotlb_virt_start + (i << IO_TLB_SHIFT),
-			get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT),
-			io_tlb_dma_bits);
-		BUG_ON(rc);
-	}
-
-	/*
-	 * Allocate and initialize the free list array.  This array is used
-	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE.
-	 */
-	io_tlb_list = alloc_bootmem(iotlb_nslabs * sizeof(int));
-	for (i = 0; i < iotlb_nslabs; i++)
- 		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-	io_tlb_index = 0;
-	io_tlb_orig_addr = alloc_bootmem(
-		iotlb_nslabs * sizeof(*io_tlb_orig_addr));
-
-	/*
-	 * Get the overflow emergency buffer
-	 */
-	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
-
-	iotlb_pfn_start = __pa(iotlb_virt_start) >> PAGE_SHIFT;
-	iotlb_pfn_end   = iotlb_pfn_start + (bytes >> PAGE_SHIFT);
-
-	printk(KERN_INFO "Software IO TLB enabled: \n"
-	       " Aperture:     %lu megabytes\n"
-	       " Kernel range: 0x%016lx - 0x%016lx\n"
-	       " Address size: %u bits\n",
-	       bytes >> 20,
-	       (unsigned long)iotlb_virt_start,
-	       (unsigned long)iotlb_virt_start + bytes,
-	       io_tlb_dma_bits);
-}
-
-void
-swiotlb_init(void)
-{
-	long ram_end;
-	size_t defsz = 64 * (1 << 20); /* 64MB default size */
-
-	if (swiotlb_force == 1) {
-		swiotlb = 1;
-	} else if ((swiotlb_force != -1) &&
-		   is_running_on_xen() &&
-		   is_initial_xendomain()) {
-		/* Domain 0 always has a swiotlb. */
-		ram_end = HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL);
-		if (ram_end <= 0x7ffff)
-			defsz = 2 * (1 << 20); /* 2MB on <2GB on systems. */
-		swiotlb = 1;
-	}
-
-	if (swiotlb)
-		swiotlb_init_with_default_size(defsz);
-	else
-		printk(KERN_INFO "Software IO TLB disabled\n");
-}
-
-/*
- * We use __copy_to_user_inatomic to transfer to the host buffer because the
- * buffer may be mapped read-only (e.g, in blkback driver) but lower-level
- * drivers map the buffer for DMA_BIDIRECTIONAL access. This causes an
- * unnecessary copy from the aperture to the host buffer, and a page fault.
- */
-static void
-__sync_single(struct phys_addr buffer, char *dma_addr, size_t size, int dir)
-{
-	if (PageHighMem(buffer.page)) {
-		size_t len, bytes;
-		char *dev, *host, *kmp;
-		len = size;
-		while (len != 0) {
-			if (((bytes = len) + buffer.offset) > PAGE_SIZE)
-				bytes = PAGE_SIZE - buffer.offset;
-			kmp  = kmap_atomic(buffer.page, KM_SWIOTLB);
-			dev  = dma_addr + size - len;
-			host = kmp + buffer.offset;
-			if (dir == DMA_FROM_DEVICE) {
-				if (__copy_to_user_inatomic(host, dev, bytes))
-					/* inaccessible */;
-			} else
-				memcpy(dev, host, bytes);
-			kunmap_atomic(kmp, KM_SWIOTLB);
-			len -= bytes;
-			buffer.page++;
-			buffer.offset = 0;
-		}
-	} else {
-		char *host = (char *)phys_to_virt(
-			page_to_pseudophys(buffer.page)) + buffer.offset;
-		if (dir == DMA_FROM_DEVICE) {
-			if (__copy_to_user_inatomic(host, dma_addr, size))
-				/* inaccessible */;
-		} else if (dir == DMA_TO_DEVICE)
-			memcpy(dma_addr, host, size);
-	}
-}
-
-/*
- * Allocates bounce buffer and returns its kernel virtual address.
- */
-static void *
-map_single(struct device *hwdev, struct phys_addr buffer, size_t size, int dir)
-{
-	unsigned long flags;
-	char *dma_addr;
-	unsigned int nslots, stride, index, wrap;
-	int i;
-
-	/*
-	 * For mappings greater than a page, we limit the stride (and
-	 * hence alignment) to a page size.
-	 */
-	nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	if (size > PAGE_SIZE)
-		stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT));
-	else
-		stride = 1;
-
-	BUG_ON(!nslots);
-
-	/*
-	 * Find suitable number of IO TLB entries size that will fit this
-	 * request and allocate a buffer from that IO TLB pool.
-	 */
-	spin_lock_irqsave(&io_tlb_lock, flags);
-	{
-		wrap = index = ALIGN(io_tlb_index, stride);
-
-		if (index >= iotlb_nslabs)
-			wrap = index = 0;
-
-		do {
-			/*
-			 * If we find a slot that indicates we have 'nslots'
-			 * number of contiguous buffers, we allocate the
-			 * buffers from that slot and mark the entries as '0'
-			 * indicating unavailable.
-			 */
-			if (io_tlb_list[index] >= nslots) {
-				int count = 0;
-
-				for (i = index; i < (int)(index + nslots); i++)
-					io_tlb_list[i] = 0;
-				for (i = index - 1;
-				     (OFFSET(i, IO_TLB_SEGSIZE) !=
-				      IO_TLB_SEGSIZE -1) && io_tlb_list[i];
-				     i--)
-					io_tlb_list[i] = ++count;
-				dma_addr = iotlb_virt_start +
-					(index << IO_TLB_SHIFT);
-
-				/*
-				 * Update the indices to avoid searching in
-				 * the next round.
-				 */
-				io_tlb_index = 
-					((index + nslots) < iotlb_nslabs
-					 ? (index + nslots) : 0);
-
-				goto found;
-			}
-			index += stride;
-			if (index >= iotlb_nslabs)
-				index = 0;
-		} while (index != wrap);
-
-		spin_unlock_irqrestore(&io_tlb_lock, flags);
-		return NULL;
-	}
-  found:
-	spin_unlock_irqrestore(&io_tlb_lock, flags);
-
-	/*
-	 * Save away the mapping from the original address to the DMA address.
-	 * This is needed when we sync the memory.  Then we sync the buffer if
-	 * needed.
-	 */
-	io_tlb_orig_addr[index] = buffer;
-	if ((dir == DMA_TO_DEVICE) || (dir == DMA_BIDIRECTIONAL))
-		__sync_single(buffer, dma_addr, size, DMA_TO_DEVICE);
-
-	return dma_addr;
-}
-
-/*
- * dma_addr is the kernel virtual address of the bounce buffer to unmap.
- */
-static void
-unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
-{
-	unsigned long flags;
-	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	int index = (dma_addr - iotlb_virt_start) >> IO_TLB_SHIFT;
-	struct phys_addr buffer = io_tlb_orig_addr[index];
-
-	/*
-	 * First, sync the memory before unmapping the entry
-	 */
-	if ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL))
-		__sync_single(buffer, dma_addr, size, DMA_FROM_DEVICE);
-
-	/*
-	 * Return the buffer to the free list by setting the corresponding
-	 * entries to indicate the number of contigous entries available.
-	 * While returning the entries to the free list, we merge the entries
-	 * with slots below and above the pool being returned.
-	 */
-	spin_lock_irqsave(&io_tlb_lock, flags);
-	{
-		count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
-			 io_tlb_list[index + nslots] : 0);
-		/*
-		 * Step 1: return the slots to the free list, merging the
-		 * slots with superceeding slots
-		 */
-		for (i = index + nslots - 1; i >= index; i--)
-			io_tlb_list[i] = ++count;
-		/*
-		 * Step 2: merge the returned slots with the preceding slots,
-		 * if available (non zero)
-		 */
-		for (i = index - 1;
-		     (OFFSET(i, IO_TLB_SEGSIZE) !=
-		      IO_TLB_SEGSIZE -1) && io_tlb_list[i];
-		     i--)
-			io_tlb_list[i] = ++count;
-	}
-	spin_unlock_irqrestore(&io_tlb_lock, flags);
-}
-
-static void
-sync_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
-{
-	int index = (dma_addr - iotlb_virt_start) >> IO_TLB_SHIFT;
-	struct phys_addr buffer = io_tlb_orig_addr[index];
-	BUG_ON((dir != DMA_FROM_DEVICE) && (dir != DMA_TO_DEVICE));
-	__sync_single(buffer, dma_addr, size, dir);
-}
-
-static void
-swiotlb_full(struct device *dev, size_t size, int dir, int do_panic)
-{
-	/*
-	 * Ran out of IOMMU space for this operation. This is very bad.
-	 * Unfortunately the drivers cannot handle this operation properly.
-	 * unless they check for pci_dma_mapping_error (most don't)
-	 * When the mapping is small enough return a static buffer to limit
-	 * the damage, or panic when the transfer is too big.
-	 */
-	printk(KERN_ERR "PCI-DMA: Out of SW-IOMMU space for %lu bytes at "
-	       "device %s\n", (unsigned long)size, dev ? dev->bus_id : "?");
-
-	if (size > io_tlb_overflow && do_panic) {
-		if (dir == PCI_DMA_FROMDEVICE || dir == PCI_DMA_BIDIRECTIONAL)
-			panic("PCI-DMA: Memory would be corrupted\n");
-		if (dir == PCI_DMA_TODEVICE || dir == PCI_DMA_BIDIRECTIONAL)
-			panic("PCI-DMA: Random memory would be DMAed\n");
-	}
-}
-
-/*
- * Map a single buffer of the indicated size for DMA in streaming mode.  The
- * PCI address to use is returned.
- *
- * Once the device is given the dma address, the device owns this memory until
- * either swiotlb_unmap_single or swiotlb_dma_sync_single is performed.
- */
-dma_addr_t
-swiotlb_map_single(struct device *hwdev, void *ptr, size_t size, int dir)
-{
-	dma_addr_t dev_addr = virt_to_bus(ptr);
-	void *map;
-	struct phys_addr buffer;
-
-	BUG_ON(dir == DMA_NONE);
-
-	/*
-	 * If the pointer passed in happens to be in the device's DMA window,
-	 * we can safely return the device addr and not worry about bounce
-	 * buffering it.
-	 */
-	if (!range_straddles_page_boundary(ptr, size) &&
-	    !address_needs_mapping(hwdev, dev_addr))
-		return dev_addr;
-
-	/*
-	 * Oh well, have to allocate and map a bounce buffer.
-	 */
-	buffer.page   = virt_to_page(ptr);
-	buffer.offset = (unsigned long)ptr & ~PAGE_MASK;
-	map = map_single(hwdev, buffer, size, dir);
-	if (!map) {
-		swiotlb_full(hwdev, size, dir, 1);
-		map = io_tlb_overflow_buffer;
-	}
-
-	dev_addr = virt_to_bus(map);
-	return dev_addr;
-}
-
-/*
- * Unmap a single streaming mode DMA translation.  The dma_addr and size must
- * match what was provided for in a previous swiotlb_map_single call.  All
- * other usages are undefined.
- *
- * After this call, reads by the cpu to the buffer are guaranteed to see
- * whatever the device wrote there.
- */
-void
-swiotlb_unmap_single(struct device *hwdev, dma_addr_t dev_addr, size_t size,
-		     int dir)
-{
-	BUG_ON(dir == DMA_NONE);
-	if (in_swiotlb_aperture(dev_addr))
-		unmap_single(hwdev, bus_to_virt(dev_addr), size, dir);
-}
-
-/*
- * Make physical memory consistent for a single streaming mode DMA translation
- * after a transfer.
- *
- * If you perform a swiotlb_map_single() but wish to interrogate the buffer
- * using the cpu, yet do not wish to teardown the PCI dma mapping, you must
- * call this function before doing so.  At the next point you give the PCI dma
- * address back to the card, you must first perform a
- * swiotlb_dma_sync_for_device, and then the device again owns the buffer
- */
-void
-swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
-			    size_t size, int dir)
-{
-	BUG_ON(dir == DMA_NONE);
-	if (in_swiotlb_aperture(dev_addr))
-		sync_single(hwdev, bus_to_virt(dev_addr), size, dir);
-}
-
-void
-swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
-			       size_t size, int dir)
-{
-	BUG_ON(dir == DMA_NONE);
-	if (in_swiotlb_aperture(dev_addr))
-		sync_single(hwdev, bus_to_virt(dev_addr), size, dir);
-}
-
-/*
- * Map a set of buffers described by scatterlist in streaming mode for DMA.
- * This is the scatter-gather version of the above swiotlb_map_single
- * interface.  Here the scatter gather list elements are each tagged with the
- * appropriate dma address and length.  They are obtained via
- * sg_dma_{address,length}(SG).
- *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
- * Device ownership issues as mentioned above for swiotlb_map_single are the
- * same here.
- */
-int
-swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg, int nelems,
-	       int dir)
-{
-	struct phys_addr buffer;
-	dma_addr_t dev_addr;
-	char *map;
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for (i = 0; i < nelems; i++, sg++) {
-		dev_addr = SG_ENT_PHYS_ADDRESS(sg);
-		if (address_needs_mapping(hwdev, dev_addr)) {
-			buffer.page   = sg->page;
-			buffer.offset = sg->offset;
-			map = map_single(hwdev, buffer, sg->length, dir);
-			if (!map) {
-				/* Don't panic here, we expect map_sg users
-				   to do proper error handling. */
-				swiotlb_full(hwdev, sg->length, dir, 0);
-				swiotlb_unmap_sg(hwdev, sg - i, i, dir);
-				sg[0].dma_length = 0;
-				return 0;
-			}
-			sg->dma_address = (dma_addr_t)virt_to_bus(map);
-		} else
-			sg->dma_address = dev_addr;
-		sg->dma_length = sg->length;
-	}
-	return nelems;
-}
-
-/*
- * Unmap a set of streaming mode DMA translations.  Again, cpu read rules
- * concerning calls here are the same as for swiotlb_unmap_single() above.
- */
-void
-swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg, int nelems,
-		 int dir)
-{
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for (i = 0; i < nelems; i++, sg++)
-		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			unmap_single(hwdev, 
-				     (void *)bus_to_virt(sg->dma_address),
-				     sg->dma_length, dir);
-}
-
-/*
- * Make physical memory consistent for a set of streaming mode DMA translations
- * after a transfer.
- *
- * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules
- * and usage.
- */
-void
-swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
-			int nelems, int dir)
-{
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for (i = 0; i < nelems; i++, sg++)
-		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			sync_single(hwdev,
-				    (void *)bus_to_virt(sg->dma_address),
-				    sg->dma_length, dir);
-}
-
-void
-swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
-			   int nelems, int dir)
-{
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for (i = 0; i < nelems; i++, sg++)
-		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			sync_single(hwdev,
-				    (void *)bus_to_virt(sg->dma_address),
-				    sg->dma_length, dir);
-}
-
-dma_addr_t
-swiotlb_map_page(struct device *hwdev, struct page *page,
-		 unsigned long offset, size_t size,
-		 enum dma_data_direction direction)
-{
-	struct phys_addr buffer;
-	dma_addr_t dev_addr;
-	char *map;
-
-	dev_addr = page_to_bus(page) + offset;
-	if (address_needs_mapping(hwdev, dev_addr)) {
-		buffer.page   = page;
-		buffer.offset = offset;
-		map = map_single(hwdev, buffer, size, direction);
-		if (!map) {
-			swiotlb_full(hwdev, size, direction, 1);
-			map = io_tlb_overflow_buffer;
-		}
-		dev_addr = (dma_addr_t)virt_to_bus(map);
-	}
-
-	return dev_addr;
-}
-
-void
-swiotlb_unmap_page(struct device *hwdev, dma_addr_t dma_address,
-		   size_t size, enum dma_data_direction direction)
-{
-	BUG_ON(direction == DMA_NONE);
-	if (in_swiotlb_aperture(dma_address))
-		unmap_single(hwdev, bus_to_virt(dma_address), size, direction);
-}
-
-int
-swiotlb_dma_mapping_error(dma_addr_t dma_addr)
-{
-	return (dma_addr == virt_to_bus(io_tlb_overflow_buffer));
-}
-
-/*
- * Return whether the given PCI device DMA address mask can be supported
- * properly.  For example, if your device can only drive the low 24-bits
- * during PCI bus mastering, then you would pass 0x00ffffff as the mask to
- * this function.
- */
-int
-swiotlb_dma_supported (struct device *hwdev, u64 mask)
-{
-	return (mask >= ((1UL << io_tlb_dma_bits) - 1));
-}
-
-EXPORT_SYMBOL(swiotlb_init);
-EXPORT_SYMBOL(swiotlb_map_single);
-EXPORT_SYMBOL(swiotlb_unmap_single);
-EXPORT_SYMBOL(swiotlb_map_sg);
-EXPORT_SYMBOL(swiotlb_unmap_sg);
-EXPORT_SYMBOL(swiotlb_sync_single_for_cpu);
-EXPORT_SYMBOL(swiotlb_sync_single_for_device);
-EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu);
-EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
-EXPORT_SYMBOL(swiotlb_map_page);
-EXPORT_SYMBOL(swiotlb_unmap_page);
-EXPORT_SYMBOL(swiotlb_dma_mapping_error);
-EXPORT_SYMBOL(swiotlb_dma_supported);
Index: sle10-sp1-2006-12-18/arch/i386/kernel/pci-dma-xen.c
===================================================================
--- sle10-sp1-2006-12-18.orig/arch/i386/kernel/pci-dma-xen.c	2006-12-20 15:53:32.000000000 +0100
+++ sle10-sp1-2006-12-18/arch/i386/kernel/pci-dma-xen.c	2006-12-20 14:06:09.000000000 +0100
@@ -16,7 +16,7 @@
 #include <asm/io.h>
 #include <xen/balloon.h>
 #include <asm/tlbflush.h>
-#include <asm-i386/mach-xen/asm/swiotlb.h>
+#include <asm/dma-mapping.h>
 #include <asm/bug.h>
 
 #ifdef __x86_64__
@@ -93,13 +93,7 @@ dma_unmap_sg(struct device *hwdev, struc
 }
 EXPORT_SYMBOL(dma_unmap_sg);
 
-/*
- * XXX This file is also used by xenLinux/ia64. 
- * "defined(__i386__) || defined (__x86_64__)" means "!defined(__ia64__)".
- * This #if work around should be removed once this file is merbed back into
- * i386' pci-dma or is moved to drivers/xen/core.
- */
-#if defined(__i386__) || defined(__x86_64__)
+#ifdef CONFIG_HIGHMEM
 dma_addr_t
 dma_map_page(struct device *dev, struct page *page, unsigned long offset,
 	     size_t size, enum dma_data_direction direction)
@@ -129,7 +123,7 @@ dma_unmap_page(struct device *dev, dma_a
 		swiotlb_unmap_page(dev, dma_address, size, direction);
 }
 EXPORT_SYMBOL(dma_unmap_page);
-#endif /* defined(__i386__) || defined(__x86_64__) */
+#endif /* CONFIG_HIGHMEM */
 
 int
 dma_mapping_error(dma_addr_t dma_addr)
Index: sle10-sp1-2006-12-18/include/asm-i386/mach-xen/asm/dma-mapping.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-i386/mach-xen/asm/dma-mapping.h	2006-12-20 15:53:32.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-i386/mach-xen/asm/dma-mapping.h	2006-12-20 14:06:09.000000000 +0100
@@ -24,7 +24,7 @@ address_needs_mapping(struct device *hwd
 }
 
 static inline int
-range_straddles_page_boundary(void *p, size_t size)
+range_straddles_page_boundary(const void *p, size_t size)
 {
 	extern unsigned long *contiguous_bitmap;
 	return (((((unsigned long)p & ~PAGE_MASK) + size) > PAGE_SIZE) &&
@@ -53,6 +53,7 @@ extern int dma_map_sg(struct device *hwd
 extern void dma_unmap_sg(struct device *hwdev, struct scatterlist *sg,
 			 int nents, enum dma_data_direction direction);
 
+#ifdef CONFIG_HIGHMEM
 extern dma_addr_t
 dma_map_page(struct device *dev, struct page *page, unsigned long offset,
 	     size_t size, enum dma_data_direction direction);
@@ -60,6 +61,11 @@ dma_map_page(struct device *dev, struct 
 extern void
 dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
 	       enum dma_data_direction direction);
+#else
+#define dma_map_page(dev, page, offset, size, dir) \
+	dma_map_single(dev, page_address(page) + (offset), (size), (dir))
+#define dma_unmap_page dma_unmap_single
+#endif
 
 extern void
 dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size,
Index: sle10-sp1-2006-12-18/include/asm-i386/mach-xen/asm/swiotlb.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-i386/mach-xen/asm/swiotlb.h	2006-12-20 15:53:32.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-i386/mach-xen/asm/swiotlb.h	2006-12-20 14:04:33.000000000 +0100
@@ -1,43 +1,18 @@
-#ifndef _ASM_SWIOTLB_H
-#define _ASM_SWIOTLB_H 1
+#ifndef _ASM_XEN_SWIOTLB_H
+#define _ASM_XEN_SWIOTLB_H 1
 
-#include <linux/config.h>
+#include <xen/swiotlb.h>
 
-/* SWIOTLB interface */
+#ifdef CONFIG_HIGHMEM
+
+#define SG_ENT_PHYS_ADDRESS(sg)	(page_to_bus((sg)->page) + (sg)->offset)
 
-extern dma_addr_t swiotlb_map_single(struct device *hwdev, void *ptr, size_t size,
-				      int dir);
-extern void swiotlb_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
-				  size_t size, int dir);
-extern void swiotlb_sync_single_for_cpu(struct device *hwdev,
-					 dma_addr_t dev_addr,
-					 size_t size, int dir);
-extern void swiotlb_sync_single_for_device(struct device *hwdev,
-					    dma_addr_t dev_addr,
-					    size_t size, int dir);
-extern void swiotlb_sync_sg_for_cpu(struct device *hwdev,
-				     struct scatterlist *sg, int nelems,
-				     int dir);
-extern void swiotlb_sync_sg_for_device(struct device *hwdev,
-					struct scatterlist *sg, int nelems,
-					int dir);
-extern int swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg,
-		      int nents, int direction);
-extern void swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg,
-			 int nents, int direction);
-extern int swiotlb_dma_mapping_error(dma_addr_t dma_addr);
 extern dma_addr_t swiotlb_map_page(struct device *hwdev, struct page *page,
                                    unsigned long offset, size_t size,
                                    enum dma_data_direction direction);
 extern void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dma_address,
                                size_t size, enum dma_data_direction direction);
-extern int swiotlb_dma_supported(struct device *hwdev, u64 mask);
-extern void swiotlb_init(void);
 
-#ifdef CONFIG_SWIOTLB
-extern int swiotlb;
-#else
-#define swiotlb 0
 #endif
 
 #endif
Index: sle10-sp1-2006-12-18/include/asm-ia64/swiotlb.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-ia64/swiotlb.h	2006-12-20 15:54:48.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-ia64/swiotlb.h	2006-12-20 16:07:27.000000000 +0100
@@ -1,8 +1,16 @@
 #ifndef _ASM_SWIOTLB_H
 #define _ASM_SWIOTLB_H 1
 
+#include <linux/config.h>
+
+#ifndef CONFIG_XEN
+
 #include <asm/machvec.h>
 
 #define SWIOTLB_ARCH_WANT_LATE_INIT
 
+#else
+#include <xen/swiotlb.h>
+#endif
+
 #endif /* _ASM_SWIOTLB_H */
Index: sle10-sp1-2006-12-18/include/asm-x86_64/mach-xen/asm/swiotlb.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sle10-sp1-2006-12-18/include/asm-x86_64/mach-xen/asm/swiotlb.h	2006-12-20 16:07:36.000000000 +0100
@@ -0,0 +1 @@
+#include <xen/swiotlb.h>
Index: sle10-sp1-2006-12-18/include/asm-x86_64/swiotlb.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-x86_64/swiotlb.h	2006-12-20 15:53:57.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-x86_64/swiotlb.h	2006-12-20 15:55:44.000000000 +0100
@@ -43,14 +43,6 @@ extern void swiotlb_free_coherent (struc
 extern int swiotlb_dma_supported(struct device *hwdev, u64 mask);
 extern void swiotlb_init(void);
 
-#ifdef CONFIG_XEN
-extern dma_addr_t swiotlb_map_page(struct device *hwdev, struct page *page,
-                                   unsigned long offset, size_t size,
-                                   enum dma_data_direction direction);
-extern void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dma_address,
-                               size_t size, enum dma_data_direction direction);
-#endif
-
 #ifdef CONFIG_SWIOTLB
 extern int swiotlb;
 #else
Index: sle10-sp1-2006-12-18/include/xen/swiotlb.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sle10-sp1-2006-12-18/include/xen/swiotlb.h	2006-12-20 15:37:50.000000000 +0100
@@ -0,0 +1,169 @@
+/*
+ * Copyright (C) 2005 Keir Fraser <keir@xensource.com>
+ */
+
+#ifndef _XEN_SWIOTLB_H
+#define _XEN_SWIOTLB_H 1
+
+#include <asm-x86_64/swiotlb.h>
+#include <linux/highmem.h>
+#include <asm/uaccess.h>
+#include <xen/interface/memory.h>
+
+/* Width of DMA addresses in the IO TLB. 30 bits is a b44 limitation. */
+#define DEFAULT_IO_TLB_DMA_BITS 30
+
+#define SWIOTLB_EXTRA_VARIABLES \
+int swiotlb; \
+static unsigned int io_tlb_dma_bits = DEFAULT_IO_TLB_DMA_BITS; \
+static int __init setup_io_tlb_bits(char *str) \
+{ \
+	io_tlb_dma_bits = simple_strtoul(str, NULL, 0); \
+	return 0; \
+} \
+__setup("swiotlb_bits=", setup_io_tlb_bits); \
+static int __init setup_io_tlb_npages(char *str) \
+{ \
+	if (isdigit(*str)) { \
+		io_tlb_nslabs = simple_strtoul(str, &str, 0); \
+		/* Unlike ia64, the size is aperture in megabytes, not 'slabs'! */ \
+		io_tlb_nslabs <<= (20 - IO_TLB_SHIFT); \
+		/* avoid tail segment of size < IO_TLB_SEGSIZE */ \
+		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); \
+	} \
+	if (*str == ',') \
+		++str; \
+	/* \
+	 * NB. 'force' enables the swiotlb, but doesn't force its use for \
+	 * every DMA like it does on native Linux. 'off' forcibly disables \
+	 * use of the swiotlb. \
+	 */ \
+	if (!strcmp(str, "force")) \
+		swiotlb_force = 1; \
+	else if (!strcmp(str, "off")) \
+		swiotlb_force = -1; \
+	return 1; \
+} \
+/* \
+ * We use __copy_to_user_inatomic to transfer to the host buffer because the \
+ * buffer may be mapped read-only (e.g, in blkback driver) but lower-level \
+ * drivers map the buffer for DMA_BIDIRECTIONAL access. This causes an \
+ * unnecessary copy from the aperture to the host buffer, and a page fault. \
+ */ \
+static inline void \
+__swiotlb_arch_sync_single(io_tlb_addr_t buffer, char *dma_addr, size_t size, int dir) \
+{ \
+	if (PageHighMem(buffer.page)) { \
+		size_t len, bytes; \
+		char *dev, *host, *kmp; \
+		len = size; \
+		while (len != 0) { \
+			if (((bytes = len) + buffer.offset) > PAGE_SIZE) \
+				bytes = PAGE_SIZE - buffer.offset; \
+			kmp  = kmap_atomic(buffer.page, KM_SWIOTLB); \
+			dev  = dma_addr + size - len; \
+			host = kmp + buffer.offset; \
+			if (dir == DMA_FROM_DEVICE) { \
+				if (__copy_to_user_inatomic(host, dev, bytes)) \
+					/* inaccessible */; \
+			} else \
+				memcpy(dev, host, bytes); \
+			kunmap_atomic(kmp, KM_SWIOTLB); \
+			len -= bytes; \
+			buffer.page++; \
+			buffer.offset = 0; \
+		} \
+	} else { \
+		char *host = (char *)phys_to_virt( \
+			page_to_pseudophys(buffer.page)) + buffer.offset; \
+		if (dir == DMA_FROM_DEVICE) { \
+			if (__copy_to_user_inatomic(host, dma_addr, size)) \
+				/* inaccessible */; \
+		} else if (dir == DMA_TO_DEVICE) \
+			memcpy(dma_addr, host, size); \
+	} \
+} \
+EXPORT_SYMBOL(swiotlb)
+#define SWIOTLB_ARCH_HAS_SETUP_IO_TLB_NPAGES
+#define SWIOTLB_ARCH_HAS_SYNC_SINGLE
+
+#define swiotlb_adjust_size(size) do { \
+	/* Round up to power of two (xen_create_contiguous_region). */ \
+	while (size & (size - 1)) \
+		size += size & ~(size - 1); \
+} while (0)
+
+#define swiotlb_adjust_seg(start, size) do { \
+	int rc = xen_create_contiguous_region( \
+		(unsigned long)(start), \
+		get_order(size), \
+		io_tlb_dma_bits); \
+	BUG_ON(rc); \
+} while (0)
+
+#define swiotlb_print_info(bytes) \
+	printk(KERN_INFO "Software IO TLB enabled: \n" \
+	       " Aperture:     %lu megabytes\n" \
+	       " Kernel range: 0x%016lx - 0x%016lx\n" \
+	       " Address size: %u bits\n", \
+	       bytes >> 20, \
+	       (unsigned long)io_tlb_start, \
+	       (unsigned long)io_tlb_end, \
+	       io_tlb_dma_bits)
+
+typedef struct {
+	struct page *page;
+	unsigned int offset;
+} io_tlb_addr_t;
+#define SWIOTLB_ARCH_HAS_IO_TLB_ADDR_T
+#define swiotlb_orig_addr_null(buffer) (!(buffer).page)
+#define ptr_to_io_tlb_addr(ptr) ({ \
+	io_tlb_addr_t __buf; \
+	if (ptr) \
+		__buf.page = virt_to_page(ptr); \
+	else \
+		__buf.page = NULL; \
+	__buf.offset = (unsigned long)(ptr) & ~PAGE_MASK; \
+	__buf; \
+})
+#define page_to_io_tlb_addr(pg, off) ({ \
+	io_tlb_addr_t __buf; \
+	__buf.page   = pg; \
+	__buf.offset = off; \
+	__buf; \
+})
+#define sg_to_io_tlb_addr(sg) ({ \
+	io_tlb_addr_t __buf; \
+	__buf.page   = (sg)->page; \
+	__buf.offset = (sg)->offset; \
+	__buf; \
+})
+
+#define __swiotlb_init_with_default_size(defsz) do { \
+	size_t size = (defsz); \
+	if (swiotlb_force == 1) { \
+		swiotlb = 1; \
+	} else if ((swiotlb_force != -1) && \
+		   is_running_on_xen() && \
+		   is_initial_xendomain()) { \
+		/* Domain 0 always has a swiotlb. */ \
+		long ram_end = HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL); \
+		if (ram_end <= 0x7ffff) \
+			size = 2 * (1 << 20); /* 2MB on <2GB on systems. */ \
+		swiotlb = 1; \
+	} \
+	if (swiotlb) \
+		swiotlb_init_with_default_size(size); \
+	else \
+		printk(KERN_INFO "Software IO TLB disabled\n"); \
+} while(0)
+
+#define range_needs_mapping range_straddles_page_boundary
+#define order_needs_mapping(order) ((order) != 0)
+#define SWIOTLB_ARCH_HAS_NEEDS_MAPPING
+
+#define SWIOTLB_ARCH_NEED_MAP_PAGE
+
+#define __swiotlb_dma_supported(hwdev, mask) ((mask) >= ((1UL << io_tlb_dma_bits) - 1))
+
+#endif /* _XEN_SWTIOLB_H */
Index: sle10-sp1-2006-12-18/lib/Makefile
===================================================================
--- sle10-sp1-2006-12-18.orig/lib/Makefile	2006-12-20 15:53:32.000000000 +0100
+++ sle10-sp1-2006-12-18/lib/Makefile	2006-12-20 15:45:20.000000000 +0100
@@ -49,7 +49,6 @@ obj-$(CONFIG_SGRB)	 += sgrb.o
 obj-$(CONFIG_STATISTICS) += statistic.o
 
 obj-$(CONFIG_SWIOTLB) += swiotlb.o
-swiotlb-$(CONFIG_XEN) := ../arch/i386/kernel/swiotlb.o
 
 hostprogs-y	:= gen_crc32table
 clean-files	:= crc32table.h

[-- Attachment #7: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
@ 2006-12-22 14:49 Jan Beulich
  2006-12-25  4:50 ` Muli Ben-Yehuda
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2006-12-22 14:49 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Muli Ben-Yehuda, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1393 bytes --]

Patch update, fixing a bug on x86/PAE, and making include/xen/swiotlb.h look
a lot nicer (but still not really nice). My plan is to submit the non-Xen ones to
lkml right after New Year, unless I hear negative feedback.

What are the plans on the Xen side - pull the non-Xen patches into patches/,
or ignore everything until (hopefully) mainline has picked up some or all of
the native ones?

Jan

>>Do we merge okay with lib/swiotlb.c then? One concern I had was with our
>>preferred setup semantics -- we really want the user to be able to forcibly
>>enable the swiotlb via a boot parameter *but* not have to suffer using it
>>for every DMA operation. Last I looked the generic swiotlb didn't have that
>>option. That and our very Xen-specific checks for whether to auto-enable the
>>swiotlb led me to think that the very start-of-day setup of swiotlb would
>>need to be overridable by architecture.
>
>I think I retained all of the semantics, attached the patches as I have them
>by now. This is a submission for review only, as the first four patches will
>need to go to kernel.org (and hopefully will get accepted). The Xen
>customization is fairly ugly, but I didn't see anything nicer than that while
>also keeping the amount of changes to lib/swiotlb.c reasonable.
>
>Patch order is
>swiotlb-bugs.patch
>swiotlb-bus.patch
>swiotlb-cleanup.patch
>swiotlb-split.patch
>xen-swiotlb.patch


[-- Attachment #2: swiotlb-split.patch --]
[-- Type: text/plain, Size: 14632 bytes --]

This patch adds abstraction so that the file can be used by environments other
than IA64 and EM64T, namely for Xen.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-21/include/asm-ia64/swiotlb.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sle10-sp1-2006-12-21/include/asm-ia64/swiotlb.h	2006-12-21 16:13:18.000000000 +0100
@@ -0,0 +1,9 @@
+#ifndef _ASM_SWIOTLB_H
+#define _ASM_SWIOTLB_H 1
+
+#include <asm/machvec.h>
+
+#define SWIOTLB_ARCH_NEED_LATE_INIT
+#define SWIOTLB_ARCH_NEED_ALLOC
+
+#endif /* _ASM_SWIOTLB_H */
Index: sle10-sp1-2006-12-21/include/asm-x86_64/swiotlb.h
===================================================================
--- sle10-sp1-2006-12-21.orig/include/asm-x86_64/swiotlb.h	2006-12-21 16:10:20.000000000 +0100
+++ sle10-sp1-2006-12-21/include/asm-x86_64/swiotlb.h	2006-12-21 16:13:03.000000000 +0100
@@ -52,6 +52,7 @@ extern void swiotlb_unmap_page(struct de
 #endif
 
 #ifdef CONFIG_SWIOTLB
+#define SWIOTLB_ARCH_NEED_ALLOC
 extern int swiotlb;
 #else
 #define swiotlb 0
Index: sle10-sp1-2006-12-21/lib/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-21.orig/lib/swiotlb.c	2006-12-21 15:41:31.000000000 +0100
+++ sle10-sp1-2006-12-21/lib/swiotlb.c	2006-12-21 16:09:44.000000000 +0100
@@ -28,6 +28,7 @@
 #include <asm/io.h>
 #include <asm/dma.h>
 #include <asm/scatterlist.h>
+#include <asm/swiotlb.h>
 
 #include <linux/init.h>
 #include <linux/bootmem.h>
@@ -35,8 +36,10 @@
 #define OFFSET(val,align) ((unsigned long)	\
 	                   ( (val) & ( (align) - 1)))
 
+#ifndef SG_ENT_VIRT_ADDRESS
 #define SG_ENT_VIRT_ADDRESS(sg)	(page_address((sg)->page) + (sg)->offset)
 #define SG_ENT_PHYS_ADDRESS(sg)	virt_to_bus(SG_ENT_VIRT_ADDRESS(sg))
+#endif
 
 /*
  * Maximum allowable number of contiguous slabs to map,
@@ -101,13 +104,25 @@ static unsigned int io_tlb_index;
  * We need to save away the original address corresponding to a mapped entry
  * for the sync operations.
  */
-static unsigned char **io_tlb_orig_addr;
+#ifndef SWIOTLB_ARCH_HAS_IO_TLB_ADDR_T
+typedef char *io_tlb_addr_t;
+#define swiotlb_orig_addr_null(buffer) (!(buffer))
+#define ptr_to_io_tlb_addr(ptr) (ptr)
+#define page_to_io_tlb_addr(pg, off) (page_address(pg) + (off))
+#define sg_to_io_tlb_addr(sg) SG_ENT_VIRT_ADDRESS(sg)
+#endif
+static io_tlb_addr_t *io_tlb_orig_addr;
 
 /*
  * Protect the above data structures in the map and unmap calls
  */
 static DEFINE_SPINLOCK(io_tlb_lock);
 
+#ifdef SWIOTLB_EXTRA_VARIABLES
+SWIOTLB_EXTRA_VARIABLES;
+#endif
+
+#ifndef SWIOTLB_ARCH_HAS_SETUP_IO_TLB_NPAGES
 static int __init
 setup_io_tlb_npages(char *str)
 {
@@ -122,9 +137,25 @@ setup_io_tlb_npages(char *str)
 		swiotlb_force = 1;
 	return 1;
 }
+#endif
 __setup("swiotlb=", setup_io_tlb_npages);
 /* make io_tlb_overflow tunable too? */
 
+#ifndef swiotlb_adjust_size
+#define swiotlb_adjust_size(size) ((void)0)
+#endif
+
+#ifndef swiotlb_adjust_seg
+#define swiotlb_adjust_seg(start, size) ((void)0)
+#endif
+
+#ifndef swiotlb_print_info
+#define swiotlb_print_info(bytes) \
+	printk(KERN_INFO "Placing %luMB software IO TLB between 0x%lx - " \
+	       "0x%lx\n", bytes >> 20, \
+	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end))
+#endif
+
 /*
  * Statically reserve bounce buffer space and initialize bounce buffer data
  * structures for the software IO TLB used to implement the DMA API.
@@ -138,6 +169,8 @@ swiotlb_init_with_default_size(size_t de
 		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
 		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
 	}
+	swiotlb_adjust_size(io_tlb_nslabs);
+	swiotlb_adjust_size(io_tlb_overflow);
 
 	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
@@ -155,25 +188,33 @@ swiotlb_init_with_default_size(size_t de
 	 * between io_tlb_start and io_tlb_end.
 	 */
 	io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int));
-	for (i = 0; i < io_tlb_nslabs; i++)
+	for (i = 0; i < io_tlb_nslabs; i++) {
+		if ( !(i % IO_TLB_SEGSIZE) )
+			swiotlb_adjust_seg(io_tlb_start + (i << IO_TLB_SHIFT),
+				IO_TLB_SEGSIZE << IO_TLB_SHIFT);
  		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+ 	}
 	io_tlb_index = 0;
-	io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(char *));
+	io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(io_tlb_addr_t));
 
 	/*
 	 * Get the overflow emergency buffer
 	 */
 	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
-	printk(KERN_INFO "Placing software IO TLB between 0x%lx - 0x%lx\n",
-	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
+	swiotlb_adjust_seg(io_tlb_overflow_buffer, io_tlb_overflow);
+	swiotlb_print_info(bytes);
 }
+#ifndef __swiotlb_init_with_default_size
+#define __swiotlb_init_with_default_size swiotlb_init_with_default_size
+#endif
 
 void __init
 swiotlb_init(void)
 {
-	swiotlb_init_with_default_size(64 * (1<<20));	/* default to 64MB */
+	__swiotlb_init_with_default_size(64 * (1<<20));	/* default to 64MB */
 }
 
+#ifdef SWIOTLB_ARCH_NEED_LATE_INIT
 /*
  * Systems with larger DMA zones (those that don't support ISA) can
  * initialize the swiotlb later using the slab allocator if needed.
@@ -231,12 +272,12 @@ swiotlb_late_init_with_default_size(size
  		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
 	io_tlb_index = 0;
 
-	io_tlb_orig_addr = (unsigned char **)__get_free_pages(GFP_KERNEL,
-	                           get_order(io_tlb_nslabs * sizeof(char *)));
+	io_tlb_orig_addr = (io_tlb_addr_t *)__get_free_pages(GFP_KERNEL,
+	                           get_order(io_tlb_nslabs * sizeof(io_tlb_addr_t)));
 	if (!io_tlb_orig_addr)
 		goto cleanup3;
 
-	memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(char *));
+	memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(io_tlb_addr_t));
 
 	/*
 	 * Get the overflow emergency buffer
@@ -246,19 +287,17 @@ swiotlb_late_init_with_default_size(size
 	if (!io_tlb_overflow_buffer)
 		goto cleanup4;
 
-	printk(KERN_INFO "Placing %luMB software IO TLB between 0x%lx - "
-	       "0x%lx\n", bytes >> 20,
-	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
+	swiotlb_print_info(bytes);
 
 	return 0;
 
 cleanup4:
-	free_pages((unsigned long)io_tlb_orig_addr, get_order(io_tlb_nslabs *
-	                                                      sizeof(char *)));
+	free_pages((unsigned long)io_tlb_orig_addr,
+		   get_order(io_tlb_nslabs * sizeof(io_tlb_addr_t)));
 	io_tlb_orig_addr = NULL;
 cleanup3:
-	free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
-	                                                 sizeof(int)));
+	free_pages((unsigned long)io_tlb_list,
+		   get_order(io_tlb_nslabs * sizeof(int)));
 	io_tlb_list = NULL;
 cleanup2:
 	io_tlb_end = NULL;
@@ -268,7 +307,9 @@ cleanup1:
 	io_tlb_nslabs = req_nslabs;
 	return -ENOMEM;
 }
+#endif
 
+#ifndef SWIOTLB_ARCH_HAS_NEEDS_MAPPING
 static inline int
 address_needs_mapping(struct device *hwdev, dma_addr_t addr)
 {
@@ -279,11 +320,35 @@ address_needs_mapping(struct device *hwd
 	return (addr & ~mask) != 0;
 }
 
+static inline int range_needs_mapping(const void *ptr, size_t size)
+{
+	return swiotlb_force;
+}
+
+static inline int order_needs_mapping(unsigned int order)
+{
+	return 0;
+}
+#endif
+
+static void
+__sync_single(io_tlb_addr_t buffer, char *dma_addr, size_t size, int dir)
+{
+#ifndef SWIOTLB_ARCH_HAS_SYNC_SINGLE
+	if (dir == DMA_TO_DEVICE)
+		memcpy(dma_addr, buffer, size);
+	else
+		memcpy(buffer, dma_addr, size);
+#else
+	__swiotlb_arch_sync_single(buffer, dma_addr, size, dir);
+#endif
+}
+
 /*
  * Allocates bounce buffer and returns its kernel virtual address.
  */
 static void *
-map_single(struct device *hwdev, char *buffer, size_t size, int dir)
+map_single(struct device *hwdev, io_tlb_addr_t buffer, size_t size, int dir)
 {
 	unsigned long flags;
 	char *dma_addr;
@@ -357,7 +422,7 @@ map_single(struct device *hwdev, char *b
 	 */
 	io_tlb_orig_addr[index] = buffer;
 	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
-		memcpy(dma_addr, buffer, size);
+		__sync_single(buffer, dma_addr, size, DMA_TO_DEVICE);
 
 	return dma_addr;
 }
@@ -371,17 +436,18 @@ unmap_single(struct device *hwdev, char 
 	unsigned long flags;
 	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
 	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
-	char *buffer = io_tlb_orig_addr[index];
+	io_tlb_addr_t buffer = io_tlb_orig_addr[index];
 
 	/*
 	 * First, sync the memory before unmapping the entry
 	 */
-	if (buffer && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
+	if (!swiotlb_orig_addr_null(buffer)
+	    && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
 		/*
 		 * bounce... copy the data back into the original buffer * and
 		 * delete the bounce buffer.
 		 */
-		memcpy(buffer, dma_addr, size);
+		__sync_single(buffer, dma_addr, size, DMA_FROM_DEVICE);
 
 	/*
 	 * Return the buffer to the free list by setting the corresponding
@@ -414,18 +480,18 @@ sync_single(struct device *hwdev, char *
 	    int dir, int target)
 {
 	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
-	char *buffer = io_tlb_orig_addr[index];
+	io_tlb_addr_t buffer = io_tlb_orig_addr[index];
 
 	switch (target) {
 	case SYNC_FOR_CPU:
 		if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
-			memcpy(buffer, dma_addr, size);
+			__sync_single(buffer, dma_addr, size, DMA_FROM_DEVICE);
 		else if (dir != DMA_TO_DEVICE)
 			BUG();
 		break;
 	case SYNC_FOR_DEVICE:
 		if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
-			memcpy(dma_addr, buffer, size);
+			__sync_single(buffer, dma_addr, size, DMA_TO_DEVICE);
 		else if (dir != DMA_FROM_DEVICE)
 			BUG();
 		break;
@@ -434,6 +500,8 @@ sync_single(struct device *hwdev, char *
 	}
 }
 
+#ifdef SWIOTLB_ARCH_NEED_ALLOC
+
 void *
 swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 		       dma_addr_t *dma_handle, gfp_t flags)
@@ -449,7 +517,10 @@ swiotlb_alloc_coherent(struct device *hw
 	 */
 	flags |= GFP_DMA;
 
-	ret = (void *)__get_free_pages(flags, order);
+	if (!order_needs_mapping(order))
+		ret = (void *)__get_free_pages(flags, order);
+	else
+		ret = NULL;
 	if (ret && address_needs_mapping(hwdev, virt_to_bus(ret))) {
 		/*
 		 * The allocated memory isn't reachable by the device.
@@ -487,6 +558,7 @@ swiotlb_alloc_coherent(struct device *hw
 	*dma_handle = dev_addr;
 	return ret;
 }
+EXPORT_SYMBOL(swiotlb_alloc_coherent);
 
 void
 swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
@@ -499,6 +571,9 @@ swiotlb_free_coherent(struct device *hwd
 		/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
 		swiotlb_unmap_single (hwdev, dma_handle, size, DMA_TO_DEVICE);
 }
+EXPORT_SYMBOL(swiotlb_free_coherent);
+
+#endif
 
 static void
 swiotlb_full(struct device *dev, size_t size, int dir, int do_panic)
@@ -536,18 +611,20 @@ swiotlb_map_single(struct device *hwdev,
 
 	if (dir == DMA_NONE)
 		BUG();
+
 	/*
 	 * If the pointer passed in happens to be in the device's DMA window,
 	 * we can safely return the device addr and not worry about bounce
 	 * buffering it.
 	 */
-	if (!address_needs_mapping(hwdev, dev_addr) && !swiotlb_force)
+	if (!range_needs_mapping(ptr, size)
+	    && !address_needs_mapping(hwdev, dev_addr))
 		return dev_addr;
 
 	/*
 	 * Oh well, have to allocate and map a bounce buffer.
 	 */
-	map = map_single(hwdev, ptr, size, dir);
+	map = map_single(hwdev, ptr_to_io_tlb_addr(ptr), size, dir);
 	if (!map) {
 		swiotlb_full(hwdev, size, dir, 1);
 		map = io_tlb_overflow_buffer;
@@ -678,7 +755,6 @@ int
 swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg, int nelems,
 	       int dir)
 {
-	void *addr;
 	dma_addr_t dev_addr;
 	int i;
 
@@ -686,10 +762,10 @@ swiotlb_map_sg(struct device *hwdev, str
 		BUG();
 
 	for (i = 0; i < nelems; i++, sg++) {
-		addr = SG_ENT_VIRT_ADDRESS(sg);
-		dev_addr = virt_to_bus(addr);
-		if (swiotlb_force || address_needs_mapping(hwdev, dev_addr)) {
-			void *map = map_single(hwdev, addr, sg->length, dir);
+		dev_addr = SG_ENT_PHYS_ADDRESS(sg);
+		if (range_needs_mapping(SG_ENT_VIRT_ADDRESS(sg), sg->length)
+		    || address_needs_mapping(hwdev, dev_addr)) {
+			void *map = map_single(hwdev, sg_to_io_tlb_addr(sg), sg->length, dir);
 			if (!map) {
 				/* Don't panic here, we expect map_sg users
 				   to do proper error handling. */
@@ -765,6 +841,46 @@ swiotlb_sync_sg_for_device(struct device
 	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
 }
 
+#ifdef SWIOTLB_ARCH_NEED_MAP_PAGE
+
+dma_addr_t
+swiotlb_map_page(struct device *hwdev, struct page *page,
+		 unsigned long offset, size_t size,
+		 enum dma_data_direction direction)
+{
+	dma_addr_t dev_addr;
+	char *map;
+
+	dev_addr = page_to_bus(page) + offset;
+	if (address_needs_mapping(hwdev, dev_addr)) {
+		map = map_single(hwdev, page_to_io_tlb_addr(page, offset), size, direction);
+		if (!map) {
+			swiotlb_full(hwdev, size, direction, 1);
+			map = io_tlb_overflow_buffer;
+		}
+		dev_addr = virt_to_bus(map);
+	}
+
+	return dev_addr;
+}
+EXPORT_SYMBOL(swiotlb_map_page);
+
+void
+swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
+		   size_t size, enum dma_data_direction direction)
+{
+	char *dma_addr = bus_to_virt(dev_addr);
+
+	BUG_ON(direction == DMA_NONE);
+	if (dma_addr >= io_tlb_start && dma_addr < io_tlb_end)
+		unmap_single(hwdev, dma_addr, size, direction);
+	else if (direction == DMA_FROM_DEVICE)
+		dma_mark_clean(dma_addr, size);
+}
+EXPORT_SYMBOL(swiotlb_unmap_page);
+
+#endif
+
 int
 swiotlb_dma_mapping_error(dma_addr_t dma_addr)
 {
@@ -780,7 +896,11 @@ swiotlb_dma_mapping_error(dma_addr_t dma
 int
 swiotlb_dma_supported(struct device *hwdev, u64 mask)
 {
+#ifndef __swiotlb_dma_supported
 	return (virt_to_bus(io_tlb_end) - 1) <= mask;
+#else
+	return __swiotlb_dma_supported(hwdev, mask);
+#endif
 }
 
 EXPORT_SYMBOL(swiotlb_init);
@@ -795,6 +915,4 @@ EXPORT_SYMBOL_GPL(swiotlb_sync_single_ra
 EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu);
 EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
 EXPORT_SYMBOL(swiotlb_dma_mapping_error);
-EXPORT_SYMBOL(swiotlb_alloc_coherent);
-EXPORT_SYMBOL(swiotlb_free_coherent);
 EXPORT_SYMBOL(swiotlb_dma_supported);

[-- Attachment #3: swiotlb-bus.patch --]
[-- Type: text/plain, Size: 5241 bytes --]

Convert all phys_to_virt/virt_to_phys uses to bus_to_virt/virt_to_bus.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-18/lib/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-18.orig/lib/swiotlb.c	2006-12-20 12:02:01.000000000 +0100
+++ sle10-sp1-2006-12-18/lib/swiotlb.c	2006-12-20 12:09:03.000000000 +0100
@@ -36,7 +36,7 @@
 	                   ( (val) & ( (align) - 1)))
 
 #define SG_ENT_VIRT_ADDRESS(sg)	(page_address((sg)->page) + (sg)->offset)
-#define SG_ENT_PHYS_ADDRESS(SG)	virt_to_phys(SG_ENT_VIRT_ADDRESS(SG))
+#define SG_ENT_PHYS_ADDRESS(sg)	virt_to_bus(SG_ENT_VIRT_ADDRESS(sg))
 
 /*
  * Maximum allowable number of contiguous slabs to map,
@@ -163,7 +163,7 @@ swiotlb_init_with_default_size (size_t d
 	 */
 	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
 	printk(KERN_INFO "Placing software IO TLB between 0x%lx - 0x%lx\n",
-	       virt_to_phys(io_tlb_start), virt_to_phys(io_tlb_end));
+	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
 }
 
 void
@@ -244,7 +244,7 @@ swiotlb_late_init_with_default_size (siz
 
 	printk(KERN_INFO "Placing %ldMB software IO TLB between 0x%lx - "
 	       "0x%lx\n", (io_tlb_nslabs * (1 << IO_TLB_SHIFT)) >> 20,
-	       virt_to_phys(io_tlb_start), virt_to_phys(io_tlb_end));
+	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
 
 	return 0;
 
@@ -446,7 +446,7 @@ swiotlb_alloc_coherent(struct device *hw
 	flags |= GFP_DMA;
 
 	ret = (void *)__get_free_pages(flags, order);
-	if (ret && address_needs_mapping(hwdev, virt_to_phys(ret))) {
+	if (ret && address_needs_mapping(hwdev, virt_to_bus(ret))) {
 		/*
 		 * The allocated memory isn't reachable by the device.
 		 * Fall back on swiotlb_map_single().
@@ -466,11 +466,11 @@ swiotlb_alloc_coherent(struct device *hw
 		if (swiotlb_dma_mapping_error(handle))
 			return NULL;
 
-		ret = phys_to_virt(handle);
+		ret = bus_to_virt(handle);
 	}
 
 	memset(ret, 0, size);
-	dev_addr = virt_to_phys(ret);
+	dev_addr = virt_to_bus(ret);
 
 	/* Confirm address can be DMA'd by device */
 	if (address_needs_mapping(hwdev, dev_addr)) {
@@ -526,7 +526,7 @@ swiotlb_full(struct device *dev, size_t 
 dma_addr_t
 swiotlb_map_single(struct device *hwdev, void *ptr, size_t size, int dir)
 {
-	unsigned long dev_addr = virt_to_phys(ptr);
+	unsigned long dev_addr = virt_to_bus(ptr);
 	void *map;
 
 	if (dir == DMA_NONE)
@@ -548,7 +548,7 @@ swiotlb_map_single(struct device *hwdev,
 		map = io_tlb_overflow_buffer;
 	}
 
-	dev_addr = virt_to_phys(map);
+	dev_addr = virt_to_bus(map);
 
 	/*
 	 * Ensure that the address returned is DMA'ble
@@ -571,7 +571,7 @@ void
 swiotlb_unmap_single(struct device *hwdev, dma_addr_t dev_addr, size_t size,
 		     int dir)
 {
-	char *dma_addr = phys_to_virt(dev_addr);
+	char *dma_addr = bus_to_virt(dev_addr);
 
 	if (dir == DMA_NONE)
 		BUG();
@@ -595,7 +595,7 @@ static inline void
 swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
 		    size_t size, int dir, int target)
 {
-	char *dma_addr = phys_to_virt(dev_addr);
+	char *dma_addr = bus_to_virt(dev_addr);
 
 	if (dir == DMA_NONE)
 		BUG();
@@ -627,7 +627,7 @@ swiotlb_sync_single_range(struct device 
 			  unsigned long offset, size_t size,
 			  int dir, int target)
 {
-	char *dma_addr = phys_to_virt(dev_addr) + offset;
+	char *dma_addr = bus_to_virt(dev_addr) + offset;
 
 	if (dir == DMA_NONE)
 		BUG();
@@ -682,7 +682,7 @@ swiotlb_map_sg(struct device *hwdev, str
 
 	for (i = 0; i < nelems; i++, sg++) {
 		addr = SG_ENT_VIRT_ADDRESS(sg);
-		dev_addr = virt_to_phys(addr);
+		dev_addr = virt_to_bus(addr);
 		if (swiotlb_force || address_needs_mapping(hwdev, dev_addr)) {
 			void *map = map_single(hwdev, addr, sg->length, dir);
 			if (!map) {
@@ -716,7 +716,8 @@ swiotlb_unmap_sg(struct device *hwdev, s
 
 	for (i = 0; i < nelems; i++, sg++)
 		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			unmap_single(hwdev, (void *) phys_to_virt(sg->dma_address), sg->dma_length, dir);
+			unmap_single(hwdev, bus_to_virt(sg->dma_address),
+				     sg->dma_length, dir);
 		else if (dir == DMA_FROM_DEVICE)
 			dma_mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
 }
@@ -739,7 +740,7 @@ swiotlb_sync_sg(struct device *hwdev, st
 
 	for (i = 0; i < nelems; i++, sg++)
 		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			sync_single(hwdev, phys_to_virt(sg->dma_address),
+			sync_single(hwdev, bus_to_virt(sg->dma_address),
 				    sg->dma_length, dir, target);
 		else if (dir == DMA_FROM_DEVICE)
 			dma_mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
@@ -762,7 +763,7 @@ swiotlb_sync_sg_for_device(struct device
 int
 swiotlb_dma_mapping_error(dma_addr_t dma_addr)
 {
-	return (dma_addr == virt_to_phys(io_tlb_overflow_buffer));
+	return (dma_addr == virt_to_bus(io_tlb_overflow_buffer));
 }
 
 /*
@@ -774,7 +775,7 @@ swiotlb_dma_mapping_error(dma_addr_t dma
 int
 swiotlb_dma_supported (struct device *hwdev, u64 mask)
 {
-	return (virt_to_phys (io_tlb_end) - 1) <= mask;
+	return (virt_to_bus(io_tlb_end) - 1) <= mask;
 }
 
 EXPORT_SYMBOL(swiotlb_init);

[-- Attachment #4: swiotlb-cleanup.patch --]
[-- Type: text/plain, Size: 7796 bytes --]

This patch
- adds proper __init decoration to swiotlb's init code (and the code calling
  it, where not already the case)
- replaces uses of 'unsigned long' with dma_addr_t where appropriate
- does miscellaneous simplicfication and cleanup

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-21/arch/ia64/mm/init.c
===================================================================
--- sle10-sp1-2006-12-21.orig/arch/ia64/mm/init.c	2006-12-20 12:02:01.000000000 +0100
+++ sle10-sp1-2006-12-21/arch/ia64/mm/init.c	2006-12-20 12:09:22.000000000 +0100
@@ -586,7 +586,7 @@ nolwsys_setup (char *s)
 
 __setup("nolwsys", nolwsys_setup);
 
-void
+void __init
 mem_init (void)
 {
 	long reserved_pages, codesize, datasize, initsize;
Index: sle10-sp1-2006-12-21/arch/x86_64/kernel/pci-swiotlb.c
===================================================================
--- sle10-sp1-2006-12-21.orig/arch/x86_64/kernel/pci-swiotlb.c	2006-12-21 11:10:55.000000000 +0100
+++ sle10-sp1-2006-12-21/arch/x86_64/kernel/pci-swiotlb.c	2006-12-20 12:09:22.000000000 +0100
@@ -28,7 +28,7 @@ struct dma_mapping_ops swiotlb_dma_ops =
 	.dma_supported = NULL,
 };
 
-void pci_swiotlb_init(void)
+void __init pci_swiotlb_init(void)
 {
 	/* don't initialize swiotlb if iommu=off (no_iommu=1) */
 	if (!iommu_detected && !no_iommu &&
Index: sle10-sp1-2006-12-21/lib/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-21.orig/lib/swiotlb.c	2006-12-20 12:09:03.000000000 +0100
+++ sle10-sp1-2006-12-21/lib/swiotlb.c	2006-12-21 15:41:31.000000000 +0100
@@ -1,7 +1,7 @@
 /*
  * Dynamic DMA mapping support.
  *
- * This implementation is for IA-64 and EM64T platforms that do not support
+ * This implementation is a fallback for platforms that do not support
  * I/O TLBs (aka DMA address translation hardware).
  * Copyright (C) 2000 Asit Mallick <Asit.K.Mallick@intel.com>
  * Copyright (C) 2000 Goutham Rao <goutham.rao@intel.com>
@@ -68,7 +68,7 @@ enum dma_sync_target {
 	SYNC_FOR_DEVICE = 1,
 };
 
-int swiotlb_force;
+static int swiotlb_force;
 
 /*
  * Used to do a quick range check in swiotlb_unmap_single and
@@ -129,23 +129,25 @@ __setup("swiotlb=", setup_io_tlb_npages)
  * Statically reserve bounce buffer space and initialize bounce buffer data
  * structures for the software IO TLB used to implement the DMA API.
  */
-void
-swiotlb_init_with_default_size (size_t default_size)
+void __init
+swiotlb_init_with_default_size(size_t default_size)
 {
-	unsigned long i;
+	unsigned long i, bytes;
 
 	if (!io_tlb_nslabs) {
 		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
 		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
 	}
 
+	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	io_tlb_start = alloc_bootmem_low_pages(io_tlb_nslabs * (1 << IO_TLB_SHIFT));
+	io_tlb_start = alloc_bootmem_low_pages(bytes);
 	if (!io_tlb_start)
 		panic("Cannot allocate SWIOTLB buffer");
-	io_tlb_end = io_tlb_start + io_tlb_nslabs * (1 << IO_TLB_SHIFT);
+	io_tlb_end = io_tlb_start + bytes;
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
@@ -166,8 +168,8 @@ swiotlb_init_with_default_size (size_t d
 	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
 }
 
-void
-swiotlb_init (void)
+void __init
+swiotlb_init(void)
 {
 	swiotlb_init_with_default_size(64 * (1<<20));	/* default to 64MB */
 }
@@ -178,9 +180,9 @@ swiotlb_init (void)
  * This should be just like above, but with some error catching.
  */
 int
-swiotlb_late_init_with_default_size (size_t default_size)
+swiotlb_late_init_with_default_size(size_t default_size)
 {
-	unsigned long i, req_nslabs = io_tlb_nslabs;
+	unsigned long i, bytes, req_nslabs = io_tlb_nslabs;
 	unsigned int order;
 
 	if (!io_tlb_nslabs) {
@@ -191,8 +193,9 @@ swiotlb_late_init_with_default_size (siz
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	order = get_order(io_tlb_nslabs * (1 << IO_TLB_SHIFT));
+	order = get_order(io_tlb_nslabs << IO_TLB_SHIFT);
 	io_tlb_nslabs = SLABS_PER_PAGE << order;
+	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
 	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
 		io_tlb_start = (char *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
@@ -205,13 +208,14 @@ swiotlb_late_init_with_default_size (siz
 	if (!io_tlb_start)
 		goto cleanup1;
 
-	if (order != get_order(io_tlb_nslabs * (1 << IO_TLB_SHIFT))) {
+	if (order != get_order(bytes)) {
 		printk(KERN_WARNING "Warning: only able to allocate %ld MB "
 		       "for software IO TLB\n", (PAGE_SIZE << order) >> 20);
 		io_tlb_nslabs = SLABS_PER_PAGE << order;
+		bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 	}
-	io_tlb_end = io_tlb_start + io_tlb_nslabs * (1 << IO_TLB_SHIFT);
-	memset(io_tlb_start, 0, io_tlb_nslabs * (1 << IO_TLB_SHIFT));
+	io_tlb_end = io_tlb_start + bytes;
+	memset(io_tlb_start, 0, bytes);
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
@@ -242,8 +246,8 @@ swiotlb_late_init_with_default_size (siz
 	if (!io_tlb_overflow_buffer)
 		goto cleanup4;
 
-	printk(KERN_INFO "Placing %ldMB software IO TLB between 0x%lx - "
-	       "0x%lx\n", (io_tlb_nslabs * (1 << IO_TLB_SHIFT)) >> 20,
+	printk(KERN_INFO "Placing %luMB software IO TLB between 0x%lx - "
+	       "0x%lx\n", bytes >> 20,
 	       virt_to_bus(io_tlb_start), virt_to_bus(io_tlb_end));
 
 	return 0;
@@ -256,8 +260,8 @@ cleanup3:
 	free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
 	                                                 sizeof(int)));
 	io_tlb_list = NULL;
-	io_tlb_end = NULL;
 cleanup2:
+	io_tlb_end = NULL;
 	free_pages((unsigned long)io_tlb_start, order);
 	io_tlb_start = NULL;
 cleanup1:
@@ -434,7 +438,7 @@ void *
 swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 		       dma_addr_t *dma_handle, gfp_t flags)
 {
-	unsigned long dev_addr;
+	dma_addr_t dev_addr;
 	void *ret;
 	int order = get_order(size);
 
@@ -474,8 +478,9 @@ swiotlb_alloc_coherent(struct device *hw
 
 	/* Confirm address can be DMA'd by device */
 	if (address_needs_mapping(hwdev, dev_addr)) {
-		printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016lx\n",
-		       (unsigned long long)*hwdev->dma_mask, dev_addr);
+		printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
+		       (unsigned long long)*hwdev->dma_mask,
+		       (unsigned long long)dev_addr);
 		panic("swiotlb_alloc_coherent: allocated memory is out of "
 		      "range for device");
 	}
@@ -505,7 +510,7 @@ swiotlb_full(struct device *dev, size_t 
 	 * When the mapping is small enough return a static buffer to limit
 	 * the damage, or panic when the transfer is too big.
 	 */
-	printk(KERN_ERR "DMA: Out of SW-IOMMU space for %lu bytes at "
+	printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at "
 	       "device %s\n", size, dev ? dev->bus_id : "?");
 
 	if (size > io_tlb_overflow && do_panic) {
@@ -526,7 +531,7 @@ swiotlb_full(struct device *dev, size_t 
 dma_addr_t
 swiotlb_map_single(struct device *hwdev, void *ptr, size_t size, int dir)
 {
-	unsigned long dev_addr = virt_to_bus(ptr);
+	dma_addr_t dev_addr = virt_to_bus(ptr);
 	void *map;
 
 	if (dir == DMA_NONE)
@@ -674,7 +679,7 @@ swiotlb_map_sg(struct device *hwdev, str
 	       int dir)
 {
 	void *addr;
-	unsigned long dev_addr;
+	dma_addr_t dev_addr;
 	int i;
 
 	if (dir == DMA_NONE)
@@ -773,7 +778,7 @@ swiotlb_dma_mapping_error(dma_addr_t dma
  * this function.
  */
 int
-swiotlb_dma_supported (struct device *hwdev, u64 mask)
+swiotlb_dma_supported(struct device *hwdev, u64 mask)
 {
 	return (virt_to_bus(io_tlb_end) - 1) <= mask;
 }

[-- Attachment #5: swiotlb-bugs.patch --]
[-- Type: text/plain, Size: 6364 bytes --]

This patch fixes
- marking I-cache clean of pages DMAed to now only done for IA64
- broken multiple inclusion in include/asm-x86_64/swiotlb.h
- missing phys-to-virt translation in swiotlb_sync_sg()
- missing call to mark_clean in swiotlb_sync_sg()

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-18/arch/ia64/mm/init.c
===================================================================
--- sle10-sp1-2006-12-18.orig/arch/ia64/mm/init.c	2006-03-20 06:53:29.000000000 +0100
+++ sle10-sp1-2006-12-18/arch/ia64/mm/init.c	2006-12-20 12:02:01.000000000 +0100
@@ -123,6 +123,25 @@ lazy_mmu_prot_update (pte_t pte)
 	set_bit(PG_arch_1, &page->flags);	/* mark page as clean */
 }
 
+/*
+ * Since DMA is i-cache coherent, any (complete) pages that were written via
+ * DMA can be marked as "clean" so that lazy_mmu_prot_update() doesn't have to
+ * flush them when they get mapped into an executable vm-area.
+ */
+void
+dma_mark_clean(void *addr, size_t size)
+{
+	unsigned long pg_addr, end;
+
+	pg_addr = PAGE_ALIGN((unsigned long) addr);
+	end = (unsigned long) addr + size;
+	while (pg_addr + PAGE_SIZE <= end) {
+		struct page *page = virt_to_page(pg_addr);
+		set_bit(PG_arch_1, &page->flags);
+		pg_addr += PAGE_SIZE;
+	}
+}
+
 inline void
 ia64_set_rbs_bot (void)
 {
Index: sle10-sp1-2006-12-18/include/asm-ia64/dma.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-ia64/dma.h	2006-03-20 06:53:29.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-ia64/dma.h	2006-12-20 12:02:01.000000000 +0100
@@ -20,4 +20,6 @@ extern unsigned long MAX_DMA_ADDRESS;
 
 #define free_dma(x)
 
+void dma_mark_clean(void *addr, size_t size);
+
 #endif /* _ASM_IA64_DMA_H */
Index: sle10-sp1-2006-12-18/include/asm-x86_64/mach-xen/asm/dma-mapping.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-x86_64/mach-xen/asm/dma-mapping.h	2006-12-20 15:53:41.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-x86_64/mach-xen/asm/dma-mapping.h	2006-12-20 12:02:01.000000000 +0100
@@ -10,7 +10,6 @@
 
 #include <asm/scatterlist.h>
 #include <asm/io.h>
-#include <asm/swiotlb.h>
 
 struct dma_mapping_ops {
 	int             (*mapping_error)(dma_addr_t dma_addr);
Index: sle10-sp1-2006-12-18/include/asm-x86_64/swiotlb.h
===================================================================
--- sle10-sp1-2006-12-18.orig/include/asm-x86_64/swiotlb.h	2006-12-20 15:53:41.000000000 +0100
+++ sle10-sp1-2006-12-18/include/asm-x86_64/swiotlb.h	2006-12-20 15:53:57.000000000 +0100
@@ -1,5 +1,5 @@
 #ifndef _ASM_SWIOTLB_H
-#define _ASM_SWTIOLB_H 1
+#define _ASM_SWIOTLB_H 1
 
 #include <linux/config.h>
 
@@ -59,4 +59,6 @@ extern int swiotlb;
 
 extern void pci_swiotlb_init(void);
 
-#endif /* _ASM_SWTIOLB_H */
+static inline void dma_mark_clean(void *addr, size_t size) {}
+
+#endif /* _ASM_SWIOTLB_H */
Index: sle10-sp1-2006-12-18/lib/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-18.orig/lib/swiotlb.c	2006-03-20 06:53:29.000000000 +0100
+++ sle10-sp1-2006-12-18/lib/swiotlb.c	2006-12-20 12:02:01.000000000 +0100
@@ -560,25 +560,6 @@ swiotlb_map_single(struct device *hwdev,
 }
 
 /*
- * Since DMA is i-cache coherent, any (complete) pages that were written via
- * DMA can be marked as "clean" so that lazy_mmu_prot_update() doesn't have to
- * flush them when they get mapped into an executable vm-area.
- */
-static void
-mark_clean(void *addr, size_t size)
-{
-	unsigned long pg_addr, end;
-
-	pg_addr = PAGE_ALIGN((unsigned long) addr);
-	end = (unsigned long) addr + size;
-	while (pg_addr + PAGE_SIZE <= end) {
-		struct page *page = virt_to_page(pg_addr);
-		set_bit(PG_arch_1, &page->flags);
-		pg_addr += PAGE_SIZE;
-	}
-}
-
-/*
  * Unmap a single streaming mode DMA translation.  The dma_addr and size must
  * match what was provided for in a previous swiotlb_map_single call.  All
  * other usages are undefined.
@@ -597,7 +578,7 @@ swiotlb_unmap_single(struct device *hwde
 	if (dma_addr >= io_tlb_start && dma_addr < io_tlb_end)
 		unmap_single(hwdev, dma_addr, size, dir);
 	else if (dir == DMA_FROM_DEVICE)
-		mark_clean(dma_addr, size);
+		dma_mark_clean(dma_addr, size);
 }
 
 /*
@@ -621,7 +602,7 @@ swiotlb_sync_single(struct device *hwdev
 	if (dma_addr >= io_tlb_start && dma_addr < io_tlb_end)
 		sync_single(hwdev, dma_addr, size, dir, target);
 	else if (dir == DMA_FROM_DEVICE)
-		mark_clean(dma_addr, size);
+		dma_mark_clean(dma_addr, size);
 }
 
 void
@@ -653,7 +634,7 @@ swiotlb_sync_single_range(struct device 
 	if (dma_addr >= io_tlb_start && dma_addr < io_tlb_end)
 		sync_single(hwdev, dma_addr, size, dir, target);
 	else if (dir == DMA_FROM_DEVICE)
-		mark_clean(dma_addr, size);
+		dma_mark_clean(dma_addr, size);
 }
 
 void
@@ -704,7 +685,6 @@ swiotlb_map_sg(struct device *hwdev, str
 		dev_addr = virt_to_phys(addr);
 		if (swiotlb_force || address_needs_mapping(hwdev, dev_addr)) {
 			void *map = map_single(hwdev, addr, sg->length, dir);
-			sg->dma_address = virt_to_bus(map);
 			if (!map) {
 				/* Don't panic here, we expect map_sg users
 				   to do proper error handling. */
@@ -713,6 +693,7 @@ swiotlb_map_sg(struct device *hwdev, str
 				sg[0].dma_length = 0;
 				return 0;
 			}
+			sg->dma_address = virt_to_bus(map);
 		} else
 			sg->dma_address = dev_addr;
 		sg->dma_length = sg->length;
@@ -737,7 +718,7 @@ swiotlb_unmap_sg(struct device *hwdev, s
 		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
 			unmap_single(hwdev, (void *) phys_to_virt(sg->dma_address), sg->dma_length, dir);
 		else if (dir == DMA_FROM_DEVICE)
-			mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
+			dma_mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
 }
 
 /*
@@ -758,8 +739,10 @@ swiotlb_sync_sg(struct device *hwdev, st
 
 	for (i = 0; i < nelems; i++, sg++)
 		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			sync_single(hwdev, (void *) sg->dma_address,
+			sync_single(hwdev, phys_to_virt(sg->dma_address),
 				    sg->dma_length, dir, target);
+		else if (dir == DMA_FROM_DEVICE)
+			dma_mark_clean(SG_ENT_VIRT_ADDRESS(sg), sg->dma_length);
 }
 
 void

[-- Attachment #6: xen-swiotlb.patch --]
[-- Type: text/plain, Size: 35000 bytes --]

This patch eliminates Xen's special version of the swiotlb code. Along with
that it adds trivial forwarding of dma_{,un}map_page when not using highmem.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Index: sle10-sp1-2006-12-21/arch/i386/kernel/swiotlb.c
===================================================================
--- sle10-sp1-2006-12-21.orig/arch/i386/kernel/swiotlb.c	2006-12-21 16:13:03.000000000 +0100
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,683 +0,0 @@
-/*
- * Dynamic DMA mapping support.
- *
- * This implementation is a fallback for platforms that do not support
- * I/O TLBs (aka DMA address translation hardware).
- * Copyright (C) 2000 Asit Mallick <Asit.K.Mallick@intel.com>
- * Copyright (C) 2000 Goutham Rao <goutham.rao@intel.com>
- * Copyright (C) 2000, 2003 Hewlett-Packard Co
- *	David Mosberger-Tang <davidm@hpl.hp.com>
- * Copyright (C) 2005 Keir Fraser <keir@xensource.com>
- */
-
-#include <linux/cache.h>
-#include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/pci.h>
-#include <linux/spinlock.h>
-#include <linux/string.h>
-#include <linux/types.h>
-#include <linux/ctype.h>
-#include <linux/init.h>
-#include <linux/bootmem.h>
-#include <linux/highmem.h>
-#include <asm/io.h>
-#include <asm/pci.h>
-#include <asm/dma.h>
-#include <asm/uaccess.h>
-#include <xen/interface/memory.h>
-
-int swiotlb;
-EXPORT_SYMBOL(swiotlb);
-
-#define OFFSET(val,align) ((unsigned long)((val) & ( (align) - 1)))
-
-#define SG_ENT_PHYS_ADDRESS(sg)	(page_to_bus((sg)->page) + (sg)->offset)
-
-/*
- * Maximum allowable number of contiguous slabs to map,
- * must be a power of 2.  What is the appropriate value ?
- * The complexity of {map,unmap}_single is linearly dependent on this value.
- */
-#define IO_TLB_SEGSIZE	128
-
-/*
- * log of the size of each IO TLB slab.  The number of slabs is command line
- * controllable.
- */
-#define IO_TLB_SHIFT 11
-
-/* Width of DMA addresses. 30 bits is a b44 limitation. */
-#define DEFAULT_DMA_BITS 30
-
-static int swiotlb_force;
-static char *iotlb_virt_start;
-static unsigned long iotlb_nslabs;
-
-/*
- * Used to do a quick range check in swiotlb_unmap_single and
- * swiotlb_sync_single_*, to see if the memory was in fact allocated by this
- * API.
- */
-static unsigned long iotlb_pfn_start, iotlb_pfn_end;
-
-/* Does the given dma address reside within the swiotlb aperture? */
-static inline int in_swiotlb_aperture(dma_addr_t dev_addr)
-{
-	unsigned long pfn = mfn_to_local_pfn(dev_addr >> PAGE_SHIFT);
-	return (pfn_valid(pfn)
-		&& (pfn >= iotlb_pfn_start)
-		&& (pfn < iotlb_pfn_end));
-}
-
-/*
- * When the IOMMU overflows we return a fallback buffer. This sets the size.
- */
-static unsigned long io_tlb_overflow = 32*1024;
-
-void *io_tlb_overflow_buffer;
-
-/*
- * This is a free list describing the number of free entries available from
- * each index
- */
-static unsigned int *io_tlb_list;
-static unsigned int io_tlb_index;
-
-/*
- * We need to save away the original address corresponding to a mapped entry
- * for the sync operations.
- */
-static struct phys_addr {
-	struct page *page;
-	unsigned int offset;
-} *io_tlb_orig_addr;
-
-/*
- * Protect the above data structures in the map and unmap calls
- */
-static DEFINE_SPINLOCK(io_tlb_lock);
-
-unsigned int dma_bits = DEFAULT_DMA_BITS;
-static int __init
-setup_dma_bits(char *str)
-{
-	dma_bits = simple_strtoul(str, NULL, 0);
-	return 0;
-}
-__setup("dma_bits=", setup_dma_bits);
-
-static int __init
-setup_io_tlb_npages(char *str)
-{
-	/* Unlike ia64, the size is aperture in megabytes, not 'slabs'! */
-	if (isdigit(*str)) {
-		iotlb_nslabs = simple_strtoul(str, &str, 0) <<
-			(20 - IO_TLB_SHIFT);
-		iotlb_nslabs = ALIGN(iotlb_nslabs, IO_TLB_SEGSIZE);
-		/* Round up to power of two (xen_create_contiguous_region). */
-		while (iotlb_nslabs & (iotlb_nslabs-1))
-			iotlb_nslabs += iotlb_nslabs & ~(iotlb_nslabs-1);
-	}
-	if (*str == ',')
-		++str;
-	/*
-         * NB. 'force' enables the swiotlb, but doesn't force its use for
-         * every DMA like it does on native Linux. 'off' forcibly disables
-         * use of the swiotlb.
-         */
-	if (!strcmp(str, "force"))
-		swiotlb_force = 1;
-	else if (!strcmp(str, "off"))
-		swiotlb_force = -1;
-	return 1;
-}
-__setup("swiotlb=", setup_io_tlb_npages);
-/* make io_tlb_overflow tunable too? */
-
-/*
- * Statically reserve bounce buffer space and initialize bounce buffer data
- * structures for the software IO TLB used to implement the PCI DMA API.
- */
-void
-swiotlb_init_with_default_size (size_t default_size)
-{
-	unsigned long i, bytes;
-
-	if (!iotlb_nslabs) {
-		iotlb_nslabs = (default_size >> IO_TLB_SHIFT);
-		iotlb_nslabs = ALIGN(iotlb_nslabs, IO_TLB_SEGSIZE);
-		/* Round up to power of two (xen_create_contiguous_region). */
-		while (iotlb_nslabs & (iotlb_nslabs-1))
-			iotlb_nslabs += iotlb_nslabs & ~(iotlb_nslabs-1);
-	}
-
-	bytes = iotlb_nslabs * (1UL << IO_TLB_SHIFT);
-
-	/*
-	 * Get IO TLB memory from the low pages
-	 */
-	iotlb_virt_start = alloc_bootmem_low_pages(bytes);
-	if (!iotlb_virt_start)
-		panic("Cannot allocate SWIOTLB buffer!\n"
-		      "Use dom0_mem Xen boot parameter to reserve\n"
-		      "some DMA memory (e.g., dom0_mem=-128M).\n");
-
-	for (i = 0; i < iotlb_nslabs; i += IO_TLB_SEGSIZE) {
-		int rc = xen_create_contiguous_region(
-			(unsigned long)iotlb_virt_start + (i << IO_TLB_SHIFT),
-			get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT),
-			dma_bits);
-		BUG_ON(rc);
-	}
-
-	/*
-	 * Allocate and initialize the free list array.  This array is used
-	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE.
-	 */
-	io_tlb_list = alloc_bootmem(iotlb_nslabs * sizeof(int));
-	for (i = 0; i < iotlb_nslabs; i++)
- 		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-	io_tlb_index = 0;
-	io_tlb_orig_addr = alloc_bootmem(
-		iotlb_nslabs * sizeof(*io_tlb_orig_addr));
-
-	/*
-	 * Get the overflow emergency buffer
-	 */
-	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
-
-	iotlb_pfn_start = __pa(iotlb_virt_start) >> PAGE_SHIFT;
-	iotlb_pfn_end   = iotlb_pfn_start + (bytes >> PAGE_SHIFT);
-
-	printk(KERN_INFO "Software IO TLB enabled: \n"
-	       " Aperture:     %lu megabytes\n"
-	       " Kernel range: 0x%016lx - 0x%016lx\n"
-	       " Address size: %u bits\n",
-	       bytes >> 20,
-	       (unsigned long)iotlb_virt_start,
-	       (unsigned long)iotlb_virt_start + bytes,
-	       dma_bits);
-}
-
-void
-swiotlb_init(void)
-{
-	long ram_end;
-	size_t defsz = 64 * (1 << 20); /* 64MB default size */
-
-	if (swiotlb_force == 1) {
-		swiotlb = 1;
-	} else if ((swiotlb_force != -1) &&
-		   is_running_on_xen() &&
-		   is_initial_xendomain()) {
-		/* Domain 0 always has a swiotlb. */
-		ram_end = HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL);
-		if (ram_end <= 0x7ffff)
-			defsz = 2 * (1 << 20); /* 2MB on <2GB on systems. */
-		swiotlb = 1;
-	}
-
-	if (swiotlb)
-		swiotlb_init_with_default_size(defsz);
-	else
-		printk(KERN_INFO "Software IO TLB disabled\n");
-}
-
-/*
- * We use __copy_to_user_inatomic to transfer to the host buffer because the
- * buffer may be mapped read-only (e.g, in blkback driver) but lower-level
- * drivers map the buffer for DMA_BIDIRECTIONAL access. This causes an
- * unnecessary copy from the aperture to the host buffer, and a page fault.
- */
-static void
-__sync_single(struct phys_addr buffer, char *dma_addr, size_t size, int dir)
-{
-	if (PageHighMem(buffer.page)) {
-		size_t len, bytes;
-		char *dev, *host, *kmp;
-		len = size;
-		while (len != 0) {
-			if (((bytes = len) + buffer.offset) > PAGE_SIZE)
-				bytes = PAGE_SIZE - buffer.offset;
-			kmp  = kmap_atomic(buffer.page, KM_SWIOTLB);
-			dev  = dma_addr + size - len;
-			host = kmp + buffer.offset;
-			if (dir == DMA_FROM_DEVICE) {
-				if (__copy_to_user_inatomic(host, dev, bytes))
-					/* inaccessible */;
-			} else
-				memcpy(dev, host, bytes);
-			kunmap_atomic(kmp, KM_SWIOTLB);
-			len -= bytes;
-			buffer.page++;
-			buffer.offset = 0;
-		}
-	} else {
-		char *host = (char *)phys_to_virt(
-			page_to_pseudophys(buffer.page)) + buffer.offset;
-		if (dir == DMA_FROM_DEVICE) {
-			if (__copy_to_user_inatomic(host, dma_addr, size))
-				/* inaccessible */;
-		} else if (dir == DMA_TO_DEVICE)
-			memcpy(dma_addr, host, size);
-	}
-}
-
-/*
- * Allocates bounce buffer and returns its kernel virtual address.
- */
-static void *
-map_single(struct device *hwdev, struct phys_addr buffer, size_t size, int dir)
-{
-	unsigned long flags;
-	char *dma_addr;
-	unsigned int nslots, stride, index, wrap;
-	int i;
-
-	/*
-	 * For mappings greater than a page, we limit the stride (and
-	 * hence alignment) to a page size.
-	 */
-	nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	if (size > PAGE_SIZE)
-		stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT));
-	else
-		stride = 1;
-
-	BUG_ON(!nslots);
-
-	/*
-	 * Find suitable number of IO TLB entries size that will fit this
-	 * request and allocate a buffer from that IO TLB pool.
-	 */
-	spin_lock_irqsave(&io_tlb_lock, flags);
-	{
-		wrap = index = ALIGN(io_tlb_index, stride);
-
-		if (index >= iotlb_nslabs)
-			wrap = index = 0;
-
-		do {
-			/*
-			 * If we find a slot that indicates we have 'nslots'
-			 * number of contiguous buffers, we allocate the
-			 * buffers from that slot and mark the entries as '0'
-			 * indicating unavailable.
-			 */
-			if (io_tlb_list[index] >= nslots) {
-				int count = 0;
-
-				for (i = index; i < (int)(index + nslots); i++)
-					io_tlb_list[i] = 0;
-				for (i = index - 1;
-				     (OFFSET(i, IO_TLB_SEGSIZE) !=
-				      IO_TLB_SEGSIZE -1) && io_tlb_list[i];
-				     i--)
-					io_tlb_list[i] = ++count;
-				dma_addr = iotlb_virt_start +
-					(index << IO_TLB_SHIFT);
-
-				/*
-				 * Update the indices to avoid searching in
-				 * the next round.
-				 */
-				io_tlb_index = 
-					((index + nslots) < iotlb_nslabs
-					 ? (index + nslots) : 0);
-
-				goto found;
-			}
-			index += stride;
-			if (index >= iotlb_nslabs)
-				index = 0;
-		} while (index != wrap);
-
-		spin_unlock_irqrestore(&io_tlb_lock, flags);
-		return NULL;
-	}
-  found:
-	spin_unlock_irqrestore(&io_tlb_lock, flags);
-
-	/*
-	 * Save away the mapping from the original address to the DMA address.
-	 * This is needed when we sync the memory.  Then we sync the buffer if
-	 * needed.
-	 */
-	io_tlb_orig_addr[index] = buffer;
-	if ((dir == DMA_TO_DEVICE) || (dir == DMA_BIDIRECTIONAL))
-		__sync_single(buffer, dma_addr, size, DMA_TO_DEVICE);
-
-	return dma_addr;
-}
-
-/*
- * dma_addr is the kernel virtual address of the bounce buffer to unmap.
- */
-static void
-unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
-{
-	unsigned long flags;
-	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	int index = (dma_addr - iotlb_virt_start) >> IO_TLB_SHIFT;
-	struct phys_addr buffer = io_tlb_orig_addr[index];
-
-	/*
-	 * First, sync the memory before unmapping the entry
-	 */
-	if ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL))
-		__sync_single(buffer, dma_addr, size, DMA_FROM_DEVICE);
-
-	/*
-	 * Return the buffer to the free list by setting the corresponding
-	 * entries to indicate the number of contigous entries available.
-	 * While returning the entries to the free list, we merge the entries
-	 * with slots below and above the pool being returned.
-	 */
-	spin_lock_irqsave(&io_tlb_lock, flags);
-	{
-		count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
-			 io_tlb_list[index + nslots] : 0);
-		/*
-		 * Step 1: return the slots to the free list, merging the
-		 * slots with superceeding slots
-		 */
-		for (i = index + nslots - 1; i >= index; i--)
-			io_tlb_list[i] = ++count;
-		/*
-		 * Step 2: merge the returned slots with the preceding slots,
-		 * if available (non zero)
-		 */
-		for (i = index - 1;
-		     (OFFSET(i, IO_TLB_SEGSIZE) !=
-		      IO_TLB_SEGSIZE -1) && io_tlb_list[i];
-		     i--)
-			io_tlb_list[i] = ++count;
-	}
-	spin_unlock_irqrestore(&io_tlb_lock, flags);
-}
-
-static void
-sync_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
-{
-	int index = (dma_addr - iotlb_virt_start) >> IO_TLB_SHIFT;
-	struct phys_addr buffer = io_tlb_orig_addr[index];
-	BUG_ON((dir != DMA_FROM_DEVICE) && (dir != DMA_TO_DEVICE));
-	__sync_single(buffer, dma_addr, size, dir);
-}
-
-static void
-swiotlb_full(struct device *dev, size_t size, int dir, int do_panic)
-{
-	/*
-	 * Ran out of IOMMU space for this operation. This is very bad.
-	 * Unfortunately the drivers cannot handle this operation properly.
-	 * unless they check for pci_dma_mapping_error (most don't)
-	 * When the mapping is small enough return a static buffer to limit
-	 * the damage, or panic when the transfer is too big.
-	 */
-	printk(KERN_ERR "PCI-DMA: Out of SW-IOMMU space for %lu bytes at "
-	       "device %s\n", (unsigned long)size, dev ? dev->bus_id : "?");
-
-	if (size > io_tlb_overflow && do_panic) {
-		if (dir == PCI_DMA_FROMDEVICE || dir == PCI_DMA_BIDIRECTIONAL)
-			panic("PCI-DMA: Memory would be corrupted\n");
-		if (dir == PCI_DMA_TODEVICE || dir == PCI_DMA_BIDIRECTIONAL)
-			panic("PCI-DMA: Random memory would be DMAed\n");
-	}
-}
-
-/*
- * Map a single buffer of the indicated size for DMA in streaming mode.  The
- * PCI address to use is returned.
- *
- * Once the device is given the dma address, the device owns this memory until
- * either swiotlb_unmap_single or swiotlb_dma_sync_single is performed.
- */
-dma_addr_t
-swiotlb_map_single(struct device *hwdev, void *ptr, size_t size, int dir)
-{
-	dma_addr_t dev_addr = virt_to_bus(ptr);
-	void *map;
-	struct phys_addr buffer;
-
-	BUG_ON(dir == DMA_NONE);
-
-	/*
-	 * If the pointer passed in happens to be in the device's DMA window,
-	 * we can safely return the device addr and not worry about bounce
-	 * buffering it.
-	 */
-	if (!range_straddles_page_boundary(ptr, size) &&
-	    !address_needs_mapping(hwdev, dev_addr))
-		return dev_addr;
-
-	/*
-	 * Oh well, have to allocate and map a bounce buffer.
-	 */
-	buffer.page   = virt_to_page(ptr);
-	buffer.offset = (unsigned long)ptr & ~PAGE_MASK;
-	map = map_single(hwdev, buffer, size, dir);
-	if (!map) {
-		swiotlb_full(hwdev, size, dir, 1);
-		map = io_tlb_overflow_buffer;
-	}
-
-	dev_addr = virt_to_bus(map);
-	return dev_addr;
-}
-
-/*
- * Unmap a single streaming mode DMA translation.  The dma_addr and size must
- * match what was provided for in a previous swiotlb_map_single call.  All
- * other usages are undefined.
- *
- * After this call, reads by the cpu to the buffer are guaranteed to see
- * whatever the device wrote there.
- */
-void
-swiotlb_unmap_single(struct device *hwdev, dma_addr_t dev_addr, size_t size,
-		     int dir)
-{
-	BUG_ON(dir == DMA_NONE);
-	if (in_swiotlb_aperture(dev_addr))
-		unmap_single(hwdev, bus_to_virt(dev_addr), size, dir);
-}
-
-/*
- * Make physical memory consistent for a single streaming mode DMA translation
- * after a transfer.
- *
- * If you perform a swiotlb_map_single() but wish to interrogate the buffer
- * using the cpu, yet do not wish to teardown the PCI dma mapping, you must
- * call this function before doing so.  At the next point you give the PCI dma
- * address back to the card, you must first perform a
- * swiotlb_dma_sync_for_device, and then the device again owns the buffer
- */
-void
-swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
-			    size_t size, int dir)
-{
-	BUG_ON(dir == DMA_NONE);
-	if (in_swiotlb_aperture(dev_addr))
-		sync_single(hwdev, bus_to_virt(dev_addr), size, dir);
-}
-
-void
-swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
-			       size_t size, int dir)
-{
-	BUG_ON(dir == DMA_NONE);
-	if (in_swiotlb_aperture(dev_addr))
-		sync_single(hwdev, bus_to_virt(dev_addr), size, dir);
-}
-
-/*
- * Map a set of buffers described by scatterlist in streaming mode for DMA.
- * This is the scatter-gather version of the above swiotlb_map_single
- * interface.  Here the scatter gather list elements are each tagged with the
- * appropriate dma address and length.  They are obtained via
- * sg_dma_{address,length}(SG).
- *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
- * Device ownership issues as mentioned above for swiotlb_map_single are the
- * same here.
- */
-int
-swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg, int nelems,
-	       int dir)
-{
-	struct phys_addr buffer;
-	dma_addr_t dev_addr;
-	char *map;
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for (i = 0; i < nelems; i++, sg++) {
-		dev_addr = SG_ENT_PHYS_ADDRESS(sg);
-		if (address_needs_mapping(hwdev, dev_addr)) {
-			buffer.page   = sg->page;
-			buffer.offset = sg->offset;
-			map = map_single(hwdev, buffer, sg->length, dir);
-			if (!map) {
-				/* Don't panic here, we expect map_sg users
-				   to do proper error handling. */
-				swiotlb_full(hwdev, sg->length, dir, 0);
-				swiotlb_unmap_sg(hwdev, sg - i, i, dir);
-				sg[0].dma_length = 0;
-				return 0;
-			}
-			sg->dma_address = (dma_addr_t)virt_to_bus(map);
-		} else
-			sg->dma_address = dev_addr;
-		sg->dma_length = sg->length;
-	}
-	return nelems;
-}
-
-/*
- * Unmap a set of streaming mode DMA translations.  Again, cpu read rules
- * concerning calls here are the same as for swiotlb_unmap_single() above.
- */
-void
-swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg, int nelems,
-		 int dir)
-{
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for (i = 0; i < nelems; i++, sg++)
-		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			unmap_single(hwdev, 
-				     (void *)bus_to_virt(sg->dma_address),
-				     sg->dma_length, dir);
-}
-
-/*
- * Make physical memory consistent for a set of streaming mode DMA translations
- * after a transfer.
- *
- * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules
- * and usage.
- */
-void
-swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
-			int nelems, int dir)
-{
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for (i = 0; i < nelems; i++, sg++)
-		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			sync_single(hwdev,
-				    (void *)bus_to_virt(sg->dma_address),
-				    sg->dma_length, dir);
-}
-
-void
-swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
-			   int nelems, int dir)
-{
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for (i = 0; i < nelems; i++, sg++)
-		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
-			sync_single(hwdev,
-				    (void *)bus_to_virt(sg->dma_address),
-				    sg->dma_length, dir);
-}
-
-dma_addr_t
-swiotlb_map_page(struct device *hwdev, struct page *page,
-		 unsigned long offset, size_t size,
-		 enum dma_data_direction direction)
-{
-	struct phys_addr buffer;
-	dma_addr_t dev_addr;
-	char *map;
-
-	dev_addr = page_to_bus(page) + offset;
-	if (address_needs_mapping(hwdev, dev_addr)) {
-		buffer.page   = page;
-		buffer.offset = offset;
-		map = map_single(hwdev, buffer, size, direction);
-		if (!map) {
-			swiotlb_full(hwdev, size, direction, 1);
-			map = io_tlb_overflow_buffer;
-		}
-		dev_addr = (dma_addr_t)virt_to_bus(map);
-	}
-
-	return dev_addr;
-}
-
-void
-swiotlb_unmap_page(struct device *hwdev, dma_addr_t dma_address,
-		   size_t size, enum dma_data_direction direction)
-{
-	BUG_ON(direction == DMA_NONE);
-	if (in_swiotlb_aperture(dma_address))
-		unmap_single(hwdev, bus_to_virt(dma_address), size, direction);
-}
-
-int
-swiotlb_dma_mapping_error(dma_addr_t dma_addr)
-{
-	return (dma_addr == virt_to_bus(io_tlb_overflow_buffer));
-}
-
-/*
- * Return whether the given PCI device DMA address mask can be supported
- * properly.  For example, if your device can only drive the low 24-bits
- * during PCI bus mastering, then you would pass 0x00ffffff as the mask to
- * this function.
- */
-int
-swiotlb_dma_supported (struct device *hwdev, u64 mask)
-{
-	return (mask >= ((1UL << dma_bits) - 1));
-}
-
-EXPORT_SYMBOL(swiotlb_init);
-EXPORT_SYMBOL(swiotlb_map_single);
-EXPORT_SYMBOL(swiotlb_unmap_single);
-EXPORT_SYMBOL(swiotlb_map_sg);
-EXPORT_SYMBOL(swiotlb_unmap_sg);
-EXPORT_SYMBOL(swiotlb_sync_single_for_cpu);
-EXPORT_SYMBOL(swiotlb_sync_single_for_device);
-EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu);
-EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
-EXPORT_SYMBOL(swiotlb_map_page);
-EXPORT_SYMBOL(swiotlb_unmap_page);
-EXPORT_SYMBOL(swiotlb_dma_mapping_error);
-EXPORT_SYMBOL(swiotlb_dma_supported);
Index: sle10-sp1-2006-12-21/arch/i386/kernel/pci-dma-xen.c
===================================================================
--- sle10-sp1-2006-12-21.orig/arch/i386/kernel/pci-dma-xen.c	2006-12-21 16:13:03.000000000 +0100
+++ sle10-sp1-2006-12-21/arch/i386/kernel/pci-dma-xen.c	2006-12-21 15:51:29.000000000 +0100
@@ -17,7 +17,7 @@
 #include <xen/balloon.h>
 #include <asm/swiotlb.h>
 #include <asm/tlbflush.h>
-#include <asm-i386/mach-xen/asm/swiotlb.h>
+#include <asm/dma-mapping.h>
 #include <asm/bug.h>
 
 #ifdef __x86_64__
@@ -38,6 +38,15 @@ __init int iommu_setup(char *p)
 }
 #endif
 
+unsigned int dma_bits = DEFAULT_DMA_BITS;
+static int __init
+setup_dma_bits(char *str)
+{
+	dma_bits = simple_strtoul(str, NULL, 0);
+	return 0;
+}
+__setup("dma_bits=", setup_dma_bits);
+
 struct dma_coherent_mem {
 	void		*virt_base;
 	u32		device_base;
@@ -94,13 +103,7 @@ dma_unmap_sg(struct device *hwdev, struc
 }
 EXPORT_SYMBOL(dma_unmap_sg);
 
-/*
- * XXX This file is also used by xenLinux/ia64. 
- * "defined(__i386__) || defined (__x86_64__)" means "!defined(__ia64__)".
- * This #if work around should be removed once this file is merbed back into
- * i386' pci-dma or is moved to drivers/xen/core.
- */
-#if defined(__i386__) || defined(__x86_64__)
+#ifdef CONFIG_HIGHMEM
 dma_addr_t
 dma_map_page(struct device *dev, struct page *page, unsigned long offset,
 	     size_t size, enum dma_data_direction direction)
@@ -130,7 +133,7 @@ dma_unmap_page(struct device *dev, dma_a
 		swiotlb_unmap_page(dev, dma_address, size, direction);
 }
 EXPORT_SYMBOL(dma_unmap_page);
-#endif /* defined(__i386__) || defined(__x86_64__) */
+#endif /* CONFIG_HIGHMEM */
 
 int
 dma_mapping_error(dma_addr_t dma_addr)
Index: sle10-sp1-2006-12-21/include/asm-i386/mach-xen/asm/dma-mapping.h
===================================================================
--- sle10-sp1-2006-12-21.orig/include/asm-i386/mach-xen/asm/dma-mapping.h	2006-12-21 16:13:03.000000000 +0100
+++ sle10-sp1-2006-12-21/include/asm-i386/mach-xen/asm/dma-mapping.h	2006-12-21 11:01:11.000000000 +0100
@@ -24,7 +24,7 @@ address_needs_mapping(struct device *hwd
 }
 
 static inline int
-range_straddles_page_boundary(void *p, size_t size)
+range_straddles_page_boundary(const void *p, size_t size)
 {
 	extern unsigned long *contiguous_bitmap;
 	return (((((unsigned long)p & ~PAGE_MASK) + size) > PAGE_SIZE) &&
@@ -53,6 +53,7 @@ extern int dma_map_sg(struct device *hwd
 extern void dma_unmap_sg(struct device *hwdev, struct scatterlist *sg,
 			 int nents, enum dma_data_direction direction);
 
+#ifdef CONFIG_HIGHMEM
 extern dma_addr_t
 dma_map_page(struct device *dev, struct page *page, unsigned long offset,
 	     size_t size, enum dma_data_direction direction);
@@ -60,6 +61,11 @@ dma_map_page(struct device *dev, struct 
 extern void
 dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
 	       enum dma_data_direction direction);
+#else
+#define dma_map_page(dev, page, offset, size, dir) \
+	dma_map_single(dev, page_address(page) + (offset), (size), (dir))
+#define dma_unmap_page dma_unmap_single
+#endif
 
 extern void
 dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size,
Index: sle10-sp1-2006-12-21/include/asm-i386/mach-xen/asm/swiotlb.h
===================================================================
--- sle10-sp1-2006-12-21.orig/include/asm-i386/mach-xen/asm/swiotlb.h	2006-12-21 16:13:03.000000000 +0100
+++ sle10-sp1-2006-12-21/include/asm-i386/mach-xen/asm/swiotlb.h	2006-12-21 15:56:11.000000000 +0100
@@ -1,45 +1,26 @@
-#ifndef _ASM_SWIOTLB_H
-#define _ASM_SWIOTLB_H 1
+#ifndef _ASM_XEN_SWIOTLB_H
+#define _ASM_XEN_SWIOTLB_H 1
 
 #include <linux/config.h>
+#include <xen/swiotlb.h>
 
-/* SWIOTLB interface */
+#ifdef CONFIG_HIGHMEM
+
+/*
+ * Intentionally leaving off the base address of the page here - it may not
+ * be valid at all (for highmem pages), and the macro is needed only for
+ * passing as argument to range_needs_mapping(), which doesn't care about
+ * the base address, and dma_mark_clean(), which is a no-op.
+ */
+#define SG_ENT_VIRT_ADDRESS(sg)	((void *)(sg)->offset)
+#define SG_ENT_PHYS_ADDRESS(sg)	(page_to_bus((sg)->page) + (sg)->offset)
 
-extern dma_addr_t swiotlb_map_single(struct device *hwdev, void *ptr, size_t size,
-				      int dir);
-extern void swiotlb_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
-				  size_t size, int dir);
-extern void swiotlb_sync_single_for_cpu(struct device *hwdev,
-					 dma_addr_t dev_addr,
-					 size_t size, int dir);
-extern void swiotlb_sync_single_for_device(struct device *hwdev,
-					    dma_addr_t dev_addr,
-					    size_t size, int dir);
-extern void swiotlb_sync_sg_for_cpu(struct device *hwdev,
-				     struct scatterlist *sg, int nelems,
-				     int dir);
-extern void swiotlb_sync_sg_for_device(struct device *hwdev,
-					struct scatterlist *sg, int nelems,
-					int dir);
-extern int swiotlb_map_sg(struct device *hwdev, struct scatterlist *sg,
-		      int nents, int direction);
-extern void swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg,
-			 int nents, int direction);
-extern int swiotlb_dma_mapping_error(dma_addr_t dma_addr);
 extern dma_addr_t swiotlb_map_page(struct device *hwdev, struct page *page,
                                    unsigned long offset, size_t size,
                                    enum dma_data_direction direction);
 extern void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dma_address,
                                size_t size, enum dma_data_direction direction);
-extern int swiotlb_dma_supported(struct device *hwdev, u64 mask);
-extern void swiotlb_init(void);
-
-extern unsigned int dma_bits;
 
-#ifdef CONFIG_SWIOTLB
-extern int swiotlb;
-#else
-#define swiotlb 0
 #endif
 
 #endif
Index: sle10-sp1-2006-12-21/include/asm-ia64/swiotlb.h
===================================================================
--- sle10-sp1-2006-12-21.orig/include/asm-ia64/swiotlb.h	2006-12-21 16:13:18.000000000 +0100
+++ sle10-sp1-2006-12-21/include/asm-ia64/swiotlb.h	2006-12-21 16:13:45.000000000 +0100
@@ -1,9 +1,17 @@
 #ifndef _ASM_SWIOTLB_H
 #define _ASM_SWIOTLB_H 1
 
+#include <linux/config.h>
+
+#ifndef CONFIG_XEN
+
 #include <asm/machvec.h>
 
 #define SWIOTLB_ARCH_NEED_LATE_INIT
 #define SWIOTLB_ARCH_NEED_ALLOC
 
+#else
+#include <xen/swiotlb.h>
+#endif
+
 #endif /* _ASM_SWIOTLB_H */
Index: sle10-sp1-2006-12-21/include/asm-x86_64/mach-xen/asm/swiotlb.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sle10-sp1-2006-12-21/include/asm-x86_64/mach-xen/asm/swiotlb.h	2006-12-21 11:01:12.000000000 +0100
@@ -0,0 +1 @@
+#include <xen/swiotlb.h>
Index: sle10-sp1-2006-12-21/include/asm-x86_64/swiotlb.h
===================================================================
--- sle10-sp1-2006-12-21.orig/include/asm-x86_64/swiotlb.h	2006-12-21 16:13:03.000000000 +0100
+++ sle10-sp1-2006-12-21/include/asm-x86_64/swiotlb.h	2006-12-21 11:01:12.000000000 +0100
@@ -43,14 +43,6 @@ extern void swiotlb_free_coherent (struc
 extern int swiotlb_dma_supported(struct device *hwdev, u64 mask);
 extern void swiotlb_init(void);
 
-#ifdef CONFIG_XEN
-extern dma_addr_t swiotlb_map_page(struct device *hwdev, struct page *page,
-                                   unsigned long offset, size_t size,
-                                   enum dma_data_direction direction);
-extern void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dma_address,
-                               size_t size, enum dma_data_direction direction);
-#endif
-
 #ifdef CONFIG_SWIOTLB
 #define SWIOTLB_ARCH_NEED_ALLOC
 extern int swiotlb;
Index: sle10-sp1-2006-12-21/include/xen/swiotlb.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sle10-sp1-2006-12-21/include/xen/swiotlb.h	2006-12-21 16:14:27.000000000 +0100
@@ -0,0 +1,166 @@
+/*
+ * Copyright (C) 2005 Keir Fraser <keir@xensource.com>
+ */
+
+#ifndef _XEN_SWIOTLB_H
+#define _XEN_SWIOTLB_H 1
+
+#include <asm-x86_64/swiotlb.h>
+#include <linux/highmem.h>
+#include <asm/uaccess.h>
+#include <xen/interface/memory.h>
+
+/* Width of DMA addresses. 30 bits is a b44 limitation. */
+#define DEFAULT_DMA_BITS 30
+extern unsigned int dma_bits;
+
+#undef SWIOTLB_ARCH_NEED_ALLOC
+
+#define SWIOTLB_EXTRA_VARIABLES \
+static int __init setup_io_tlb_npages(char *str) \
+{ \
+	if (isdigit(*str)) { \
+		io_tlb_nslabs = simple_strtoul(str, &str, 0); \
+		/* Unlike ia64, the size is aperture in megabytes, not 'slabs'! */ \
+		io_tlb_nslabs <<= (20 - IO_TLB_SHIFT); \
+		/* avoid tail segment of size < IO_TLB_SEGSIZE */ \
+		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); \
+	} \
+	if (*str == ',') \
+		++str; \
+	/* \
+	 * NB. 'force' enables the swiotlb, but doesn't force its use for \
+	 * every DMA like it does on native Linux. 'off' forcibly disables \
+	 * use of the swiotlb. \
+	 */ \
+	if (!strcmp(str, "force")) \
+		swiotlb_force = 1; \
+	else if (!strcmp(str, "off")) \
+		swiotlb_force = -1; \
+	return 1; \
+} \
+int swiotlb; \
+EXPORT_SYMBOL(swiotlb)
+#define SWIOTLB_ARCH_HAS_SETUP_IO_TLB_NPAGES
+
+#define swiotlb_adjust_size(size) do { \
+	/* Round up to power of two (xen_create_contiguous_region). */ \
+	while (size & (size - 1)) \
+		size += size & ~(size - 1); \
+} while (0)
+
+#define swiotlb_adjust_seg(start, size) do { \
+	int rc = xen_create_contiguous_region( \
+		(unsigned long)(start), \
+		get_order(size), \
+		dma_bits); \
+	BUG_ON(rc); \
+} while (0)
+
+#define swiotlb_print_info(bytes) \
+	printk(KERN_INFO "Software IO TLB enabled: \n" \
+	       " Aperture:     %lu megabytes\n" \
+	       " Kernel range: 0x%016lx - 0x%016lx\n" \
+	       " Address size: %u bits\n", \
+	       bytes >> 20, \
+	       (unsigned long)io_tlb_start, \
+	       (unsigned long)io_tlb_end, \
+	       dma_bits)
+
+#define __swiotlb_init_with_default_size(defsz) do { \
+	size_t size = (defsz); \
+	if (swiotlb_force == 1) { \
+		swiotlb = 1; \
+	} else if ((swiotlb_force != -1) && \
+		   is_running_on_xen() && \
+		   is_initial_xendomain()) { \
+		/* Domain 0 always has a swiotlb. */ \
+		long ram_end = HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL); \
+		if (ram_end <= 0x7ffff) \
+			size = 2 * (1 << 20); /* 2MB on <2GB on systems. */ \
+		swiotlb = 1; \
+	} \
+	if (swiotlb) \
+		swiotlb_init_with_default_size(size); \
+	else \
+		printk(KERN_INFO "Software IO TLB disabled\n"); \
+} while(0)
+
+#define range_needs_mapping range_straddles_page_boundary
+#define order_needs_mapping(order) ((order) != 0)
+#define SWIOTLB_ARCH_HAS_NEEDS_MAPPING
+
+typedef struct {
+	struct page *page;
+	unsigned int offset;
+} io_tlb_addr_t;
+#define SWIOTLB_ARCH_HAS_IO_TLB_ADDR_T
+#define swiotlb_orig_addr_null(buffer) (!(buffer).page)
+#define ptr_to_io_tlb_addr(ptr) ({ \
+	io_tlb_addr_t __buf; \
+	if (ptr) \
+		__buf.page = virt_to_page(ptr); \
+	else \
+		__buf.page = NULL; \
+	__buf.offset = (unsigned long)(ptr) & ~PAGE_MASK; \
+	__buf; \
+})
+#define page_to_io_tlb_addr(pg, off) ({ \
+	io_tlb_addr_t __buf; \
+	__buf.page   = pg; \
+	__buf.offset = off; \
+	__buf; \
+})
+#define sg_to_io_tlb_addr(sg) ({ \
+	io_tlb_addr_t __buf; \
+	__buf.page   = (sg)->page; \
+	__buf.offset = (sg)->offset; \
+	__buf; \
+})
+
+/*
+ * We use __copy_to_user_inatomic to transfer to the host buffer because the
+ * buffer may be mapped read-only (e.g, in blkback driver) but lower-level
+ * drivers map the buffer for DMA_BIDIRECTIONAL access. This causes an
+ * unnecessary copy from the aperture to the host buffer, and a page fault.
+ */
+static inline void
+__swiotlb_arch_sync_single(io_tlb_addr_t buffer, char *dma_addr, size_t size, int dir)
+{
+	if (PageHighMem(buffer.page)) {
+		size_t len, bytes;
+		char *dev, *host, *kmp;
+		len = size;
+		while (len != 0) {
+			if (((bytes = len) + buffer.offset) > PAGE_SIZE)
+				bytes = PAGE_SIZE - buffer.offset;
+			kmp  = kmap_atomic(buffer.page, KM_SWIOTLB);
+			dev  = dma_addr + size - len;
+			host = kmp + buffer.offset;
+			if (dir == DMA_FROM_DEVICE) {
+				if (__copy_to_user_inatomic(host, dev, bytes))
+					/* inaccessible */;
+			} else
+				memcpy(dev, host, bytes);
+			kunmap_atomic(kmp, KM_SWIOTLB);
+			len -= bytes;
+			buffer.page++;
+			buffer.offset = 0;
+		}
+	} else {
+		char *host = (char *)phys_to_virt(
+			page_to_pseudophys(buffer.page)) + buffer.offset;
+		if (dir == DMA_FROM_DEVICE) {
+			if (__copy_to_user_inatomic(host, dma_addr, size))
+				/* inaccessible */;
+		} else if (dir == DMA_TO_DEVICE)
+			memcpy(dma_addr, host, size);
+	}
+}
+#define SWIOTLB_ARCH_HAS_SYNC_SINGLE
+
+#define SWIOTLB_ARCH_NEED_MAP_PAGE
+
+#define __swiotlb_dma_supported(hwdev, mask) ((((u64)1 << dma_bits) - 1) <= (mask))
+
+#endif /* _XEN_SWTIOLB_H */
Index: sle10-sp1-2006-12-21/lib/Makefile
===================================================================
--- sle10-sp1-2006-12-21.orig/lib/Makefile	2006-12-21 16:13:03.000000000 +0100
+++ sle10-sp1-2006-12-21/lib/Makefile	2006-12-21 11:01:12.000000000 +0100
@@ -49,7 +49,6 @@ obj-$(CONFIG_SGRB)	 += sgrb.o
 obj-$(CONFIG_STATISTICS) += statistic.o
 
 obj-$(CONFIG_SWIOTLB) += swiotlb.o
-swiotlb-$(CONFIG_XEN) := ../arch/i386/kernel/swiotlb.o
 
 hostprogs-y	:= gen_crc32table
 clean-files	:= crc32table.h

[-- Attachment #7: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
@ 2006-12-22 16:20 Jan Beulich
  2006-12-22 21:00 ` Herbert Xu
  2006-12-23  9:48 ` Keir Fraser
  0 siblings, 2 replies; 29+ messages in thread
From: Jan Beulich @ 2006-12-22 16:20 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Herbert Xu

One more thing: Is it really necessary to restrict dma_alloc_coherent() to dma_bits?
I.e., couldn't we, once the bit-level page allocator is merged, use the real bit width
needed for the requesting device here? If not, this would then permit using the
original implementation of swiotlb_dma_supported() (as dma_alloc_coherent() then
no longer depends on dma_bits), and perhaps even auto-setting dma_bits based
on what memory we can get out of Xen in swiotlb_init(), making the mismatching of
command line options (between Xen and kernel) impossible (the kernel simply
wouldn't have one anymore).

As a nice side effect, using the original implementation of swiotlb_dma_supported()
would require slightly less tweaking of lib/swiotlb.c, hence slightly raising the
chances of the changes getting accepted into mainline. And clearly, if the kernel
manages to allocate the swiotlb at an address with less than dma_bits bits, there
seems to be no reason to refuse use of I/O devices that the actual buffer fits, but
dma_bits doesn't.

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-22 16:20 Jan Beulich
@ 2006-12-22 21:00 ` Herbert Xu
  2006-12-23  9:48 ` Keir Fraser
  1 sibling, 0 replies; 29+ messages in thread
From: Herbert Xu @ 2006-12-22 21:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Keir Fraser

On Fri, Dec 22, 2006 at 04:20:28PM +0000, Jan Beulich wrote:
> One more thing: Is it really necessary to restrict dma_alloc_coherent() to dma_bits?
> I.e., couldn't we, once the bit-level page allocator is merged, use the real bit width
> needed for the requesting device here? If not, this would then permit using the

Yes I think that should work.  If the width is less than what the
hypervisor supports it'll simply fail which is better than returning
the wrong memory silently.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-22 16:20 Jan Beulich
  2006-12-22 21:00 ` Herbert Xu
@ 2006-12-23  9:48 ` Keir Fraser
  1 sibling, 0 replies; 29+ messages in thread
From: Keir Fraser @ 2006-12-23  9:48 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser; +Cc: xen-devel, Herbert Xu


Firstly, I think we should wait and see if the patches are acceptable
upstream in their current form before switching to using them.

As for dma_alloc_coherent() -- yes it makes sense to make an allocation
request with the device's specified bit width. And we could be opportunistic
about the bit width we advertise from swiotlb if we happen to get lower
memory than we asked for. *but*:
 1. Our swiotlb is made up of separately allocated strides, so the swiotlb
is not contiguous in machine memory. That needs to be kept in mind when
calculating the bit width as it'll be max over all strides.
 2. Also because of this, the existing swiotlb_dma_supported() cannot work
as is. Firstly it would need to use virt_to_machine(), and even then it
doesn't take into account that the aperture is not contiguous in machine
memory.

And as I said before, dma_bits will disappear from Xen in due course when
the dma pool goes away and is replaced with something more flexible. The
plan is to leave it (or a similarly-named parameter) in Linux, at least as a
guide to the swiotlb pre-allocation (even if no longer used for
dma_alloc_coherent).

 -- Keir

On 22/12/06 4:20 pm, "Jan Beulich" <jbeulich@novell.com> wrote:

> One more thing: Is it really necessary to restrict dma_alloc_coherent() to
> dma_bits?
> I.e., couldn't we, once the bit-level page allocator is merged, use the real
> bit width
> needed for the requesting device here? If not, this would then permit using
> the
> original implementation of swiotlb_dma_supported() (as dma_alloc_coherent()
> then
> no longer depends on dma_bits), and perhaps even auto-setting dma_bits based
> on what memory we can get out of Xen in swiotlb_init(), making the mismatching
> of
> command line options (between Xen and kernel) impossible (the kernel simply
> wouldn't have one anymore).
> 
> As a nice side effect, using the original implementation of
> swiotlb_dma_supported()
> would require slightly less tweaking of lib/swiotlb.c, hence slightly raising
> the
> chances of the changes getting accepted into mainline. And clearly, if the
> kernel
> manages to allocate the swiotlb at an address with less than dma_bits bits,
> there
> seems to be no reason to refuse use of I/O devices that the actual buffer
> fits, but
> dma_bits doesn't.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-22 14:49 Jan Beulich
@ 2006-12-25  4:50 ` Muli Ben-Yehuda
  2006-12-25 10:20   ` Keir Fraser
  0 siblings, 1 reply; 29+ messages in thread
From: Muli Ben-Yehuda @ 2006-12-25  4:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Keir Fraser

On Fri, Dec 22, 2006 at 02:49:53PM +0000, Jan Beulich wrote:

> Patch update, fixing a bug on x86/PAE, and making
> include/xen/swiotlb.h look a lot nicer (but still not really
> nice). My plan is to submit the non-Xen ones to lkml right after New
> Year, unless I hear negative feedback.

> >Patch order is
> >swiotlb-bugs.patch
> >swiotlb-bus.patch
> >swiotlb-cleanup.patch
> >swiotlb-split.patch
> >xen-swiotlb.patch

Comments inline.

swiotlb-bugs.patch:

[snip]

>  /*
> @@ -758,8 +739,10 @@ swiotlb_sync_sg(struct device *hwdev, st
>  
>  	for (i = 0; i < nelems; i++, sg++)
>  		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
> -			sync_single(hwdev, (void *) sg->dma_address,
> +			sync_single(hwdev, phys_to_virt(sg->dma_address),
>  				    sg->dma_length, dir, target);

Fix looks correct and bug looks painful. I think you should send this
one to mainline immediately.

swiotlb-bus.patch:

> Convert all phys_to_virt/virt_to_phys uses to
> bus_to_virt/virt_to_bus.

"... because Xen needs it", otherwise someone is bound to ask why, as
all other archs define _bus as _phys.

> Signed-off-by: Jan Beulich <jbeulich@novell.com>

Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>

swiotlb-cleanup:

> This patch
> - adds proper __init decoration to swiotlb's init code (and the code calling
>   it, where not already the case)
> - replaces uses of 'unsigned long' with dma_addr_t where appropriate
> - does miscellaneous simplicfication and cleanup
> 
> Signed-off-by: Jan Beulich <jbeulich@novell.com>

Looks good in general, not acking yet because I want to give it a spin
first.

swiotlb-split.patch:

> This patch adds abstraction so that the file can be used by environments other
> than IA64 and EM64T, namely for Xen.

I'm sorry, but this patch is horrible. swiotlb.c is now pretty much
unreadable. I'd be surprised if mainline accepted it - I would
certainly NAK it with my mainline hat on, especially for an unmerged
architecture.

If Xen needs so many "abstractions", I have to ask whether it isn't
better off just using its own swiotlb.c as we are now.

I'll take another look at this later and try to come up with a
different way of merging them that isn't quite this horrible. Maybe
using function pointers for the "low level" operations?

Cheers,
Muli

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-25  4:50 ` Muli Ben-Yehuda
@ 2006-12-25 10:20   ` Keir Fraser
  0 siblings, 0 replies; 29+ messages in thread
From: Keir Fraser @ 2006-12-25 10:20 UTC (permalink / raw)
  To: Muli Ben-Yehuda, Jan Beulich; +Cc: xen-devel, Keir Fraser

On 25/12/06 4:50 am, "Muli Ben-Yehuda" <muli@il.ibm.com> wrote:

> I'm sorry, but this patch is horrible. swiotlb.c is now pretty much
> unreadable. I'd be surprised if mainline accepted it - I would
> certainly NAK it with my mainline hat on, especially for an unmerged
> architecture.
> 
> If Xen needs so many "abstractions", I have to ask whether it isn't
> better off just using its own swiotlb.c as we are now.
> 
> I'll take another look at this later and try to come up with a
> different way of merging them that isn't quite this horrible. Maybe
> using function pointers for the "low level" operations?

A lot of this ugliness comes from swiotlb_[un]map_page, the introduction of
a structural 'io_tlb_addr_t', and a more complicated synchronisation
function than memcpy (which uses copy_to_user and kmap).

However, I don't remember *why* we need these Xen-specific customisations
(even though I was the one who originally made them, way back). That given,
it would probably be sensible to make a more brutal merge that discards Xen
differences which we cannot explain. For example:

 * Why can't we turn dma_[un]map_page into dma_[un]map_single, as x86_64
does? This would avoid needing to expand the swiotlb api.

 * Why can't we store virtual addresses in the io_tlb_orig_addr[] array just
like lib/swiotlb.c, given that the native swiotlb is happy to use
page_address() on any 'struct page' that is passed to it? I can't see why we
actually need KM_SWIOTLB and to use kmap_atomic.

 * If we make the above changes, do we need special sync_single() any more?
We'll no longer need kmap_atomic, but we'll still have
copy_to_user_inatomic, due to abuses of the DMA API by the block-device
subsystem. Maybe that has been fixed already? Or maybe, in merging this
upstream, we should aim to fix the block-device drivers rather than work
around?

Even if we wait for some of the patches to be merged upstream before
applying the swiotlb changes to our own Linux tree, I'd consider patches
just to remove the above differences relative to lib/swiotlb.c. This might
reduce the diff but would also let us get some testing of these swiotlb
simplifications.

The differences I was expecting to need to make were just the following:
 * Initialisation time -- we want to be able to automatically conditionally
initialise the swiotlb, and allocate the aperture contents in a special way.
 * Usage -- 'force' has difference semantics for Xen (means forcibly
allocate a swiotlb, but not use it on every device access whether it is
needed or not). We want to be able to use the swiotlb only when it is needed
for a particular device access; this may in particular require special
treatment of the unmap api calls to decide whether or not the given
dma_address resides in the swiotlb aperture.

 -- Keir

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
@ 2006-12-30 17:32 Jan Beulich
  2006-12-30 17:47 ` Keir Fraser
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2006-12-30 17:32 UTC (permalink / raw)
  To: Keir.Fraser, muli; +Cc: xen-devel, keir

>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 12/25/06 11:20 AM >>>
>On 25/12/06 4:50 am, "Muli Ben-Yehuda" <muli@il.ibm.com> wrote:
>
>> I'm sorry, but this patch is horrible. swiotlb.c is now pretty much
>> unreadable. I'd be surprised if mainline accepted it - I would
>> certainly NAK it with my mainline hat on, especially for an unmerged
>> architecture.
>> 
>> If Xen needs so many "abstractions", I have to ask whether it isn't
>> better off just using its own swiotlb.c as we are now.
>> 
>> I'll take another look at this later and try to come up with a
>> different way of merging them that isn't quite this horrible. Maybe
>> using function pointers for the "low level" operations?
>
>A lot of this ugliness comes from swiotlb_[un]map_page, the introduction of
<a structural 'io_tlb_addr_t', and a more complicated synchronisation
>function than memcpy (which uses copy_to_user and kmap).
>
>However, I don't remember *why* we need these Xen-specific customisations
>(even though I was the one who originally made them, way back). That given,
>it would probably be sensible to make a more brutal merge that discards Xen
>differences which we cannot explain. For example:
>
> * Why can't we turn dma_[un]map_page into dma_[un]map_single, as x86_64
>does? This would avoid needing to expand the swiotlb api.

Because we allow highmem pages in the I/O path, hence page_address() cannot
be used. As you may have concluded from my sending of a second rev of the
patches, I had a bug in exactly that path, so I know it is being exercised.
Of course, all this exists for x86-32/PAE *only*, so it may be valid to
raise the question if it's worth it. But otoh with supporting (only) 32-bit
PAE PV guests on x86-64 we are in the process of widening the use case here.

> * Why can't we store virtual addresses in the io_tlb_orig_addr[] array just
>like lib/swiotlb.c, given that the native swiotlb is happy to use
>page_address() on any 'struct page' that is passed to it? I can't see why we
>actually need KM_SWIOTLB and to use kmap_atomic.

Likewise.

> * If we make the above changes, do we need special sync_single() any more?
>We'll no longer need kmap_atomic, but we'll still have
>copy_to_user_inatomic, due to abuses of the DMA API by the block-device
>subsystem. Maybe that has been fixed already? Or maybe, in merging this
>upstream, we should aim to fix the block-device drivers rather than work
>around?

Fixing the block drivers would certainly be nice (I'm not exactly clear where
this dependency lives on their side), but due to the above we'll be unable to
completely switch over to memcpy() only.

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-30 17:32 Jan Beulich
@ 2006-12-30 17:47 ` Keir Fraser
  2007-01-02  8:39   ` Jan Beulich
  2007-01-03  7:10   ` Jan Beulich
  0 siblings, 2 replies; 29+ messages in thread
From: Keir Fraser @ 2006-12-30 17:47 UTC (permalink / raw)
  To: Jan Beulich, Keir.Fraser, muli; +Cc: xen-devel

On 30/12/06 5:32 pm, "Jan Beulich" <jbeulich@novell.com> wrote:

>> * Why can't we turn dma_[un]map_page into dma_[un]map_single, as x86_64
>> does? This would avoid needing to expand the swiotlb api.
> 
> Because we allow highmem pages in the I/O path, hence page_address() cannot
> be used. As you may have concluded from my sending of a second rev of the
> patches, I had a bug in exactly that path, so I know it is being exercised.
> Of course, all this exists for x86-32/PAE *only*, so it may be valid to
> raise the question if it's worth it. But otoh with supporting (only) 32-bit
> PAE PV guests on x86-64 we are in the process of widening the use case here.

Ah of course, the generic swiotlb has not (yet) been used by an architecture
with highmem requiring use of kmap(). I forgot about that.

Unfortunately highmem does rather complicate things -- I guess it's up to
the lib/swiotlb maintainers whether they want to keep that complexity in an
xen-i386-specific swiotlb.c or attempt a merge.

Here's a thought: if the highmem DMA requests come from *only* the blkdev
subsystem, then perhaps we could use its highmem bounce buffer (I think that
still exists?). We turn that off on Xen right now, but we could re-enable
it, leading to a slightly odd 'double bounce buffer': the first taking us
from high pseudophysical memory to low pseudophysical memory, and the second
taking us from high machine memory to low machine memory. If we can ensure
only lowmem requests get to the swiotlb then a lot of the Xen diffs go away.
I'm not sure whether we might get DMA requests from high memory from things
other than block devices though, nor whether all block devices would
actually pass through the blkdev bounce buffer code.

 Thanks,
 Keir

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-30 17:47 ` Keir Fraser
@ 2007-01-02  8:39   ` Jan Beulich
  2007-01-03  7:10   ` Jan Beulich
  1 sibling, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2007-01-02  8:39 UTC (permalink / raw)
  To: Keir.Fraser; +Cc: Muli Ben-Yehuda, xen-devel

>Here's a thought: if the highmem DMA requests come from *only* the blkdev
>subsystem, then perhaps we could use its highmem bounce buffer (I think that
>still exists?). We turn that off on Xen right now, but we could re-enable
>it, leading to a slightly odd 'double bounce buffer': the first taking us
>from high pseudophysical memory to low pseudophysical memory, and the second
>taking us from high machine memory to low machine memory. If we can ensure
>only lowmem requests get to the swiotlb then a lot of the Xen diffs go away.
>I'm not sure whether we might get DMA requests from high memory from things
>other than block devices though, nor whether all block devices would
>actually pass through the blkdev bounce buffer code.

Proving that no driver ever passes highmem pages into any of the DMA ops should
be at least difficult, but would be needed since native i386 has no requirement that
dma_map_sg() or dma_map_page() be called only with non-highmem pages. I am
therefore not thinking this is a realistic option.

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2006-12-30 17:47 ` Keir Fraser
  2007-01-02  8:39   ` Jan Beulich
@ 2007-01-03  7:10   ` Jan Beulich
  2007-01-03  9:32     ` Keir Fraser
  1 sibling, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2007-01-03  7:10 UTC (permalink / raw)
  To: Keir.Fraser; +Cc: muli, xen-devel

>Here's a thought: if the highmem DMA requests come from *only* the blkdev
>subsystem, then perhaps we could use its highmem bounce buffer (I think that
>still exists?).

Another, only partially related thought: The addition of KM_SWIOTLB is the only
difference to native kmap_types.h. It would seem to me that, at the price of
disabling interrupts around the use, it should be possible to replace that with
KM_BOUNCE_READ and let go of the Xen specific header...

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2007-01-03  7:10   ` Jan Beulich
@ 2007-01-03  9:32     ` Keir Fraser
  2007-01-03 11:01       ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Keir Fraser @ 2007-01-03  9:32 UTC (permalink / raw)
  To: Jan Beulich; +Cc: muli, xen-devel




On 3/1/07 07:10, "Jan Beulich" <jbeulich@novell.com> wrote:

>> Here's a thought: if the highmem DMA requests come from *only* the blkdev
>> subsystem, then perhaps we could use its highmem bounce buffer (I think that
>> still exists?).
> 
> Another, only partially related thought: The addition of KM_SWIOTLB is the
> only
> difference to native kmap_types.h. It would seem to me that, at the price of
> disabling interrupts around the use, it should be possible to replace that
> with
> KM_BOUNCE_READ and let go of the Xen specific header...

Sounds reasonable, although KM_BOUNCE_READ is a misnomer (since we write to
the kmap as well as read from it).

 -- Keir

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: x86 swiotlb questions
  2007-01-03  9:32     ` Keir Fraser
@ 2007-01-03 11:01       ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2007-01-03 11:01 UTC (permalink / raw)
  To: Keir Fraser; +Cc: muli, xen-devel

>>> Keir Fraser <keir@xensource.com> 03.01.07 10:32 >>>
>On 3/1/07 07:10, "Jan Beulich" <jbeulich@novell.com> wrote:
>> Another, only partially related thought: The addition of KM_SWIOTLB is the
>> only
>> difference to native kmap_types.h. It would seem to me that, at the price of
>> disabling interrupts around the use, it should be possible to replace that
>> with
>> KM_BOUNCE_READ and let go of the Xen specific header...
>
>Sounds reasonable, although KM_BOUNCE_READ is a misnomer (since we write to
>the kmap as well as read from it).

... similar to the (mis-)use of it in the EDAC driver.

Jan

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2007-01-03 11:01 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-15 12:50 x86 swiotlb questions Jan Beulich
2006-12-15 13:35 ` Keir Fraser
2006-12-15 13:53   ` Jan Beulich
2006-12-15 14:03     ` Keir Fraser
2006-12-15 14:17       ` Jan Beulich
2006-12-15 14:19         ` Keir Fraser
2006-12-15 14:46           ` Jan Beulich
2006-12-15 16:47             ` Keir Fraser
2006-12-15 16:19   ` Alan
2006-12-18  7:44   ` Jan Beulich
2006-12-18  9:39     ` Keir Fraser
2006-12-19 12:48       ` Jan Beulich
2006-12-19 14:14         ` Keir Fraser
2006-12-19 14:39           ` Jan Beulich
2006-12-19 14:46             ` Keir Fraser
2006-12-19 17:07               ` Muli Ben-Yehuda
2006-12-20 16:40       ` Jan Beulich
  -- strict thread matches above, loose matches on Subject: below --
2006-12-22 14:49 Jan Beulich
2006-12-25  4:50 ` Muli Ben-Yehuda
2006-12-25 10:20   ` Keir Fraser
2006-12-22 16:20 Jan Beulich
2006-12-22 21:00 ` Herbert Xu
2006-12-23  9:48 ` Keir Fraser
2006-12-30 17:32 Jan Beulich
2006-12-30 17:47 ` Keir Fraser
2007-01-02  8:39   ` Jan Beulich
2007-01-03  7:10   ` Jan Beulich
2007-01-03  9:32     ` Keir Fraser
2007-01-03 11:01       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.