* [PATCH] DMA 4GB boundary protection
@ 2007-03-02 21:49 Jake Moilanen
2007-03-02 22:27 ` Olof Johansson
2007-03-03 23:29 ` Olof Johansson
0 siblings, 2 replies; 25+ messages in thread
From: Jake Moilanen @ 2007-03-02 21:49 UTC (permalink / raw)
To: paulus, olof; +Cc: linuxppc-dev
There are many adapters which can not handle DMAing acrosss any 4 GB
boundary. For instance the latest Emulex adapters.
This normally is not an issue as firmware gives us dma-windows under
4gigs. However, some of the new System-P boxes have dma-windows above
4gigs, and this present a problem.
I propose fixing it in the IOMMU allocation instead of making each
driver protect against it as it is more efficient, and won't require
changing every driver which has not considered this issue.
This patch checks to see if the mapping spans a 4 gig boundary, and if
it does, retries the allocation. It tries the next allocation at the
start of the crossed 4 gig boundary.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
---
arch/powerpc/kernel/iommu.c | 10 ++++++++++
1 files changed, 10 insertions(+)
Index: powerpc/arch/powerpc/kernel/iommu.c
===================================================================
--- powerpc.orig/arch/powerpc/kernel/iommu.c
+++ powerpc/arch/powerpc/kernel/iommu.c
@@ -76,6 +76,7 @@ static unsigned long iommu_range_alloc(s
unsigned int align_order)
{
unsigned long n, end, i, start;
+ unsigned long start_addr, end_addr;
unsigned long limit;
int largealloc = npages > 15;
int pass = 0;
@@ -146,6 +147,15 @@ static unsigned long iommu_range_alloc(s
}
}
+ /* DMA cannot cross 4 GB boundary */
+ start_addr = (n + tbl->it_offset) << PAGE_SHIFT;
+ end_addr = (end + tbl->it_offset) << PAGE_SHIFT;
+ if ((start_addr >> 32) != (end_addr >> 32)) {
+ end_addr &= 0xffffffff00000000l;
+ start = (end_addr >> PAGE_SHIFT) - tbl->it_offset;
+ goto again;
+ }
+
for (i = n; i < end; i++)
if (test_bit(i, tbl->it_map)) {
start = i+1;
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-02 21:49 [PATCH] DMA 4GB boundary protection Jake Moilanen
@ 2007-03-02 22:27 ` Olof Johansson
2007-03-03 8:27 ` Benjamin Herrenschmidt
2007-03-03 23:29 ` Olof Johansson
1 sibling, 1 reply; 25+ messages in thread
From: Olof Johansson @ 2007-03-02 22:27 UTC (permalink / raw)
To: Jake Moilanen; +Cc: linuxppc-dev, paulus
On Fri, Mar 02, 2007 at 03:49:43PM -0600, Jake Moilanen wrote:
> There are many adapters which can not handle DMAing acrosss any 4 GB
> boundary. For instance the latest Emulex adapters.
>
> This normally is not an issue as firmware gives us dma-windows under
> 4gigs. However, some of the new System-P boxes have dma-windows above
> 4gigs, and this present a problem.
>
> I propose fixing it in the IOMMU allocation instead of making each
> driver protect against it as it is more efficient, and won't require
> changing every driver which has not considered this issue.
Sorry, but you need to fix the drivers. They might be used on a system
without an IOMMU, or with direct mapping. It's particularily likely in
the case of DAC-capable devices, and that's the only case for which you
can cross a 4GB boundary in the first place.
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-02 22:27 ` Olof Johansson
@ 2007-03-03 8:27 ` Benjamin Herrenschmidt
2007-03-03 23:25 ` Olof Johansson
2007-03-04 5:17 ` Christoph Hellwig
0 siblings, 2 replies; 25+ messages in thread
From: Benjamin Herrenschmidt @ 2007-03-03 8:27 UTC (permalink / raw)
To: Olof Johansson; +Cc: linuxppc-dev, paulus
On Fri, 2007-03-02 at 16:27 -0600, Olof Johansson wrote:
> On Fri, Mar 02, 2007 at 03:49:43PM -0600, Jake Moilanen wrote:
> > There are many adapters which can not handle DMAing acrosss any 4 GB
> > boundary. For instance the latest Emulex adapters.
> >
> > This normally is not an issue as firmware gives us dma-windows under
> > 4gigs. However, some of the new System-P boxes have dma-windows above
> > 4gigs, and this present a problem.
> >
> > I propose fixing it in the IOMMU allocation instead of making each
> > driver protect against it as it is more efficient, and won't require
> > changing every driver which has not considered this issue.
>
> Sorry, but you need to fix the drivers. They might be used on a system
> without an IOMMU, or with direct mapping. It's particularily likely in
> the case of DAC-capable devices, and that's the only case for which you
> can cross a 4GB boundary in the first place.
Hrm... ok, that would be ideal, _but_
- We currently don't have platforms that DMA > 32 bits and don't have an
iommu
- Drivers are broken -today- and I doubt they can all be audited and
fixed (and fixes bacported to distros) quickly
- That problem is very common and can be very hard to debug when it
happens.
So while I agree, the logical fix would be in the drivers, I think that
-also- making sure the iommu code will not hand off sections crossing 4G
boundaries does make some sense. It will at least reduce the likehood of
strange and hard to debug problems on those setups that do have an iommu
and >4G of RAM, which is pretty much all of them at the moment on ppc64.
Can you double check the correctness of the patch ? I'll have a better
look myself next week hopefully.
Ben.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-03 8:27 ` Benjamin Herrenschmidt
@ 2007-03-03 23:25 ` Olof Johansson
2007-03-04 5:17 ` Christoph Hellwig
1 sibling, 0 replies; 25+ messages in thread
From: Olof Johansson @ 2007-03-03 23:25 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, paulus
On Sat, Mar 03, 2007 at 09:27:31AM +0100, Benjamin Herrenschmidt wrote:
> On Fri, 2007-03-02 at 16:27 -0600, Olof Johansson wrote:
> > On Fri, Mar 02, 2007 at 03:49:43PM -0600, Jake Moilanen wrote:
> > > There are many adapters which can not handle DMAing acrosss any 4 GB
> > > boundary. For instance the latest Emulex adapters.
> > >
> > > This normally is not an issue as firmware gives us dma-windows under
> > > 4gigs. However, some of the new System-P boxes have dma-windows above
> > > 4gigs, and this present a problem.
> > >
> > > I propose fixing it in the IOMMU allocation instead of making each
> > > driver protect against it as it is more efficient, and won't require
> > > changing every driver which has not considered this issue.
> >
> > Sorry, but you need to fix the drivers. They might be used on a system
> > without an IOMMU, or with direct mapping. It's particularily likely in
> > the case of DAC-capable devices, and that's the only case for which you
> > can cross a 4GB boundary in the first place.
>
> Hrm... ok, that would be ideal, _but_
>
> - We currently don't have platforms that DMA > 32 bits and don't have an
> iommu
Up until this week, we didn't have a platform that did DMA > 32, period.
On PPC I suppose there's only one right now that can do it without IOMMU,
and that's 1682M. I'd expect there to be more in the future. On non-PPC
I'm sure there's several.
> - Drivers are broken -today- and I doubt they can all be audited and
> fixed (and fixes bacported to distros) quickly
Yes, and the chance of them getting audited and fixed suddenly drops to
0% once we put this in instead. The same problem applies to this patch
with respect to backporting it to distros.
> - That problem is very common and can be very hard to debug when it
> happens.
So let's just make someone else suffer instead?
> So while I agree, the logical fix would be in the drivers, I think that
> -also- making sure the iommu code will not hand off sections crossing 4G
> boundaries does make some sense. It will at least reduce the likehood of
> strange and hard to debug problems on those setups that do have an iommu
> and >4G of RAM, which is pretty much all of them at the moment on ppc64.
Even though most ppc64 platforms have more than 4GB of ram, the IOMMUs
have up until now not done more than 32-bit addresses. This 4GB boundary
bug won't affect the majority of users out there, only those on platforms
whose firmware gives out address ranges that crosses the 4GB boundary.
Besides virtualization, there's no need to use an IOMMU with devices
that can address the full memory range of the system anyway. In most
cases it just hurts performance.
> Can you double check the correctness of the patch ? I'll have a better
> look myself next week hopefully.
It's incorrect as posted, and I think it can be done in a nicer
manner. I'll post feedback to the patch.
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-02 21:49 [PATCH] DMA 4GB boundary protection Jake Moilanen
2007-03-02 22:27 ` Olof Johansson
@ 2007-03-03 23:29 ` Olof Johansson
2007-03-03 23:32 ` Segher Boessenkool
2007-03-21 21:05 ` Jake Moilanen
1 sibling, 2 replies; 25+ messages in thread
From: Olof Johansson @ 2007-03-03 23:29 UTC (permalink / raw)
To: Jake Moilanen; +Cc: linuxppc-dev, paulus
On Fri, Mar 02, 2007 at 03:49:43PM -0600, Jake Moilanen wrote:
> This normally is not an issue as firmware gives us dma-windows under
> 4gigs. However, some of the new System-P boxes have dma-windows above
> 4gigs, and this present a problem.
Above 4 gigs, or that crosses the 4GB boundary? There's a difference.
> I propose fixing it in the IOMMU allocation instead of making each
> driver protect against it as it is more efficient, and won't require
> changing every driver which has not considered this issue.
The drawback of this patch is that it adds code to every single allocation.
Instead, you should just mark the last entry before the 4GB boundary
as allocated when you setup the bitmaps for the table. That way, no
allocation will ever be able to cross over.
Even nicer would be to only do it when a boot option is specified, so
we actually have a chance to expose and find the driver bugs instead of
papering them over.
Also, remember to use IOMMU_PAGE_SHIFT instead of PAGE_SHIFT, they might
be different.
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-03 23:29 ` Olof Johansson
@ 2007-03-03 23:32 ` Segher Boessenkool
2007-03-03 23:57 ` Olof Johansson
2007-03-21 21:05 ` Jake Moilanen
1 sibling, 1 reply; 25+ messages in thread
From: Segher Boessenkool @ 2007-03-03 23:32 UTC (permalink / raw)
To: Olof Johansson; +Cc: linuxppc-dev, paulus
> The drawback of this patch is that it adds code to every single
> allocation.
> Instead, you should just mark the last entry before the 4GB boundary
> as allocated when you setup the bitmaps for the table. That way, no
> allocation will ever be able to cross over.
Jake said that this bug happens when crossing _any_ 4GB
boundary, so that means reserving a few more blocks.
> Even nicer would be to only do it when a boot option is specified, so
> we actually have a chance to expose and find the driver bugs instead of
> papering them over.
Almost all drivers (*) that can do DAC already avoid
crossing the SAC-to-DAC boundary. I have never heard
about a card having the bug on _any_ 4GB crossing, I
doubt they are that common. Just fix the drivers :-)
(*) Well, amongst those that matter and have this bug.
Segher
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-03 23:32 ` Segher Boessenkool
@ 2007-03-03 23:57 ` Olof Johansson
0 siblings, 0 replies; 25+ messages in thread
From: Olof Johansson @ 2007-03-03 23:57 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linuxppc-dev, paulus
On Sun, Mar 04, 2007 at 12:32:04AM +0100, Segher Boessenkool wrote:
> > The drawback of this patch is that it adds code to every single
> > allocation.
> > Instead, you should just mark the last entry before the 4GB boundary
> > as allocated when you setup the bitmaps for the table. That way, no
> > allocation will ever be able to cross over.
>
> Jake said that this bug happens when crossing _any_ 4GB
> boundary, so that means reserving a few more blocks.
Yes, sorry. I was presuming that the table provided by firmware was less
than 4GB in size. He would obviously need to mark at each boundary.
> > Even nicer would be to only do it when a boot option is specified, so
> > we actually have a chance to expose and find the driver bugs instead of
> > papering them over.
>
> Almost all drivers (*) that can do DAC already avoid
> crossing the SAC-to-DAC boundary. I have never heard
> about a card having the bug on _any_ 4GB crossing, I
> doubt they are that common. Just fix the drivers :-)
I can see the point with dealig with backports. Having a boot option
that is disabled by default would help working around it until a fix is
available, but it'd still help expose driver bugs.
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-03 8:27 ` Benjamin Herrenschmidt
2007-03-03 23:25 ` Olof Johansson
@ 2007-03-04 5:17 ` Christoph Hellwig
2007-03-04 5:52 ` Olof Johansson
1 sibling, 1 reply; 25+ messages in thread
From: Christoph Hellwig @ 2007-03-04 5:17 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Olof Johansson, linuxppc-dev, paulus
On Sat, Mar 03, 2007 at 09:27:31AM +0100, Benjamin Herrenschmidt wrote:
> - Drivers are broken -today- and I doubt they can all be audited and
> fixed (and fixes bacported to distros) quickly
Which drivers? All requests coming from the block layer make sure
you never span the 4GB boundary, and IIRC that same is true for
networking. Similarly dma_alloc_coherent and pci_alloc_consinstant
only allocate from the lower 4G unless told otherwise. A driver writer
would have to be extra-ordinarily stuid to get this wrong. (And yes,
some driver writers are..)
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-04 5:17 ` Christoph Hellwig
@ 2007-03-04 5:52 ` Olof Johansson
0 siblings, 0 replies; 25+ messages in thread
From: Olof Johansson @ 2007-03-04 5:52 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: paulus, linuxppc-dev
On Sun, Mar 04, 2007 at 06:17:38AM +0100, Christoph Hellwig wrote:
> On Sat, Mar 03, 2007 at 09:27:31AM +0100, Benjamin Herrenschmidt wrote:
> > - Drivers are broken -today- and I doubt they can all be audited and
> > fixed (and fixes bacported to distros) quickly
>
> Which drivers? All requests coming from the block layer make sure
> you never span the 4GB boundary, and IIRC that same is true for
> networking. Similarly dma_alloc_coherent and pci_alloc_consinstant
> only allocate from the lower 4G unless told otherwise.
Since the IOMMU layer adds a completely independent address space, any
mapping might end up straddling the boundary if you're unlucky. But, with
the above info, adding the restrictions in the iommu code seems safer --
at least it won't cause problems on non-translated enviromnents if the
block and network code makes sure it can't happen there.
> A driver writer
> would have to be extra-ordinarily stuid to get this wrong. (And yes,
> some driver writers are..)
Any pci_map_single or friends should be able to end up there based on
how the iommu allocator works. It doesn't take into consideration any
kind of alignment of the physical regions being mapped.
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-03 23:29 ` Olof Johansson
2007-03-03 23:32 ` Segher Boessenkool
@ 2007-03-21 21:05 ` Jake Moilanen
2007-03-21 21:39 ` Olof Johansson
2007-03-22 17:53 ` Olof Johansson
1 sibling, 2 replies; 25+ messages in thread
From: Jake Moilanen @ 2007-03-21 21:05 UTC (permalink / raw)
To: Olof Johansson; +Cc: linuxppc-dev, paulus
> > I propose fixing it in the IOMMU allocation instead of making each
> > driver protect against it as it is more efficient, and won't require
> > changing every driver which has not considered this issue.
>
> The drawback of this patch is that it adds code to every single allocation.
> Instead, you should just mark the last entry before the 4GB boundary
> as allocated when you setup the bitmaps for the table. That way, no
> allocation will ever be able to cross over.
Agreed. While it's not an issue at 4KB, when we go to say 64k, or 16MB
TCE sizes, this extra space will be wasted. But, I believe we'll need a
little code for this anyways.
> Even nicer would be to only do it when a boot option is specified, so
> we actually have a chance to expose and find the driver bugs instead of
> papering them over.
Done.
> Also, remember to use IOMMU_PAGE_SHIFT instead of PAGE_SHIFT, they might
> be different.
Yup...
Here's a patch addressing Olof's concerns.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
---
arch/powerpc/kernel/iommu.c | 32 +++++++++++++++++++++++++++++++-
1 files changed, 31 insertions(+), 1 deletion(-)
Index: powerpc/arch/powerpc/kernel/iommu.c
===================================================================
--- powerpc.orig/arch/powerpc/kernel/iommu.c
+++ powerpc/arch/powerpc/kernel/iommu.c
@@ -47,6 +47,8 @@ static int novmerge = 0;
static int novmerge = 1;
#endif
+static int protect4gb = 1;
+
static inline unsigned long iommu_num_pages(unsigned long vaddr,
unsigned long slen)
{
@@ -58,6 +60,16 @@ static inline unsigned long iommu_num_pa
return npages;
}
+static int __init setup_protect4gb(char *str)
+{
+ if (strcmp(str, "on") == 0)
+ protect4gb = 1;
+ else if (strcmp(str, "off") == 0)
+ protect4gb = 0;
+
+ return 1;
+}
+
static int __init setup_iommu(char *str)
{
if (!strcmp(str, "novmerge"))
@@ -67,6 +79,7 @@ static int __init setup_iommu(char *str)
return 1;
}
+__setup("protect4gb=", setup_protect4gb);
__setup("iommu=", setup_iommu);
static unsigned long iommu_range_alloc(struct iommu_table *tbl,
@@ -429,6 +442,8 @@ void iommu_unmap_sg(struct iommu_table *
struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid)
{
unsigned long sz;
+ unsigned long start_addr, end_addr;
+ unsigned long index;
static int welcomed = 0;
struct page *page;
@@ -450,7 +465,7 @@ struct iommu_table *iommu_init_table(str
#ifdef CONFIG_CRASH_DUMP
if (ppc_md.tce_get) {
- unsigned long index, tceval;
+ unsigned long tceval;
unsigned long tcecount = 0;
/*
@@ -480,6 +495,21 @@ struct iommu_table *iommu_init_table(str
ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
#endif
+ /*
+ * DMA cannot cross 4 GB boundary. Mark first entry of each 4
+ * GB chunk as reserved.
+ */
+ if (protect4gb) {
+ start_addr = tbl->it_offset << IOMMU_PAGE_SHIFT;
+ /* go up to next 4GB boundary */
+ start_addr = (start_addr + 0x00000000ffffffffl) >> 32;
+ end_addr = (tbl->it_offset + tbl->it_size) << IOMMU_PAGE_SHIFT;
+ for (index = start_addr; index < end_addr; index += (1l << 32)) {
+ /* Reserve 4GB entry */
+ __set_bit((index >> IOMMU_PAGE_SHIFT) - tbl->it_offset,
tbl->it_map);
+ }
+ }
+
if (!welcomed) {
printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n",
novmerge ? "disabled" : "enabled");
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-21 21:05 ` Jake Moilanen
@ 2007-03-21 21:39 ` Olof Johansson
2007-03-22 17:53 ` Olof Johansson
1 sibling, 0 replies; 25+ messages in thread
From: Olof Johansson @ 2007-03-21 21:39 UTC (permalink / raw)
To: Jake Moilanen; +Cc: linuxppc-dev, paulus
On Wed, Mar 21, 2007 at 04:05:48PM -0500, Jake Moilanen wrote:
> > > I propose fixing it in the IOMMU allocation instead of making each
> > > driver protect against it as it is more efficient, and won't require
> > > changing every driver which has not considered this issue.
> >
> > The drawback of this patch is that it adds code to every single allocation.
> > Instead, you should just mark the last entry before the 4GB boundary
> > as allocated when you setup the bitmaps for the table. That way, no
> > allocation will ever be able to cross over.
>
> Agreed. While it's not an issue at 4KB, when we go to say 64k, or 16MB
> TCE sizes, this extra space will be wasted. But, I believe we'll need a
> little code for this anyways.
Doing 16MB mappings will have it's challenges anyway, especially for
things such as network and file I/O. It's similar to what happens for
64kB base page size and the buffer cache: lots and lots of waste, and
the risk of running out of address space is high (depending on how big
the window is per device, of course).
> > Even nicer would be to only do it when a boot option is specified, so
> > we actually have a chance to expose and find the driver bugs instead of
> > papering them over.
>
> Done.
>
> > Also, remember to use IOMMU_PAGE_SHIFT instead of PAGE_SHIFT, they might
> > be different.
>
> Yup...
>
> Here's a patch addressing Olof's concerns.
Thanks, I'll have a closer look at it tonight.
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-22 17:53 ` Olof Johansson
@ 2007-03-22 17:47 ` Jake Moilanen
2007-03-22 22:52 ` Segher Boessenkool
2007-03-27 20:10 ` Jake Moilanen
2 siblings, 0 replies; 25+ messages in thread
From: Jake Moilanen @ 2007-03-22 17:47 UTC (permalink / raw)
To: Olof Johansson; +Cc: linuxppc-dev, paulus
On Thu, 2007-03-22 at 12:53 -0500, Olof Johansson wrote:
> On Wed, Mar 21, 2007 at 04:05:48PM -0500, Jake Moilanen wrote:
>
> > @@ -480,6 +495,21 @@ struct iommu_table *iommu_init_table(str
> > ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
> > #endif
> >
> > + /*
> > + * DMA cannot cross 4 GB boundary. Mark first entry of each 4
> > + * GB chunk as reserved.
> > + */
> > + if (protect4gb) {
> > + start_addr = tbl->it_offset << IOMMU_PAGE_SHIFT;
> > + /* go up to next 4GB boundary */
> > + start_addr = (start_addr + 0x00000000ffffffffl) >> 32;
> > + end_addr = (tbl->it_offset + tbl->it_size) << IOMMU_PAGE_SHIFT;
> > + for (index = start_addr; index < end_addr; index += (1l << 32)) {
> > + /* Reserve 4GB entry */
> > + __set_bit((index >> IOMMU_PAGE_SHIFT) - tbl->it_offset,
> > tbl->it_map);
> > + }
> > + }
> > +
>
>
> This is done a bit more complicated than it has to be. The >> 32 is a
> red flag as well.
>
> I would also like it to be the last page in the range, not the first
> (since otherwise you'll reserve even if the window is less than 4GB.
>
> Something like (untested):
>
> entries_per_4g = 0x100000000 >> IOMMU_PAGE_SHIFT;
>
> /* Mark the last bit before a 4GB boundary as used */
> start_index = tbl->it_offset | (entries_per_4g - 1);
> end_index = tbl->it_offset + tbl->it_size;
>
> for (index = start_index; index < end_index; index += entries_per_4g)
> __set_bit(index, tbl->it_map);
>
> Is easier to follow.
Yup, I like that better. I'll give it a run.
Jake
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-21 21:05 ` Jake Moilanen
2007-03-21 21:39 ` Olof Johansson
@ 2007-03-22 17:53 ` Olof Johansson
2007-03-22 17:47 ` Jake Moilanen
` (2 more replies)
1 sibling, 3 replies; 25+ messages in thread
From: Olof Johansson @ 2007-03-22 17:53 UTC (permalink / raw)
To: Jake Moilanen; +Cc: linuxppc-dev, paulus
On Wed, Mar 21, 2007 at 04:05:48PM -0500, Jake Moilanen wrote:
> @@ -480,6 +495,21 @@ struct iommu_table *iommu_init_table(str
> ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
> #endif
>
> + /*
> + * DMA cannot cross 4 GB boundary. Mark first entry of each 4
> + * GB chunk as reserved.
> + */
> + if (protect4gb) {
> + start_addr = tbl->it_offset << IOMMU_PAGE_SHIFT;
> + /* go up to next 4GB boundary */
> + start_addr = (start_addr + 0x00000000ffffffffl) >> 32;
> + end_addr = (tbl->it_offset + tbl->it_size) << IOMMU_PAGE_SHIFT;
> + for (index = start_addr; index < end_addr; index += (1l << 32)) {
> + /* Reserve 4GB entry */
> + __set_bit((index >> IOMMU_PAGE_SHIFT) - tbl->it_offset,
> tbl->it_map);
> + }
> + }
> +
This is done a bit more complicated than it has to be. The >> 32 is a
red flag as well.
I would also like it to be the last page in the range, not the first
(since otherwise you'll reserve even if the window is less than 4GB.
Something like (untested):
entries_per_4g = 0x100000000 >> IOMMU_PAGE_SHIFT;
/* Mark the last bit before a 4GB boundary as used */
start_index = tbl->it_offset | (entries_per_4g - 1);
end_index = tbl->it_offset + tbl->it_size;
for (index = start_index; index < end_index; index += entries_per_4g)
__set_bit(index, tbl->it_map);
Is easier to follow.
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-22 17:53 ` Olof Johansson
2007-03-22 17:47 ` Jake Moilanen
@ 2007-03-22 22:52 ` Segher Boessenkool
2007-03-27 20:10 ` Jake Moilanen
2 siblings, 0 replies; 25+ messages in thread
From: Segher Boessenkool @ 2007-03-22 22:52 UTC (permalink / raw)
To: Olof Johansson; +Cc: linuxppc-dev, paulus
> I would also like it to be the last page in the range, not the first
> (since otherwise you'll reserve even if the window is less than 4GB.
>
> Something like (untested):
>
> entries_per_4g = 0x100000000 >> IOMMU_PAGE_SHIFT;
>
> /* Mark the last bit before a 4GB boundary as used */
> start_index = tbl->it_offset | (entries_per_4g - 1);
> end_index = tbl->it_offset + tbl->it_size;
>
> for (index = start_index; index < end_index; index += entries_per_4g)
> __set_bit(index, tbl->it_map);
As a very minor optimisation, avoid reserving the last page
of the I/O window if it ends on a 4GB boundary?
Segher
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-22 17:53 ` Olof Johansson
2007-03-22 17:47 ` Jake Moilanen
2007-03-22 22:52 ` Segher Boessenkool
@ 2007-03-27 20:10 ` Jake Moilanen
2007-03-27 20:55 ` Benjamin Herrenschmidt
` (2 more replies)
2 siblings, 3 replies; 25+ messages in thread
From: Jake Moilanen @ 2007-03-27 20:10 UTC (permalink / raw)
To: Olof Johansson; +Cc: linuxppc-dev, paulus
On Thu, 2007-03-22 at 12:53 -0500, Olof Johansson wrote:
> On Wed, Mar 21, 2007 at 04:05:48PM -0500, Jake Moilanen wrote:
>
> > @@ -480,6 +495,21 @@ struct iommu_table *iommu_init_table(str
> > ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
> > #endif
> >
> > + /*
> > + * DMA cannot cross 4 GB boundary. Mark first entry of each 4
> > + * GB chunk as reserved.
> > + */
> > + if (protect4gb) {
> > + start_addr = tbl->it_offset << IOMMU_PAGE_SHIFT;
> > + /* go up to next 4GB boundary */
> > + start_addr = (start_addr + 0x00000000ffffffffl) >> 32;
> > + end_addr = (tbl->it_offset + tbl->it_size) << IOMMU_PAGE_SHIFT;
> > + for (index = start_addr; index < end_addr; index += (1l << 32)) {
> > + /* Reserve 4GB entry */
> > + __set_bit((index >> IOMMU_PAGE_SHIFT) - tbl->it_offset,
> > tbl->it_map);
> > + }
> > + }
> > +
>
>
> This is done a bit more complicated than it has to be. The >> 32 is a
> red flag as well.
>
> I would also like it to be the last page in the range, not the first
> (since otherwise you'll reserve even if the window is less than 4GB.
>
> Something like (untested):
>
> entries_per_4g = 0x100000000 >> IOMMU_PAGE_SHIFT;
>
> /* Mark the last bit before a 4GB boundary as used */
> start_index = tbl->it_offset | (entries_per_4g - 1);
> end_index = tbl->it_offset + tbl->it_size;
>
> for (index = start_index; index < end_index; index += entries_per_4g)
> __set_bit(index, tbl->it_map);
I just had to realign the start_index in case it didn't start at a 4GB
boundary. Other than that, it worked like a champ. I also changed to
for loop to exit if it's the last entry.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
---
arch/powerpc/kernel/iommu.c | 35 ++++++++++++++++++++++++++++++++++-
1 files changed, 34 insertions(+), 1 deletion(-)
Index: powerpc/arch/powerpc/kernel/iommu.c
===================================================================
--- powerpc.orig/arch/powerpc/kernel/iommu.c
+++ powerpc/arch/powerpc/kernel/iommu.c
@@ -47,6 +47,8 @@ static int novmerge = 0;
static int novmerge = 1;
#endif
+static int protect4gb = 1;
+
static inline unsigned long iommu_num_pages(unsigned long vaddr,
unsigned long slen)
{
@@ -58,6 +60,16 @@ static inline unsigned long iommu_num_pa
return npages;
}
+static int __init setup_protect4gb(char *str)
+{
+ if (strcmp(str, "on") == 0)
+ protect4gb = 1;
+ else if (strcmp(str, "off") == 0)
+ protect4gb = 0;
+
+ return 1;
+}
+
static int __init setup_iommu(char *str)
{
if (!strcmp(str, "novmerge"))
@@ -67,6 +79,7 @@ static int __init setup_iommu(char *str)
return 1;
}
+__setup("protect4gb=", setup_protect4gb);
__setup("iommu=", setup_iommu);
static unsigned long iommu_range_alloc(struct iommu_table *tbl,
@@ -429,6 +442,9 @@ void iommu_unmap_sg(struct iommu_table *
struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid)
{
unsigned long sz;
+ unsigned long start_index, end_index;
+ unsigned long entries_per_4g;
+ unsigned long index;
static int welcomed = 0;
struct page *page;
@@ -450,7 +466,7 @@ struct iommu_table *iommu_init_table(str
#ifdef CONFIG_CRASH_DUMP
if (ppc_md.tce_get) {
- unsigned long index, tceval;
+ unsigned long tceval;
unsigned long tcecount = 0;
/*
@@ -480,6 +496,23 @@ struct iommu_table *iommu_init_table(str
ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
#endif
+ /*
+ * DMA cannot cross 4 GB boundary. Mark last entry of each 4
+ * GB chunk as reserved.
+ */
+ if (protect4gb) {
+ entries_per_4g = 0x100000000l >> IOMMU_PAGE_SHIFT;
+
+ /* Mark the last bit before a 4GB boundary as used */
+ start_index = (tbl->it_offset << IOMMU_PAGE_SHIFT) >> 32;
+ start_index |= (entries_per_4g - 1);
+
+ end_index = tbl->it_offset + tbl->it_size;
+
+ for (index = start_index; index < end_index - 1; index +=
entries_per_4g)
+ __set_bit(index, tbl->it_map);
+ }
+
if (!welcomed) {
printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n",
novmerge ? "disabled" : "enabled");
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-27 20:10 ` Jake Moilanen
@ 2007-03-27 20:55 ` Benjamin Herrenschmidt
2007-03-27 23:48 ` Paul Mackerras
2007-03-28 15:56 ` Olof Johansson
2 siblings, 0 replies; 25+ messages in thread
From: Benjamin Herrenschmidt @ 2007-03-27 20:55 UTC (permalink / raw)
To: Jake Moilanen; +Cc: Olof Johansson, linuxppc-dev, paulus
On Tue, 2007-03-27 at 15:10 -0500, Jake Moilanen wrote:
> I just had to realign the start_index in case it didn't start at a 4GB
> boundary. Other than that, it worked like a champ. I also changed to
> forloop to exit if it's the last entry.
>
> Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
patch got mangled ...
> ---
> arch/powerpc/kernel/iommu.c | 35 ++++++++++++++++++++++++++++++++++-
> 1 files changed, 34 insertions(+), 1 deletion(-)
>
> Index: powerpc/arch/powerpc/kernel/iommu.c
> ===================================================================
> --- powerpc.orig/arch/powerpc/kernel/iommu.c
> +++ powerpc/arch/powerpc/kernel/iommu.c
> @@ -47,6 +47,8 @@ static int novmerge = 0;
> static int novmerge = 1;
> #endif
>
> +static int protect4gb = 1;
> +
> static inline unsigned long iommu_num_pages(unsigned long vaddr,
> unsigned long slen)
> {
> @@ -58,6 +60,16 @@ static inline unsigned long iommu_num_pa
> return npages;
> }
>
> +static int __init setup_protect4gb(char *str)
> +{
> + if (strcmp(str, "on") == 0)
> + protect4gb = 1;
> + else if (strcmp(str, "off") == 0)
> + protect4gb = 0;
> +
> + return 1;
> +}
> +
> static int __init setup_iommu(char *str)
> {
> if (!strcmp(str, "novmerge"))
> @@ -67,6 +79,7 @@ static int __init setup_iommu(char *str)
> return 1;
> }
>
> +__setup("protect4gb=", setup_protect4gb);
> __setup("iommu=", setup_iommu);
>
> static unsigned long iommu_range_alloc(struct iommu_table *tbl,
> @@ -429,6 +442,9 @@ void iommu_unmap_sg(struct iommu_table *
> struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid)
> {
> unsigned long sz;
> + unsigned long start_index, end_index;
> + unsigned long entries_per_4g;
> + unsigned long index;
> static int welcomed = 0;
> struct page *page;
>
> @@ -450,7 +466,7 @@ struct iommu_table *iommu_init_table(str
>
> #ifdef CONFIG_CRASH_DUMP
> if (ppc_md.tce_get) {
> - unsigned long index, tceval;
> + unsigned long tceval;
> unsigned long tcecount = 0;
>
> /*
> @@ -480,6 +496,23 @@ struct iommu_table *iommu_init_table(str
> ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
> #endif
>
> + /*
> + * DMA cannot cross 4 GB boundary. Mark last entry of each 4
> + * GB chunk as reserved.
> + */
> + if (protect4gb) {
> + entries_per_4g = 0x100000000l >> IOMMU_PAGE_SHIFT;
> +
> + /* Mark the last bit before a 4GB boundary as used */
> + start_index = (tbl->it_offset << IOMMU_PAGE_SHIFT) >> 32;
> + start_index |= (entries_per_4g - 1);
> +
> + end_index = tbl->it_offset + tbl->it_size;
> +
> + for (index = start_index; index < end_index - 1; index +=
> entries_per_4g)
> + __set_bit(index, tbl->it_map);
> + }
> +
> if (!welcomed) {
> printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n",
> novmerge ? "disabled" : "enabled");
>
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-27 20:10 ` Jake Moilanen
2007-03-27 20:55 ` Benjamin Herrenschmidt
@ 2007-03-27 23:48 ` Paul Mackerras
2007-03-28 15:56 ` Olof Johansson
2 siblings, 0 replies; 25+ messages in thread
From: Paul Mackerras @ 2007-03-27 23:48 UTC (permalink / raw)
To: Jake Moilanen; +Cc: Olof Johansson, linuxppc-dev
Jake Moilanen writes:
> I just had to realign the start_index in case it didn't start at a 4GB
> boundary. Other than that, it worked like a champ. I also changed to
> for loop to exit if it's the last entry.
If this is now suitable for going upstream, please re-send with a
proper patch description, e.g. the original patch description modified
or augmented as necessary for the changes you've made. A description
of what has changed since the previous version of the patch is not
suitable as the commit text in the git repository.
Paul.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-27 20:10 ` Jake Moilanen
2007-03-27 20:55 ` Benjamin Herrenschmidt
2007-03-27 23:48 ` Paul Mackerras
@ 2007-03-28 15:56 ` Olof Johansson
2007-03-28 18:17 ` Jake Moilanen
2 siblings, 1 reply; 25+ messages in thread
From: Olof Johansson @ 2007-03-28 15:56 UTC (permalink / raw)
To: Jake Moilanen; +Cc: linuxppc-dev, paulus
On Tue, Mar 27, 2007 at 03:10:41PM -0500, Jake Moilanen wrote:
> I just had to realign the start_index in case it didn't start at a 4GB
> boundary. Other than that, it worked like a champ. I also changed to
> for loop to exit if it's the last entry.
Hmm. See below.
> + /*
> + * DMA cannot cross 4 GB boundary. Mark last entry of each 4
> + * GB chunk as reserved.
> + */
> + if (protect4gb) {
> + entries_per_4g = 0x100000000l >> IOMMU_PAGE_SHIFT;
> +
> + /* Mark the last bit before a 4GB boundary as used */
> + start_index = (tbl->it_offset << IOMMU_PAGE_SHIFT) >> 32;
> + start_index |= (entries_per_4g - 1);
This looks broken.
The idea is to make start_index the last page before the first 4GB
boundary after it_offset. If that happens to be beyond end_index the
for loop below will never run. If it's below that, every last page in
the 4GB ranges will be marked in the loop. This will work even if the
table starts at i.e. 2GB and goes until 10GB.
With the first line above, your start_index will always be 0xfffff
(unless the offset is waay up there in the address space).
The logic I had was:
start_index = tbl->it_offset | (entries_per_4g - 1);
This is also broken, since it doesn't consider it_offset in the loop
below. That was my bad, and I guess was what you tried to fix above.
What you really want is:
start_index = tbl->it_offset | (entries_per_4g - 1);
start_index -= tbl->it_offset;
end_index = tbl->it_size;
Say that it_offset is at 3GB, with 4KB pages that means the value is
0xc0000. entries_per_4g is 0x100000, i.e. the logic becomes: 0xc0000 |
0xfffff = 0xfffff (- 0xc0000 = 0x3ffff), which indeed is the last page
before 4GB.
If it_offset is at 9GB, i.e. 0x240000, then we get start_index at 0x2fffff
(- 0x240000 = 0xbffff) , i.e. yet again last page before the 12GB wrap.
> +
> + end_index = tbl->it_offset + tbl->it_size;
> +
> + for (index = start_index; index < end_index - 1; index +=
> entries_per_4g)
> + __set_bit(index, tbl->it_map);
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-28 15:56 ` Olof Johansson
@ 2007-03-28 18:17 ` Jake Moilanen
2007-03-28 23:23 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 25+ messages in thread
From: Jake Moilanen @ 2007-03-28 18:17 UTC (permalink / raw)
To: paulus, Olof Johansson; +Cc: linuxppc-dev
> > + /*
> > + * DMA cannot cross 4 GB boundary. Mark last entry of each 4
> > + * GB chunk as reserved.
> > + */
> > + if (protect4gb) {
> > + entries_per_4g = 0x100000000l >> IOMMU_PAGE_SHIFT;
> > +
> > + /* Mark the last bit before a 4GB boundary as used */
> > + start_index = (tbl->it_offset << IOMMU_PAGE_SHIFT) >> 32;
> > + start_index |= (entries_per_4g - 1);
>
> This looks broken.
>
> The idea is to make start_index the last page before the first 4GB
> boundary after it_offset. If that happens to be beyond end_index the
> for loop below will never run. If it's below that, every last page in
> the 4GB ranges will be marked in the loop. This will work even if the
> table starts at i.e. 2GB and goes until 10GB.
>
> With the first line above, your start_index will always be 0xfffff
> (unless the offset is waay up there in the address space).
>
> The logic I had was:
>
> start_index = tbl->it_offset | (entries_per_4g - 1);
>
> This is also broken, since it doesn't consider it_offset in the loop
> below. That was my bad, and I guess was what you tried to fix above.
>
> What you really want is:
>
> start_index = tbl->it_offset | (entries_per_4g - 1);
> start_index -= tbl->it_offset;
>
> end_index = tbl->it_size;
Yup.
> Say that it_offset is at 3GB, with 4KB pages that means the value is
> 0xc0000. entries_per_4g is 0x100000, i.e. the logic becomes: 0xc0000 |
> 0xfffff = 0xfffff (- 0xc0000 = 0x3ffff), which indeed is the last page
> before 4GB.
>
> If it_offset is at 9GB, i.e. 0x240000, then we get start_index at 0x2fffff
> (- 0x240000 = 0xbffff) , i.e. yet again last page before the 12GB wrap.
One more try.
There are many adapters which can not handle DMAing acrosss any 4 GB
boundary. For instance the latest Emulex adapters.
This normally is not an issue as firmware gives dma-windows under
4gigs. However, some of the new System-P boxes have dma-windows above
4gigs, and this present a problem.
During initialization of the IOMMU tables, the last entry at each 4GB
boundary is marked as used. Thus no mappings can cross the boundary.
If a table ends at a 4GB boundary, the entry is not marked as used.
A boot option to remove this 4GB protection is given w/ protect4gb=off.
This exposes the potential issue for driver and hardware development
purposes.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
---
arch/powerpc/kernel/iommu.c | 35 ++++++++++++++++++++++++++++++++++-
1 files changed, 34 insertions(+), 1 deletion(-)
Index: powerpc/arch/powerpc/kernel/iommu.c
===================================================================
--- powerpc.orig/arch/powerpc/kernel/iommu.c
+++ powerpc/arch/powerpc/kernel/iommu.c
@@ -47,6 +47,8 @@ static int novmerge = 0;
static int novmerge = 1;
#endif
+static int protect4gb = 1;
+
static inline unsigned long iommu_num_pages(unsigned long vaddr,
unsigned long slen)
{
@@ -58,6 +60,16 @@ static inline unsigned long iommu_num_pa
return npages;
}
+static int __init setup_protect4gb(char *str)
+{
+ if (strcmp(str, "on") == 0)
+ protect4gb = 1;
+ else if (strcmp(str, "off") == 0)
+ protect4gb = 0;
+
+ return 1;
+}
+
static int __init setup_iommu(char *str)
{
if (!strcmp(str, "novmerge"))
@@ -67,6 +79,7 @@ static int __init setup_iommu(char *str)
return 1;
}
+__setup("protect4gb=", setup_protect4gb);
__setup("iommu=", setup_iommu);
static unsigned long iommu_range_alloc(struct iommu_table *tbl,
@@ -429,6 +442,9 @@ void iommu_unmap_sg(struct iommu_table *
struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid)
{
unsigned long sz;
+ unsigned long start_index, end_index;
+ unsigned long entries_per_4g;
+ unsigned long index;
static int welcomed = 0;
struct page *page;
@@ -450,7 +466,7 @@ struct iommu_table *iommu_init_table(str
#ifdef CONFIG_CRASH_DUMP
if (ppc_md.tce_get) {
- unsigned long index, tceval;
+ unsigned long tceval;
unsigned long tcecount = 0;
/*
@@ -480,6 +496,23 @@ struct iommu_table *iommu_init_table(str
ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
#endif
+ /*
+ * DMA cannot cross 4 GB boundary. Mark last entry of each 4
+ * GB chunk as reserved.
+ */
+ if (protect4gb) {
+ entries_per_4g = 0x100000000l >> IOMMU_PAGE_SHIFT;
+
+ /* Mark the last bit before a 4GB boundary as used */
+ start_index = tbl->it_offset | (entries_per_4g - 1);
+ start_index -= tbl->it_offset;
+
+ end_index = tbl->it_size;
+
+ for (index = start_index; index < end_index - 1; index +=
entries_per_4g)
+ __set_bit(index, tbl->it_map);
+ }
+
if (!welcomed) {
printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n",
novmerge ? "disabled" : "enabled");
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-28 18:17 ` Jake Moilanen
@ 2007-03-28 23:23 ` Benjamin Herrenschmidt
2007-03-29 13:44 ` Jake Moilanen
0 siblings, 1 reply; 25+ messages in thread
From: Benjamin Herrenschmidt @ 2007-03-28 23:23 UTC (permalink / raw)
To: Jake Moilanen; +Cc: Olof Johansson, linuxppc-dev, paulus
On Wed, 2007-03-28 at 13:17 -0500, Jake Moilanen wrote:
> One more try.
And patch is still mangled....
> There are many adapters which can not handle DMAing acrosss any 4 GB
> boundary. For instance the latest Emulex adapters.
>
> This normally is not an issue as firmware gives dma-windows under
> 4gigs. However, some of the new System-P boxes have dma-windows above
> 4gigs, and this present a problem.
>
> During initialization of the IOMMU tables, the last entry at each 4GB
> boundary is marked as used. Thus no mappings can cross the boundary.
> If a table ends at a 4GB boundary, the entry is not marked as used.
>
> A boot option to remove this 4GB protection is given w/ protect4gb=off.
> This exposes the potential issue for driver and hardware development
> purposes.
>
> Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
> ---
> arch/powerpc/kernel/iommu.c | 35 ++++++++++++++++++++++++++++++++++-
> 1 files changed, 34 insertions(+), 1 deletion(-)
>
> Index: powerpc/arch/powerpc/kernel/iommu.c
> ===================================================================
> --- powerpc.orig/arch/powerpc/kernel/iommu.c
> +++ powerpc/arch/powerpc/kernel/iommu.c
> @@ -47,6 +47,8 @@ static int novmerge = 0;
> static int novmerge = 1;
> #endif
>
> +static int protect4gb = 1;
> +
> static inline unsigned long iommu_num_pages(unsigned long vaddr,
> unsigned long slen)
> {
> @@ -58,6 +60,16 @@ static inline unsigned long iommu_num_pa
> return npages;
> }
>
> +static int __init setup_protect4gb(char *str)
> +{
> + if (strcmp(str, "on") == 0)
> + protect4gb = 1;
> + else if (strcmp(str, "off") == 0)
> + protect4gb = 0;
> +
> + return 1;
> +}
> +
> static int __init setup_iommu(char *str)
> {
> if (!strcmp(str, "novmerge"))
> @@ -67,6 +79,7 @@ static int __init setup_iommu(char *str)
> return 1;
> }
>
> +__setup("protect4gb=", setup_protect4gb);
> __setup("iommu=", setup_iommu);
>
> static unsigned long iommu_range_alloc(struct iommu_table *tbl,
> @@ -429,6 +442,9 @@ void iommu_unmap_sg(struct iommu_table *
> struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid)
> {
> unsigned long sz;
> + unsigned long start_index, end_index;
> + unsigned long entries_per_4g;
> + unsigned long index;
> static int welcomed = 0;
> struct page *page;
>
> @@ -450,7 +466,7 @@ struct iommu_table *iommu_init_table(str
>
> #ifdef CONFIG_CRASH_DUMP
> if (ppc_md.tce_get) {
> - unsigned long index, tceval;
> + unsigned long tceval;
> unsigned long tcecount = 0;
>
> /*
> @@ -480,6 +496,23 @@ struct iommu_table *iommu_init_table(str
> ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
> #endif
>
> + /*
> + * DMA cannot cross 4 GB boundary. Mark last entry of each 4
> + * GB chunk as reserved.
> + */
> + if (protect4gb) {
> + entries_per_4g = 0x100000000l >> IOMMU_PAGE_SHIFT;
> +
> + /* Mark the last bit before a 4GB boundary as used */
> + start_index = tbl->it_offset | (entries_per_4g - 1);
> + start_index -= tbl->it_offset;
> +
> + end_index = tbl->it_size;
> +
> + for (index = start_index; index < end_index - 1; index +=
> entries_per_4g)
> + __set_bit(index, tbl->it_map);
> + }
> +
> if (!welcomed) {
> printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n",
> novmerge ? "disabled" : "enabled");
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-28 23:23 ` Benjamin Herrenschmidt
@ 2007-03-29 13:44 ` Jake Moilanen
2007-03-29 14:52 ` Olof Johansson
` (2 more replies)
0 siblings, 3 replies; 25+ messages in thread
From: Jake Moilanen @ 2007-03-29 13:44 UTC (permalink / raw)
To: paulus, Benjamin Herrenschmidt; +Cc: Olof Johansson, linuxppc-dev
On Thu, 2007-03-29 at 09:23 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2007-03-28 at 13:17 -0500, Jake Moilanen wrote:
>
> > One more try.
>
> And patch is still mangled....
Hrm....I should just go back to Sylpheed.
Unmangled version:
There are many adapters which can not handle DMAing acrosss any 4 GB
boundary. For instance the latest Emulex adapters.
This normally is not an issue as firmware gives dma-windows under
4gigs. However, some of the new System-P boxes have dma-windows above
4gigs, and this present a problem.
During initialization of the IOMMU tables, the last entry at each 4GB
boundary is marked as used. Thus no mappings can cross the boundary.
If a table ends at a 4GB boundary, the entry is not marked as used.
A boot option to remove this 4GB protection is given w/ protect4gb=off.
This exposes the potential issue for driver and hardware development
purposes.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
---
arch/powerpc/kernel/iommu.c | 35 ++++++++++++++++++++++++++++++++++-
1 files changed, 34 insertions(+), 1 deletion(-)
Index: powerpc/arch/powerpc/kernel/iommu.c
===================================================================
--- powerpc.orig/arch/powerpc/kernel/iommu.c
+++ powerpc/arch/powerpc/kernel/iommu.c
@@ -47,6 +47,8 @@ static int novmerge = 0;
static int novmerge = 1;
#endif
+static int protect4gb = 1;
+
static inline unsigned long iommu_num_pages(unsigned long vaddr,
unsigned long slen)
{
@@ -58,6 +60,16 @@ static inline unsigned long iommu_num_pa
return npages;
}
+static int __init setup_protect4gb(char *str)
+{
+ if (strcmp(str, "on") == 0)
+ protect4gb = 1;
+ else if (strcmp(str, "off") == 0)
+ protect4gb = 0;
+
+ return 1;
+}
+
static int __init setup_iommu(char *str)
{
if (!strcmp(str, "novmerge"))
@@ -67,6 +79,7 @@ static int __init setup_iommu(char *str)
return 1;
}
+__setup("protect4gb=", setup_protect4gb);
__setup("iommu=", setup_iommu);
static unsigned long iommu_range_alloc(struct iommu_table *tbl,
@@ -429,6 +442,9 @@ void iommu_unmap_sg(struct iommu_table *
struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid)
{
unsigned long sz;
+ unsigned long start_index, end_index;
+ unsigned long entries_per_4g;
+ unsigned long index;
static int welcomed = 0;
struct page *page;
@@ -450,7 +466,7 @@ struct iommu_table *iommu_init_table(str
#ifdef CONFIG_CRASH_DUMP
if (ppc_md.tce_get) {
- unsigned long index, tceval;
+ unsigned long tceval;
unsigned long tcecount = 0;
/*
@@ -480,6 +496,23 @@ struct iommu_table *iommu_init_table(str
ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
#endif
+ /*
+ * DMA cannot cross 4 GB boundary. Mark last entry of each 4
+ * GB chunk as reserved.
+ */
+ if (protect4gb) {
+ entries_per_4g = 0x100000000l >> IOMMU_PAGE_SHIFT;
+
+ /* Mark the last bit before a 4GB boundary as used */
+ start_index = tbl->it_offset | (entries_per_4g - 1);
+ start_index -= tbl->it_offset;
+
+ end_index = tbl->it_size;
+
+ for (index = start_index; index < end_index - 1; index += entries_per_4g)
+ __set_bit(index, tbl->it_map);
+ }
+
if (!welcomed) {
printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n",
novmerge ? "disabled" : "enabled");
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-29 13:44 ` Jake Moilanen
@ 2007-03-29 14:52 ` Olof Johansson
2007-03-29 21:54 ` Benjamin Herrenschmidt
2007-04-23 12:22 ` Paul Mackerras
2 siblings, 0 replies; 25+ messages in thread
From: Olof Johansson @ 2007-03-29 14:52 UTC (permalink / raw)
To: Jake Moilanen; +Cc: paulus, linuxppc-dev
On Thu, Mar 29, 2007 at 08:44:02AM -0500, Jake Moilanen wrote:
> On Thu, 2007-03-29 at 09:23 +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2007-03-28 at 13:17 -0500, Jake Moilanen wrote:
> >
> > > One more try.
> >
> > And patch is still mangled....
>
> Hrm....I should just go back to Sylpheed.
>
> Unmangled version:
>
>
> There are many adapters which can not handle DMAing acrosss any 4 GB
> boundary. For instance the latest Emulex adapters.
>
> This normally is not an issue as firmware gives dma-windows under
> 4gigs. However, some of the new System-P boxes have dma-windows above
> 4gigs, and this present a problem.
>
> During initialization of the IOMMU tables, the last entry at each 4GB
> boundary is marked as used. Thus no mappings can cross the boundary.
> If a table ends at a 4GB boundary, the entry is not marked as used.
>
> A boot option to remove this 4GB protection is given w/ protect4gb=off.
> This exposes the potential issue for driver and hardware development
> purposes.
>
> Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-29 13:44 ` Jake Moilanen
2007-03-29 14:52 ` Olof Johansson
@ 2007-03-29 21:54 ` Benjamin Herrenschmidt
2007-04-23 12:22 ` Paul Mackerras
2 siblings, 0 replies; 25+ messages in thread
From: Benjamin Herrenschmidt @ 2007-03-29 21:54 UTC (permalink / raw)
To: Jake Moilanen; +Cc: Olof Johansson, linuxppc-dev, paulus
On Thu, 2007-03-29 at 08:44 -0500, Jake Moilanen wrote:
> On Thu, 2007-03-29 at 09:23 +1000, Benjamin Herrenschmidt wrote:
> > On Wed, 2007-03-28 at 13:17 -0500, Jake Moilanen wrote:
> >
> > > One more try.
> >
> > And patch is still mangled....
>
> Hrm....I should just go back to Sylpheed.
Heh, I never have problems with Evolution... just make sure you use
"Preformat" as a style.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-03-29 13:44 ` Jake Moilanen
2007-03-29 14:52 ` Olof Johansson
2007-03-29 21:54 ` Benjamin Herrenschmidt
@ 2007-04-23 12:22 ` Paul Mackerras
2007-04-24 3:07 ` Olof Johansson
2 siblings, 1 reply; 25+ messages in thread
From: Paul Mackerras @ 2007-04-23 12:22 UTC (permalink / raw)
To: Jake Moilanen; +Cc: Olof Johansson, linuxppc-dev
Jake,
It looks like I have applied both the final version of this patch
(569975591c5530fdc9c7a3c45122e5e46f075a74) and the original version
(618d3adc351a24c4c48437c767befb88ca2d199d) that modified
iommu_range_alloc().
Should I revert the original version?
Paul.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] DMA 4GB boundary protection
2007-04-23 12:22 ` Paul Mackerras
@ 2007-04-24 3:07 ` Olof Johansson
0 siblings, 0 replies; 25+ messages in thread
From: Olof Johansson @ 2007-04-24 3:07 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
On Mon, Apr 23, 2007 at 10:22:00PM +1000, Paul Mackerras wrote:
> Jake,
>
> It looks like I have applied both the final version of this patch
> (569975591c5530fdc9c7a3c45122e5e46f075a74) and the original version
> (618d3adc351a24c4c48437c767befb88ca2d199d) that modified
> iommu_range_alloc().
>
> Should I revert the original version?
(Since I didn't see a reply from Jake today):
Yes, please do. It's redundant given the better patch that went in later.
Thanks,
-Olof
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2007-04-24 3:07 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-02 21:49 [PATCH] DMA 4GB boundary protection Jake Moilanen
2007-03-02 22:27 ` Olof Johansson
2007-03-03 8:27 ` Benjamin Herrenschmidt
2007-03-03 23:25 ` Olof Johansson
2007-03-04 5:17 ` Christoph Hellwig
2007-03-04 5:52 ` Olof Johansson
2007-03-03 23:29 ` Olof Johansson
2007-03-03 23:32 ` Segher Boessenkool
2007-03-03 23:57 ` Olof Johansson
2007-03-21 21:05 ` Jake Moilanen
2007-03-21 21:39 ` Olof Johansson
2007-03-22 17:53 ` Olof Johansson
2007-03-22 17:47 ` Jake Moilanen
2007-03-22 22:52 ` Segher Boessenkool
2007-03-27 20:10 ` Jake Moilanen
2007-03-27 20:55 ` Benjamin Herrenschmidt
2007-03-27 23:48 ` Paul Mackerras
2007-03-28 15:56 ` Olof Johansson
2007-03-28 18:17 ` Jake Moilanen
2007-03-28 23:23 ` Benjamin Herrenschmidt
2007-03-29 13:44 ` Jake Moilanen
2007-03-29 14:52 ` Olof Johansson
2007-03-29 21:54 ` Benjamin Herrenschmidt
2007-04-23 12:22 ` Paul Mackerras
2007-04-24 3:07 ` Olof Johansson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).