public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* pgprot_writecombine & shub 1.x
@ 2005-01-11 20:00 Jesse Barnes
  2005-01-11 22:12 ` David Mosberger
                   ` (31 more replies)
  0 siblings, 32 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-11 20:00 UTC (permalink / raw)
  To: linux-ia64

SGI sn2 systems based on SHub 1.x chipsets don't support the write combine 
attribute on PIO space, only regular memory space.  This is a problem because 
drivers often expect to map PIO space with certain memory attributes (e.g. 
the PCI mmap code, drm, fbmem, acorn video).  None of the calls to 
pgprot_writecombine are in performance critical paths, so would making 
pgprot_writecombine into a machine vector would be ok?  Comments?

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
@ 2005-01-11 22:12 ` David Mosberger
  2005-01-11 22:35 ` Jesse Barnes
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: David Mosberger @ 2005-01-11 22:12 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Tue, 11 Jan 2005 12:00:02 -0800, Jesse Barnes <jbarnes@sgi.com> said:

  Jesse> SGI sn2 systems based on SHub 1.x chipsets don't support the
  Jesse> write combine attribute on PIO space, only regular memory
  Jesse> space.  This is a problem because drivers often expect to map
  Jesse> PIO space with certain memory attributes (e.g.  the PCI mmap
  Jesse> code, drm, fbmem, acorn video).  None of the calls to
  Jesse> pgprot_writecombine are in performance critical paths, so
  Jesse> would making pgprot_writecombine into a machine vector would
  Jesse> be ok?  Comments?

This seems wrong to me.  pgprot_writecombine() should do what it says,
no more and no less.

The EFI memory-map has already all the info needed to determine
whether write-combine mapping is supported, so perhaps the code should
be changed to take that into consideration instead?

	--david

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
  2005-01-11 22:12 ` David Mosberger
@ 2005-01-11 22:35 ` Jesse Barnes
  2005-01-12 18:51 ` Jim Hull
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-11 22:35 UTC (permalink / raw)
  To: linux-ia64

On Tuesday, January 11, 2005 2:12 pm, David Mosberger wrote:
> This seems wrong to me.  pgprot_writecombine() should do what it says,
> no more and no less.
>
> The EFI memory-map has already all the info needed to determine
> whether write-combine mapping is supported, so perhaps the code should
> be changed to take that into consideration instead?

But what about places that unconditionally set the WC bit regardless of what 
the EFI memory map says?  pci_mmap_page_range does this for example if the 
write_combine flag is set on the vma.  I'm looking for a way to abstract out 
uses like that, so that shub 1.x systems don't set the bit.

Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
  2005-01-11 22:12 ` David Mosberger
  2005-01-11 22:35 ` Jesse Barnes
@ 2005-01-12 18:51 ` Jim Hull
  2005-01-12 19:31 ` Hugo Kohmann
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jim Hull @ 2005-01-12 18:51 UTC (permalink / raw)
  To: linux-ia64

Jesse Barnes wrote:

> But what about places that unconditionally set the WC bit 
> regardless of what the EFI memory map says?

To be blunt - those places are broken, at least from the perspective of the IPF
(ia64 if you prefer) architecture.  IPF declares that it is the platform which
gets to decide the supported attributes for each address range, and provides the
EFI memory map to inform the OS of this support.

> pci_mmap_page_range does this for 
> example if the write_combine flag is set on the vma.
> I'm looking for a way to abstract out 
> uses like that, so that shub 1.x systems don't set the bit.

I'm not really qualified to design the right linux interfaces, but to be IPF
compliant, you need to change all such places to first consult the EFI memory
map.  Whether you do this once at boot time, on every call, whether to fail an
unsupported request or remap the attribute to something the platform can support
(e.g., mapping WC to UC), is all up to you.

 -- Jim



^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (2 preceding siblings ...)
  2005-01-12 18:51 ` Jim Hull
@ 2005-01-12 19:31 ` Hugo Kohmann
  2005-01-12 19:32 ` Jesse Barnes
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Hugo Kohmann @ 2005-01-12 19:31 UTC (permalink / raw)
  To: linux-ia64


Hi,

Is there any new plans to implement support for writecombine in kernel 
space ?

We have implemented a new socket transport family, AF_SCI, that provides
UDP/TCP compliant transport over SCI that needs writecombine to archieve
max througpht and low latency for small messages. Large messages can be 
send using DMA, but small messages highly benefit from memory 
mapped transmission....

Best regards

Hugo Kohmann
Dolphin Interconnect Solutions


On Wed, 12 Jan 2005, Jim Hull wrote:

>Date: Wed, 12 Jan 2005 10:51:49 -0800
>From: Jim Hull <jim.hull@hp.com>
>To: 'Jesse Barnes' <jbarnes@sgi.com>,
>    "Mosberger, David" <david.mosberger@hp.com>
>Cc: linux-ia64@vger.kernel.org, 'Tony Luck' <tony.luck@intel.com>
>Subject: RE: pgprot_writecombine & shub 1.x
>
> Jesse Barnes wrote:
>
>> But what about places that unconditionally set the WC bit
>> regardless of what the EFI memory map says?
>
> To be blunt - those places are broken, at least from the perspective of the IPF
> (ia64 if you prefer) architecture.  IPF declares that it is the platform which
> gets to decide the supported attributes for each address range, and provides the
> EFI memory map to inform the OS of this support.
>
>> pci_mmap_page_range does this for
>> example if the write_combine flag is set on the vma.
>> I'm looking for a way to abstract out
>> uses like that, so that shub 1.x systems don't set the bit.
>
> I'm not really qualified to design the right linux interfaces, but to be IPF
> compliant, you need to change all such places to first consult the EFI memory
> map.  Whether you do this once at boot time, on every call, whether to fail an
> unsupported request or remap the attribute to something the platform can support
> (e.g., mapping WC to UC), is all up to you.
>
> -- Jim
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


============================================Hugo Kohmann                           |
Dolphin Interconnect Solutions AS      | E-mail:
P.O. Box 150 Oppsal                    | hugo@dolphinics.com
N-0619 Oslo, Norway                    | Web:
Tel:+47 23 16 71 83                    | http://www.dolphinics.com
Fax:+47 23 16 71 80                    | 
Visiting Address: Olaf Helsets vei 6   |

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (3 preceding siblings ...)
  2005-01-12 19:31 ` Hugo Kohmann
@ 2005-01-12 19:32 ` Jesse Barnes
  2005-01-12 21:54 ` David Mosberger
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-12 19:32 UTC (permalink / raw)
  To: linux-ia64

On Wednesday, January 12, 2005 10:51 am, Jim Hull wrote:
> Jesse Barnes wrote:
> > But what about places that unconditionally set the WC bit
> > regardless of what the EFI memory map says?
>
> To be blunt - those places are broken, at least from the perspective of the
> IPF (ia64 if you prefer) architecture.  IPF declares that it is the
> platform which gets to decide the supported attributes for each address
> range, and provides the EFI memory map to inform the OS of this support.
>
> > pci_mmap_page_range does this for
> > example if the write_combine flag is set on the vma.
> > I'm looking for a way to abstract out
> > uses like that, so that shub 1.x systems don't set the bit.
>
> I'm not really qualified to design the right linux interfaces, but to be
> IPF compliant, you need to change all such places to first consult the EFI
> memory map.  Whether you do this once at boot time, on every call, whether
> to fail an unsupported request or remap the attribute to something the
> platform can support (e.g., mapping WC to UC), is all up to you.

Thanks for clarifying, that would certainly make things easier from our 
perspective.  I suppose that potentially makes the pgprot_* calls a bit more 
expensive, but with the benefit that they won't clash with what EFI tells the 
kernel about how memory works.

Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (4 preceding siblings ...)
  2005-01-12 19:32 ` Jesse Barnes
@ 2005-01-12 21:54 ` David Mosberger
  2005-01-19 17:28 ` Jesse Barnes
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: David Mosberger @ 2005-01-12 21:54 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 12 Jan 2005 20:31:30 +0100 (CET), Hugo Kohmann <hugo@dolphinics.no> said:

  Hugo> Is there any new plans to implement support for writecombine
  Hugo> in kernel space ?

I don't know of anybody who's working on it at the moment.

  Hugo> We have implemented a new socket transport family, AF_SCI,
  Hugo> that provides UDP/TCP compliant transport over SCI that needs
  Hugo> writecombine to archieve max througpht and low latency for
  Hugo> small messages. Large messages can be send using DMA, but
  Hugo> small messages highly benefit from memory mapped
  Hugo> transmission....

I think there are two somewhat separate issues here:

 (1) WC mapping of uncachable address-ranges.
 (2) WC mapping of memory.

(1) isn't much of an issue (at least for today's chips; I think we may
still be violating the architecture if we end up mapping the same
address-range both UC and WC, but current chips don't care, AFAIK).

(2) is where things get trickier.  You'd have to make sure that any
memory that's mapped WC is allocated in granule-sized chunks.  There
is no support for this at the moment.  I do think it might be
worthwhile to support this, but don't know of anybody working on it.

	--david

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (5 preceding siblings ...)
  2005-01-12 21:54 ` David Mosberger
@ 2005-01-19 17:28 ` Jesse Barnes
  2005-01-19 17:53 ` David Mosberger
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-19 17:28 UTC (permalink / raw)
  To: linux-ia64

On Wednesday, January 12, 2005 10:51 am, Jim Hull wrote:
> I'm not really qualified to design the right linux interfaces, but to be
> IPF compliant, you need to change all such places to first consult the EFI
> memory map.  Whether you do this once at boot time, on every call, whether
> to fail an unsupported request or remap the attribute to something the
> platform can support (e.g., mapping WC to UC), is all up to you.

So I guess we need to add ia64_set_page_wc() and ia64_set_page_uc() functions 
that set the appropriate page bits or return errors if the EFI memory map 
says a particular mode isn't allowed?  How does that sound to you, Tony & 
David?  Fortunately there aren't that many users of pgprot_writecombine so 
fixing them up should be easy.

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (6 preceding siblings ...)
  2005-01-19 17:28 ` Jesse Barnes
@ 2005-01-19 17:53 ` David Mosberger
  2005-01-19 17:56 ` Jesse Barnes
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: David Mosberger @ 2005-01-19 17:53 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 19 Jan 2005 09:28:27 -0800, Jesse Barnes <jbarnes@sgi.com> said:

  Jesse> On Wednesday, January 12, 2005 10:51 am, Jim Hull wrote:
  >> I'm not really qualified to design the right linux interfaces,
  >> but to be IPF compliant, you need to change all such places to
  >> first consult the EFI memory map.  Whether you do this once at
  >> boot time, on every call, whether to fail an unsupported request
  >> or remap the attribute to something the platform can support
  >> (e.g., mapping WC to UC), is all up to you.

  Jesse> So I guess we need to add ia64_set_page_wc() and
  Jesse> ia64_set_page_uc() functions that set the appropriate page
  Jesse> bits or return errors if the EFI memory map says a particular
  Jesse> mode isn't allowed?  How does that sound to you, Tony &
  Jesse> David?  Fortunately there aren't that many users of
  Jesse> pgprot_writecombine so fixing them up should be easy.

I'm still not clear which case you're concerned about: mapping I/O
memory with WC or mapping real memory with WC.  For the latter, I
think we may need an API to allocate memory that can be mapped WC (if
possible at all).

	--david

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (7 preceding siblings ...)
  2005-01-19 17:53 ` David Mosberger
@ 2005-01-19 17:56 ` Jesse Barnes
  2005-01-19 18:04 ` David Mosberger
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-19 17:56 UTC (permalink / raw)
  To: linux-ia64

On Wednesday, January 19, 2005 9:53 am, David Mosberger wrote:
> I'm still not clear which case you're concerned about: mapping I/O
> memory with WC or mapping real memory with WC.  For the latter, I
> think we may need an API to allocate memory that can be mapped WC (if
> possible at all).

Oh, I thought I said that in the first message, I'm talking about I/O memory 
here.  In any event though, we shouldn't be mapping anything with the WC 
attribute that the EFI memory map doesn't allow.  And yes, we probably do 
need a special allocator for mapping regular memory with WC or UC attributes 
to avoid mapping the same region with different attributes.

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (8 preceding siblings ...)
  2005-01-19 17:56 ` Jesse Barnes
@ 2005-01-19 18:04 ` David Mosberger
  2005-01-19 18:16 ` Luck, Tony
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: David Mosberger @ 2005-01-19 18:04 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 19 Jan 2005 09:56:59 -0800, Jesse Barnes <jbarnes@sgi.com> said:

  Jesse> On Wednesday, January 19, 2005 9:53 am, David Mosberger
  Jesse> wrote:
  >> I'm still not clear which case you're concerned about: mapping
  >> I/O memory with WC or mapping real memory with WC.  For the
  >> latter, I think we may need an API to allocate memory that can be
  >> mapped WC (if possible at all).

  Jesse> Oh, I thought I said that in the first message, I'm talking
  Jesse> about I/O memory here.

OK, thanks for clarifying.

  Jesse> In any event though, we shouldn't be mapping anything with
  Jesse> the WC attribute that the EFI memory map doesn't allow.

Adding new routines for supporting this sounds reasonable, but I think
it needs to be added to linux/efi.h, since EFI isn't ia64-specific.

  Jesse> And yes, we probably do need a special allocator for mapping
  Jesse> regular memory with WC or UC attributes to avoid mapping the
  Jesse> same region with different attributes.

OK, sounds like we're in violent agreement.

	--david

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (9 preceding siblings ...)
  2005-01-19 18:04 ` David Mosberger
@ 2005-01-19 18:16 ` Luck, Tony
  2005-01-19 18:21 ` Jesse Barnes
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Luck, Tony @ 2005-01-19 18:16 UTC (permalink / raw)
  To: linux-ia64

>Oh, I thought I said that in the first message, I'm talking 
>about I/O memory here.  In any event though, we shouldn't be
>mapping anything with the WC attribute that the EFI memory
>map doesn't allow.

Does the EFI memory map help here?  Is I/O memory included
in the efi memory map?  What about the hot-plug case ... the
EFI memory map isn't updated for new i/o memory that appears
when you plug in a new card?

-Tony

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (10 preceding siblings ...)
  2005-01-19 18:16 ` Luck, Tony
@ 2005-01-19 18:21 ` Jesse Barnes
  2005-01-19 18:38 ` Luck, Tony
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-19 18:21 UTC (permalink / raw)
  To: linux-ia64

On Wednesday, January 19, 2005 10:16 am, Luck, Tony wrote:
> >Oh, I thought I said that in the first message, I'm talking
> >about I/O memory here.  In any event though, we shouldn't be
> >mapping anything with the WC attribute that the EFI memory
> >map doesn't allow.
>
> Does the EFI memory map help here?  Is I/O memory included
> in the efi memory map?  What about the hot-plug case ... the
> EFI memory map isn't updated for new i/o memory that appears
> when you plug in a new card?

I think it's supposed to appear as memory mapped I/O or memory mapped I/O port 
space in the EFI memory map.  Shouldn't the map be updated in the case of 
hotplug?

Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (11 preceding siblings ...)
  2005-01-19 18:21 ` Jesse Barnes
@ 2005-01-19 18:38 ` Luck, Tony
  2005-01-19 19:33 ` Bjorn Helgaas
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Luck, Tony @ 2005-01-19 18:38 UTC (permalink / raw)
  To: linux-ia64

>> Does the EFI memory map help here?  Is I/O memory included
>> in the efi memory map?  What about the hot-plug case ... the
>> EFI memory map isn't updated for new i/o memory that appears
>> when you plug in a new card?
>
>I think it's supposed to appear as memory mapped I/O or memory 
>mapped I/O port space in the EFI memory map.  Shouldn't the map
>be updated in the case of hotplug?

I'm not an EFI expert ... but it looks like the i/f to get the
memory map is only part of the boot time services, not the runtime
services.  If that's right, then I don't think there's a way for
it to be updated.

-Tony

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (12 preceding siblings ...)
  2005-01-19 18:38 ` Luck, Tony
@ 2005-01-19 19:33 ` Bjorn Helgaas
  2005-01-19 21:51 ` Jesse Barnes
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Bjorn Helgaas @ 2005-01-19 19:33 UTC (permalink / raw)
  To: linux-ia64

On Wed, 2005-01-19 at 10:38 -0800, Luck, Tony wrote:
> >> Does the EFI memory map help here?  Is I/O memory included
> >> in the efi memory map?  What about the hot-plug case ... the
> >> EFI memory map isn't updated for new i/o memory that appears
> >> when you plug in a new card?
> >
> >I think it's supposed to appear as memory mapped I/O or memory 
> >mapped I/O port space in the EFI memory map.  Shouldn't the map
> >be updated in the case of hotplug?
> 
> I'm not an EFI expert ... but it looks like the i/f to get the
> memory map is only part of the boot time services, not the runtime
> services.  If that's right, then I don't think there's a way for
> it to be updated.

That's correct.  The EFI memory map is only a boot-time interface.
You have to learn about hot-plug stuff via ACPI.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (13 preceding siblings ...)
  2005-01-19 19:33 ` Bjorn Helgaas
@ 2005-01-19 21:51 ` Jesse Barnes
  2005-01-19 22:00 ` Luck, Tony
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-19 21:51 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: text/plain, Size: 550 bytes --]

On Wednesday, January 19, 2005 10:04 am, David Mosberger wrote:
> Adding new routines for supporting this sounds reasonable, but I think
> it needs to be added to linux/efi.h, since EFI isn't ia64-specific.

Something like this then?  Works for my test cases.

Add a new EFI memory map checking function called efi_range_is_wc for checking 
if address ranges can be mapped with the WC (write coalescing) attribute.  
Useful for fbmem, userspace PCI mapping, and other places where write 
combining might be beneficial but unavailable.

Thanks,
Jesse

[-- Attachment #2: efi-check-wc-range-3.patch --]
[-- Type: text/plain, Size: 2365 bytes --]

===== arch/ia64/pci/pci.c 1.65 vs edited =====
--- 1.65/arch/ia64/pci/pci.c	2005-01-12 10:08:48 -08:00
+++ edited/arch/ia64/pci/pci.c	2005-01-19 11:09:32 -08:00
@@ -539,7 +539,8 @@
 	 */
 	vma->vm_flags |= (VM_SHM | VM_RESERVED | VM_IO);
 
-	if (write_combine)
+	if (write_combine && efi_range_is_wc(vma->vm_start,
+					     vma->vm_end - vma->vm_start))
 		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
 	else
 		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
===== drivers/video/fbmem.c 1.148 vs edited =====
--- 1.148/drivers/video/fbmem.c	2005-01-05 15:46:41 -08:00
+++ edited/drivers/video/fbmem.c	2005-01-19 13:45:35 -08:00
@@ -35,6 +35,7 @@
 #include <linux/err.h>
 #include <linux/kernel.h>
 #include <linux/device.h>
+#include <linux/efi.h>
 
 #if defined(__mc68000__) || defined(CONFIG_APUS)
 #include <asm/setup.h>
@@ -950,9 +951,14 @@
 	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 #elif defined(__hppa__)
 	pgprot_val(vma->vm_page_prot) |= _PAGE_NO_CACHE;
-#elif defined(__ia64__) || defined(__arm__) || defined(__sh__) || \
+#elif defined(__arm__) || defined(__sh__) || \
       defined(__m32r__)
 	vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+#elif defined(__ia64)
+	if (efi_range_is_wc(vma->vm_start, vma->vm_end - vma->vm_start))
+		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+	else
+		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 #else
 #warning What do we have to do here??
 #endif
===== include/linux/efi.h 1.12 vs edited =====
--- 1.12/include/linux/efi.h	2004-11-18 23:03:10 -08:00
+++ edited/include/linux/efi.h	2005-01-19 13:41:44 -08:00
@@ -305,6 +305,27 @@
 extern int __init efi_set_rtc_mmss(unsigned long nowtime);
 extern struct efi_memory_map memmap;
 
+/**
+ * efi_range_is_wc - check the WC bit on an address range
+ * @start: starting kvirt address
+ * @len: length of range
+ *
+ * Consult the EFI memory map and make sure it's ok to set this range WC.
+ * Returns true or false.
+ */
+static inline int efi_range_is_wc(unsigned long start, unsigned long len)
+{
+	int i;
+
+	for (i = 0; i < len; i++) {
+		unsigned long paddr = __pa(start + i);
+		if (!(efi_mem_attributes(paddr) & EFI_MEMORY_WC))
+			return 0;
+	}
+	/* The range checked out */
+	return 1;
+}
+
 #ifdef CONFIG_EFI_PCDP
 extern int __init efi_setup_pcdp_console(char *);
 #endif

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (14 preceding siblings ...)
  2005-01-19 21:51 ` Jesse Barnes
@ 2005-01-19 22:00 ` Luck, Tony
  2005-01-19 22:03 ` Jesse Barnes
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Luck, Tony @ 2005-01-19 22:00 UTC (permalink / raw)
  To: linux-ia64

>On Wednesday, January 19, 2005 10:04 am, David Mosberger wrote:
>> Adding new routines for supporting this sounds reasonable, 
>but I think
>> it needs to be added to linux/efi.h, since EFI isn't ia64-specific.
>
>Something like this then?  Works for my test cases.
>
>Add a new EFI memory map checking function called 
>efi_range_is_wc for checking 
>if address ranges can be mapped with the WC (write coalescing) 
>attribute.  
>Useful for fbmem, userspace PCI mapping, and other places where write 
>combining might be beneficial but unavailable.

+	for (i = 0; i < len; i++) {
+		unsigned long paddr = __pa(start + i);
+		if (!(efi_mem_attributes(paddr) & EFI_MEMORY_WC))
+			return 0;
+	}

More EFI ignorance on my part ... do you really have to check
each *byte* of the address range, or is there some larger unit
that you could step by?  It doesn't seem rational that this
attribute could change for anything less than cache line size.

-Tony

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (15 preceding siblings ...)
  2005-01-19 22:00 ` Luck, Tony
@ 2005-01-19 22:03 ` Jesse Barnes
  2005-01-19 22:07 ` David Mosberger
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-19 22:03 UTC (permalink / raw)
  To: linux-ia64

On Wednesday, January 19, 2005 2:00 pm, Luck, Tony wrote:
> More EFI ignorance on my part ... do you really have to check
> each *byte* of the address range, or is there some larger unit
> that you could step by?  It doesn't seem rational that this
> attribute could change for anything less than cache line size.

I think I could stride by 1 << EFI_PAGE_SHIFT since the memory map is 
described in that granularity.

Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (16 preceding siblings ...)
  2005-01-19 22:03 ` Jesse Barnes
@ 2005-01-19 22:07 ` David Mosberger
  2005-01-19 22:16 ` Jesse Barnes
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: David Mosberger @ 2005-01-19 22:07 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 19 Jan 2005 13:51:45 -0800, Jesse Barnes <jbarnes@sgi.com> said:

  Jesse> Something like this then?

Shouldn't the #ifdef test be for CONFIG_EFI rather than ia64?

	--david

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (17 preceding siblings ...)
  2005-01-19 22:07 ` David Mosberger
@ 2005-01-19 22:16 ` Jesse Barnes
  2005-01-19 22:20 ` David Mosberger
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-19 22:16 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: text/plain, Size: 489 bytes --]

On Wednesday, January 19, 2005 2:07 pm, David Mosberger wrote:
> >>>>> On Wed, 19 Jan 2005 13:51:45 -0800, Jesse Barnes <jbarnes@sgi.com>
> >>>>> said:
>
>   Jesse> Something like this then?
>
> Shouldn't the #ifdef test be for CONFIG_EFI rather than ia64?

No, it still depends on pgprot_writecombined, which is ia64 specific.  x86 has 
its own way to set the wc bit and afaik doesn't have the same aliasing issues 
that ia64 has.

This one fixes the loop stride Tony pointed out.

Jesse

[-- Attachment #2: efi-check-wc-range-4.patch --]
[-- Type: text/plain, Size: 2406 bytes --]

===== arch/ia64/pci/pci.c 1.65 vs edited =====
--- 1.65/arch/ia64/pci/pci.c	2005-01-12 10:08:48 -08:00
+++ edited/arch/ia64/pci/pci.c	2005-01-19 11:09:32 -08:00
@@ -539,7 +539,8 @@
 	 */
 	vma->vm_flags |= (VM_SHM | VM_RESERVED | VM_IO);
 
-	if (write_combine)
+	if (write_combine && efi_range_is_wc(vma->vm_start,
+					     vma->vm_end - vma->vm_start))
 		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
 	else
 		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
===== drivers/video/fbmem.c 1.148 vs edited =====
--- 1.148/drivers/video/fbmem.c	2005-01-05 15:46:41 -08:00
+++ edited/drivers/video/fbmem.c	2005-01-19 14:15:41 -08:00
@@ -35,6 +35,7 @@
 #include <linux/err.h>
 #include <linux/kernel.h>
 #include <linux/device.h>
+#include <linux/efi.h>
 
 #if defined(__mc68000__) || defined(CONFIG_APUS)
 #include <asm/setup.h>
@@ -950,9 +951,13 @@
 	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 #elif defined(__hppa__)
 	pgprot_val(vma->vm_page_prot) |= _PAGE_NO_CACHE;
-#elif defined(__ia64__) || defined(__arm__) || defined(__sh__) || \
-      defined(__m32r__)
+#elif defined(__arm__) || defined(__sh__) || defined(__m32r__)
 	vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+#elif defined(__ia64)
+	if (efi_range_is_wc(vma->vm_start, vma->vm_end - vma->vm_start))
+		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+	else
+		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 #else
 #warning What do we have to do here??
 #endif
===== include/linux/efi.h 1.12 vs edited =====
--- 1.12/include/linux/efi.h	2004-11-18 23:03:10 -08:00
+++ edited/include/linux/efi.h	2005-01-19 14:04:14 -08:00
@@ -305,6 +305,27 @@
 extern int __init efi_set_rtc_mmss(unsigned long nowtime);
 extern struct efi_memory_map memmap;
 
+/**
+ * efi_range_is_wc - check the WC bit on an address range
+ * @start: starting kvirt address
+ * @len: length of range
+ *
+ * Consult the EFI memory map and make sure it's ok to set this range WC.
+ * Returns true or false.
+ */
+static inline int efi_range_is_wc(unsigned long start, unsigned long len)
+{
+	int i;
+
+	for (i = 0; i < len; i += (1UL << EFI_PAGE_SHIFT)) {
+		unsigned long paddr = __pa(start + i);
+		if (!(efi_mem_attributes(paddr) & EFI_MEMORY_WC))
+			return 0;
+	}
+	/* The range checked out */
+	return 1;
+}
+
 #ifdef CONFIG_EFI_PCDP
 extern int __init efi_setup_pcdp_console(char *);
 #endif

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (18 preceding siblings ...)
  2005-01-19 22:16 ` Jesse Barnes
@ 2005-01-19 22:20 ` David Mosberger
  2005-01-19 22:22 ` Jesse Barnes
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: David Mosberger @ 2005-01-19 22:20 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 19 Jan 2005 14:16:28 -0800, Jesse Barnes <jbarnes@sgi.com> said:

  Jesse> No, it still depends on pgprot_writecombined, which is ia64
  Jesse> specific.

Not exactly ia64-specific, but yeah, I see x86 doesn't have it (I
thought it did).

  Jesse> x86 has its own way to set the wc bit and afaik doesn't have
  Jesse> the same aliasing issues that ia64 has.

In my opinion, an EFI-based system should respect the
attribute-restrictions specified in the memory-map.

	--david

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (19 preceding siblings ...)
  2005-01-19 22:20 ` David Mosberger
@ 2005-01-19 22:22 ` Jesse Barnes
  2005-01-19 22:25 ` David Mosberger
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-19 22:22 UTC (permalink / raw)
  To: linux-ia64

On Wednesday, January 19, 2005 2:20 pm, David Mosberger wrote:
> In my opinion, an EFI-based system should respect the
> attribute-restrictions specified in the memory-map.

That makes sense to me.  Seems like we could use a whole new arch abstraction 
for setting memory attributes (wc and uncached at the very least) on page 
ranges, but that's a much bigger task.

Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (20 preceding siblings ...)
  2005-01-19 22:22 ` Jesse Barnes
@ 2005-01-19 22:25 ` David Mosberger
  2005-01-19 22:36 ` Jesse Barnes
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: David Mosberger @ 2005-01-19 22:25 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 19 Jan 2005 14:22:56 -0800, Jesse Barnes <jbarnes@sgi.com> said:

  Jesse> On Wednesday, January 19, 2005 2:20 pm, David Mosberger
  Jesse> wrote:
  >> In my opinion, an EFI-based system should respect the
  >> attribute-restrictions specified in the memory-map.

  Jesse> That makes sense to me.  Seems like we could use a whole new
  Jesse> arch abstraction for setting memory attributes (wc and
  Jesse> uncached at the very least) on page ranges, but that's a much
  Jesse> bigger task.

Absolutely.  Perhaps it would make sense to clean this up once the
hot-plug stuff has settled down?

	--david

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (21 preceding siblings ...)
  2005-01-19 22:25 ` David Mosberger
@ 2005-01-19 22:36 ` Jesse Barnes
  2005-01-19 22:39 ` David Mosberger
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-19 22:36 UTC (permalink / raw)
  To: linux-ia64

On Wednesday, January 19, 2005 2:25 pm, David Mosberger wrote:
> >>>>> On Wed, 19 Jan 2005 14:22:56 -0800, Jesse Barnes <jbarnes@sgi.com>
> >>>>> said:
>
>   Jesse> On Wednesday, January 19, 2005 2:20 pm, David Mosberger
>
>   Jesse> wrote:
>   >> In my opinion, an EFI-based system should respect the
>   >> attribute-restrictions specified in the memory-map.
>
>   Jesse> That makes sense to me.  Seems like we could use a whole new
>   Jesse> arch abstraction for setting memory attributes (wc and
>   Jesse> uncached at the very least) on page ranges, but that's a much
>   Jesse> bigger task.
>
> Absolutely.  Perhaps it would make sense to clean this up once the
> hot-plug stuff has settled down?

Sounds good, though that could be a long wait.  Should we add this new EFI 
call anyway?  Seems like it'll be useful regardless...

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (22 preceding siblings ...)
  2005-01-19 22:36 ` Jesse Barnes
@ 2005-01-19 22:39 ` David Mosberger
  2005-01-19 22:53 ` Jesse Barnes
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: David Mosberger @ 2005-01-19 22:39 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 19 Jan 2005 14:36:41 -0800, Jesse Barnes <jbarnes@sgi.com> said:

  Jesse> Sounds good, though that could be a long wait.  Should we add
  Jesse> this new EFI call anyway?  Seems like it'll be useful
  Jesse> regardless...

I agree.

	--david

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (23 preceding siblings ...)
  2005-01-19 22:39 ` David Mosberger
@ 2005-01-19 22:53 ` Jesse Barnes
  2005-01-20  9:03 ` Jes Sorensen
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-19 22:53 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: text/plain, Size: 869 bytes --]

On Wednesday, January 19, 2005 2:39 pm, David Mosberger wrote:
>   Jesse> Sounds good, though that could be a long wait.  Should we add
>   Jesse> this new EFI call anyway?  Seems like it'll be useful
>   Jesse> regardless...
>
> I agree.

Ok, here you go Tony.  This one fixes the loop and also fixes drm_vm.c.  All 
of the bits aside from the efi.h bit are ia64 specific (either under 
arch/ia64 or __ia64__), so your tree is probably the right place for all of 
it.

This patch adds efi_range_is_wc() to efi.h.  It's used to determine whether an 
address range can be mapped with the write coalescing attribute.  It also 
fixes up some ia64 specific callers to use the new routine instead of 
unconditionally calling pgprot_writecombined, which can be dangerous if used 
on ranges that don't support it.

Signed-off-by: Jesse Barnes <jbarnes@sgi.com>

Thanks,
Jesse

[-- Attachment #2: efi-check-wc-range-6.patch --]
[-- Type: text/plain, Size: 3070 bytes --]

===== arch/ia64/pci/pci.c 1.65 vs edited =====
--- 1.65/arch/ia64/pci/pci.c	2005-01-12 10:08:48 -08:00
+++ edited/arch/ia64/pci/pci.c	2005-01-19 11:09:32 -08:00
@@ -539,7 +539,8 @@
 	 */
 	vma->vm_flags |= (VM_SHM | VM_RESERVED | VM_IO);
 
-	if (write_combine)
+	if (write_combine && efi_range_is_wc(vma->vm_start,
+					     vma->vm_end - vma->vm_start))
 		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
 	else
 		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
===== drivers/char/drm/drm_vm.c 1.36 vs edited =====
--- 1.36/drivers/char/drm/drm_vm.c	2004-11-04 02:35:43 -08:00
+++ edited/drivers/char/drm/drm_vm.c	2005-01-19 14:31:47 -08:00
@@ -612,8 +612,13 @@
 			vma->vm_flags |= VM_IO;	/* not in core dump */
 		}
 #if defined(__ia64__)
-		if (map->type != _DRM_AGP)
-			vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+		if (efi_range_is_wc(vma->vm_start, vma->vm_end -
+				    vma->vm_start))
+			vma->vm_page_prot =
+				pgprot_writecombine(vma->vm_page_prot);
+		else
+			vma->vm_page_prot =
+				pgprot_noncached(vma->vm_page_prot);
 #endif
 		offset = dev->driver->get_reg_ofs(dev);
 #ifdef __sparc__
===== drivers/video/fbmem.c 1.148 vs edited =====
--- 1.148/drivers/video/fbmem.c	2005-01-05 15:46:41 -08:00
+++ edited/drivers/video/fbmem.c	2005-01-19 14:52:06 -08:00
@@ -35,6 +35,7 @@
 #include <linux/err.h>
 #include <linux/kernel.h>
 #include <linux/device.h>
+#include <linux/efi.h>
 
 #if defined(__mc68000__) || defined(CONFIG_APUS)
 #include <asm/setup.h>
@@ -950,9 +951,13 @@
 	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 #elif defined(__hppa__)
 	pgprot_val(vma->vm_page_prot) |= _PAGE_NO_CACHE;
-#elif defined(__ia64__) || defined(__arm__) || defined(__sh__) || \
-      defined(__m32r__)
+#elif defined(__arm__) || defined(__sh__) || defined(__m32r__)
 	vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+#elif defined(__ia64__)
+	if (efi_range_is_wc(vma->vm_start, vma->vm_end - vma->vm_start))
+		vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+	else
+		vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 #else
 #warning What do we have to do here??
 #endif
===== include/linux/efi.h 1.12 vs edited =====
--- 1.12/include/linux/efi.h	2004-11-18 23:03:10 -08:00
+++ edited/include/linux/efi.h	2005-01-19 14:04:14 -08:00
@@ -305,6 +305,27 @@
 extern int __init efi_set_rtc_mmss(unsigned long nowtime);
 extern struct efi_memory_map memmap;
 
+/**
+ * efi_range_is_wc - check the WC bit on an address range
+ * @start: starting kvirt address
+ * @len: length of range
+ *
+ * Consult the EFI memory map and make sure it's ok to set this range WC.
+ * Returns true or false.
+ */
+static inline int efi_range_is_wc(unsigned long start, unsigned long len)
+{
+	int i;
+
+	for (i = 0; i < len; i += (1UL << EFI_PAGE_SHIFT)) {
+		unsigned long paddr = __pa(start + i);
+		if (!(efi_mem_attributes(paddr) & EFI_MEMORY_WC))
+			return 0;
+	}
+	/* The range checked out */
+	return 1;
+}
+
 #ifdef CONFIG_EFI_PCDP
 extern int __init efi_setup_pcdp_console(char *);
 #endif

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (24 preceding siblings ...)
  2005-01-19 22:53 ` Jesse Barnes
@ 2005-01-20  9:03 ` Jes Sorensen
  2005-01-20 13:43 ` Hugo Kohmann
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jes Sorensen @ 2005-01-20  9:03 UTC (permalink / raw)
  To: linux-ia64

>>>>> "David" = David Mosberger <davidm@napali.hpl.hp.com> writes:

>>>>> On Wed, 19 Jan 2005 09:28:27 -0800, Jesse Barnes <jbarnes@sgi.com> said:
Jesse> So I guess we need to add ia64_set_page_wc() and
Jesse> ia64_set_page_uc() functions that set the appropriate page bits
Jesse> or return errors if the EFI memory map says a particular mode
Jesse> isn't allowed?  How does that sound to you, Tony & David?
Jesse> Fortunately there aren't that many users of pgprot_writecombine
Jesse> so fixing them up should be easy.

David> I'm still not clear which case you're concerned about: mapping
David> I/O memory with WC or mapping real memory with WC.  For the
David> latter, I think we may need an API to allocate memory that can
David> be mapped WC (if possible at all).

Hmmm,

What about real memory and mapping it uncached? Do we need to play the
same trick before allowing it to be mapped uncached? and for IO memory
mapped uncached?

Trying to map real memory uncached was what made me stumble upon the
PREFETCH_VISIBILITY limitation in the PAL code.

Cheers,
Jes

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (25 preceding siblings ...)
  2005-01-20  9:03 ` Jes Sorensen
@ 2005-01-20 13:43 ` Hugo Kohmann
  2005-01-20 16:42 ` Jesse Barnes
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Hugo Kohmann @ 2005-01-20 13:43 UTC (permalink / raw)
  To: linux-ia64


FYI:


X86 and X86_64 kernels rely on MTRR to enable writecombining. X86 and 
X86_64 systems supports WriteCombining using the PAT bit in the PTE, but 
this is not yet supported by Linux kernels. ( There are no PAGE_PAT bit 
defiend and the PAT MSR power on value does not include support for the WC 
attribute.) After changing the PAT MSR to enable WC, I can do:

#define _PAGE_PAT  0x80 /* for AMD64 and X86 */
  pgprot_val(vma->vm_page_prot) = pgprot_val(vma->vm_page_prot) |=  _PAGE_PAT;

to set up a user space write combine map ( using mmmap())

and

__ioremap(ioaddr, size, _PAGE_PAT);

To set up a kernel space write combine map.

I can send the code to change the PAT MSR if anybody needs it - but I 
guess this is the wrong interest group (As this only works for 
x86/x86_64)

Best regards

Hugo


On Wed, 19 Jan 2005, David Mosberger wrote:

>Date: Wed, 19 Jan 2005 14:20:21 -0800
>From: David Mosberger <davidm@napali.hpl.hp.com>
>Reply-To: davidm@hpl.hp.com
>To: Jesse Barnes <jbarnes@sgi.com>
>Cc: davidm@hpl.hp.com, Jim Hull <jim.hull@hp.com>, linux-ia64@vger.kernel.org,
>    'Tony Luck' <tony.luck@intel.com>
>Subject: Re: pgprot_writecombine & shub 1.x
>
>>>>>> On Wed, 19 Jan 2005 14:16:28 -0800, Jesse Barnes <jbarnes@sgi.com> said:
>
>  Jesse> No, it still depends on pgprot_writecombined, which is ia64
>  Jesse> specific.
>
> Not exactly ia64-specific, but yeah, I see x86 doesn't have it (I
> thought it did).
>
>  Jesse> x86 has its own way to set the wc bit and afaik doesn't have
>  Jesse> the same aliasing issues that ia64 has.
>
> In my opinion, an EFI-based system should respect the
> attribute-restrictions specified in the memory-map.
>
> 	--david
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


============================================Hugo Kohmann                           |
Dolphin Interconnect Solutions AS      | E-mail:
P.O. Box 150 Oppsal                    | hugo@dolphinics.com
N-0619 Oslo, Norway                    | Web:
Tel:+47 23 16 71 83                    | http://www.dolphinics.com
Fax:+47 23 16 71 80                    | 
Visiting Address: Olaf Helsets vei 6   |

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (26 preceding siblings ...)
  2005-01-20 13:43 ` Hugo Kohmann
@ 2005-01-20 16:42 ` Jesse Barnes
  2005-01-20 16:45 ` Jesse Barnes
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-20 16:42 UTC (permalink / raw)
  To: linux-ia64

On Thursday, January 20, 2005 5:43 am, Hugo Kohmann wrote:
> X86 and X86_64 kernels rely on MTRR to enable writecombining. X86 and
> X86_64 systems supports WriteCombining using the PAT bit in the PTE, but
> this is not yet supported by Linux kernels. ( There are no PAGE_PAT bit
> defiend and the PAT MSR power on value does not include support for the WC
> attribute.) After changing the PAT MSR to enable WC, I can do:
>
> #define _PAGE_PAT  0x80 /* for AMD64 and X86 */
>   pgprot_val(vma->vm_page_prot) = pgprot_val(vma->vm_page_prot) |= 
> _PAGE_PAT;
>
> to set up a user space write combine map ( using mmmap())
>
> and
>
> __ioremap(ioaddr, size, _PAGE_PAT);
>
> To set up a kernel space write combine map.
>
> I can send the code to change the PAT MSR if anybody needs it - but I
> guess this is the wrong interest group (As this only works for
> x86/x86_64)

Yeah, I think this is the wrong list for that, but it definitely sounds 
useful.  You should send the code to Andi Kleen and see if he's interested in 
adding support for WC in the x86-64 codebase.

Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (27 preceding siblings ...)
  2005-01-20 16:42 ` Jesse Barnes
@ 2005-01-20 16:45 ` Jesse Barnes
  2005-01-20 17:17 ` David Mosberger
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 33+ messages in thread
From: Jesse Barnes @ 2005-01-20 16:45 UTC (permalink / raw)
  To: linux-ia64

On Thursday, January 20, 2005 1:03 am, Jes Sorensen wrote:
> What about real memory and mapping it uncached? Do we need to play the
> same trick before allowing it to be mapped uncached? and for IO memory
> mapped uncached?

Real memory that we map uncached or WC should be in it's own granule.  Since 
we don't have an allocator for that yet, it's generally an unsafe thing to do 
(there are exceptions though, like the mspec driver).  And yes, we should 
probably be consulting the EFI memory map before setting any attributes on a 
page (i.e. at memory init time and whenever we change the pgprot bits), but 
since almost all conventional memory can be mapped uncached or cached, and I 
think all I/O memory can be mapped uncached, I didn't worry about those cases 
(well, that and I doubt that many EFI memory maps are up to the task--it 
would be ashame to break a bunch of otherwise working machines by forcing 
them to move to a more complete EFI memory map).

> Trying to map real memory uncached was what made me stumble upon the
> PREFETCH_VISIBILITY limitation in the PAL code.

Ah, I wondered about that :)

Jesse

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (28 preceding siblings ...)
  2005-01-20 16:45 ` Jesse Barnes
@ 2005-01-20 17:17 ` David Mosberger
  2005-01-21  9:00 ` Jes Sorensen
  2005-01-21  9:01 ` Jes Sorensen
  31 siblings, 0 replies; 33+ messages in thread
From: David Mosberger @ 2005-01-20 17:17 UTC (permalink / raw)
  To: linux-ia64

>>>>> On 20 Jan 2005 04:03:19 -0500, Jes Sorensen <jes@wildopensource.com> said:

  Jes> What about real memory and mapping it uncached? Do we need to
  Jes> play the same trick before allowing it to be mapped uncached?

It needs to be done on a granule-boundary, too, yes.

  Jes> Trying to map real memory uncached was what made me stumble
  Jes> upon the PREFETCH_VISIBILITY limitation in the PAL code.

I assume you're using this as part of the procedure outlined in
Section 4.4.11.2 of IASD v2 "Physical Addressing Attribute
Transition", right?

	--david

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (29 preceding siblings ...)
  2005-01-20 17:17 ` David Mosberger
@ 2005-01-21  9:00 ` Jes Sorensen
  2005-01-21  9:01 ` Jes Sorensen
  31 siblings, 0 replies; 33+ messages in thread
From: Jes Sorensen @ 2005-01-21  9:00 UTC (permalink / raw)
  To: linux-ia64

>>>>> "David" = David Mosberger <davidm@napali.hpl.hp.com> writes:

>>>>> On 20 Jan 2005 04:03:19 -0500, Jes Sorensen <jes@wildopensource.com> said:
Jes> Trying to map real memory uncached was what made me stumble upon
Jes> the PREFETCH_VISIBILITY limitation in the PAL code.

David> I assume you're using this as part of the procedure outlined in
David> Section 4.4.11.2 of IASD v2 "Physical Addressing Attribute
David> Transition", right?

The section code named 'clear as pea soup'? ;-) Yes, thats exactly
where I ran into it.

Cheers,
Jes

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: pgprot_writecombine & shub 1.x
  2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
                   ` (30 preceding siblings ...)
  2005-01-21  9:00 ` Jes Sorensen
@ 2005-01-21  9:01 ` Jes Sorensen
  31 siblings, 0 replies; 33+ messages in thread
From: Jes Sorensen @ 2005-01-21  9:01 UTC (permalink / raw)
  To: linux-ia64

>>>>> "Jesse" = Jesse Barnes <jbarnes@engr.sgi.com> writes:

Jesse> On Thursday, January 20, 2005 1:03 am, Jes Sorensen wrote:
>> What about real memory and mapping it uncached? Do we need to play
>> the same trick before allowing it to be mapped uncached? and for IO
>> memory mapped uncached?

Jesse> Real memory that we map uncached or WC should be in it's own
Jesse> granule.  Since we don't have an allocator for that yet, it's
Jesse> generally an unsafe thing to do (there are exceptions though,
Jesse> like the mspec driver).  And yes, we should probably be
Jesse> consulting the EFI memory map before setting any attributes on
Jesse> a page (i.e. at memory init time and whenever we change the
Jesse> pgprot bits), but since almost all conventional memory can be
Jesse> mapped uncached or cached, and I think all I/O memory can be
Jesse> mapped uncached, I didn't worry about those cases (well, that
Jesse> and I doubt that many EFI memory maps are up to the task--it
Jesse> would be ashame to break a bunch of otherwise working machines
Jesse> by forcing them to move to a more complete EFI memory map).

I am working on an allocator for that very reason, however I plan to
make it more generic so one can stick more than just uncached memory
into it (in it's own seperate pool of course).

Cheers,
Jes

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2005-01-21  9:01 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-11 20:00 pgprot_writecombine & shub 1.x Jesse Barnes
2005-01-11 22:12 ` David Mosberger
2005-01-11 22:35 ` Jesse Barnes
2005-01-12 18:51 ` Jim Hull
2005-01-12 19:31 ` Hugo Kohmann
2005-01-12 19:32 ` Jesse Barnes
2005-01-12 21:54 ` David Mosberger
2005-01-19 17:28 ` Jesse Barnes
2005-01-19 17:53 ` David Mosberger
2005-01-19 17:56 ` Jesse Barnes
2005-01-19 18:04 ` David Mosberger
2005-01-19 18:16 ` Luck, Tony
2005-01-19 18:21 ` Jesse Barnes
2005-01-19 18:38 ` Luck, Tony
2005-01-19 19:33 ` Bjorn Helgaas
2005-01-19 21:51 ` Jesse Barnes
2005-01-19 22:00 ` Luck, Tony
2005-01-19 22:03 ` Jesse Barnes
2005-01-19 22:07 ` David Mosberger
2005-01-19 22:16 ` Jesse Barnes
2005-01-19 22:20 ` David Mosberger
2005-01-19 22:22 ` Jesse Barnes
2005-01-19 22:25 ` David Mosberger
2005-01-19 22:36 ` Jesse Barnes
2005-01-19 22:39 ` David Mosberger
2005-01-19 22:53 ` Jesse Barnes
2005-01-20  9:03 ` Jes Sorensen
2005-01-20 13:43 ` Hugo Kohmann
2005-01-20 16:42 ` Jesse Barnes
2005-01-20 16:45 ` Jesse Barnes
2005-01-20 17:17 ` David Mosberger
2005-01-21  9:00 ` Jes Sorensen
2005-01-21  9:01 ` Jes Sorensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox