linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC Patch] use MTRR for write combining if PAT is not available
@ 2009-10-10  1:22 Thomas Schlichter
  2009-10-10  4:24 ` Arjan van de Ven
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-10  1:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Thomas Hellstrom, Arjan van de Ven

[-- Attachment #1: Type: Text/Plain, Size: 1494 bytes --]

Hi,

I've found a problem with X.org not setting up MTRR for the framebuffer 
memory. After I investigated I think this is not a X.org problem, but a kernel 
issue.

X.org uses libpciaccess to map the framebuffer memory. This library opens 
/sys/bus/pci/devices/*/resource0_wc and mmaps the memory. Unfortunately, the 
kernel only enables write combining if PAT is enabled, if it is not, the 
memory is mmapped uncached. But Xorg (respectively libpciaccess) thinks it was 
successfully mapped with write combining enabled and thus does not 
additionally set up MTRR entries.

The corresponding libpciaccess code can be found here:
  http://cgit.freedesktop.org/xorg/lib/libpciaccess/tree/src/linux_sysfs.c#n501

If the kernel behavior is intentional and X.org should always set up MTRR 
entries, why should it use /sys/.../resource0_wc at all? I think there are 2 
possibilities to make the kernel behavior consistent:

 1. The mmap_wc should fail if PAT is not enabled.
    (libpciaccess will then map the framebuffer uncached and set up
     MTRR entries)
 2. Use MTRR to enable write combining if PAT is not available.

In an earlier thread about ioremap_wc, Arjan van de Ven wrote that option 2 is 
preferred over option 1:

    http://lkml.indiana.edu/hypermail/linux/kernel/0805.3/2925.html

So, I've created the attached patch implementing option 2. For me this solves 
the problem with the slow Video playback due to not correctly set up MTRR 
entries.

Kind regards,
Thomas Schlichter

[-- Attachment #2: 0001-Use-MTRR-for-write-combining-mmap-ioremap-if-PAT-is-.patch --]
[-- Type: text/x-patch, Size: 3346 bytes --]

From 19fb39061825a0110d1a4feb3f83dfa3f09fb738 Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Thu, 8 Oct 2009 21:24:07 +0200
Subject: [PATCH] Use MTRR for write combining mmap/ioremap if PAT is not available

X.org uses libpciaccess which tries to mmap with write combining enabled via
/sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, we
fall back to uncached mmap. Then libpciaccess thinks it succeeded mapping
with write combining anabled and does not set up suited MTRR entries. ;-(

So when falling back to uncached mapping, we better try to set up MTRR
entries automatically. To match this modified PCI mmap behavior, also
ioremap_wc and set_memory_wc are adjusted.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/mm/ioremap.c  |   14 +++++++++-----
 arch/x86/mm/pageattr.c |   10 ++++++++--
 arch/x86/pci/i386.c    |    6 ++++++
 3 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 334e63c..1a73bbc 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -21,6 +21,7 @@
 #include <asm/tlbflush.h>
 #include <asm/pgalloc.h>
 #include <asm/pat.h>
+#include <asm/mtrr.h>
 
 #include "physaddr.h"
 
@@ -268,11 +269,14 @@ EXPORT_SYMBOL(ioremap_nocache);
  */
 void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 {
-	if (pat_enabled)
-		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
-					__builtin_return_address(0));
-	else
-		return ioremap_nocache(phys_addr, size);
+	if (!pat_enabled) {
+		void __iomem *ret = ioremap_nocache(phys_addr, size);
+		if (ret)
+			mtrr_add(phys_addr, size, MTRR_TYPE_WRCOMB, false);
+		return ret;
+	}
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
+				__builtin_return_address(0));
 }
 EXPORT_SYMBOL(ioremap_wc);
 
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index dd38bfb..7f3a85b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -23,6 +23,7 @@
 #include <asm/pgalloc.h>
 #include <asm/proto.h>
 #include <asm/pat.h>
+#include <asm/mtrr.h>
 
 /*
  * The current flushing context - we pass it instead of 5 arguments:
@@ -1010,8 +1011,13 @@ int set_memory_wc(unsigned long addr, int numpages)
 {
 	int ret;
 
-	if (!pat_enabled)
-		return set_memory_uc(addr, numpages);
+	if (!pat_enabled) {
+		ret = set_memory_uc(addr, numpages);
+		if (!ret)
+			mtrr_add(__pa(addr), numpages * PAGE_SIZE,
+				 MTRR_TYPE_WRCOMB, false);
+		return ret;
+	}
 
 	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
 		_PAGE_CACHE_WC, NULL);
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index b22d13b..06cf678 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -33,6 +33,7 @@
 #include <linux/bootmem.h>
 
 #include <asm/pat.h>
+#include <asm/mtrr.h>
 #include <asm/e820.h>
 #include <asm/pci_x86.h>
 #include <asm/io_apic.h>
@@ -301,5 +302,10 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 
 	vma->vm_ops = &pci_mmap_ops;
 
+	if (!pat_enabled && write_combine)
+		mtrr_add_page(vma->vm_pgoff,
+			     (vma->vm_end - vma->vm_start) >> PAGE_SHIFT,
+			      MTRR_TYPE_WRCOMB, false);
+
 	return 0;
 }
-- 
1.6.4.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-10  1:22 [RFC Patch] use MTRR for write combining if PAT is not available Thomas Schlichter
@ 2009-10-10  4:24 ` Arjan van de Ven
  2009-10-10  8:31   ` Thomas Schlichter
  2009-10-11 18:51   ` Henrique de Moraes Holschuh
  0 siblings, 2 replies; 60+ messages in thread
From: Arjan van de Ven @ 2009-10-10  4:24 UTC (permalink / raw)
  To: Thomas Schlichter; +Cc: linux-kernel, Thomas Hellstrom

Thomas Schlichter wrote:
> Hi,
> 
> I've found a problem with X.org not setting up MTRR for the framebuffer 
> memory. After I investigated I think this is not a X.org problem, but a kernel 
> issue.

is there any CPU left that does not support PAT ?

eg in other words, should we try instead to get PAT working for you?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-10  4:24 ` Arjan van de Ven
@ 2009-10-10  8:31   ` Thomas Schlichter
  2009-10-10 15:45     ` Arjan van de Ven
  2009-10-11 18:51   ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-10  8:31 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, Thomas Hellstrom

Am Samstag 10 Oktober 2009 06:24:41 schrieb Arjan van de Ven:
> Thomas Schlichter wrote:
> > Hi,
> >
> > I've found a problem with X.org not setting up MTRR for the framebuffer
> > memory. After I investigated I think this is not a X.org problem, but a
> > kernel issue.
> 
> is there any CPU left that does not support PAT ?
> 
> eg in other words, should we try instead to get PAT working for you?

Hmm, my CPU (VIA Nano) seems to support PAT, but Ubuntu disables this
feature in its kernel config. I'm using 9.04, but even 9.10 will have PAT
disabled:

http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-
karmic.git;a=blob;f=debian.master/config/config.common.ubuntu;h=a43cbabd02bfdf5412de6fea232db86aa5d3e742;hb=HEAD

The patch I sent does nothing if PAT is enabled, so it should be quite
safe for this case. And for the case when PAT is not enabled, it simply
tries to set up MTRR entries, if this fails it will still behave like
it does today...

So, convincing Ubuntu to enable PAT would be a workaround, but I think
the problem would still exist and should be fixed.

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-10  8:31   ` Thomas Schlichter
@ 2009-10-10 15:45     ` Arjan van de Ven
  2009-10-10 17:50       ` Roland Dreier
                         ` (2 more replies)
  0 siblings, 3 replies; 60+ messages in thread
From: Arjan van de Ven @ 2009-10-10 15:45 UTC (permalink / raw)
  To: Thomas Schlichter; +Cc: linux-kernel, Thomas Hellstrom

Thomas Schlichter wrote:

> Hmm, my CPU (VIA Nano) seems to support PAT, but Ubuntu disables this
> feature in its kernel config. I'm using 9.04, but even 9.10 will have PAT
> disabled:

that sounds like a bad idea. But to each distro their own I suppose.


> 
> http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-
> karmic.git;a=blob;f=debian.master/config/config.common.ubuntu;h=a43cbabd02bfdf5412de6fea232db86aa5d3e742;hb=HEAD

> So, convincing Ubuntu to enable PAT would be a workaround, but I think
> the problem would still exist and should be fixed.

I think it's time to get rid of CONFIG_PAT at all.. and just have it be there.
Making 10+ year old CPU features that are essential functionality a config option...
... not a good idea to be honest.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-10 15:45     ` Arjan van de Ven
@ 2009-10-10 17:50       ` Roland Dreier
  2009-10-11  7:40       ` Ingo Molnar
  2009-10-11  9:56       ` Thomas Schlichter
  2 siblings, 0 replies; 60+ messages in thread
From: Roland Dreier @ 2009-10-10 17:50 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Thomas Schlichter, linux-kernel, Thomas Hellstrom


 > > Hmm, my CPU (VIA Nano) seems to support PAT, but Ubuntu disables this
 > > feature in its kernel config. I'm using 9.04, but even 9.10 will have PAT
 > > disabled:
 > 
 > that sounds like a bad idea. But to each distro their own I suppose.

Seems that Ubuntu has caught this issue, and it will probably be
resolved: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/446480

 - R.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-10 15:45     ` Arjan van de Ven
  2009-10-10 17:50       ` Roland Dreier
@ 2009-10-11  7:40       ` Ingo Molnar
  2009-10-11  9:56       ` Thomas Schlichter
  2 siblings, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2009-10-11  7:40 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Thomas Schlichter, linux-kernel, Thomas Hellstrom, Yinghai Lu,
	H. Peter Anvin, Thomas Gleixner


* Arjan van de Ven <arjan@linux.intel.com> wrote:

> Thomas Schlichter wrote:
>
>> Hmm, my CPU (VIA Nano) seems to support PAT, but Ubuntu disables this
>> feature in its kernel config. I'm using 9.04, but even 9.10 will have PAT
>> disabled:
>
> that sounds like a bad idea. But to each distro their own I suppose.
>
>
>>
>> http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-
>> karmic.git;a=blob;f=debian.master/config/config.common.ubuntu;h=a43cbabd02bfdf5412de6fea232db86aa5d3e742;hb=HEAD
>
>> So, convincing Ubuntu to enable PAT would be a workaround, but I think
>> the problem would still exist and should be fixed.
>
> I think it's time to get rid of CONFIG_PAT at all.. and just have it 
> be there. Making 10+ year old CPU features that are essential 
> functionality a config option... ... not a good idea to be honest.

Mind sending a patch that makes the configurability of PAT dependent on 
EMBEDDED or so? There's still the nopat boot option so it can still be 
configured by a distro if it wishes to.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-10 15:45     ` Arjan van de Ven
  2009-10-10 17:50       ` Roland Dreier
  2009-10-11  7:40       ` Ingo Molnar
@ 2009-10-11  9:56       ` Thomas Schlichter
  2 siblings, 0 replies; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-11  9:56 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, Thomas Hellstrom

Arjan van de Ven wrote:
> Thomas Schlichter wrote:
> > So, convincing Ubuntu to enable PAT would be a workaround, but I think
> > the problem would still exist and should be fixed.
> 
> I think it's time to get rid of CONFIG_PAT at all.. and just have it be
>  there. Making 10+ year old CPU features that are essential functionality a
>  config option... ... not a good idea to be honest.

I do completely agree with you in this point.

But should the current behavior of not using MTRR for write combining when PAT 
is disabled (for whatever reason) stay? I think it is more consistent to do 
whatever is necessary to enable write combining when we are asked to.

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-10  4:24 ` Arjan van de Ven
  2009-10-10  8:31   ` Thomas Schlichter
@ 2009-10-11 18:51   ` Henrique de Moraes Holschuh
  2009-10-11 18:54     ` Arjan van de Ven
  1 sibling, 1 reply; 60+ messages in thread
From: Henrique de Moraes Holschuh @ 2009-10-11 18:51 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Thomas Schlichter, linux-kernel, Thomas Hellstrom

On Fri, 09 Oct 2009, Arjan van de Ven wrote:
> >I've found a problem with X.org not setting up MTRR for the
> >framebuffer memory. After I investigated I think this is not a
> >X.org problem, but a kernel issue.
> 
> is there any CPU left that does not support PAT ?

A few million of them.  Like every Centrino laptop out there, unless the
kernel blacklist for PAT on Intel CPUs is wrong.

> eg in other words, should we try instead to get PAT working for you?

Is it even possible?  Can you get Intel to fix my Pentium M?  I'd be quite
happy to test any microcode updates you send my way :)

As things stand right now, you can assume PAT will be there in 2015 or
thereabouts.  Machine models that can run Linux have a service life of
around 15 years.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-11 18:51   ` Henrique de Moraes Holschuh
@ 2009-10-11 18:54     ` Arjan van de Ven
  2009-10-11 20:19       ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 60+ messages in thread
From: Arjan van de Ven @ 2009-10-11 18:54 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Thomas Schlichter, linux-kernel, Thomas Hellstrom

Henrique de Moraes Holschuh wrote:
> On Fri, 09 Oct 2009, Arjan van de Ven wrote:
>>> I've found a problem with X.org not setting up MTRR for the
>>> framebuffer memory. After I investigated I think this is not a
>>> X.org problem, but a kernel issue.
>> is there any CPU left that does not support PAT ?
> 
> A few million of them.  Like every Centrino laptop out there, unless the
> kernel blacklist for PAT on Intel CPUs is wrong.

afaik it is extremely conservative right now. FAR too much so.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-11 18:54     ` Arjan van de Ven
@ 2009-10-11 20:19       ` Henrique de Moraes Holschuh
  2009-10-12 18:09         ` Robert Hancock
  0 siblings, 1 reply; 60+ messages in thread
From: Henrique de Moraes Holschuh @ 2009-10-11 20:19 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Thomas Schlichter, linux-kernel, Thomas Hellstrom

On Sun, 11 Oct 2009, Arjan van de Ven wrote:
> Henrique de Moraes Holschuh wrote:
> >On Fri, 09 Oct 2009, Arjan van de Ven wrote:
> >>>I've found a problem with X.org not setting up MTRR for the
> >>>framebuffer memory. After I investigated I think this is not a
> >>>X.org problem, but a kernel issue.
> >>is there any CPU left that does not support PAT ?
> >
> >A few million of them.  Like every Centrino laptop out there, unless the
> >kernel blacklist for PAT on Intel CPUs is wrong.
> 
> afaik it is extremely conservative right now. FAR too much so.

That *still* means we need the patch in this thread.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not   available
  2009-10-11 20:19       ` Henrique de Moraes Holschuh
@ 2009-10-12 18:09         ` Robert Hancock
  0 siblings, 0 replies; 60+ messages in thread
From: Robert Hancock @ 2009-10-12 18:09 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Arjan van de Ven, Thomas Schlichter, linux-kernel,
	Thomas Hellstrom

On 10/11/2009 02:19 PM, Henrique de Moraes Holschuh wrote:
> On Sun, 11 Oct 2009, Arjan van de Ven wrote:
>> Henrique de Moraes Holschuh wrote:
>>> On Fri, 09 Oct 2009, Arjan van de Ven wrote:
>>>>> I've found a problem with X.org not setting up MTRR for the
>>>>> framebuffer memory. After I investigated I think this is not a
>>>>> X.org problem, but a kernel issue.
>>>> is there any CPU left that does not support PAT ?
>>>
>>> A few million of them.  Like every Centrino laptop out there, unless the
>>> kernel blacklist for PAT on Intel CPUs is wrong.
>>
>> afaik it is extremely conservative right now. FAR too much so.
>
> That *still* means we need the patch in this thread.

I don't know that it is too conservative - the erratum which justifies 
the disablement (Core Solo/Duo erratum AE7, Pentium M erratum Y31, 
likely other IDs for it too) states that "A page whose PAT memory type 
is USWC while the relevant MTRR memory type is UC, the consolidated 
memory type may be treated as UC (rather than WC as specified in IA-32 
Intel® Architecture Software Developer's Manual). When this erratum 
occurs, the memory page may be as UC (rather than WC). This may have a 
negative performance impact." This is exactly the case being discussed 
here with the framebuffer which we're trying to set as WC with existing 
MTRRs set (or defaulting to) UC. This not only affects Pentium M but 
also the original Core Solo and Core Duo CPUs which surely is millions 
of existing laptops. It seems pretty clear-cut and there's no workaround 
identified, so I think we do indeed need a patch like this one.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
       [not found] <200910122032.52168.thomas.schlichter@web.de>
@ 2009-10-12 19:16 ` Thomas Hellstrom
  2009-10-12 19:45   ` Thomas Schlichter
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Hellstrom @ 2009-10-12 19:16 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	dri-devel@lists.sourceforge.net, Arjan van de Ven, Yinghai Lu,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Jeremy Fitzhardinge,
	Venkatesh Pallipadi, Suresh Siddha, Jan Beulich, Tejun Heo,
	Jesse Barnes, Henrique de Moraes Holschuh, Robert Hancock

Hi!

One problem with this patch is that it conflicts with the way graphics 
drivers traditionally handles
the situation, namely

1) Set up mtrr
2) Map. If fallback to uncached minus we will still have write-combined 
access.

I think mtrr-add used in this fashion will typically fail due to the 
alignment constraints. In particular,
for set_memory_wc() the typical usage pattern is a large number of pages 
in a fragmented physical address space.

So if we were to fix the problem with libpciaccess in the kernel, I 
think the best option would be to fail the user-space mapping when we 
can't make it write-combined.

Thanks,
Thomas

Thomas Schlichter wrote:
> Hi,
>
> when I first sent this E-Mail on Saturday, I unfortunately forgot to CC many 
> people. Now I used get_maintainer.pl to get the list of people that may want 
> to contribute to this topic.
>
> Because of this topic, there is already a patch from Arjan in the -tip tree to 
> make PAT and MTRR options only configurable if EMBEDDED and enabled by 
> default. I think this is a step in the right direction, but at least Henrique, 
> Robert and I seem to think something like the attached patch is still 
> required. What do you think?
>
> Kind regards,
>   Thomas
>
> -----------------------------------------------------------------------------
>
> Hi,
>
> I've found a problem with X.org not setting up MTRR for the framebuffer 
> memory. After I investigated I think this is not a X.org problem, but a kernel 
> issue.
>
> X.org uses libpciaccess to map the framebuffer memory. This library opens 
> /sys/bus/pci/devices/*/resource0_wc and mmaps the memory. Unfortunately, the 
> kernel only enables write combining if PAT is enabled, if it is not, the 
> memory is mmapped uncached. But Xorg (respectively libpciaccess) thinks it was 
> successfully mapped with write combining enabled and thus does not 
> additionally set up MTRR entries.
>
> The corresponding libpciaccess code can be found here:
>   http://cgit.freedesktop.org/xorg/lib/libpciaccess/tree/src/linux_sysfs.c#n501
>
> If the kernel behavior is intentional and X.org should always set up MTRR 
> entries, why should it use /sys/.../resource0_wc at all? I think there are 2 
> possibilities to make the kernel behavior consistent:
>
>  1. The mmap_wc should fail if PAT is not enabled.
>     (libpciaccess will then map the framebuffer uncached and set up
>      MTRR entries)
>  2. Use MTRR to enable write combining if PAT is not available.
>
> In an earlier thread about ioremap_wc, Arjan van de Ven wrote that option 2 is 
> preferred over option 1:
>
>     http://lkml.indiana.edu/hypermail/linux/kernel/0805.3/2925.html
>
> So, I've created the attached patch implementing option 2. For me this solves 
> the problem with the slow Video playback due to not correctly set up MTRR 
> entries.
>
> Kind regards,
> Thomas Schlichter
>   


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-12 19:16 ` Thomas Hellstrom
@ 2009-10-12 19:45   ` Thomas Schlichter
       [not found]     ` <1255378684.2063.5.camel@gaiman.anholt.net>
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-12 19:45 UTC (permalink / raw)
  To: Thomas Hellstrom
  Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	dri-devel@lists.sourceforge.net, Arjan van de Ven, Yinghai Lu,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Jeremy Fitzhardinge,
	Venkatesh Pallipadi, Suresh Siddha, Jan Beulich, Tejun Heo,
	Jesse Barnes, Henrique de Moraes Holschuh, Robert Hancock

[-- Attachment #1: Type: Text/Plain, Size: 1222 bytes --]

Thomas Hellstrom wrote:
> Hi!
> 
> One problem with this patch is that it conflicts with the way graphics
> drivers traditionally handles
> the situation, namely
> 
> 1) Set up mtrr
> 2) Map. If fallback to uncached minus we will still have write-combined
> access.
> 
> I think mtrr-add used in this fashion will typically fail due to the
> alignment constraints. In particular,
> for set_memory_wc() the typical usage pattern is a large number of pages
> in a fragmented physical address space.

Yes, maybe this patch tries to change current behavior too less. Indeed, if 
setting up MTRR entries it simply behaves as today, and userspace does not see 
that write combining is not correctly enabled.

> So if we were to fix the problem with libpciaccess in the kernel, I
> think the best option would be to fail the user-space mapping when we
> can't make it write-combined.

One idea to do this would be the attached patch. It simply returns an error if 
PAT is not available. It does not even try to use MTRR on its own. But maybe 
even better would be to combine both patches to something like this:
 1. try to use PAT
 2. if this fails try to set up MTRR
 3. if this also fails, return error

Kind regards,
  Thomas

[-- Attachment #2: 0001-Do-not-mmap-ioremap-uncached-when-WC-is-requested.patch --]
[-- Type: text/x-patch, Size: 2707 bytes --]

From afb48e1a1ef035c4580a5ce59a956b54a56a5c18 Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Thu, 8 Oct 2009 00:42:47 +0200
Subject: [PATCH] Do not mmap/ioremap uncached when WC is requested

X.org uses libpciaccess which tries to mmap with write combining enabled via
/sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, we
fall back to uncached mmap. Then libpciaccess thinks it succeeded mapping
with write combining anabled and does not set up suited MTRR entries. ;-(

So instead of silently falling back to uncached mapping, we better fail. In
this case libpciaccess mmaps via /sys/bus/pci/devices/*/resource0 and correctly
sets up MTRR entries.

Aditionally modify ioremap_wc and set_memory_wc to match this behavior.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/mm/ioremap.c  |   10 +++++-----
 arch/x86/mm/pageattr.c |    2 +-
 arch/x86/pci/i386.c    |    6 ++++++
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 334e63c..293581e 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -268,11 +268,11 @@ EXPORT_SYMBOL(ioremap_nocache);
  */
 void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 {
-	if (pat_enabled)
-		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
-					__builtin_return_address(0));
-	else
-		return ioremap_nocache(phys_addr, size);
+	if (!pat_enabled)
+		return NULL;
+
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
+				__builtin_return_address(0));
 }
 EXPORT_SYMBOL(ioremap_wc);
 
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index dd38bfb..b1287a9 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1011,7 +1011,7 @@ int set_memory_wc(unsigned long addr, int numpages)
 	int ret;
 
 	if (!pat_enabled)
-		return set_memory_uc(addr, numpages);
+		return -EINVAL;
 
 	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
 		_PAGE_CACHE_WC, NULL);
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index b22d13b..cf63f9c 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -281,6 +281,12 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 	if (mmap_state == pci_mmap_io)
 		return -EINVAL;
 
+	/* We cannot mmap write combining (WC) without PAT enabled.
+	 * So better fail and let the user map without WC and use MTRR.
+	 */
+	if (!pat_enabled && write_combine)
+		return -EINVAL;
+
 	prot = pgprot_val(vma->vm_page_prot);
 	if (pat_enabled && write_combine)
 		prot |= _PAGE_CACHE_WC;
-- 
1.6.4.4


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
@ 2009-10-13  7:34 Jan Beulich
  2009-10-13 21:29 ` Thomas Schlichter
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2009-10-13  7:34 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Suresh Siddha, Venkatesh Pallipadi, Tejun Heo, x86, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven, dri-devel, Ingo Molnar,
	linux-kernel, jbarnes, Thomas Hellstrom, H. Peter Anvin

>>> Thomas Schlichter <thomas.schlichter@web.de> 12.10.09 20:32 >>>
>@@ -268,11 +269,14 @@ EXPORT_SYMBOL(ioremap_nocache);
>  */
> void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
> {
>-	if (pat_enabled)
>-		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
>-					__builtin_return_address(0));
>-	else
>-		return ioremap_nocache(phys_addr, size);
>+	if (!pat_enabled) {
>+		void __iomem *ret = ioremap_nocache(phys_addr, size);
>+		if (ret)
>+			mtrr_add(phys_addr, size, MTRR_TYPE_WRCOMB, false);

This won't work if phys_addr or size aren't page aligned, or if size isn't
a power of two.

>+		return ret;
>+	}
>+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
>+				__builtin_return_address(0));
> }
> EXPORT_SYMBOL(ioremap_wc);
> 
>@@ -1010,8 +1011,13 @@ int set_memory_wc(unsigned long addr, int numpages)
> {
> 	int ret;
> 
>-	if (!pat_enabled)
>-		return set_memory_uc(addr, numpages);
>+	if (!pat_enabled) {
>+		ret = set_memory_uc(addr, numpages);
>+		if (!ret)
>+			mtrr_add(__pa(addr), numpages * PAGE_SIZE,
>+				 MTRR_TYPE_WRCOMB, false);

Similarly, this requires numpages to be a power of two.

>+		return ret;
>+	}
> 
> 	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
> 		_PAGE_CACHE_WC, NULL);

I think user mode code handles this by splitting the request and trying to
establish as many entries as possible (and those as big as possible).

Also, in all cases you pass 'false' for the 'increment' parameter, in order
to avoid having to tear down the established entries. While this may be
reasonable for kernel initiated mappings, I don't think that's acceptable
for such originating from user mode.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
       [not found]     ` <1255378684.2063.5.camel@gaiman.anholt.net>
@ 2009-10-13 21:05       ` Thomas Schlichter
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-13 21:05 UTC (permalink / raw)
  To: Eric Anholt
  Cc: Thomas Hellstrom, linux-kernel@vger.kernel.org, x86@kernel.org,
	dri-devel@lists.sourceforge.net, Arjan van de Ven, Yinghai Lu,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Jeremy Fitzhardinge,
	Venkatesh Pallipadi, Suresh Siddha, Jan Beulich, Tejun Heo,
	Jesse Barnes, Henrique de Moraes Holschuh, Robert Hancock

Eric Anholt wrote:
> On Mon, 2009-10-12 at 21:45 +0200, Thomas Schlichter wrote:
> > Yes, maybe this patch tries to change current behavior too less. Indeed,
> > if setting up MTRR entries it simply behaves as today, and userspace does
> > not see that write combining is not correctly enabled.
> 
> I'm uncomfortable with this patch because it doesn't appear to cover any
> callers of these functions inside of the kernel.  Have you audited them
> to make sure they can handle NULL being returned?

No, I haven't. And to be honest, I think the earlier patch that adds MTRR 
entries should be more safe, as it modifies behavior only slightly.

> Seems like we should install an MTRR instead.  Requiring userland to set
> up the MTRR on the kernel's behalf is backwards.

Exactly.

> With modern drivers we're installing any required MTRRs at module load
> in the kernel, and that's where things should be moving.

I think in general this is not possible, because not for all graphics chips 
there are kernel drivers (required). E.g. for my VIA VX800 based notebook, 
there is no kernel module that should be responsible to set-up the MTRR 
entries. Here it's up to the (userspace) X.org driver. It opens the 
/sys/bus/pci/devices/.../resource0_wc device and mmaps the framebuffer memory. 
With PAT this will set up a write combining memory area, and with my first 
patch this should also happen without PAT with MTRR.

> As long as
> this doesn't interfere with them, I'm OK.  And if some kernel driver is
> failing to install its MTRR, well, let's fix it.

Well, I think the mtrr_add inside mmap should not do any harm...

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-13  7:34 Jan Beulich
@ 2009-10-13 21:29 ` Thomas Schlichter
  2009-10-14  8:13   ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-13 21:29 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Suresh Siddha, Venkatesh Pallipadi, Tejun Heo, x86, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven, dri-devel, Ingo Molnar,
	linux-kernel, jbarnes, Thomas Hellstrom, H. Peter Anvin

Jan Beulich wrote:
> >>> Thomas Schlichter <thomas.schlichter@web.de> 12.10.09 20:32 >>>
> >+	if (!pat_enabled) {
> >+		void __iomem *ret = ioremap_nocache(phys_addr, size);
> >+		if (ret)
> >+			mtrr_add(phys_addr, size, MTRR_TYPE_WRCOMB, false);
> 
> This won't work if phys_addr or size aren't page aligned, or if size isn't
> a power of two.

No, at least the comments in mtrr_add and mtrr_check state that it is just 
required that phys_addr and size are multiple of PAGE_SIZE. And I'm not sure 
if it is always safe to round these up/down to the next PAGE boundary. If it 
is not, maybe it is better to fail...

> >+	if (!pat_enabled) {
> >+		ret = set_memory_uc(addr, numpages);
> >+		if (!ret)
> >+			mtrr_add(__pa(addr), numpages * PAGE_SIZE,
> >+				 MTRR_TYPE_WRCOMB, false);
> 
> Similarly, this requires numpages to be a power of two.

Same as above.

> I think user mode code handles this by splitting the request and trying to
> establish as many entries as possible (and those as big as possible).

Seems not required here...

> Also, in all cases you pass 'false' for the 'increment' parameter, in order
> to avoid having to tear down the established entries. While this may be
> reasonable for kernel initiated mappings, I don't think that's acceptable
> for such originating from user mode.

Well, therefore I would have to remember which mmap successfully set up which 
MTRR entries. And it must be ensured that for each mmap there always is a 
munmap. If this is indeed always ensured, and if it's worth to maintain the 
required data, I'd be happy to add a corresponding mtrr_del there.

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-13 21:29 ` Thomas Schlichter
@ 2009-10-14  8:13   ` Jan Beulich
  2009-10-14 19:14     ` Thomas Schlichter
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2009-10-14  8:13 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Suresh Siddha, Venkatesh Pallipadi, Tejun Heo, x86, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven, dri-devel, Ingo Molnar,
	linux-kernel, jbarnes, Thomas Hellstrom, H. Peter Anvin

>>> Thomas Schlichter <thomas.schlichter@web.de> 13.10.09 23:29 >>>
>Jan Beulich wrote:
>> >>> Thomas Schlichter <thomas.schlichter@web.de> 12.10.09 20:32 >>>
>> >+	if (!pat_enabled) {
>> >+		void __iomem *ret = ioremap_nocache(phys_addr, size);
>> >+		if (ret)
>> >+			mtrr_add(phys_addr, size, MTRR_TYPE_WRCOMB, false);
>> 
>> This won't work if phys_addr or size aren't page aligned, or if size isn't
>> a power of two.
>
>No, at least the comments in mtrr_add and mtrr_check state that it is just 
>required that phys_addr and size are multiple of PAGE_SIZE. And I'm not sure 
>if it is always safe to round these up/down to the next PAGE boundary. If it 
>is not, maybe it is better to fail...

That function isn't the limiting factor, generic_validate_add_page() is
what you need to look at (plus the spec on how the MTRR ranges are
calculated by the CPU from the base/mask register pairs).

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-14  8:13   ` Jan Beulich
@ 2009-10-14 19:14     ` Thomas Schlichter
  2009-10-15  7:48       ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-14 19:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Suresh Siddha, Venkatesh Pallipadi, Tejun Heo, x86, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven, dri-devel, Ingo Molnar,
	linux-kernel, jbarnes, Thomas Hellstrom, H. Peter Anvin

[-- Attachment #1: Type: text/plain, Size: 1043 bytes --]

Jan Beulich wrote:
> >>> Thomas Schlichter <thomas.schlichter@web.de> 13.10.09 23:29 >>>
> >No, at least the comments in mtrr_add and mtrr_check state that it is just
> >required that phys_addr and size are multiple of PAGE_SIZE. And I'm not
> > sure if it is always safe to round these up/down to the next PAGE
> > boundary. If it is not, maybe it is better to fail...
> 
> That function isn't the limiting factor, generic_validate_add_page() is
> what you need to look at (plus the spec on how the MTRR ranges are
> calculated by the CPU from the base/mask register pairs).

Yes, you are right. Sorry for not looking into generic_validate_add_page() 
before. Thank you for showing!

I added a function mtrr_add_unaligned() that tries to create as many MTRR 
entries as necessary, beginning with the biggest regions. It does not check 
the return values of each mtrr_add(), nor does it return the indexes of the 
created MTRR entries. So it seems to be only useful with increment=false. Or 
do you have a better idea?

Kind regards,
  Thomas

[-- Attachment #2: 0001-Add-new-mtrr_add_unaligned-function.patch --]
[-- Type: text/x-patch, Size: 2619 bytes --]

>From 3708f0f4c729de32445ba13ab16b6268920af0bb Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Wed, 14 Oct 2009 19:25:33 +0200
Subject: [PATCH 1/2] Add new mtrr_add_unaligned function

This function creates multiple MTRR entries for unaligned memory regions.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/include/asm/mtrr.h     |    6 ++++++
 arch/x86/kernel/cpu/mtrr/main.c |   25 +++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 4365ffd..0ad8e68 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -116,6 +116,8 @@ extern int mtrr_add(unsigned long base, unsigned long size,
 		    unsigned int type, bool increment);
 extern int mtrr_add_page(unsigned long base, unsigned long size,
 			 unsigned int type, bool increment);
+extern void mtrr_add_unaligned(unsigned long base, unsigned long size,
+			       unsigned int type, bool increment);
 extern int mtrr_del(int reg, unsigned long base, unsigned long size);
 extern int mtrr_del_page(int reg, unsigned long base, unsigned long size);
 extern void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi);
@@ -146,6 +148,10 @@ static inline int mtrr_add_page(unsigned long base, unsigned long size,
 {
     return -ENODEV;
 }
+static inline void mtrr_add_unaligned(unsigned long base, unsigned long size,
+				      unsigned int type, bool increment)
+{
+}
 static inline int mtrr_del(int reg, unsigned long base, unsigned long size)
 {
     return -ENODEV;
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 84e83de..7417ebb 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -487,6 +487,31 @@ int mtrr_add(unsigned long base, unsigned long size, unsigned int type,
 }
 EXPORT_SYMBOL(mtrr_add);
 
+void mtrr_add_unaligned(unsigned long base, unsigned long size,
+			unsigned int type, bool increment)
+{
+	unsigned long ptr1, ptr2, end = base + size;
+
+	// round down size to next power ot two
+	size = __rounddown_pow_of_two(size);
+
+	// accordingly align pointers
+	ptr1 = ptr2 = (base + size - 1) & ~(size - 1);
+
+	while (size >= PAGE_SIZE) {
+		if (ptr1 + size <= end) {
+			mtrr_add(ptr1, size, type, increment);
+			ptr1 += size;
+		}
+		if (base + size <= ptr2) {
+			ptr2 -= size;
+			mtrr_add(ptr2, size, type, increment);
+		}
+		size >>= 1;
+	}
+}
+EXPORT_SYMBOL(mtrr_add_unaligned);
+
 /**
  * mtrr_del_page - delete a memory type region
  * @reg: Register returned by mtrr_add
-- 
1.6.5


[-- Attachment #3: 0002-Use-MTRR-for-write-combining-mmap-ioremap-if-PAT-is-.patch --]
[-- Type: text/x-patch, Size: 3275 bytes --]

>From 3533b9e66c5144844a0b0864d0f57f43d57aea1a Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Thu, 8 Oct 2009 21:24:07 +0200
Subject: [PATCH 2/2] Use MTRR for write combining mmap/ioremap if PAT is not available

X.org uses libpciaccess which tries to mmap with write combining enabled via
/sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, we
fall back to uncached mmap. Then libpciaccess thinks it succeeded mapping
with write combining anabled and does not set up suited MTRR entries. ;-(

So when falling back to uncached mapping, we better try to set up MTRR
entries automatically. To match this modified PCI mmap behavior, also
ioremap_wc and set_memory_wc are adjusted.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/mm/ioremap.c  |   15 ++++++++++-----
 arch/x86/mm/pageattr.c |   10 ++++++++--
 arch/x86/pci/i386.c    |    6 ++++++
 3 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 334e63c..abe40fa 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -21,6 +21,7 @@
 #include <asm/tlbflush.h>
 #include <asm/pgalloc.h>
 #include <asm/pat.h>
+#include <asm/mtrr.h>
 
 #include "physaddr.h"
 
@@ -268,11 +269,15 @@ EXPORT_SYMBOL(ioremap_nocache);
  */
 void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 {
-	if (pat_enabled)
-		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
-					__builtin_return_address(0));
-	else
-		return ioremap_nocache(phys_addr, size);
+	if (!pat_enabled) {
+		void __iomem *ret = ioremap_nocache(phys_addr, size);
+		if (ret)
+			mtrr_add_unaligned(phys_addr, size,
+					   MTRR_TYPE_WRCOMB, false);
+		return ret;
+	}
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
+				__builtin_return_address(0));
 }
 EXPORT_SYMBOL(ioremap_wc);
 
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index dd38bfb..c25f697 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -23,6 +23,7 @@
 #include <asm/pgalloc.h>
 #include <asm/proto.h>
 #include <asm/pat.h>
+#include <asm/mtrr.h>
 
 /*
  * The current flushing context - we pass it instead of 5 arguments:
@@ -1010,8 +1011,13 @@ int set_memory_wc(unsigned long addr, int numpages)
 {
 	int ret;
 
-	if (!pat_enabled)
-		return set_memory_uc(addr, numpages);
+	if (!pat_enabled) {
+		ret = set_memory_uc(addr, numpages);
+		if (!ret)
+			mtrr_add_unaligned(__pa(addr), numpages * PAGE_SIZE,
+					   MTRR_TYPE_WRCOMB, false);
+		return ret;
+	}
 
 	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
 		_PAGE_CACHE_WC, NULL);
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index b22d13b..8379e9b 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -33,6 +33,7 @@
 #include <linux/bootmem.h>
 
 #include <asm/pat.h>
+#include <asm/mtrr.h>
 #include <asm/e820.h>
 #include <asm/pci_x86.h>
 #include <asm/io_apic.h>
@@ -301,5 +302,10 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 
 	vma->vm_ops = &pci_mmap_ops;
 
+	if (!pat_enabled && write_combine)
+		mtrr_add_unaligned(vma->vm_pgoff << PAGE_SHIFT,
+				   vma->vm_end - vma->vm_start,
+				   MTRR_TYPE_WRCOMB, false);
+
 	return 0;
 }
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-14 19:14     ` Thomas Schlichter
@ 2009-10-15  7:48       ` Jan Beulich
  2009-10-17 19:48         ` Thomas Schlichter
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2009-10-15  7:48 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Suresh Siddha, Venkatesh Pallipadi, Tejun Heo, x86, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven, dri-devel, Ingo Molnar,
	linux-kernel, jbarnes, Thomas Hellstrom, H. Peter Anvin

>>> Thomas Schlichter <thomas.schlichter@web.de> 14.10.09 21:14 >>>
>I added a function mtrr_add_unaligned() that tries to create as many MTRR 
>entries as necessary, beginning with the biggest regions. It does not check 
>the return values of each mtrr_add(), nor does it return the indexes of the 
>created MTRR entries. So it seems to be only useful with increment=false. Or 
>do you have a better idea?

I don't have immediate thoughts on how to address this, but nevertheless
I continue to think that the issue must be solved in some way, even more
that now you may be leaking multiple MTRRs.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-15  7:48       ` Jan Beulich
@ 2009-10-17 19:48         ` Thomas Schlichter
  2009-10-19  9:16           ` Jan Beulich
  2009-10-19 13:36           ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-17 19:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Suresh Siddha, Venkatesh Pallipadi, Tejun Heo, x86, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven, dri-devel, Ingo Molnar,
	linux-kernel, jbarnes, Thomas Hellstrom, H. Peter Anvin

[-- Attachment #1: Type: Text/Plain, Size: 1181 bytes --]

Jan Beulich wrote:
> >>> Thomas Schlichter <thomas.schlichter@web.de> 14.10.09 21:14 >>>
> >
> >I added a function mtrr_add_unaligned() that tries to create as many MTRR
> >entries as necessary, beginning with the biggest regions. It does not
> > check the return values of each mtrr_add(), nor does it return the
> > indexes of the created MTRR entries. So it seems to be only useful with
> > increment=false. Or do you have a better idea?
> 
> I don't have immediate thoughts on how to address this, but nevertheless
> I continue to think that the issue must be solved in some way, even more
> that now you may be leaking multiple MTRRs.

OK, it required some changes on different edges, but now I have it...

Patches 0001-0003 introduce functionality/changes to the MTRR and sysfs 
subsystems. Patch 0004 is the main patch which sets up the MTRR entries when 
pci memory regions are mmapped. The Patches 0005-0006 also change ioremap() 
and set_mempry_wc() to set up MTRR entries. These two are completely optional, 
especially and there is currently no way to automatically remove MTRR entries 
set up with them.

What do you think about these patches?

Kind regards,
  Thomas

[-- Attachment #2: 0005-Use-MTRR-for-write-combining-ioremap.patch --]
[-- Type: text/x-patch, Size: 1368 bytes --]

>From 4f000ec18079924b1191118dccf468bba50c402d Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 21:29:43 +0200
Subject: [PATCH 5/6] Use MTRR for write combining ioremap

If PAT is not enabled, set up write combining MTRR entries in ioremap().

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/mm/ioremap.c |   15 ++++++++++-----
 1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 334e63c..abe40fa 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -21,6 +21,7 @@
 #include <asm/tlbflush.h>
 #include <asm/pgalloc.h>
 #include <asm/pat.h>
+#include <asm/mtrr.h>
 
 #include "physaddr.h"
 
@@ -268,11 +269,15 @@ EXPORT_SYMBOL(ioremap_nocache);
  */
 void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 {
-	if (pat_enabled)
-		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
-					__builtin_return_address(0));
-	else
-		return ioremap_nocache(phys_addr, size);
+	if (!pat_enabled) {
+		void __iomem *ret = ioremap_nocache(phys_addr, size);
+		if (ret)
+			mtrr_add_unaligned(phys_addr, size,
+					   MTRR_TYPE_WRCOMB, false);
+		return ret;
+	}
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WC,
+				__builtin_return_address(0));
 }
 EXPORT_SYMBOL(ioremap_wc);
 
-- 
1.6.5


[-- Attachment #3: 0001-Add-new-mtrr_add_unaligned-function.patch --]
[-- Type: text/x-patch, Size: 2846 bytes --]

>From 9ef568a835c5033efb8032f1a4415ac2a128d4fc Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Wed, 14 Oct 2009 19:25:33 +0200
Subject: [PATCH 1/6] Add new mtrr_add_unaligned function

This function creates multiple MTRR entries for unaligned memory regions.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/include/asm/mtrr.h     |    8 ++++++++
 arch/x86/kernel/cpu/mtrr/main.c |   30 ++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 4365ffd..2e0d99d 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -116,6 +116,9 @@ extern int mtrr_add(unsigned long base, unsigned long size,
 		    unsigned int type, bool increment);
 extern int mtrr_add_page(unsigned long base, unsigned long size,
 			 unsigned int type, bool increment);
+extern void mtrr_add_unaligned(unsigned long base, unsigned long size,
+			       unsigned int type, bool increment,
+			       int *mtrr_usage);
 extern int mtrr_del(int reg, unsigned long base, unsigned long size);
 extern int mtrr_del_page(int reg, unsigned long base, unsigned long size);
 extern void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi);
@@ -146,6 +149,11 @@ static inline int mtrr_add_page(unsigned long base, unsigned long size,
 {
     return -ENODEV;
 }
+static inline void mtrr_add_unaligned(unsigned long base, unsigned long size,
+				      unsigned int type, bool increment,
+				      int *mtrr_usage)
+{
+}
 static inline int mtrr_del(int reg, unsigned long base, unsigned long size)
 {
     return -ENODEV;
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 84e83de..5bf769b 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -487,6 +487,36 @@ int mtrr_add(unsigned long base, unsigned long size, unsigned int type,
 }
 EXPORT_SYMBOL(mtrr_add);
 
+void mtrr_add_unaligned(unsigned long base, unsigned long size,
+			unsigned int type, bool increment, int *mtrr_usage)
+{
+	int i;
+	unsigned long ptr1, ptr2, end = base + size;
+
+	// round down size to next power ot two
+	size = __rounddown_pow_of_two(size);
+
+	// accordingly align pointers
+	ptr1 = ptr2 = (base + size - 1) & ~(size - 1);
+
+	while (size >= PAGE_SIZE) {
+		if (ptr1 + size <= end) {
+			i = mtrr_add(ptr1, size, type, increment);
+			if (i >= 0 && increment && mtrr_usage)
+				++mtrr_usage[i];
+			ptr1 += size;
+		}
+		if (base + size <= ptr2) {
+			ptr2 -= size;
+			i = mtrr_add(ptr2, size, type, increment);
+			if (i >= 0 && increment && mtrr_usage)
+				++mtrr_usage[i];
+		}
+		size >>= 1;
+	}
+}
+EXPORT_SYMBOL(mtrr_add_unaligned);
+
 /**
  * mtrr_del_page - delete a memory type region
  * @reg: Register returned by mtrr_add
-- 
1.6.5


[-- Attachment #4: 0002-Make-num_var_ranges-accessible-outside-MTRR-code.patch --]
[-- Type: text/x-patch, Size: 1636 bytes --]

>From 513955a4697328d319739799b2bca650df895689 Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 12:36:05 +0200
Subject: [PATCH 2/6] Make num_var_ranges accessible outside MTRR code

Code outside MTRR will have to remember which MTRR entries are used for a
given purpose. Therefore it has to access num_var_ranges which holds the
value of the maximum count of MTRR entries available. So we make this value
accessible from outside the core MTRR code.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/include/asm/mtrr.h     |    2 ++
 arch/x86/kernel/cpu/mtrr/mtrr.h |    1 -
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 2e0d99d..179a767 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -72,6 +72,8 @@ typedef __u8 mtrr_type;
 #define MTRR_NUM_FIXED_RANGES 88
 #define MTRR_MAX_VAR_RANGES 256
 
+extern unsigned int num_var_ranges;
+
 struct mtrr_state_type {
 	struct mtrr_var_range var_ranges[MTRR_MAX_VAR_RANGES];
 	mtrr_type fixed_ranges[MTRR_NUM_FIXED_RANGES];
diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h
index a501dee..ba6a8a5 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.h
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.h
@@ -61,7 +61,6 @@ extern struct mtrr_ops *mtrr_if;
 #define is_cpu(vnd)	(mtrr_if && mtrr_if->vendor == X86_VENDOR_##vnd)
 #define use_intel()	(mtrr_if && mtrr_if->use_intel_if == 1)
 
-extern unsigned int num_var_ranges;
 extern u64 mtrr_tom2;
 extern struct mtrr_state_type mtrr_state;
 
-- 
1.6.5


[-- Attachment #5: 0003-Provide-per-file-private-data-for-bin-sysfs-files.patch --]
[-- Type: text/x-patch, Size: 2813 bytes --]

>From 0d7525045927c8056ff28dd9465766862faba227 Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 12:39:11 +0200
Subject: [PATCH 3/6] Provide per-file private data for bin sysfs files

For binary sysfs files, provide a per-file private field in
struct bin_buffer. Therefore this modified struct is exported in the
<linux/sysfs.h> header file. Additionally add a release() callback
that can be used to free the private data on file release.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 fs/sysfs/bin.c        |   18 ++++++++----------
 include/linux/sysfs.h |   14 ++++++++++++++
 2 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 60c702b..3647b7a 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -36,15 +36,6 @@
  */
 static DEFINE_MUTEX(sysfs_bin_lock);
 
-struct bin_buffer {
-	struct mutex			mutex;
-	void				*buffer;
-	int				mmapped;
-	const struct vm_operations_struct *vm_ops;
-	struct file			*file;
-	struct hlist_node		list;
-};
-
 static int
 fill_read(struct dentry *dentry, char *buffer, loff_t off, size_t count)
 {
@@ -438,14 +429,21 @@ static int open(struct inode * inode, struct file * file)
 static int release(struct inode * inode, struct file * file)
 {
 	struct bin_buffer *bb = file->private_data;
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
+	struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+	int rc = 0;
 
+	if (attr->release)
+		rc = attr->release(kobj, attr, file);
+	
 	mutex_lock(&sysfs_bin_lock);
 	hlist_del(&bb->list);
 	mutex_unlock(&sysfs_bin_lock);
 
 	kfree(bb->buffer);
 	kfree(bb);
-	return 0;
+	return rc;
 }
 
 const struct file_operations bin_fops = {
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 9d68fed..bcad8ad 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -15,6 +15,7 @@
 #include <linux/compiler.h>
 #include <linux/errno.h>
 #include <linux/list.h>
+#include <linux/mutex.h>
 #include <asm/atomic.h>
 
 struct kobject;
@@ -60,8 +61,19 @@ struct attribute_group {
 
 #define attr_name(_attr) (_attr).attr.name
 
+struct file;
 struct vm_area_struct;
 
+struct bin_buffer {
+	struct mutex			mutex;
+	void				*buffer;
+	void				*private;
+	int				mmapped;
+	const struct vm_operations_struct *vm_ops;
+	struct file			*file;
+	struct hlist_node		list;
+};
+
 struct bin_attribute {
 	struct attribute	attr;
 	size_t			size;
@@ -72,6 +84,8 @@ struct bin_attribute {
 			 char *, loff_t, size_t);
 	int (*mmap)(struct kobject *, struct bin_attribute *attr,
 		    struct vm_area_struct *vma);
+	int (*release)(struct kobject *, struct bin_attribute *attr,
+		      struct file *file);
 };
 
 struct sysfs_ops {
-- 
1.6.5


[-- Attachment #6: 0004-Use-MTRR-for-pci_mmap_resource_wc-if-PAT-is-not-avai.patch --]
[-- Type: text/x-patch, Size: 3530 bytes --]

>From d7d9760bf7052ee53e50beb7e89fc14371e33cfa Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 21:17:16 +0200
Subject: [PATCH 4/6] Use MTRR for pci_mmap_resource_wc if PAT is not available

X.org uses libpciaccess which tries to mmap with write combining enabled via
/sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, we
fall back to uncached mmap. Then libpciaccess thinks it succeeded mapping
with write combining anabled and does not set up suited MTRR entries. ;-(

So when falling back to uncached mapping, we better try to set up MTRR
entries automatically. When the resource file is closed, we remove the MTRR
entries again.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 drivers/pci/pci-sysfs.c |   45 +++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 0f6382f..604a063 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -23,6 +23,11 @@
 #include <linux/mm.h>
 #include <linux/capability.h>
 #include <linux/pci-aspm.h>
+#include <linux/fs.h>
+#ifdef CONFIG_X86
+#  include <asm/mtrr.h>
+#  include <asm/pat.h>
+#endif
 #include "pci.h"
 
 static int sysfs_initialized;	/* = 0 */
@@ -692,9 +697,10 @@ pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
 	struct pci_dev *pdev = to_pci_dev(container_of(kobj,
 						       struct device, kobj));
 	struct resource *res = (struct resource *)attr->private;
+	struct bin_buffer *bb = (struct bin_buffer *)vma->vm_file->private_data;
 	enum pci_mmap_state mmap_type;
 	resource_size_t start, end;
-	int i;
+	int rc, i;
 
 	for (i = 0; i < PCI_ROM_RESOURCE; i++)
 		if (res == &pdev->resource[i])
@@ -716,7 +722,18 @@ pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
 	if (res->flags & IORESOURCE_MEM && iomem_is_exclusive(start))
 		return -EINVAL;
 
-	return pci_mmap_page_range(pdev, vma, mmap_type, write_combine);
+	rc = pci_mmap_page_range(pdev, vma, mmap_type, write_combine);
+#ifdef CONFIG_X86
+	if (!rc && !pat_enabled && write_combine) {
+		if (!bb->private)
+			bb->private = kzalloc(num_var_ranges * sizeof(int),
+					      GFP_KERNEL);
+		mtrr_add_unaligned(vma->vm_pgoff << PAGE_SHIFT,
+				   vma->vm_end - vma->vm_start,
+				   MTRR_TYPE_WRCOMB, true, bb->private);
+	}
+#endif // CONFIG_X86
+	return rc;
 }
 
 static int
@@ -733,6 +750,29 @@ pci_mmap_resource_wc(struct kobject *kobj, struct bin_attribute *attr,
 	return pci_mmap_resource(kobj, attr, vma, 1);
 }
 
+static int
+pci_release(struct kobject *kobj, struct bin_attribute *attr, struct file *file)
+{
+#ifdef CONFIG_X86
+	struct bin_buffer *bb = (struct bin_buffer *)file->private_data;
+	int i, *mtrr_usage = (int *)bb->private;
+
+	if (!mtrr_usage)
+		return 0;
+
+	for (i = 0; i < num_var_ranges; ++i) {
+		while (mtrr_usage[i] > 0) {
+			mtrr_del(i, 0, 0);
+			--mtrr_usage[i];
+		}
+	}
+
+	kfree(bb->private);
+	bb->private = NULL;
+#endif // CONFIG_X86
+	return 0;
+}
+
 /**
  * pci_remove_resource_files - cleanup resource files
  * @pdev: dev to cleanup
@@ -782,6 +822,7 @@ static int pci_create_attr(struct pci_dev *pdev, int num, int write_combine)
 			sprintf(res_attr_name, "resource%d", num);
 			res_attr->mmap = pci_mmap_resource_uc;
 		}
+		res_attr->release = pci_release;
 		res_attr->attr.name = res_attr_name;
 		res_attr->attr.mode = S_IRUSR | S_IWUSR;
 		res_attr->size = pci_resource_len(pdev, num);
-- 
1.6.5


[-- Attachment #7: 0006-Set-up-MTRR-entries-within-set_memory_wc.patch --]
[-- Type: text/x-patch, Size: 1247 bytes --]

>From 8eea502750fa642ff9f74e91466646e669eb5791 Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 21:36:37 +0200
Subject: [PATCH 6/6] Set up MTRR entries within set_memory_wc

If PAT is not enabled, set up write combining MTRR entries in set_memory_wc().

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/mm/pageattr.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index dd38bfb..c25f697 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -23,6 +23,7 @@
 #include <asm/pgalloc.h>
 #include <asm/proto.h>
 #include <asm/pat.h>
+#include <asm/mtrr.h>
 
 /*
  * The current flushing context - we pass it instead of 5 arguments:
@@ -1010,8 +1011,13 @@ int set_memory_wc(unsigned long addr, int numpages)
 {
 	int ret;
 
-	if (!pat_enabled)
-		return set_memory_uc(addr, numpages);
+	if (!pat_enabled) {
+		ret = set_memory_uc(addr, numpages);
+		if (!ret)
+			mtrr_add_unaligned(__pa(addr), numpages * PAGE_SIZE,
+					   MTRR_TYPE_WRCOMB, false);
+		return ret;
+	}
 
 	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
 		_PAGE_CACHE_WC, NULL);
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-17 19:48         ` Thomas Schlichter
@ 2009-10-19  9:16           ` Jan Beulich
  2009-10-19 13:44             ` Suresh Siddha
  2009-10-19 13:36           ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2009-10-19  9:16 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Suresh Siddha, Venkatesh Pallipadi, Tejun Heo, x86, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven, dri-devel, Ingo Molnar,
	linux-kernel, jbarnes, Thomas Hellstrom, H. Peter Anvin

>>> Thomas Schlichter <thomas.schlichter@web.de> 17.10.09 21:48 >>>
>Patches 0001-0003 introduce functionality/changes to the MTRR and sysfs 
>subsystems. Patch 0004 is the main patch which sets up the MTRR entries when 
>pci memory regions are mmapped. The Patches 0005-0006 also change ioremap() 
>and set_mempry_wc() to set up MTRR entries. These two are completely optional, 
>especially and there is currently no way to automatically remove MTRR entries 
>set up with them.
>
>What do you think about these patches?

Functionality-wise this looks fine to me; whether the core sysfs changes
are acceptable I can't judge, though.

However, I would recommend folding the last two arguments of
mtrr_add_unaligned(), i.e. use mtrr_usage != NULL as the increment
argument passed to mtrr_add().

And patches 5 and 6 would apparently not build right now, as they're
failing to pass the new 5th argument to mtrr_add_unaligned().

Also, why do you add x86-specific code to drivers/pci-sysfs.c when the
function called there (pci_mmap_page_range()) already is arch-specific?
Moving your addition there would also allow covering the related
(legacy?) procfs based functionality... pci_release() could also become
arch-specific, with a fall-back definition to NULL.

Additionally, I would suggest making those code portions depend on
CONFIG_X86_MTRR rather than CONFIG_X86, so that you don't
needlessly (try to) allocate the mtrr_usage vector.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-17 19:48         ` Thomas Schlichter
  2009-10-19  9:16           ` Jan Beulich
@ 2009-10-19 13:36           ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 60+ messages in thread
From: Konrad Rzeszutek Wilk @ 2009-10-19 13:36 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jan Beulich, Thomas Hellstrom, Jeremy Fitzhardinge,
	H. Peter Anvin, Suresh Siddha, Arjan van de Ven, x86,
	linux-kernel, jbarnes, Ingo Molnar, Henrique de Moraes Holschuh,
	Venkatesh Pallipadi, Tejun Heo, Thomas Gleixner, dri-devel,
	Yinghai Lu, Robert Hancock

> X.org uses libpciaccess which tries to mmap with write combining enabled via
> /sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, we
> fall back to uncached mmap. Then libpciaccess thinks it succeeded mapping
> with write combining anabled and does not set up suited MTRR entries. ;-(

I am not familiar with libpciaccess, but I was wondering why that library
cannot realize it failed in its endeavour and use other means to accomplish
its goals?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-19  9:16           ` Jan Beulich
@ 2009-10-19 13:44             ` Suresh Siddha
  2009-10-19 13:54               ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Suresh Siddha @ 2009-10-19 13:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Thomas Schlichter, Jeremy Fitzhardinge, Robert Hancock,
	Henrique de Moraes Holschuh, Pallipadi, Venkatesh, Tejun Heo,
	x86@kernel.org, Yinghai Lu, Thomas Gleixner, Arjan van de Ven,
	dri-devel@lists.sourceforge.net, Ingo Molnar,
	linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
	Thomas Hellstrom, H. Peter Anvin

On Mon, 2009-10-19 at 02:16 -0700, Jan Beulich wrote:
> Functionality-wise this looks fine to me

If we are going to make ioremap() and set_memory_wc() add mtrr's in
non-pat case, then we need to delete the added mtrr(s) in the
corresponding iounmap() and set_memory_wb() aswell.

hmm, this is becoming too complex. The way i915 and other graphics
drivers are using set_memory_wc(), it is def a bad idea to start adding
mtrr's behind the back for non-pat case.

Can't we just force PAT option always and we probably don't care about
ioremap_wc() on processors were PAT doesn't get enabled because of known
errata.

or Perhaps just try to add mtrr only for the pci mmap case like the 4th
patch in this series..

thanks,
suresh


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-19 13:44             ` Suresh Siddha
@ 2009-10-19 13:54               ` Ingo Molnar
  0 siblings, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2009-10-19 13:54 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Jan Beulich, Thomas Schlichter, Jeremy Fitzhardinge,
	Robert Hancock, Henrique de Moraes Holschuh, Pallipadi, Venkatesh,
	Tejun Heo, x86@kernel.org, Yinghai Lu, Thomas Gleixner,
	Arjan van de Ven, dri-devel@lists.sourceforge.net, Ingo Molnar,
	linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
	Thomas Hellstrom, H. Peter Anvin


* Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> On Mon, 2009-10-19 at 02:16 -0700, Jan Beulich wrote:
> > Functionality-wise this looks fine to me
> 
> If we are going to make ioremap() and set_memory_wc() add mtrr's in 
> non-pat case, then we need to delete the added mtrr(s) in the 
> corresponding iounmap() and set_memory_wb() aswell.
> 
> hmm, this is becoming too complex. The way i915 and other graphics 
> drivers are using set_memory_wc(), it is def a bad idea to start 
> adding mtrr's behind the back for non-pat case.

Touching MTRRs beyond working around basic bugs like non-cached RAM 
sounds like madness. The interactions with PAT are ... countless.

> Can't we just force PAT option always and we probably don't care about 
> ioremap_wc() on processors were PAT doesn't get enabled because of 
> known errata.

We can make PAT configurability dependent on EMBEDDED-y - mind sending a 
patch for that?

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not  available
@ 2009-10-19 14:47 Thomas Schlichter
  2009-10-20 19:54 ` Thomas Schlichter
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-19 14:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Arjan van de Ven, dri-devel, Robert Hancock,
	Henrique de Moraes Holschuh, H. Peter Anvin, jbarnes,
	Jeremy Fitzhardinge, linux-kernel, Ingo Molnar, Suresh Siddha,
	Thomas Gleixner, Thomas Hellstrom, Tejun Heo, Venkatesh Pallipadi,
	x86, Yinghai Lu

Jan Beulich wrote:
> >>> Thomas Schlichter <thomas.schlichter@web.de> 17.10.09 21:48 >>>
> >What do you think about these patches?
> 
> Functionality-wise this looks fine to me; whether the core sysfs changes
> are acceptable I can't judge, though.

OK, who can? Or shall I simply ask get_maintainer.pl again?

> However, I would recommend folding the last two arguments of
> mtrr_add_unaligned(), i.e. use mtrr_usage != NULL as the increment
> argument passed to mtrr_add().

Good idea, I will do so this evening...

> And patches 5 and 6 would apparently not build right now, as they're
> failing to pass the new 5th argument to mtrr_add_unaligned().

*Grml* you are of course right. But I am not sure if these both patches
are a goot idea at all.

> Also, why do you add x86-specific code to drivers/pci-sysfs.c when the
> function called there (pci_mmap_page_range()) already is arch-specific?
> Moving your addition there would also allow covering the related
> (legacy?) procfs based functionality... pci_release() could also become
> arch-specific, with a fall-back definition to NULL.

You are right, I should do that...

> Additionally, I would suggest making those code portions depend on
> CONFIG_X86_MTRR rather than CONFIG_X86, so that you don't
> needlessly (try to) allocate the mtrr_usage vector.

Good idea, I will do so.

Thank you very much for your feedback, I'll hopefully come up with an
even better version tonight...

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
@ 2009-10-19 14:59 Thomas Schlichter
  2009-10-19 15:31 ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-19 14:59 UTC (permalink / raw)
  To: Jan Beulich, Suresh Siddha
  Cc: arjan, dri-devel, hancockrwd, hmh, hpa, jbarnes,
	jeremy.fitzhardinge, linux-kernel, mingo, tglx, thellstrom, tj,
	venkatesh.pallipadi, x86, yinghai

Suresh Siddha wrote:
> If we are going to make ioremap() and set_memory_wc() add mtrr's in
> non-pat case, then we need to delete the added mtrr(s) in the
> corresponding iounmap() and set_memory_wb() aswell.
> 
> hmm, this is becoming too complex. The way i915 and other graphics
> drivers are using set_memory_wc(), it is def a bad idea to start adding
> mtrr's behind the back for non-pat case.

Yes, maybe it's better to drop it for ioremap() and set_memory_wc(). But I'd like
to keep it for mmapping the PCI region. It should help all the people with
PAT-incapable CPUs and graphics chips without DRM support (for them there
simply is no driver that should set up the MTRR entries...).

> Can't we just force PAT option always and we probably don't care about
> ioremap_wc() on processors were PAT doesn't get enabled because of known
> errata.

I don't think this is a good idea, Robert Hancock wrote there may be millions of
such Laptops (Core Solo/Duo erratum AE7, Pentium M erratum Y31) :
  http://marc.info/?l=linux-kernel&m=125537136105246&w=2

> or Perhaps just try to add mtrr only for the pci mmap case like the 4th
> patch in this series..

I'd prefer this! ;-)

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
@ 2009-10-19 15:07 Thomas Schlichter
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-19 15:07 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Jan Beulich, Arjan van de Ven, dri-devel, Robert Hancock,
	Henrique de Moraes Holschuh, H. Peter Anvin, jbarnes,
	Jeremy Fitzhardinge, linux-kernel, Ingo Molnar, Suresh Siddha,
	Thomas Gleixner, Thomas Hellstrom, Tejun Heo, Venkatesh Pallipadi,
	x86, Yinghai Lu

Konrad Rzeszutek Wilk wrote:
> I am not familiar with libpciaccess, but I was wondering why that library
> cannot realize it failed in its endeavour and use other means to accomplish
> its goals?

The thing is that the mmap() will succeed! Only the memory region will not be
set up as write combining when PAT is disabled. So userspace would have to
find out it PAT is enabled in kernel and accordingly set up MTRR entries. And
currently I don't know of a kernel interface that would tell userspace if PAT is
enabled.

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
@ 2009-10-19 15:10 Thomas Schlichter
  2009-10-19 15:28 ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-19 15:10 UTC (permalink / raw)
  To: Ingo Molnar, Suresh Siddha
  Cc: JBeulich, arjan, dri-devel, hancockrwd, hmh, hpa, jbarnes,
	jeremy.fitzhardinge, linux-kernel, mingo, tglx, thellstrom, tj,
	venkatesh.pallipadi, x86, yinghai

Ingo Molnar wrote:
> * Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > Can't we just force PAT option always and we probably don't care about 
> > ioremap_wc() on processors were PAT doesn't get enabled because of 
> > known errata.
> 
> We can make PAT configurability dependent on EMBEDDED-y - mind sending a 
> patch for that?

I think there already is such a patch in -tip:
http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=c03cb3149daed3e411657e3212d05ae27cf1a874

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-19 15:10 Thomas Schlichter
@ 2009-10-19 15:28 ` Ingo Molnar
  0 siblings, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2009-10-19 15:28 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Suresh Siddha, JBeulich, arjan, dri-devel, hancockrwd, hmh, hpa,
	jbarnes, jeremy.fitzhardinge, linux-kernel, mingo, tglx,
	thellstrom, tj, venkatesh.pallipadi, x86, yinghai


* Thomas Schlichter <thomas.schlichter@web.de> wrote:

> > We can make PAT configurability dependent on EMBEDDED-y - mind 
> > sending a patch for that?
> 
> I think there already is such a patch in -tip: 
> http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=c03cb3149daed3e411657e3212d05ae27cf1a874

Yes :-)

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-19 14:59 Thomas Schlichter
@ 2009-10-19 15:31 ` Ingo Molnar
  2009-10-19 21:49   ` Suresh Siddha
  2009-10-20 20:35   ` Thomas Schlichter
  0 siblings, 2 replies; 60+ messages in thread
From: Ingo Molnar @ 2009-10-19 15:31 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jan Beulich, Suresh Siddha, arjan, dri-devel, hancockrwd, hmh,
	hpa, jbarnes, jeremy.fitzhardinge, linux-kernel, mingo, tglx,
	thellstrom, tj, venkatesh.pallipadi, x86, yinghai


* Thomas Schlichter <thomas.schlichter@web.de> wrote:

> I don't think this is a good idea, Robert Hancock wrote there may be 
> millions of such Laptops (Core Solo/Duo erratum AE7, Pentium M erratum 
> Y31) :
>
>   http://marc.info/?l=linux-kernel&m=125537136105246&w=2
> 
> > or Perhaps just try to add mtrr only for the pci mmap case like the 
> > 4th patch in this series..
> 
> I'd prefer this! ;-)

Hm, we could perhaps do that - but i think we should only do that on 
systems that have PAT disabled.

Which brings up the question of how to properly QA such a narrow segment 
of the market. Maybe disabling CONFIG_X86_PAT should enable that logic 
too.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-19 15:31 ` Ingo Molnar
@ 2009-10-19 21:49   ` Suresh Siddha
  2009-10-20 20:35   ` Thomas Schlichter
  1 sibling, 0 replies; 60+ messages in thread
From: Suresh Siddha @ 2009-10-19 21:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Schlichter, Jan Beulich, arjan@linux.intel.com,
	dri-devel@lists.sourceforge.net, hancockrwd@gmail.com,
	hmh@hmh.eng.br, hpa@zytor.com, jbarnes@virtuousgeek.org,
	jeremy.fitzhardinge@citrix.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, tglx@linutronix.de, thellstrom@vmware.com,
	tj@kernel.org, Pallipadi, Venkatesh, x86@kernel.org,
	yinghai@kernel.org

On Mon, 2009-10-19 at 08:31 -0700, Ingo Molnar wrote:
> * Thomas Schlichter <thomas.schlichter@web.de> wrote:
> 
> > I don't think this is a good idea, Robert Hancock wrote there may be 
> > millions of such Laptops (Core Solo/Duo erratum AE7, Pentium M erratum 
> > Y31) :
> >
> >   http://marc.info/?l=linux-kernel&m=125537136105246&w=2
> > 
> > > or Perhaps just try to add mtrr only for the pci mmap case like the 
> > > 4th patch in this series..
> > 
> > I'd prefer this! ;-)
> 
> Hm, we could perhaps do that - but i think we should only do that on 
> systems that have PAT disabled.
> 
> Which brings up the question of how to properly QA such a narrow segment 
> of the market. Maybe disabling CONFIG_X86_PAT should enable that logic 
> too.

Also, I think we can simplify and avoid mtrr_add_unaligned() and worry
only about the pci mmap aligned cases etc. Those are the only
interesting ones.

thanks,
suresh


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not  available
  2009-10-19 14:47 Thomas Schlichter
@ 2009-10-20 19:54 ` Thomas Schlichter
  2009-10-21 11:57   ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-20 19:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Suresh Siddha, Venkatesh Pallipadi, Tejun Heo, x86, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven, dri-devel, Ingo Molnar,
	linux-kernel, jbarnes, Thomas Hellstrom, H. Peter Anvin

[-- Attachment #1: Type: Text/Plain, Size: 890 bytes --]

Jan Beulich wrote:
> Functionality-wise this looks fine to me; whether the core sysfs changes
> are acceptable I can't judge, though.

OK, I think I should have addressed your comments. Unfortunately I had to use 
a little "hack" to make pci_mmap_page_range() work for sysfs and proc. I 
placed a "private" pointer in the beginning of both per-file private 
structures. So this pointer can be accessed independent from the caller. I 
hope this is acceptable.

I dropped the ioremap() and set_memory_wc() patches, I could not implement 
reference counting for them and it may interact too much with existing GPU 
drivers.

Again, this series should not change the current behavior if either MTRR is 
disabled or PAT is enabled. But it helps in the case that MTRR is enabled and 
PAT is not available.

What should be done now to get this series on the right "track"?

Kind regards,
  Thomas

[-- Attachment #2: 0004-Use-MTRR-for-pci_mmap_resource_wc-if-PAT-is-not-avai.patch --]
[-- Type: text/x-patch, Size: 5518 bytes --]

>From 88d77b939afa1934eb2158f6d28db500eab0d345 Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 21:17:16 +0200
Subject: [PATCH 4/4] Use MTRR for pci_mmap_resource_wc if PAT is not available

X.org uses libpciaccess which tries to mmap with write combining enabled via
/sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, the
kernel does fall back to uncached mmap. Then libpciaccess thinks it succeeded
mapping with write combining enabled and does not set up suited MTRR entries.
;-(

So when falling back to uncached mapping, we better try to set up MTRR
entries automatically. When the resource file is closed, we remove the MTRR
entries again.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/include/asm/pci.h |    2 ++
 arch/x86/pci/i386.c        |   38 ++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci-sysfs.c    |    8 ++++++++
 drivers/pci/pci.h          |    6 ++++++
 drivers/pci/proc.c         |    5 ++++-
 5 files changed, 58 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index ada8c20..e94aeb1 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -64,6 +64,8 @@ struct irq_routing_table *pcibios_get_irq_routing_table(void);
 int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
 
 
+#define HAVE_PCI_RELEASE_FILE
+
 #define HAVE_PCI_MMAP
 extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 			       enum pci_mmap_state mmap_state,
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index b22d13b..4cef0c1 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -31,7 +31,9 @@
 #include <linux/ioport.h>
 #include <linux/errno.h>
 #include <linux/bootmem.h>
+#include <linux/fs.h>
 
+#include <asm/mtrr.h>
 #include <asm/pat.h>
 #include <asm/e820.h>
 #include <asm/pci_x86.h>
@@ -274,6 +276,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 			enum pci_mmap_state mmap_state, int write_combine)
 {
 	unsigned long prot;
+	int **p_mtrr_usage = (int **)vma->vm_file->private_data;
 
 	/* I/O space cannot be accessed via normal processor loads and
 	 * stores on this platform.
@@ -301,5 +304,40 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 
 	vma->vm_ops = &pci_mmap_ops;
 
+#ifdef CONFIG_MTRR
+	if (!pat_enabled && write_combine) {
+		if (*p_mtrr_usage == NULL)
+			*p_mtrr_usage = kzalloc(num_var_ranges * sizeof(int),
+						GFP_KERNEL);
+		mtrr_add_unaligned(vma->vm_pgoff << PAGE_SHIFT,
+				   vma->vm_end - vma->vm_start,
+				   MTRR_TYPE_WRCOMB, *p_mtrr_usage);
+	}
+#endif // CONFIG_MTRR
+
+	return 0;
+}
+
+int pci_release_file(struct file *file)
+{
+#ifdef CONFIG_MTRR
+	int i;
+	int **p_mtrr_usage = (int **)file->private_data;
+	int *mtrr_usage = *p_mtrr_usage;
+
+	if (!mtrr_usage)
+		return 0;
+
+	for (i = 0; i < num_var_ranges; ++i) {
+		while (mtrr_usage[i] > 0) {
+			mtrr_del(i, 0, 0);
+			--mtrr_usage[i];
+		}
+	}
+
+	kfree(*p_mtrr_usage);
+	*p_mtrr_usage = NULL;
+#endif // CONFIG_MTRR
+
 	return 0;
 }
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 0f6382f..4f36835 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -733,6 +733,13 @@ pci_mmap_resource_wc(struct kobject *kobj, struct bin_attribute *attr,
 	return pci_mmap_resource(kobj, attr, vma, 1);
 }
 
+static int
+pci_release_resource(struct kobject *kobj, struct bin_attribute *attr,
+		     struct file *file)
+{
+	return pci_release_file(file);
+}
+
 /**
  * pci_remove_resource_files - cleanup resource files
  * @pdev: dev to cleanup
@@ -782,6 +789,7 @@ static int pci_create_attr(struct pci_dev *pdev, int num, int write_combine)
 			sprintf(res_attr_name, "resource%d", num);
 			res_attr->mmap = pci_mmap_resource_uc;
 		}
+		res_attr->release = pci_release_resource;
 		res_attr->attr.name = res_attr_name;
 		res_attr->attr.mode = S_IRUSR | S_IWUSR;
 		res_attr->size = pci_resource_len(pdev, num);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index d92d195..cff3b7d 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -84,6 +84,12 @@ static inline void pci_vpd_release(struct pci_dev *dev)
 		dev->vpd->ops->release(dev);
 }
 
+#ifdef HAVE_PCI_RELEASE_FILE
+extern int pci_release_file(struct file *file);
+#else
+static inline int pci_release_file(struct file *file) { return 0; }
+#endif
+
 /* PCI /proc functions */
 #ifdef CONFIG_PROC_FS
 extern int pci_proc_attach_device(struct pci_dev *dev);
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 593bb84..91c8a48 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -197,6 +197,7 @@ proc_bus_pci_write(struct file *file, const char __user *buf, size_t nbytes, lof
 }
 
 struct pci_filp_private {
+	void *private;
 	enum pci_mmap_state mmap_state;
 	int write_combine;
 };
@@ -277,7 +278,7 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
 
 static int proc_bus_pci_open(struct inode *inode, struct file *file)
 {
-	struct pci_filp_private *fpriv = kmalloc(sizeof(*fpriv), GFP_KERNEL);
+	struct pci_filp_private *fpriv = kzalloc(sizeof(*fpriv), GFP_KERNEL);
 
 	if (!fpriv)
 		return -ENOMEM;
@@ -292,6 +293,8 @@ static int proc_bus_pci_open(struct inode *inode, struct file *file)
 
 static int proc_bus_pci_release(struct inode *inode, struct file *file)
 {
+	pci_release_file(file);
+
 	kfree(file->private_data);
 	file->private_data = NULL;
 
-- 
1.6.5.1


[-- Attachment #3: 0001-Add-new-mtrr_add_unaligned-function.patch --]
[-- Type: text/x-patch, Size: 2756 bytes --]

>From 7cd44581d15aaa3b2ef38a4d76ba19dd50e7cb7b Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Wed, 14 Oct 2009 19:25:33 +0200
Subject: [PATCH 1/4] Add new mtrr_add_unaligned function

This function creates multiple MTRR entries for unaligned memory regions.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/include/asm/mtrr.h     |    6 ++++++
 arch/x86/kernel/cpu/mtrr/main.c |   30 ++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 4365ffd..9f447c4 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -116,6 +116,8 @@ extern int mtrr_add(unsigned long base, unsigned long size,
 		    unsigned int type, bool increment);
 extern int mtrr_add_page(unsigned long base, unsigned long size,
 			 unsigned int type, bool increment);
+extern void mtrr_add_unaligned(unsigned long base, unsigned long size,
+			       unsigned int type, int *mtrr_usage);
 extern int mtrr_del(int reg, unsigned long base, unsigned long size);
 extern int mtrr_del_page(int reg, unsigned long base, unsigned long size);
 extern void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi);
@@ -146,6 +148,10 @@ static inline int mtrr_add_page(unsigned long base, unsigned long size,
 {
     return -ENODEV;
 }
+static inline void mtrr_add_unaligned(unsigned long base, unsigned long size,
+				      unsigned int type, int *mtrr_usage)
+{
+}
 static inline int mtrr_del(int reg, unsigned long base, unsigned long size)
 {
     return -ENODEV;
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 84e83de..5654ef7 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -487,6 +487,36 @@ int mtrr_add(unsigned long base, unsigned long size, unsigned int type,
 }
 EXPORT_SYMBOL(mtrr_add);
 
+void mtrr_add_unaligned(unsigned long base, unsigned long size,
+			unsigned int type, int *mtrr_usage)
+{
+	int i;
+	unsigned long ptr1, ptr2, end = base + size;
+
+	// round down size to next power ot two
+	size = __rounddown_pow_of_two(size);
+
+	// accordingly align pointers
+	ptr1 = ptr2 = (base + size - 1) & ~(size - 1);
+
+	while (size >= PAGE_SIZE) {
+		if (ptr1 + size <= end) {
+			i = mtrr_add(ptr1, size, type, !!mtrr_usage);
+			if (i >= 0 && mtrr_usage)
+				++mtrr_usage[i];
+			ptr1 += size;
+		}
+		if (base + size <= ptr2) {
+			ptr2 -= size;
+			i = mtrr_add(ptr2, size, type, !!mtrr_usage);
+			if (i >= 0 && mtrr_usage)
+				++mtrr_usage[i];
+		}
+		size >>= 1;
+	}
+}
+EXPORT_SYMBOL(mtrr_add_unaligned);
+
 /**
  * mtrr_del_page - delete a memory type region
  * @reg: Register returned by mtrr_add
-- 
1.6.5.1


[-- Attachment #4: 0002-Make-num_var_ranges-accessible-outside-MTRR-code.patch --]
[-- Type: text/x-patch, Size: 1638 bytes --]

>From 4299569a5ed27b2a0bc2c0a04014edc4bb29bb7e Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 12:36:05 +0200
Subject: [PATCH 2/4] Make num_var_ranges accessible outside MTRR code

Code outside MTRR will have to remember which MTRR entries are used for a
given purpose. Therefore it has to access num_var_ranges which holds the
value of the maximum count of MTRR entries available. So we make this value
accessible from outside the core MTRR code.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/include/asm/mtrr.h     |    2 ++
 arch/x86/kernel/cpu/mtrr/mtrr.h |    1 -
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 9f447c4..cc6f123 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -72,6 +72,8 @@ typedef __u8 mtrr_type;
 #define MTRR_NUM_FIXED_RANGES 88
 #define MTRR_MAX_VAR_RANGES 256
 
+extern unsigned int num_var_ranges;
+
 struct mtrr_state_type {
 	struct mtrr_var_range var_ranges[MTRR_MAX_VAR_RANGES];
 	mtrr_type fixed_ranges[MTRR_NUM_FIXED_RANGES];
diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h
index a501dee..ba6a8a5 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.h
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.h
@@ -61,7 +61,6 @@ extern struct mtrr_ops *mtrr_if;
 #define is_cpu(vnd)	(mtrr_if && mtrr_if->vendor == X86_VENDOR_##vnd)
 #define use_intel()	(mtrr_if && mtrr_if->use_intel_if == 1)
 
-extern unsigned int num_var_ranges;
 extern u64 mtrr_tom2;
 extern struct mtrr_state_type mtrr_state;
 
-- 
1.6.5.1


[-- Attachment #5: 0003-Provide-per-file-private-data-for-bin-sysfs-files.patch --]
[-- Type: text/x-patch, Size: 2157 bytes --]

>From 92735813dbd6ea3c9ee56a7e23d7a8d8483e9e7d Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 12:39:11 +0200
Subject: [PATCH 3/4] Provide per-file private data for bin sysfs files

For binary sysfs files, provide a per-file private field in struct bin_buffer.
To be easily accessible, it is located on head of the struct. Additionally add
a release() callback that may be used to e.g. free the private data on file
release.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 fs/sysfs/bin.c        |   10 +++++++++-
 include/linux/sysfs.h |    3 +++
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 60c702b..c0575a3 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -37,6 +37,7 @@
 static DEFINE_MUTEX(sysfs_bin_lock);
 
 struct bin_buffer {
+	void				*private;
 	struct mutex			mutex;
 	void				*buffer;
 	int				mmapped;
@@ -438,14 +439,21 @@ static int open(struct inode * inode, struct file * file)
 static int release(struct inode * inode, struct file * file)
 {
 	struct bin_buffer *bb = file->private_data;
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
+	struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+	int rc = 0;
 
+	if (attr->release)
+		rc = attr->release(kobj, attr, file);
+	
 	mutex_lock(&sysfs_bin_lock);
 	hlist_del(&bb->list);
 	mutex_unlock(&sysfs_bin_lock);
 
 	kfree(bb->buffer);
 	kfree(bb);
-	return 0;
+	return rc;
 }
 
 const struct file_operations bin_fops = {
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 9d68fed..751ea68 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -19,6 +19,7 @@
 
 struct kobject;
 struct module;
+struct file;
 
 /* FIXME
  * The *owner field is no longer used.
@@ -72,6 +73,8 @@ struct bin_attribute {
 			 char *, loff_t, size_t);
 	int (*mmap)(struct kobject *, struct bin_attribute *attr,
 		    struct vm_area_struct *vma);
+	int (*release)(struct kobject *, struct bin_attribute *attr,
+		      struct file *file);
 };
 
 struct sysfs_ops {
-- 
1.6.5.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-19 15:31 ` Ingo Molnar
  2009-10-19 21:49   ` Suresh Siddha
@ 2009-10-20 20:35   ` Thomas Schlichter
  2009-10-20 21:59     ` Suresh Siddha
  1 sibling, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-20 20:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jan Beulich, Suresh Siddha, arjan, dri-devel, hancockrwd, hmh,
	hpa, jbarnes, jeremy.fitzhardinge, linux-kernel, mingo, tglx,
	thellstrom, tj, venkatesh.pallipadi, x86, yinghai

Ingo Molnar wrote:
> * Thomas Schlichter <thomas.schlichter@web.de> wrote:
> > > or Perhaps just try to add mtrr only for the pci mmap case like the
> > > 4th patch in this series..
> >
> > I'd prefer this! ;-)
> 
> Hm, we could perhaps do that - but i think we should only do that on
> systems that have PAT disabled.

The patch I sent should do exactly that.

> Which brings up the question of how to properly QA such a narrow segment
> of the market. Maybe disabling CONFIG_X86_PAT should enable that logic
> too.

When CONFIG_X86_PAT is disabled, pat_enabled is 0, and thus my new code should 
be active. Or do I miss something?

What do you think about the latest version of my patch series I just sent?

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-20 20:35   ` Thomas Schlichter
@ 2009-10-20 21:59     ` Suresh Siddha
  2009-10-21 11:52       ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Suresh Siddha @ 2009-10-20 21:59 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Ingo Molnar, Jan Beulich, arjan@linux.intel.com,
	dri-devel@lists.sourceforge.net, hancockrwd@gmail.com,
	hmh@hmh.eng.br, hpa@zytor.com, jbarnes@virtuousgeek.org,
	jeremy.fitzhardinge@citrix.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, tglx@linutronix.de, thellstrom@vmware.com,
	tj@kernel.org, Pallipadi, Venkatesh, x86@kernel.org,
	yinghai@kernel.org

On Tue, 2009-10-20 at 13:35 -0700, Thomas Schlichter wrote:
> What do you think about the latest version of my patch series I just sent?

I think we can simplify this  by just using mtrr_add_page() and avoid
all the complexity that comes with mtrr_add_unaligned().

pci_mmap_range() should call one mtrr_add_page() if the base and size
are nicely aligned. Almost all the cases, this is the usage sequence
here anyway.

thanks,
suresh


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-20 21:59     ` Suresh Siddha
@ 2009-10-21 11:52       ` Ingo Molnar
  0 siblings, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2009-10-21 11:52 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Thomas Schlichter, Jan Beulich, arjan@linux.intel.com,
	dri-devel@lists.sourceforge.net, hancockrwd@gmail.com,
	hmh@hmh.eng.br, hpa@zytor.com, jbarnes@virtuousgeek.org,
	jeremy.fitzhardinge@citrix.com, linux-kernel@vger.kernel.org,
	mingo@redhat.com, tglx@linutronix.de, thellstrom@vmware.com,
	tj@kernel.org, Pallipadi, Venkatesh, x86@kernel.org,
	yinghai@kernel.org


* Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> On Tue, 2009-10-20 at 13:35 -0700, Thomas Schlichter wrote:
> > What do you think about the latest version of my patch series I just sent?
> 
> I think we can simplify this by just using mtrr_add_page() and avoid 
> all the complexity that comes with mtrr_add_unaligned().
> 
> pci_mmap_range() should call one mtrr_add_page() if the base and size 
> are nicely aligned. Almost all the cases, this is the usage sequence 
> here anyway.

yep, keeping it simple is a must.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-20 19:54 ` Thomas Schlichter
@ 2009-10-21 11:57   ` Ingo Molnar
  0 siblings, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2009-10-21 11:57 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jan Beulich, Jeremy Fitzhardinge, Robert Hancock,
	Henrique de Moraes Holschuh, Suresh Siddha, Venkatesh Pallipadi,
	Tejun Heo, x86, Yinghai Lu, Thomas Gleixner, Arjan van de Ven,
	dri-devel, Ingo Molnar, linux-kernel, jbarnes, Thomas Hellstrom,
	H. Peter Anvin


* Thomas Schlichter <thomas.schlichter@web.de> wrote:

> Jan Beulich wrote:
>
> > Functionality-wise this looks fine to me; whether the core sysfs 
> > changes are acceptable I can't judge, though.
> 
> OK, I think I should have addressed your comments. Unfortunately I had 
> to use a little "hack" to make pci_mmap_page_range() work for sysfs 
> and proc. I placed a "private" pointer in the beginning of both 
> per-file private structures. So this pointer can be accessed 
> independent from the caller. I hope this is acceptable.
> 
> I dropped the ioremap() and set_memory_wc() patches, I could not 
> implement reference counting for them and it may interact too much 
> with existing GPU drivers.
> 
> Again, this series should not change the current behavior if either 
> MTRR is disabled or PAT is enabled. But it helps in the case that MTRR 
> is enabled and PAT is not available.
> 
> What should be done now to get this series on the right "track"?

Can we eliminate mtrr_add_unaligned() as Suresh suggested, and still 
make it work on your testbox?

Once that's done i'll look at putting it into the x86 tree for testing. 
The acks of Suresh/Venki/Jan would be nice to have.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
@ 2009-10-21 13:45 Thomas Schlichter
  2009-10-21 14:11 ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-21 13:45 UTC (permalink / raw)
  To: Ingo Molnar, Suresh Siddha
  Cc: Jan Beulich, Arjan van de Ven, dri-devel, Robert Hancock,
	Henrique de Moraes Holschuh, H. Peter Anvin, jbarnes,
	Jeremy Fitzhardinge, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Thomas Hellstrom, Tejun Heo, Venkatesh Pallipadi, x86, Yinghai Lu

Suresh Siddha wrote:
> I think we can simplify this  by just using mtrr_add_page() and avoid
> all the complexity that comes with mtrr_add_unaligned().
> 
> pci_mmap_range() should call one mtrr_add_page() if the base and size
> are nicely aligned. Almost all the cases, this is the usage sequence
> here anyway.

Ingo Molnar wrote:
> Can we eliminate mtrr_add_unaligned() as Suresh suggested, and still 
> make it work on your testbox?

Yes, I had that in the first place, but Jan suggested to extend it to also
handle non-aligned, non-power-of-two cases:
  http://marc.info/?l=linux-kernel&m=125541951529918&w=2

So if it's OK for Jan, I'll reduce the functionality again and use mtrr_add()
instead. Btw. this simply means to drop mtrr_add_unaligned(), all the other
parts are still required for reference counting and a proper mtrr_del() on file
close.

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-21 13:45 Thomas Schlichter
@ 2009-10-21 14:11 ` Jan Beulich
  2009-10-21 17:35   ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2009-10-21 14:11 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jeremy Fitzhardinge, Ingo Molnar, Robert Hancock,
	Henrique de Moraes Holschuh, Suresh Siddha, Venkatesh Pallipadi,
	Tejun Heo, x86, Yinghai Lu, Thomas Gleixner, Arjan van de Ven,
	dri-devel, Ingo Molnar, linux-kernel, jbarnes, Thomas Hellstrom,
	H. Peter Anvin

>>> Thomas Schlichter <thomas.schlichter@web.de> 21.10.09 15:45 >>>
>Ingo Molnar wrote:
>> Can we eliminate mtrr_add_unaligned() as Suresh suggested, and still 
>> make it work on your testbox?
>
>Yes, I had that in the first place, but Jan suggested to extend it to also
>handle non-aligned, non-power-of-two cases:
>  http://marc.info/?l=linux-kernel&m=125541951529918&w=2 

I merely pointed out it wouldn't work for unaligned addresses or sizes
passed in.

>So if it's OK for Jan, I'll reduce the functionality again and use mtrr_add()
>instead. Btw. this simply means to drop mtrr_add_unaligned(), all the other
>parts are still required for reference counting and a proper mtrr_del() on file
>close.

I'm perfectly fine with just handling the aligned case, as long as the code
checks that the alignment constraints are met.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not  available
@ 2009-10-21 14:38 Thomas Schlichter
  2009-10-21 15:14 ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-21 14:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Arjan van de Ven, dri-devel, Robert Hancock,
	Henrique de Moraes Holschuh, H. Peter Anvin, jbarnes,
	Jeremy Fitzhardinge, linux-kernel, Ingo Molnar, Ingo Molnar,
	Suresh Siddha, Thomas Gleixner, Thomas Hellstrom, Tejun Heo,
	Venkatesh Pallipadi, x86, Yinghai Lu

Jan Beulich wrote:
> >Yes, I had that in the first place, but Jan suggested to extend it to also
> >handle non-aligned, non-power-of-two cases:
> >  http://marc.info/?l=linux-kernel&m=125541951529918&w=2 
> 
> I merely pointed out it wouldn't work for unaligned addresses or sizes
> passed in.

Oh, I'm sorry, I must have misinterpreted it...

> >So if it's OK for Jan, I'll reduce the functionality again and use mtrr_add()
> >instead. Btw. this simply means to drop mtrr_add_unaligned(), all the other
> >parts are still required for reference counting and a proper mtrr_del() on file
> >close.
> 
> I'm perfectly fine with just handling the aligned case, as long as the code
> checks that the alignment constraints are met.

Hmm, as far as I see mtrr_add() and mtrr_add_page() already check these
constraints. Do you want me to check them additionally? Or do you want to
completely fail the mmap() if these constraints are violated?

I'd let mmap() succeed even if the mtrr_add() fails...

Regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not  available
  2009-10-21 14:38 Thomas Schlichter
@ 2009-10-21 15:14 ` Jan Beulich
  0 siblings, 0 replies; 60+ messages in thread
From: Jan Beulich @ 2009-10-21 15:14 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jeremy Fitzhardinge, Ingo Molnar, Robert Hancock,
	Henrique de Moraes Holschuh, Suresh Siddha, Venkatesh Pallipadi,
	Tejun Heo, x86, Yinghai Lu, Thomas Gleixner, Arjan van de Ven,
	dri-devel, Ingo Molnar, linux-kernel, jbarnes, Thomas Hellstrom,
	H. Peter Anvin

>>> Thomas Schlichter <thomas.schlichter@web.de> 21.10.09 16:38 >>>
>Hmm, as far as I see mtrr_add() and mtrr_add_page() already check these
>constraints. Do you want me to check them additionally? Or do you want to
>completely fail the mmap() if these constraints are violated?

I'd say you should check it explicitly in the caller, to avoid the warning
message being printed otherwise.

>I'd let mmap() succeed even if the mtrr_add() fails...

Yes, of course.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-21 14:11 ` Jan Beulich
@ 2009-10-21 17:35   ` Ingo Molnar
  2009-10-21 20:01     ` Thomas Schlichter
  0 siblings, 1 reply; 60+ messages in thread
From: Ingo Molnar @ 2009-10-21 17:35 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Thomas Schlichter, Jeremy Fitzhardinge, Robert Hancock,
	Henrique de Moraes Holschuh, Suresh Siddha, Venkatesh Pallipadi,
	Tejun Heo, x86, Yinghai Lu, Thomas Gleixner, Arjan van de Ven,
	dri-devel, Ingo Molnar, linux-kernel, jbarnes, Thomas Hellstrom,
	H. Peter Anvin


* Jan Beulich <JBeulich@novell.com> wrote:

> > So if it's OK for Jan, I'll reduce the functionality again and use 
> > mtrr_add() instead. Btw. this simply means to drop 
> > mtrr_add_unaligned(), all the other parts are still required for 
> > reference counting and a proper mtrr_del() on file close.
> 
> I'm perfectly fine with just handling the aligned case, as long as the 
> code checks that the alignment constraints are met.

It could even emit a debug warning if they are not met - that way we'll 
see how important the unaligned case is.

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-21 17:35   ` Ingo Molnar
@ 2009-10-21 20:01     ` Thomas Schlichter
  2009-10-22  9:53       ` Suresh Siddha
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-21 20:01 UTC (permalink / raw)
  To: Ingo Molnar, Jan Beulich
  Cc: Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Suresh Siddha, Venkatesh Pallipadi, Tejun Heo, x86, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven, dri-devel, Ingo Molnar,
	linux-kernel, jbarnes, Thomas Hellstrom, H. Peter Anvin

[-- Attachment #1: Type: Text/Plain, Size: 1080 bytes --]

Ingo Molnar wrote:
> * Jan Beulich <JBeulich@novell.com> wrote:
> > I'm perfectly fine with just handling the aligned case, as long as the
> > code checks that the alignment constraints are met.
> 
> It could even emit a debug warning if they are not met - that way we'll
> see how important the unaligned case is.

OK, so I think the attached patches should do it. Hopefully I have addressed 
all your comments.

This series works for my test machine, without it, or without X running, I 
have these MTRR entries:
reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
reg01: base=0x06ff00000 ( 1791MB), size=    1MB, count=1: uncachable
reg02: base=0x070000000 ( 1792MB), size=  256MB, count=1: uncachable

With the patches applied and X running I get these:
reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
reg01: base=0x06ff00000 ( 1791MB), size=    1MB, count=1: uncachable
reg02: base=0x070000000 ( 1792MB), size=  256MB, count=1: uncachable
reg03: base=0x0d0000000 ( 3328MB), size=  256MB, count=1: write-combining

Best regards,
  Thomas

[-- Attachment #2: 0001-Make-num_var_ranges-accessible-outside-MTRR-code.patch --]
[-- Type: text/x-patch, Size: 1638 bytes --]

>From e946206915e3b023eb331499d73014105429200c Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 12:36:05 +0200
Subject: [PATCH 1/3] Make num_var_ranges accessible outside MTRR code

Code outside MTRR will have to remember which MTRR entries are used for a
given purpose. Therefore it has to access num_var_ranges which holds the
value of the maximum count of MTRR entries available. So we make this value
accessible from outside the core MTRR code.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/include/asm/mtrr.h     |    2 ++
 arch/x86/kernel/cpu/mtrr/mtrr.h |    1 -
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index 4365ffd..1e7a824 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -72,6 +72,8 @@ typedef __u8 mtrr_type;
 #define MTRR_NUM_FIXED_RANGES 88
 #define MTRR_MAX_VAR_RANGES 256
 
+extern unsigned int num_var_ranges;
+
 struct mtrr_state_type {
 	struct mtrr_var_range var_ranges[MTRR_MAX_VAR_RANGES];
 	mtrr_type fixed_ranges[MTRR_NUM_FIXED_RANGES];
diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h
index a501dee..ba6a8a5 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.h
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.h
@@ -61,7 +61,6 @@ extern struct mtrr_ops *mtrr_if;
 #define is_cpu(vnd)	(mtrr_if && mtrr_if->vendor == X86_VENDOR_##vnd)
 #define use_intel()	(mtrr_if && mtrr_if->use_intel_if == 1)
 
-extern unsigned int num_var_ranges;
 extern u64 mtrr_tom2;
 extern struct mtrr_state_type mtrr_state;
 
-- 
1.6.5.1


[-- Attachment #3: 0002-Provide-per-file-private-data-for-bin-sysfs-files.patch --]
[-- Type: text/x-patch, Size: 2203 bytes --]

>From 7210888ded750f87f5509bdd5b95363a476ce307 Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 12:39:11 +0200
Subject: [PATCH 2/3] Provide per-file private data for bin sysfs files

For binary sysfs files, provide a per-file private field in struct bin_buffer.
To be easily accessible, it is located on head of the struct. Additionally add
a release() callback that may be used to e.g. free the private data on file
release.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 fs/sysfs/bin.c        |   10 +++++++++-
 include/linux/sysfs.h |    3 +++
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/fs/sysfs/bin.c b/fs/sysfs/bin.c
index 60c702b..28c039a 100644
--- a/fs/sysfs/bin.c
+++ b/fs/sysfs/bin.c
@@ -37,6 +37,7 @@
 static DEFINE_MUTEX(sysfs_bin_lock);
 
 struct bin_buffer {
+	void				*private;
 	struct mutex			mutex;
 	void				*buffer;
 	int				mmapped;
@@ -438,6 +439,13 @@ static int open(struct inode * inode, struct file * file)
 static int release(struct inode * inode, struct file * file)
 {
 	struct bin_buffer *bb = file->private_data;
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
+	struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+	int rc = 0;
+
+	if (attr->release)
+		rc = attr->release(kobj, attr, file);
 
 	mutex_lock(&sysfs_bin_lock);
 	hlist_del(&bb->list);
@@ -445,7 +453,7 @@ static int release(struct inode * inode, struct file * file)
 
 	kfree(bb->buffer);
 	kfree(bb);
-	return 0;
+	return rc;
 }
 
 const struct file_operations bin_fops = {
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 9d68fed..751ea68 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -19,6 +19,7 @@
 
 struct kobject;
 struct module;
+struct file;
 
 /* FIXME
  * The *owner field is no longer used.
@@ -72,6 +73,8 @@ struct bin_attribute {
 			 char *, loff_t, size_t);
 	int (*mmap)(struct kobject *, struct bin_attribute *attr,
 		    struct vm_area_struct *vma);
+	int (*release)(struct kobject *, struct bin_attribute *attr,
+		      struct file *file);
 };
 
 struct sysfs_ops {
-- 
1.6.5.1


[-- Attachment #4: 0003-Use-MTRR-for-pci_mmap_resource_wc-if-PAT-is-not-avai.patch --]
[-- Type: text/x-patch, Size: 5751 bytes --]

>From e3317e73726152380cd05f75c87c433b9185b291 Mon Sep 17 00:00:00 2001
From: Thomas Schlichter <thomas.schlichter@web.de>
Date: Sat, 17 Oct 2009 21:17:16 +0200
Subject: [PATCH 3/3] Use MTRR for pci_mmap_resource_wc if PAT is not available

X.org uses libpciaccess which tries to mmap with write combining enabled via
/sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, the
kernel does fall back to uncached mmap. Then libpciaccess thinks it succeeded
mapping with write combining enabled and does not set up suited MTRR entries.
;-(

So when falling back to uncached mapping, we better try to set up MTRR
entries automatically. When the resource file is closed, we remove the MTRR
entries again.

Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de>
---
 arch/x86/include/asm/pci.h |    2 +
 arch/x86/pci/i386.c        |   60 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci-sysfs.c    |    8 ++++++
 drivers/pci/pci.h          |    6 ++++
 drivers/pci/proc.c         |    5 +++-
 5 files changed, 80 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index ada8c20..e94aeb1 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -64,6 +64,8 @@ struct irq_routing_table *pcibios_get_irq_routing_table(void);
 int pcibios_set_irq_routing(struct pci_dev *dev, int pin, int irq);
 
 
+#define HAVE_PCI_RELEASE_FILE
+
 #define HAVE_PCI_MMAP
 extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 			       enum pci_mmap_state mmap_state,
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index b22d13b..60b6fdc 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -31,7 +31,9 @@
 #include <linux/ioport.h>
 #include <linux/errno.h>
 #include <linux/bootmem.h>
+#include <linux/fs.h>
 
+#include <asm/mtrr.h>
 #include <asm/pat.h>
 #include <asm/e820.h>
 #include <asm/pci_x86.h>
@@ -301,5 +303,63 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 
 	vma->vm_ops = &pci_mmap_ops;
 
+#ifdef CONFIG_MTRR
+	if (!pat_enabled && write_combine) {
+		int i;
+		unsigned long base = vma->vm_pgoff << PAGE_SHIFT;
+		unsigned long size = vma->vm_end - vma->vm_start;
+
+		if ((size < PAGE_SIZE) || (size & (size - 1)) ||
+		    (base & (size - 1)))
+		{
+			printk(KERN_DEBUG
+				"Unaligned memory region passed to "
+				"pci_mmap_page_range().\n");
+			printk(KERN_DEBUG
+				"Thus no write combining MTRR entry "
+				"is created.\n");
+			printk(KERN_DEBUG
+				"   base = 0x%lx  size = 0x%lx\n", base, size);
+			return 0;
+		}
+
+		i = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true);
+		if (i >= 0) {
+			int **p_mtrr_usage = (int **)vma->vm_file->private_data;
+
+			if (*p_mtrr_usage == NULL)
+				*p_mtrr_usage = kzalloc(num_var_ranges *
+							sizeof(int),
+							GFP_KERNEL);
+			if (*p_mtrr_usage != NULL)
+				(*p_mtrr_usage)[i]++;
+		}
+	}
+#endif // CONFIG_MTRR
+
+	return 0;
+}
+
+int pci_release_file(struct file *file)
+{
+#ifdef CONFIG_MTRR
+	int i;
+	int **p_mtrr_usage = (int **)file->private_data;
+	int *mtrr_usage = *p_mtrr_usage;
+
+	if (!mtrr_usage)
+		return 0;
+
+	for (i = 0; i < num_var_ranges; ++i) {
+		while (mtrr_usage[i] > 0) {
+			mtrr_del(i, 0, 0);
+			--mtrr_usage[i];
+		}
+	}
+
+	kfree(*p_mtrr_usage);
+	*p_mtrr_usage = NULL;
+#endif // CONFIG_MTRR
+
 	return 0;
 }
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 0f6382f..4f36835 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -733,6 +733,13 @@ pci_mmap_resource_wc(struct kobject *kobj, struct bin_attribute *attr,
 	return pci_mmap_resource(kobj, attr, vma, 1);
 }
 
+static int
+pci_release_resource(struct kobject *kobj, struct bin_attribute *attr,
+		     struct file *file)
+{
+	return pci_release_file(file);
+}
+
 /**
  * pci_remove_resource_files - cleanup resource files
  * @pdev: dev to cleanup
@@ -782,6 +789,7 @@ static int pci_create_attr(struct pci_dev *pdev, int num, int write_combine)
 			sprintf(res_attr_name, "resource%d", num);
 			res_attr->mmap = pci_mmap_resource_uc;
 		}
+		res_attr->release = pci_release_resource;
 		res_attr->attr.name = res_attr_name;
 		res_attr->attr.mode = S_IRUSR | S_IWUSR;
 		res_attr->size = pci_resource_len(pdev, num);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index d92d195..cff3b7d 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -84,6 +84,12 @@ static inline void pci_vpd_release(struct pci_dev *dev)
 		dev->vpd->ops->release(dev);
 }
 
+#ifdef HAVE_PCI_RELEASE_FILE
+extern int pci_release_file(struct file *file);
+#else
+static inline int pci_release_file(struct file *file) { return 0; }
+#endif
+
 /* PCI /proc functions */
 #ifdef CONFIG_PROC_FS
 extern int pci_proc_attach_device(struct pci_dev *dev);
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 593bb84..91c8a48 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -197,6 +197,7 @@ proc_bus_pci_write(struct file *file, const char __user *buf, size_t nbytes, lof
 }
 
 struct pci_filp_private {
+	void *private;
 	enum pci_mmap_state mmap_state;
 	int write_combine;
 };
@@ -277,7 +278,7 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
 
 static int proc_bus_pci_open(struct inode *inode, struct file *file)
 {
-	struct pci_filp_private *fpriv = kmalloc(sizeof(*fpriv), GFP_KERNEL);
+	struct pci_filp_private *fpriv = kzalloc(sizeof(*fpriv), GFP_KERNEL);
 
 	if (!fpriv)
 		return -ENOMEM;
@@ -292,6 +293,8 @@ static int proc_bus_pci_open(struct inode *inode, struct file *file)
 
 static int proc_bus_pci_release(struct inode *inode, struct file *file)
 {
+	pci_release_file(file);
+
 	kfree(file->private_data);
 	file->private_data = NULL;
 
-- 
1.6.5.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-21 20:01     ` Thomas Schlichter
@ 2009-10-22  9:53       ` Suresh Siddha
  2009-10-22 15:34         ` Eric Anholt
  0 siblings, 1 reply; 60+ messages in thread
From: Suresh Siddha @ 2009-10-22  9:53 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Ingo Molnar, Jan Beulich, Jeremy Fitzhardinge, Robert Hancock,
	Henrique de Moraes Holschuh, Pallipadi, Venkatesh, Tejun Heo,
	x86@kernel.org, Yinghai Lu, Thomas Gleixner, Arjan van de Ven,
	dri-devel@lists.sourceforge.net, Ingo Molnar,
	linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
	Thomas Hellstrom, H. Peter Anvin

On Wed, 2009-10-21 at 13:01 -0700, Thomas Schlichter wrote:
> OK, so I think the attached patches should do it. Hopefully I have addressed 
> all your comments.

Thomas,

I have couple of issues with this patchset still. pci_mmap_page_range()
doesn't get called for each fork(). So, we won't be ref counting the
mtrr usage properly.

I need to think a bit more carefully on this. Can I get back to you
early next week on this, as I am traveling and need to think through
this?

We already keep track of some of the PAT ref counting using
track_pfn_vma_copy(). And we need to extend/use something similar here.

Even if we need to extend sysfs or pci vma ops, we need to increment and
decrement the ref count of the mtrr register that gets used. There is no
need to go through num_var_ranges etc.

thanks,
suresh


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
@ 2009-10-22 12:08 Thomas Schlichter
  2009-10-22 12:14 ` H. Peter Anvin
  2009-10-22 13:35 ` Suresh Siddha
  0 siblings, 2 replies; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-22 12:08 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Jan Beulich, Arjan van de Ven, dri-devel@lists.sourceforge.net,
	Robert Hancock, Henrique de Moraes Holschuh, H. Peter Anvin,
	jbarnes@virtuousgeek.org, Jeremy Fitzhardinge,
	linux-kernel@vger.kernel.org, Ingo Molnar, Ingo Molnar,
	Thomas Gleixner, Thomas Hellstrom, Tejun Heo,
	Pallipadi, Venkatesh, x86@kernel.org, Yinghai Lu

Suresh Siddha wrote:
> On Wed, 2009-10-21 at 13:01 -0700, Thomas Schlichter wrote:
> > OK, so I think the attached patches should do it. Hopefully I have addressed 
> > all your comments.
> 
> Thomas,
> 
> I have couple of issues with this patchset still. pci_mmap_page_range()
> doesn't get called for each fork(). So, we won't be ref counting the
> mtrr usage properly.

When forking, what happens with the "struct file"? If it is being copied, then the
processes share the same private data which would be freed during the first
release(). I think this would be a problem whereever file-private data are used.

So I think it must be shared between the forked processes and some reference
counting must exist. This reference counting must ensure that release() is only
called when all processes did close() their file.

And in that case (shared "struct file", one single release() call in the end) this
implementation should be completely safe...

> I need to think a bit more carefully on this. Can I get back to you
> early next week on this, as I am traveling and need to think through
> this?

Of course you can. But as I do this just in my spare time, I can only work on this
in the evenings...

> We already keep track of some of the PAT ref counting using
> track_pfn_vma_copy(). And we need to extend/use something similar here.
> 
> Even if we need to extend sysfs or pci vma ops, we need to increment and
> decrement the ref count of the mtrr register that gets used.

The MTRR register ref count is already incremented with each mtrr_add() or
mtrr_add_page() that has the "increment" parameter set to true. Which is the
case in my implementation.

> There is no need to go through num_var_ranges etc.

Well I have to remember wich file added which MTRR entries. Because I have
to remove them if the file is being closed. Therefore I need an array of size
"num_var_ranges" (or MTRR_MAX_VAR_RANGES which is the uper bound).

Btw. if there really is a problem in my reference counting code, it also exists in
the current code in:
  arch/x86/kernel/cpu/mtrr/if.c

My patch borrows its ref-counting parts from there...

Kind regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-22 12:08 Thomas Schlichter
@ 2009-10-22 12:14 ` H. Peter Anvin
  2009-10-22 13:26   ` Suresh Siddha
  2009-10-22 13:35 ` Suresh Siddha
  1 sibling, 1 reply; 60+ messages in thread
From: H. Peter Anvin @ 2009-10-22 12:14 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Suresh Siddha, Jan Beulich, Arjan van de Ven,
	dri-devel@lists.sourceforge.net, Robert Hancock,
	Henrique de Moraes Holschuh, jbarnes@virtuousgeek.org,
	Jeremy Fitzhardinge, linux-kernel@vger.kernel.org, Ingo Molnar,
	Ingo Molnar, Thomas Gleixner, Thomas Hellstrom, Tejun Heo,
	Pallipadi, Venkatesh, x86@kernel.org, Yinghai Lu

On 10/22/2009 09:08 PM, Thomas Schlichter wrote:
>>
>> I have couple of issues with this patchset still. pci_mmap_page_range()
>> doesn't get called for each fork(). So, we won't be ref counting the
>> mtrr usage properly.
>
> When forking, what happens with the "struct file"? If it is being copied, then the
> processes share the same private data which would be freed during the first
> release(). I think this would be a problem whereever file-private data are used.
>
> So I think it must be shared between the forked processes and some reference
> counting must exist. This reference counting must ensure that release() is only
> called when all processes did close() their file.
>
> And in that case (shared "struct file", one single release() call in the end) this
> implementation should be completely safe...
>

struct file is shared between forked processes.

	-hpa

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-22 12:14 ` H. Peter Anvin
@ 2009-10-22 13:26   ` Suresh Siddha
  0 siblings, 0 replies; 60+ messages in thread
From: Suresh Siddha @ 2009-10-22 13:26 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Thomas Schlichter, Jan Beulich, Arjan van de Ven,
	dri-devel@lists.sourceforge.net, Robert Hancock,
	Henrique de Moraes Holschuh, jbarnes@virtuousgeek.org,
	Jeremy Fitzhardinge, linux-kernel@vger.kernel.org, Ingo Molnar,
	Ingo Molnar, Thomas Gleixner, Thomas Hellstrom, Tejun Heo,
	Pallipadi, Venkatesh, x86@kernel.org, Yinghai Lu

On Thu, 2009-10-22 at 05:14 -0700, H. Peter Anvin wrote:
> On 10/22/2009 09:08 PM, Thomas Schlichter wrote:
> >>
> >> I have couple of issues with this patchset still. pci_mmap_page_range()
> >> doesn't get called for each fork(). So, we won't be ref counting the
> >> mtrr usage properly.
> >
> > When forking, what happens with the "struct file"? If it is being copied, then the
> > processes share the same private data which would be freed during the first
> > release(). I think this would be a problem whereever file-private data are used.
> >
> > So I think it must be shared between the forked processes and some reference
> > counting must exist. This reference counting must ensure that release() is only
> > called when all processes did close() their file.
> >
> > And in that case (shared "struct file", one single release() call in the end) this
> > implementation should be completely safe...
> >
> 
> struct file is shared between forked processes.

That is correct. But I am referring to the ref-count getting incremented
in Thomas's patch only in the pci_mmap_page_range() which will be called
only during first mmap.

We need to keep track of the counts of later forks too. For PAT, we keep
track of this ref counting in track_pfn_vma_copy(). We shouldn't use
different tracking mechanisms for PAT and non-PAT. We should cleanly tap
into track_pfn_vma_copy() or extend that to cover this case aswell.

thanks,
suresh


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-22 12:08 Thomas Schlichter
  2009-10-22 12:14 ` H. Peter Anvin
@ 2009-10-22 13:35 ` Suresh Siddha
  1 sibling, 0 replies; 60+ messages in thread
From: Suresh Siddha @ 2009-10-22 13:35 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jan Beulich, Arjan van de Ven, dri-devel@lists.sourceforge.net,
	Robert Hancock, Henrique de Moraes Holschuh, H. Peter Anvin,
	jbarnes@virtuousgeek.org, Jeremy Fitzhardinge,
	linux-kernel@vger.kernel.org, Ingo Molnar, Ingo Molnar,
	Thomas Gleixner, Thomas Hellstrom, Tejun Heo,
	Pallipadi, Venkatesh, x86@kernel.org, Yinghai Lu

On Thu, 2009-10-22 at 05:08 -0700, Thomas Schlichter wrote:
> When forking, what happens with the "struct file"? If it is being copied, then the
> processes share the same private data which would be freed during the first
> release(). I think this would be a problem whereever file-private data are used.
> 
> So I think it must be shared between the forked processes and some reference
> counting must exist. This reference counting must ensure that release() is only
> called when all processes did close() their file.
> 
> And in that case (shared "struct file", one single release() call in the end) this
> implementation should be completely safe...

I am referring to the refcount getting incremented. Also, let me think
about your direction (as the pci_mmap_page_rane() is explicitly adding
the mtrr, we should perhaps do the ref counting there, perhaps)

> > There is no need to go through num_var_ranges etc.
> 
> Well I have to remember wich file added which MTRR entries. Because I have
> to remove them if the file is being closed. Therefore I need an array of size
> "num_var_ranges" (or MTRR_MAX_VAR_RANGES which is the uper bound).

No. the private data  for example can keep track of a struct containing
mtrr number and ref count etc. Exporting var_ranges and going through
var ranges elements in an array is not clean, especially when you are
populating only one element.

thanks
suresh


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
@ 2009-10-22 14:27 Thomas Schlichter
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-22 14:27 UTC (permalink / raw)
  To: H. Peter Anvin, Suresh Siddha
  Cc: JBeulich, arjan, dri-devel, hancockrwd, hmh, jbarnes,
	jeremy.fitzhardinge, linux-kernel, mingo, mingo, tglx, thellstrom,
	tj, venkatesh.pallipadi, x86, yinghai

Suresh Siddha wrote:
> On Thu, 2009-10-22 at 05:14 -0700, H. Peter Anvin wrote:
> > On 10/22/2009 09:08 PM, Thomas Schlichter wrote:
> > > And in that case (shared "struct file", one single release() call in the end) this
> > > implementation should be completely safe...
> > 
> > struct file is shared between forked processes.
> 
> That is correct. But I am referring to the ref-count getting incremented
> in Thomas's patch only in the pci_mmap_page_range() which will be called
> only during first mmap.
> 
> We need to keep track of the counts of later forks too.

When forked processes do mmap() PCI additional memory,
pci_mmap_page_range() will be called again and the corresponding (shared)
mtrr_usage_count wil be incremented. So we do keep track of later forks...

Yes, the MTRR reference count has nothing to do with the processes using this
memory, but if you want this, arch/x86/kernel/cpu/mtrr/if.c must be changed, too.

Regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
@ 2009-10-22 14:41 Thomas Schlichter
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-22 14:41 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Jan Beulich, Arjan van de Ven, dri-devel@lists.sourceforge.net,
	Robert Hancock, Henrique de Moraes Holschuh, H. Peter Anvin,
	jbarnes@virtuousgeek.org, Jeremy Fitzhardinge,
	linux-kernel@vger.kernel.org, Ingo Molnar, Ingo Molnar,
	Thomas Gleixner, Thomas Hellstrom, Tejun Heo,
	Pallipadi, Venkatesh, x86@kernel.org, Yinghai Lu

Suresh Siddha wrote:
> > > There is no need to go through num_var_ranges etc.
> > 
> > Well I have to remember wich file added which MTRR entries. Because I have
> > to remove them if the file is being closed. Therefore I need an array of size
> > "num_var_ranges" (or MTRR_MAX_VAR_RANGES which is the uper bound).
> 
> No. the private data  for example can keep track of a struct containing
> mtrr number and ref count etc. Exporting var_ranges and going through
> var ranges elements in an array is not clean, especially when you are
> populating only one element.

OK, I should have written that num_var_ranges is neccessary if I do copy a
algorithm for exactly the same purpose from an other place. And I don't see
anything better in having a dynamically growing list that makes the operation
of incrementing MTRR entries an O(n) operation where it now is a O(1) operation.

Additionally, the worst case memory requirement would be
2*sizeof(int)*num_var_ranges, where it is now 1*sizeof(int)*num_var_ranges.

So for me, this would be a step back, but if you want this, you should additionally
change arch/x86/kernel/cpu/mtrr/if.c from where I reused the algorithm.

Regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-22  9:53       ` Suresh Siddha
@ 2009-10-22 15:34         ` Eric Anholt
  2009-10-22 21:47           ` Suresh Siddha
  0 siblings, 1 reply; 60+ messages in thread
From: Eric Anholt @ 2009-10-22 15:34 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Thomas Schlichter, Ingo Molnar, Jan Beulich, Jeremy Fitzhardinge,
	Robert Hancock, Henrique de Moraes Holschuh, Pallipadi, Venkatesh,
	Tejun Heo, x86@kernel.org, Yinghai Lu, Thomas Gleixner,
	Arjan van de Ven, dri-devel@lists.sourceforge.net, Ingo Molnar,
	linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
	Thomas Hellstrom, H. Peter Anvin

[-- Attachment #1: Type: text/plain, Size: 1159 bytes --]

On Thu, 2009-10-22 at 02:53 -0700, Suresh Siddha wrote:
> On Wed, 2009-10-21 at 13:01 -0700, Thomas Schlichter wrote:
> > OK, so I think the attached patches should do it. Hopefully I have addressed 
> > all your comments.
> 
> Thomas,
> 
> I have couple of issues with this patchset still. pci_mmap_page_range()
> doesn't get called for each fork(). So, we won't be ref counting the
> mtrr usage properly.
> 
> I need to think a bit more carefully on this. Can I get back to you
> early next week on this, as I am traveling and need to think through
> this?
> 
> We already keep track of some of the PAT ref counting using
> track_pfn_vma_copy(). And we need to extend/use something similar here.
> 
> Even if we need to extend sysfs or pci vma ops, we need to increment and
> decrement the ref count of the mtrr register that gets used. There is no
> need to go through num_var_ranges etc.

Can we just not create the _wc sysfs entry if we don't have PAT?  I
don't think there's userland relying on its presence as opposed to the
non-_wc entry.

-- 
Eric Anholt
eric@anholt.net                         eric.anholt@intel.com



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-22 15:34         ` Eric Anholt
@ 2009-10-22 21:47           ` Suresh Siddha
  2009-10-22 23:10             ` Jesse Barnes
  0 siblings, 1 reply; 60+ messages in thread
From: Suresh Siddha @ 2009-10-22 21:47 UTC (permalink / raw)
  To: Eric Anholt
  Cc: Thomas Schlichter, Ingo Molnar, Jan Beulich, Jeremy Fitzhardinge,
	Robert Hancock, Henrique de Moraes Holschuh, Pallipadi, Venkatesh,
	Tejun Heo, x86@kernel.org, Yinghai Lu, Thomas Gleixner,
	Arjan van de Ven, dri-devel@lists.sourceforge.net, Ingo Molnar,
	linux-kernel@vger.kernel.org, jbarnes@virtuousgeek.org,
	Thomas Hellstrom, H. Peter Anvin

On Thu, 2009-10-22 at 08:34 -0700, Eric Anholt wrote:
> Can we just not create the _wc sysfs entry if we don't have PAT?  I
> don't think there's userland relying on its presence as opposed to the
> non-_wc entry.

Yes indeed. Jesse do you see an issue with this? This is simple and
clean. Thanks Eric.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-22 21:47           ` Suresh Siddha
@ 2009-10-22 23:10             ` Jesse Barnes
  2009-10-23  0:11               ` Suresh Siddha
  0 siblings, 1 reply; 60+ messages in thread
From: Jesse Barnes @ 2009-10-22 23:10 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Eric Anholt, Thomas Schlichter, Ingo Molnar, Jan Beulich,
	Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Pallipadi, Venkatesh, Tejun Heo, x86@kernel.org, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven,
	dri-devel@lists.sourceforge.net, Ingo Molnar,
	linux-kernel@vger.kernel.org, Thomas Hellstrom, H. Peter Anvin

On Thu, 22 Oct 2009 14:47:30 -0700
Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> On Thu, 2009-10-22 at 08:34 -0700, Eric Anholt wrote:
> > Can we just not create the _wc sysfs entry if we don't have PAT?  I
> > don't think there's userland relying on its presence as opposed to
> > the non-_wc entry.
> 
> Yes indeed. Jesse do you see an issue with this? This is simple and
> clean. Thanks Eric.

Yeah, I think that will be fine.  In fact, older versions of
libpciaccess will behave better if we do it that way (iirc it only
allocates an MTRR if the resource_wc file doesn't exist or fails to get
mapped).

Jesse

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-22 23:10             ` Jesse Barnes
@ 2009-10-23  0:11               ` Suresh Siddha
  2009-10-23  1:53                 ` Eric Anholt
  0 siblings, 1 reply; 60+ messages in thread
From: Suresh Siddha @ 2009-10-23  0:11 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Eric Anholt, Thomas Schlichter, Ingo Molnar, Jan Beulich,
	Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Pallipadi, Venkatesh, Tejun Heo, x86@kernel.org, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven,
	dri-devel@lists.sourceforge.net, Ingo Molnar,
	linux-kernel@vger.kernel.org, Thomas Hellstrom, H. Peter Anvin

On Thu, 2009-10-22 at 16:10 -0700, Jesse Barnes wrote:
> On Thu, 22 Oct 2009 14:47:30 -0700
> Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> 
> > On Thu, 2009-10-22 at 08:34 -0700, Eric Anholt wrote:
> > > Can we just not create the _wc sysfs entry if we don't have PAT?  I
> > > don't think there's userland relying on its presence as opposed to
> > > the non-_wc entry.
> > 
> > Yes indeed. Jesse do you see an issue with this? This is simple and
> > clean. Thanks Eric.
> 
> Yeah, I think that will be fine.  In fact, older versions of
> libpciaccess will behave better if we do it that way (iirc it only
> allocates an MTRR if the resource_wc file doesn't exist or fails to get
> mapped).

Eric, care to send the patch?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-23  0:11               ` Suresh Siddha
@ 2009-10-23  1:53                 ` Eric Anholt
  2009-10-23  4:31                   ` Jesse Barnes
  2009-10-23  4:33                   ` Suresh Siddha
  0 siblings, 2 replies; 60+ messages in thread
From: Eric Anholt @ 2009-10-23  1:53 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Jesse Barnes, Thomas Schlichter, Ingo Molnar, Jan Beulich,
	Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Pallipadi, Venkatesh, Tejun Heo, x86@kernel.org, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven,
	dri-devel@lists.sourceforge.net, Ingo Molnar,
	linux-kernel@vger.kernel.org, Thomas Hellstrom, H. Peter Anvin

[-- Attachment #1: Type: text/plain, Size: 1164 bytes --]

On Thu, 2009-10-22 at 17:11 -0700, Suresh Siddha wrote:
> On Thu, 2009-10-22 at 16:10 -0700, Jesse Barnes wrote:
> > On Thu, 22 Oct 2009 14:47:30 -0700
> > Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > 
> > > On Thu, 2009-10-22 at 08:34 -0700, Eric Anholt wrote:
> > > > Can we just not create the _wc sysfs entry if we don't have PAT?  I
> > > > don't think there's userland relying on its presence as opposed to
> > > > the non-_wc entry.
> > > 
> > > Yes indeed. Jesse do you see an issue with this? This is simple and
> > > clean. Thanks Eric.
> > 
> > Yeah, I think that will be fine.  In fact, older versions of
> > libpciaccess will behave better if we do it that way (iirc it only
> > allocates an MTRR if the resource_wc file doesn't exist or fails to get
> > mapped).
> 
> Eric, care to send the patch?

I don't have a patch, I was just suggesting a way to handle the
submitter's problem that won't involve complicated changes that nobody
else will be testing since everyone *should* have a graphics driver for
their graphics hardware.

-- 
Eric Anholt
eric@anholt.net                         eric.anholt@intel.com



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-23  1:53                 ` Eric Anholt
@ 2009-10-23  4:31                   ` Jesse Barnes
  2009-10-23  4:58                     ` Suresh Siddha
  2009-10-23  4:33                   ` Suresh Siddha
  1 sibling, 1 reply; 60+ messages in thread
From: Jesse Barnes @ 2009-10-23  4:31 UTC (permalink / raw)
  To: Eric Anholt
  Cc: Suresh Siddha, Thomas Hellstrom, Jeremy Fitzhardinge,
	H. Peter Anvin, Arjan van de Ven, x86@kernel.org,
	linux-kernel@vger.kernel.org, Jan Beulich, Ingo Molnar,
	Thomas Gleixner, Henrique de Moraes Holschuh,
	Pallipadi, Venkatesh, Tejun Heo, Thomas Schlichter,
	dri-devel@lists.sourceforge.net, Yinghai Lu, Ingo Molnar,
	Robert Hancock

On Thu, 22 Oct 2009 18:53:19 -0700
Eric Anholt <eric@anholt.net> wrote:

> On Thu, 2009-10-22 at 17:11 -0700, Suresh Siddha wrote:
> > On Thu, 2009-10-22 at 16:10 -0700, Jesse Barnes wrote:
> > > On Thu, 22 Oct 2009 14:47:30 -0700
> > > Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > > 
> > > > On Thu, 2009-10-22 at 08:34 -0700, Eric Anholt wrote:
> > > > > Can we just not create the _wc sysfs entry if we don't have
> > > > > PAT?  I don't think there's userland relying on its presence
> > > > > as opposed to the non-_wc entry.
> > > > 
> > > > Yes indeed. Jesse do you see an issue with this? This is simple
> > > > and clean. Thanks Eric.
> > > 
> > > Yeah, I think that will be fine.  In fact, older versions of
> > > libpciaccess will behave better if we do it that way (iirc it only
> > > allocates an MTRR if the resource_wc file doesn't exist or fails
> > > to get mapped).
> > 
> > Eric, care to send the patch?
> 
> I don't have a patch, I was just suggesting a way to handle the
> submitter's problem that won't involve complicated changes that nobody
> else will be testing since everyone *should* have a graphics driver
> for their graphics hardware.

Here's a quick & dirty version, totally untested.  A cleaner approach
would be to separate the WC mapping routines and hide the return
-EINVAL in arch specific code...

Jesse

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 0f6382f..41010bb 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -23,6 +23,9 @@
 #include <linux/mm.h>
 #include <linux/capability.h>
 #include <linux/pci-aspm.h>
+#ifdef CONFIG_X86
+#include <asm/pat.h>
+#endif
 #include "pci.h"
 
 static int sysfs_initialized;	/* = 0 */
@@ -730,6 +733,10 @@ static int
 pci_mmap_resource_wc(struct kobject *kobj, struct bin_attribute *attr,
 		     struct vm_area_struct *vma)
 {
+#ifdef CONFIG_X86
+	if (!pat_enabled)
+		return -EINVAL;
+#endif
 	return pci_mmap_resource(kobj, attr, vma, 1);
 }
 

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-23  1:53                 ` Eric Anholt
  2009-10-23  4:31                   ` Jesse Barnes
@ 2009-10-23  4:33                   ` Suresh Siddha
  1 sibling, 0 replies; 60+ messages in thread
From: Suresh Siddha @ 2009-10-23  4:33 UTC (permalink / raw)
  To: Eric Anholt
  Cc: Jesse Barnes, Thomas Schlichter, Ingo Molnar, Jan Beulich,
	Jeremy Fitzhardinge, Robert Hancock, Henrique de Moraes Holschuh,
	Pallipadi, Venkatesh, Tejun Heo, x86@kernel.org, Yinghai Lu,
	Thomas Gleixner, Arjan van de Ven,
	dri-devel@lists.sourceforge.net, Ingo Molnar,
	linux-kernel@vger.kernel.org, Thomas Hellstrom, H. Peter Anvin

On Thu, 2009-10-22 at 18:53 -0700, Eric Anholt wrote:
> On Thu, 2009-10-22 at 17:11 -0700, Suresh Siddha wrote:
> > On Thu, 2009-10-22 at 16:10 -0700, Jesse Barnes wrote:
> > > On Thu, 22 Oct 2009 14:47:30 -0700
> > > Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > > 
> > > > On Thu, 2009-10-22 at 08:34 -0700, Eric Anholt wrote:
> > > > > Can we just not create the _wc sysfs entry if we don't have PAT?  I
> > > > > don't think there's userland relying on its presence as opposed to
> > > > > the non-_wc entry.
> > > > 
> > > > Yes indeed. Jesse do you see an issue with this? This is simple and
> > > > clean. Thanks Eric.
> > > 
> > > Yeah, I think that will be fine.  In fact, older versions of
> > > libpciaccess will behave better if we do it that way (iirc it only
> > > allocates an MTRR if the resource_wc file doesn't exist or fails to get
> > > mapped).
> > 
> > Eric, care to send the patch?
> 
> I don't have a patch, I was just suggesting a way to handle the
> submitter's problem that won't involve complicated changes that nobody
> else will be testing since everyone *should* have a graphics driver for
> their graphics hardware.

I can send the patch early next week unless Thomas or someone else beats
me.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-23  4:31                   ` Jesse Barnes
@ 2009-10-23  4:58                     ` Suresh Siddha
  2009-10-23  7:24                       ` Thomas Schlichter
  0 siblings, 1 reply; 60+ messages in thread
From: Suresh Siddha @ 2009-10-23  4:58 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Eric Anholt, Thomas Hellstrom, Jeremy Fitzhardinge,
	H. Peter Anvin, Arjan van de Ven, x86@kernel.org,
	linux-kernel@vger.kernel.org, Jan Beulich, Ingo Molnar,
	Thomas Gleixner, Henrique de Moraes Holschuh,
	Pallipadi, Venkatesh, Tejun Heo, Thomas Schlichter,
	dri-devel@lists.sourceforge.net, Yinghai Lu, Ingo Molnar,
	Robert Hancock

On Thu, 2009-10-22 at 21:31 -0700, Jesse Barnes wrote:
> Here's a quick & dirty version, totally untested.  A cleaner approach
> would be to separate the WC mapping routines and hide the return
> -EINVAL in arch specific code...

Jesse How about this patch? Doing this in x86 is cleaner.

I would like Acks/sign-offs-by Thomas, Eric and Jesse, if it is ok with
this patch and works.

thanks,
suresh
---

From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: x86, pat: return EINVAL for pci mmap WC request for !pat_enabled

Thomas Schlichter reported:
> X.org uses libpciaccess which tries to mmap with write combining enabled via
> /sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, the
> kernel does fall back to uncached mmap. Then libpciaccess thinks it succeeded
> mapping with write combining enabled and does not set up suited MTRR entries.
> ;-(

Instead of silently mapping pci mmap region as UC minus in the case
of !pat_enabled and wc request, we can return error. Eric Anholt mentioned
that caller (like X) typically follows up with UC minus pci mmap request and
if there is a free mtrr slot, caller will manage adding WC mtrr.

Jesse Barnes says:
> Older versions of libpciaccess will behave better if we do it that way
> (iirc it only allocates an MTRR if the resource_wc file doesn't exist or
> fails to get mapped).

Reported-by: Thomas Schlichter <thomas.schlichter@web.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---

diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index b22d13b..a672f12 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -282,6 +282,15 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 		return -EINVAL;
 
 	prot = pgprot_val(vma->vm_page_prot);
+
+	/*
+ 	 * Return error if pat is not enabled and write_combine is requested.
+ 	 * Caller can followup with UC MINUS request and add a WC mtrr if there
+ 	 * is a free mtrr slot.
+ 	 */
+	if (!pat_enabled && write_combine)
+		return -EINVAL;
+
 	if (pat_enabled && write_combine)
 		prot |= _PAGE_CACHE_WC;
 	else if (pat_enabled || boot_cpu_data.x86 > 3)



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-23  4:58                     ` Suresh Siddha
@ 2009-10-23  7:24                       ` Thomas Schlichter
  2009-10-23 14:24                         ` Suresh Siddha
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Schlichter @ 2009-10-23  7:24 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Jesse Barnes, Eric Anholt, Thomas Hellstrom, Jeremy Fitzhardinge,
	H. Peter Anvin, Arjan van de Ven, x86@kernel.org,
	linux-kernel@vger.kernel.org, Jan Beulich, Ingo Molnar,
	Thomas Gleixner, Henrique de Moraes Holschuh,
	Pallipadi, Venkatesh, Tejun Heo, dri-devel@lists.sourceforge.net,
	Yinghai Lu, Ingo Molnar, Robert Hancock

Suresh Siddha wrote:
> On Thu, 2009-10-22 at 21:31 -0700, Jesse Barnes wrote:
> > Here's a quick & dirty version, totally untested.  A cleaner approach
> > would be to separate the WC mapping routines and hide the return
> > -EINVAL in arch specific code...
> 
> Jesse How about this patch? Doing this in x86 is cleaner.
> 
> I would like Acks/sign-offs-by Thomas, Eric and Jesse, if it is ok with
> this patch and works.

Hmm, at this point I already was more than a week ago:
  http://marc.info/?l=linux-kernel&m=125537770514713&w=2

OK, I also modified ioremap() and set_memory_wc() but your patch is just part 
of what I did there...

And Eric Anholt answered:
> Seems like we should install an MTRR instead.  Requiring userland to set
> up the MTRR on the kernel's behalf is backwards.

Where I totally agree with him.

Regards,
  Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-23  7:24                       ` Thomas Schlichter
@ 2009-10-23 14:24                         ` Suresh Siddha
  2009-10-23 14:37                           ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Suresh Siddha @ 2009-10-23 14:24 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jesse Barnes, Eric Anholt, Thomas Hellstrom, Jeremy Fitzhardinge,
	H. Peter Anvin, Arjan van de Ven, x86@kernel.org,
	linux-kernel@vger.kernel.org, Jan Beulich, Ingo Molnar,
	Thomas Gleixner, Henrique de Moraes Holschuh,
	Pallipadi, Venkatesh, Tejun Heo, dri-devel@lists.sourceforge.net,
	Yinghai Lu, Ingo Molnar, Robert Hancock

On Fri, 2009-10-23 at 00:24 -0700, Thomas Schlichter wrote:
> Hmm, at this point I already was more than a week ago:
>   http://marc.info/?l=linux-kernel&m=125537770514713&w=2
> 
> OK, I also modified ioremap() and set_memory_wc() but your patch is just part 
> of what I did there...

oops! Sorry I read the later emails in this thread carefully but not the
conversation with Thomas Hellstrom. Anyhow, glad that you were on top of
this. Thanks.

Ingo, can you please queue this patch with a sign-off from Thomas
Schlichter too?

thanks,
suresh


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC Patch] use MTRR for write combining if PAT is not available
  2009-10-23 14:24                         ` Suresh Siddha
@ 2009-10-23 14:37                           ` Ingo Molnar
  0 siblings, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2009-10-23 14:37 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Thomas Schlichter, Jesse Barnes, Eric Anholt, Thomas Hellstrom,
	Jeremy Fitzhardinge, H. Peter Anvin, Arjan van de Ven,
	x86@kernel.org, linux-kernel@vger.kernel.org, Jan Beulich,
	Ingo Molnar, Thomas Gleixner, Henrique de Moraes Holschuh,
	Pallipadi, Venkatesh, Tejun Heo, dri-devel@lists.sourceforge.net,
	Yinghai Lu, Robert Hancock


* Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> On Fri, 2009-10-23 at 00:24 -0700, Thomas Schlichter wrote:
> > Hmm, at this point I already was more than a week ago:
> >   http://marc.info/?l=linux-kernel&m=125537770514713&w=2
> > 
> > OK, I also modified ioremap() and set_memory_wc() but your patch is just part 
> > of what I did there...
> 
> oops! Sorry I read the later emails in this thread carefully but not 
> the conversation with Thomas Hellstrom. Anyhow, glad that you were on 
> top of this. Thanks.
> 
> Ingo, can you please queue this patch with a sign-off from Thomas 
> Schlichter too?

Please resend the latest with all the acks/signoffs added, with the 
subject line changed to '[PATCH] ...' and all people Cc:-ed.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2009-10-23 14:38 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-10  1:22 [RFC Patch] use MTRR for write combining if PAT is not available Thomas Schlichter
2009-10-10  4:24 ` Arjan van de Ven
2009-10-10  8:31   ` Thomas Schlichter
2009-10-10 15:45     ` Arjan van de Ven
2009-10-10 17:50       ` Roland Dreier
2009-10-11  7:40       ` Ingo Molnar
2009-10-11  9:56       ` Thomas Schlichter
2009-10-11 18:51   ` Henrique de Moraes Holschuh
2009-10-11 18:54     ` Arjan van de Ven
2009-10-11 20:19       ` Henrique de Moraes Holschuh
2009-10-12 18:09         ` Robert Hancock
     [not found] <200910122032.52168.thomas.schlichter@web.de>
2009-10-12 19:16 ` Thomas Hellstrom
2009-10-12 19:45   ` Thomas Schlichter
     [not found]     ` <1255378684.2063.5.camel@gaiman.anholt.net>
2009-10-13 21:05       ` Thomas Schlichter
  -- strict thread matches above, loose matches on Subject: below --
2009-10-13  7:34 Jan Beulich
2009-10-13 21:29 ` Thomas Schlichter
2009-10-14  8:13   ` Jan Beulich
2009-10-14 19:14     ` Thomas Schlichter
2009-10-15  7:48       ` Jan Beulich
2009-10-17 19:48         ` Thomas Schlichter
2009-10-19  9:16           ` Jan Beulich
2009-10-19 13:44             ` Suresh Siddha
2009-10-19 13:54               ` Ingo Molnar
2009-10-19 13:36           ` Konrad Rzeszutek Wilk
2009-10-19 14:47 Thomas Schlichter
2009-10-20 19:54 ` Thomas Schlichter
2009-10-21 11:57   ` Ingo Molnar
2009-10-19 14:59 Thomas Schlichter
2009-10-19 15:31 ` Ingo Molnar
2009-10-19 21:49   ` Suresh Siddha
2009-10-20 20:35   ` Thomas Schlichter
2009-10-20 21:59     ` Suresh Siddha
2009-10-21 11:52       ` Ingo Molnar
2009-10-19 15:07 Thomas Schlichter
2009-10-19 15:10 Thomas Schlichter
2009-10-19 15:28 ` Ingo Molnar
2009-10-21 13:45 Thomas Schlichter
2009-10-21 14:11 ` Jan Beulich
2009-10-21 17:35   ` Ingo Molnar
2009-10-21 20:01     ` Thomas Schlichter
2009-10-22  9:53       ` Suresh Siddha
2009-10-22 15:34         ` Eric Anholt
2009-10-22 21:47           ` Suresh Siddha
2009-10-22 23:10             ` Jesse Barnes
2009-10-23  0:11               ` Suresh Siddha
2009-10-23  1:53                 ` Eric Anholt
2009-10-23  4:31                   ` Jesse Barnes
2009-10-23  4:58                     ` Suresh Siddha
2009-10-23  7:24                       ` Thomas Schlichter
2009-10-23 14:24                         ` Suresh Siddha
2009-10-23 14:37                           ` Ingo Molnar
2009-10-23  4:33                   ` Suresh Siddha
2009-10-21 14:38 Thomas Schlichter
2009-10-21 15:14 ` Jan Beulich
2009-10-22 12:08 Thomas Schlichter
2009-10-22 12:14 ` H. Peter Anvin
2009-10-22 13:26   ` Suresh Siddha
2009-10-22 13:35 ` Suresh Siddha
2009-10-22 14:27 Thomas Schlichter
2009-10-22 14:41 Thomas Schlichter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).