* [PATCH v4 0/5] pci/devres: add and use pci_iomap_wc() and pcim_iomap_wc()
@ 2015-04-29 21:36 Luis R. Rodriguez
  2015-04-29 21:36 ` [PATCH v4 1/5] pci: add pci_iomap_wc() variants Luis R. Rodriguez
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-04-29 21:36 UTC (permalink / raw)
  To: bhelgaas, mst, plagnioj, tomi.valkeinen, airlied, daniel.vetter
  Cc: linux-fbdev, luto, cocci, linux-kernel, Luis R. Rodriguez
From: "Luis R. Rodriguez" <mcgrof@suse.com>
This series adds pci_iomap_wc() and the respective devm helper pcim_iomap_wc().
Luis R. Rodriguez (5):
  pci: add pci_iomap_wc() variants
  lib: devres: add pcim_iomap_wc() variants
  video: fbdev: arkfb: use arch_phys_wc_add() and pci_iomap_wc()
  video: fbdev: s3fb: use arch_phys_wc_add() and pci_iomap_wc()
  video: fbdev: vt8623fb: use arch_phys_wc_add() and pci_iomap_wc()
 drivers/video/fbdev/arkfb.c     | 36 +++----------------
 drivers/video/fbdev/s3fb.c      | 35 ++++--------------
 drivers/video/fbdev/vt8623fb.c  | 31 ++++------------
 include/asm-generic/pci_iomap.h | 14 ++++++++
 include/linux/pci.h             |  2 ++
 lib/devres.c                    | 78 +++++++++++++++++++++++++++++++++++++++++
 lib/pci_iomap.c                 | 61 ++++++++++++++++++++++++++++++++
 7 files changed, 172 insertions(+), 85 deletions(-)
-- 
2.3.2.209.gd67f9d5.dirty
^ permalink raw reply	[flat|nested] 14+ messages in thread
* [PATCH v4 1/5] pci: add pci_iomap_wc() variants
  2015-04-29 21:36 [PATCH v4 0/5] pci/devres: add and use pci_iomap_wc() and pcim_iomap_wc() Luis R. Rodriguez
@ 2015-04-29 21:36 ` Luis R. Rodriguez
  2015-04-30 15:59   ` Bjorn Helgaas
  2015-04-29 21:36 ` [PATCH v4 2/5] lib: devres: add pcim_iomap_wc() variants Luis R. Rodriguez
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-04-29 21:36 UTC (permalink / raw)
  To: bhelgaas, mst, plagnioj, tomi.valkeinen, airlied, daniel.vetter
  Cc: linux-fbdev, luto, cocci, linux-kernel, Luis R. Rodriguez,
	Toshi Kani, Suresh Siddha, Ingo Molnar, Thomas Gleixner,
	Juergen Gross, Daniel Vetter, Dave Airlie, Antonino Daplas,
	Dave Hansen, Arnd Bergmann, venkatesh.pallipadi, Stefan Bader,
	Ville Syrjälä, Mel Gorman, Vlastimil Babka,
	Borislav Petkov, Davidlohr Bueso, konrad.wilk, ville.syrjala,
	david.vrabel, jbeulich, Roger Pau Monné, xen-devel
From: "Luis R. Rodriguez" <mcgrof@suse.com>
This allows drivers to take advantage of write-combining
when possible. Ideally we'd have pci_read_bases() just
peg an IORESOURCE_WC flag for us but where exactly
video devices memory lie varies *largely* and at times things
are mixed with MMIO registers, sometimes we can address
the changes in drivers, other times the change requires
intrusive changes.
Although there is also arch_phys_wc_add() that makes use of
architecture specific write-combining alternatives (MTRR on
x86 when a system does not have PAT) we void polluting
pci_iomap() space with it and force drivers and subsystems
that want to use it to be explicit.
There are a few motivations for this:
a) Take advantage of PAT when available
b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT
c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: venkatesh.pallipadi@intel.com
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: Ville Syrjälä <syrjala@sci.fi>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: konrad.wilk@oracle.com
Cc: ville.syrjala@linux.intel.com
Cc: david.vrabel@citrix.com
Cc: jbeulich@suse.com
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 include/asm-generic/pci_iomap.h | 14 ++++++++++
 lib/pci_iomap.c                 | 61 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 75 insertions(+)
diff --git a/include/asm-generic/pci_iomap.h b/include/asm-generic/pci_iomap.h
index 7389c87..b1e17fc 100644
--- a/include/asm-generic/pci_iomap.h
+++ b/include/asm-generic/pci_iomap.h
@@ -15,9 +15,13 @@ struct pci_dev;
 #ifdef CONFIG_PCI
 /* Create a virtual mapping cookie for a PCI BAR (memory or IO) */
 extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max);
+extern void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned long max);
 extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
 				     unsigned long offset,
 				     unsigned long maxlen);
+extern void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar,
+					unsigned long offset,
+					unsigned long maxlen);
 /* Create a virtual mapping cookie for a port on a given PCI device.
  * Do not call this directly, it exists to make it easier for architectures
  * to override */
@@ -34,12 +38,22 @@ static inline void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned lon
 	return NULL;
 }
 
+static inline void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned long max)
+{
+	return NULL;
+}
 static inline void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
 					    unsigned long offset,
 					    unsigned long maxlen)
 {
 	return NULL;
 }
+static inline void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar,
+					       unsigned long offset,
+					       unsigned long maxlen)
+{
+	return NULL;
+}
 #endif
 
 #endif /* __ASM_GENERIC_IO_H */
diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
index bcce5f1..30b65ae 100644
--- a/lib/pci_iomap.c
+++ b/lib/pci_iomap.c
@@ -52,6 +52,46 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
 EXPORT_SYMBOL(pci_iomap_range);
 
 /**
+ * pci_iomap_wc_range - create a virtual WC mapping cookie for a PCI BAR
+ * @dev: PCI device that owns the BAR
+ * @bar: BAR number
+ * @offset: map memory at the given offset in BAR
+ * @maxlen: max length of the memory to map
+ *
+ * Using this function you will get a __iomem address to your device BAR.
+ * You can access it using ioread*() and iowrite*(). These functions hide
+ * the details if this is a MMIO or PIO address space and will just do what
+ * you expect from them in the correct way. When possible write combining
+ * is used.
+ *
+ * @maxlen specifies the maximum length to map. If you want to get access to
+ * the complete BAR from offset to the end, pass %0 here.
+ * */
+void __iomem *pci_iomap_wc_range(struct pci_dev *dev,
+				 int bar,
+				 unsigned long offset,
+				 unsigned long maxlen)
+{
+	resource_size_t start = pci_resource_start(dev, bar);
+	resource_size_t len = pci_resource_len(dev, bar);
+	unsigned long flags = pci_resource_flags(dev, bar);
+
+	if (len <= offset || !start)
+		return NULL;
+	len -= offset;
+	start += offset;
+	if (maxlen && len > maxlen)
+		len = maxlen;
+	if (flags & IORESOURCE_IO)
+		return __pci_ioport_map(dev, start, len);
+	if (flags & IORESOURCE_MEM)
+		return ioremap_wc(start, len);
+	/* What? */
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(pci_iomap_wc_range);
+
+/**
  * pci_iomap - create a virtual mapping cookie for a PCI BAR
  * @dev: PCI device that owns the BAR
  * @bar: BAR number
@@ -70,4 +110,25 @@ void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
 	return pci_iomap_range(dev, bar, 0, maxlen);
 }
 EXPORT_SYMBOL(pci_iomap);
+
+/**
+ * pci_iomap_wc - create a virtual WC mapping cookie for a PCI BAR
+ * @dev: PCI device that owns the BAR
+ * @bar: BAR number
+ * @maxlen: length of the memory to map
+ *
+ * Using this function you will get a __iomem address to your device BAR.
+ * You can access it using ioread*() and iowrite*(). These functions hide
+ * the details if this is a MMIO or PIO address space and will just do what
+ * you expect from them in the correct way. When possible write combining
+ * is used.
+ *
+ * @maxlen specifies the maximum length to map. If you want to get access to
+ * the complete BAR without checking for its length first, pass %0 here.
+ * */
+void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned long maxlen)
+{
+	return pci_iomap_wc_range(dev, bar, 0, maxlen);
+}
+EXPORT_SYMBOL_GPL(pci_iomap_wc);
 #endif /* CONFIG_PCI */
-- 
2.3.2.209.gd67f9d5.dirty
^ permalink raw reply related	[flat|nested] 14+ messages in thread
* [PATCH v4 2/5] lib: devres: add pcim_iomap_wc() variants
  2015-04-29 21:36 [PATCH v4 0/5] pci/devres: add and use pci_iomap_wc() and pcim_iomap_wc() Luis R. Rodriguez
  2015-04-29 21:36 ` [PATCH v4 1/5] pci: add pci_iomap_wc() variants Luis R. Rodriguez
@ 2015-04-29 21:36 ` Luis R. Rodriguez
  2015-04-30 16:26   ` Bjorn Helgaas
  2015-04-29 21:36 ` [PATCH v4 3/5] video: fbdev: arkfb: use arch_phys_wc_add() and pci_iomap_wc() Luis R. Rodriguez
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-04-29 21:36 UTC (permalink / raw)
  To: bhelgaas, mst, plagnioj, tomi.valkeinen, airlied, daniel.vetter
  Cc: linux-fbdev, luto, cocci, linux-kernel, Luis R. Rodriguez,
	Toshi Kani, Suresh Siddha, Ingo Molnar, Thomas Gleixner,
	Juergen Gross, Daniel Vetter, Dave Airlie, Antonino Daplas,
	Dave Hansen, Arnd Bergmann, venkatesh.pallipadi, Stefan Bader,
	Ville Syrjälä, Mel Gorman, Vlastimil Babka,
	Borislav Petkov, Davidlohr Bueso, konrad.wilk, ville.syrjala,
	david.vrabel, jbeulich, Roger Pau Monné, xen-devel
From: "Luis R. Rodriguez" <mcgrof@suse.com>
Now that we have pci_iomap_wc() add the respective devres helpers.
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: venkatesh.pallipadi@intel.com
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: Ville Syrjälä <syrjala@sci.fi>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: konrad.wilk@oracle.com
Cc: ville.syrjala@linux.intel.com
Cc: david.vrabel@citrix.com
Cc: jbeulich@suse.com
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 include/linux/pci.h |  2 ++
 lib/devres.c        | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 490ca41..cb317c4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1619,9 +1619,11 @@ static inline void pci_dev_specific_enable_acs(struct pci_dev *dev) { }
 #endif
 
 void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned long maxlen);
+void __iomem *pcim_iomap_wc(struct pci_dev *pdev, int bar, unsigned long maxlen);
 void pcim_iounmap(struct pci_dev *pdev, void __iomem *addr);
 void __iomem * const *pcim_iomap_table(struct pci_dev *pdev);
 int pcim_iomap_regions(struct pci_dev *pdev, int mask, const char *name);
+int pcim_iomap_wc_regions(struct pci_dev *pdev, int mask, const char *name);
 int pcim_iomap_regions_request_all(struct pci_dev *pdev, int mask,
 				   const char *name);
 void pcim_iounmap_regions(struct pci_dev *pdev, int mask);
diff --git a/lib/devres.c b/lib/devres.c
index fbe2aac..d59a2b9 100644
--- a/lib/devres.c
+++ b/lib/devres.c
@@ -304,6 +304,30 @@ void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned long maxlen)
 EXPORT_SYMBOL(pcim_iomap);
 
 /**
+ * pcim_iomap_wc - Managed pcim_iomap_wc()
+ * @pdev: PCI device to iomap for
+ * @bar: BAR to iomap
+ * @maxlen: Maximum length of iomap
+ *
+ * Managed pci_iomap_wc().  Map is automatically unmapped on driver
+ * detach.
+ */
+void __iomem *pcim_iomap_wc(struct pci_dev *pdev, int bar, unsigned long maxlen)
+{
+	void __iomem **tbl;
+
+	BUG_ON(bar >= PCIM_IOMAP_MAX);
+
+	tbl = (void __iomem **)pcim_iomap_table(pdev);
+	if (!tbl || tbl[bar])	/* duplicate mappings not allowed */
+		return NULL;
+
+	tbl[bar] = pci_iomap_wc(pdev, bar, maxlen);
+	return tbl[bar];
+}
+EXPORT_SYMBOL_GPL(pcim_iomap_wc);
+
+/**
  * pcim_iounmap - Managed pci_iounmap()
  * @pdev: PCI device to iounmap for
  * @addr: Address to unmap
@@ -383,6 +407,60 @@ int pcim_iomap_regions(struct pci_dev *pdev, int mask, const char *name)
 EXPORT_SYMBOL(pcim_iomap_regions);
 
 /**
+ * pcim_iomap_wc_regions - Request and iomap PCI BARs with write-combining
+ * @pdev: PCI device to map IO resources for
+ * @mask: Mask of BARs to request and iomap
+ * @name: Name used when requesting regions
+ *
+ * Request and iomap regions specified by @mask with a preference for
+ * write-combining.
+ */
+int pcim_iomap_wc_regions(struct pci_dev *pdev, int mask, const char *name)
+{
+	void __iomem * const *iomap;
+	int i, rc;
+
+	iomap = pcim_iomap_table(pdev);
+	if (!iomap)
+		return -ENOMEM;
+
+	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+		unsigned long len;
+
+		if (!(mask & (1 << i)))
+			continue;
+
+		rc = -EINVAL;
+		len = pci_resource_len(pdev, i);
+		if (!len)
+			goto err_inval;
+
+		rc = pci_request_region(pdev, i, name);
+		if (rc)
+			goto err_inval;
+
+		rc = -ENOMEM;
+		if (!pcim_iomap_wc(pdev, i, 0))
+			goto err_region;
+	}
+
+	return 0;
+
+ err_region:
+	pci_release_region(pdev, i);
+ err_inval:
+	while (--i >= 0) {
+		if (!(mask & (1 << i)))
+			continue;
+		pcim_iounmap(pdev, iomap[i]);
+		pci_release_region(pdev, i);
+	}
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(pcim_iomap_wc_regions);
+
+/**
  * pcim_iomap_regions_request_all - Request all BARs and iomap specified ones
  * @pdev: PCI device to map IO resources for
  * @mask: Mask of BARs to iomap
-- 
2.3.2.209.gd67f9d5.dirty
^ permalink raw reply related	[flat|nested] 14+ messages in thread
* [PATCH v4 3/5] video: fbdev: arkfb: use arch_phys_wc_add() and pci_iomap_wc()
  2015-04-29 21:36 [PATCH v4 0/5] pci/devres: add and use pci_iomap_wc() and pcim_iomap_wc() Luis R. Rodriguez
  2015-04-29 21:36 ` [PATCH v4 1/5] pci: add pci_iomap_wc() variants Luis R. Rodriguez
  2015-04-29 21:36 ` [PATCH v4 2/5] lib: devres: add pcim_iomap_wc() variants Luis R. Rodriguez
@ 2015-04-29 21:36 ` Luis R. Rodriguez
  2015-04-29 21:36 ` [PATCH v4 4/5] video: fbdev: s3fb: " Luis R. Rodriguez
  2015-04-29 21:36 ` [PATCH v4 5/5] video: fbdev: vt8623fb: " Luis R. Rodriguez
  4 siblings, 0 replies; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-04-29 21:36 UTC (permalink / raw)
  To: bhelgaas, mst, plagnioj, tomi.valkeinen, airlied, daniel.vetter
  Cc: linux-fbdev, luto, cocci, linux-kernel, Luis R. Rodriguez,
	Laurent Pinchart, Geert Uytterhoeven, Lad, Prabhakar,
	Suresh Siddha, Ingo Molnar, Thomas Gleixner, Juergen Gross,
	Daniel Vetter, Dave Airlie, Antonino Daplas
From: "Luis R. Rodriguez" <mcgrof@suse.com>
Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.
There are a few motivations for this:
a) Take advantage of PAT when available
b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT
c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")
The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.
@ mtrr_found @
expression index, base, size;
@@
-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);
@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@
-mtrr_del(index, base, size);
+arch_phys_wc_del(index);
@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@
-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);
@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@
-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);
@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@
-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);
@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@
-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);
Generated-by: Coccinelle SmPL
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "Lad, Prabhakar" <prabhakar.csengg@gmail.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 drivers/video/fbdev/arkfb.c | 36 +++++-------------------------------
 1 file changed, 5 insertions(+), 31 deletions(-)
diff --git a/drivers/video/fbdev/arkfb.c b/drivers/video/fbdev/arkfb.c
index b305a1e..6a317de 100644
--- a/drivers/video/fbdev/arkfb.c
+++ b/drivers/video/fbdev/arkfb.c
@@ -26,13 +26,9 @@
 #include <linux/console.h> /* Why should fb driver call console functions? because console_lock() */
 #include <video/vga.h>
 
-#ifdef CONFIG_MTRR
-#include <asm/mtrr.h>
-#endif
-
 struct arkfb_info {
 	int mclk_freq;
-	int mtrr_reg;
+	int wc_cookie;
 
 	struct dac_info *dac;
 	struct vgastate state;
@@ -102,10 +98,6 @@ static const struct svga_timing_regs ark_timing_regs     = {
 
 static char *mode_option = "640x480-8@60";
 
-#ifdef CONFIG_MTRR
-static int mtrr = 1;
-#endif
-
 MODULE_AUTHOR("(c) 2007 Ondrej Zajicek <santiago@crfreenet.org>");
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("fbdev driver for ARK 2000PV");
@@ -115,11 +107,6 @@ MODULE_PARM_DESC(mode_option, "Default video mode ('640x480-8@60', etc)");
 module_param_named(mode, mode_option, charp, 0444);
 MODULE_PARM_DESC(mode, "Default video mode ('640x480-8@60', etc) (deprecated)");
 
-#ifdef CONFIG_MTRR
-module_param(mtrr, int, 0444);
-MODULE_PARM_DESC(mtrr, "Enable write-combining with MTRR (1=enable, 0=disable, default=1)");
-#endif
-
 static int threshold = 4;
 
 module_param(threshold, int, 0644);
@@ -1002,7 +989,7 @@ static int ark_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	info->fix.smem_len = pci_resource_len(dev, 0);
 
 	/* Map physical IO memory address into kernel space */
-	info->screen_base = pci_iomap(dev, 0, 0);
+	info->screen_base = pci_iomap_wc(dev, 0, 0);
 	if (! info->screen_base) {
 		rc = -ENOMEM;
 		dev_err(info->device, "iomap for framebuffer failed\n");
@@ -1057,14 +1044,8 @@ static int ark_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 
 	/* Record a reference to the driver data */
 	pci_set_drvdata(dev, info);
-
-#ifdef CONFIG_MTRR
-	if (mtrr) {
-		par->mtrr_reg = -1;
-		par->mtrr_reg = mtrr_add(info->fix.smem_start, info->fix.smem_len, MTRR_TYPE_WRCOMB, 1);
-	}
-#endif
-
+	par->wc_cookie = arch_phys_wc_add(info->fix.smem_start,
+					  info->fix.smem_len);
 	return 0;
 
 	/* Error handling */
@@ -1092,14 +1073,7 @@ static void ark_pci_remove(struct pci_dev *dev)
 
 	if (info) {
 		struct arkfb_info *par = info->par;
-
-#ifdef CONFIG_MTRR
-		if (par->mtrr_reg >= 0) {
-			mtrr_del(par->mtrr_reg, 0, 0);
-			par->mtrr_reg = -1;
-		}
-#endif
-
+		arch_phys_wc_del(par->wc_cookie);
 		dac_release(par->dac);
 		unregister_framebuffer(info);
 		fb_dealloc_cmap(&info->cmap);
-- 
2.3.2.209.gd67f9d5.dirty
^ permalink raw reply related	[flat|nested] 14+ messages in thread
* [PATCH v4 4/5] video: fbdev: s3fb: use arch_phys_wc_add() and pci_iomap_wc()
  2015-04-29 21:36 [PATCH v4 0/5] pci/devres: add and use pci_iomap_wc() and pcim_iomap_wc() Luis R. Rodriguez
                   ` (2 preceding siblings ...)
  2015-04-29 21:36 ` [PATCH v4 3/5] video: fbdev: arkfb: use arch_phys_wc_add() and pci_iomap_wc() Luis R. Rodriguez
@ 2015-04-29 21:36 ` Luis R. Rodriguez
  2015-04-29 21:36 ` [PATCH v4 5/5] video: fbdev: vt8623fb: " Luis R. Rodriguez
  4 siblings, 0 replies; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-04-29 21:36 UTC (permalink / raw)
  To: bhelgaas, mst, plagnioj, tomi.valkeinen, airlied, daniel.vetter
  Cc: linux-fbdev, luto, cocci, linux-kernel, Luis R. Rodriguez,
	Jingoo Han, Geert Uytterhoeven, Daniel Vetter, Lad, Prabhakar,
	Rickard Strandqvist, Suresh Siddha, Ingo Molnar, Thomas Gleixner,
	Juergen Gross, Dave Airlie, Antonino Daplas
From: "Luis R. Rodriguez" <mcgrof@suse.com>
This driver uses the same area for MTRR as for the ioremap().
Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.
There are a few motivations for this:
a) Take advantage of PAT when available
b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT
c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")
The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.
@ mtrr_found @
expression index, base, size;
@@
-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);
@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@
-mtrr_del(index, base, size);
+arch_phys_wc_del(index);
@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@
-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);
@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@
-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);
@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@
-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);
@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@
-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);
Generated-by: Coccinelle SmPL
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Lad, Prabhakar" <prabhakar.csengg@gmail.com>
Cc: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 drivers/video/fbdev/s3fb.c | 35 ++++++-----------------------------
 1 file changed, 6 insertions(+), 29 deletions(-)
diff --git a/drivers/video/fbdev/s3fb.c b/drivers/video/fbdev/s3fb.c
index f0ae61a..13b1090 100644
--- a/drivers/video/fbdev/s3fb.c
+++ b/drivers/video/fbdev/s3fb.c
@@ -28,13 +28,9 @@
 #include <linux/i2c.h>
 #include <linux/i2c-algo-bit.h>
 
-#ifdef CONFIG_MTRR
-#include <asm/mtrr.h>
-#endif
-
 struct s3fb_info {
 	int chip, rev, mclk_freq;
-	int mtrr_reg;
+	int wc_cookie;
 	struct vgastate state;
 	struct mutex open_lock;
 	unsigned int ref_count;
@@ -154,11 +150,7 @@ static const struct svga_timing_regs s3_timing_regs     = {
 
 
 static char *mode_option;
-
-#ifdef CONFIG_MTRR
 static int mtrr = 1;
-#endif
-
 static int fasttext = 1;
 
 
@@ -170,11 +162,8 @@ module_param(mode_option, charp, 0444);
 MODULE_PARM_DESC(mode_option, "Default video mode ('640x480-8@60', etc)");
 module_param_named(mode, mode_option, charp, 0444);
 MODULE_PARM_DESC(mode, "Default video mode ('640x480-8@60', etc) (deprecated)");
-
-#ifdef CONFIG_MTRR
 module_param(mtrr, int, 0444);
 MODULE_PARM_DESC(mtrr, "Enable write-combining with MTRR (1=enable, 0=disable, default=1)");
-#endif
 
 module_param(fasttext, int, 0644);
 MODULE_PARM_DESC(fasttext, "Enable S3 fast text mode (1=enable, 0=disable, default=1)");
@@ -1168,7 +1157,7 @@ static int s3_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	info->fix.smem_len = pci_resource_len(dev, 0);
 
 	/* Map physical IO memory address into kernel space */
-	info->screen_base = pci_iomap(dev, 0, 0);
+	info->screen_base = pci_iomap_wc(dev, 0, 0);
 	if (! info->screen_base) {
 		rc = -ENOMEM;
 		dev_err(info->device, "iomap for framebuffer failed\n");
@@ -1365,12 +1354,9 @@ static int s3_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	/* Record a reference to the driver data */
 	pci_set_drvdata(dev, info);
 
-#ifdef CONFIG_MTRR
-	if (mtrr) {
-		par->mtrr_reg = -1;
-		par->mtrr_reg = mtrr_add(info->fix.smem_start, info->fix.smem_len, MTRR_TYPE_WRCOMB, 1);
-	}
-#endif
+	if (mtrr)
+		par->wc_cookie = arch_phys_wc_add(info->fix.smem_start,
+						  info->fix.smem_len);
 
 	return 0;
 
@@ -1405,14 +1391,7 @@ static void s3_pci_remove(struct pci_dev *dev)
 
 	if (info) {
 		par = info->par;
-
-#ifdef CONFIG_MTRR
-		if (par->mtrr_reg >= 0) {
-			mtrr_del(par->mtrr_reg, 0, 0);
-			par->mtrr_reg = -1;
-		}
-#endif
-
+		arch_phys_wc_del(par->wc_cookie);
 		unregister_framebuffer(info);
 		fb_dealloc_cmap(&info->cmap);
 
@@ -1551,10 +1530,8 @@ static int  __init s3fb_setup(char *options)
 
 		if (!*opt)
 			continue;
-#ifdef CONFIG_MTRR
 		else if (!strncmp(opt, "mtrr:", 5))
 			mtrr = simple_strtoul(opt + 5, NULL, 0);
-#endif
 		else if (!strncmp(opt, "fasttext:", 9))
 			fasttext = simple_strtoul(opt + 9, NULL, 0);
 		else
-- 
2.3.2.209.gd67f9d5.dirty
^ permalink raw reply related	[flat|nested] 14+ messages in thread
* [PATCH v4 5/5] video: fbdev: vt8623fb: use arch_phys_wc_add() and pci_iomap_wc()
  2015-04-29 21:36 [PATCH v4 0/5] pci/devres: add and use pci_iomap_wc() and pcim_iomap_wc() Luis R. Rodriguez
                   ` (3 preceding siblings ...)
  2015-04-29 21:36 ` [PATCH v4 4/5] video: fbdev: s3fb: " Luis R. Rodriguez
@ 2015-04-29 21:36 ` Luis R. Rodriguez
  4 siblings, 0 replies; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-04-29 21:36 UTC (permalink / raw)
  To: bhelgaas, mst, plagnioj, tomi.valkeinen, airlied, daniel.vetter
  Cc: linux-fbdev, luto, cocci, linux-kernel, Luis R. Rodriguez,
	Rob Clark, Laurent Pinchart, Jingoo Han, Lad, Prabhakar,
	Suresh Siddha, Ingo Molnar, Thomas Gleixner, Juergen Gross,
	Daniel Vetter, Dave Airlie, Antonino Daplas
From: "Luis R. Rodriguez" <mcgrof@suse.com>
This driver uses the same area for MTRR as for the ioremap().
Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.
There are a few motivations for this:
a) Take advantage of PAT when available
b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT
c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")
The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.
@ mtrr_found @
expression index, base, size;
@@
-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);
@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@
-mtrr_del(index, base, size);
+arch_phys_wc_del(index);
@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@
-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);
@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@
-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);
@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@
-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);
@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@
-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);
Generated-by: Coccinelle SmPL
Cc: Rob Clark <robdclark@gmail.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: "Lad, Prabhakar" <prabhakar.csengg@gmail.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 drivers/video/fbdev/vt8623fb.c | 31 ++++++-------------------------
 1 file changed, 6 insertions(+), 25 deletions(-)
diff --git a/drivers/video/fbdev/vt8623fb.c b/drivers/video/fbdev/vt8623fb.c
index ea7f056..60f24828 100644
--- a/drivers/video/fbdev/vt8623fb.c
+++ b/drivers/video/fbdev/vt8623fb.c
@@ -26,13 +26,9 @@
 #include <linux/console.h> /* Why should fb driver call console functions? because console_lock() */
 #include <video/vga.h>
 
-#ifdef CONFIG_MTRR
-#include <asm/mtrr.h>
-#endif
-
 struct vt8623fb_info {
 	char __iomem *mmio_base;
-	int mtrr_reg;
+	int wc_cookie;
 	struct vgastate state;
 	struct mutex open_lock;
 	unsigned int ref_count;
@@ -99,10 +95,7 @@ static struct svga_timing_regs vt8623_timing_regs     = {
 /* Module parameters */
 
 static char *mode_option = "640x480-8@60";
-
-#ifdef CONFIG_MTRR
 static int mtrr = 1;
-#endif
 
 MODULE_AUTHOR("(c) 2006 Ondrej Zajicek <santiago@crfreenet.org>");
 MODULE_LICENSE("GPL");
@@ -112,11 +105,8 @@ module_param(mode_option, charp, 0644);
 MODULE_PARM_DESC(mode_option, "Default video mode ('640x480-8@60', etc)");
 module_param_named(mode, mode_option, charp, 0);
 MODULE_PARM_DESC(mode, "Default video mode e.g. '648x480-8@60' (deprecated)");
-
-#ifdef CONFIG_MTRR
 module_param(mtrr, int, 0444);
 MODULE_PARM_DESC(mtrr, "Enable write-combining with MTRR (1=enable, 0=disable, default=1)");
-#endif
 
 
 /* ------------------------------------------------------------------------- */
@@ -710,7 +700,7 @@ static int vt8623_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	info->fix.mmio_len = pci_resource_len(dev, 1);
 
 	/* Map physical IO memory address into kernel space */
-	info->screen_base = pci_iomap(dev, 0, 0);
+	info->screen_base = pci_iomap_wc(dev, 0, 0);
 	if (! info->screen_base) {
 		rc = -ENOMEM;
 		dev_err(info->device, "iomap for framebuffer failed\n");
@@ -781,12 +771,9 @@ static int vt8623_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	/* Record a reference to the driver data */
 	pci_set_drvdata(dev, info);
 
-#ifdef CONFIG_MTRR
-	if (mtrr) {
-		par->mtrr_reg = -1;
-		par->mtrr_reg = mtrr_add(info->fix.smem_start, info->fix.smem_len, MTRR_TYPE_WRCOMB, 1);
-	}
-#endif
+	if (mtrr)
+		par->wc_cookie = arch_phys_wc_add(info->fix.smem_start,
+						  info->fix.smem_len);
 
 	return 0;
 
@@ -816,13 +803,7 @@ static void vt8623_pci_remove(struct pci_dev *dev)
 	if (info) {
 		struct vt8623fb_info *par = info->par;
 
-#ifdef CONFIG_MTRR
-		if (par->mtrr_reg >= 0) {
-			mtrr_del(par->mtrr_reg, 0, 0);
-			par->mtrr_reg = -1;
-		}
-#endif
-
+		arch_phys_wc_del(par->wc_cookie);
 		unregister_framebuffer(info);
 		fb_dealloc_cmap(&info->cmap);
 
-- 
2.3.2.209.gd67f9d5.dirty
^ permalink raw reply related	[flat|nested] 14+ messages in thread
* Re: [PATCH v4 1/5] pci: add pci_iomap_wc() variants
  2015-04-29 21:36 ` [PATCH v4 1/5] pci: add pci_iomap_wc() variants Luis R. Rodriguez
@ 2015-04-30 15:59   ` Bjorn Helgaas
  2015-04-30 16:52     ` Luis R. Rodriguez
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2015-04-30 15:59 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: mst, plagnioj, tomi.valkeinen, airlied, daniel.vetter,
	linux-fbdev, luto, cocci, linux-kernel, Luis R. Rodriguez,
	Toshi Kani, Suresh Siddha, Ingo Molnar, Thomas Gleixner,
	Juergen Gross, Daniel Vetter, Dave Airlie, Antonino Daplas,
	Dave Hansen, Arnd Bergmann, venkatesh.pallipadi, Stefan Bader,
	Ville Syrjälä, Mel Gorman, Vlastimil Babka,
	Borislav Petkov, Davidlohr Bueso, konrad.wilk, ville.syrjala,
	david.vrabel, jbeulich, Roger Pau Monné, xen-devel,
	linux-pci
[+cc linux-pci]
Hi Luis,
On Wed, Apr 29, 2015 at 02:36:08PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> This allows drivers to take advantage of write-combining
> when possible. Ideally we'd have pci_read_bases() just
> peg an IORESOURCE_WC flag for us 
This makes it sound like pci_read_bases() could do a better job
if we just tried harder, but I don't think that's the case.  All
pci_read_bases() can do is look at the bits in the BAR.  For
memory BARs, there's a "prefetchable" bit and a "64-bit" bit.
If you just want to complain that the PCI spec didn't define a
way for software to discover whether a BAR can be mapped with WC,
that's fine, but it's misleading to suggest that pci_read_bases()
could figure out WC without some help from the spec.
> but where exactly
> video devices memory lie varies *largely* and at times things
> are mixed with MMIO registers, sometimes we can address
> the changes in drivers, other times the change requires
> intrusive changes.
> 
> Although there is also arch_phys_wc_add() that makes use of
> architecture specific write-combining alternatives (MTRR on
> x86 when a system does not have PAT) we void polluting
> pci_iomap() space with it and force drivers and subsystems
> that want to use it to be explicit.
I'm not quite sure I understand the point you're making here
about not polluting pci_iomap_wc() with arch_phys_wc_add().  I
think the choice is for a driver to do either this:
  info->screen_base = pci_iomap_wc(dev, 0, 0);
or this:
  info->screen_base = pci_iomap_wc(dev, 0, 0);
  par->wc_cookie = arch_phys_wc_add(pci_resource_start(dev, 0),
				    pci_resource_len(dev, 0));
The driver is *already* being explicit because it calls
pci_iomap_wc() instead of pci_iomap().
It seems like it would be ideal if ioremap_wc() could call
arch_phys_wc_add() internally.  Doesn't any caller of
arch_phys_wc_add() have to also do some sort of ioremap()
beforehand?  I assume there's some reason for separating them,
and I see that the current arch_phys_wc_add() requires the caller
to store a handle, but doing both seems confusing.
> There are a few motivations for this:
> 
> a) Take advantage of PAT when available
> 
> b) Help bury MTRR code away, MTRR is architecture specific and on
>    x86 its replaced by PAT
> 
> c) Help with the goal of eventually using _PAGE_CACHE_UC over
>    _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
>    de33c442e titled "x86 PAT: fix performance drop for glx,
>    use UC minus for ioremap(), ioremap_nocache() and
>    pci_mmap_page_range()")
I think these are now _PAGE_CACHE_MODE_UC and
_PAGE_CACHE_MODE_UC_MINUS, right? 
> ...
> +void __iomem *pci_iomap_wc_range(struct pci_dev *dev,
> +				 int bar,
> +				 unsigned long offset,
> +				 unsigned long maxlen)
> +{
> +	resource_size_t start = pci_resource_start(dev, bar);
> +	resource_size_t len = pci_resource_len(dev, bar);
> +	unsigned long flags = pci_resource_flags(dev, bar);
> +
> +	if (len <= offset || !start)
> +		return NULL;
> +	len -= offset;
> +	start += offset;
> +	if (maxlen && len > maxlen)
> +		len = maxlen;
> +	if (flags & IORESOURCE_IO)
> +		return __pci_ioport_map(dev, start, len);
Is there any point in checking for IORESOURCE_IO?  If a driver
calls pci_iomap_wc_range(), I assume it already knows this is an
IORESOURCE_MEM BAR, so if we see IORESOURCE_IO here we should
just return an error, i.e., NULL.
> +	if (flags & IORESOURCE_MEM)
> +		return ioremap_wc(start, len);
> +	/* What? */
> +	return NULL;
> +}
> +EXPORT_SYMBOL_GPL(pci_iomap_wc_range);
Bjorn
^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: [PATCH v4 2/5] lib: devres: add pcim_iomap_wc() variants
  2015-04-29 21:36 ` [PATCH v4 2/5] lib: devres: add pcim_iomap_wc() variants Luis R. Rodriguez
@ 2015-04-30 16:26   ` Bjorn Helgaas
  2015-04-30 17:27     ` Luis R. Rodriguez
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2015-04-30 16:26 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: mst, plagnioj, tomi.valkeinen, airlied, daniel.vetter,
	linux-fbdev, luto, cocci, linux-kernel, Luis R. Rodriguez,
	Toshi Kani, Suresh Siddha, Ingo Molnar, Thomas Gleixner,
	Juergen Gross, Daniel Vetter, Dave Airlie, Antonino Daplas,
	Dave Hansen, Arnd Bergmann, venkatesh.pallipadi, Stefan Bader,
	Ville Syrjälä, Mel Gorman, Vlastimil Babka,
	Borislav Petkov, Davidlohr Bueso, konrad.wilk, ville.syrjala,
	david.vrabel, jbeulich, Roger Pau Monné, xen-devel,
	linux-pci
[+cc linux-pci]
Hi Luis,
On Wed, Apr 29, 2015 at 02:36:09PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> Now that we have pci_iomap_wc() add the respective devres helpers.
I guess I'm still confused about the relationship between pci_iomap_wc()
and arch_phys_wc_add().
Do you expect every caller of pcim_iomap_wc() to also call
arch_phys_wc_add()?
If so, I'm not sure how pcim_iomap_wc() fits into the picture.  A driver
can call both pcim_iomap_wc() and arch_phys_wc_add(), but the driver
doesn't explicitly do the unmap, so where would the arch_phys_wc_del()
happen?
If not, how does a driver know whether it should call arch_phys_wc_add()?
> ...
>  /**
> + * pcim_iomap_wc_regions - Request and iomap PCI BARs with write-combining
> + * @pdev: PCI device to map IO resources for
> + * @mask: Mask of BARs to request and iomap
> + * @name: Name used when requesting regions
> + *
> + * Request and iomap regions specified by @mask with a preference for
> + * write-combining.
> + */
> +int pcim_iomap_wc_regions(struct pci_dev *pdev, int mask, const char *name)
> +{
> +	void __iomem * const *iomap;
> +	int i, rc;
> +
> +	iomap = pcim_iomap_table(pdev);
> +	if (!iomap)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
> +		unsigned long len;
> +
> +		if (!(mask & (1 << i)))
> +			continue;
> +
> +		rc = -EINVAL;
> +		len = pci_resource_len(pdev, i);
> +		if (!len)
> +			goto err_inval;
> +
> +		rc = pci_request_region(pdev, i, name);
> +		if (rc)
> +			goto err_inval;
> +
> +		rc = -ENOMEM;
> +		if (!pcim_iomap_wc(pdev, i, 0))
> +			goto err_region;
Is there a user for this?  Are there really devices where *all* the BARs
can be mapped with WC?  Are there enough of them to make it worth adding
this?
I don't see users of either pcim_iomap_wc() or pcim_iomap_wc_regions() so
far.  Did I miss them, or do you just expect them in the near future?
Bjorn
^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: [PATCH v4 1/5] pci: add pci_iomap_wc() variants
  2015-04-30 15:59   ` Bjorn Helgaas
@ 2015-04-30 16:52     ` Luis R. Rodriguez
  2015-04-30 17:03       ` Andy Lutomirski
  0 siblings, 1 reply; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-04-30 16:52 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Luis R. Rodriguez, mst, plagnioj, tomi.valkeinen, airlied,
	daniel.vetter, linux-fbdev, luto, cocci, linux-kernel, Toshi Kani,
	Suresh Siddha, Ingo Molnar, Thomas Gleixner, Juergen Gross,
	Daniel Vetter, Dave Airlie, Antonino Daplas, Dave Hansen,
	Arnd Bergmann, venkatesh.pallipadi, Stefan Bader,
	Ville Syrjälä, Mel Gorman, Vlastimil Babka,
	Borislav Petkov, Davidlohr Bueso, konrad.wilk, ville.syrjala,
	david.vrabel, jbeulich, Roger Pau Monné, xen-devel,
	linux-pci
On Thu, Apr 30, 2015 at 10:59:17AM -0500, Bjorn Helgaas wrote:
> [+cc linux-pci]
> 
> Hi Luis,
> 
> On Wed, Apr 29, 2015 at 02:36:08PM -0700, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> > 
> > This allows drivers to take advantage of write-combining
> > when possible. Ideally we'd have pci_read_bases() just
> > peg an IORESOURCE_WC flag for us 
> 
> This makes it sound like pci_read_bases() could do a better job
> if we just tried harder, but I don't think that's the case.  All
> pci_read_bases() can do is look at the bits in the BAR.  For
> memory BARs, there's a "prefetchable" bit and a "64-bit" bit.
> 
> If you just want to complain that the PCI spec didn't define a
> way for software to discover whether a BAR can be mapped with WC,
> that's fine, but it's misleading to suggest that pci_read_bases()
> could figure out WC without some help from the spec.
You're right sorry about that, in my original patch this was more
of a question and I did not have a full answer for but mst had
clarified before the spec doesn't allow for this [0] and you are
confirming this now as well.
[0] https://lkml.org/lkml/2015/4/21/714
I'll update the patch and at least document we did think about
this and that its a shortcoming of the spec.
> > but where exactly
> > video devices memory lie varies *largely* and at times things
> > are mixed with MMIO registers, sometimes we can address
> > the changes in drivers, other times the change requires
> > intrusive changes.
> > 
> > Although there is also arch_phys_wc_add() that makes use of
> > architecture specific write-combining alternatives (MTRR on
> > x86 when a system does not have PAT) we void polluting
> > pci_iomap() space with it and force drivers and subsystems
> > that want to use it to be explicit.
> 
> I'm not quite sure I understand the point you're making here
> about not polluting pci_iomap_wc() with arch_phys_wc_add().  I
> think the choice is for a driver to do either this:
> 
>   info->screen_base = pci_iomap_wc(dev, 0, 0);
> 
> or this:
> 
>   info->screen_base = pci_iomap_wc(dev, 0, 0);
>   par->wc_cookie = arch_phys_wc_add(pci_resource_start(dev, 0),
> 				    pci_resource_len(dev, 0));
> 
> The driver is *already* being explicit because it calls
> pci_iomap_wc() instead of pci_iomap().
> 
> It seems like it would be ideal if ioremap_wc() could call
> arch_phys_wc_add() internally.
Indeed, that's what I was alluding to.
> Doesn't any caller of
> arch_phys_wc_add() have to also do some sort of ioremap()
> beforehand? 
This is not a requirement as the physical address is used,
not the virtual address.
> I assume there's some reason for separating them,
Well a full sweep to change to arch_phys_wc_add() was never done,
consider this part of the last effort to do so. In retrospect now
that I've covered all other drivers in 12 different series of patches
I think its perhaps best to not mesh them together as we're phasing
out MTRR and the only reason to have arch_phys_wc_add() is for MTRR
which is legacy.
I'll update the commit log to mention that.
> and I see that the current arch_phys_wc_add() requires the caller
> to store a handle, but doing both seems confusing.
That's just a cookie so that later when we undo the driver we can
tell the backend to remove it.
> > There are a few motivations for this:
> > 
> > a) Take advantage of PAT when available
> > 
> > b) Help bury MTRR code away, MTRR is architecture specific and on
> >    x86 its replaced by PAT
> > 
> > c) Help with the goal of eventually using _PAGE_CACHE_UC over
> >    _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
> >    de33c442e titled "x86 PAT: fix performance drop for glx,
> >    use UC minus for ioremap(), ioremap_nocache() and
> >    pci_mmap_page_range()")
> 
> I think these are now _PAGE_CACHE_MODE_UC and
> _PAGE_CACHE_MODE_UC_MINUS, right? 
Indeed, thanks, I'll fix that in the commit log.
> > ...
> 
> > +void __iomem *pci_iomap_wc_range(struct pci_dev *dev,
> > +				 int bar,
> > +				 unsigned long offset,
> > +				 unsigned long maxlen)
> > +{
> > +	resource_size_t start = pci_resource_start(dev, bar);
> > +	resource_size_t len = pci_resource_len(dev, bar);
> > +	unsigned long flags = pci_resource_flags(dev, bar);
> > +
> > +	if (len <= offset || !start)
> > +		return NULL;
> > +	len -= offset;
> > +	start += offset;
> > +	if (maxlen && len > maxlen)
> > +		len = maxlen;
> > +	if (flags & IORESOURCE_IO)
> > +		return __pci_ioport_map(dev, start, len);
> 
> Is there any point in checking for IORESOURCE_IO?  If a driver
> calls pci_iomap_wc_range(), I assume it already knows this is an
> IORESOURCE_MEM BAR, so if we see IORESOURCE_IO here we should
> just return an error, i.e., NULL.
Agreed, will fix with all the other changes on the commit log and
repost. I won't repost the full series but just this one patch as
a v5.
Thanks for the review.
 Luis
^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: [PATCH v4 1/5] pci: add pci_iomap_wc() variants
  2015-04-30 16:52     ` Luis R. Rodriguez
@ 2015-04-30 17:03       ` Andy Lutomirski
  2015-04-30 17:15         ` Luis R. Rodriguez
  0 siblings, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2015-04-30 17:03 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Bjorn Helgaas, Luis R. Rodriguez, Michael S. Tsirkin,
	Jean-Christophe Plagniol-Villard, Tomi Valkeinen, David Airlie,
	daniel.vetter, Linux Fbdev development list, cocci,
	linux-kernel@vger.kernel.org, Toshi Kani, Suresh Siddha,
	Ingo Molnar, Thomas Gleixner, Juergen Gross, Daniel Vetter,
	Dave Airlie, Antonino Daplas, Dave Hansen, Arnd Bergmann,
	venkatesh.pallipadi, Stefan Bader, Ville Syrjälä,
	Mel Gorman, Vlastimil Babka, Borislav Petkov, Davidlohr Bueso,
	Konrad Rzeszutek Wilk, ville.syrjala, David Vrabel, Jan Beulich,
	Roger Pau Monné, xen-devel, linux-pci@vger.kernel.org
On Thu, Apr 30, 2015 at 9:52 AM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
> On Thu, Apr 30, 2015 at 10:59:17AM -0500, Bjorn Helgaas wrote:
>> [+cc linux-pci]
>>
>> Hi Luis,
>>
>> On Wed, Apr 29, 2015 at 02:36:08PM -0700, Luis R. Rodriguez wrote:
>> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
>> >
>> > This allows drivers to take advantage of write-combining
>> > when possible. Ideally we'd have pci_read_bases() just
>> > peg an IORESOURCE_WC flag for us
>>
>> This makes it sound like pci_read_bases() could do a better job
>> if we just tried harder, but I don't think that's the case.  All
>> pci_read_bases() can do is look at the bits in the BAR.  For
>> memory BARs, there's a "prefetchable" bit and a "64-bit" bit.
>>
>> If you just want to complain that the PCI spec didn't define a
>> way for software to discover whether a BAR can be mapped with WC,
>> that's fine, but it's misleading to suggest that pci_read_bases()
>> could figure out WC without some help from the spec.
>
> You're right sorry about that, in my original patch this was more
> of a question and I did not have a full answer for but mst had
> clarified before the spec doesn't allow for this [0] and you are
> confirming this now as well.
>
> [0] https://lkml.org/lkml/2015/4/21/714
>
> I'll update the patch and at least document we did think about
> this and that its a shortcoming of the spec.
>
>> > but where exactly
>> > video devices memory lie varies *largely* and at times things
>> > are mixed with MMIO registers, sometimes we can address
>> > the changes in drivers, other times the change requires
>> > intrusive changes.
>> >
>> > Although there is also arch_phys_wc_add() that makes use of
>> > architecture specific write-combining alternatives (MTRR on
>> > x86 when a system does not have PAT) we void polluting
>> > pci_iomap() space with it and force drivers and subsystems
>> > that want to use it to be explicit.
>>
>> I'm not quite sure I understand the point you're making here
>> about not polluting pci_iomap_wc() with arch_phys_wc_add().  I
>> think the choice is for a driver to do either this:
>>
>>   info->screen_base = pci_iomap_wc(dev, 0, 0);
>>
>> or this:
>>
>>   info->screen_base = pci_iomap_wc(dev, 0, 0);
>>   par->wc_cookie = arch_phys_wc_add(pci_resource_start(dev, 0),
>>                                   pci_resource_len(dev, 0));
>>
>> The driver is *already* being explicit because it calls
>> pci_iomap_wc() instead of pci_iomap().
>>
>> It seems like it would be ideal if ioremap_wc() could call
>> arch_phys_wc_add() internally.
>
> Indeed, that's what I was alluding to.
>
>> Doesn't any caller of
>> arch_phys_wc_add() have to also do some sort of ioremap()
>> beforehand?
>
> This is not a requirement as the physical address is used,
> not the virtual address.
>
>> I assume there's some reason for separating them,
>
> Well a full sweep to change to arch_phys_wc_add() was never done,
> consider this part of the last effort to do so. In retrospect now
> that I've covered all other drivers in 12 different series of patches
> I think its perhaps best to not mesh them together as we're phasing
> out MTRR and the only reason to have arch_phys_wc_add() is for MTRR
> which is legacy.
I would say it much more strongly.
Drivers for new hardware SHOULD NOT call arch_phys_wc_add, directly or
otherwise.  MTRRs are crap.  They have nasty alignment requirements,
they are a very limited and unpredictable resource, and the interact
poorly with BIOS.  They should really only be used for old video
framebuffers and such.
Anything new should use PAT (it's been available for a long time) and
possibly streaming memory writes.  Even fancy server gear (myri10ge,
for example) should stay far away from MTRRs and such: it's very easy
to put enough devices in a server board that you simply run out of
MTRRs and arch_phys_wc_add will stop working.
If we make ioremap_wc and similar start automatically adding MTRRs,
then performance will vary wildly with the order of driver loading,
because we'll run out of MTRRs part-way through bootup.
ioremap_wc via PAT, on the other hand, is 100% reliable on newer hardware.
Maybe I should have called it arch_phys_wc_add_awful_legacy_hack.
--Andy
^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: [PATCH v4 1/5] pci: add pci_iomap_wc() variants
  2015-04-30 17:03       ` Andy Lutomirski
@ 2015-04-30 17:15         ` Luis R. Rodriguez
  0 siblings, 0 replies; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-04-30 17:15 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Bjorn Helgaas, Luis R. Rodriguez, Michael S. Tsirkin,
	Jean-Christophe Plagniol-Villard, Tomi Valkeinen, David Airlie,
	daniel.vetter, Linux Fbdev development list, cocci,
	linux-kernel@vger.kernel.org, Toshi Kani, Suresh Siddha,
	Ingo Molnar, Thomas Gleixner, Juergen Gross, Daniel Vetter,
	Dave Airlie, Antonino Daplas, Dave Hansen, Arnd Bergmann,
	venkatesh.pallipadi, Stefan Bader, Ville Syrjälä,
	Mel Gorman, Vlastimil Babka, Borislav Petkov, Davidlohr Bueso,
	Konrad Rzeszutek Wilk, ville.syrjala, David Vrabel, Jan Beulich,
	Roger Pau Monné, xen-devel, linux-pci@vger.kernel.org
On Thu, Apr 30, 2015 at 10:03:18AM -0700, Andy Lutomirski wrote:
> On Thu, Apr 30, 2015 at 9:52 AM, Luis R. Rodriguez <mcgrof@suse.com> wrote:
> > On Thu, Apr 30, 2015 at 10:59:17AM -0500, Bjorn Helgaas wrote:
> >> [+cc linux-pci]
> >>
> >> Hi Luis,
> >>
> >> On Wed, Apr 29, 2015 at 02:36:08PM -0700, Luis R. Rodriguez wrote:
> >> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> >> >
> >> > This allows drivers to take advantage of write-combining
> >> > when possible. Ideally we'd have pci_read_bases() just
> >> > peg an IORESOURCE_WC flag for us
> >>
> >> This makes it sound like pci_read_bases() could do a better job
> >> if we just tried harder, but I don't think that's the case.  All
> >> pci_read_bases() can do is look at the bits in the BAR.  For
> >> memory BARs, there's a "prefetchable" bit and a "64-bit" bit.
> >>
> >> If you just want to complain that the PCI spec didn't define a
> >> way for software to discover whether a BAR can be mapped with WC,
> >> that's fine, but it's misleading to suggest that pci_read_bases()
> >> could figure out WC without some help from the spec.
> >
> > You're right sorry about that, in my original patch this was more
> > of a question and I did not have a full answer for but mst had
> > clarified before the spec doesn't allow for this [0] and you are
> > confirming this now as well.
> >
> > [0] https://lkml.org/lkml/2015/4/21/714
> >
> > I'll update the patch and at least document we did think about
> > this and that its a shortcoming of the spec.
> >
> >> > but where exactly
> >> > video devices memory lie varies *largely* and at times things
> >> > are mixed with MMIO registers, sometimes we can address
> >> > the changes in drivers, other times the change requires
> >> > intrusive changes.
> >> >
> >> > Although there is also arch_phys_wc_add() that makes use of
> >> > architecture specific write-combining alternatives (MTRR on
> >> > x86 when a system does not have PAT) we void polluting
> >> > pci_iomap() space with it and force drivers and subsystems
> >> > that want to use it to be explicit.
> >>
> >> I'm not quite sure I understand the point you're making here
> >> about not polluting pci_iomap_wc() with arch_phys_wc_add().  I
> >> think the choice is for a driver to do either this:
> >>
> >>   info->screen_base = pci_iomap_wc(dev, 0, 0);
> >>
> >> or this:
> >>
> >>   info->screen_base = pci_iomap_wc(dev, 0, 0);
> >>   par->wc_cookie = arch_phys_wc_add(pci_resource_start(dev, 0),
> >>                                   pci_resource_len(dev, 0));
> >>
> >> The driver is *already* being explicit because it calls
> >> pci_iomap_wc() instead of pci_iomap().
> >>
> >> It seems like it would be ideal if ioremap_wc() could call
> >> arch_phys_wc_add() internally.
> >
> > Indeed, that's what I was alluding to.
> >
> >> Doesn't any caller of
> >> arch_phys_wc_add() have to also do some sort of ioremap()
> >> beforehand?
> >
> > This is not a requirement as the physical address is used,
> > not the virtual address.
> >
> >> I assume there's some reason for separating them,
> >
> > Well a full sweep to change to arch_phys_wc_add() was never done,
> > consider this part of the last effort to do so. In retrospect now
> > that I've covered all other drivers in 12 different series of patches
> > I think its perhaps best to not mesh them together as we're phasing
> > out MTRR and the only reason to have arch_phys_wc_add() is for MTRR
> > which is legacy.
> 
> I would say it much more strongly.
> 
> Drivers for new hardware SHOULD NOT call arch_phys_wc_add, directly or
> otherwise.  MTRRs are crap.  They have nasty alignment requirements,
> they are a very limited and unpredictable resource, and the interact
> poorly with BIOS.  They should really only be used for old video
> framebuffers and such.
> 
> Anything new should use PAT (it's been available for a long time) and
> possibly streaming memory writes.  Even fancy server gear (myri10ge,
> for example) should stay far away from MTRRs and such: it's very easy
> to put enough devices in a server board that you simply run out of
> MTRRs and arch_phys_wc_add will stop working.
> 
> If we make ioremap_wc and similar start automatically adding MTRRs,
> then performance will vary wildly with the order of driver loading,
> because we'll run out of MTRRs part-way through bootup.
> 
> ioremap_wc via PAT, on the other hand, is 100% reliable on newer hardware.
> 
> Maybe I should have called it arch_phys_wc_add_awful_legacy_hack.
Thanks, I'll document such technicalities as well ;)
 Luis
^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: [PATCH v4 2/5] lib: devres: add pcim_iomap_wc() variants
  2015-04-30 16:26   ` Bjorn Helgaas
@ 2015-04-30 17:27     ` Luis R. Rodriguez
  2015-04-30 21:46       ` Bjorn Helgaas
  0 siblings, 1 reply; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-04-30 17:27 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Luis R. Rodriguez, mst, plagnioj, tomi.valkeinen, airlied,
	daniel.vetter, linux-fbdev, luto, cocci, linux-kernel, Toshi Kani,
	Suresh Siddha, Ingo Molnar, Thomas Gleixner, Juergen Gross,
	Daniel Vetter, Dave Airlie, Antonino Daplas, Dave Hansen,
	Arnd Bergmann, venkatesh.pallipadi, Stefan Bader,
	Ville Syrjälä, Mel Gorman, Vlastimil Babka,
	Borislav Petkov, Davidlohr Bueso, konrad.wilk, ville.syrjala,
	david.vrabel, jbeulich, Roger Pau Monné, xen-devel,
	linux-pci
On Thu, Apr 30, 2015 at 11:26:47AM -0500, Bjorn Helgaas wrote:
> [+cc linux-pci]
> 
> Hi Luis,
> 
> On Wed, Apr 29, 2015 at 02:36:09PM -0700, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> > 
> > Now that we have pci_iomap_wc() add the respective devres helpers.
> 
> I guess I'm still confused about the relationship between pci_iomap_wc()
> and arch_phys_wc_add().
> 
> Do you expect every caller of pcim_iomap_wc() to also call
> arch_phys_wc_add()?
Yeap.
> If so, I'm not sure how pcim_iomap_wc() fits into the picture.  A driver
> can call both pcim_iomap_wc() and arch_phys_wc_add(), but the driver
> doesn't explicitly do the unmap, so where would the arch_phys_wc_del()
> happen?
As with other current drivers not using devres, upon exit or where they
would otherwise typically iounmap().
> If not, how does a driver know whether it should call arch_phys_wc_add()?
Sadly they'd have to figure it out, as Andy notes arch_phys_wc_add() is
a hack so I think we need to leave it as such and hope to see arch_phys_wc_add()
use phased as it won't be needed anymore really. arch_phys_wc_add() really should
only be used by device drivers that know that are working with non-PAT systems.
The code already takes care of this but since its an x86 write-combining hack
we should not consider meshing it with devres.
> > ...
> >  /**
> > + * pcim_iomap_wc_regions - Request and iomap PCI BARs with write-combining
> > + * @pdev: PCI device to map IO resources for
> > + * @mask: Mask of BARs to request and iomap
> > + * @name: Name used when requesting regions
> > + *
> > + * Request and iomap regions specified by @mask with a preference for
> > + * write-combining.
> > + */
> > +int pcim_iomap_wc_regions(struct pci_dev *pdev, int mask, const char *name)
> > +{
> > +	void __iomem * const *iomap;
> > +	int i, rc;
> > +
> > +	iomap = pcim_iomap_table(pdev);
> > +	if (!iomap)
> > +		return -ENOMEM;
> > +
> > +	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
> > +		unsigned long len;
> > +
> > +		if (!(mask & (1 << i)))
> > +			continue;
> > +
> > +		rc = -EINVAL;
> > +		len = pci_resource_len(pdev, i);
> > +		if (!len)
> > +			goto err_inval;
> > +
> > +		rc = pci_request_region(pdev, i, name);
> > +		if (rc)
> > +			goto err_inval;
> > +
> > +		rc = -ENOMEM;
> > +		if (!pcim_iomap_wc(pdev, i, 0))
> > +			goto err_region;
> 
> Is there a user for this?  Are there really devices where *all* the BARs
> can be mapped with WC?  Are there enough of them to make it worth adding
> this?
Not right now, I did this more to help with a friend who is testing one
driver for a feature. The driver is upstream but a way to make the feature
take effect only under certain conditions still would need to be done.
> I don't see users of either pcim_iomap_wc() or pcim_iomap_wc_regions() so
> far.  Did I miss them, or do you just expect them in the near future?
The later, and also I hate seeing folks later add code under EXPORT_SYMBOL()
rather than EXPORT_SYMBOL_GPL() so I figure I'd rather do it first. It happened
recently in my v1 series, someone beat me to a write-combining export symbol
and changed it to EXPORT_SYMBOL(). Feel free to drop this though but I hope
no one out there then tries to just add an EXPORT_SYMBOL() later for this...
 Luis
^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: [PATCH v4 2/5] lib: devres: add pcim_iomap_wc() variants
  2015-04-30 17:27     ` Luis R. Rodriguez
@ 2015-04-30 21:46       ` Bjorn Helgaas
  2015-05-01  0:20         ` Luis R. Rodriguez
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2015-04-30 21:46 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Luis R. Rodriguez, mst, plagnioj, tomi.valkeinen, airlied,
	daniel.vetter, linux-fbdev, luto, cocci, linux-kernel, Toshi Kani,
	Suresh Siddha, Ingo Molnar, Thomas Gleixner, Juergen Gross,
	Daniel Vetter, Dave Airlie, Antonino Daplas, Dave Hansen,
	Arnd Bergmann, venkatesh.pallipadi, Stefan Bader,
	Ville Syrjälä, Mel Gorman, Vlastimil Babka,
	Borislav Petkov, Davidlohr Bueso, konrad.wilk, ville.syrjala,
	david.vrabel, jbeulich, Roger Pau Monné, xen-devel,
	linux-pci
On Thu, Apr 30, 2015 at 07:27:23PM +0200, Luis R. Rodriguez wrote:
> On Thu, Apr 30, 2015 at 11:26:47AM -0500, Bjorn Helgaas wrote:
> > I don't see users of either pcim_iomap_wc() or pcim_iomap_wc_regions() so
> > far.  Did I miss them, or do you just expect them in the near future?
> 
> The later, and also I hate seeing folks later add code under EXPORT_SYMBOL()
> rather than EXPORT_SYMBOL_GPL() so I figure I'd rather do it first. It happened
> recently in my v1 series, someone beat me to a write-combining export symbol
> and changed it to EXPORT_SYMBOL(). Feel free to drop this though but I hope
> no one out there then tries to just add an EXPORT_SYMBOL() later for this...
Why do you want them to be EXPORT_SYMBOL_GPL?  I would expect them to be
exported the same way pcim_iomap(), pcim_iomap_regions(), and ioremap_wc()
are exported, i.e., with EXPORT_SYMBOL.
Per Documentation/DocBook/kernel-hacking.tmpl, EXPORT_SYMBOL_GPL "implies
that the function is considered an internal implementation issue, and not
really an interface."  I don't think these are internal implementation
issues.
Bjorn
^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: [PATCH v4 2/5] lib: devres: add pcim_iomap_wc() variants
  2015-04-30 21:46       ` Bjorn Helgaas
@ 2015-05-01  0:20         ` Luis R. Rodriguez
  0 siblings, 0 replies; 14+ messages in thread
From: Luis R. Rodriguez @ 2015-05-01  0:20 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Luis R. Rodriguez, mst, plagnioj, tomi.valkeinen, airlied,
	daniel.vetter, linux-fbdev, luto, cocci, linux-kernel, Toshi Kani,
	Suresh Siddha, Ingo Molnar, Thomas Gleixner, Juergen Gross,
	Daniel Vetter, Dave Airlie, Antonino Daplas, Dave Hansen,
	Arnd Bergmann, venkatesh.pallipadi, Stefan Bader,
	Ville Syrjälä, Mel Gorman, Vlastimil Babka,
	Borislav Petkov, Davidlohr Bueso, konrad.wilk, ville.syrjala,
	david.vrabel, jbeulich, Roger Pau Monné, xen-devel,
	linux-pci
On Thu, Apr 30, 2015 at 04:46:38PM -0500, Bjorn Helgaas wrote:
> On Thu, Apr 30, 2015 at 07:27:23PM +0200, Luis R. Rodriguez wrote:
> > On Thu, Apr 30, 2015 at 11:26:47AM -0500, Bjorn Helgaas wrote:
> 
> > > I don't see users of either pcim_iomap_wc() or pcim_iomap_wc_regions() so
> > > far.  Did I miss them, or do you just expect them in the near future?
> > 
> > The later, and also I hate seeing folks later add code under EXPORT_SYMBOL()
> > rather than EXPORT_SYMBOL_GPL() so I figure I'd rather do it first. It happened
> > recently in my v1 series, someone beat me to a write-combining export symbol
> > and changed it to EXPORT_SYMBOL(). Feel free to drop this though but I hope
> > no one out there then tries to just add an EXPORT_SYMBOL() later for this...
> 
> Why do you want them to be EXPORT_SYMBOL_GPL?  I would expect them to be
> exported the same way pcim_iomap(), pcim_iomap_regions(), and ioremap_wc()
> are exported, i.e., with EXPORT_SYMBOL.
>> 
> Per Documentation/DocBook/kernel-hacking.tmpl, EXPORT_SYMBOL_GPL "implies
> that the function is considered an internal implementation issue, and not
> really an interface."  I don't think these are internal implementation
> issues.
What Documentation/DocBook/kernel-hacking.tmpl states over EXPORT_SYMBOL_GPL()
is old and in no way reflects current trends and reality. For instance, some
folks believe that if some code has EXPORT_SYMBOL() declared that they can use
it on proprietary modules. This is terribly incorrect, quite a few developers
do not in any way stand by this as a "needed" clarification on their code [0].
I'm one of them, but to be even more clear on this I simply *always* use
EXPORT_SYMBOL_GPL() to remove any possible doubt over this on any symbols
that I export. Heck, even tons of driver library code uses EXPORT_SYMBOL_GPL().
[0] https://lkml.org/lkml/2012/4/20/402
  Luis
^ permalink raw reply	[flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-05-01  0:20 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-29 21:36 [PATCH v4 0/5] pci/devres: add and use pci_iomap_wc() and pcim_iomap_wc() Luis R. Rodriguez
2015-04-29 21:36 ` [PATCH v4 1/5] pci: add pci_iomap_wc() variants Luis R. Rodriguez
2015-04-30 15:59   ` Bjorn Helgaas
2015-04-30 16:52     ` Luis R. Rodriguez
2015-04-30 17:03       ` Andy Lutomirski
2015-04-30 17:15         ` Luis R. Rodriguez
2015-04-29 21:36 ` [PATCH v4 2/5] lib: devres: add pcim_iomap_wc() variants Luis R. Rodriguez
2015-04-30 16:26   ` Bjorn Helgaas
2015-04-30 17:27     ` Luis R. Rodriguez
2015-04-30 21:46       ` Bjorn Helgaas
2015-05-01  0:20         ` Luis R. Rodriguez
2015-04-29 21:36 ` [PATCH v4 3/5] video: fbdev: arkfb: use arch_phys_wc_add() and pci_iomap_wc() Luis R. Rodriguez
2015-04-29 21:36 ` [PATCH v4 4/5] video: fbdev: s3fb: " Luis R. Rodriguez
2015-04-29 21:36 ` [PATCH v4 5/5] video: fbdev: vt8623fb: " Luis R. Rodriguez
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).