* [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers
@ 2012-09-14 10:57 Chris Wilson
2012-09-14 10:57 ` [PATCH 2/2] agp/intel: Use a write-combining map for updating PTEs Chris Wilson
2012-09-14 18:50 ` [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers Ben Widawsky
0 siblings, 2 replies; 5+ messages in thread
From: Chris Wilson @ 2012-09-14 10:57 UTC (permalink / raw)
To: intel-gfx
In the future we may like to experiment with using a WC map of the GTT
portion. However, that will conflict with i915.ko mapping the entire bar
as UC in order to access the GPU registers. Instead we can shrink the
register ioremap to only map the register block.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_dma.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index ee9fe72..18bb48b 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1460,7 +1460,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
{
struct drm_i915_private *dev_priv;
struct intel_device_info *info;
- int ret = 0, mmio_bar;
+ int ret = 0, mmio_bar, mmio_size;
uint32_t aperture_size;
info = (struct intel_device_info *) flags;
@@ -1524,8 +1524,18 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
if (IS_BROADWATER(dev) || IS_CRESTLINE(dev))
dma_set_coherent_mask(&dev->pdev->dev, DMA_BIT_MASK(32));
+ /* Restrict iomap to avoid clobbering the GTT which we want WC mapped.
+ * Do not attempt to map the whole BAR!
+ */
mmio_bar = IS_GEN2(dev) ? 1 : 0;
- dev_priv->regs = pci_iomap(dev->pdev, mmio_bar, 0);
+ if (info->gen < 3)
+ mmio_size = 64*1024;
+ else if (info->gen < 5)
+ mmio_size = 512*1024;
+ else
+ mmio_size = 2*1024*1024;
+
+ dev_priv->regs = pci_iomap(dev->pdev, mmio_bar, mmio_size);
if (!dev_priv->regs) {
DRM_ERROR("failed to map registers\n");
ret = -EIO;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] agp/intel: Use a write-combining map for updating PTEs
2012-09-14 10:57 [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers Chris Wilson
@ 2012-09-14 10:57 ` Chris Wilson
2012-09-14 11:08 ` Chris Wilson
2012-09-14 18:50 ` [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers Ben Widawsky
1 sibling, 1 reply; 5+ messages in thread
From: Chris Wilson @ 2012-09-14 10:57 UTC (permalink / raw)
To: intel-gfx
Rewriting the PTE entries using an WC mapping is roughly an order of
magnitude faster than through the uncached mapping. This makes an
observable difference on workloads that cycle through large numbers of
buffers, for example Chromium using ShmPixmaps where virtually all the
CPU time is currently spent rebinding the userptr.
v2: Limit the WC mapping to older generations as we should the TLB
invalidation on SandyBridge+ unreliable.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/char/agp/intel-gtt.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index 258873a..8b0f6d19 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -666,9 +666,14 @@ static int intel_gtt_init(void)
gtt_map_size = intel_private.base.gtt_total_entries * 4;
- intel_private.gtt = ioremap(intel_private.gtt_bus_addr,
- gtt_map_size);
- if (!intel_private.gtt) {
+ intel_private.gtt = NULL;
+ if (INTEL_GTT_GEN < 6)
+ intel_private.gtt = ioremap_wc(intel_private.gtt_bus_addr,
+ gtt_map_size);
+ if (intel_private.gtt == NULL)
+ intel_private.gtt = ioremap(intel_private.gtt_bus_addr,
+ gtt_map_size);
+ if (intel_private.gtt == NULL) {
intel_private.driver->cleanup();
iounmap(intel_private.registers);
return -ENOMEM;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] agp/intel: Use a write-combining map for updating PTEs
2012-09-14 10:57 ` [PATCH 2/2] agp/intel: Use a write-combining map for updating PTEs Chris Wilson
@ 2012-09-14 11:08 ` Chris Wilson
0 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2012-09-14 11:08 UTC (permalink / raw)
To: intel-gfx
On Fri, 14 Sep 2012 11:57:47 +0100, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> v2: Limit the WC mapping to older generations as we should the TLB
> invalidation on SandyBridge+ unreliable.
/me fires his editor and proof-reader
v2: Limit the WC mapping to older generations as we have observed that
the TLB invalidation on SandyBridge+ is unreliable with WC updates.
See i-g-t/tests/gem_gtt_cpu_tlb
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers
2012-09-14 10:57 [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers Chris Wilson
2012-09-14 10:57 ` [PATCH 2/2] agp/intel: Use a write-combining map for updating PTEs Chris Wilson
@ 2012-09-14 18:50 ` Ben Widawsky
2012-09-14 21:27 ` Daniel Vetter
1 sibling, 1 reply; 5+ messages in thread
From: Ben Widawsky @ 2012-09-14 18:50 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Fri, 14 Sep 2012 11:57:46 +0100
Chris Wilson <chris@chris-wilson.co.uk> wrote:
> In the future we may like to experiment with using a WC map of the GTT
> portion. However, that will conflict with i915.ko mapping the entire bar
> as UC in order to access the GPU registers. Instead we can shrink the
> register ioremap to only map the register block.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since I really don't want to pull out docs for anything less than
gen6...
Tested-by (IVB): Ben Widawsky <ben@bwidawsk.net>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers
2012-09-14 18:50 ` [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers Ben Widawsky
@ 2012-09-14 21:27 ` Daniel Vetter
0 siblings, 0 replies; 5+ messages in thread
From: Daniel Vetter @ 2012-09-14 21:27 UTC (permalink / raw)
To: Ben Widawsky; +Cc: intel-gfx
On Fri, Sep 14, 2012 at 11:50:30AM -0700, Ben Widawsky wrote:
> On Fri, 14 Sep 2012 11:57:46 +0100
> Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> > In the future we may like to experiment with using a WC map of the GTT
> > portion. However, that will conflict with i915.ko mapping the entire bar
> > as UC in order to access the GPU registers. Instead we can shrink the
> > register ioremap to only map the register block.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> Since I really don't want to pull out docs for anything less than
> gen6...
> Tested-by (IVB): Ben Widawsky <ben@bwidawsk.net>
> Acked-by: Ben Widawsky <ben@bwidawsk.net>
Both patches merged, with the commit message frobbed.
Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-09-14 21:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-14 10:57 [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers Chris Wilson
2012-09-14 10:57 ` [PATCH 2/2] agp/intel: Use a write-combining map for updating PTEs Chris Wilson
2012-09-14 11:08 ` Chris Wilson
2012-09-14 18:50 ` [PATCH 1/2] drm/i915: Limit the ioremap of the PCI bar to the registers Ben Widawsky
2012-09-14 21:27 ` Daniel Vetter
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.