* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
[not found] <Y04i8V7xamTkuqNA@mail-itl>
@ 2022-10-18 8:24 ` Christoph Hellwig
2022-10-18 8:57 ` Jan Beulich
2022-10-18 8:31 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2022-10-18 8:24 UTC (permalink / raw)
To: Marek Marczykowski-Górecki
Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
Oleksandr Tyshchenko, iommu, Matthew Auld, Rodrigo Vivi,
xen-devel, Christoph Hellwig
On Tue, Oct 18, 2022 at 05:52:16AM +0200, Marek Marczykowski-Górecki wrote:
> not only) when using IGD in Xen PV dom0. After not very long time Xorg
> crashes, and dmesg contain messages like this:
>
> i915 0000:00:02.0: [drm] GPU HANG: ecode 7:1:01fffbfe, in Xorg [5337]
> i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
> i915 0000:00:02.0: [drm] Xorg[5337] context reset due to GPU hang
<snip>
> I tried reverting just this commit on top of 6.0.x, but the context
> changed significantly in subsequent commits, so after trying reverting
> it together with 3 or 4 more commits I gave up.
>
> What may be an important detail, the system heavily uses cross-VM shared
> memory (gntdev) to map window contents from VMs. This is Qubes OS, and
> it uses Xen 4.14.
Can you try the patch below?
---
From 26fe4749750f1bf843666ca777e297279994e33a Mon Sep 17 00:00:00 2001
From: Robert Beckett <bob.beckett@collabora.com>
Date: Tue, 26 Jul 2022 16:39:35 +0100
Subject: drm/i915: stop abusing swiotlb_max_segment
Calling swiotlb functions directly is nowadays considered harmful. See
https://lore.kernel.org/intel-gfx/20220711082614.GA29487@lst.de/
Replace swiotlb_max_segment() calls with dma_max_mapping_size().
In i915_gem_object_get_pages_internal() no longer consider max_segment
only if CONFIG_SWIOTLB is enabled. There can be other (iommu related)
causes of specific max segment sizes.
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[hch: added the Xen hack]
---
drivers/gpu/drm/i915/gem/i915_gem_internal.c | 19 +++----------
drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 4 +--
drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 2 +-
drivers/gpu/drm/i915/i915_scatterlist.h | 30 +++++++++++---------
5 files changed, 25 insertions(+), 32 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index c698f95af15fe..629acb403a2c9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -6,7 +6,6 @@
#include <linux/scatterlist.h>
#include <linux/slab.h>
-#include <linux/swiotlb.h>
#include "i915_drv.h"
#include "i915_gem.h"
@@ -38,22 +37,12 @@ static int i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj)
struct scatterlist *sg;
unsigned int sg_page_sizes;
unsigned int npages;
- int max_order;
+ int max_order = MAX_ORDER;
+ unsigned int max_segment;
gfp_t gfp;
- max_order = MAX_ORDER;
-#ifdef CONFIG_SWIOTLB
- if (is_swiotlb_active(obj->base.dev->dev)) {
- unsigned int max_segment;
-
- max_segment = swiotlb_max_segment();
- if (max_segment) {
- max_segment = max_t(unsigned int, max_segment,
- PAGE_SIZE) >> PAGE_SHIFT;
- max_order = min(max_order, ilog2(max_segment));
- }
- }
-#endif
+ max_segment = i915_sg_segment_size(i915->drm.dev) >> PAGE_SHIFT;
+ max_order = min(max_order, get_order(max_segment));
gfp = GFP_KERNEL | __GFP_HIGHMEM | __GFP_RECLAIMABLE;
if (IS_I965GM(i915) || IS_I965G(i915)) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index f42ca1179f373..11125c32dd35d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -194,7 +194,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
struct intel_memory_region *mem = obj->mm.region;
struct address_space *mapping = obj->base.filp->f_mapping;
const unsigned long page_count = obj->base.size / PAGE_SIZE;
- unsigned int max_segment = i915_sg_segment_size();
+ unsigned int max_segment = i915_sg_segment_size(i915->drm.dev);
struct sg_table *st;
struct sgt_iter sgt_iter;
struct page *page;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index e3fc38dd5db04..de5d0a7241027 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -189,7 +189,7 @@ static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev);
struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM];
struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
- const unsigned int max_segment = i915_sg_segment_size();
+ const unsigned int max_segment = i915_sg_segment_size(i915->drm.dev);
const size_t size = (size_t)ttm->num_pages << PAGE_SHIFT;
struct file *filp = i915_tt->filp;
struct sgt_iter sgt_iter;
@@ -538,7 +538,7 @@ static struct i915_refct_sgt *i915_ttm_tt_get_st(struct ttm_tt *ttm)
ret = sg_alloc_table_from_pages_segment(st,
ttm->pages, ttm->num_pages,
0, (unsigned long)ttm->num_pages << PAGE_SHIFT,
- i915_sg_segment_size(), GFP_KERNEL);
+ i915_sg_segment_size(i915_tt->dev), GFP_KERNEL);
if (ret) {
st->sgl = NULL;
return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 8423df021b713..e4515d6acd43c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -129,7 +129,7 @@ static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj)
static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
{
const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
- unsigned int max_segment = i915_sg_segment_size();
+ unsigned int max_segment = i915_sg_segment_size(obj->base.dev->dev);
struct sg_table *st;
unsigned int sg_page_sizes;
struct page **pvec;
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915_scatterlist.h
index 9ddb3e743a3e5..c278888f71528 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.h
+++ b/drivers/gpu/drm/i915/i915_scatterlist.h
@@ -9,7 +9,8 @@
#include <linux/pfn.h>
#include <linux/scatterlist.h>
-#include <linux/swiotlb.h>
+#include <linux/dma-mapping.h>
+#include <xen/xen.h>
#include "i915_gem.h"
@@ -127,19 +128,22 @@ static inline unsigned int i915_sg_dma_sizes(struct scatterlist *sg)
return page_sizes;
}
-static inline unsigned int i915_sg_segment_size(void)
+static inline unsigned int i915_sg_segment_size(struct device *dev)
{
- unsigned int size = swiotlb_max_segment();
-
- if (size == 0)
- size = UINT_MAX;
-
- size = rounddown(size, PAGE_SIZE);
- /* swiotlb_max_segment_size can return 1 byte when it means one page. */
- if (size < PAGE_SIZE)
- size = PAGE_SIZE;
-
- return size;
+ size_t max = min_t(size_t, UINT_MAX, dma_max_mapping_size(dev));
+
+ /*
+ * Xen on x86 can reshuffle pages under us. The DMA API takes
+ * care of that both in dma_alloc_* (by calling into the hypervisor
+ * to make the pages contigous) and in dma_map_* (by bounce buffering).
+ * But i915 abuses ignores the coherency aspects of the DMA API and
+ * thus can't cope with bounce buffering actually happening, so add
+ * a hack here to force small allocations and mapping when running on
+ * Xen. (good luck with TDX, btw --hch)
+ */
+ if (IS_ENABLED(CONFIG_X86) && xen_domain())
+ max = PAGE_SIZE;
+ return round_down(max, PAGE_SIZE);
}
bool i915_sg_trim(struct sg_table *orig_st);
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BUILD: failure for i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
[not found] <Y04i8V7xamTkuqNA@mail-itl>
2022-10-18 8:24 ` [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment" Christoph Hellwig
@ 2022-10-18 8:31 ` Patchwork
1 sibling, 0 replies; 8+ messages in thread
From: Patchwork @ 2022-10-18 8:31 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: intel-gfx
== Series Details ==
Series: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
URL : https://patchwork.freedesktop.org/series/109817/
State : failure
== Summary ==
Error: patch https://patchwork.freedesktop.org/api/1.0/series/109817/revisions/1/mbox/ not applied
Applying: i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
error: No valid patches in input (allow with "--allow-empty")
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
2022-10-18 8:24 ` [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment" Christoph Hellwig
@ 2022-10-18 8:57 ` Jan Beulich
2022-10-18 11:02 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2022-10-18 8:57 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
Matthew Auld, Rodrigo Vivi, xen-devel
On 18.10.2022 10:24, Christoph Hellwig wrote:
> @@ -127,19 +128,22 @@ static inline unsigned int i915_sg_dma_sizes(struct scatterlist *sg)
> return page_sizes;
> }
>
> -static inline unsigned int i915_sg_segment_size(void)
> +static inline unsigned int i915_sg_segment_size(struct device *dev)
> {
> - unsigned int size = swiotlb_max_segment();
> -
> - if (size == 0)
> - size = UINT_MAX;
> -
> - size = rounddown(size, PAGE_SIZE);
> - /* swiotlb_max_segment_size can return 1 byte when it means one page. */
> - if (size < PAGE_SIZE)
> - size = PAGE_SIZE;
> -
> - return size;
> + size_t max = min_t(size_t, UINT_MAX, dma_max_mapping_size(dev));
> +
> + /*
> + * Xen on x86 can reshuffle pages under us. The DMA API takes
> + * care of that both in dma_alloc_* (by calling into the hypervisor
> + * to make the pages contigous) and in dma_map_* (by bounce buffering).
> + * But i915 abuses ignores the coherency aspects of the DMA API and
> + * thus can't cope with bounce buffering actually happening, so add
> + * a hack here to force small allocations and mapping when running on
> + * Xen. (good luck with TDX, btw --hch)
> + */
> + if (IS_ENABLED(CONFIG_X86) && xen_domain())
> + max = PAGE_SIZE;
> + return round_down(max, PAGE_SIZE);
> }
Shouldn't this then be xen_pv_domain() that you use here, and - if you
really want IS_ENABLED() in addition - CONFIG_XEN_PV?
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
2022-10-18 8:57 ` Jan Beulich
@ 2022-10-18 11:02 ` Christoph Hellwig
2022-10-18 14:21 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2022-10-18 11:02 UTC (permalink / raw)
To: Jan Beulich
Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
Matthew Auld, Rodrigo Vivi, xen-devel, Christoph Hellwig
On Tue, Oct 18, 2022 at 10:57:37AM +0200, Jan Beulich wrote:
> Shouldn't this then be xen_pv_domain() that you use here, and - if you
> really want IS_ENABLED() in addition - CONFIG_XEN_PV?
I'll need help from people that understand Xen better than me what
the exact conditions (and maybe also comments are).
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
2022-10-18 11:02 ` Christoph Hellwig
@ 2022-10-18 14:21 ` Jan Beulich
2022-10-18 14:33 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2022-10-18 14:21 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
Matthew Auld, Rodrigo Vivi, xen-devel
On 18.10.2022 13:02, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 10:57:37AM +0200, Jan Beulich wrote:
>> Shouldn't this then be xen_pv_domain() that you use here, and - if you
>> really want IS_ENABLED() in addition - CONFIG_XEN_PV?
>
> I'll need help from people that understand Xen better than me what
> the exact conditions (and maybe also comments are).
Leaving the "i915 abuses" part aside (because I can't tell what exactly the
abuse is), but assuming that "can't cope with bounce buffering" means they
don't actually use the allocated buffers, I'd suggest this:
/*
* For Xen PV guests pages aren't contiguous in DMA (machine) address
* space. The DMA API takes care of that both in dma_alloc_* (by
* calling into the hypervisor to make the pages contiguous) and in
* dma_map_* (by bounce buffering). But i915 abuses ignores the
* coherency aspects of the DMA API and thus can't cope with bounce
* buffering actually happening, so add a hack here to force small
* allocations and mappings when running in PV mode on Xen.
*/
if (IS_ENABLED(CONFIG_XEN_PV) && xen_pv_domain())
max = PAGE_SIZE;
I've dropped the TDX related remark because I don't think it's meaningful
for PV guests. Otoh I've left the "abuses ignores" word sequence as is, no
matter that it reads odd to me. Plus, as hinted at before, I'm not
convinced the IS_ENABLED() use is actually necessary or warranted here.
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
2022-10-18 14:21 ` Jan Beulich
@ 2022-10-18 14:33 ` Christoph Hellwig
2022-10-18 14:53 ` Juergen Gross
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2022-10-18 14:33 UTC (permalink / raw)
To: Jan Beulich
Cc: Juergen Gross, Stefano Stabellini, regressions, dri-devel,
Anshuman Khandual, intel-gfx, Konrad Rzeszutek Wilk,
Marek Marczykowski-Górecki, Oleksandr Tyshchenko, iommu,
Matthew Auld, Rodrigo Vivi, xen-devel, Christoph Hellwig
On Tue, Oct 18, 2022 at 04:21:43PM +0200, Jan Beulich wrote:
> Leaving the "i915 abuses" part aside (because I can't tell what exactly the
> abuse is), but assuming that "can't cope with bounce buffering" means they
> don't actually use the allocated buffers, I'd suggest this:
Except for one odd place i915 never uses dma_alloc_* but always allocates
memory itself and then maps it, but then treats it as if it was a
dma_alloc_coherent allocations, that is never does ownership changes.
> I've dropped the TDX related remark because I don't think it's meaningful
> for PV guests.
This remark is for TDX in general, not Xen related. With TDX and other
confidentatial computing schemes, all DMA must be bounce buffered, and
all drivers skipping dma_sync* calls are broken.
> Otoh I've left the "abuses ignores" word sequence as is, no
> matter that it reads odd to me. Plus, as hinted at before, I'm not
> convinced the IS_ENABLED() use is actually necessary or warranted here.
If we don't need the IS_ENABLED is not needed I'm all for dropping it.
But unless I misread the code, on arm/arm64 even PV guests are 1:1
mapped so that all Linux physically contigous memory also is Xen
contigous, so we don't need the hack.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
2022-10-18 14:33 ` Christoph Hellwig
@ 2022-10-18 14:53 ` Juergen Gross
2022-10-18 14:55 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Juergen Gross @ 2022-10-18 14:53 UTC (permalink / raw)
To: Christoph Hellwig, Jan Beulich
Cc: Stefano Stabellini, regressions, dri-devel, Anshuman Khandual,
intel-gfx, Konrad Rzeszutek Wilk, Marek Marczykowski-Górecki,
Oleksandr Tyshchenko, iommu, Matthew Auld, Rodrigo Vivi,
xen-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 1386 bytes --]
On 18.10.22 16:33, Christoph Hellwig wrote:
> On Tue, Oct 18, 2022 at 04:21:43PM +0200, Jan Beulich wrote:
>> Leaving the "i915 abuses" part aside (because I can't tell what exactly the
>> abuse is), but assuming that "can't cope with bounce buffering" means they
>> don't actually use the allocated buffers, I'd suggest this:
>
> Except for one odd place i915 never uses dma_alloc_* but always allocates
> memory itself and then maps it, but then treats it as if it was a
> dma_alloc_coherent allocations, that is never does ownership changes.
>
>> I've dropped the TDX related remark because I don't think it's meaningful
>> for PV guests.
>
> This remark is for TDX in general, not Xen related. With TDX and other
> confidentatial computing schemes, all DMA must be bounce buffered, and
> all drivers skipping dma_sync* calls are broken.
>
>> Otoh I've left the "abuses ignores" word sequence as is, no
>> matter that it reads odd to me. Plus, as hinted at before, I'm not
>> convinced the IS_ENABLED() use is actually necessary or warranted here.
>
> If we don't need the IS_ENABLED is not needed I'm all for dropping it.
> But unless I misread the code, on arm/arm64 even PV guests are 1:1
> mapped so that all Linux physically contigous memory also is Xen
> contigous, so we don't need the hack.
There are no PV guests on arm/arm64.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment"
2022-10-18 14:53 ` Juergen Gross
@ 2022-10-18 14:55 ` Christoph Hellwig
0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2022-10-18 14:55 UTC (permalink / raw)
To: Juergen Gross
Cc: Stefano Stabellini, regressions, dri-devel, Anshuman Khandual,
intel-gfx, Konrad Rzeszutek Wilk, Marek Marczykowski-Górecki,
Oleksandr Tyshchenko, iommu, Matthew Auld, Jan Beulich,
Rodrigo Vivi, xen-devel, Christoph Hellwig
On Tue, Oct 18, 2022 at 04:53:50PM +0200, Juergen Gross wrote:
>> If we don't need the IS_ENABLED is not needed I'm all for dropping it.
>> But unless I misread the code, on arm/arm64 even PV guests are 1:1
>> mapped so that all Linux physically contigous memory also is Xen
>> contigous, so we don't need the hack.
>
> There are no PV guests on arm/arm64.
Ok, that's the part I was missing. In that case we should be fine
without the IS_ENABLED indeed.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-10-24 13:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Y04i8V7xamTkuqNA@mail-itl>
2022-10-18 8:24 ` [Intel-gfx] i915 "GPU HANG", bisected to a2daa27c0c61 "swiotlb: simplify swiotlb_max_segment" Christoph Hellwig
2022-10-18 8:57 ` Jan Beulich
2022-10-18 11:02 ` Christoph Hellwig
2022-10-18 14:21 ` Jan Beulich
2022-10-18 14:33 ` Christoph Hellwig
2022-10-18 14:53 ` Juergen Gross
2022-10-18 14:55 ` Christoph Hellwig
2022-10-18 8:31 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox