dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/i915/dmabuf: Flush the cache in vmap
@ 2025-10-24 11:04 Jocelyn Falempe
  2025-10-24 11:53 ` Tvrtko Ursulin
  0 siblings, 1 reply; 18+ messages in thread
From: Jocelyn Falempe @ 2025-10-24 11:04 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Simona Vetter, Christian Brauner, Andi Shyti,
	intel-gfx, dri-devel
  Cc: Jocelyn Falempe

On a lenovo se100 server, when using i915 GPU for rendering, and the
ast driver for display, the graphic output is corrupted, and almost
unusable.

Adding a clflush call in the vmap function fixes this issue
completely.

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index f4f1c979d1b9..f6a8c1cbe4d1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -77,6 +77,7 @@ static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf,
 		return PTR_ERR(vaddr);
 
 	iosys_map_set_vaddr(map, vaddr);
+	drm_clflush_virt_range(vaddr, dma_buf->size);
 
 	return 0;
 }

base-commit: 0790925dadad0997580df6e32cdccd54316807f2
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 11:04 [PATCH] drm/i915/dmabuf: Flush the cache in vmap Jocelyn Falempe
@ 2025-10-24 11:53 ` Tvrtko Ursulin
  2025-10-24 12:40   ` Thomas Zimmermann
  2025-10-24 12:48   ` Jocelyn Falempe
  0 siblings, 2 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2025-10-24 11:53 UTC (permalink / raw)
  To: Jocelyn Falempe, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	David Airlie, Simona Vetter, Christian Brauner, Andi Shyti,
	intel-gfx, dri-devel


On 24/10/2025 12:04, Jocelyn Falempe wrote:
> On a lenovo se100 server, when using i915 GPU for rendering, and the
> ast driver for display, the graphic output is corrupted, and almost
> unusable.
> 
> Adding a clflush call in the vmap function fixes this issue
> completely.

AST is importing i915 allocated buffer in this use case, or how exactly 
is the relationship?

Wondering if some path is not calling dma_buf_begin/end_cpu_access().

Regards,

Tvrtko

> 
> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> index f4f1c979d1b9..f6a8c1cbe4d1 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> @@ -77,6 +77,7 @@ static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf,
>   		return PTR_ERR(vaddr);
>   
>   	iosys_map_set_vaddr(map, vaddr);
> +	drm_clflush_virt_range(vaddr, dma_buf->size);
>   
>   	return 0;
>   }
> 
> base-commit: 0790925dadad0997580df6e32cdccd54316807f2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 11:53 ` Tvrtko Ursulin
@ 2025-10-24 12:40   ` Thomas Zimmermann
  2025-10-24 13:33     ` Jocelyn Falempe
  2025-10-24 12:48   ` Jocelyn Falempe
  1 sibling, 1 reply; 18+ messages in thread
From: Thomas Zimmermann @ 2025-10-24 12:40 UTC (permalink / raw)
  To: Tvrtko Ursulin, Jocelyn Falempe, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

Hi

Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>
> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>> ast driver for display, the graphic output is corrupted, and almost
>> unusable.
>>
>> Adding a clflush call in the vmap function fixes this issue
>> completely.
>
> AST is importing i915 allocated buffer in this use case, or how 
> exactly is the relationship?
>
> Wondering if some path is not calling dma_buf_begin/end_cpu_access().

Yes, ast doesn't call begin/end_cpu_access in [1].

Jocelyn, if that fixes the issue, feel free to send me a patch for review.

[1] 
https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ast/ast_mode.c

Best regards
Thomas

>
> Regards,
>
> Tvrtko
>
>>
>> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>> index f4f1c979d1b9..f6a8c1cbe4d1 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>> @@ -77,6 +77,7 @@ static int i915_gem_dmabuf_vmap(struct dma_buf 
>> *dma_buf,
>>           return PTR_ERR(vaddr);
>>         iosys_map_set_vaddr(map, vaddr);
>> +    drm_clflush_virt_range(vaddr, dma_buf->size);
>>         return 0;
>>   }
>>
>> base-commit: 0790925dadad0997580df6e32cdccd54316807f2
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 11:53 ` Tvrtko Ursulin
  2025-10-24 12:40   ` Thomas Zimmermann
@ 2025-10-24 12:48   ` Jocelyn Falempe
  2025-10-24 15:18     ` Tvrtko Ursulin
  2025-10-24 16:49     ` Michel Dänzer
  1 sibling, 2 replies; 18+ messages in thread
From: Jocelyn Falempe @ 2025-10-24 12:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	David Airlie, Simona Vetter, Christian Brauner, Andi Shyti,
	intel-gfx, dri-devel, Michel Dänzer

On 24/10/2025 13:53, Tvrtko Ursulin wrote:
> 
> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>> ast driver for display, the graphic output is corrupted, and almost
>> unusable.
>>
>> Adding a clflush call in the vmap function fixes this issue
>> completely.
> 
> AST is importing i915 allocated buffer in this use case, or how exactly 
> is the relationship?

I think it's mutter/gnome-shell who copy the buffer from i915 to ast, 
here is the logs:

gnome-shell[2079]: Failed to initialize accelerated iGPU/dGPU 
framebuffer sharing: Do not want to use software renderer (llvmpipe 
(LLVM 19.1.7, 256 bits)), falling back to CPU copy path
gnome-shell[1533]: Created gbm renderer for '/dev/dri/card0'
gnome-shell[1533]: GPU /dev/dri/card1 selected as primary

card0 is ast and card1 is i915

Do you think there is something missing in mutter?

(Added Michel Dänzer that may know more about mutter).

-- 

Jocelyn


> 
> Wondering if some path is not calling dma_buf_begin/end_cpu_access().
> 
> Regards,
> 
> Tvrtko
> 
>>
>> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/ 
>> drm/i915/gem/i915_gem_dmabuf.c
>> index f4f1c979d1b9..f6a8c1cbe4d1 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>> @@ -77,6 +77,7 @@ static int i915_gem_dmabuf_vmap(struct dma_buf 
>> *dma_buf,
>>           return PTR_ERR(vaddr);
>>       iosys_map_set_vaddr(map, vaddr);
>> +    drm_clflush_virt_range(vaddr, dma_buf->size);
>>       return 0;
>>   }
>>
>> base-commit: 0790925dadad0997580df6e32cdccd54316807f2
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 12:40   ` Thomas Zimmermann
@ 2025-10-24 13:33     ` Jocelyn Falempe
  2025-10-24 15:18       ` Thomas Zimmermann
  0 siblings, 1 reply; 18+ messages in thread
From: Jocelyn Falempe @ 2025-10-24 13:33 UTC (permalink / raw)
  To: Thomas Zimmermann, Tvrtko Ursulin, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

On 24/10/2025 14:40, Thomas Zimmermann wrote:
> Hi
> 
> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>
>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>>> ast driver for display, the graphic output is corrupted, and almost
>>> unusable.
>>>
>>> Adding a clflush call in the vmap function fixes this issue
>>> completely.
>>
>> AST is importing i915 allocated buffer in this use case, or how 
>> exactly is the relationship?
>>
>> Wondering if some path is not calling dma_buf_begin/end_cpu_access().
> 
> Yes, ast doesn't call begin/end_cpu_access in [1].
> 
> Jocelyn, if that fixes the issue, feel free to send me a patch for review.
> 
> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ast/ 
> ast_mode.c

I tried the following patch, but that doesn't fix the graphical issue:

diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index b4e8edc7c767..e50f95a4c8a9 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -564,6 +564,7 @@ static void 
ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
         struct drm_crtc_state *crtc_state = 
drm_atomic_get_new_crtc_state(state, crtc);
         struct drm_rect damage;
         struct drm_atomic_helper_damage_iter iter;
+       int ret;

         if (!old_fb || (fb->format != old_fb->format) || 
crtc_state->mode_changed) {
                 struct ast_crtc_state *ast_crtc_state = 
to_ast_crtc_state(crtc_state);
@@ -572,11 +573,16 @@ static void 
ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
                 ast_set_vbios_color_reg(ast, fb->format, 
ast_crtc_state->vmode);
         }

+       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
+       pr_info("AST begin_cpu_access %d\n", ret);
+
         drm_atomic_helper_damage_iter_init(&iter, old_plane_state, 
plane_state);
         drm_atomic_for_each_plane_damage(&iter, &damage) {
                 ast_handle_damage(ast_plane, shadow_plane_state->data, 
fb, &damage);
         }

+       drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
+
         /*
          * Some BMCs stop scanning out the video signal after the driver
          * reprogrammed the offset. This stalls display output for several



Best regards,

-- 

Jocelyn

> 
> Best regards
> Thomas
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>>
>>> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
>>> ---
>>>   drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/ 
>>> gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> index f4f1c979d1b9..f6a8c1cbe4d1 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> @@ -77,6 +77,7 @@ static int i915_gem_dmabuf_vmap(struct dma_buf 
>>> *dma_buf,
>>>           return PTR_ERR(vaddr);
>>>         iosys_map_set_vaddr(map, vaddr);
>>> +    drm_clflush_virt_range(vaddr, dma_buf->size);
>>>         return 0;
>>>   }
>>>
>>> base-commit: 0790925dadad0997580df6e32cdccd54316807f2
>>
> 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 13:33     ` Jocelyn Falempe
@ 2025-10-24 15:18       ` Thomas Zimmermann
  2025-10-24 15:55         ` Tvrtko Ursulin
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Zimmermann @ 2025-10-24 15:18 UTC (permalink / raw)
  To: Jocelyn Falempe, Tvrtko Ursulin, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

Hi

Am 24.10.25 um 15:33 schrieb Jocelyn Falempe:
> On 24/10/2025 14:40, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>>
>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>>>> ast driver for display, the graphic output is corrupted, and almost
>>>> unusable.
>>>>
>>>> Adding a clflush call in the vmap function fixes this issue
>>>> completely.
>>>
>>> AST is importing i915 allocated buffer in this use case, or how 
>>> exactly is the relationship?
>>>
>>> Wondering if some path is not calling dma_buf_begin/end_cpu_access().
>>
>> Yes, ast doesn't call begin/end_cpu_access in [1].
>>
>> Jocelyn, if that fixes the issue, feel free to send me a patch for 
>> review.
>>
>> [1] 
>> https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ast/ 
>> ast_mode.c
>
> I tried the following patch, but that doesn't fix the graphical issue:
>
> diff --git a/drivers/gpu/drm/ast/ast_mode.c 
> b/drivers/gpu/drm/ast/ast_mode.c
> index b4e8edc7c767..e50f95a4c8a9 100644
> --- a/drivers/gpu/drm/ast/ast_mode.c
> +++ b/drivers/gpu/drm/ast/ast_mode.c
> @@ -564,6 +564,7 @@ static void 
> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>         struct drm_crtc_state *crtc_state = 
> drm_atomic_get_new_crtc_state(state, crtc);
>         struct drm_rect damage;
>         struct drm_atomic_helper_damage_iter iter;
> +       int ret;
>
>         if (!old_fb || (fb->format != old_fb->format) || 
> crtc_state->mode_changed) {
>                 struct ast_crtc_state *ast_crtc_state = 
> to_ast_crtc_state(crtc_state);
> @@ -572,11 +573,16 @@ static void 
> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>                 ast_set_vbios_color_reg(ast, fb->format, 
> ast_crtc_state->vmode);
>         }
>
> +       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
> +       pr_info("AST begin_cpu_access %d\n", ret);

Presumably, you end up in [1]. I cannot find the cflush there or in [2]. 
Maybe you need to add this call somewhere in there, similar to [3]. Just 
guessing.

Best regards
Thomas

[1] 
https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L117
[2] 
https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/i915/gem/i915_gem_domain.c#L493
[3] 
https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/i915/gem/i915_gem_object.c#L509

> +
>         drm_atomic_helper_damage_iter_init(&iter, old_plane_state, 
> plane_state);
>         drm_atomic_for_each_plane_damage(&iter, &damage) {
>                 ast_handle_damage(ast_plane, shadow_plane_state->data, 
> fb, &damage);
>         }
>
> +       drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
> +
>         /*
>          * Some BMCs stop scanning out the video signal after the driver
>          * reprogrammed the offset. This stalls display output for 
> several
>
>
>
> Best regards,
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 12:48   ` Jocelyn Falempe
@ 2025-10-24 15:18     ` Tvrtko Ursulin
  2025-10-24 16:32       ` Jocelyn Falempe
  2025-10-24 16:49     ` Michel Dänzer
  1 sibling, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2025-10-24 15:18 UTC (permalink / raw)
  To: Jocelyn Falempe, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	David Airlie, Simona Vetter, Christian Brauner, Andi Shyti,
	intel-gfx, dri-devel, Michel Dänzer


On 24/10/2025 13:48, Jocelyn Falempe wrote:
> On 24/10/2025 13:53, Tvrtko Ursulin wrote:
>>
>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>>> ast driver for display, the graphic output is corrupted, and almost
>>> unusable.
>>>
>>> Adding a clflush call in the vmap function fixes this issue
>>> completely.
>>
>> AST is importing i915 allocated buffer in this use case, or how 
>> exactly is the relationship?
> 
> I think it's mutter/gnome-shell who copy the buffer from i915 to ast, 
> here is the logs:
> 
> gnome-shell[2079]: Failed to initialize accelerated iGPU/dGPU 
> framebuffer sharing: Do not want to use software renderer (llvmpipe 
> (LLVM 19.1.7, 256 bits)), falling back to CPU copy path
> gnome-shell[1533]: Created gbm renderer for '/dev/dri/card0'
> gnome-shell[1533]: GPU /dev/dri/card1 selected as primary
> 
> card0 is ast and card1 is i915
> 
> Do you think there is something missing in mutter?

No idea. I only notice that the log message "falling back to CPU copy 
path" has been removed from the codebase in:

commit d4e8cfa17a3a69965b9bd130ded46cf54a438908
Author: Jonas Ådahl <jadahl@gmail.com>
Date:   Wed May 5 15:53:59 2021 +0200

     renderer/native: Use MetaRenderDevice

     This replaces functionality that MetaRenderDevice and friends has
     learned, e.g. buffer allocation, EGLDisplay creation, with the usage of
     those helper objects. The main objective is to shrink
     meta-renderer-native.c and by extension meta-onscreen-native.c, moving
     its functionality into more isolated objects.

     Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1854>

So long long time ago. Maybe worth trying with a newer version?

Regards,

Tvrtko


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 15:18       ` Thomas Zimmermann
@ 2025-10-24 15:55         ` Tvrtko Ursulin
  2025-10-24 16:28           ` Jocelyn Falempe
  2025-10-27  9:46           ` Jocelyn Falempe
  0 siblings, 2 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2025-10-24 15:55 UTC (permalink / raw)
  To: Thomas Zimmermann, Jocelyn Falempe, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel


On 24/10/2025 16:18, Thomas Zimmermann wrote:
> Hi
> 
> Am 24.10.25 um 15:33 schrieb Jocelyn Falempe:
>> On 24/10/2025 14:40, Thomas Zimmermann wrote:
>>> Hi
>>>
>>> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>>>
>>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>>>>> ast driver for display, the graphic output is corrupted, and almost
>>>>> unusable.
>>>>>
>>>>> Adding a clflush call in the vmap function fixes this issue
>>>>> completely.
>>>>
>>>> AST is importing i915 allocated buffer in this use case, or how 
>>>> exactly is the relationship?
>>>>
>>>> Wondering if some path is not calling dma_buf_begin/end_cpu_access().
>>>
>>> Yes, ast doesn't call begin/end_cpu_access in [1].
>>>
>>> Jocelyn, if that fixes the issue, feel free to send me a patch for 
>>> review.
>>>
>>> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>>> ast/ ast_mode.c
>>
>> I tried the following patch, but that doesn't fix the graphical issue:
>>
>> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ 
>> ast_mode.c
>> index b4e8edc7c767..e50f95a4c8a9 100644
>> --- a/drivers/gpu/drm/ast/ast_mode.c
>> +++ b/drivers/gpu/drm/ast/ast_mode.c
>> @@ -564,6 +564,7 @@ static void 
>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>         struct drm_crtc_state *crtc_state = 
>> drm_atomic_get_new_crtc_state(state, crtc);
>>         struct drm_rect damage;
>>         struct drm_atomic_helper_damage_iter iter;
>> +       int ret;
>>
>>         if (!old_fb || (fb->format != old_fb->format) || crtc_state- 
>> >mode_changed) {
>>                 struct ast_crtc_state *ast_crtc_state = 
>> to_ast_crtc_state(crtc_state);
>> @@ -572,11 +573,16 @@ static void 
>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>                 ast_set_vbios_color_reg(ast, fb->format, 
>> ast_crtc_state->vmode);
>>         }
>>
>> +       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>> +       pr_info("AST begin_cpu_access %d\n", ret);
> 
> Presumably, you end up in [1]. I cannot find the cflush there or in [2]. 
> Maybe you need to add this call somewhere in there, similar to [3]. Just 
> guessing.

Near [2] clflush can happen at [4] *if* the driver thinks it is needed. 
Most GPUs are cache coherent so mostly it isn't. But if this is a 
Meteorlake machine (when I google Lenovo se100 it makes me think so?) 
then the userspace has some responsibility to manage things since there 
it is only 1-way coherency. Or userspace could have even told the driver 
to stay off in which case it then needs to manage everything. From the 
top of my head I am not sure how exactly this used to work, or how it is 
supposed to interact with exported buffers.

If this is indeed on Meteorlake, maybe Joonas or Rodrigo remember better 
how the special 1-way coherency is supposed to be managed there?

Regards,

Tvrtko

[4] 
https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/i915/gem/i915_gem_domain.c#L510

> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
> i915/gem/i915_gem_dmabuf.c#L117
> [2] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
> i915/gem/i915_gem_domain.c#L493
> [3] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
> i915/gem/i915_gem_object.c#L509
> 
>> +
>>         drm_atomic_helper_damage_iter_init(&iter, old_plane_state, 
>> plane_state);
>>         drm_atomic_for_each_plane_damage(&iter, &damage) {
>>                 ast_handle_damage(ast_plane, shadow_plane_state->data, 
>> fb, &damage);
>>         }
>>
>> +       drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
>> +
>>         /*
>>          * Some BMCs stop scanning out the video signal after the driver
>>          * reprogrammed the offset. This stalls display output for 
>> several
>>
>>
>>
>> Best regards,
>>
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 15:55         ` Tvrtko Ursulin
@ 2025-10-24 16:28           ` Jocelyn Falempe
  2025-10-27  9:46           ` Jocelyn Falempe
  1 sibling, 0 replies; 18+ messages in thread
From: Jocelyn Falempe @ 2025-10-24 16:28 UTC (permalink / raw)
  To: Tvrtko Ursulin, Thomas Zimmermann, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

On 24/10/2025 17:55, Tvrtko Ursulin wrote:
> 
> On 24/10/2025 16:18, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 24.10.25 um 15:33 schrieb Jocelyn Falempe:
>>> On 24/10/2025 14:40, Thomas Zimmermann wrote:
>>>> Hi
>>>>
>>>> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>>>>
>>>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>>>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>>>>>> ast driver for display, the graphic output is corrupted, and almost
>>>>>> unusable.
>>>>>>
>>>>>> Adding a clflush call in the vmap function fixes this issue
>>>>>> completely.
>>>>>
>>>>> AST is importing i915 allocated buffer in this use case, or how 
>>>>> exactly is the relationship?
>>>>>
>>>>> Wondering if some path is not calling dma_buf_begin/end_cpu_access().
>>>>
>>>> Yes, ast doesn't call begin/end_cpu_access in [1].
>>>>
>>>> Jocelyn, if that fixes the issue, feel free to send me a patch for 
>>>> review.
>>>>
>>>> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>>>> ast/ ast_mode.c
>>>
>>> I tried the following patch, but that doesn't fix the graphical issue:
>>>
>>> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ 
>>> ast_mode.c
>>> index b4e8edc7c767..e50f95a4c8a9 100644
>>> --- a/drivers/gpu/drm/ast/ast_mode.c
>>> +++ b/drivers/gpu/drm/ast/ast_mode.c
>>> @@ -564,6 +564,7 @@ static void 
>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>         struct drm_crtc_state *crtc_state = 
>>> drm_atomic_get_new_crtc_state(state, crtc);
>>>         struct drm_rect damage;
>>>         struct drm_atomic_helper_damage_iter iter;
>>> +       int ret;
>>>
>>>         if (!old_fb || (fb->format != old_fb->format) || crtc_state- 
>>> >mode_changed) {
>>>                 struct ast_crtc_state *ast_crtc_state = 
>>> to_ast_crtc_state(crtc_state);
>>> @@ -572,11 +573,16 @@ static void 
>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>                 ast_set_vbios_color_reg(ast, fb->format, 
>>> ast_crtc_state->vmode);
>>>         }
>>>
>>> +       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>>> +       pr_info("AST begin_cpu_access %d\n", ret);
>>
>> Presumably, you end up in [1]. I cannot find the cflush there or in 
>> [2]. Maybe you need to add this call somewhere in there, similar to 
>> [3]. Just guessing.
> 
> Near [2] clflush can happen at [4] *if* the driver thinks it is needed. 
> Most GPUs are cache coherent so mostly it isn't. But if this is a 
> Meteorlake machine (when I google Lenovo se100 it makes me think so?) 
> then the userspace has some responsibility to manage things since there 
> it is only 1-way coherency. Or userspace could have even told the driver 
> to stay off in which case it then needs to manage everything. From the 
> top of my head I am not sure how exactly this used to work, or how it is 
> supposed to interact with exported buffers.
> 
> If this is indeed on Meteorlake, maybe Joonas or Rodrigo remember better 
> how the special 1-way coherency is supposed to be managed there?

CPU is Intel(R) Core(TM) Ultra 5 225H

lspci says Arrow Lake for the GPU:
00:02.0 VGA compatible controller: Intel Corporation Arrow Lake-P [Intel 
Graphics] (rev 03)

But some other PCI devices says Meteor Lake:
00:0a.0 Signal processing controller: Intel Corporation Meteor Lake-P 
Platform Monitoring Technology (rev 01)
00:0b.0 Processing accelerators: Intel Corporation Meteor Lake NPU (rev 05)

Thanks a lot for helping on this.

Best regards,

-- 

Jocelyn

> 
> Regards,
> 
> Tvrtko
> 
> [4] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
> i915/gem/i915_gem_domain.c#L510
> 
>> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>> i915/gem/i915_gem_dmabuf.c#L117
>> [2] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>> i915/gem/i915_gem_domain.c#L493
>> [3] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>> i915/gem/i915_gem_object.c#L509
>>
>>> +
>>>         drm_atomic_helper_damage_iter_init(&iter, old_plane_state, 
>>> plane_state);
>>>         drm_atomic_for_each_plane_damage(&iter, &damage) {
>>>                 ast_handle_damage(ast_plane, shadow_plane_state- 
>>> >data, fb, &damage);
>>>         }
>>>
>>> +       drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
>>> +
>>>         /*
>>>          * Some BMCs stop scanning out the video signal after the driver
>>>          * reprogrammed the offset. This stalls display output for 
>>> several
>>>
>>>
>>>
>>> Best regards,
>>>
>>
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 15:18     ` Tvrtko Ursulin
@ 2025-10-24 16:32       ` Jocelyn Falempe
  0 siblings, 0 replies; 18+ messages in thread
From: Jocelyn Falempe @ 2025-10-24 16:32 UTC (permalink / raw)
  To: Tvrtko Ursulin, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	David Airlie, Simona Vetter, Christian Brauner, Andi Shyti,
	intel-gfx, dri-devel, Michel Dänzer

On 24/10/2025 17:18, Tvrtko Ursulin wrote:
> 
> On 24/10/2025 13:48, Jocelyn Falempe wrote:
>> On 24/10/2025 13:53, Tvrtko Ursulin wrote:
>>>
>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>>>> ast driver for display, the graphic output is corrupted, and almost
>>>> unusable.
>>>>
>>>> Adding a clflush call in the vmap function fixes this issue
>>>> completely.
>>>
>>> AST is importing i915 allocated buffer in this use case, or how 
>>> exactly is the relationship?
>>
>> I think it's mutter/gnome-shell who copy the buffer from i915 to ast, 
>> here is the logs:
>>
>> gnome-shell[2079]: Failed to initialize accelerated iGPU/dGPU 
>> framebuffer sharing: Do not want to use software renderer (llvmpipe 
>> (LLVM 19.1.7, 256 bits)), falling back to CPU copy path
>> gnome-shell[1533]: Created gbm renderer for '/dev/dri/card0'
>> gnome-shell[1533]: GPU /dev/dri/card1 selected as primary
>>
>> card0 is ast and card1 is i915
>>
>> Do you think there is something missing in mutter?
> 
> No idea. I only notice that the log message "falling back to CPU copy 
> path" has been removed from the codebase in:
> 
> commit d4e8cfa17a3a69965b9bd130ded46cf54a438908
> Author: Jonas Ådahl <jadahl@gmail.com>
> Date:   Wed May 5 15:53:59 2021 +0200
> 
>      renderer/native: Use MetaRenderDevice
> 
>      This replaces functionality that MetaRenderDevice and friends has
>      learned, e.g. buffer allocation, EGLDisplay creation, with the 
> usage of
>      those helper objects. The main objective is to shrink
>      meta-renderer-native.c and by extension meta-onscreen-native.c, moving
>      its functionality into more isolated objects.
> 
>      Part-of: <https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1854>
> 
> So long long time ago. Maybe worth trying with a newer version?

Yes, I've reproduced the issue on an old version, rhel9.6, but I'm being 
told it's still present in newer version as well.
I will test with the latest version of mutter, to see if it's still present.

Thanks,

-- 

Jocelyn

> 
> Regards,
> 
> Tvrtko
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 12:48   ` Jocelyn Falempe
  2025-10-24 15:18     ` Tvrtko Ursulin
@ 2025-10-24 16:49     ` Michel Dänzer
  2025-10-27 10:23       ` Thomas Zimmermann
  1 sibling, 1 reply; 18+ messages in thread
From: Michel Dänzer @ 2025-10-24 16:49 UTC (permalink / raw)
  To: Jocelyn Falempe, Tvrtko Ursulin, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

On 10/24/25 14:48, Jocelyn Falempe wrote:
> On 24/10/2025 13:53, Tvrtko Ursulin wrote:
>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>>> ast driver for display, the graphic output is corrupted, and almost
>>> unusable.
>>>
>>> Adding a clflush call in the vmap function fixes this issue
>>> completely.
>>
>> AST is importing i915 allocated buffer in this use case, or how exactly is the relationship?
> 
> I think it's mutter/gnome-shell who copy the buffer from i915 to ast, here is the logs:
> 
> gnome-shell[2079]: Failed to initialize accelerated iGPU/dGPU framebuffer sharing: Do not want to use software renderer (llvmpipe (LLVM 19.1.7, 256 bits)), falling back to CPU copy path

FWIW, the code which logged "falling back to CPU copy path" was removed in mutter 42. Can you still reproduce the issue with a newer version of mutter, ideally 49? A lot has changed over the last 3.5 years.


> gnome-shell[1533]: Created gbm renderer for '/dev/dri/card0'
> gnome-shell[1533]: GPU /dev/dri/card1 selected as primary
> 
> card0 is ast and card1 is i915
> 
> Do you think there is something missing in mutter?

Not that I can see offhand.

After logging "Failed to initialize accelerated iGPU/dGPU framebuffer sharing", mutter tries to export a dma-buf from the i915 BO, import it into the AST device, create an KMS FB for the imported BO and assign the FB to the primary plane of a CRTC. If this path works (which seems to be the case in this scenario, or you wouldn't see BOs shared between i915 & AST), mutter doesn't do any copies between different BOs. It's between the AST & i915 drivers to correctly handle access to the shared BO.

Does the AST driver wait for the i915 GPU to finish drawing to the shared BO, by waiting for the exclusive fence of the dma-buf synchronization object, before reading from the BO?


P.S. If the path described above didn't work, mutter would fall back to reading the i915 BO contents with glReadPixels and copying them to a dumb BO allocated from the AST device, so you wouldn't see BOs shared between the drivers. With current mutter, you can force this fallback with the environment variable MUTTER_DEBUG_MULTI_GPU_FORCE_COPY_MODE=primary-gpu-cpu . Does that avoid the issue?


-- 
Earthling Michel Dänzer       \        GNOME / Xwayland / Mesa developer
https://redhat.com             \               Libre software enthusiast

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 15:55         ` Tvrtko Ursulin
  2025-10-24 16:28           ` Jocelyn Falempe
@ 2025-10-27  9:46           ` Jocelyn Falempe
  2025-10-27 10:26             ` Thomas Zimmermann
  1 sibling, 1 reply; 18+ messages in thread
From: Jocelyn Falempe @ 2025-10-27  9:46 UTC (permalink / raw)
  To: Tvrtko Ursulin, Thomas Zimmermann, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

On 24/10/2025 17:55, Tvrtko Ursulin wrote:
> 
> On 24/10/2025 16:18, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 24.10.25 um 15:33 schrieb Jocelyn Falempe:
>>> On 24/10/2025 14:40, Thomas Zimmermann wrote:
>>>> Hi
>>>>
>>>> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>>>>
>>>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>>>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>>>>>> ast driver for display, the graphic output is corrupted, and almost
>>>>>> unusable.
>>>>>>
>>>>>> Adding a clflush call in the vmap function fixes this issue
>>>>>> completely.
>>>>>
>>>>> AST is importing i915 allocated buffer in this use case, or how 
>>>>> exactly is the relationship?
>>>>>
>>>>> Wondering if some path is not calling dma_buf_begin/end_cpu_access().
>>>>
>>>> Yes, ast doesn't call begin/end_cpu_access in [1].
>>>>
>>>> Jocelyn, if that fixes the issue, feel free to send me a patch for 
>>>> review.
>>>>
>>>> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>>>> ast/ ast_mode.c
>>>
>>> I tried the following patch, but that doesn't fix the graphical issue:
>>>
>>> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ 
>>> ast_mode.c
>>> index b4e8edc7c767..e50f95a4c8a9 100644
>>> --- a/drivers/gpu/drm/ast/ast_mode.c
>>> +++ b/drivers/gpu/drm/ast/ast_mode.c
>>> @@ -564,6 +564,7 @@ static void 
>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>         struct drm_crtc_state *crtc_state = 
>>> drm_atomic_get_new_crtc_state(state, crtc);
>>>         struct drm_rect damage;
>>>         struct drm_atomic_helper_damage_iter iter;
>>> +       int ret;
>>>
>>>         if (!old_fb || (fb->format != old_fb->format) || crtc_state- 
>>> >mode_changed) {
>>>                 struct ast_crtc_state *ast_crtc_state = 
>>> to_ast_crtc_state(crtc_state);
>>> @@ -572,11 +573,16 @@ static void 
>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>                 ast_set_vbios_color_reg(ast, fb->format, 
>>> ast_crtc_state->vmode);
>>>         }
>>>
>>> +       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>>> +       pr_info("AST begin_cpu_access %d\n", ret);
>>
>> Presumably, you end up in [1]. I cannot find the cflush there or in 
>> [2]. Maybe you need to add this call somewhere in there, similar to 
>> [3]. Just guessing.
> 
> Near [2] clflush can happen at [4] *if* the driver thinks it is needed. 
> Most GPUs are cache coherent so mostly it isn't. But if this is a 
> Meteorlake machine (when I google Lenovo se100 it makes me think so?) 
> then the userspace has some responsibility to manage things since there 
> it is only 1-way coherency. Or userspace could have even told the driver 
> to stay off in which case it then needs to manage everything. From the 
> top of my head I am not sure how exactly this used to work, or how it is 
> supposed to interact with exported buffers.
> 
> If this is indeed on Meteorlake, maybe Joonas or Rodrigo remember better 
> how the special 1-way coherency is supposed to be managed there?

I've made an experiment, and if I add:

* a calls to drm_gem_fb_begin_cpu_access() in the ast driver.
* and in i915_gem_domain.c flush_write_domain():
         case I915_GEM_DOMAIN_RENDER:
+               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC | 
I915_CLFLUSH_FORCE);

Then that fixes the issue too.

So I think there are two things to fix:
  * The missing call to drm_gem_fb_begin_cpu_access() in ast.
  * The missing cache flush in i915 for the Arrowlake iGPU (but probably 
not the way I've done it).

Regards,

-- 

Jocelyn

> 
> Regards,
> 
> Tvrtko
> 
> [4] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
> i915/gem/i915_gem_domain.c#L510
> 
>> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>> i915/gem/i915_gem_dmabuf.c#L117
>> [2] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>> i915/gem/i915_gem_domain.c#L493
>> [3] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>> i915/gem/i915_gem_object.c#L509
>>
>>> +
>>>         drm_atomic_helper_damage_iter_init(&iter, old_plane_state, 
>>> plane_state);
>>>         drm_atomic_for_each_plane_damage(&iter, &damage) {
>>>                 ast_handle_damage(ast_plane, shadow_plane_state- 
>>> >data, fb, &damage);
>>>         }
>>>
>>> +       drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
>>> +
>>>         /*
>>>          * Some BMCs stop scanning out the video signal after the driver
>>>          * reprogrammed the offset. This stalls display output for 
>>> several
>>>
>>>
>>>
>>> Best regards,
>>>
>>
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-24 16:49     ` Michel Dänzer
@ 2025-10-27 10:23       ` Thomas Zimmermann
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Zimmermann @ 2025-10-27 10:23 UTC (permalink / raw)
  To: Michel Dänzer, Jocelyn Falempe, Tvrtko Ursulin, Jani Nikula,
	Joonas Lahtinen, Rodrigo Vivi, David Airlie, Simona Vetter,
	Christian Brauner, Andi Shyti, intel-gfx, dri-devel

Hi

Am 24.10.25 um 18:49 schrieb Michel Dänzer:
> On 10/24/25 14:48, Jocelyn Falempe wrote:
>> On 24/10/2025 13:53, Tvrtko Ursulin wrote:
>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>> On a lenovo se100 server, when using i915 GPU for rendering, and the
>>>> ast driver for display, the graphic output is corrupted, and almost
>>>> unusable.
>>>>
>>>> Adding a clflush call in the vmap function fixes this issue
>>>> completely.
>>> AST is importing i915 allocated buffer in this use case, or how exactly is the relationship?
>> I think it's mutter/gnome-shell who copy the buffer from i915 to ast, here is the logs:
>>
>> gnome-shell[2079]: Failed to initialize accelerated iGPU/dGPU framebuffer sharing: Do not want to use software renderer (llvmpipe (LLVM 19.1.7, 256 bits)), falling back to CPU copy path
> FWIW, the code which logged "falling back to CPU copy path" was removed in mutter 42. Can you still reproduce the issue with a newer version of mutter, ideally 49? A lot has changed over the last 3.5 years.
>
>
>> gnome-shell[1533]: Created gbm renderer for '/dev/dri/card0'
>> gnome-shell[1533]: GPU /dev/dri/card1 selected as primary
>>
>> card0 is ast and card1 is i915
>>
>> Do you think there is something missing in mutter?
> Not that I can see offhand.
>
> After logging "Failed to initialize accelerated iGPU/dGPU framebuffer sharing", mutter tries to export a dma-buf from the i915 BO, import it into the AST device, create an KMS FB for the imported BO and assign the FB to the primary plane of a CRTC. If this path works (which seems to be the case in this scenario, or you wouldn't see BOs shared between i915 & AST), mutter doesn't do any copies between different BOs. It's between the AST & i915 drivers to correctly handle access to the shared BO.
>
> Does the AST driver wait for the i915 GPU to finish drawing to the shared BO, by waiting for the exclusive fence of the dma-buf synchronization object, before reading from the BO?

Yes, ast waits for all the plane fences here: 
https://elixir.bootlin.com/linux/v6.17.5/source/drivers/gpu/drm/drm_atomic_helper.c#L2762

Best regards
Thomas

>
>
> P.S. If the path described above didn't work, mutter would fall back to reading the i915 BO contents with glReadPixels and copying them to a dumb BO allocated from the AST device, so you wouldn't see BOs shared between the drivers. With current mutter, you can force this fallback with the environment variable MUTTER_DEBUG_MULTI_GPU_FORCE_COPY_MODE=primary-gpu-cpu . Does that avoid the issue?
>
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-27  9:46           ` Jocelyn Falempe
@ 2025-10-27 10:26             ` Thomas Zimmermann
  2025-10-31 13:12               ` Tvrtko Ursulin
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Zimmermann @ 2025-10-27 10:26 UTC (permalink / raw)
  To: Jocelyn Falempe, Tvrtko Ursulin, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

Hi

Am 27.10.25 um 10:46 schrieb Jocelyn Falempe:
> On 24/10/2025 17:55, Tvrtko Ursulin wrote:
>>
>> On 24/10/2025 16:18, Thomas Zimmermann wrote:
>>> Hi
>>>
>>> Am 24.10.25 um 15:33 schrieb Jocelyn Falempe:
>>>> On 24/10/2025 14:40, Thomas Zimmermann wrote:
>>>>> Hi
>>>>>
>>>>> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>>>>>
>>>>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>>>>> On a lenovo se100 server, when using i915 GPU for rendering, and 
>>>>>>> the
>>>>>>> ast driver for display, the graphic output is corrupted, and almost
>>>>>>> unusable.
>>>>>>>
>>>>>>> Adding a clflush call in the vmap function fixes this issue
>>>>>>> completely.
>>>>>>
>>>>>> AST is importing i915 allocated buffer in this use case, or how 
>>>>>> exactly is the relationship?
>>>>>>
>>>>>> Wondering if some path is not calling 
>>>>>> dma_buf_begin/end_cpu_access().
>>>>>
>>>>> Yes, ast doesn't call begin/end_cpu_access in [1].
>>>>>
>>>>> Jocelyn, if that fixes the issue, feel free to send me a patch for 
>>>>> review.
>>>>>
>>>>> [1] 
>>>>> https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/drm/ 
>>>>> ast/ ast_mode.c
>>>>
>>>> I tried the following patch, but that doesn't fix the graphical issue:
>>>>
>>>> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ 
>>>> ast_mode.c
>>>> index b4e8edc7c767..e50f95a4c8a9 100644
>>>> --- a/drivers/gpu/drm/ast/ast_mode.c
>>>> +++ b/drivers/gpu/drm/ast/ast_mode.c
>>>> @@ -564,6 +564,7 @@ static void 
>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>         struct drm_crtc_state *crtc_state = 
>>>> drm_atomic_get_new_crtc_state(state, crtc);
>>>>         struct drm_rect damage;
>>>>         struct drm_atomic_helper_damage_iter iter;
>>>> +       int ret;
>>>>
>>>>         if (!old_fb || (fb->format != old_fb->format) || 
>>>> crtc_state- >mode_changed) {
>>>>                 struct ast_crtc_state *ast_crtc_state = 
>>>> to_ast_crtc_state(crtc_state);
>>>> @@ -572,11 +573,16 @@ static void 
>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>                 ast_set_vbios_color_reg(ast, fb->format, 
>>>> ast_crtc_state->vmode);
>>>>         }
>>>>
>>>> +       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>>>> +       pr_info("AST begin_cpu_access %d\n", ret);
>>>
>>> Presumably, you end up in [1]. I cannot find the cflush there or in 
>>> [2]. Maybe you need to add this call somewhere in there, similar to 
>>> [3]. Just guessing.
>>
>> Near [2] clflush can happen at [4] *if* the driver thinks it is 
>> needed. Most GPUs are cache coherent so mostly it isn't. But if this 
>> is a Meteorlake machine (when I google Lenovo se100 it makes me think 
>> so?) then the userspace has some responsibility to manage things 
>> since there it is only 1-way coherency. Or userspace could have even 
>> told the driver to stay off in which case it then needs to manage 
>> everything. From the top of my head I am not sure how exactly this 
>> used to work, or how it is supposed to interact with exported buffers.
>>
>> If this is indeed on Meteorlake, maybe Joonas or Rodrigo remember 
>> better how the special 1-way coherency is supposed to be managed there?
>
> I've made an experiment, and if I add:
>
> * a calls to drm_gem_fb_begin_cpu_access() in the ast driver.
> * and in i915_gem_domain.c flush_write_domain():
>         case I915_GEM_DOMAIN_RENDER:
> +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC | 
> I915_CLFLUSH_FORCE);
>
> Then that fixes the issue too.
>
> So I think there are two things to fix:
>  * The missing call to drm_gem_fb_begin_cpu_access() in ast.

Yes. We definitely want to add these calls, as they are expected for 
this case.

>  * The missing cache flush in i915 for the Arrowlake iGPU (but 
> probably not the way I've done it).

You call begin_cpu_access with DMA_FROM_DEVICE, but there's no support 
for that flag in i915 AFAICT. Maybe this needs to be added somehow?

Best regards
Thomas

>
> Regards,
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-27 10:26             ` Thomas Zimmermann
@ 2025-10-31 13:12               ` Tvrtko Ursulin
  2025-10-31 15:04                 ` Jocelyn Falempe
  2025-10-31 16:48                 ` Thomas Zimmermann
  0 siblings, 2 replies; 18+ messages in thread
From: Tvrtko Ursulin @ 2025-10-31 13:12 UTC (permalink / raw)
  To: Thomas Zimmermann, Jocelyn Falempe, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel


On 27/10/2025 10:26, Thomas Zimmermann wrote:
> Hi
> 
> Am 27.10.25 um 10:46 schrieb Jocelyn Falempe:
>> On 24/10/2025 17:55, Tvrtko Ursulin wrote:
>>>
>>> On 24/10/2025 16:18, Thomas Zimmermann wrote:
>>>> Hi
>>>>
>>>> Am 24.10.25 um 15:33 schrieb Jocelyn Falempe:
>>>>> On 24/10/2025 14:40, Thomas Zimmermann wrote:
>>>>>> Hi
>>>>>>
>>>>>> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>>>>>>
>>>>>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>>>>>> On a lenovo se100 server, when using i915 GPU for rendering, and 
>>>>>>>> the
>>>>>>>> ast driver for display, the graphic output is corrupted, and almost
>>>>>>>> unusable.
>>>>>>>>
>>>>>>>> Adding a clflush call in the vmap function fixes this issue
>>>>>>>> completely.
>>>>>>>
>>>>>>> AST is importing i915 allocated buffer in this use case, or how 
>>>>>>> exactly is the relationship?
>>>>>>>
>>>>>>> Wondering if some path is not calling dma_buf_begin/ 
>>>>>>> end_cpu_access().
>>>>>>
>>>>>> Yes, ast doesn't call begin/end_cpu_access in [1].
>>>>>>
>>>>>> Jocelyn, if that fixes the issue, feel free to send me a patch for 
>>>>>> review.
>>>>>>
>>>>>> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/ 
>>>>>> drm/ ast/ ast_mode.c
>>>>>
>>>>> I tried the following patch, but that doesn't fix the graphical issue:
>>>>>
>>>>> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ 
>>>>> ast_mode.c
>>>>> index b4e8edc7c767..e50f95a4c8a9 100644
>>>>> --- a/drivers/gpu/drm/ast/ast_mode.c
>>>>> +++ b/drivers/gpu/drm/ast/ast_mode.c
>>>>> @@ -564,6 +564,7 @@ static void 
>>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>>         struct drm_crtc_state *crtc_state = 
>>>>> drm_atomic_get_new_crtc_state(state, crtc);
>>>>>         struct drm_rect damage;
>>>>>         struct drm_atomic_helper_damage_iter iter;
>>>>> +       int ret;
>>>>>
>>>>>         if (!old_fb || (fb->format != old_fb->format) || 
>>>>> crtc_state- >mode_changed) {
>>>>>                 struct ast_crtc_state *ast_crtc_state = 
>>>>> to_ast_crtc_state(crtc_state);
>>>>> @@ -572,11 +573,16 @@ static void 
>>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>>                 ast_set_vbios_color_reg(ast, fb->format, 
>>>>> ast_crtc_state->vmode);
>>>>>         }
>>>>>
>>>>> +       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>>>>> +       pr_info("AST begin_cpu_access %d\n", ret);
>>>>
>>>> Presumably, you end up in [1]. I cannot find the cflush there or in 
>>>> [2]. Maybe you need to add this call somewhere in there, similar to 
>>>> [3]. Just guessing.
>>>
>>> Near [2] clflush can happen at [4] *if* the driver thinks it is 
>>> needed. Most GPUs are cache coherent so mostly it isn't. But if this 
>>> is a Meteorlake machine (when I google Lenovo se100 it makes me think 
>>> so?) then the userspace has some responsibility to manage things 
>>> since there it is only 1-way coherency. Or userspace could have even 
>>> told the driver to stay off in which case it then needs to manage 
>>> everything. From the top of my head I am not sure how exactly this 
>>> used to work, or how it is supposed to interact with exported buffers.
>>>
>>> If this is indeed on Meteorlake, maybe Joonas or Rodrigo remember 
>>> better how the special 1-way coherency is supposed to be managed there?
>>
>> I've made an experiment, and if I add:
>>
>> * a calls to drm_gem_fb_begin_cpu_access() in the ast driver.
>> * and in i915_gem_domain.c flush_write_domain():
>>         case I915_GEM_DOMAIN_RENDER:
>> +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC | 
>> I915_CLFLUSH_FORCE);
>>
>> Then that fixes the issue too.
>>
>> So I think there are two things to fix:
>>  * The missing call to drm_gem_fb_begin_cpu_access() in ast.
> 
> Yes. We definitely want to add these calls, as they are expected for 
> this case.

Browsing around a bit, I notice ast_primary_plane_helper_atomic_update() 
calls to_drm_shadow_plane_state() to get the source of the memcpy. 
Should there somewhere be calls to drm_gem_begin_shadow_fb_access() and 
drm_gem_end_shadow_fb_access()? Or those should be set as  vfuncs by 
someone? Sorry I get lost easily in the DRM maze of 
helpers<->vfuncs<->helpers<->vfuncs..

>>  * The missing cache flush in i915 for the Arrowlake iGPU (but 
>> probably not the way I've done it).
> 
> You call begin_cpu_access with DMA_FROM_DEVICE, but there's no support 
> for that flag in i915 AFAICT. Maybe this needs to be added somehow?

AFAIR the premise is GPU writes will not get stuck in the last level 
cache but I might be remembering a reverse of what Meteorlake 1-way 
coherency means. This area of the driver ended up a mess and was never 
properly cleaned up. I even had a series to try and do it but it never 
happened. We will need someone who actually remembers how Meteorlake works.

Regards,

Tvrtko


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-31 13:12               ` Tvrtko Ursulin
@ 2025-10-31 15:04                 ` Jocelyn Falempe
  2025-10-31 16:50                   ` Thomas Zimmermann
  2025-10-31 16:48                 ` Thomas Zimmermann
  1 sibling, 1 reply; 18+ messages in thread
From: Jocelyn Falempe @ 2025-10-31 15:04 UTC (permalink / raw)
  To: Tvrtko Ursulin, Thomas Zimmermann, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

On 31/10/2025 14:12, Tvrtko Ursulin wrote:
> 
> On 27/10/2025 10:26, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 27.10.25 um 10:46 schrieb Jocelyn Falempe:
>>> On 24/10/2025 17:55, Tvrtko Ursulin wrote:
>>>>
>>>> On 24/10/2025 16:18, Thomas Zimmermann wrote:
>>>>> Hi
>>>>>
>>>>> Am 24.10.25 um 15:33 schrieb Jocelyn Falempe:
>>>>>> On 24/10/2025 14:40, Thomas Zimmermann wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>>>>>>>
>>>>>>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>>>>>>> On a lenovo se100 server, when using i915 GPU for rendering, 
>>>>>>>>> and the
>>>>>>>>> ast driver for display, the graphic output is corrupted, and 
>>>>>>>>> almost
>>>>>>>>> unusable.
>>>>>>>>>
>>>>>>>>> Adding a clflush call in the vmap function fixes this issue
>>>>>>>>> completely.
>>>>>>>>
>>>>>>>> AST is importing i915 allocated buffer in this use case, or how 
>>>>>>>> exactly is the relationship?
>>>>>>>>
>>>>>>>> Wondering if some path is not calling dma_buf_begin/ 
>>>>>>>> end_cpu_access().
>>>>>>>
>>>>>>> Yes, ast doesn't call begin/end_cpu_access in [1].
>>>>>>>
>>>>>>> Jocelyn, if that fixes the issue, feel free to send me a patch 
>>>>>>> for review.
>>>>>>>
>>>>>>> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/ 
>>>>>>> drm/ ast/ ast_mode.c
>>>>>>
>>>>>> I tried the following patch, but that doesn't fix the graphical 
>>>>>> issue:
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ 
>>>>>> ast_mode.c
>>>>>> index b4e8edc7c767..e50f95a4c8a9 100644
>>>>>> --- a/drivers/gpu/drm/ast/ast_mode.c
>>>>>> +++ b/drivers/gpu/drm/ast/ast_mode.c
>>>>>> @@ -564,6 +564,7 @@ static void 
>>>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>>>         struct drm_crtc_state *crtc_state = 
>>>>>> drm_atomic_get_new_crtc_state(state, crtc);
>>>>>>         struct drm_rect damage;
>>>>>>         struct drm_atomic_helper_damage_iter iter;
>>>>>> +       int ret;
>>>>>>
>>>>>>         if (!old_fb || (fb->format != old_fb->format) || 
>>>>>> crtc_state- >mode_changed) {
>>>>>>                 struct ast_crtc_state *ast_crtc_state = 
>>>>>> to_ast_crtc_state(crtc_state);
>>>>>> @@ -572,11 +573,16 @@ static void 
>>>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>>>                 ast_set_vbios_color_reg(ast, fb->format, 
>>>>>> ast_crtc_state->vmode);
>>>>>>         }
>>>>>>
>>>>>> +       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>>>>>> +       pr_info("AST begin_cpu_access %d\n", ret);
>>>>>
>>>>> Presumably, you end up in [1]. I cannot find the cflush there or in 
>>>>> [2]. Maybe you need to add this call somewhere in there, similar to 
>>>>> [3]. Just guessing.
>>>>
>>>> Near [2] clflush can happen at [4] *if* the driver thinks it is 
>>>> needed. Most GPUs are cache coherent so mostly it isn't. But if this 
>>>> is a Meteorlake machine (when I google Lenovo se100 it makes me 
>>>> think so?) then the userspace has some responsibility to manage 
>>>> things since there it is only 1-way coherency. Or userspace could 
>>>> have even told the driver to stay off in which case it then needs to 
>>>> manage everything. From the top of my head I am not sure how exactly 
>>>> this used to work, or how it is supposed to interact with exported 
>>>> buffers.
>>>>
>>>> If this is indeed on Meteorlake, maybe Joonas or Rodrigo remember 
>>>> better how the special 1-way coherency is supposed to be managed there?
>>>
>>> I've made an experiment, and if I add:
>>>
>>> * a calls to drm_gem_fb_begin_cpu_access() in the ast driver.
>>> * and in i915_gem_domain.c flush_write_domain():
>>>         case I915_GEM_DOMAIN_RENDER:
>>> +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC | 
>>> I915_CLFLUSH_FORCE);
>>>
>>> Then that fixes the issue too.
>>>
>>> So I think there are two things to fix:
>>>  * The missing call to drm_gem_fb_begin_cpu_access() in ast.
>>
>> Yes. We definitely want to add these calls, as they are expected for 
>> this case.
> 
> Browsing around a bit, I notice ast_primary_plane_helper_atomic_update() 
> calls to_drm_shadow_plane_state() to get the source of the memcpy. 
> Should there somewhere be calls to drm_gem_begin_shadow_fb_access() and 
> drm_gem_end_shadow_fb_access()? Or those should be set as  vfuncs by 
> someone? Sorry I get lost easily in the DRM maze of helpers<->vfuncs<- 
>  >helpers<->vfuncs..

I've already submitted [1] for the ast driver.

> 
>>>  * The missing cache flush in i915 for the Arrowlake iGPU (but 
>>> probably not the way I've done it).
>>
>> You call begin_cpu_access with DMA_FROM_DEVICE, but there's no support 
>> for that flag in i915 AFAICT. Maybe this needs to be added somehow?
> 
> AFAIR the premise is GPU writes will not get stuck in the last level 
> cache but I might be remembering a reverse of what Meteorlake 1-way 
> coherency means. This area of the driver ended up a mess and was never 
> properly cleaned up. I even had a series to try and do it but it never 
> happened. We will need someone who actually remembers how Meteorlake works.

Logically clflush() shouldn't have any effect, because it flushes the 
CPU cache, and then ast is reading from the CPU.
But from the tests, it also flushes the iGPU cache, and the iGPU cache 
is not coherent with the CPU cache (at least for iGPU writes), or this 
problem won't exist.

I've the setup ready, and this is easily reproducible, so you can send 
me patches, and I can quickly check if that works or not.

[1] 
https://lore.kernel.org/dri-devel/908ff7f7-cdf3-4596-9246-0100e4c15820@suse.de/T/#t

Best regards,

-- 

Jocelyn


> 
> Regards,
> 
> Tvrtko
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-31 13:12               ` Tvrtko Ursulin
  2025-10-31 15:04                 ` Jocelyn Falempe
@ 2025-10-31 16:48                 ` Thomas Zimmermann
  1 sibling, 0 replies; 18+ messages in thread
From: Thomas Zimmermann @ 2025-10-31 16:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, Jocelyn Falempe, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

Hi

Am 31.10.25 um 14:12 schrieb Tvrtko Ursulin:
>
> On 27/10/2025 10:26, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 27.10.25 um 10:46 schrieb Jocelyn Falempe:
>>> On 24/10/2025 17:55, Tvrtko Ursulin wrote:
>>>>
>>>> On 24/10/2025 16:18, Thomas Zimmermann wrote:
>>>>> Hi
>>>>>
>>>>> Am 24.10.25 um 15:33 schrieb Jocelyn Falempe:
>>>>>> On 24/10/2025 14:40, Thomas Zimmermann wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>>>>>>>
>>>>>>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>>>>>>> On a lenovo se100 server, when using i915 GPU for rendering, 
>>>>>>>>> and the
>>>>>>>>> ast driver for display, the graphic output is corrupted, and 
>>>>>>>>> almost
>>>>>>>>> unusable.
>>>>>>>>>
>>>>>>>>> Adding a clflush call in the vmap function fixes this issue
>>>>>>>>> completely.
>>>>>>>>
>>>>>>>> AST is importing i915 allocated buffer in this use case, or how 
>>>>>>>> exactly is the relationship?
>>>>>>>>
>>>>>>>> Wondering if some path is not calling dma_buf_begin/ 
>>>>>>>> end_cpu_access().
>>>>>>>
>>>>>>> Yes, ast doesn't call begin/end_cpu_access in [1].
>>>>>>>
>>>>>>> Jocelyn, if that fixes the issue, feel free to send me a patch 
>>>>>>> for review.
>>>>>>>
>>>>>>> [1] https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/ 
>>>>>>> drm/ ast/ ast_mode.c
>>>>>>
>>>>>> I tried the following patch, but that doesn't fix the graphical 
>>>>>> issue:
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/ast/ast_mode.c 
>>>>>> b/drivers/gpu/drm/ast/ ast_mode.c
>>>>>> index b4e8edc7c767..e50f95a4c8a9 100644
>>>>>> --- a/drivers/gpu/drm/ast/ast_mode.c
>>>>>> +++ b/drivers/gpu/drm/ast/ast_mode.c
>>>>>> @@ -564,6 +564,7 @@ static void 
>>>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>>>         struct drm_crtc_state *crtc_state = 
>>>>>> drm_atomic_get_new_crtc_state(state, crtc);
>>>>>>         struct drm_rect damage;
>>>>>>         struct drm_atomic_helper_damage_iter iter;
>>>>>> +       int ret;
>>>>>>
>>>>>>         if (!old_fb || (fb->format != old_fb->format) || 
>>>>>> crtc_state- >mode_changed) {
>>>>>>                 struct ast_crtc_state *ast_crtc_state = 
>>>>>> to_ast_crtc_state(crtc_state);
>>>>>> @@ -572,11 +573,16 @@ static void 
>>>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>>>                 ast_set_vbios_color_reg(ast, fb->format, 
>>>>>> ast_crtc_state->vmode);
>>>>>>         }
>>>>>>
>>>>>> +       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>>>>>> +       pr_info("AST begin_cpu_access %d\n", ret);
>>>>>
>>>>> Presumably, you end up in [1]. I cannot find the cflush there or 
>>>>> in [2]. Maybe you need to add this call somewhere in there, 
>>>>> similar to [3]. Just guessing.
>>>>
>>>> Near [2] clflush can happen at [4] *if* the driver thinks it is 
>>>> needed. Most GPUs are cache coherent so mostly it isn't. But if 
>>>> this is a Meteorlake machine (when I google Lenovo se100 it makes 
>>>> me think so?) then the userspace has some responsibility to manage 
>>>> things since there it is only 1-way coherency. Or userspace could 
>>>> have even told the driver to stay off in which case it then needs 
>>>> to manage everything. From the top of my head I am not sure how 
>>>> exactly this used to work, or how it is supposed to interact with 
>>>> exported buffers.
>>>>
>>>> If this is indeed on Meteorlake, maybe Joonas or Rodrigo remember 
>>>> better how the special 1-way coherency is supposed to be managed 
>>>> there?
>>>
>>> I've made an experiment, and if I add:
>>>
>>> * a calls to drm_gem_fb_begin_cpu_access() in the ast driver.
>>> * and in i915_gem_domain.c flush_write_domain():
>>>         case I915_GEM_DOMAIN_RENDER:
>>> +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC | 
>>> I915_CLFLUSH_FORCE);
>>>
>>> Then that fixes the issue too.
>>>
>>> So I think there are two things to fix:
>>>  * The missing call to drm_gem_fb_begin_cpu_access() in ast.
>>
>> Yes. We definitely want to add these calls, as they are expected for 
>> this case.
>
> Browsing around a bit, I notice 
> ast_primary_plane_helper_atomic_update() calls 
> to_drm_shadow_plane_state() to get the source of the memcpy. Should 
> there somewhere be calls to drm_gem_begin_shadow_fb_access() and 
> drm_gem_end_shadow_fb_access()? Or those should be set as  vfuncs by 
> someone? Sorry I get lost easily in the DRM maze of 
> helpers<->vfuncs<->helpers<->vfuncs..

It is used by DRM_GEM_SHADOW_PLANE_HELPER_FUNCS, [1] which ast uses at 
[2]. And then atomic helpers call it around the plane update.

[1] 
https://elixir.bootlin.com/linux/v6.17.6/source/include/drm/drm_gem_atomic_helper.h#L125
[2] 
https://elixir.bootlin.com/linux/v6.17.6/source/drivers/gpu/drm/ast/ast_mode.c#L631

Best regards
Thomas

>
>>>  * The missing cache flush in i915 for the Arrowlake iGPU (but 
>>> probably not the way I've done it).
>>
>> You call begin_cpu_access with DMA_FROM_DEVICE, but there's no 
>> support for that flag in i915 AFAICT. Maybe this needs to be added 
>> somehow?
>
> AFAIR the premise is GPU writes will not get stuck in the last level 
> cache but I might be remembering a reverse of what Meteorlake 1-way 
> coherency means. This area of the driver ended up a mess and was never 
> properly cleaned up. I even had a series to try and do it but it never 
> happened. We will need someone who actually remembers how Meteorlake 
> works.
>
> Regards,
>
> Tvrtko
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/i915/dmabuf: Flush the cache in vmap
  2025-10-31 15:04                 ` Jocelyn Falempe
@ 2025-10-31 16:50                   ` Thomas Zimmermann
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Zimmermann @ 2025-10-31 16:50 UTC (permalink / raw)
  To: Jocelyn Falempe, Tvrtko Ursulin, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, David Airlie, Simona Vetter, Christian Brauner,
	Andi Shyti, intel-gfx, dri-devel

Hi

Am 31.10.25 um 16:04 schrieb Jocelyn Falempe:
> On 31/10/2025 14:12, Tvrtko Ursulin wrote:
>>
>> On 27/10/2025 10:26, Thomas Zimmermann wrote:
>>> Hi
>>>
>>> Am 27.10.25 um 10:46 schrieb Jocelyn Falempe:
>>>> On 24/10/2025 17:55, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 24/10/2025 16:18, Thomas Zimmermann wrote:
>>>>>> Hi
>>>>>>
>>>>>> Am 24.10.25 um 15:33 schrieb Jocelyn Falempe:
>>>>>>> On 24/10/2025 14:40, Thomas Zimmermann wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Am 24.10.25 um 13:53 schrieb Tvrtko Ursulin:
>>>>>>>>>
>>>>>>>>> On 24/10/2025 12:04, Jocelyn Falempe wrote:
>>>>>>>>>> On a lenovo se100 server, when using i915 GPU for rendering, 
>>>>>>>>>> and the
>>>>>>>>>> ast driver for display, the graphic output is corrupted, and 
>>>>>>>>>> almost
>>>>>>>>>> unusable.
>>>>>>>>>>
>>>>>>>>>> Adding a clflush call in the vmap function fixes this issue
>>>>>>>>>> completely.
>>>>>>>>>
>>>>>>>>> AST is importing i915 allocated buffer in this use case, or 
>>>>>>>>> how exactly is the relationship?
>>>>>>>>>
>>>>>>>>> Wondering if some path is not calling dma_buf_begin/ 
>>>>>>>>> end_cpu_access().
>>>>>>>>
>>>>>>>> Yes, ast doesn't call begin/end_cpu_access in [1].
>>>>>>>>
>>>>>>>> Jocelyn, if that fixes the issue, feel free to send me a patch 
>>>>>>>> for review.
>>>>>>>>
>>>>>>>> [1] 
>>>>>>>> https://elixir.bootlin.com/linux/v6.17.4/source/drivers/gpu/ 
>>>>>>>> drm/ ast/ ast_mode.c
>>>>>>>
>>>>>>> I tried the following patch, but that doesn't fix the graphical 
>>>>>>> issue:
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/ast/ast_mode.c 
>>>>>>> b/drivers/gpu/drm/ast/ ast_mode.c
>>>>>>> index b4e8edc7c767..e50f95a4c8a9 100644
>>>>>>> --- a/drivers/gpu/drm/ast/ast_mode.c
>>>>>>> +++ b/drivers/gpu/drm/ast/ast_mode.c
>>>>>>> @@ -564,6 +564,7 @@ static void 
>>>>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>>>>         struct drm_crtc_state *crtc_state = 
>>>>>>> drm_atomic_get_new_crtc_state(state, crtc);
>>>>>>>         struct drm_rect damage;
>>>>>>>         struct drm_atomic_helper_damage_iter iter;
>>>>>>> +       int ret;
>>>>>>>
>>>>>>>         if (!old_fb || (fb->format != old_fb->format) || 
>>>>>>> crtc_state- >mode_changed) {
>>>>>>>                 struct ast_crtc_state *ast_crtc_state = 
>>>>>>> to_ast_crtc_state(crtc_state);
>>>>>>> @@ -572,11 +573,16 @@ static void 
>>>>>>> ast_primary_plane_helper_atomic_update(struct drm_plane *plane,
>>>>>>>                 ast_set_vbios_color_reg(ast, fb->format, 
>>>>>>> ast_crtc_state->vmode);
>>>>>>>         }
>>>>>>>
>>>>>>> +       ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>>>>>>> +       pr_info("AST begin_cpu_access %d\n", ret);
>>>>>>
>>>>>> Presumably, you end up in [1]. I cannot find the cflush there or 
>>>>>> in [2]. Maybe you need to add this call somewhere in there, 
>>>>>> similar to [3]. Just guessing.
>>>>>
>>>>> Near [2] clflush can happen at [4] *if* the driver thinks it is 
>>>>> needed. Most GPUs are cache coherent so mostly it isn't. But if 
>>>>> this is a Meteorlake machine (when I google Lenovo se100 it makes 
>>>>> me think so?) then the userspace has some responsibility to manage 
>>>>> things since there it is only 1-way coherency. Or userspace could 
>>>>> have even told the driver to stay off in which case it then needs 
>>>>> to manage everything. From the top of my head I am not sure how 
>>>>> exactly this used to work, or how it is supposed to interact with 
>>>>> exported buffers.
>>>>>
>>>>> If this is indeed on Meteorlake, maybe Joonas or Rodrigo remember 
>>>>> better how the special 1-way coherency is supposed to be managed 
>>>>> there?
>>>>
>>>> I've made an experiment, and if I add:
>>>>
>>>> * a calls to drm_gem_fb_begin_cpu_access() in the ast driver.
>>>> * and in i915_gem_domain.c flush_write_domain():
>>>>         case I915_GEM_DOMAIN_RENDER:
>>>> +               i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC | 
>>>> I915_CLFLUSH_FORCE);
>>>>
>>>> Then that fixes the issue too.
>>>>
>>>> So I think there are two things to fix:
>>>>  * The missing call to drm_gem_fb_begin_cpu_access() in ast.
>>>
>>> Yes. We definitely want to add these calls, as they are expected for 
>>> this case.
>>
>> Browsing around a bit, I notice 
>> ast_primary_plane_helper_atomic_update() calls 
>> to_drm_shadow_plane_state() to get the source of the memcpy. Should 
>> there somewhere be calls to drm_gem_begin_shadow_fb_access() and 
>> drm_gem_end_shadow_fb_access()? Or those should be set as vfuncs by 
>> someone? Sorry I get lost easily in the DRM maze of 
>> helpers<->vfuncs<-  >helpers<->vfuncs..
>
> I've already submitted [1] for the ast driver.

Don't be confused by the similar naming. The shadow-plane functions do 
the vmap(). The cpu-access functions do the caching and DMA sync.

Best regards
Thomas

>
>>
>>>>  * The missing cache flush in i915 for the Arrowlake iGPU (but 
>>>> probably not the way I've done it).
>>>
>>> You call begin_cpu_access with DMA_FROM_DEVICE, but there's no 
>>> support for that flag in i915 AFAICT. Maybe this needs to be added 
>>> somehow?
>>
>> AFAIR the premise is GPU writes will not get stuck in the last level 
>> cache but I might be remembering a reverse of what Meteorlake 1-way 
>> coherency means. This area of the driver ended up a mess and was 
>> never properly cleaned up. I even had a series to try and do it but 
>> it never happened. We will need someone who actually remembers how 
>> Meteorlake works.
>
> Logically clflush() shouldn't have any effect, because it flushes the 
> CPU cache, and then ast is reading from the CPU.
> But from the tests, it also flushes the iGPU cache, and the iGPU cache 
> is not coherent with the CPU cache (at least for iGPU writes), or this 
> problem won't exist.
>
> I've the setup ready, and this is easily reproducible, so you can send 
> me patches, and I can quickly check if that works or not.
>
> [1] 
> https://lore.kernel.org/dri-devel/908ff7f7-cdf3-4596-9246-0100e4c15820@suse.de/T/#t
>
> Best regards,
>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-10-31 16:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-24 11:04 [PATCH] drm/i915/dmabuf: Flush the cache in vmap Jocelyn Falempe
2025-10-24 11:53 ` Tvrtko Ursulin
2025-10-24 12:40   ` Thomas Zimmermann
2025-10-24 13:33     ` Jocelyn Falempe
2025-10-24 15:18       ` Thomas Zimmermann
2025-10-24 15:55         ` Tvrtko Ursulin
2025-10-24 16:28           ` Jocelyn Falempe
2025-10-27  9:46           ` Jocelyn Falempe
2025-10-27 10:26             ` Thomas Zimmermann
2025-10-31 13:12               ` Tvrtko Ursulin
2025-10-31 15:04                 ` Jocelyn Falempe
2025-10-31 16:50                   ` Thomas Zimmermann
2025-10-31 16:48                 ` Thomas Zimmermann
2025-10-24 12:48   ` Jocelyn Falempe
2025-10-24 15:18     ` Tvrtko Ursulin
2025-10-24 16:32       ` Jocelyn Falempe
2025-10-24 16:49     ` Michel Dänzer
2025-10-27 10:23       ` Thomas Zimmermann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).