All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
@ 2018-03-30 12:27 Hans de Goede
  2018-03-30 12:30 ` [Intel-gfx] " Chris Wilson
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Hans de Goede @ 2018-03-30 12:27 UTC (permalink / raw)
  To: Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Ville Syrjälä
  Cc: Hans de Goede, intel-gfx, dri-devel

Before this commit the WaSkipStolenMemoryFirstPage workaround code was
skipping the first 4k by passing 4096 as start of the address range passed
to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
reserve the firmware framebuffer so that we can inherit it would always
fail, as the firmware framebuffer starts at address 0.

Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
everything >= gen8") says in its commit message: "This is confirmed to fix
Skylake screen flickering issues (probably caused by the fact that we
initialized a ring in the first page of stolen, but I didn't 100% confirm
this theory)."

Which suggests that it is safe to use the first page for a linear
framebuffer as the firmware is doing.

This commit always passes 0 as start to drm_mm_init() and works around
WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
by insuring the start address passed by to drm_mm_insert_node_in_range()
is always 4k or more. All entry points to i915_gem_stolen.c go through
i915_gem_stolen_insert_node_in_range(), so that any newly allocated
objects such as ring-buffers will not be allocated in the first 4k.

The one exception is i915_gem_object_create_stolen_for_preallocated()
which directly calls drm_mm_reserve_node() which now will be able to
use the first 4k.

This fixes the i915 driver no longer being able to inherit the firmware
framebuffer on gen8+, which fixes the video output changing from the
vendor logo to a black screen as soon as the i915 driver is loaded
(on systems without fbcon).

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index af915d041281..ad949cc30928 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
 	if (!drm_mm_initialized(&dev_priv->mm.stolen))
 		return -ENODEV;
 
+	/* WaSkipStolenMemoryFirstPage:bdw+ */
+	if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
+		start = 4096;
+
 	mutex_lock(&dev_priv->mm.stolen_lock);
 	ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
 					  size, alignment, 0,
@@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
 {
 	resource_size_t reserved_base, stolen_top;
 	resource_size_t reserved_total, reserved_size;
-	resource_size_t stolen_usable_start;
 
 	mutex_init(&dev_priv->mm.stolen_lock);
 
@@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
 			 (u64)resource_size(&dev_priv->dsm) >> 10,
 			 ((u64)resource_size(&dev_priv->dsm) - reserved_total) >> 10);
 
-	stolen_usable_start = 0;
-	/* WaSkipStolenMemoryFirstPage:bdw+ */
-	if (INTEL_GEN(dev_priv) >= 8)
-		stolen_usable_start = 4096;
-
 	dev_priv->stolen_usable_size =
-		resource_size(&dev_priv->dsm) - reserved_total - stolen_usable_start;
+		resource_size(&dev_priv->dsm) - reserved_total;
 
 	/* Basic memrange allocator for stolen space. */
-	drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
-		    dev_priv->stolen_usable_size);
+	drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->stolen_usable_size);
 
 	return 0;
 }
-- 
2.17.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-03-30 12:27 [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers Hans de Goede
@ 2018-03-30 12:30 ` Chris Wilson
  2018-03-30 12:37   ` Hans de Goede
  2018-03-30 12:48 ` ✓ Fi.CI.BAT: success for " Patchwork
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2018-03-30 12:30 UTC (permalink / raw)
  To: Hans de Goede, Jani, Nikula, " <jani.nikula, Joonas,
	Lahtinen, " <joonas.lahtinen, Rodrigo, Vivi,
	" <rodrigo.vivi
  Cc: Hans de Goede, intel-gfx, dri-devel

Quoting Hans de Goede (2018-03-30 13:27:15)
> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> skipping the first 4k by passing 4096 as start of the address range passed
> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
> reserve the firmware framebuffer so that we can inherit it would always
> fail, as the firmware framebuffer starts at address 0.
> 
> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> everything >= gen8") says in its commit message: "This is confirmed to fix
> Skylake screen flickering issues (probably caused by the fact that we
> initialized a ring in the first page of stolen, but I didn't 100% confirm
> this theory)."
> 
> Which suggests that it is safe to use the first page for a linear
> framebuffer as the firmware is doing.
> 
> This commit always passes 0 as start to drm_mm_init() and works around
> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> by insuring the start address passed by to drm_mm_insert_node_in_range()
> is always 4k or more. All entry points to i915_gem_stolen.c go through
> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> objects such as ring-buffers will not be allocated in the first 4k.
> 
> The one exception is i915_gem_object_create_stolen_for_preallocated()
> which directly calls drm_mm_reserve_node() which now will be able to
> use the first 4k.
> 
> This fixes the i915 driver no longer being able to inherit the firmware
> framebuffer on gen8+, which fixes the video output changing from the
> vendor logo to a black screen as soon as the i915 driver is loaded
> (on systems without fbcon).

We've been told by the HW guys not to use the first page. (That's my
understanding from every time this gets questioned.)
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-03-30 12:30 ` [Intel-gfx] " Chris Wilson
@ 2018-03-30 12:37   ` Hans de Goede
  2018-03-30 12:44     ` Chris Wilson
  0 siblings, 1 reply; 25+ messages in thread
From: Hans de Goede @ 2018-03-30 12:37 UTC (permalink / raw)
  To: Chris Wilson, Jani Nikula, Rodrigo Vivi, Ville Syrjälä,
	Joonas Lahtinen
  Cc: Hans de Goede, intel-gfx, dri-devel

Hi,

On 30-03-18 14:30, Chris Wilson wrote:
> Quoting Hans de Goede (2018-03-30 13:27:15)
>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>> skipping the first 4k by passing 4096 as start of the address range passed
>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
>> reserve the firmware framebuffer so that we can inherit it would always
>> fail, as the firmware framebuffer starts at address 0.
>>
>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>> everything >= gen8") says in its commit message: "This is confirmed to fix
>> Skylake screen flickering issues (probably caused by the fact that we
>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>> this theory)."
>>
>> Which suggests that it is safe to use the first page for a linear
>> framebuffer as the firmware is doing.
>>
>> This commit always passes 0 as start to drm_mm_init() and works around
>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>> objects such as ring-buffers will not be allocated in the first 4k.
>>
>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>> which directly calls drm_mm_reserve_node() which now will be able to
>> use the first 4k.
>>
>> This fixes the i915 driver no longer being able to inherit the firmware
>> framebuffer on gen8+, which fixes the video output changing from the
>> vendor logo to a black screen as soon as the i915 driver is loaded
>> (on systems without fbcon).
> 
> We've been told by the HW guys not to use the first page. (That's my
> understanding from every time this gets questioned.)

Yet the GOP is happily using the first page. I think we may need to make
a difference here between the GPU not using the first page and the
display engine/pipeline not using the first page. Note that my patch
only influences the inheriting of the initial framebuffer as allocated
by the GOP. It does not influence any other allocations from the
reserved range, those will still all avoid the first page.

Without this patch fastboot / flickerfree support is essentially broken
on any gen8+ hardware given that one of the goals of atomic is to be
able to do flickerfree transitions I think that this warrants a closer
look then just simply saying never use the first page.

Regards,

Hans




_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-03-30 12:37   ` Hans de Goede
@ 2018-03-30 12:44     ` Chris Wilson
  2018-03-30 13:25       ` Hans de Goede
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2018-03-30 12:44 UTC (permalink / raw)
  To: Jani, Nikula, " <jani.nikula, Rodrigo, Vivi,
	" <rodrigo.vivi, Joonas, Lahtinen,
	" <joonas.lahtinen
  Cc: Hans de Goede, intel-gfx, dri-devel

Quoting Hans de Goede (2018-03-30 13:37:40)
> Hi,
> 
> On 30-03-18 14:30, Chris Wilson wrote:
> > Quoting Hans de Goede (2018-03-30 13:27:15)
> >> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> >> skipping the first 4k by passing 4096 as start of the address range passed
> >> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
> >> reserve the firmware framebuffer so that we can inherit it would always
> >> fail, as the firmware framebuffer starts at address 0.
> >>
> >> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> >> everything >= gen8") says in its commit message: "This is confirmed to fix
> >> Skylake screen flickering issues (probably caused by the fact that we
> >> initialized a ring in the first page of stolen, but I didn't 100% confirm
> >> this theory)."
> >>
> >> Which suggests that it is safe to use the first page for a linear
> >> framebuffer as the firmware is doing.
> >>
> >> This commit always passes 0 as start to drm_mm_init() and works around
> >> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> >> by insuring the start address passed by to drm_mm_insert_node_in_range()
> >> is always 4k or more. All entry points to i915_gem_stolen.c go through
> >> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> >> objects such as ring-buffers will not be allocated in the first 4k.
> >>
> >> The one exception is i915_gem_object_create_stolen_for_preallocated()
> >> which directly calls drm_mm_reserve_node() which now will be able to
> >> use the first 4k.
> >>
> >> This fixes the i915 driver no longer being able to inherit the firmware
> >> framebuffer on gen8+, which fixes the video output changing from the
> >> vendor logo to a black screen as soon as the i915 driver is loaded
> >> (on systems without fbcon).
> > 
> > We've been told by the HW guys not to use the first page. (That's my
> > understanding from every time this gets questioned.)
> 
> Yet the GOP is happily using the first page. I think we may need to make
> a difference here between the GPU not using the first page and the
> display engine/pipeline not using the first page. Note that my patch
> only influences the inheriting of the initial framebuffer as allocated
> by the GOP. It does not influence any other allocations from the
> reserved range, those will still all avoid the first page.
> 
> Without this patch fastboot / flickerfree support is essentially broken
> on any gen8+ hardware given that one of the goals of atomic is to be
> able to do flickerfree transitions I think that this warrants a closer
> look then just simply saying never use the first page.

The concern is what else (i.e. nothing that we allocated ourselves) that
may be in the first page...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-03-30 12:27 [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers Hans de Goede
  2018-03-30 12:30 ` [Intel-gfx] " Chris Wilson
@ 2018-03-30 12:48 ` Patchwork
  2018-03-30 13:31 ` ✓ Fi.CI.IGT: " Patchwork
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2018-03-30 12:48 UTC (permalink / raw)
  To: Hans de Goede; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
URL   : https://patchwork.freedesktop.org/series/40929/
State : success

== Summary ==

Series 40929v1 drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
https://patchwork.freedesktop.org/api/1.0/series/40929/revisions/1/mbox/

fi-bdw-5557u     total:285  pass:264  dwarn:0   dfail:0   fail:0   skip:21  time:429s
fi-bdw-gvtdvm    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:443s
fi-blb-e6850     total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:380s
fi-bsw-n3050     total:285  pass:239  dwarn:0   dfail:0   fail:0   skip:46  time:533s
fi-bwr-2160      total:285  pass:180  dwarn:0   dfail:0   fail:0   skip:105 time:295s
fi-bxt-dsi       total:285  pass:255  dwarn:0   dfail:0   fail:0   skip:30  time:518s
fi-bxt-j4205     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:511s
fi-byt-j1900     total:285  pass:250  dwarn:0   dfail:0   fail:0   skip:35  time:518s
fi-byt-n2820     total:285  pass:246  dwarn:0   dfail:0   fail:0   skip:39  time:508s
fi-cfl-8700k     total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:408s
fi-cfl-s3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:558s
fi-cfl-u         total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:518s
fi-cnl-y3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:586s
fi-elk-e7500     total:285  pass:225  dwarn:1   dfail:0   fail:0   skip:59  time:421s
fi-gdg-551       total:285  pass:176  dwarn:0   dfail:0   fail:1   skip:108 time:315s
fi-glk-1         total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:539s
fi-hsw-4770      total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:404s
fi-ilk-650       total:285  pass:225  dwarn:0   dfail:0   fail:0   skip:60  time:423s
fi-ivb-3520m     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:470s
fi-ivb-3770      total:285  pass:252  dwarn:0   dfail:0   fail:0   skip:33  time:429s
fi-kbl-7500u     total:285  pass:260  dwarn:1   dfail:0   fail:0   skip:24  time:470s
fi-kbl-7567u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:460s
fi-kbl-r         total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:509s
fi-pnv-d510      total:285  pass:219  dwarn:1   dfail:0   fail:0   skip:65  time:658s
fi-skl-6260u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:442s
fi-skl-6600u     total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:533s
fi-skl-6700k2    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:500s
fi-skl-6770hq    total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:509s
fi-skl-guc       total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:430s
fi-skl-gvtdvm    total:285  pass:262  dwarn:0   dfail:0   fail:0   skip:23  time:446s
fi-snb-2520m     total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:581s
fi-snb-2600      total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:402s
Blacklisted hosts:
fi-cnl-psr       total:285  pass:254  dwarn:3   dfail:0   fail:2   skip:26  time:515s
fi-glk-j4005     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:484s

335ef9849310af26d65c54f7d2d2e9dcbce238b9 drm-tip: 2018y-03m-30d-09h-08m-40s UTC integration manifest
bfdc21d08f2b drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8547/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-03-30 12:44     ` Chris Wilson
@ 2018-03-30 13:25       ` Hans de Goede
  2018-03-30 14:26         ` [Intel-gfx] " Hans de Goede
  0 siblings, 1 reply; 25+ messages in thread
From: Hans de Goede @ 2018-03-30 13:25 UTC (permalink / raw)
  To: Chris Wilson, Jani Nikula, Rodrigo Vivi, Ville Syrjälä,
	Joonas Lahtinen
  Cc: intel-gfx, dri-devel

Hi,

On 30-03-18 14:44, Chris Wilson wrote:
> Quoting Hans de Goede (2018-03-30 13:37:40)
>> Hi,
>>
>> On 30-03-18 14:30, Chris Wilson wrote:
>>> Quoting Hans de Goede (2018-03-30 13:27:15)
>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>> skipping the first 4k by passing 4096 as start of the address range passed
>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>> fail, as the firmware framebuffer starts at address 0.
>>>>
>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>> everything >= gen8") says in its commit message: "This is confirmed to fix
>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>> this theory)."
>>>>
>>>> Which suggests that it is safe to use the first page for a linear
>>>> framebuffer as the firmware is doing.
>>>>
>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>
>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>> use the first 4k.
>>>>
>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>> (on systems without fbcon).
>>>
>>> We've been told by the HW guys not to use the first page. (That's my
>>> understanding from every time this gets questioned.)
>>
>> Yet the GOP is happily using the first page. I think we may need to make
>> a difference here between the GPU not using the first page and the
>> display engine/pipeline not using the first page. Note that my patch
>> only influences the inheriting of the initial framebuffer as allocated
>> by the GOP. It does not influence any other allocations from the
>> reserved range, those will still all avoid the first page.
>>
>> Without this patch fastboot / flickerfree support is essentially broken
>> on any gen8+ hardware given that one of the goals of atomic is to be
>> able to do flickerfree transitions I think that this warrants a closer
>> look then just simply saying never use the first page.
> 
> The concern is what else (i.e. nothing that we allocated ourselves) that
> may be in the first page...

Given that the GOP has put its framebuffer there at least at boot there
is nothing there, otherwise it would show up on the display.

We have a whole bunch of code to inherit the BIOS fb in intel_display.c
and AFAIK that code is there because this inheriting the BIOS fb is
deemed an important feature. So I'm not happy at all with the handwavy
best to not touch it answer I'm getting to this patch.

Unless there are some clear answer from the hardware folks which specifically
say we cannot put a framebuffer there (and then why is the GOP doing it?)
then I believe that the best way forward here is to get various people to
test with this patch and the best way to do that is probably to put it
in next. Note I deliberately did not add a Cc stable.

Regards,

Hans



_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-03-30 12:27 [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers Hans de Goede
  2018-03-30 12:30 ` [Intel-gfx] " Chris Wilson
  2018-03-30 12:48 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2018-03-30 13:31 ` Patchwork
  2018-04-05  7:14 ` [PATCH] " Daniel Vetter
  2018-04-19 13:54 ` ✗ Fi.CI.BAT: failure for drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers (rev2) Patchwork
  4 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2018-03-30 13:31 UTC (permalink / raw)
  To: Hans de Goede; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
URL   : https://patchwork.freedesktop.org/series/40929/
State : success

== Summary ==

---- Known issues:

Test kms_flip:
        Subgroup 2x-dpms-vs-vblank-race:
                fail       -> PASS       (shard-hsw) fdo#103060
        Subgroup 2x-flip-vs-expired-vblank-interruptible:
                fail       -> PASS       (shard-hsw) fdo#102887
        Subgroup flip-vs-blocking-wf-vblank:
                fail       -> PASS       (shard-hsw) fdo#100368 +1
        Subgroup flip-vs-panning-vs-hang-interruptible:
                dmesg-warn -> PASS       (shard-snb) fdo#103821
Test kms_sysfs_edid_timing:
                pass       -> WARN       (shard-apl) fdo#100047

fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#103821 https://bugs.freedesktop.org/show_bug.cgi?id=103821
fdo#100047 https://bugs.freedesktop.org/show_bug.cgi?id=100047

shard-apl        total:3495 pass:1831 dwarn:1   dfail:0   fail:7   skip:1655 time:12913s
shard-hsw        total:3495 pass:1782 dwarn:1   dfail:0   fail:2   skip:1709 time:11419s
shard-snb        total:3495 pass:1374 dwarn:1   dfail:0   fail:3   skip:2117 time:6970s
Blacklisted hosts:
shard-kbl        total:3495 pass:1946 dwarn:12  dfail:0   fail:9   skip:1528 time:9154s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8547/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-03-30 13:25       ` Hans de Goede
@ 2018-03-30 14:26         ` Hans de Goede
  2018-04-04 15:50           ` Ville Syrjälä
  0 siblings, 1 reply; 25+ messages in thread
From: Hans de Goede @ 2018-03-30 14:26 UTC (permalink / raw)
  To: Chris Wilson, Jani Nikula, Rodrigo Vivi, Ville Syrjälä,
	Joonas Lahtinen
  Cc: intel-gfx, dri-devel

Hi,

On 30-03-18 15:25, Hans de Goede wrote:
> Hi,
> 
> On 30-03-18 14:44, Chris Wilson wrote:
>> Quoting Hans de Goede (2018-03-30 13:37:40)
>>> Hi,
>>>
>>> On 30-03-18 14:30, Chris Wilson wrote:
>>>> Quoting Hans de Goede (2018-03-30 13:27:15)
>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>>> skipping the first 4k by passing 4096 as start of the address range passed
>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
>>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>>> fail, as the firmware framebuffer starts at address 0.
>>>>>
>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix
>>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>>> this theory)."
>>>>>
>>>>> Which suggests that it is safe to use the first page for a linear
>>>>> framebuffer as the firmware is doing.
>>>>>
>>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>>
>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>>> use the first 4k.
>>>>>
>>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>>> (on systems without fbcon).
>>>>
>>>> We've been told by the HW guys not to use the first page. (That's my
>>>> understanding from every time this gets questioned.)
>>>
>>> Yet the GOP is happily using the first page. I think we may need to make
>>> a difference here between the GPU not using the first page and the
>>> display engine/pipeline not using the first page. Note that my patch
>>> only influences the inheriting of the initial framebuffer as allocated
>>> by the GOP. It does not influence any other allocations from the
>>> reserved range, those will still all avoid the first page.
>>>
>>> Without this patch fastboot / flickerfree support is essentially broken
>>> on any gen8+ hardware given that one of the goals of atomic is to be
>>> able to do flickerfree transitions I think that this warrants a closer
>>> look then just simply saying never use the first page.
>>
>> The concern is what else (i.e. nothing that we allocated ourselves) that
>> may be in the first page...
> 
> Given that the GOP has put its framebuffer there at least at boot there
> is nothing there, otherwise it would show up on the display.
> 
> We have a whole bunch of code to inherit the BIOS fb in intel_display.c
> and AFAIK that code is there because this inheriting the BIOS fb is
> deemed an important feature. So I'm not happy at all with the handwavy
> best to not touch it answer I'm getting to this patch.
> 
> Unless there are some clear answer from the hardware folks which specifically
> say we cannot put a framebuffer there (and then why is the GOP doing it?)
> then I believe that the best way forward here is to get various people to
> test with this patch and the best way to do that is probably to put it
> in next. Note I deliberately did not add a Cc stable.

To elaborate on this, the excluding of the first 4k of the stolen memory
region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
which in turn causes intel_find_initial_plane_obj() to call
intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
completely turns off the display which leads to a very ugly flickering
of the display at boot (as well as replacing the vendor logo with a
black screen).

I think we can all agree that this behavior is undesirable and even a
regression in behavior caused by the fix to exclude the first 4k.

Chris, if my patch is not an acceptable way to fix this, then how do you
suggest that we fix this?

Digging a bit deeper I found this:

https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf

Which says:

"WaSkipStolenMemoryFirstPage:

WA to skip the first page of stolen
memory due to sporadic HW write on *CS Idle"

And also about FBC:

"First line of FBC getting corrupted when FBC
compressed frame buffer offset is programmed to
zero. Command streamers are doing flush writes to
base of stolen.
WA: New restriction to program FBC compressed
frame buffer offset to at least 4KB."

So using the first 4kB for the *framebuffer* as done by the GOP will
not cause any major problems (freezes, hangs, etc.), and commit
d43537610470 ("drm/i915: skip the first 4k of stolen memory on
everything >= gen8") was correct in deducing that the problem was
likely that some *vital* information was being stored i the first 4k
and that go overwritten.

But the contents of the (first lines of) the framebuffer may become
corrupted once we actually start using the command-streamers, which
is still very much not wanted.

In practice Xorg or Wayland will likely have setup another framebuffer
by the time the command-streamers will start to get used.

Alternatively we could start with inheriting the BIOS framebuffer
(as my patch allows) so that we don't get the flicker and then soon
afterwards atomically transit to a new framebuffer (which should
contain a copy of the BIOS fb contents) at a different location.

I think it might be worthwhile to first just try my patch and then if
we see or receive reports of problems with the fb becoming corrupt
because in some cases it ends up being used for longer then expected,
we can do the atomic transition to a new fb thing.

Regards,

Hans


p.s.

Fi.CI.BAT and Fi.CI.IGT have run successfully for this patch, I realize
that this does not mean that much, but it is an indication that it is
not completely broken.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-03-30 14:26         ` [Intel-gfx] " Hans de Goede
@ 2018-04-04 15:50           ` Ville Syrjälä
  2018-04-04 17:26             ` Rodrigo Vivi
  2018-04-04 20:06             ` [Intel-gfx] " Hans de Goede
  0 siblings, 2 replies; 25+ messages in thread
From: Ville Syrjälä @ 2018-04-04 15:50 UTC (permalink / raw)
  To: Hans de Goede; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 30-03-18 15:25, Hans de Goede wrote:
> > Hi,
> > 
> > On 30-03-18 14:44, Chris Wilson wrote:
> >> Quoting Hans de Goede (2018-03-30 13:37:40)
> >>> Hi,
> >>>
> >>> On 30-03-18 14:30, Chris Wilson wrote:
> >>>> Quoting Hans de Goede (2018-03-30 13:27:15)
> >>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> >>>>> skipping the first 4k by passing 4096 as start of the address range passed
> >>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
> >>>>> reserve the firmware framebuffer so that we can inherit it would always
> >>>>> fail, as the firmware framebuffer starts at address 0.
> >>>>>
> >>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> >>>>> everything >= gen8") says in its commit message: "This is confirmed to fix
> >>>>> Skylake screen flickering issues (probably caused by the fact that we
> >>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
> >>>>> this theory)."
> >>>>>
> >>>>> Which suggests that it is safe to use the first page for a linear
> >>>>> framebuffer as the firmware is doing.
> >>>>>
> >>>>> This commit always passes 0 as start to drm_mm_init() and works around
> >>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> >>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
> >>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
> >>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> >>>>> objects such as ring-buffers will not be allocated in the first 4k.
> >>>>>
> >>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
> >>>>> which directly calls drm_mm_reserve_node() which now will be able to
> >>>>> use the first 4k.
> >>>>>
> >>>>> This fixes the i915 driver no longer being able to inherit the firmware
> >>>>> framebuffer on gen8+, which fixes the video output changing from the
> >>>>> vendor logo to a black screen as soon as the i915 driver is loaded
> >>>>> (on systems without fbcon).
> >>>>
> >>>> We've been told by the HW guys not to use the first page. (That's my
> >>>> understanding from every time this gets questioned.)
> >>>
> >>> Yet the GOP is happily using the first page. I think we may need to make
> >>> a difference here between the GPU not using the first page and the
> >>> display engine/pipeline not using the first page. Note that my patch
> >>> only influences the inheriting of the initial framebuffer as allocated
> >>> by the GOP. It does not influence any other allocations from the
> >>> reserved range, those will still all avoid the first page.
> >>>
> >>> Without this patch fastboot / flickerfree support is essentially broken
> >>> on any gen8+ hardware given that one of the goals of atomic is to be
> >>> able to do flickerfree transitions I think that this warrants a closer
> >>> look then just simply saying never use the first page.
> >>
> >> The concern is what else (i.e. nothing that we allocated ourselves) that
> >> may be in the first page...
> > 
> > Given that the GOP has put its framebuffer there at least at boot there
> > is nothing there, otherwise it would show up on the display.
> > 
> > We have a whole bunch of code to inherit the BIOS fb in intel_display.c
> > and AFAIK that code is there because this inheriting the BIOS fb is
> > deemed an important feature. So I'm not happy at all with the handwavy
> > best to not touch it answer I'm getting to this patch.
> > 
> > Unless there are some clear answer from the hardware folks which specifically
> > say we cannot put a framebuffer there (and then why is the GOP doing it?)
> > then I believe that the best way forward here is to get various people to
> > test with this patch and the best way to do that is probably to put it
> > in next. Note I deliberately did not add a Cc stable.
> 
> To elaborate on this, the excluding of the first 4k of the stolen memory
> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
> which in turn causes intel_find_initial_plane_obj() to call
> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
> completely turns off the display which leads to a very ugly flickering
> of the display at boot (as well as replacing the vendor logo with a
> black screen).
> 
> I think we can all agree that this behavior is undesirable and even a
> regression in behavior caused by the fix to exclude the first 4k.
> 
> Chris, if my patch is not an acceptable way to fix this, then how do you
> suggest that we fix this?
> 
> Digging a bit deeper I found this:
> 
> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf
> 
> Which says:
> 
> "WaSkipStolenMemoryFirstPage:
> 
> WA to skip the first page of stolen
> memory due to sporadic HW write on *CS Idle"
> 
> And also about FBC:
> 
> "First line of FBC getting corrupted when FBC
> compressed frame buffer offset is programmed to
> zero. Command streamers are doing flush writes to
> base of stolen.
> WA: New restriction to program FBC compressed
> frame buffer offset to at least 4KB."
> 
> So using the first 4kB for the *framebuffer* as done by the GOP will
> not cause any major problems (freezes, hangs, etc.), and commit
> d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> everything >= gen8") was correct in deducing that the problem was
> likely that some *vital* information was being stored i the first 4k
> and that go overwritten.
> 
> But the contents of the (first lines of) the framebuffer may become
> corrupted once we actually start using the command-streamers, which
> is still very much not wanted.
> 
> In practice Xorg or Wayland will likely have setup another framebuffer
> by the time the command-streamers will start to get used.
> 
> Alternatively we could start with inheriting the BIOS framebuffer
> (as my patch allows) so that we don't get the flicker and then soon
> afterwards atomically transit to a new framebuffer (which should
> contain a copy of the BIOS fb contents) at a different location.

What I suggested long ago was to copy just the first page and adjust the
sg list. But I'm not sure if our stolen gem code would be happy with an
sg list with two entries instead of one.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-04 15:50           ` Ville Syrjälä
@ 2018-04-04 17:26             ` Rodrigo Vivi
  2018-04-04 20:06             ` [Intel-gfx] " Hans de Goede
  1 sibling, 0 replies; 25+ messages in thread
From: Rodrigo Vivi @ 2018-04-04 17:26 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, dri-devel, Hans de Goede

On Wed, Apr 04, 2018 at 06:50:16PM +0300, Ville Syrjälä wrote:
> On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote:
> > Hi,
> > 
> > On 30-03-18 15:25, Hans de Goede wrote:
> > > Hi,
> > > 
> > > On 30-03-18 14:44, Chris Wilson wrote:
> > >> Quoting Hans de Goede (2018-03-30 13:37:40)
> > >>> Hi,
> > >>>
> > >>> On 30-03-18 14:30, Chris Wilson wrote:
> > >>>> Quoting Hans de Goede (2018-03-30 13:27:15)
> > >>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> > >>>>> skipping the first 4k by passing 4096 as start of the address range passed
> > >>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
> > >>>>> reserve the firmware framebuffer so that we can inherit it would always
> > >>>>> fail, as the firmware framebuffer starts at address 0.
> > >>>>>
> > >>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> > >>>>> everything >= gen8") says in its commit message: "This is confirmed to fix
> > >>>>> Skylake screen flickering issues (probably caused by the fact that we
> > >>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
> > >>>>> this theory)."
> > >>>>>
> > >>>>> Which suggests that it is safe to use the first page for a linear
> > >>>>> framebuffer as the firmware is doing.
> > >>>>>
> > >>>>> This commit always passes 0 as start to drm_mm_init() and works around
> > >>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> > >>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
> > >>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
> > >>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> > >>>>> objects such as ring-buffers will not be allocated in the first 4k.
> > >>>>>
> > >>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
> > >>>>> which directly calls drm_mm_reserve_node() which now will be able to
> > >>>>> use the first 4k.
> > >>>>>
> > >>>>> This fixes the i915 driver no longer being able to inherit the firmware
> > >>>>> framebuffer on gen8+, which fixes the video output changing from the
> > >>>>> vendor logo to a black screen as soon as the i915 driver is loaded
> > >>>>> (on systems without fbcon).
> > >>>>
> > >>>> We've been told by the HW guys not to use the first page. (That's my
> > >>>> understanding from every time this gets questioned.)
> > >>>
> > >>> Yet the GOP is happily using the first page. I think we may need to make
> > >>> a difference here between the GPU not using the first page and the
> > >>> display engine/pipeline not using the first page. Note that my patch
> > >>> only influences the inheriting of the initial framebuffer as allocated
> > >>> by the GOP. It does not influence any other allocations from the
> > >>> reserved range, those will still all avoid the first page.
> > >>>
> > >>> Without this patch fastboot / flickerfree support is essentially broken
> > >>> on any gen8+ hardware given that one of the goals of atomic is to be
> > >>> able to do flickerfree transitions I think that this warrants a closer
> > >>> look then just simply saying never use the first page.
> > >>
> > >> The concern is what else (i.e. nothing that we allocated ourselves) that
> > >> may be in the first page...
> > > 
> > > Given that the GOP has put its framebuffer there at least at boot there
> > > is nothing there, otherwise it would show up on the display.
> > > 
> > > We have a whole bunch of code to inherit the BIOS fb in intel_display.c
> > > and AFAIK that code is there because this inheriting the BIOS fb is
> > > deemed an important feature. So I'm not happy at all with the handwavy
> > > best to not touch it answer I'm getting to this patch.
> > > 
> > > Unless there are some clear answer from the hardware folks which specifically
> > > say we cannot put a framebuffer there (and then why is the GOP doing it?)
> > > then I believe that the best way forward here is to get various people to
> > > test with this patch and the best way to do that is probably to put it
> > > in next. Note I deliberately did not add a Cc stable.
> > 
> > To elaborate on this, the excluding of the first 4k of the stolen memory
> > region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
> > which in turn causes intel_find_initial_plane_obj() to call
> > intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
> > completely turns off the display which leads to a very ugly flickering
> > of the display at boot (as well as replacing the vendor logo with a
> > black screen).
> > 
> > I think we can all agree that this behavior is undesirable and even a
> > regression in behavior caused by the fix to exclude the first 4k.
> > 
> > Chris, if my patch is not an acceptable way to fix this, then how do you
> > suggest that we fix this?
> > 
> > Digging a bit deeper I found this:
> > 
> > https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf
> > 
> > Which says:
> > 
> > "WaSkipStolenMemoryFirstPage:
> > 
> > WA to skip the first page of stolen
> > memory due to sporadic HW write on *CS Idle"
> > 
> > And also about FBC:
> > 
> > "First line of FBC getting corrupted when FBC
> > compressed frame buffer offset is programmed to
> > zero. Command streamers are doing flush writes to
> > base of stolen.
> > WA: New restriction to program FBC compressed
> > frame buffer offset to at least 4KB."
> > 
> > So using the first 4kB for the *framebuffer* as done by the GOP will
> > not cause any major problems (freezes, hangs, etc.), and commit
> > d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> > everything >= gen8") was correct in deducing that the problem was
> > likely that some *vital* information was being stored i the first 4k
> > and that go overwritten.
> > 
> > But the contents of the (first lines of) the framebuffer may become
> > corrupted once we actually start using the command-streamers, which
> > is still very much not wanted.
> > 
> > In practice Xorg or Wayland will likely have setup another framebuffer
> > by the time the command-streamers will start to get used.
> > 
> > Alternatively we could start with inheriting the BIOS framebuffer
> > (as my patch allows) so that we don't get the flicker and then soon
> > afterwards atomically transit to a new framebuffer (which should
> > contain a copy of the BIOS fb contents) at a different location.
> 
> What I suggested long ago was to copy just the first page and adjust the
> sg list. But I'm not sure if our stolen gem code would be happy with an
> sg list with two entries instead of one.

Oh I vaguely remember the issue and this suggestion.

But looking to Hans code and explanation I believe that makes sense,
although I understand the fear of regressions :/

> 
> -- 
> Ville Syrjälä
> Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-04 15:50           ` Ville Syrjälä
  2018-04-04 17:26             ` Rodrigo Vivi
@ 2018-04-04 20:06             ` Hans de Goede
  2018-04-04 20:49               ` Ville Syrjälä
  1 sibling, 1 reply; 25+ messages in thread
From: Hans de Goede @ 2018-04-04 20:06 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

Hi,

On 04-04-18 17:50, Ville Syrjälä wrote:
> On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 30-03-18 15:25, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 30-03-18 14:44, Chris Wilson wrote:
>>>> Quoting Hans de Goede (2018-03-30 13:37:40)
>>>>> Hi,
>>>>>
>>>>> On 30-03-18 14:30, Chris Wilson wrote:
>>>>>> Quoting Hans de Goede (2018-03-30 13:27:15)
>>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>>>>> skipping the first 4k by passing 4096 as start of the address range passed
>>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
>>>>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>>>>> fail, as the firmware framebuffer starts at address 0.
>>>>>>>
>>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix
>>>>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>>>>> this theory)."
>>>>>>>
>>>>>>> Which suggests that it is safe to use the first page for a linear
>>>>>>> framebuffer as the firmware is doing.
>>>>>>>
>>>>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>>>>
>>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>>>>> use the first 4k.
>>>>>>>
>>>>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>>>>> (on systems without fbcon).
>>>>>>
>>>>>> We've been told by the HW guys not to use the first page. (That's my
>>>>>> understanding from every time this gets questioned.)
>>>>>
>>>>> Yet the GOP is happily using the first page. I think we may need to make
>>>>> a difference here between the GPU not using the first page and the
>>>>> display engine/pipeline not using the first page. Note that my patch
>>>>> only influences the inheriting of the initial framebuffer as allocated
>>>>> by the GOP. It does not influence any other allocations from the
>>>>> reserved range, those will still all avoid the first page.
>>>>>
>>>>> Without this patch fastboot / flickerfree support is essentially broken
>>>>> on any gen8+ hardware given that one of the goals of atomic is to be
>>>>> able to do flickerfree transitions I think that this warrants a closer
>>>>> look then just simply saying never use the first page.
>>>>
>>>> The concern is what else (i.e. nothing that we allocated ourselves) that
>>>> may be in the first page...
>>>
>>> Given that the GOP has put its framebuffer there at least at boot there
>>> is nothing there, otherwise it would show up on the display.
>>>
>>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c
>>> and AFAIK that code is there because this inheriting the BIOS fb is
>>> deemed an important feature. So I'm not happy at all with the handwavy
>>> best to not touch it answer I'm getting to this patch.
>>>
>>> Unless there are some clear answer from the hardware folks which specifically
>>> say we cannot put a framebuffer there (and then why is the GOP doing it?)
>>> then I believe that the best way forward here is to get various people to
>>> test with this patch and the best way to do that is probably to put it
>>> in next. Note I deliberately did not add a Cc stable.
>>
>> To elaborate on this, the excluding of the first 4k of the stolen memory
>> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
>> which in turn causes intel_find_initial_plane_obj() to call
>> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
>> completely turns off the display which leads to a very ugly flickering
>> of the display at boot (as well as replacing the vendor logo with a
>> black screen).
>>
>> I think we can all agree that this behavior is undesirable and even a
>> regression in behavior caused by the fix to exclude the first 4k.
>>
>> Chris, if my patch is not an acceptable way to fix this, then how do you
>> suggest that we fix this?
>>
>> Digging a bit deeper I found this:
>>
>> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf
>>
>> Which says:
>>
>> "WaSkipStolenMemoryFirstPage:
>>
>> WA to skip the first page of stolen
>> memory due to sporadic HW write on *CS Idle"
>>
>> And also about FBC:
>>
>> "First line of FBC getting corrupted when FBC
>> compressed frame buffer offset is programmed to
>> zero. Command streamers are doing flush writes to
>> base of stolen.
>> WA: New restriction to program FBC compressed
>> frame buffer offset to at least 4KB."
>>
>> So using the first 4kB for the *framebuffer* as done by the GOP will
>> not cause any major problems (freezes, hangs, etc.), and commit
>> d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>> everything >= gen8") was correct in deducing that the problem was
>> likely that some *vital* information was being stored i the first 4k
>> and that go overwritten.
>>
>> But the contents of the (first lines of) the framebuffer may become
>> corrupted once we actually start using the command-streamers, which
>> is still very much not wanted.
>>
>> In practice Xorg or Wayland will likely have setup another framebuffer
>> by the time the command-streamers will start to get used.
>>
>> Alternatively we could start with inheriting the BIOS framebuffer
>> (as my patch allows) so that we don't get the flicker and then soon
>> afterwards atomically transit to a new framebuffer (which should
>> contain a copy of the BIOS fb contents) at a different location.
> 
> What I suggested long ago was to copy just the first page and adjust the
> sg list. But I'm not sure if our stolen gem code would be happy with an
> sg list with two entries instead of one.

But that would still require an atomic-modeset to install the new sg-list,
right? Then we might just as well simply alloc a new fb and copy the
contents over, or are you worried that with say a 4k fb that takes too
much time? FWIW I can see how the single memcpy this involves will take
some time, but I don't take it will take so long as to be a problem.

Anyways I could use some help with implementing either solution as I'm
not familiar with the involved parts of the code. I will happily test
a patch for this. Keep in mind that for this to work my original patch
will also be necessary so that the initial takeover of the firmware
fb will work.

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-04 20:06             ` [Intel-gfx] " Hans de Goede
@ 2018-04-04 20:49               ` Ville Syrjälä
  2018-04-05 11:37                 ` Hans de Goede
  0 siblings, 1 reply; 25+ messages in thread
From: Ville Syrjälä @ 2018-04-04 20:49 UTC (permalink / raw)
  To: Hans de Goede; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 04-04-18 17:50, Ville Syrjälä wrote:
> > On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote:
> >> Hi,
> >>
> >> On 30-03-18 15:25, Hans de Goede wrote:
> >>> Hi,
> >>>
> >>> On 30-03-18 14:44, Chris Wilson wrote:
> >>>> Quoting Hans de Goede (2018-03-30 13:37:40)
> >>>>> Hi,
> >>>>>
> >>>>> On 30-03-18 14:30, Chris Wilson wrote:
> >>>>>> Quoting Hans de Goede (2018-03-30 13:27:15)
> >>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> >>>>>>> skipping the first 4k by passing 4096 as start of the address range passed
> >>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
> >>>>>>> reserve the firmware framebuffer so that we can inherit it would always
> >>>>>>> fail, as the firmware framebuffer starts at address 0.
> >>>>>>>
> >>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> >>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix
> >>>>>>> Skylake screen flickering issues (probably caused by the fact that we
> >>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
> >>>>>>> this theory)."
> >>>>>>>
> >>>>>>> Which suggests that it is safe to use the first page for a linear
> >>>>>>> framebuffer as the firmware is doing.
> >>>>>>>
> >>>>>>> This commit always passes 0 as start to drm_mm_init() and works around
> >>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> >>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
> >>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
> >>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> >>>>>>> objects such as ring-buffers will not be allocated in the first 4k.
> >>>>>>>
> >>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
> >>>>>>> which directly calls drm_mm_reserve_node() which now will be able to
> >>>>>>> use the first 4k.
> >>>>>>>
> >>>>>>> This fixes the i915 driver no longer being able to inherit the firmware
> >>>>>>> framebuffer on gen8+, which fixes the video output changing from the
> >>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded
> >>>>>>> (on systems without fbcon).
> >>>>>>
> >>>>>> We've been told by the HW guys not to use the first page. (That's my
> >>>>>> understanding from every time this gets questioned.)
> >>>>>
> >>>>> Yet the GOP is happily using the first page. I think we may need to make
> >>>>> a difference here between the GPU not using the first page and the
> >>>>> display engine/pipeline not using the first page. Note that my patch
> >>>>> only influences the inheriting of the initial framebuffer as allocated
> >>>>> by the GOP. It does not influence any other allocations from the
> >>>>> reserved range, those will still all avoid the first page.
> >>>>>
> >>>>> Without this patch fastboot / flickerfree support is essentially broken
> >>>>> on any gen8+ hardware given that one of the goals of atomic is to be
> >>>>> able to do flickerfree transitions I think that this warrants a closer
> >>>>> look then just simply saying never use the first page.
> >>>>
> >>>> The concern is what else (i.e. nothing that we allocated ourselves) that
> >>>> may be in the first page...
> >>>
> >>> Given that the GOP has put its framebuffer there at least at boot there
> >>> is nothing there, otherwise it would show up on the display.
> >>>
> >>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c
> >>> and AFAIK that code is there because this inheriting the BIOS fb is
> >>> deemed an important feature. So I'm not happy at all with the handwavy
> >>> best to not touch it answer I'm getting to this patch.
> >>>
> >>> Unless there are some clear answer from the hardware folks which specifically
> >>> say we cannot put a framebuffer there (and then why is the GOP doing it?)
> >>> then I believe that the best way forward here is to get various people to
> >>> test with this patch and the best way to do that is probably to put it
> >>> in next. Note I deliberately did not add a Cc stable.
> >>
> >> To elaborate on this, the excluding of the first 4k of the stolen memory
> >> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
> >> which in turn causes intel_find_initial_plane_obj() to call
> >> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
> >> completely turns off the display which leads to a very ugly flickering
> >> of the display at boot (as well as replacing the vendor logo with a
> >> black screen).
> >>
> >> I think we can all agree that this behavior is undesirable and even a
> >> regression in behavior caused by the fix to exclude the first 4k.
> >>
> >> Chris, if my patch is not an acceptable way to fix this, then how do you
> >> suggest that we fix this?
> >>
> >> Digging a bit deeper I found this:
> >>
> >> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf
> >>
> >> Which says:
> >>
> >> "WaSkipStolenMemoryFirstPage:
> >>
> >> WA to skip the first page of stolen
> >> memory due to sporadic HW write on *CS Idle"
> >>
> >> And also about FBC:
> >>
> >> "First line of FBC getting corrupted when FBC
> >> compressed frame buffer offset is programmed to
> >> zero. Command streamers are doing flush writes to
> >> base of stolen.
> >> WA: New restriction to program FBC compressed
> >> frame buffer offset to at least 4KB."
> >>
> >> So using the first 4kB for the *framebuffer* as done by the GOP will
> >> not cause any major problems (freezes, hangs, etc.), and commit
> >> d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> >> everything >= gen8") was correct in deducing that the problem was
> >> likely that some *vital* information was being stored i the first 4k
> >> and that go overwritten.
> >>
> >> But the contents of the (first lines of) the framebuffer may become
> >> corrupted once we actually start using the command-streamers, which
> >> is still very much not wanted.
> >>
> >> In practice Xorg or Wayland will likely have setup another framebuffer
> >> by the time the command-streamers will start to get used.
> >>
> >> Alternatively we could start with inheriting the BIOS framebuffer
> >> (as my patch allows) so that we don't get the flicker and then soon
> >> afterwards atomically transit to a new framebuffer (which should
> >> contain a copy of the BIOS fb contents) at a different location.
> > 
> > What I suggested long ago was to copy just the first page and adjust the
> > sg list. But I'm not sure if our stolen gem code would be happy with an
> > sg list with two entries instead of one.
> 
> But that would still require an atomic-modeset to install the new sg-list,
> right?

Perhaps not. Not sure if the pte update would be atomic enough to just
change it underneath the display engine without ill effects, and then
do the equivalent of a page flip to invalidate the TLBs.

> Then we might just as well simply alloc a new fb and copy the
> contents over, or are you worried that with say a 4k fb that takes too
> much time? FWIW I can see how the single memcpy this involves will take
> some time, but I don't take it will take so long as to be a problem.

Mainly just a question of keeping it in stolen. Assuming we want to keep
things in stolen, which is a matter of some debate as FBC needs stolen
and people might not be happy if it's all taken up by fbdev.

> 
> Anyways I could use some help with implementing either solution as I'm
> not familiar with the involved parts of the code. I will happily test
> a patch for this. Keep in mind that for this to work my original patch
> will also be necessary so that the initial takeover of the firmware
> fb will work.

I guess the trickiest part would be getting both the old and new
location of the page mapped in the ggtt at the same time. Sadly you're
not allowed to access stolen directly. So I suppose this part would
involve some fairly low level frobbing of the ggtt ptes and a
manual ioremap() of the matching ranges of the aperture.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-03-30 12:27 [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers Hans de Goede
                   ` (2 preceding siblings ...)
  2018-03-30 13:31 ` ✓ Fi.CI.IGT: " Patchwork
@ 2018-04-05  7:14 ` Daniel Vetter
  2018-04-05 11:47   ` Hans de Goede
  2018-04-19 13:54 ` ✗ Fi.CI.BAT: failure for drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers (rev2) Patchwork
  4 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2018-04-05  7:14 UTC (permalink / raw)
  To: Hans de Goede; +Cc: intel-gfx, Hans de Goede, dri-devel, Rodrigo Vivi

On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> skipping the first 4k by passing 4096 as start of the address range passed
> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
> reserve the firmware framebuffer so that we can inherit it would always
> fail, as the firmware framebuffer starts at address 0.
> 
> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> everything >= gen8") says in its commit message: "This is confirmed to fix
> Skylake screen flickering issues (probably caused by the fact that we
> initialized a ring in the first page of stolen, but I didn't 100% confirm
> this theory)."
> 
> Which suggests that it is safe to use the first page for a linear
> framebuffer as the firmware is doing.
> 
> This commit always passes 0 as start to drm_mm_init() and works around
> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> by insuring the start address passed by to drm_mm_insert_node_in_range()
> is always 4k or more. All entry points to i915_gem_stolen.c go through
> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> objects such as ring-buffers will not be allocated in the first 4k.
> 
> The one exception is i915_gem_object_create_stolen_for_preallocated()
> which directly calls drm_mm_reserve_node() which now will be able to
> use the first 4k.
> 
> This fixes the i915 driver no longer being able to inherit the firmware
> framebuffer on gen8+, which fixes the video output changing from the
> vendor logo to a black screen as soon as the i915 driver is loaded
> (on systems without fbcon).
> 
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

I think this is worth a shot. The only explanation I can think of why the
GOP could get away with this and still follow the w/a is if it doesn't
have a 1:1 mapping between GGTT and stolen. Atm we hardcode that
assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
both the stolen_offset and the gtt_offset (but it's only the gtt_offset
really). And since we're not re-writing the ptes it's not noticeable.

I think to decide whether this is the right approach we should
double-check whether that 1:1 assumption really holds true: Either read
back the ggtt ptes and check their addresses (but iirc on some platforms
their write-only, readback doesn't work), or we also rewrite the ptes
again for preallocated stuff, like when binding a normal object into the
gtt. If either of these approaches confirms that those affected gen8+
machines still use the 1:1 mapping, then I'm happy to put my r-b on this
patch. If not, well then we at least know what to fix: We need to read out
the real stolen_offset, instead of making assumptions.

Cheers, Daniel
> ---
>  drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
>  1 file changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index af915d041281..ad949cc30928 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
>  	if (!drm_mm_initialized(&dev_priv->mm.stolen))
>  		return -ENODEV;
>  
> +	/* WaSkipStolenMemoryFirstPage:bdw+ */
> +	if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
> +		start = 4096;
> +
>  	mutex_lock(&dev_priv->mm.stolen_lock);
>  	ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
>  					  size, alignment, 0,
> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
>  {
>  	resource_size_t reserved_base, stolen_top;
>  	resource_size_t reserved_total, reserved_size;
> -	resource_size_t stolen_usable_start;
>  
>  	mutex_init(&dev_priv->mm.stolen_lock);
>  
> @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
>  			 (u64)resource_size(&dev_priv->dsm) >> 10,
>  			 ((u64)resource_size(&dev_priv->dsm) - reserved_total) >> 10);
>  
> -	stolen_usable_start = 0;
> -	/* WaSkipStolenMemoryFirstPage:bdw+ */
> -	if (INTEL_GEN(dev_priv) >= 8)
> -		stolen_usable_start = 4096;
> -
>  	dev_priv->stolen_usable_size =
> -		resource_size(&dev_priv->dsm) - reserved_total - stolen_usable_start;
> +		resource_size(&dev_priv->dsm) - reserved_total;
>  
>  	/* Basic memrange allocator for stolen space. */
> -	drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
> -		    dev_priv->stolen_usable_size);
> +	drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->stolen_usable_size);
>  
>  	return 0;
>  }
> -- 
> 2.17.0.rc2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-04 20:49               ` Ville Syrjälä
@ 2018-04-05 11:37                 ` Hans de Goede
  2018-04-06 16:06                   ` [Intel-gfx] " Ville Syrjälä
  0 siblings, 1 reply; 25+ messages in thread
From: Hans de Goede @ 2018-04-05 11:37 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

Hi,

On 04-04-18 22:49, Ville Syrjälä wrote:
> On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 04-04-18 17:50, Ville Syrjälä wrote:
>>> On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 30-03-18 15:25, Hans de Goede wrote:
>>>>> Hi,
>>>>>
>>>>> On 30-03-18 14:44, Chris Wilson wrote:
>>>>>> Quoting Hans de Goede (2018-03-30 13:37:40)
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 30-03-18 14:30, Chris Wilson wrote:
>>>>>>>> Quoting Hans de Goede (2018-03-30 13:27:15)
>>>>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>>>>>>> skipping the first 4k by passing 4096 as start of the address range passed
>>>>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
>>>>>>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>>>>>>> fail, as the firmware framebuffer starts at address 0.
>>>>>>>>>
>>>>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix
>>>>>>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>>>>>>> this theory)."
>>>>>>>>>
>>>>>>>>> Which suggests that it is safe to use the first page for a linear
>>>>>>>>> framebuffer as the firmware is doing.
>>>>>>>>>
>>>>>>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>>>>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>>>>>>
>>>>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>>>>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>>>>>>> use the first 4k.
>>>>>>>>>
>>>>>>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>>>>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>>>>>>> (on systems without fbcon).
>>>>>>>>
>>>>>>>> We've been told by the HW guys not to use the first page. (That's my
>>>>>>>> understanding from every time this gets questioned.)
>>>>>>>
>>>>>>> Yet the GOP is happily using the first page. I think we may need to make
>>>>>>> a difference here between the GPU not using the first page and the
>>>>>>> display engine/pipeline not using the first page. Note that my patch
>>>>>>> only influences the inheriting of the initial framebuffer as allocated
>>>>>>> by the GOP. It does not influence any other allocations from the
>>>>>>> reserved range, those will still all avoid the first page.
>>>>>>>
>>>>>>> Without this patch fastboot / flickerfree support is essentially broken
>>>>>>> on any gen8+ hardware given that one of the goals of atomic is to be
>>>>>>> able to do flickerfree transitions I think that this warrants a closer
>>>>>>> look then just simply saying never use the first page.
>>>>>>
>>>>>> The concern is what else (i.e. nothing that we allocated ourselves) that
>>>>>> may be in the first page...
>>>>>
>>>>> Given that the GOP has put its framebuffer there at least at boot there
>>>>> is nothing there, otherwise it would show up on the display.
>>>>>
>>>>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c
>>>>> and AFAIK that code is there because this inheriting the BIOS fb is
>>>>> deemed an important feature. So I'm not happy at all with the handwavy
>>>>> best to not touch it answer I'm getting to this patch.
>>>>>
>>>>> Unless there are some clear answer from the hardware folks which specifically
>>>>> say we cannot put a framebuffer there (and then why is the GOP doing it?)
>>>>> then I believe that the best way forward here is to get various people to
>>>>> test with this patch and the best way to do that is probably to put it
>>>>> in next. Note I deliberately did not add a Cc stable.
>>>>
>>>> To elaborate on this, the excluding of the first 4k of the stolen memory
>>>> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
>>>> which in turn causes intel_find_initial_plane_obj() to call
>>>> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
>>>> completely turns off the display which leads to a very ugly flickering
>>>> of the display at boot (as well as replacing the vendor logo with a
>>>> black screen).
>>>>
>>>> I think we can all agree that this behavior is undesirable and even a
>>>> regression in behavior caused by the fix to exclude the first 4k.
>>>>
>>>> Chris, if my patch is not an acceptable way to fix this, then how do you
>>>> suggest that we fix this?
>>>>
>>>> Digging a bit deeper I found this:
>>>>
>>>> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf
>>>>
>>>> Which says:
>>>>
>>>> "WaSkipStolenMemoryFirstPage:
>>>>
>>>> WA to skip the first page of stolen
>>>> memory due to sporadic HW write on *CS Idle"
>>>>
>>>> And also about FBC:
>>>>
>>>> "First line of FBC getting corrupted when FBC
>>>> compressed frame buffer offset is programmed to
>>>> zero. Command streamers are doing flush writes to
>>>> base of stolen.
>>>> WA: New restriction to program FBC compressed
>>>> frame buffer offset to at least 4KB."
>>>>
>>>> So using the first 4kB for the *framebuffer* as done by the GOP will
>>>> not cause any major problems (freezes, hangs, etc.), and commit
>>>> d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>> everything >= gen8") was correct in deducing that the problem was
>>>> likely that some *vital* information was being stored i the first 4k
>>>> and that go overwritten.
>>>>
>>>> But the contents of the (first lines of) the framebuffer may become
>>>> corrupted once we actually start using the command-streamers, which
>>>> is still very much not wanted.
>>>>
>>>> In practice Xorg or Wayland will likely have setup another framebuffer
>>>> by the time the command-streamers will start to get used.
>>>>
>>>> Alternatively we could start with inheriting the BIOS framebuffer
>>>> (as my patch allows) so that we don't get the flicker and then soon
>>>> afterwards atomically transit to a new framebuffer (which should
>>>> contain a copy of the BIOS fb contents) at a different location.
>>>
>>> What I suggested long ago was to copy just the first page and adjust the
>>> sg list. But I'm not sure if our stolen gem code would be happy with an
>>> sg list with two entries instead of one.
>>
>> But that would still require an atomic-modeset to install the new sg-list,
>> right?
> 
> Perhaps not. Not sure if the pte update would be atomic enough to just
> change it underneath the display engine without ill effects, and then
> do the equivalent of a page flip to invalidate the TLBs.
> 
>> Then we might just as well simply alloc a new fb and copy the
>> contents over, or are you worried that with say a 4k fb that takes too
>> much time? FWIW I can see how the single memcpy this involves will take
>> some time, but I don't take it will take so long as to be a problem.
> 
> Mainly just a question of keeping it in stolen.

Ah I see.

> Assuming we want to keep
> things in stolen, which is a matter of some debate as FBC needs stolen
> and people might not be happy if it's all taken up by fbdev.
> 
>>
>> Anyways I could use some help with implementing either solution as I'm
>> not familiar with the involved parts of the code. I will happily test
>> a patch for this. Keep in mind that for this to work my original patch
>> will also be necessary so that the initial takeover of the firmware
>> fb will work.
> 
> I guess the trickiest part would be getting both the old and new
> location of the page mapped in the ggtt at the same time. Sadly you're
> not allowed to access stolen directly. So I suppose this part would
> involve some fairly low level frobbing of the ggtt ptes and a
> manual ioremap() of the matching ranges of the aperture.

Hmm, you're talking about what needs to be done to copy the contents here,
right?

I have a feeling we really should just try only my patch first, as
mentioned before the worst thing which can happen is some corruption
of the first lines of the display, which I agree is not good, but also
not the end of the world.

Regards,

Hans
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-05  7:14 ` [PATCH] " Daniel Vetter
@ 2018-04-05 11:47   ` Hans de Goede
  2018-04-05 13:26     ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Hans de Goede @ 2018-04-05 11:47 UTC (permalink / raw)
  To: Daniel Vetter, Hans de Goede; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

Hi,

On 05-04-18 09:14, Daniel Vetter wrote:
> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>> skipping the first 4k by passing 4096 as start of the address range passed
>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
>> reserve the firmware framebuffer so that we can inherit it would always
>> fail, as the firmware framebuffer starts at address 0.
>>
>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>> everything >= gen8") says in its commit message: "This is confirmed to fix
>> Skylake screen flickering issues (probably caused by the fact that we
>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>> this theory)."
>>
>> Which suggests that it is safe to use the first page for a linear
>> framebuffer as the firmware is doing.
>>
>> This commit always passes 0 as start to drm_mm_init() and works around
>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>> objects such as ring-buffers will not be allocated in the first 4k.
>>
>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>> which directly calls drm_mm_reserve_node() which now will be able to
>> use the first 4k.
>>
>> This fixes the i915 driver no longer being able to inherit the firmware
>> framebuffer on gen8+, which fixes the video output changing from the
>> vendor logo to a black screen as soon as the i915 driver is loaded
>> (on systems without fbcon).
>>
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> 
> I think this is worth a shot. The only explanation I can think of why the
> GOP could get away with this and still follow the w/a is if it doesn't
> have a 1:1 mapping between GGTT and stolen.

My guess is that the GOP does not care about the w/a because the w/a
states that the command-streamers sometimes do a spurious flush (write)
to the first page, and the command-streamers are not used until much
later, so the GOP is probably ignoring the w/a all together.

At least that is my theory.

> Atm we hardcode that
> assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
> both the stolen_offset and the gtt_offset (but it's only the gtt_offset
> really). And since we're not re-writing the ptes it's not noticeable.
> 
> I think to decide whether this is the right approach we should
> double-check whether that 1:1 assumption really holds true: Either read
> back the ggtt ptes and check their addresses (but iirc on some platforms
> their write-only, readback doesn't work), or we also rewrite the ptes
> again for preallocated stuff, like when binding a normal object into the
> gtt. If either of these approaches confirms that those affected gen8+
> machines still use the 1:1 mapping, then I'm happy to put my r-b on this
> patch. If not, well then we at least know what to fix: We need to read out
> the real stolen_offset, instead of making assumptions.

I'm happy to give this a try on one or more affected machines, I can e.g.
try this on both a skylake desktop and a cherry-trail based tablet.

But I'm clueless about the whole PTE stuff, so I'm going to need someone
to provide me with a patch following either approach. If readback of the
PTEs works on skylake / cherrytrail I guess that would be the best way.

Note to test this I'm currently booting the kernel with the machine's
UEFI vendor logo still being displayed when efifb kicks in. I've added:
"fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
VC0 / VC1, so that the framebuffer contents stays intact, combined with
the patch we are discussing now, this makes the vendor logo stay on
the screen all the way till GDM loads, which looks rather nice actually :)

And on shutdown we fall back to the original framebuffer and the vendor
logo is back again. I cannot see any corruption in the display at either
boot or shutdown time.

Regards,

Hans



> 
> Cheers, Daniel
>> ---
>>   drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
>>   1 file changed, 6 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> index af915d041281..ad949cc30928 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
>>   	if (!drm_mm_initialized(&dev_priv->mm.stolen))
>>   		return -ENODEV;
>>   
>> +	/* WaSkipStolenMemoryFirstPage:bdw+ */
>> +	if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
>> +		start = 4096;
>> +
>>   	mutex_lock(&dev_priv->mm.stolen_lock);
>>   	ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
>>   					  size, alignment, 0,
>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
>>   {
>>   	resource_size_t reserved_base, stolen_top;
>>   	resource_size_t reserved_total, reserved_size;
>> -	resource_size_t stolen_usable_start;
>>   
>>   	mutex_init(&dev_priv->mm.stolen_lock);
>>   
>> @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv)
>>   			 (u64)resource_size(&dev_priv->dsm) >> 10,
>>   			 ((u64)resource_size(&dev_priv->dsm) - reserved_total) >> 10);
>>   
>> -	stolen_usable_start = 0;
>> -	/* WaSkipStolenMemoryFirstPage:bdw+ */
>> -	if (INTEL_GEN(dev_priv) >= 8)
>> -		stolen_usable_start = 4096;
>> -
>>   	dev_priv->stolen_usable_size =
>> -		resource_size(&dev_priv->dsm) - reserved_total - stolen_usable_start;
>> +		resource_size(&dev_priv->dsm) - reserved_total;
>>   
>>   	/* Basic memrange allocator for stolen space. */
>> -	drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
>> -		    dev_priv->stolen_usable_size);
>> +	drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->stolen_usable_size);
>>   
>>   	return 0;
>>   }
>> -- 
>> 2.17.0.rc2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-05 11:47   ` Hans de Goede
@ 2018-04-05 13:26     ` Daniel Vetter
  2018-04-05 13:31       ` [Intel-gfx] " Hans de Goede
  2018-04-19 13:43       ` Hans de Goede
  0 siblings, 2 replies; 25+ messages in thread
From: Daniel Vetter @ 2018-04-05 13:26 UTC (permalink / raw)
  To: Hans de Goede; +Cc: Hans de Goede, intel-gfx, dri-devel, Rodrigo Vivi

On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote:
> Hi,
>
>
> On 05-04-18 09:14, Daniel Vetter wrote:
>>
>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
>>>
>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>> skipping the first 4k by passing 4096 as start of the address range
>>> passed
>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try
>>> and
>>> reserve the firmware framebuffer so that we can inherit it would always
>>> fail, as the firmware framebuffer starts at address 0.
>>>
>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>> everything >= gen8") says in its commit message: "This is confirmed to
>>> fix
>>> Skylake screen flickering issues (probably caused by the fact that we
>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>> this theory)."
>>>
>>> Which suggests that it is safe to use the first page for a linear
>>> framebuffer as the firmware is doing.
>>>
>>> This commit always passes 0 as start to drm_mm_init() and works around
>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>
>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>> which directly calls drm_mm_reserve_node() which now will be able to
>>> use the first 4k.
>>>
>>> This fixes the i915 driver no longer being able to inherit the firmware
>>> framebuffer on gen8+, which fixes the video output changing from the
>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>> (on systems without fbcon).
>>>
>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>
>>
>> I think this is worth a shot. The only explanation I can think of why the
>> GOP could get away with this and still follow the w/a is if it doesn't
>> have a 1:1 mapping between GGTT and stolen.
>
>
> My guess is that the GOP does not care about the w/a because the w/a
> states that the command-streamers sometimes do a spurious flush (write)
> to the first page, and the command-streamers are not used until much
> later, so the GOP is probably ignoring the w/a all together.
>
> At least that is my theory.

Atm we do not know whether the GOP actually implements the wa or not.
That it doesn't care is just a conjecture, but we have no proof. On
previous platforms the 1:1 mapping did hold, but there's no
fundamental reason why it sitll does.

>> Atm we hardcode that
>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset
>> really). And since we're not re-writing the ptes it's not noticeable.
>>
>> I think to decide whether this is the right approach we should
>> double-check whether that 1:1 assumption really holds true: Either read
>> back the ggtt ptes and check their addresses (but iirc on some platforms
>> their write-only, readback doesn't work), or we also rewrite the ptes
>> again for preallocated stuff, like when binding a normal object into the
>> gtt. If either of these approaches confirms that those affected gen8+
>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this
>> patch. If not, well then we at least know what to fix: We need to read out
>> the real stolen_offset, instead of making assumptions.
>
>
> I'm happy to give this a try on one or more affected machines, I can e.g.
> try this on both a skylake desktop and a cherry-trail based tablet.
>
> But I'm clueless about the whole PTE stuff, so I'm going to need someone
> to provide me with a patch following either approach. If readback of the
> PTEs works on skylake / cherrytrail I guess that would be the best way.
>
> Note to test this I'm currently booting the kernel with the machine's
> UEFI vendor logo still being displayed when efifb kicks in. I've added:
> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
> VC0 / VC1, so that the framebuffer contents stays intact, combined with
> the patch we are discussing now, this makes the vendor logo stay on
> the screen all the way till GDM loads, which looks rather nice actually :)
>
> And on shutdown we fall back to the original framebuffer and the vendor
> logo is back again. I cannot see any corruption in the display at either
> boot or shutdown time.

It shouldn't matter whether efifb takes over or not, it's still the
GOP's framebuffer we're taking over. Different content for sure, logo
is gone, but we only care about which pages we're using.

Wrt the code:
- (Re)binding happens by calling i915_vma_bind, if you put a call to
that at the end of the stolen_preallocated functions you should get
that. Ofc needs the logo still there so you can see it jump/get
mangled.
- GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page
or gen8_ggtt_insert_entries (which takes the vma). Since we only care
about the first entry a readq(dev_priv->ggtt->gsm) is all we really
need. If it's all 0 then it's not working (since at least
_PAGE_PRESENT should be set, plus the physical address of a page in
stolen memory).

I can do some patches for each of those, but a bit swamped. Also no
gen8 handy for testing I think to make sure it works, so would take a
few weeks probably.

Cheers, Daniel
>
>
>>
>> Cheers, Daniel
>>>
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
>>>   1 file changed, 6 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>> b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>> index af915d041281..ad949cc30928 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct
>>> drm_i915_private *dev_priv,
>>>         if (!drm_mm_initialized(&dev_priv->mm.stolen))
>>>                 return -ENODEV;
>>>   +     /* WaSkipStolenMemoryFirstPage:bdw+ */
>>> +       if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
>>> +               start = 4096;
>>> +
>>>         mutex_lock(&dev_priv->mm.stolen_lock);
>>>         ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
>>>                                           size, alignment, 0,
>>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private
>>> *dev_priv)
>>>   {
>>>         resource_size_t reserved_base, stolen_top;
>>>         resource_size_t reserved_total, reserved_size;
>>> -       resource_size_t stolen_usable_start;
>>>         mutex_init(&dev_priv->mm.stolen_lock);
>>>   @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private
>>> *dev_priv)
>>>                          (u64)resource_size(&dev_priv->dsm) >> 10,
>>>                          ((u64)resource_size(&dev_priv->dsm) -
>>> reserved_total) >> 10);
>>>   -     stolen_usable_start = 0;
>>> -       /* WaSkipStolenMemoryFirstPage:bdw+ */
>>> -       if (INTEL_GEN(dev_priv) >= 8)
>>> -               stolen_usable_start = 4096;
>>> -
>>>         dev_priv->stolen_usable_size =
>>> -               resource_size(&dev_priv->dsm) - reserved_total -
>>> stolen_usable_start;
>>> +               resource_size(&dev_priv->dsm) - reserved_total;
>>>         /* Basic memrange allocator for stolen space. */
>>> -       drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
>>> -                   dev_priv->stolen_usable_size);
>>> +       drm_mm_init(&dev_priv->mm.stolen, 0,
>>> dev_priv->stolen_usable_size);
>>>         return 0;
>>>   }
>>> --
>>> 2.17.0.rc2
>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>>
>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-05 13:26     ` Daniel Vetter
@ 2018-04-05 13:31       ` Hans de Goede
  2018-04-05 13:38         ` Daniel Vetter
  2018-04-19 13:43       ` Hans de Goede
  1 sibling, 1 reply; 25+ messages in thread
From: Hans de Goede @ 2018-04-05 13:31 UTC (permalink / raw)
  To: Daniel Vetter, Hans de Goede; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

Hi,

On 05-04-18 15:26, Daniel Vetter wrote:
> On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote:
>> Hi,
>>
>>
>> On 05-04-18 09:14, Daniel Vetter wrote:
>>>
>>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
>>>>
>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>> skipping the first 4k by passing 4096 as start of the address range
>>>> passed
>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try
>>>> and
>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>> fail, as the firmware framebuffer starts at address 0.
>>>>
>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>> everything >= gen8") says in its commit message: "This is confirmed to
>>>> fix
>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>> this theory)."
>>>>
>>>> Which suggests that it is safe to use the first page for a linear
>>>> framebuffer as the firmware is doing.
>>>>
>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>
>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>> use the first 4k.
>>>>
>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>> (on systems without fbcon).
>>>>
>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>
>>>
>>> I think this is worth a shot. The only explanation I can think of why the
>>> GOP could get away with this and still follow the w/a is if it doesn't
>>> have a 1:1 mapping between GGTT and stolen.
>>
>>
>> My guess is that the GOP does not care about the w/a because the w/a
>> states that the command-streamers sometimes do a spurious flush (write)
>> to the first page, and the command-streamers are not used until much
>> later, so the GOP is probably ignoring the w/a all together.
>>
>> At least that is my theory.
> 
> Atm we do not know whether the GOP actually implements the wa or not.
> That it doesn't care is just a conjecture, but we have no proof. On
> previous platforms the 1:1 mapping did hold, but there's no
> fundamental reason why it sitll does.
> 
>>> Atm we hardcode that
>>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
>>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset
>>> really). And since we're not re-writing the ptes it's not noticeable.
>>>
>>> I think to decide whether this is the right approach we should
>>> double-check whether that 1:1 assumption really holds true: Either read
>>> back the ggtt ptes and check their addresses (but iirc on some platforms
>>> their write-only, readback doesn't work), or we also rewrite the ptes
>>> again for preallocated stuff, like when binding a normal object into the
>>> gtt. If either of these approaches confirms that those affected gen8+
>>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this
>>> patch. If not, well then we at least know what to fix: We need to read out
>>> the real stolen_offset, instead of making assumptions.
>>
>>
>> I'm happy to give this a try on one or more affected machines, I can e.g.
>> try this on both a skylake desktop and a cherry-trail based tablet.
>>
>> But I'm clueless about the whole PTE stuff, so I'm going to need someone
>> to provide me with a patch following either approach. If readback of the
>> PTEs works on skylake / cherrytrail I guess that would be the best way.
>>
>> Note to test this I'm currently booting the kernel with the machine's
>> UEFI vendor logo still being displayed when efifb kicks in. I've added:
>> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
>> VC0 / VC1, so that the framebuffer contents stays intact, combined with
>> the patch we are discussing now, this makes the vendor logo stay on
>> the screen all the way till GDM loads, which looks rather nice actually :)
>>
>> And on shutdown we fall back to the original framebuffer and the vendor
>> logo is back again. I cannot see any corruption in the display at either
>> boot or shutdown time.
> 
> It shouldn't matter whether efifb takes over or not, it's still the
> GOP's framebuffer we're taking over. Different content for sure, logo
> is gone, but we only care about which pages we're using.

Right, I only mentioned this to explain that I'm not seeing any
(visible) corruption.

> Wrt the code:
> - (Re)binding happens by calling i915_vma_bind, if you put a call to
> that at the end of the stolen_preallocated functions you should get
> that. Ofc needs the logo still there so you can see it jump/get
> mangled.
> - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page
> or gen8_ggtt_insert_entries (which takes the vma). Since we only care
> about the first entry a readq(dev_priv->ggtt->gsm) is all we really
> need. If it's all 0 then it's not working (since at least
> _PAGE_PRESENT should be set, plus the physical address of a page in
> stolen memory).

Ok, I will try to give this a shot, probably not before coming
Monday though.

> I can do some patches for each of those, but a bit swamped. Also no
> gen8 handy for testing I think to make sure it works, so would take a
> few weeks probably.

I'll give this a shot myself first and attach the patch when I reply
with the testing results, so that you can verify that the patch works
as expected.

Regards,

Hans


> 
> Cheers, Daniel
>>
>>
>>>
>>> Cheers, Daniel
>>>>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
>>>>    1 file changed, 6 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>> b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>> index af915d041281..ad949cc30928 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct
>>>> drm_i915_private *dev_priv,
>>>>          if (!drm_mm_initialized(&dev_priv->mm.stolen))
>>>>                  return -ENODEV;
>>>>    +     /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>> +       if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
>>>> +               start = 4096;
>>>> +
>>>>          mutex_lock(&dev_priv->mm.stolen_lock);
>>>>          ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
>>>>                                            size, alignment, 0,
>>>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private
>>>> *dev_priv)
>>>>    {
>>>>          resource_size_t reserved_base, stolen_top;
>>>>          resource_size_t reserved_total, reserved_size;
>>>> -       resource_size_t stolen_usable_start;
>>>>          mutex_init(&dev_priv->mm.stolen_lock);
>>>>    @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private
>>>> *dev_priv)
>>>>                           (u64)resource_size(&dev_priv->dsm) >> 10,
>>>>                           ((u64)resource_size(&dev_priv->dsm) -
>>>> reserved_total) >> 10);
>>>>    -     stolen_usable_start = 0;
>>>> -       /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>> -       if (INTEL_GEN(dev_priv) >= 8)
>>>> -               stolen_usable_start = 4096;
>>>> -
>>>>          dev_priv->stolen_usable_size =
>>>> -               resource_size(&dev_priv->dsm) - reserved_total -
>>>> stolen_usable_start;
>>>> +               resource_size(&dev_priv->dsm) - reserved_total;
>>>>          /* Basic memrange allocator for stolen space. */
>>>> -       drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
>>>> -                   dev_priv->stolen_usable_size);
>>>> +       drm_mm_init(&dev_priv->mm.stolen, 0,
>>>> dev_priv->stolen_usable_size);
>>>>          return 0;
>>>>    }
>>>> --
>>>> 2.17.0.rc2
>>>>
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>
>>>
>>
> 
> 
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-05 13:31       ` [Intel-gfx] " Hans de Goede
@ 2018-04-05 13:38         ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2018-04-05 13:38 UTC (permalink / raw)
  To: Hans de Goede; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

On Thu, Apr 5, 2018 at 3:31 PM, Hans de Goede <hdegoede@redhat.com> wrote:
> Hi,
>
>
> On 05-04-18 15:26, Daniel Vetter wrote:
>>
>> On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote:
>>>
>>> Hi,
>>>
>>>
>>> On 05-04-18 09:14, Daniel Vetter wrote:
>>>>
>>>>
>>>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
>>>>>
>>>>>
>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>>> skipping the first 4k by passing 4096 as start of the address range
>>>>> passed
>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try
>>>>> and
>>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>>> fail, as the firmware framebuffer starts at address 0.
>>>>>
>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>>> everything >= gen8") says in its commit message: "This is confirmed to
>>>>> fix
>>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>>> initialized a ring in the first page of stolen, but I didn't 100%
>>>>> confirm
>>>>> this theory)."
>>>>>
>>>>> Which suggests that it is safe to use the first page for a linear
>>>>> framebuffer as the firmware is doing.
>>>>>
>>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>>> by insuring the start address passed by to
>>>>> drm_mm_insert_node_in_range()
>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>>
>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>>> use the first 4k.
>>>>>
>>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>>> (on systems without fbcon).
>>>>>
>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>>
>>>>
>>>>
>>>> I think this is worth a shot. The only explanation I can think of why
>>>> the
>>>> GOP could get away with this and still follow the w/a is if it doesn't
>>>> have a 1:1 mapping between GGTT and stolen.
>>>
>>>
>>>
>>> My guess is that the GOP does not care about the w/a because the w/a
>>> states that the command-streamers sometimes do a spurious flush (write)
>>> to the first page, and the command-streamers are not used until much
>>> later, so the GOP is probably ignoring the w/a all together.
>>>
>>> At least that is my theory.
>>
>>
>> Atm we do not know whether the GOP actually implements the wa or not.
>> That it doesn't care is just a conjecture, but we have no proof. On
>> previous platforms the 1:1 mapping did hold, but there's no
>> fundamental reason why it sitll does.
>>
>>>> Atm we hardcode that
>>>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned
>>>> as
>>>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset
>>>> really). And since we're not re-writing the ptes it's not noticeable.
>>>>
>>>> I think to decide whether this is the right approach we should
>>>> double-check whether that 1:1 assumption really holds true: Either read
>>>> back the ggtt ptes and check their addresses (but iirc on some platforms
>>>> their write-only, readback doesn't work), or we also rewrite the ptes
>>>> again for preallocated stuff, like when binding a normal object into the
>>>> gtt. If either of these approaches confirms that those affected gen8+
>>>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this
>>>> patch. If not, well then we at least know what to fix: We need to read
>>>> out
>>>> the real stolen_offset, instead of making assumptions.
>>>
>>>
>>>
>>> I'm happy to give this a try on one or more affected machines, I can e.g.
>>> try this on both a skylake desktop and a cherry-trail based tablet.
>>>
>>> But I'm clueless about the whole PTE stuff, so I'm going to need someone
>>> to provide me with a patch following either approach. If readback of the
>>> PTEs works on skylake / cherrytrail I guess that would be the best way.
>>>
>>> Note to test this I'm currently booting the kernel with the machine's
>>> UEFI vendor logo still being displayed when efifb kicks in. I've added:
>>> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
>>> VC0 / VC1, so that the framebuffer contents stays intact, combined with
>>> the patch we are discussing now, this makes the vendor logo stay on
>>> the screen all the way till GDM loads, which looks rather nice actually
>>> :)
>>>
>>> And on shutdown we fall back to the original framebuffer and the vendor
>>> logo is back again. I cannot see any corruption in the display at either
>>> boot or shutdown time.
>>
>>
>> It shouldn't matter whether efifb takes over or not, it's still the
>> GOP's framebuffer we're taking over. Different content for sure, logo
>> is gone, but we only care about which pages we're using.
>
>
> Right, I only mentioned this to explain that I'm not seeing any
> (visible) corruption.
>
>> Wrt the code:
>> - (Re)binding happens by calling i915_vma_bind, if you put a call to
>> that at the end of the stolen_preallocated functions you should get
>> that. Ofc needs the logo still there so you can see it jump/get
>> mangled.
>> - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page
>> or gen8_ggtt_insert_entries (which takes the vma). Since we only care
>> about the first entry a readq(dev_priv->ggtt->gsm) is all we really
>> need. If it's all 0 then it's not working (since at least
>> _PAGE_PRESENT should be set, plus the physical address of a page in
>> stolen memory).
>
>
> Ok, I will try to give this a shot, probably not before coming
> Monday though.

For the 2nd expirement we ofc also need the base physical address of
stolen. /proc/iomem should dump that already, but if it's not there
pls add a printk for that too. It's dev_priv->dsm.start, see
i915_pages_create_for_stolen.

>> I can do some patches for each of those, but a bit swamped. Also no
>> gen8 handy for testing I think to make sure it works, so would take a
>> few weeks probably.
>
>
> I'll give this a shot myself first and attach the patch when I reply
> with the testing results, so that you can verify that the patch works
> as expected.

Thanks a lot. I'll be travelling a bit later on next week, but I'll
try to take a look.

Cheers, Daniel
>
> Regards,
>
> Hans
>
>
>
>>
>> Cheers, Daniel
>>>
>>>
>>>
>>>>
>>>> Cheers, Daniel
>>>>>
>>>>>
>>>>> ---
>>>>>    drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
>>>>>    1 file changed, 6 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>> b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>> index af915d041281..ad949cc30928 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct
>>>>> drm_i915_private *dev_priv,
>>>>>          if (!drm_mm_initialized(&dev_priv->mm.stolen))
>>>>>                  return -ENODEV;
>>>>>    +     /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>>> +       if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
>>>>> +               start = 4096;
>>>>> +
>>>>>          mutex_lock(&dev_priv->mm.stolen_lock);
>>>>>          ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
>>>>>                                            size, alignment, 0,
>>>>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private
>>>>> *dev_priv)
>>>>>    {
>>>>>          resource_size_t reserved_base, stolen_top;
>>>>>          resource_size_t reserved_total, reserved_size;
>>>>> -       resource_size_t stolen_usable_start;
>>>>>          mutex_init(&dev_priv->mm.stolen_lock);
>>>>>    @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct
>>>>> drm_i915_private
>>>>> *dev_priv)
>>>>>                           (u64)resource_size(&dev_priv->dsm) >> 10,
>>>>>                           ((u64)resource_size(&dev_priv->dsm) -
>>>>> reserved_total) >> 10);
>>>>>    -     stolen_usable_start = 0;
>>>>> -       /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>>> -       if (INTEL_GEN(dev_priv) >= 8)
>>>>> -               stolen_usable_start = 4096;
>>>>> -
>>>>>          dev_priv->stolen_usable_size =
>>>>> -               resource_size(&dev_priv->dsm) - reserved_total -
>>>>> stolen_usable_start;
>>>>> +               resource_size(&dev_priv->dsm) - reserved_total;
>>>>>          /* Basic memrange allocator for stolen space. */
>>>>> -       drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
>>>>> -                   dev_priv->stolen_usable_size);
>>>>> +       drm_mm_init(&dev_priv->mm.stolen, 0,
>>>>> dev_priv->stolen_usable_size);
>>>>>          return 0;
>>>>>    }
>>>>> --
>>>>> 2.17.0.rc2
>>>>>
>>>>> _______________________________________________
>>>>> Intel-gfx mailing list
>>>>> Intel-gfx@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-05 11:37                 ` Hans de Goede
@ 2018-04-06 16:06                   ` Ville Syrjälä
  2018-04-07  9:34                     ` Hans de Goede
  0 siblings, 1 reply; 25+ messages in thread
From: Ville Syrjälä @ 2018-04-06 16:06 UTC (permalink / raw)
  To: Hans de Goede; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

On Thu, Apr 05, 2018 at 01:37:31PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 04-04-18 22:49, Ville Syrjälä wrote:
> > On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote:
> >> Hi,
> >>
> >> On 04-04-18 17:50, Ville Syrjälä wrote:
> >>> On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote:
> >>>> Hi,
> >>>>
> >>>> On 30-03-18 15:25, Hans de Goede wrote:
> >>>>> Hi,
> >>>>>
> >>>>> On 30-03-18 14:44, Chris Wilson wrote:
> >>>>>> Quoting Hans de Goede (2018-03-30 13:37:40)
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On 30-03-18 14:30, Chris Wilson wrote:
> >>>>>>>> Quoting Hans de Goede (2018-03-30 13:27:15)
> >>>>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> >>>>>>>>> skipping the first 4k by passing 4096 as start of the address range passed
> >>>>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
> >>>>>>>>> reserve the firmware framebuffer so that we can inherit it would always
> >>>>>>>>> fail, as the firmware framebuffer starts at address 0.
> >>>>>>>>>
> >>>>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> >>>>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix
> >>>>>>>>> Skylake screen flickering issues (probably caused by the fact that we
> >>>>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
> >>>>>>>>> this theory)."
> >>>>>>>>>
> >>>>>>>>> Which suggests that it is safe to use the first page for a linear
> >>>>>>>>> framebuffer as the firmware is doing.
> >>>>>>>>>
> >>>>>>>>> This commit always passes 0 as start to drm_mm_init() and works around
> >>>>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> >>>>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
> >>>>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
> >>>>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> >>>>>>>>> objects such as ring-buffers will not be allocated in the first 4k.
> >>>>>>>>>
> >>>>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
> >>>>>>>>> which directly calls drm_mm_reserve_node() which now will be able to
> >>>>>>>>> use the first 4k.
> >>>>>>>>>
> >>>>>>>>> This fixes the i915 driver no longer being able to inherit the firmware
> >>>>>>>>> framebuffer on gen8+, which fixes the video output changing from the
> >>>>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded
> >>>>>>>>> (on systems without fbcon).
> >>>>>>>>
> >>>>>>>> We've been told by the HW guys not to use the first page. (That's my
> >>>>>>>> understanding from every time this gets questioned.)
> >>>>>>>
> >>>>>>> Yet the GOP is happily using the first page. I think we may need to make
> >>>>>>> a difference here between the GPU not using the first page and the
> >>>>>>> display engine/pipeline not using the first page. Note that my patch
> >>>>>>> only influences the inheriting of the initial framebuffer as allocated
> >>>>>>> by the GOP. It does not influence any other allocations from the
> >>>>>>> reserved range, those will still all avoid the first page.
> >>>>>>>
> >>>>>>> Without this patch fastboot / flickerfree support is essentially broken
> >>>>>>> on any gen8+ hardware given that one of the goals of atomic is to be
> >>>>>>> able to do flickerfree transitions I think that this warrants a closer
> >>>>>>> look then just simply saying never use the first page.
> >>>>>>
> >>>>>> The concern is what else (i.e. nothing that we allocated ourselves) that
> >>>>>> may be in the first page...
> >>>>>
> >>>>> Given that the GOP has put its framebuffer there at least at boot there
> >>>>> is nothing there, otherwise it would show up on the display.
> >>>>>
> >>>>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c
> >>>>> and AFAIK that code is there because this inheriting the BIOS fb is
> >>>>> deemed an important feature. So I'm not happy at all with the handwavy
> >>>>> best to not touch it answer I'm getting to this patch.
> >>>>>
> >>>>> Unless there are some clear answer from the hardware folks which specifically
> >>>>> say we cannot put a framebuffer there (and then why is the GOP doing it?)
> >>>>> then I believe that the best way forward here is to get various people to
> >>>>> test with this patch and the best way to do that is probably to put it
> >>>>> in next. Note I deliberately did not add a Cc stable.
> >>>>
> >>>> To elaborate on this, the excluding of the first 4k of the stolen memory
> >>>> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
> >>>> which in turn causes intel_find_initial_plane_obj() to call
> >>>> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
> >>>> completely turns off the display which leads to a very ugly flickering
> >>>> of the display at boot (as well as replacing the vendor logo with a
> >>>> black screen).
> >>>>
> >>>> I think we can all agree that this behavior is undesirable and even a
> >>>> regression in behavior caused by the fix to exclude the first 4k.
> >>>>
> >>>> Chris, if my patch is not an acceptable way to fix this, then how do you
> >>>> suggest that we fix this?
> >>>>
> >>>> Digging a bit deeper I found this:
> >>>>
> >>>> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf
> >>>>
> >>>> Which says:
> >>>>
> >>>> "WaSkipStolenMemoryFirstPage:
> >>>>
> >>>> WA to skip the first page of stolen
> >>>> memory due to sporadic HW write on *CS Idle"
> >>>>
> >>>> And also about FBC:
> >>>>
> >>>> "First line of FBC getting corrupted when FBC
> >>>> compressed frame buffer offset is programmed to
> >>>> zero. Command streamers are doing flush writes to
> >>>> base of stolen.
> >>>> WA: New restriction to program FBC compressed
> >>>> frame buffer offset to at least 4KB."
> >>>>
> >>>> So using the first 4kB for the *framebuffer* as done by the GOP will
> >>>> not cause any major problems (freezes, hangs, etc.), and commit
> >>>> d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> >>>> everything >= gen8") was correct in deducing that the problem was
> >>>> likely that some *vital* information was being stored i the first 4k
> >>>> and that go overwritten.
> >>>>
> >>>> But the contents of the (first lines of) the framebuffer may become
> >>>> corrupted once we actually start using the command-streamers, which
> >>>> is still very much not wanted.
> >>>>
> >>>> In practice Xorg or Wayland will likely have setup another framebuffer
> >>>> by the time the command-streamers will start to get used.
> >>>>
> >>>> Alternatively we could start with inheriting the BIOS framebuffer
> >>>> (as my patch allows) so that we don't get the flicker and then soon
> >>>> afterwards atomically transit to a new framebuffer (which should
> >>>> contain a copy of the BIOS fb contents) at a different location.
> >>>
> >>> What I suggested long ago was to copy just the first page and adjust the
> >>> sg list. But I'm not sure if our stolen gem code would be happy with an
> >>> sg list with two entries instead of one.
> >>
> >> But that would still require an atomic-modeset to install the new sg-list,
> >> right?
> > 
> > Perhaps not. Not sure if the pte update would be atomic enough to just
> > change it underneath the display engine without ill effects, and then
> > do the equivalent of a page flip to invalidate the TLBs.
> > 
> >> Then we might just as well simply alloc a new fb and copy the
> >> contents over, or are you worried that with say a 4k fb that takes too
> >> much time? FWIW I can see how the single memcpy this involves will take
> >> some time, but I don't take it will take so long as to be a problem.
> > 
> > Mainly just a question of keeping it in stolen.
> 
> Ah I see.
> 
> > Assuming we want to keep
> > things in stolen, which is a matter of some debate as FBC needs stolen
> > and people might not be happy if it's all taken up by fbdev.
> > 
> >>
> >> Anyways I could use some help with implementing either solution as I'm
> >> not familiar with the involved parts of the code. I will happily test
> >> a patch for this. Keep in mind that for this to work my original patch
> >> will also be necessary so that the initial takeover of the firmware
> >> fb will work.
> > 
> > I guess the trickiest part would be getting both the old and new
> > location of the page mapped in the ggtt at the same time. Sadly you're
> > not allowed to access stolen directly. So I suppose this part would
> > involve some fairly low level frobbing of the ggtt ptes and a
> > manual ioremap() of the matching ranges of the aperture.
> 
> Hmm, you're talking about what needs to be done to copy the contents here,
> right?

Yeah.

> I have a feeling we really should just try only my patch first, as
> mentioned before the worst thing which can happen is some corruption
> of the first lines of the display, which I agree is not good, but also
> not the end of the world.

One can hope :)

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-06 16:06                   ` [Intel-gfx] " Ville Syrjälä
@ 2018-04-07  9:34                     ` Hans de Goede
  2018-04-09  8:30                       ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Hans de Goede @ 2018-04-07  9:34 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

Hi,

On 06-04-18 18:06, Ville Syrjälä wrote:
> On Thu, Apr 05, 2018 at 01:37:31PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 04-04-18 22:49, Ville Syrjälä wrote:
>>> On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 04-04-18 17:50, Ville Syrjälä wrote:
>>>>> On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 30-03-18 15:25, Hans de Goede wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 30-03-18 14:44, Chris Wilson wrote:
>>>>>>>> Quoting Hans de Goede (2018-03-30 13:37:40)
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On 30-03-18 14:30, Chris Wilson wrote:
>>>>>>>>>> Quoting Hans de Goede (2018-03-30 13:27:15)
>>>>>>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>>>>>>>>> skipping the first 4k by passing 4096 as start of the address range passed
>>>>>>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
>>>>>>>>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>>>>>>>>> fail, as the firmware framebuffer starts at address 0.
>>>>>>>>>>>
>>>>>>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>>>>>>>>> everything >= gen8") says in its commit message: "This is confirmed to fix
>>>>>>>>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>>>>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>>>>>>>>> this theory)."
>>>>>>>>>>>
>>>>>>>>>>> Which suggests that it is safe to use the first page for a linear
>>>>>>>>>>> framebuffer as the firmware is doing.
>>>>>>>>>>>
>>>>>>>>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>>>>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>>>>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>>>>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>>>>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>>>>>>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>>>>>>>>
>>>>>>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>>>>>>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>>>>>>>>> use the first 4k.
>>>>>>>>>>>
>>>>>>>>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>>>>>>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>>>>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>>>>>>>>> (on systems without fbcon).
>>>>>>>>>>
>>>>>>>>>> We've been told by the HW guys not to use the first page. (That's my
>>>>>>>>>> understanding from every time this gets questioned.)
>>>>>>>>>
>>>>>>>>> Yet the GOP is happily using the first page. I think we may need to make
>>>>>>>>> a difference here between the GPU not using the first page and the
>>>>>>>>> display engine/pipeline not using the first page. Note that my patch
>>>>>>>>> only influences the inheriting of the initial framebuffer as allocated
>>>>>>>>> by the GOP. It does not influence any other allocations from the
>>>>>>>>> reserved range, those will still all avoid the first page.
>>>>>>>>>
>>>>>>>>> Without this patch fastboot / flickerfree support is essentially broken
>>>>>>>>> on any gen8+ hardware given that one of the goals of atomic is to be
>>>>>>>>> able to do flickerfree transitions I think that this warrants a closer
>>>>>>>>> look then just simply saying never use the first page.
>>>>>>>>
>>>>>>>> The concern is what else (i.e. nothing that we allocated ourselves) that
>>>>>>>> may be in the first page...
>>>>>>>
>>>>>>> Given that the GOP has put its framebuffer there at least at boot there
>>>>>>> is nothing there, otherwise it would show up on the display.
>>>>>>>
>>>>>>> We have a whole bunch of code to inherit the BIOS fb in intel_display.c
>>>>>>> and AFAIK that code is there because this inheriting the BIOS fb is
>>>>>>> deemed an important feature. So I'm not happy at all with the handwavy
>>>>>>> best to not touch it answer I'm getting to this patch.
>>>>>>>
>>>>>>> Unless there are some clear answer from the hardware folks which specifically
>>>>>>> say we cannot put a framebuffer there (and then why is the GOP doing it?)
>>>>>>> then I believe that the best way forward here is to get various people to
>>>>>>> test with this patch and the best way to do that is probably to put it
>>>>>>> in next. Note I deliberately did not add a Cc stable.
>>>>>>
>>>>>> To elaborate on this, the excluding of the first 4k of the stolen memory
>>>>>> region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
>>>>>> which in turn causes intel_find_initial_plane_obj() to call
>>>>>> intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
>>>>>> completely turns off the display which leads to a very ugly flickering
>>>>>> of the display at boot (as well as replacing the vendor logo with a
>>>>>> black screen).
>>>>>>
>>>>>> I think we can all agree that this behavior is undesirable and even a
>>>>>> regression in behavior caused by the fix to exclude the first 4k.
>>>>>>
>>>>>> Chris, if my patch is not an acceptable way to fix this, then how do you
>>>>>> suggest that we fix this?
>>>>>>
>>>>>> Digging a bit deeper I found this:
>>>>>>
>>>>>> https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf
>>>>>>
>>>>>> Which says:
>>>>>>
>>>>>> "WaSkipStolenMemoryFirstPage:
>>>>>>
>>>>>> WA to skip the first page of stolen
>>>>>> memory due to sporadic HW write on *CS Idle"
>>>>>>
>>>>>> And also about FBC:
>>>>>>
>>>>>> "First line of FBC getting corrupted when FBC
>>>>>> compressed frame buffer offset is programmed to
>>>>>> zero. Command streamers are doing flush writes to
>>>>>> base of stolen.
>>>>>> WA: New restriction to program FBC compressed
>>>>>> frame buffer offset to at least 4KB."
>>>>>>
>>>>>> So using the first 4kB for the *framebuffer* as done by the GOP will
>>>>>> not cause any major problems (freezes, hangs, etc.), and commit
>>>>>> d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>>>> everything >= gen8") was correct in deducing that the problem was
>>>>>> likely that some *vital* information was being stored i the first 4k
>>>>>> and that go overwritten.
>>>>>>
>>>>>> But the contents of the (first lines of) the framebuffer may become
>>>>>> corrupted once we actually start using the command-streamers, which
>>>>>> is still very much not wanted.
>>>>>>
>>>>>> In practice Xorg or Wayland will likely have setup another framebuffer
>>>>>> by the time the command-streamers will start to get used.
>>>>>>
>>>>>> Alternatively we could start with inheriting the BIOS framebuffer
>>>>>> (as my patch allows) so that we don't get the flicker and then soon
>>>>>> afterwards atomically transit to a new framebuffer (which should
>>>>>> contain a copy of the BIOS fb contents) at a different location.
>>>>>
>>>>> What I suggested long ago was to copy just the first page and adjust the
>>>>> sg list. But I'm not sure if our stolen gem code would be happy with an
>>>>> sg list with two entries instead of one.
>>>>
>>>> But that would still require an atomic-modeset to install the new sg-list,
>>>> right?
>>>
>>> Perhaps not. Not sure if the pte update would be atomic enough to just
>>> change it underneath the display engine without ill effects, and then
>>> do the equivalent of a page flip to invalidate the TLBs.
>>>
>>>> Then we might just as well simply alloc a new fb and copy the
>>>> contents over, or are you worried that with say a 4k fb that takes too
>>>> much time? FWIW I can see how the single memcpy this involves will take
>>>> some time, but I don't take it will take so long as to be a problem.
>>>
>>> Mainly just a question of keeping it in stolen.
>>
>> Ah I see.
>>
>>> Assuming we want to keep
>>> things in stolen, which is a matter of some debate as FBC needs stolen
>>> and people might not be happy if it's all taken up by fbdev.
>>>
>>>>
>>>> Anyways I could use some help with implementing either solution as I'm
>>>> not familiar with the involved parts of the code. I will happily test
>>>> a patch for this. Keep in mind that for this to work my original patch
>>>> will also be necessary so that the initial takeover of the firmware
>>>> fb will work.
>>>
>>> I guess the trickiest part would be getting both the old and new
>>> location of the page mapped in the ggtt at the same time. Sadly you're
>>> not allowed to access stolen directly. So I suppose this part would
>>> involve some fairly low level frobbing of the ggtt ptes and a
>>> manual ioremap() of the matching ranges of the aperture.
>>
>> Hmm, you're talking about what needs to be done to copy the contents here,
>> right?
> 
> Yeah.

So thinking more about this, for the old / BIOS framebuffer there
already is a mapping setup for efifb use and for the new one we
should also already set up a mapping when we create it, so I think
we can really just do a memcpy here after creating a new framebuffer?

Anyways I will run the tests Daniel asked me to run first.

Regards,

Hans
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-07  9:34                     ` Hans de Goede
@ 2018-04-09  8:30                       ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2018-04-09  8:30 UTC (permalink / raw)
  To: Hans de Goede; +Cc: intel-gfx, dri-devel, Rodrigo Vivi

On Sat, Apr 07, 2018 at 11:34:57AM +0200, Hans de Goede wrote:
> Hi,
> 
> On 06-04-18 18:06, Ville Syrjälä wrote:
> > On Thu, Apr 05, 2018 at 01:37:31PM +0200, Hans de Goede wrote:
> > > Hi,
> > > 
> > > On 04-04-18 22:49, Ville Syrjälä wrote:
> > > > On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote:
> > > > > Hi,
> > > > > 
> > > > > On 04-04-18 17:50, Ville Syrjälä wrote:
> > > > > > On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On 30-03-18 15:25, Hans de Goede wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > On 30-03-18 14:44, Chris Wilson wrote:
> > > > > > > > > Quoting Hans de Goede (2018-03-30 13:37:40)
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > On 30-03-18 14:30, Chris Wilson wrote:
> > > > > > > > > > > Quoting Hans de Goede (2018-03-30 13:27:15)
> > > > > > > > > > > > Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> > > > > > > > > > > > skipping the first 4k by passing 4096 as start of the address range passed
> > > > > > > > > > > > to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
> > > > > > > > > > > > reserve the firmware framebuffer so that we can inherit it would always
> > > > > > > > > > > > fail, as the firmware framebuffer starts at address 0.
> > > > > > > > > > > > 
> > > > > > > > > > > > Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> > > > > > > > > > > > everything >= gen8") says in its commit message: "This is confirmed to fix
> > > > > > > > > > > > Skylake screen flickering issues (probably caused by the fact that we
> > > > > > > > > > > > initialized a ring in the first page of stolen, but I didn't 100% confirm
> > > > > > > > > > > > this theory)."
> > > > > > > > > > > > 
> > > > > > > > > > > > Which suggests that it is safe to use the first page for a linear
> > > > > > > > > > > > framebuffer as the firmware is doing.
> > > > > > > > > > > > 
> > > > > > > > > > > > This commit always passes 0 as start to drm_mm_init() and works around
> > > > > > > > > > > > WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> > > > > > > > > > > > by insuring the start address passed by to drm_mm_insert_node_in_range()
> > > > > > > > > > > > is always 4k or more. All entry points to i915_gem_stolen.c go through
> > > > > > > > > > > > i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> > > > > > > > > > > > objects such as ring-buffers will not be allocated in the first 4k.
> > > > > > > > > > > > 
> > > > > > > > > > > > The one exception is i915_gem_object_create_stolen_for_preallocated()
> > > > > > > > > > > > which directly calls drm_mm_reserve_node() which now will be able to
> > > > > > > > > > > > use the first 4k.
> > > > > > > > > > > > 
> > > > > > > > > > > > This fixes the i915 driver no longer being able to inherit the firmware
> > > > > > > > > > > > framebuffer on gen8+, which fixes the video output changing from the
> > > > > > > > > > > > vendor logo to a black screen as soon as the i915 driver is loaded
> > > > > > > > > > > > (on systems without fbcon).
> > > > > > > > > > > 
> > > > > > > > > > > We've been told by the HW guys not to use the first page. (That's my
> > > > > > > > > > > understanding from every time this gets questioned.)
> > > > > > > > > > 
> > > > > > > > > > Yet the GOP is happily using the first page. I think we may need to make
> > > > > > > > > > a difference here between the GPU not using the first page and the
> > > > > > > > > > display engine/pipeline not using the first page. Note that my patch
> > > > > > > > > > only influences the inheriting of the initial framebuffer as allocated
> > > > > > > > > > by the GOP. It does not influence any other allocations from the
> > > > > > > > > > reserved range, those will still all avoid the first page.
> > > > > > > > > > 
> > > > > > > > > > Without this patch fastboot / flickerfree support is essentially broken
> > > > > > > > > > on any gen8+ hardware given that one of the goals of atomic is to be
> > > > > > > > > > able to do flickerfree transitions I think that this warrants a closer
> > > > > > > > > > look then just simply saying never use the first page.
> > > > > > > > > 
> > > > > > > > > The concern is what else (i.e. nothing that we allocated ourselves) that
> > > > > > > > > may be in the first page...
> > > > > > > > 
> > > > > > > > Given that the GOP has put its framebuffer there at least at boot there
> > > > > > > > is nothing there, otherwise it would show up on the display.
> > > > > > > > 
> > > > > > > > We have a whole bunch of code to inherit the BIOS fb in intel_display.c
> > > > > > > > and AFAIK that code is there because this inheriting the BIOS fb is
> > > > > > > > deemed an important feature. So I'm not happy at all with the handwavy
> > > > > > > > best to not touch it answer I'm getting to this patch.
> > > > > > > > 
> > > > > > > > Unless there are some clear answer from the hardware folks which specifically
> > > > > > > > say we cannot put a framebuffer there (and then why is the GOP doing it?)
> > > > > > > > then I believe that the best way forward here is to get various people to
> > > > > > > > test with this patch and the best way to do that is probably to put it
> > > > > > > > in next. Note I deliberately did not add a Cc stable.
> > > > > > > 
> > > > > > > To elaborate on this, the excluding of the first 4k of the stolen memory
> > > > > > > region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
> > > > > > > which in turn causes intel_find_initial_plane_obj() to call
> > > > > > > intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
> > > > > > > completely turns off the display which leads to a very ugly flickering
> > > > > > > of the display at boot (as well as replacing the vendor logo with a
> > > > > > > black screen).
> > > > > > > 
> > > > > > > I think we can all agree that this behavior is undesirable and even a
> > > > > > > regression in behavior caused by the fix to exclude the first 4k.
> > > > > > > 
> > > > > > > Chris, if my patch is not an acceptable way to fix this, then how do you
> > > > > > > suggest that we fix this?
> > > > > > > 
> > > > > > > Digging a bit deeper I found this:
> > > > > > > 
> > > > > > > https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf
> > > > > > > 
> > > > > > > Which says:
> > > > > > > 
> > > > > > > "WaSkipStolenMemoryFirstPage:
> > > > > > > 
> > > > > > > WA to skip the first page of stolen
> > > > > > > memory due to sporadic HW write on *CS Idle"
> > > > > > > 
> > > > > > > And also about FBC:
> > > > > > > 
> > > > > > > "First line of FBC getting corrupted when FBC
> > > > > > > compressed frame buffer offset is programmed to
> > > > > > > zero. Command streamers are doing flush writes to
> > > > > > > base of stolen.
> > > > > > > WA: New restriction to program FBC compressed
> > > > > > > frame buffer offset to at least 4KB."
> > > > > > > 
> > > > > > > So using the first 4kB for the *framebuffer* as done by the GOP will
> > > > > > > not cause any major problems (freezes, hangs, etc.), and commit
> > > > > > > d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> > > > > > > everything >= gen8") was correct in deducing that the problem was
> > > > > > > likely that some *vital* information was being stored i the first 4k
> > > > > > > and that go overwritten.
> > > > > > > 
> > > > > > > But the contents of the (first lines of) the framebuffer may become
> > > > > > > corrupted once we actually start using the command-streamers, which
> > > > > > > is still very much not wanted.
> > > > > > > 
> > > > > > > In practice Xorg or Wayland will likely have setup another framebuffer
> > > > > > > by the time the command-streamers will start to get used.
> > > > > > > 
> > > > > > > Alternatively we could start with inheriting the BIOS framebuffer
> > > > > > > (as my patch allows) so that we don't get the flicker and then soon
> > > > > > > afterwards atomically transit to a new framebuffer (which should
> > > > > > > contain a copy of the BIOS fb contents) at a different location.
> > > > > > 
> > > > > > What I suggested long ago was to copy just the first page and adjust the
> > > > > > sg list. But I'm not sure if our stolen gem code would be happy with an
> > > > > > sg list with two entries instead of one.
> > > > > 
> > > > > But that would still require an atomic-modeset to install the new sg-list,
> > > > > right?
> > > > 
> > > > Perhaps not. Not sure if the pte update would be atomic enough to just
> > > > change it underneath the display engine without ill effects, and then
> > > > do the equivalent of a page flip to invalidate the TLBs.
> > > > 
> > > > > Then we might just as well simply alloc a new fb and copy the
> > > > > contents over, or are you worried that with say a 4k fb that takes too
> > > > > much time? FWIW I can see how the single memcpy this involves will take
> > > > > some time, but I don't take it will take so long as to be a problem.
> > > > 
> > > > Mainly just a question of keeping it in stolen.
> > > 
> > > Ah I see.
> > > 
> > > > Assuming we want to keep
> > > > things in stolen, which is a matter of some debate as FBC needs stolen
> > > > and people might not be happy if it's all taken up by fbdev.
> > > > 
> > > > > 
> > > > > Anyways I could use some help with implementing either solution as I'm
> > > > > not familiar with the involved parts of the code. I will happily test
> > > > > a patch for this. Keep in mind that for this to work my original patch
> > > > > will also be necessary so that the initial takeover of the firmware
> > > > > fb will work.
> > > > 
> > > > I guess the trickiest part would be getting both the old and new
> > > > location of the page mapped in the ggtt at the same time. Sadly you're
> > > > not allowed to access stolen directly. So I suppose this part would
> > > > involve some fairly low level frobbing of the ggtt ptes and a
> > > > manual ioremap() of the matching ranges of the aperture.
> > > 
> > > Hmm, you're talking about what needs to be done to copy the contents here,
> > > right?
> > 
> > Yeah.
> 
> So thinking more about this, for the old / BIOS framebuffer there
> already is a mapping setup for efifb use and for the new one we
> should also already set up a mapping when we create it, so I think
> we can really just do a memcpy here after creating a new framebuffer?
> 
> Anyways I will run the tests Daniel asked me to run first.

Yeah let's not invent solutions for problems we might not even have :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-05 13:26     ` Daniel Vetter
  2018-04-05 13:31       ` [Intel-gfx] " Hans de Goede
@ 2018-04-19 13:43       ` Hans de Goede
  2018-04-20  7:27         ` Daniel Vetter
  1 sibling, 1 reply; 25+ messages in thread
From: Hans de Goede @ 2018-04-19 13:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Hans de Goede, intel-gfx, dri-devel, Rodrigo Vivi

Hi,

On 05-04-18 15:26, Daniel Vetter wrote:
> On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote:
>> Hi,
>>
>>
>> On 05-04-18 09:14, Daniel Vetter wrote:
>>>
>>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
>>>>
>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>> skipping the first 4k by passing 4096 as start of the address range
>>>> passed
>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try
>>>> and
>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>> fail, as the firmware framebuffer starts at address 0.
>>>>
>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>> everything >= gen8") says in its commit message: "This is confirmed to
>>>> fix
>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>> this theory)."
>>>>
>>>> Which suggests that it is safe to use the first page for a linear
>>>> framebuffer as the firmware is doing.
>>>>
>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>
>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>> use the first 4k.
>>>>
>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>> (on systems without fbcon).
>>>>
>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>
>>>
>>> I think this is worth a shot. The only explanation I can think of why the
>>> GOP could get away with this and still follow the w/a is if it doesn't
>>> have a 1:1 mapping between GGTT and stolen.
>>
>>
>> My guess is that the GOP does not care about the w/a because the w/a
>> states that the command-streamers sometimes do a spurious flush (write)
>> to the first page, and the command-streamers are not used until much
>> later, so the GOP is probably ignoring the w/a all together.
>>
>> At least that is my theory.
> 
> Atm we do not know whether the GOP actually implements the wa or not.
> That it doesn't care is just a conjecture, but we have no proof. On
> previous platforms the 1:1 mapping did hold, but there's no
> fundamental reason why it sitll does.

Right.

>>> Atm we hardcode that
>>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
>>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset
>>> really). And since we're not re-writing the ptes it's not noticeable.
>>>
>>> I think to decide whether this is the right approach we should
>>> double-check whether that 1:1 assumption really holds true: Either read
>>> back the ggtt ptes and check their addresses (but iirc on some platforms
>>> their write-only, readback doesn't work), or we also rewrite the ptes
>>> again for preallocated stuff, like when binding a normal object into the
>>> gtt. If either of these approaches confirms that those affected gen8+
>>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this
>>> patch. If not, well then we at least know what to fix: We need to read out
>>> the real stolen_offset, instead of making assumptions.
>>
>>
>> I'm happy to give this a try on one or more affected machines, I can e.g.
>> try this on both a skylake desktop and a cherry-trail based tablet.
>>
>> But I'm clueless about the whole PTE stuff, so I'm going to need someone
>> to provide me with a patch following either approach. If readback of the
>> PTEs works on skylake / cherrytrail I guess that would be the best way.
>>
>> Note to test this I'm currently booting the kernel with the machine's
>> UEFI vendor logo still being displayed when efifb kicks in. I've added:
>> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
>> VC0 / VC1, so that the framebuffer contents stays intact, combined with
>> the patch we are discussing now, this makes the vendor logo stay on
>> the screen all the way till GDM loads, which looks rather nice actually :)
>>
>> And on shutdown we fall back to the original framebuffer and the vendor
>> logo is back again. I cannot see any corruption in the display at either
>> boot or shutdown time.
> 
> It shouldn't matter whether efifb takes over or not, it's still the
> GOP's framebuffer we're taking over. Different content for sure, logo
> is gone, but we only care about which pages we're using.
> 
> Wrt the code:
> - (Re)binding happens by calling i915_vma_bind, if you put a call to
> that at the end of the stolen_preallocated functions you should get
> that. Ofc needs the logo still there so you can see it jump/get
> mangled.
> - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page
> or gen8_ggtt_insert_entries (which takes the vma). Since we only care
> about the first entry a readq(dev_priv->ggtt->gsm) is all we really
> need. If it's all 0 then it's not working (since at least
> _PAGE_PRESENT should be set, plus the physical address of a page in
> stolen memory).
> 
> I can do some patches for each of those, but a bit swamped. Also no
> gen8 handy for testing I think to make sure it works, so would take a
> few weeks probably.

I was a bit swamped with other stuff myself too, but I'm back to working
on this now. Below is the debug patch I came up with based on what you
wrote above.

This results in the following debug output:

[    1.651141] first ggtt entry before bind: 0x0000000078c00001
[    1.651151] i915_vma_bind ret 0
[    1.651152] first ggtt entry after bind: 0x0000000078c00083

And "sudo cat /proc/iomem | grep Stolen" gives:
   78c00000-88bfffff : Graphics Stolen Memory

I see no visual changes with this patch (BIOS vendor logo still
stays in place when using fbcon=vc:2-62 to make fbcon not bind
on boot and keep the vendor logo intact).

The address of the first ggtt entry matches with the start of stolen
and the i915_vma_bind call only changes the first gtt entry's flags,
or-ing in _PAGE_RW (BIT(1)) and PPAT_CACHED (BIT(7)), which perfectly
matches what we would expect based on gen8_pte_encode()'s behavior.

So it seems that the GOP indeed does NOT implement the wa and the i915's
code assuming a linear mapping at the start of stolen for the BIOS fb
still holds true.

The debug output is from my Skylake based machine which uses an
Asrock B150M Pro4S/D3 mainboard.

I've also tested this on a Cherry Trail based device (a GPD Win)
with identical results (the flags are 0x1b after the vma_bind
on CHT, which matches with I915_CACHE_NONE).

Regards,

Hans


 From 400f2b0d722236999a5706a1f2fa0bb70bf9c73f Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede@redhat.com>
Date: Thu, 19 Apr 2018 14:44:43 +0200
Subject: [PATCH resend] i915: i915_vma_bind for pre-allocated stolen test /
  debug

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
  drivers/gpu/drm/i915/i915_gem_stolen.c | 7 +++++++
  1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 152487d91197..7db03f7d2c51 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -648,6 +648,13 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
  	obj->bind_count++;
  	spin_unlock(&dev_priv->mm.obj_lock);

+	pr_err("first ggtt entry before bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
+	ret = i915_vma_bind(vma,
+		    HAS_LLC(dev_priv) ? I915_CACHE_LLC : I915_CACHE_NONE,
+		    PIN_UPDATE);
+	pr_err("i915_vma_bind ret %d\n", ret);
+	pr_err("first ggtt entry after bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
+
  	return obj;

  err_pages:
-- 
2.17.0



> 
> Cheers, Daniel
>>
>>
>>>
>>> Cheers, Daniel
>>>>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
>>>>    1 file changed, 6 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>> b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>> index af915d041281..ad949cc30928 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct
>>>> drm_i915_private *dev_priv,
>>>>          if (!drm_mm_initialized(&dev_priv->mm.stolen))
>>>>                  return -ENODEV;
>>>>    +     /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>> +       if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
>>>> +               start = 4096;
>>>> +
>>>>          mutex_lock(&dev_priv->mm.stolen_lock);
>>>>          ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
>>>>                                            size, alignment, 0,
>>>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private
>>>> *dev_priv)
>>>>    {
>>>>          resource_size_t reserved_base, stolen_top;
>>>>          resource_size_t reserved_total, reserved_size;
>>>> -       resource_size_t stolen_usable_start;
>>>>          mutex_init(&dev_priv->mm.stolen_lock);
>>>>    @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private
>>>> *dev_priv)
>>>>                           (u64)resource_size(&dev_priv->dsm) >> 10,
>>>>                           ((u64)resource_size(&dev_priv->dsm) -
>>>> reserved_total) >> 10);
>>>>    -     stolen_usable_start = 0;
>>>> -       /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>> -       if (INTEL_GEN(dev_priv) >= 8)
>>>> -               stolen_usable_start = 4096;
>>>> -
>>>>          dev_priv->stolen_usable_size =
>>>> -               resource_size(&dev_priv->dsm) - reserved_total -
>>>> stolen_usable_start;
>>>> +               resource_size(&dev_priv->dsm) - reserved_total;
>>>>          /* Basic memrange allocator for stolen space. */
>>>> -       drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
>>>> -                   dev_priv->stolen_usable_size);
>>>> +       drm_mm_init(&dev_priv->mm.stolen, 0,
>>>> dev_priv->stolen_usable_size);
>>>>          return 0;
>>>>    }
>>>> --
>>>> 2.17.0.rc2
>>>>
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>
>>>
>>
> 
> 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers (rev2)
  2018-03-30 12:27 [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers Hans de Goede
                   ` (3 preceding siblings ...)
  2018-04-05  7:14 ` [PATCH] " Daniel Vetter
@ 2018-04-19 13:54 ` Patchwork
  4 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2018-04-19 13:54 UTC (permalink / raw)
  To: Hans de Goede; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers (rev2)
URL   : https://patchwork.freedesktop.org/series/40929/
State : failure

== Summary ==

Applying: drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
error: sha1 information is lacking or useless (drivers/gpu/drm/i915/i915_gem_stolen.c).
error: could not build fake ancestor
Patch failed at 0001 drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-19 13:43       ` Hans de Goede
@ 2018-04-20  7:27         ` Daniel Vetter
  2018-04-20 10:19           ` Hans de Goede
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2018-04-20  7:27 UTC (permalink / raw)
  To: Hans de Goede; +Cc: Hans de Goede, intel-gfx, dri-devel, Rodrigo Vivi

On Thu, Apr 19, 2018 at 03:43:19PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 05-04-18 15:26, Daniel Vetter wrote:
> > On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote:
> > > Hi,
> > > 
> > > 
> > > On 05-04-18 09:14, Daniel Vetter wrote:
> > > > 
> > > > On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
> > > > > 
> > > > > Before this commit the WaSkipStolenMemoryFirstPage workaround code was
> > > > > skipping the first 4k by passing 4096 as start of the address range
> > > > > passed
> > > > > to drm_mm_init(). This means that calling drm_mm_reserve_node() to try
> > > > > and
> > > > > reserve the firmware framebuffer so that we can inherit it would always
> > > > > fail, as the firmware framebuffer starts at address 0.
> > > > > 
> > > > > Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
> > > > > everything >= gen8") says in its commit message: "This is confirmed to
> > > > > fix
> > > > > Skylake screen flickering issues (probably caused by the fact that we
> > > > > initialized a ring in the first page of stolen, but I didn't 100% confirm
> > > > > this theory)."
> > > > > 
> > > > > Which suggests that it is safe to use the first page for a linear
> > > > > framebuffer as the firmware is doing.
> > > > > 
> > > > > This commit always passes 0 as start to drm_mm_init() and works around
> > > > > WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
> > > > > by insuring the start address passed by to drm_mm_insert_node_in_range()
> > > > > is always 4k or more. All entry points to i915_gem_stolen.c go through
> > > > > i915_gem_stolen_insert_node_in_range(), so that any newly allocated
> > > > > objects such as ring-buffers will not be allocated in the first 4k.
> > > > > 
> > > > > The one exception is i915_gem_object_create_stolen_for_preallocated()
> > > > > which directly calls drm_mm_reserve_node() which now will be able to
> > > > > use the first 4k.
> > > > > 
> > > > > This fixes the i915 driver no longer being able to inherit the firmware
> > > > > framebuffer on gen8+, which fixes the video output changing from the
> > > > > vendor logo to a black screen as soon as the i915 driver is loaded
> > > > > (on systems without fbcon).
> > > > > 
> > > > > Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> > > > 
> > > > 
> > > > I think this is worth a shot. The only explanation I can think of why the
> > > > GOP could get away with this and still follow the w/a is if it doesn't
> > > > have a 1:1 mapping between GGTT and stolen.
> > > 
> > > 
> > > My guess is that the GOP does not care about the w/a because the w/a
> > > states that the command-streamers sometimes do a spurious flush (write)
> > > to the first page, and the command-streamers are not used until much
> > > later, so the GOP is probably ignoring the w/a all together.
> > > 
> > > At least that is my theory.
> > 
> > Atm we do not know whether the GOP actually implements the wa or not.
> > That it doesn't care is just a conjecture, but we have no proof. On
> > previous platforms the 1:1 mapping did hold, but there's no
> > fundamental reason why it sitll does.
> 
> Right.
> 
> > > > Atm we hardcode that
> > > > assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
> > > > both the stolen_offset and the gtt_offset (but it's only the gtt_offset
> > > > really). And since we're not re-writing the ptes it's not noticeable.
> > > > 
> > > > I think to decide whether this is the right approach we should
> > > > double-check whether that 1:1 assumption really holds true: Either read
> > > > back the ggtt ptes and check their addresses (but iirc on some platforms
> > > > their write-only, readback doesn't work), or we also rewrite the ptes
> > > > again for preallocated stuff, like when binding a normal object into the
> > > > gtt. If either of these approaches confirms that those affected gen8+
> > > > machines still use the 1:1 mapping, then I'm happy to put my r-b on this
> > > > patch. If not, well then we at least know what to fix: We need to read out
> > > > the real stolen_offset, instead of making assumptions.
> > > 
> > > 
> > > I'm happy to give this a try on one or more affected machines, I can e.g.
> > > try this on both a skylake desktop and a cherry-trail based tablet.
> > > 
> > > But I'm clueless about the whole PTE stuff, so I'm going to need someone
> > > to provide me with a patch following either approach. If readback of the
> > > PTEs works on skylake / cherrytrail I guess that would be the best way.
> > > 
> > > Note to test this I'm currently booting the kernel with the machine's
> > > UEFI vendor logo still being displayed when efifb kicks in. I've added:
> > > "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
> > > VC0 / VC1, so that the framebuffer contents stays intact, combined with
> > > the patch we are discussing now, this makes the vendor logo stay on
> > > the screen all the way till GDM loads, which looks rather nice actually :)
> > > 
> > > And on shutdown we fall back to the original framebuffer and the vendor
> > > logo is back again. I cannot see any corruption in the display at either
> > > boot or shutdown time.
> > 
> > It shouldn't matter whether efifb takes over or not, it's still the
> > GOP's framebuffer we're taking over. Different content for sure, logo
> > is gone, but we only care about which pages we're using.
> > 
> > Wrt the code:
> > - (Re)binding happens by calling i915_vma_bind, if you put a call to
> > that at the end of the stolen_preallocated functions you should get
> > that. Ofc needs the logo still there so you can see it jump/get
> > mangled.
> > - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page
> > or gen8_ggtt_insert_entries (which takes the vma). Since we only care
> > about the first entry a readq(dev_priv->ggtt->gsm) is all we really
> > need. If it's all 0 then it's not working (since at least
> > _PAGE_PRESENT should be set, plus the physical address of a page in
> > stolen memory).
> > 
> > I can do some patches for each of those, but a bit swamped. Also no
> > gen8 handy for testing I think to make sure it works, so would take a
> > few weeks probably.
> 
> I was a bit swamped with other stuff myself too, but I'm back to working
> on this now. Below is the debug patch I came up with based on what you
> wrote above.
> 
> This results in the following debug output:
> 
> [    1.651141] first ggtt entry before bind: 0x0000000078c00001
> [    1.651151] i915_vma_bind ret 0
> [    1.651152] first ggtt entry after bind: 0x0000000078c00083
> 
> And "sudo cat /proc/iomem | grep Stolen" gives:
>   78c00000-88bfffff : Graphics Stolen Memory
> 
> I see no visual changes with this patch (BIOS vendor logo still
> stays in place when using fbcon=vc:2-62 to make fbcon not bind
> on boot and keep the vendor logo intact).
> 
> The address of the first ggtt entry matches with the start of stolen
> and the i915_vma_bind call only changes the first gtt entry's flags,
> or-ing in _PAGE_RW (BIT(1)) and PPAT_CACHED (BIT(7)), which perfectly
> matches what we would expect based on gen8_pte_encode()'s behavior.
> 
> So it seems that the GOP indeed does NOT implement the wa and the i915's
> code assuming a linear mapping at the start of stolen for the BIOS fb
> still holds true.
> 
> The debug output is from my Skylake based machine which uses an
> Asrock B150M Pro4S/D3 mainboard.
> 
> I've also tested this on a Cherry Trail based device (a GPD Win)
> with identical results (the flags are 0x1b after the vma_bind
> on CHT, which matches with I915_CACHE_NONE).

tbh, not what I expected, so good we've done the experiment to confirm :-)

With the above information added to the commit message (you can just paste
your entire message, including test diff to the commit message) the
original patch has my:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Thanks for doing all the digging here.

Cheers, Daniel

> 
> Regards,
> 
> Hans
> 
> 
> From 400f2b0d722236999a5706a1f2fa0bb70bf9c73f Mon Sep 17 00:00:00 2001
> From: Hans de Goede <hdegoede@redhat.com>
> Date: Thu, 19 Apr 2018 14:44:43 +0200
> Subject: [PATCH resend] i915: i915_vma_bind for pre-allocated stolen test /
>  debug
> 
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_stolen.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 152487d91197..7db03f7d2c51 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -648,6 +648,13 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
>  	obj->bind_count++;
>  	spin_unlock(&dev_priv->mm.obj_lock);
> 
> +	pr_err("first ggtt entry before bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
> +	ret = i915_vma_bind(vma,
> +		    HAS_LLC(dev_priv) ? I915_CACHE_LLC : I915_CACHE_NONE,
> +		    PIN_UPDATE);
> +	pr_err("i915_vma_bind ret %d\n", ret);
> +	pr_err("first ggtt entry after bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
> +
>  	return obj;
> 
>  err_pages:
> -- 
> 2.17.0
> 
> 
> 
> > 
> > Cheers, Daniel
> > > 
> > > 
> > > > 
> > > > Cheers, Daniel
> > > > > 
> > > > > ---
> > > > >    drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
> > > > >    1 file changed, 6 insertions(+), 9 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > index af915d041281..ad949cc30928 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct
> > > > > drm_i915_private *dev_priv,
> > > > >          if (!drm_mm_initialized(&dev_priv->mm.stolen))
> > > > >                  return -ENODEV;
> > > > >    +     /* WaSkipStolenMemoryFirstPage:bdw+ */
> > > > > +       if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
> > > > > +               start = 4096;
> > > > > +
> > > > >          mutex_lock(&dev_priv->mm.stolen_lock);
> > > > >          ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
> > > > >                                            size, alignment, 0,
> > > > > @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private
> > > > > *dev_priv)
> > > > >    {
> > > > >          resource_size_t reserved_base, stolen_top;
> > > > >          resource_size_t reserved_total, reserved_size;
> > > > > -       resource_size_t stolen_usable_start;
> > > > >          mutex_init(&dev_priv->mm.stolen_lock);
> > > > >    @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private
> > > > > *dev_priv)
> > > > >                           (u64)resource_size(&dev_priv->dsm) >> 10,
> > > > >                           ((u64)resource_size(&dev_priv->dsm) -
> > > > > reserved_total) >> 10);
> > > > >    -     stolen_usable_start = 0;
> > > > > -       /* WaSkipStolenMemoryFirstPage:bdw+ */
> > > > > -       if (INTEL_GEN(dev_priv) >= 8)
> > > > > -               stolen_usable_start = 4096;
> > > > > -
> > > > >          dev_priv->stolen_usable_size =
> > > > > -               resource_size(&dev_priv->dsm) - reserved_total -
> > > > > stolen_usable_start;
> > > > > +               resource_size(&dev_priv->dsm) - reserved_total;
> > > > >          /* Basic memrange allocator for stolen space. */
> > > > > -       drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
> > > > > -                   dev_priv->stolen_usable_size);
> > > > > +       drm_mm_init(&dev_priv->mm.stolen, 0,
> > > > > dev_priv->stolen_usable_size);
> > > > >          return 0;
> > > > >    }
> > > > > --
> > > > > 2.17.0.rc2
> > > > > 
> > > > > _______________________________________________
> > > > > Intel-gfx mailing list
> > > > > Intel-gfx@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > > 
> > > > 
> > > 
> > 
> > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers
  2018-04-20  7:27         ` Daniel Vetter
@ 2018-04-20 10:19           ` Hans de Goede
  0 siblings, 0 replies; 25+ messages in thread
From: Hans de Goede @ 2018-04-20 10:19 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Hans de Goede, intel-gfx, dri-devel, Rodrigo Vivi

Hi,

On 20-04-18 09:27, Daniel Vetter wrote:
> On Thu, Apr 19, 2018 at 03:43:19PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 05-04-18 15:26, Daniel Vetter wrote:
>>> On Thu, Apr 5, 2018 at 1:47 PM, Hans de Goede <hdegoede@redhat.com> wrote:
>>>> Hi,
>>>>
>>>>
>>>> On 05-04-18 09:14, Daniel Vetter wrote:
>>>>>
>>>>> On Fri, Mar 30, 2018 at 02:27:15PM +0200, Hans de Goede wrote:
>>>>>>
>>>>>> Before this commit the WaSkipStolenMemoryFirstPage workaround code was
>>>>>> skipping the first 4k by passing 4096 as start of the address range
>>>>>> passed
>>>>>> to drm_mm_init(). This means that calling drm_mm_reserve_node() to try
>>>>>> and
>>>>>> reserve the firmware framebuffer so that we can inherit it would always
>>>>>> fail, as the firmware framebuffer starts at address 0.
>>>>>>
>>>>>> Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
>>>>>> everything >= gen8") says in its commit message: "This is confirmed to
>>>>>> fix
>>>>>> Skylake screen flickering issues (probably caused by the fact that we
>>>>>> initialized a ring in the first page of stolen, but I didn't 100% confirm
>>>>>> this theory)."
>>>>>>
>>>>>> Which suggests that it is safe to use the first page for a linear
>>>>>> framebuffer as the firmware is doing.
>>>>>>
>>>>>> This commit always passes 0 as start to drm_mm_init() and works around
>>>>>> WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
>>>>>> by insuring the start address passed by to drm_mm_insert_node_in_range()
>>>>>> is always 4k or more. All entry points to i915_gem_stolen.c go through
>>>>>> i915_gem_stolen_insert_node_in_range(), so that any newly allocated
>>>>>> objects such as ring-buffers will not be allocated in the first 4k.
>>>>>>
>>>>>> The one exception is i915_gem_object_create_stolen_for_preallocated()
>>>>>> which directly calls drm_mm_reserve_node() which now will be able to
>>>>>> use the first 4k.
>>>>>>
>>>>>> This fixes the i915 driver no longer being able to inherit the firmware
>>>>>> framebuffer on gen8+, which fixes the video output changing from the
>>>>>> vendor logo to a black screen as soon as the i915 driver is loaded
>>>>>> (on systems without fbcon).
>>>>>>
>>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>>>
>>>>>
>>>>> I think this is worth a shot. The only explanation I can think of why the
>>>>> GOP could get away with this and still follow the w/a is if it doesn't
>>>>> have a 1:1 mapping between GGTT and stolen.
>>>>
>>>>
>>>> My guess is that the GOP does not care about the w/a because the w/a
>>>> states that the command-streamers sometimes do a spurious flush (write)
>>>> to the first page, and the command-streamers are not used until much
>>>> later, so the GOP is probably ignoring the w/a all together.
>>>>
>>>> At least that is my theory.
>>>
>>> Atm we do not know whether the GOP actually implements the wa or not.
>>> That it doesn't care is just a conjecture, but we have no proof. On
>>> previous platforms the 1:1 mapping did hold, but there's no
>>> fundamental reason why it sitll does.
>>
>> Right.
>>
>>>>> Atm we hardcode that
>>>>> assumption in intel_alloc_initial_plane_obj by passing the base_aligned as
>>>>> both the stolen_offset and the gtt_offset (but it's only the gtt_offset
>>>>> really). And since we're not re-writing the ptes it's not noticeable.
>>>>>
>>>>> I think to decide whether this is the right approach we should
>>>>> double-check whether that 1:1 assumption really holds true: Either read
>>>>> back the ggtt ptes and check their addresses (but iirc on some platforms
>>>>> their write-only, readback doesn't work), or we also rewrite the ptes
>>>>> again for preallocated stuff, like when binding a normal object into the
>>>>> gtt. If either of these approaches confirms that those affected gen8+
>>>>> machines still use the 1:1 mapping, then I'm happy to put my r-b on this
>>>>> patch. If not, well then we at least know what to fix: We need to read out
>>>>> the real stolen_offset, instead of making assumptions.
>>>>
>>>>
>>>> I'm happy to give this a try on one or more affected machines, I can e.g.
>>>> try this on both a skylake desktop and a cherry-trail based tablet.
>>>>
>>>> But I'm clueless about the whole PTE stuff, so I'm going to need someone
>>>> to provide me with a patch following either approach. If readback of the
>>>> PTEs works on skylake / cherrytrail I guess that would be the best way.
>>>>
>>>> Note to test this I'm currently booting the kernel with the machine's
>>>> UEFI vendor logo still being displayed when efifb kicks in. I've added:
>>>> "fbcon=vc:2-62" to the kernel commandline to make fbcon not bind to
>>>> VC0 / VC1, so that the framebuffer contents stays intact, combined with
>>>> the patch we are discussing now, this makes the vendor logo stay on
>>>> the screen all the way till GDM loads, which looks rather nice actually :)
>>>>
>>>> And on shutdown we fall back to the original framebuffer and the vendor
>>>> logo is back again. I cannot see any corruption in the display at either
>>>> boot or shutdown time.
>>>
>>> It shouldn't matter whether efifb takes over or not, it's still the
>>> GOP's framebuffer we're taking over. Different content for sure, logo
>>> is gone, but we only care about which pages we're using.
>>>
>>> Wrt the code:
>>> - (Re)binding happens by calling i915_vma_bind, if you put a call to
>>> that at the end of the stolen_preallocated functions you should get
>>> that. Ofc needs the logo still there so you can see it jump/get
>>> mangled.
>>> - GGTT PTEs for gen8+ are written through e.g. gen8_ggttt_insert_page
>>> or gen8_ggtt_insert_entries (which takes the vma). Since we only care
>>> about the first entry a readq(dev_priv->ggtt->gsm) is all we really
>>> need. If it's all 0 then it's not working (since at least
>>> _PAGE_PRESENT should be set, plus the physical address of a page in
>>> stolen memory).
>>>
>>> I can do some patches for each of those, but a bit swamped. Also no
>>> gen8 handy for testing I think to make sure it works, so would take a
>>> few weeks probably.
>>
>> I was a bit swamped with other stuff myself too, but I'm back to working
>> on this now. Below is the debug patch I came up with based on what you
>> wrote above.
>>
>> This results in the following debug output:
>>
>> [    1.651141] first ggtt entry before bind: 0x0000000078c00001
>> [    1.651151] i915_vma_bind ret 0
>> [    1.651152] first ggtt entry after bind: 0x0000000078c00083
>>
>> And "sudo cat /proc/iomem | grep Stolen" gives:
>>    78c00000-88bfffff : Graphics Stolen Memory
>>
>> I see no visual changes with this patch (BIOS vendor logo still
>> stays in place when using fbcon=vc:2-62 to make fbcon not bind
>> on boot and keep the vendor logo intact).
>>
>> The address of the first ggtt entry matches with the start of stolen
>> and the i915_vma_bind call only changes the first gtt entry's flags,
>> or-ing in _PAGE_RW (BIT(1)) and PPAT_CACHED (BIT(7)), which perfectly
>> matches what we would expect based on gen8_pte_encode()'s behavior.
>>
>> So it seems that the GOP indeed does NOT implement the wa and the i915's
>> code assuming a linear mapping at the start of stolen for the BIOS fb
>> still holds true.
>>
>> The debug output is from my Skylake based machine which uses an
>> Asrock B150M Pro4S/D3 mainboard.
>>
>> I've also tested this on a Cherry Trail based device (a GPD Win)
>> with identical results (the flags are 0x1b after the vma_bind
>> on CHT, which matches with I915_CACHE_NONE).
> 
> tbh, not what I expected, so good we've done the experiment to confirm :-)
> 
> With the above information added to the commit message (you can just paste
> your entire message, including test diff to the commit message) the
> original patch has my:
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Thanks, I've just send out a v2 with the amended commit message
and your r-b added. Once that has passed the CI I will push
it to dinq.

Regards,

Hans




> 
> Thanks for doing all the digging here.
> 
> Cheers, Daniel
> 
>>
>> Regards,
>>
>> Hans
>>
>>
>>  From 400f2b0d722236999a5706a1f2fa0bb70bf9c73f Mon Sep 17 00:00:00 2001
>> From: Hans de Goede <hdegoede@redhat.com>
>> Date: Thu, 19 Apr 2018 14:44:43 +0200
>> Subject: [PATCH resend] i915: i915_vma_bind for pre-allocated stolen test /
>>   debug
>>
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem_stolen.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> index 152487d91197..7db03f7d2c51 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>> @@ -648,6 +648,13 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
>>   	obj->bind_count++;
>>   	spin_unlock(&dev_priv->mm.obj_lock);
>>
>> +	pr_err("first ggtt entry before bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
>> +	ret = i915_vma_bind(vma,
>> +		    HAS_LLC(dev_priv) ? I915_CACHE_LLC : I915_CACHE_NONE,
>> +		    PIN_UPDATE);
>> +	pr_err("i915_vma_bind ret %d\n", ret);
>> +	pr_err("first ggtt entry after bind: 0x%016llx\n", readq(dev_priv->ggtt.gsm));
>> +
>>   	return obj;
>>
>>   err_pages:
>> -- 
>> 2.17.0
>>
>>
>>
>>>
>>> Cheers, Daniel
>>>>
>>>>
>>>>>
>>>>> Cheers, Daniel
>>>>>>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/i915_gem_stolen.c | 15 ++++++---------
>>>>>>     1 file changed, 6 insertions(+), 9 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>>> b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>>> index af915d041281..ad949cc30928 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
>>>>>> @@ -51,6 +51,10 @@ int i915_gem_stolen_insert_node_in_range(struct
>>>>>> drm_i915_private *dev_priv,
>>>>>>           if (!drm_mm_initialized(&dev_priv->mm.stolen))
>>>>>>                   return -ENODEV;
>>>>>>     +     /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>>>> +       if (INTEL_GEN(dev_priv) >= 8 && start < 4096)
>>>>>> +               start = 4096;
>>>>>> +
>>>>>>           mutex_lock(&dev_priv->mm.stolen_lock);
>>>>>>           ret = drm_mm_insert_node_in_range(&dev_priv->mm.stolen, node,
>>>>>>                                             size, alignment, 0,
>>>>>> @@ -343,7 +347,6 @@ int i915_gem_init_stolen(struct drm_i915_private
>>>>>> *dev_priv)
>>>>>>     {
>>>>>>           resource_size_t reserved_base, stolen_top;
>>>>>>           resource_size_t reserved_total, reserved_size;
>>>>>> -       resource_size_t stolen_usable_start;
>>>>>>           mutex_init(&dev_priv->mm.stolen_lock);
>>>>>>     @@ -435,17 +438,11 @@ int i915_gem_init_stolen(struct drm_i915_private
>>>>>> *dev_priv)
>>>>>>                            (u64)resource_size(&dev_priv->dsm) >> 10,
>>>>>>                            ((u64)resource_size(&dev_priv->dsm) -
>>>>>> reserved_total) >> 10);
>>>>>>     -     stolen_usable_start = 0;
>>>>>> -       /* WaSkipStolenMemoryFirstPage:bdw+ */
>>>>>> -       if (INTEL_GEN(dev_priv) >= 8)
>>>>>> -               stolen_usable_start = 4096;
>>>>>> -
>>>>>>           dev_priv->stolen_usable_size =
>>>>>> -               resource_size(&dev_priv->dsm) - reserved_total -
>>>>>> stolen_usable_start;
>>>>>> +               resource_size(&dev_priv->dsm) - reserved_total;
>>>>>>           /* Basic memrange allocator for stolen space. */
>>>>>> -       drm_mm_init(&dev_priv->mm.stolen, stolen_usable_start,
>>>>>> -                   dev_priv->stolen_usable_size);
>>>>>> +       drm_mm_init(&dev_priv->mm.stolen, 0,
>>>>>> dev_priv->stolen_usable_size);
>>>>>>           return 0;
>>>>>>     }
>>>>>> --
>>>>>> 2.17.0.rc2
>>>>>>
>>>>>> _______________________________________________
>>>>>> Intel-gfx mailing list
>>>>>> Intel-gfx@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-04-20 10:19 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-30 12:27 [PATCH] drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers Hans de Goede
2018-03-30 12:30 ` [Intel-gfx] " Chris Wilson
2018-03-30 12:37   ` Hans de Goede
2018-03-30 12:44     ` Chris Wilson
2018-03-30 13:25       ` Hans de Goede
2018-03-30 14:26         ` [Intel-gfx] " Hans de Goede
2018-04-04 15:50           ` Ville Syrjälä
2018-04-04 17:26             ` Rodrigo Vivi
2018-04-04 20:06             ` [Intel-gfx] " Hans de Goede
2018-04-04 20:49               ` Ville Syrjälä
2018-04-05 11:37                 ` Hans de Goede
2018-04-06 16:06                   ` [Intel-gfx] " Ville Syrjälä
2018-04-07  9:34                     ` Hans de Goede
2018-04-09  8:30                       ` Daniel Vetter
2018-03-30 12:48 ` ✓ Fi.CI.BAT: success for " Patchwork
2018-03-30 13:31 ` ✓ Fi.CI.IGT: " Patchwork
2018-04-05  7:14 ` [PATCH] " Daniel Vetter
2018-04-05 11:47   ` Hans de Goede
2018-04-05 13:26     ` Daniel Vetter
2018-04-05 13:31       ` [Intel-gfx] " Hans de Goede
2018-04-05 13:38         ` Daniel Vetter
2018-04-19 13:43       ` Hans de Goede
2018-04-20  7:27         ` Daniel Vetter
2018-04-20 10:19           ` Hans de Goede
2018-04-19 13:54 ` ✗ Fi.CI.BAT: failure for drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.