From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: "Ceraolo Spurio, Daniele" <daniele.ceraolospurio@intel.com>,
intel-gfx@lists.freedesktop.org
Cc: Alexander Usyskin <alexander.usyskin@intel.com>,
alan.previn.teres.alexis@intel.com
Subject: Re: [Intel-gfx] [PATCH 00/15] HuC loading for DG2
Date: Tue, 14 Jun 2022 08:44:50 +0100 [thread overview]
Message-ID: <954db3a4-e8c9-e157-5211-aceec87dfd9d@linux.intel.com> (raw)
In-Reply-To: <e90f9dd1-7229-f958-d2e6-6fc4ec5a866b@intel.com>
On 13/06/2022 19:13, Ceraolo Spurio, Daniele wrote:
> On 6/13/2022 10:39 AM, Tvrtko Ursulin wrote:
>> On 13/06/2022 18:06, Ceraolo Spurio, Daniele wrote:
>>> On 6/13/2022 9:56 AM, Tvrtko Ursulin wrote:
>>>> On 13/06/2022 17:41, Ceraolo Spurio, Daniele wrote:
>>>>> On 6/13/2022 9:31 AM, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 13/06/2022 16:39, Ceraolo Spurio, Daniele wrote:
>>>>>>> On 6/13/2022 1:16 AM, Tvrtko Ursulin wrote:
>>>>>>>>
>>>>>>>> On 10/06/2022 00:19, Daniele Ceraolo Spurio wrote:
>>>>>>>>> On DG2, HuC loading is performed by the GSC, via a PXP command.
>>>>>>>>> The load
>>>>>>>>> operation itself is relatively simple (just send a message to
>>>>>>>>> the GSC
>>>>>>>>> with the physical address of the HuC in LMEM), but there are
>>>>>>>>> timing
>>>>>>>>> changes that requires special attention. In particular, to send
>>>>>>>>> a PXP
>>>>>>>>> command we need to first export the GSC driver and then wait
>>>>>>>>> for the
>>>>>>>>> mei-gsc and mei-pxp modules to start, which means that HuC load
>>>>>>>>> will
>>>>>>>>> complete after i915 load is complete. This means that there is
>>>>>>>>> a small
>>>>>>>>> window of time after i915 is registered and before HuC is loaded
>>>>>>>>> during which userspace could submit and/or checking the HuC
>>>>>>>>> load status,
>>>>>>>>> although this is quite unlikely to happen (HuC is usually
>>>>>>>>> loaded before
>>>>>>>>> kernel init/resume completes).
>>>>>>>>> We've consulted with the media team in regards to how to handle
>>>>>>>>> this and
>>>>>>>>> they've asked us to do the following:
>>>>>>>>>
>>>>>>>>> 1) Report HuC as loaded in the getparam IOCTL even if load is
>>>>>>>>> still in
>>>>>>>>> progress. The media driver uses the IOCTL as a way to check if
>>>>>>>>> HuC is
>>>>>>>>> enabled and then includes a secondary check in the batches to
>>>>>>>>> get the
>>>>>>>>> actual status, so doing it this way allows userspace to keep
>>>>>>>>> working
>>>>>>>>> without changes.
>>>>>>>>>
>>>>>>>>> 2) Stall all userspace VCS submission until HuC is loaded.
>>>>>>>>> Stalls are
>>>>>>>>> expected to be very rare (if any), due to the fact that HuC is
>>>>>>>>> usually
>>>>>>>>> loaded before kernel init/resume is completed.
>>>>>>>>
>>>>>>>> Motivation to add these complications into i915 are not clear to
>>>>>>>> me here. I mean there is no HuC on DG2 _yet_ is the premise of
>>>>>>>> the series, right? So no backwards compatibility concerns. In
>>>>>>>> this case why jump through the hoops and not let userspace
>>>>>>>> handle all of this by just leaving the getparam return the true
>>>>>>>> status?
>>>>>>>
>>>>>>> The main areas impacted by the fact that we can't guarantee that
>>>>>>> HuC load is complete when i915 starts accepting submissions are
>>>>>>> boot and suspend/resume, with the latter being the main problem;
>>>>>>> GT reset is not a concern because HuC now survives it. A
>>>>>>> suspend/resume can be transparent to userspace and therefore the
>>>>>>> HuC status can temporarily flip from loaded to not without
>>>>>>> userspace knowledge, especially if we start going into deeper
>>>>>>> suspend states and start causing HuC resets when we go into
>>>>>>> runtime suspend. Note that this is different from what happens
>>>>>>> during GT reset for older platforms, because in that scenario we
>>>>>>> guarantee that HuC reload is complete before we restart the
>>>>>>> submission back-end, so userspace doesn't notice that the HuC
>>>>>>> status change. We had an internal discussion about this problem
>>>>>>> with both media and i915 archs and the conclusion was that the
>>>>>>> best option is for i915 to stall media submission while HuC
>>>>>>> (re-)load is in progress.
>>>>>>
>>>>>> Resume is potentialy a good reason - I did not pick up on that
>>>>>> from the cover letter. I read the statement about the unlikely and
>>>>>> small window where HuC is not loaded during kernel init/resume and
>>>>>> I guess did not pick up on the resume part.
>>>>>>
>>>>>> Waiting for GSC to load HuC from i915 resume is not an option?
>>>>>
>>>>> GSC is an aux device exported by i915, so AFAIU GSC resume can't
>>>>> start until i915 resume completes.
>>>>
>>>> I'll dig into this in the next few days since I want to understand
>>>> how exactly it works. Or someone can help explain.
>>>>
>>>> If in the end conclusion will be that i915 resume indeed cannot wait
>>>> for GSC, then I think auto-blocking of queued up contexts on media
>>>> engines indeed sounds unavoidable. Otherwise, as you explained, user
>>>> experience post resume wouldn't be good.
>>>
>>> Even if we could implement a wait, I'm not sure we should. GSC resume
>>> and HuC reload takes ~300ms in most cases, I don't think we want to
>>> block within the i915 resume path for that long.
>>
>> Yeah maybe not. But entertaining the idea that it is technically
>> possible to block - we could perhaps add uapi for userspace to mark
>> contexts which want HuC access. Then track if there are any such
>> contexts with outstanding submissions and only wait in resume if there
>> are. If that would end up significantly less code on the i915 side to
>> maintain is an open.
>>
>> What would be the end result from users point of view in case where it
>> suspended during video playback? The proposed solution from this
>> series sees the video stuck after resume. Maybe compositor blocks as
>> well since I am not sure how well they handle one window not providing
>> new data. Probably depends on the compositor.
>>
>> And then with a simpler solution definitely the whole resume would be
>> delayed by 300ms.
>>
>> With my ChromeOS hat the stalled media engines does sound like a
>> better solution. But with the maintainer hat I'd like all options
>> evaluated since there is attractiveness if a good enough solution can
>> be achieved with significantly less kernel code.
>>
>> You say 300ms is typical time for HuC load. How long it is on other
>> platforms? If much faster then why is it so slow here?
>
> The GSC itself has to come out of suspend before it can perform the
> load, which takes a few tens of ms I believe. AFAIU the GSC is also
> slower in processing the HuC load and auth compared to the legacy path.
> The GSC FW team gave a 250ms limit for the time the GSC FW needs from
> start of the resume flow to HuC load complete, so I bumped that to
> ~300ms to account for all other SW interactions, plus a bit of buffer.
> Note that a bit of the SW overhead is caused by the fact that we have 2
> mei modules in play here: mei-gsc, which manages the GSC device itself
> (including resume), and mei-pxp, which owns the pxp messaging, including
> HuC load.
And how long on other platforms (not DG2) do you know? Presumably there
the wait is on the i915 resume path?
>>>> However, do we really need to lie in the getparam? How about extend
>>>> or add a new one to separate the loading vs loaded states? Since
>>>> userspace does not support DG2 HuC yet this should be doable.
>>>
>>> I don't really have a preference here. The media team asked us to do
>>> it this way because they wouldn't have a use for the different "in
>>> progress" and "done" states. If they're ok with having separate flags
>>> that's fine by me.
>>> Tony, any feedback here?
>>
>> We don't even have any docs in i915_drm.h in terms of what it means:
>>
>> #define I915_PARAM_HUC_STATUS 42
>>
>> Seems to be a boolean. Status false vs true? Could you add some docs?
>
> There is documentation above intel_huc_check_status(), which is also
> updated in this series. I can move that to i915_drm.h.
That would be great, thanks.
And with so rich return codes already documented and exposed via uapi -
would we really need to add anything new for DG2 apart for userspace to
know that if zero is returned (not a negative error value) it should
retry? I mean is there another negative error missing which would
prevent zero transitioning to one?
Regards,
Tvrtko
>
> Daniele
>
>>
>> Regards,
>>
>> Tvrtko
>>
>>>
>>> Thanks,
>>> Daniele
>>>
>>>>
>>>>>> Will there be runtime suspend happening on the GSC device behind
>>>>>> i915's back, or i915 and GSC will always be able to transition the
>>>>>> states in tandem?
>>>>>
>>>>> They're always in sync. The GSC is part of the same HW PCI device
>>>>> as the rest of the GPU, so they change HW state together.
>>>>
>>>> Okay thanks, I wasn't sure if it is the same or separate device.
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>>
>
next prev parent reply other threads:[~2022-06-14 7:45 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-09 23:19 [Intel-gfx] [PATCH 00/15] HuC loading for DG2 Daniele Ceraolo Spurio
2022-06-09 23:19 ` [Intel-gfx] [PATCH 01/15] HAX: mei: GSC support for XeHP SDV and DG2 platform Daniele Ceraolo Spurio
2022-06-09 23:19 ` [Intel-gfx] [PATCH 02/15] mei: add support to GSC extended header Daniele Ceraolo Spurio
2022-08-03 22:07 ` Teres Alexis, Alan Previn
2022-08-16 20:49 ` Winkler, Tomas
2022-06-09 23:19 ` [Intel-gfx] [PATCH 03/15] mei: bus: enable sending gsc commands Daniele Ceraolo Spurio
2022-06-09 23:19 ` [Intel-gfx] [PATCH 04/15] mei: bus: extend bus API to support command streamer API Daniele Ceraolo Spurio
2022-06-09 23:19 ` [Intel-gfx] [PATCH 05/15] mei: pxp: add command streamer API to the PXP driver Daniele Ceraolo Spurio
2022-07-27 1:42 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 06/15] mei: pxp: support matching with a gfx discrete card Daniele Ceraolo Spurio
2022-07-27 1:01 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 07/15] drm/i915/pxp: load the pxp module when we have a gsc-loaded huc Daniele Ceraolo Spurio
2022-06-18 7:27 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 08/15] drm/i915/pxp: implement function for sending tee stream command Daniele Ceraolo Spurio
2022-06-18 8:07 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 09/15] drm/i915/pxp: add huc authentication and loading command Daniele Ceraolo Spurio
2022-06-21 6:33 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 10/15] drm/i915/dg2: setup HuC loading via GSC Daniele Ceraolo Spurio
2022-07-05 22:35 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 11/15] drm/i915/huc: track delayed HuC load with a fence Daniele Ceraolo Spurio
2022-07-06 4:42 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 12/15] drm/i915/huc: stall media submission until HuC is loaded Daniele Ceraolo Spurio
2022-07-27 0:33 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 13/15] drm/i915/huc: report HuC as loaded even if load still in progress Daniele Ceraolo Spurio
2022-07-06 4:49 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 14/15] drm/i915/huc: define gsc-compatible HuC fw for DG2 Daniele Ceraolo Spurio
2022-06-22 17:55 ` Teres Alexis, Alan Previn
2022-06-22 18:16 ` Teres Alexis, Alan Previn
2022-06-09 23:19 ` [Intel-gfx] [PATCH 15/15] HAX: drm/i915: force INTEL_MEI_GSC and INTEL_MEI_PXP on for CI Daniele Ceraolo Spurio
2022-06-10 0:07 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for HuC loading for DG2 Patchwork
2022-06-10 0:07 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-06-10 8:01 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-06-11 8:01 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2022-06-13 8:16 ` [Intel-gfx] [PATCH 00/15] " Tvrtko Ursulin
2022-06-13 15:39 ` Ceraolo Spurio, Daniele
2022-06-13 16:31 ` Tvrtko Ursulin
2022-06-13 16:41 ` Ceraolo Spurio, Daniele
2022-06-13 16:56 ` Tvrtko Ursulin
2022-06-13 17:06 ` Ceraolo Spurio, Daniele
2022-06-13 17:39 ` Tvrtko Ursulin
2022-06-13 18:13 ` Ceraolo Spurio, Daniele
2022-06-14 7:44 ` Tvrtko Ursulin [this message]
2022-06-14 15:30 ` Ceraolo Spurio, Daniele
2022-06-14 23:15 ` Ye, Tony
2022-06-15 10:13 ` Tvrtko Ursulin
2022-06-15 14:35 ` Ceraolo Spurio, Daniele
2022-06-15 14:53 ` Tvrtko Ursulin
2022-06-15 16:14 ` Ye, Tony
2022-06-16 2:28 ` Zhang, Carl
2022-07-05 23:30 ` Ceraolo Spurio, Daniele
2022-07-06 17:26 ` Ye, Tony
2022-07-06 19:29 ` Ceraolo Spurio, Daniele
2022-07-06 20:11 ` Ye, Tony
2022-06-16 7:10 ` Tvrtko Ursulin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=954db3a4-e8c9-e157-5211-aceec87dfd9d@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=alan.previn.teres.alexis@intel.com \
--cc=alexander.usyskin@intel.com \
--cc=daniele.ceraolospurio@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox