Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: John Harrison <john.c.harrison@intel.com>
To: "Ceraolo Spurio, Daniele" <daniele.ceraolospurio@intel.com>,
	<Intel-GFX@Lists.FreeDesktop.Org>
Cc: DRI-Devel@Lists.FreeDesktop.Org
Subject: Re: [Intel-gfx] [PATCH 5/5] drm/i915/uc: Reject doplicate entries in firmware table
Date: Wed, 19 Apr 2023 10:59:33 -0700	[thread overview]
Message-ID: <688259d2-d29e-439f-44e7-7dca1a3f7037@intel.com> (raw)
In-Reply-To: <f102d291-6284-52aa-a92e-7b911e9470b1@intel.com>

On 4/19/2023 10:33, Ceraolo Spurio, Daniele wrote:
> On 4/19/2023 10:12 AM, John Harrison wrote:
>> On 4/19/2023 10:02, John Harrison wrote:
>>> On 4/18/2023 16:24, Ceraolo Spurio, Daniele wrote:
>>>> Typo doplicate in patch title
>>>>
>>>> On 4/14/2023 5:57 PM, John.C.Harrison@Intel.com wrote:
>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> It was noticed that duplicte entries in the firmware table could 
>>>>> cause
>>>>
>>>> typo duplicte
>>>>
>>>>> an infinite loop in the firmware loading code if that entry failed to
>>>>> load. Duplicate entries are a bug anyway and so should never happen.
>>>>> Ensure they don't by tweaking the table validation code to reject
>>>>> duplicates.
>>>>
>>>> Here you're not really rejecting anything though, just printing an 
>>>> error (and even that only if the SELFTEST kconfig is selected). 
>>>> This would allow our CI to catch issues with patches sent to our 
>>>> ML, but IIRC the reported bug was on a kernel fork. We could 
>>>> disable the FW loading is the table for that particular blob type 
>>>> is in an invalid state, as it wouldn't be safe to attempt a load in 
>>>> that case anyway.
>>> The validation code is rejecting duplicates. Whether the driver 
>>> loads or not after a failed validation is another matter.
>>>
>>> I was basically assuming that CI will fail on the error message and 
>>> thus prevent such code ever being merged. But yeah, I guess we don't 
>>> run CI on backports to stable kernels and such. Although, I would 
>>> hope that anyone pushing patches to a stable kernel would run some 
>>> testing on it first!
>>>
>>> Any thoughts on a good way to fail the load? We don't want to just 
>>> pretend that firmware is not wanted/required on the platform and 
>>> just load the i915 module without the firmware. Also, what about the 
>>> longer plan of moving the validation to a selftest. You can't fail 
>>> the load at all then.
>> Actually, forgot we already have a INTEL_UC_FIRMWARE_ERROR status. 
>> That works fine for aborting the load. So just go with that and drop 
>> the plan to move to a selftest?
>>
>> John.
>
> I do actually like the idea of moving this code to a mock selftest. 
> Maybe just add a comment above the tables making clear that duplicated 
> entries are not allowed and will break the loading flow?
The point is about accidental breakages. The 'issue' was not an actual 
failure but that a failure could potentially occur if, for example, a 
patch got applied twice due to some backport error/confusion. Adding a 
comment doesn't help with that.

Given that it is trivial to return FIRMWARE_ERROR on a validation 
failure, I'm inclined to go with that for now. The validation is still 
only being done once (and moving it to _early time means we don't 
actually need the static bool at all) and the driver is guaranteed to 
not try to process a broken table and get confused. It is also instantly 
evident if a backport is broken for any reason without worrying about 
what kind of explicit testing may or may not be run.

John.


>
> Daniele
>
>>
>>
>>>
>>> John.
>>>
>>>>
>>>>>
>>>>> For full m/m/p files, that can be done by simply tweaking the patch
>>>>> level check to reject matching values. For reduced version entries,
>>>>> the filename itself must be compared.
>>>>>
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 27 
>>>>> +++++++++++++++++++++---
>>>>>   1 file changed, 24 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
>>>>> index c589782467265..44829247ef6bc 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
>>>>> @@ -319,7 +319,7 @@ static void validate_fw_table_type(struct 
>>>>> drm_i915_private *i915, enum intel_uc_
>>>>>   {
>>>>>       const struct uc_fw_platform_requirement *fw_blobs;
>>>>>       u32 fw_count;
>>>>> -    int i;
>>>>> +    int i, j;
>>>>>         if (type >= ARRAY_SIZE(blobs_all)) {
>>>>>           drm_err(&i915->drm, "No blob array for %s\n", 
>>>>> intel_uc_fw_type_repr(type));
>>>>> @@ -334,6 +334,27 @@ static void validate_fw_table_type(struct 
>>>>> drm_i915_private *i915, enum intel_uc_
>>>>>         /* make sure the list is ordered as expected */
>>>>>       for (i = 1; i < fw_count; i++) {
>>>>> +        /* Versionless file names must be unique per platform: */
>>>>> +        for (j = i + 1; j < fw_count; j++) {
>>>>> +            /* Same platform? */
>>>>> +            if (fw_blobs[i].p != fw_blobs[j].p)
>>>>> +                continue;
>>>>> +
>>>>> +            if (fw_blobs[i].blob.path != fw_blobs[j].blob.path)
>>>>> +                continue;
>>>>> +
>>>>> +            drm_err(&i915->drm, "Diplicaate %s blobs: %s r%u 
>>>>> %s%d.%d.%d [%s] matches %s r%u %s%d.%d.%d [%s]\n",
>>>>
>>>> Typo Diplicaate
>>>>
>>>> Daniele
>>>>
>>>>> + intel_uc_fw_type_repr(type),
>>>>> +                intel_platform_name(fw_blobs[j].p), fw_blobs[j].rev,
>>>>> +                fw_blobs[j].blob.legacy ? "L" : "v",
>>>>> +                fw_blobs[j].blob.major, fw_blobs[j].blob.minor,
>>>>> +                fw_blobs[j].blob.patch, fw_blobs[j].blob.path,
>>>>> +                intel_platform_name(fw_blobs[i].p), fw_blobs[i].rev,
>>>>> +                fw_blobs[i].blob.legacy ? "L" : "v",
>>>>> +                fw_blobs[i].blob.major, fw_blobs[i].blob.minor,
>>>>> +                fw_blobs[i].blob.patch, fw_blobs[i].blob.path);
>>>>> +        }
>>>>> +
>>>>>           /* Next platform is good: */
>>>>>           if (fw_blobs[i].p < fw_blobs[i - 1].p)
>>>>>               continue;
>>>>> @@ -377,8 +398,8 @@ static void validate_fw_table_type(struct 
>>>>> drm_i915_private *i915, enum intel_uc_
>>>>>           if (fw_blobs[i].blob.minor != fw_blobs[i - 1].blob.minor)
>>>>>               goto bad;
>>>>>   -        /* Patch versions must be in order: */
>>>>> -        if (fw_blobs[i].blob.patch <= fw_blobs[i - 1].blob.patch)
>>>>> +        /* Patch versions must be in order and unique: */
>>>>> +        if (fw_blobs[i].blob.patch < fw_blobs[i - 1].blob.patch)
>>>>>               continue;
>>>>>     bad:
>>>>
>>>
>>
>


  reply	other threads:[~2023-04-19 17:59 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-15  0:57 [Intel-gfx] [PATCH 0/5] Improvements to uc firmare management John.C.Harrison
2023-04-15  0:57 ` [Intel-gfx] [PATCH 1/5] drm/i915/guc: Decode another GuC load failure case John.C.Harrison
2023-04-18 18:41   ` Ceraolo Spurio, Daniele
2023-04-15  0:57 ` [Intel-gfx] [PATCH 2/5] drm/i915/guc: Print status register when waiting for GuC to load John.C.Harrison
2023-04-18 18:37   ` Ceraolo Spurio, Daniele
2023-04-15  0:57 ` [Intel-gfx] [PATCH 3/5] drm/i915/uc: Track patch level versions on reduced version firmware files John.C.Harrison
2023-04-18 22:46   ` Ceraolo Spurio, Daniele
2023-04-19 16:06     ` John Harrison
2023-04-15  0:57 ` [Intel-gfx] [PATCH 4/5] drm/i915/uc: Split firmware table validation to a separate function John.C.Harrison
2023-04-18 23:14   ` Ceraolo Spurio, Daniele
2023-04-19 15:45     ` John Harrison
2023-04-15  0:57 ` [Intel-gfx] [PATCH 5/5] drm/i915/uc: Reject doplicate entries in firmware table John.C.Harrison
2023-04-18 23:24   ` Ceraolo Spurio, Daniele
2023-04-19 17:02     ` John Harrison
2023-04-19 17:12       ` John Harrison
2023-04-19 17:33         ` Ceraolo Spurio, Daniele
2023-04-19 17:59           ` John Harrison [this message]
2023-04-15  1:28 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Improvements to uc firmare management Patchwork
2023-04-15  1:28 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-04-15  1:44 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-04-15  8:09 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=688259d2-d29e-439f-44e7-7dca1a3f7037@intel.com \
    --to=john.c.harrison@intel.com \
    --cc=DRI-Devel@Lists.FreeDesktop.Org \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=daniele.ceraolospurio@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox