On 31-07-2024 23:26, Emil Velikov wrote:
Hey everyone,

On Wed, 31 Jul 2024 at 09:31, Bommu Krishnaiah
<krishnaiah.bommu@intel.com> wrote:
The `tweak_perm` function now uses `continue` instead of `break` when
encountering non-existent card paths. This change addresses the assumption
in existing userspace that card paths are contiguous, which may not be true
if cards are dynamically removed.

Problem Description:
The test fails after running the `core_hotunplug@hot*` subtest, where card0 is missing.
This leads to a failure in the subsequent `master-drop-set-user` test.

Command sequence for reproducing the issue:
- Check available cards: `ls /dev/dri/`
  - Output: by-path  card0  renderD128
- Run the test: `# ./core_hotunplug --r hotrebind-lateclose`
- Check again: ‘ls /dev/dri/’
  - Output after failure: by-path  card1  renderD129
- Run the test: `# ./core_setmaster --r master-drop-set-user`

This failures on both XE and i915 for above sequence.

The commit message is still a bit strange, but I'll refer that to your
colleague who already gave you good tips.

tests/core_setmaster: Handle the test even if cards are non-continuous

 

Although most of userspace assumes drm cards to be populated continously,

this assumption may not be always true. After hot unpluging and repluging of card

the card may be non-continous.  To address such scenarios where the cards can be non-continous

modify the tweak_perm function to loop all possible cardnumbers(card0-card255) to confirm are they populated or not.

 

Problem Description:

The test fails after running the `core_hotunplug@hot*` subtest, where card0 is missing.

This leads to a failure in the subsequent `master-drop-set-user` test.

 

Command sequence for reproducing the issue:

- Check available cards: `ls /dev/dri/`

- Output: by-path  card0  renderD128

- Run the test: `# ./core_hotunplug --r hotrebind-lateclose`

- Check again: ‘ls /dev/dri/’

- Output after failure: by-path  card1  renderD129

- Run the test: `# ./core_setmaster --r master-drop-set-user`

 

This failures on both XE and i915 for above sequence.

Fwiw the "existing usespace" refers to mesa, Xorg (both server and
drivers) as well as gstreamer, if I recall correctly. There may be
others assume there's no gaps.
here I am referring to IGT, may be as you said few may be continues, but this change works for continues also.

      
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
---
 tests/core_setmaster.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/core_setmaster.c b/tests/core_setmaster.c
index 9c2083f66..a2758e2f2 100644
--- a/tests/core_setmaster.c
+++ b/tests/core_setmaster.c
@@ -116,9 +116,10 @@ static unsigned tweak_perm(uint8_t *saved_perm, unsigned max_perm, bool save)
        for (i = 0; i < max_perm; i++) {
                snprintf(path, sizeof(path), "/dev/dri/card%u", i);

-               /* Existing userspace assumes there's no gaps, do the same. */
+               /* Userspace cannot always have continuous card0...N due to
+                * dynamic unloading or removal of cards */
Is this "dynamic unloading or removal of cards" a relatively new
thing? Back when I wrote there test, unloading DRM modules wasn't
supported (I believe) hence, it wasn't possible to have gaps.

I don't think dynamic unloading or removal of cards is new

core_hotunplug test verifies unloading of i915/xe driver module


Or does this feature/work that only affects IGT and normal day-to-day
userspace will never hit this case?

as I mentioned in commit message running sequence is not done before

Regards,

Krishna


Should probably mention a word or two about this in the commit
message/inline comment.

HTH,
Emil