public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: maarten.lankhorst@linux.intel.com, matthew.auld@intel.com,
	Kai Vehmanen <kai.vehmanen@linux.intel.com>,
	Imre Deak <imre.deak@intel.com>,
	Russell King <rmk+kernel@armlinux.org.uk>
Subject: [Intel-gfx] [PATCH v6 8/9] HAX: component: do not leave master devres group open after bind
Date: Wed, 22 Sep 2021 08:25:26 +0200	[thread overview]
Message-ID: <20210922062527.865433-9-thomas.hellstrom@linux.intel.com> (raw)
In-Reply-To: <20210922062527.865433-1-thomas.hellstrom@linux.intel.com>

From: Kai Vehmanen <kai.vehmanen@linux.intel.com>

In current code, the devres group for aggregate master is left open
after call to component_master_add_*(). This leads to problems when the
master does further managed allocations on its own. When any
participating driver calls component_del(), this leads to immediate
release of resources.

This came up when investigating a page fault occurring with i915 DRM
driver unbind with 5.15-rc1 kernel. The following sequence occurs:

 i915_pci_remove()
   -> intel_display_driver_unregister()
     -> i915_audio_component_cleanup()
       -> component_del()
         -> component.c:take_down_master()
           -> hdac_component_master_unbind() [via master->ops->unbind()]
           -> devres_release_group(master->parent, NULL)

With older kernels this has not caused issues, but with audio driver
moving to use managed interfaces for more of its allocations, this no
longer works. Devres log shows following to occur:

component_master_add_with_match()
[  126.886032] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000323ccdc5 devm_component_match_release (24 bytes)
[  126.886045] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000865cdb29 grp< (0 bytes)
[  126.886049] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 grp< (0 bytes)

audio driver completes its PCI probe()
[  126.892238] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 pcim_iomap_release (48 bytes)

component_del() called() at DRM/i915 unbind()
[  137.579422] i915 0000:00:02.0: DEVRES REL 00000000ef44c293 grp< (0 bytes)
[  137.579445] snd_hda_intel 0000:00:1f.3: DEVRES REL 00000000865cdb29 grp< (0 bytes)
[  137.579458] snd_hda_intel 0000:00:1f.3: DEVRES REL 000000001b480725 pcim_iomap_release (48 bytes)

So the "devres_release_group(master->parent, NULL)" ends up freeing the
pcim_iomap allocation. Upon next runtime resume, the audio driver will
cause a page fault as the iomap alloc was released without the driver
knowing about it.

Fix this issue by using the "struct master" pointer as identifier for
the devres group, and by closing the devres group after the master->ops->bind()
call is done. This allows devres allocations done by the driver acting as
master to be isolated from the binding state of the aggregate driver. This
modifies the logic originally introduced in commit 9e1ccb4a7700
("drivers/base: fix devres handling for master device").

BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Acked-by: Imre Deak <imre.deak@intel.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/base/component.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/base/component.c b/drivers/base/component.c
index 5e79299f6c3f..870485cbbb87 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -246,7 +246,7 @@ static int try_to_bring_up_master(struct master *master,
 		return 0;
 	}
 
-	if (!devres_open_group(master->parent, NULL, GFP_KERNEL))
+	if (!devres_open_group(master->parent, master, GFP_KERNEL))
 		return -ENOMEM;
 
 	/* Found all components */
@@ -258,6 +258,7 @@ static int try_to_bring_up_master(struct master *master,
 		return ret;
 	}
 
+	devres_close_group(master->parent, NULL);
 	master->bound = true;
 	return 1;
 }
@@ -282,7 +283,7 @@ static void take_down_master(struct master *master)
 {
 	if (master->bound) {
 		master->ops->unbind(master->parent);
-		devres_release_group(master->parent, NULL);
+		devres_release_group(master->parent, master);
 		master->bound = false;
 	}
 }
-- 
2.31.1


  parent reply	other threads:[~2021-09-22  6:26 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-22  6:25 [Intel-gfx] [PATCH v6 0/9] drm/i915: Suspend / resume backup- and restore of LMEM Thomas Hellström
2021-09-22  6:25 ` [Intel-gfx] [PATCH v6 1/9] drm/i915/ttm: Implement a function to copy the contents of two TTM-based objects Thomas Hellström
2021-09-22  6:25 ` [Intel-gfx] [PATCH v6 2/9] drm/i915/gem: Implement a function to process all gem objects of a region Thomas Hellström
2021-09-22  6:25 ` [Intel-gfx] [PATCH v6 3/9] drm/i915/gt: Increase suspend timeout Thomas Hellström
2021-09-23  9:18   ` Matthew Auld
2021-09-23 10:13   ` Tvrtko Ursulin
2021-09-23 11:47     ` Thomas Hellström
2021-09-23 12:59       ` Tvrtko Ursulin
2021-09-23 13:19         ` Thomas Hellström
2021-09-23 14:33           ` Tvrtko Ursulin
2021-09-23 15:43             ` Thomas Hellström
2021-09-22  6:25 ` [Intel-gfx] [PATCH v6 4/9] drm/i915 Implement LMEM backup and restore for suspend / resume Thomas Hellström
2021-09-22  6:25 ` [Intel-gfx] [PATCH v6 5/9] drm/i915/gt: Register the migrate contexts with their engines Thomas Hellström
2021-09-22  6:25 ` [Intel-gfx] [PATCH v6 6/9] drm/i915: Don't back up pinned LMEM context images and rings during suspend Thomas Hellström
2021-09-22  6:25 ` [Intel-gfx] [PATCH v6 7/9] drm/i915: Reduce the number of objects subject to memcpy recover Thomas Hellström
2021-09-23  9:44   ` Matthew Auld
2021-09-23  9:58     ` Thomas Hellström
2021-09-22  6:25 ` Thomas Hellström [this message]
2021-09-22  6:25 ` [Intel-gfx] [PATCH v6 9/9] HAX: drm/i915/gem: Fix the __i915_gem_is_lmem() function Thomas Hellström
2021-09-22  7:23 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Suspend / resume backup- and restore of LMEM. (rev9) Patchwork
2021-09-22  7:25 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-09-22  7:52 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-09-22  9:05 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-09-22 18:06   ` Thomas Hellström
2021-09-23  2:11     ` Vudum, Lakshminarayana
2021-09-23  0:27 ` [Intel-gfx] ✓ Fi.CI.IGT: success " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210922062527.865433-9-thomas.hellstrom@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=imre.deak@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=kai.vehmanen@linux.intel.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.auld@intel.com \
    --cc=rmk+kernel@armlinux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox