From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org,
Andi Shyti <andi.shyti@intel.com>,
Andrzej Hajda <andrzej.hajda@intel.com>
Subject: Re: [Intel-gfx] [PATCH 6/8] drm/i915/gt: Fix memory leaks in per-gt sysfs
Date: Thu, 12 May 2022 22:05:42 -0700 [thread overview]
Message-ID: <871qwy6mzd.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <eee5af37-c287-5c3d-d536-e6612d0fe8f8@linux.intel.com>
On Thu, 12 May 2022 00:48:08 -0700, Tvrtko Ursulin wrote:
Hi Tvrtko,
> On 12/05/2022 00:15, Dixit, Ashutosh wrote:
> > On Tue, 10 May 2022 03:41:57 -0700, Andrzej Hajda wrote:
> >> On 10.05.2022 11:48, Tvrtko Ursulin wrote:
> >>> On 10/05/2022 10:39, Andrzej Hajda wrote:
> >>>> On 10.05.2022 10:18, Tvrtko Ursulin wrote:
> >>>>>>>
> >>>>>>> Was there closure/agreement on the matter of whether or not there is
> >>>>>>> a potential race between "kfree(gt)" and sysfs access (last put from
> >>>>>>> sysfs that is)? I've noticed Andrzej and Ashutosh were discussing it
> >>>>>>> but did not read all the details.
> >>>>>>>
> >>>>>>
> >>>>>> Not really :)
> >>>>>> IMO docs are against this practice, Ashutosh shows examples of this
> >>>>>> practice in code and according to his analysis it is safe.
> >>>>>> I gave up looking for contradictions :) Either it is OK, kobject is
> >>>>>> not fully shared object, docs are obsolete and needs update, either
> >>>>>> the patch is wrong.
> >>>>>> Anyway finally I tend to accept this solution, I failed to prove it is
> >>>>>> wrong :)
> >>>>>
> >>>>> Like a question of whether hotunplug can be triggered while userspace
> >>>>> is sitting in a sysfs hook? Final kfree then has to be delayed until
> >>>>> userspace exists.
> >>>>>
> >>>>> Btw where is the "kfree(gt)" for the tiles on the PCI remove path? I
> >>>>> can't find it.. Do we have a leak?
> >>>>
> >>>> intel_gt_tile_cleanup ?
> >>>
> >>> Called from intel_gt_release_all, whose only caller is the failure path
> >>> of i915_driver_probe. Feels like something is missing?
> >>
> >> This is final proof this patch is safe - no kfree, no UAF :)
> >>
> >> Apparently it is broken in internal branch as well.
> >> Should I take care of it?
> >
> > See Daniele's comment here:
> >
> > https://patchwork.freedesktop.org/patch/478856/?series=101551&rev=1
>
> Yeah we found that same leak yesterday, or the day before in this thread.
>
> > We clean up the gt sysfs during PCI device remove (i915_driver_remove ->
> > i915_driver_unregister -> intel_gt_driver_unregister ->
> > intel_gt_sysfs_unregister (added in this patch)). But from Daniele's mail
> > it appears that "kfree(gt)" can only be done from i915_driver_release().
> >
> > So as long as i915_driver_release() always happens after
> > i915_driver_remove() (which seems to be the case though I couldn't figure
> > out why (i.e. who is putting the final reference of the drm device)) there
> > is no UAF and no race. Thanks!
>
> No worried by the unknown?
Well if release() happens before or during remove() then (as is clear from
Daniele's mail) we have a much bigger problem than sysfs on our hands and
will see UAF crashes during device remove/unbind. But as far as we know no
such crashes have been reported.
> I had a quick look whether core_hotunplug tests for sysfs interactions
> but couldn't spot it. What I had in mind is userspace stuck in a sysfs
> hook (say read into a userfaultfd buffer) with device hotunplug in
> parallel. Maybe it is all handled already, not claiming that it isn't.
This is the 20 year old issue mentioned by Andrzej here:
https://lwn.net/Articles/36850/
So I thought I'll try this out today and see what actually happens to
settle this. And you will see it makes perfect sense. So this is what I
did:
* Change IGT to add a 20 second sleep after opening a sysfs file
* In that 20 second period, with an open fd, unbind the device using:
echo -n "0000:03:00.0" > /sys/bus/pci/drivers/i915/unbind
And also rmmod i915.
So this is what we see when we do this:
* As soon as the device is unbound, the complete i915 sysfs tree (under
/sys/card/drm/card0) is cleanly removed (even with the open fd in IGT).
* The fd open in IGT is now orphan/invalid, so when IGT resumes and tries
to use that fd IGT crashes.
* So no problem with device unbind but if IGT is still hanging around rmmod
fails (saying module is in use, most likely due to the still open drm fd)
but after IGT is completely killed rmmod is also fine.
So this confirms all this is correctly handled.
Separately, note that kobject_put's introduced in this patch are only
needed for freeing the memory allocated for the kobject's themselves (or
their containing struct's). kobject_put's don't play a role in cleaning up
the sysfs hierarchy itself (which will get cleaned up even without the
kobject_put's). Further, child kobject's take a reference on their parents
so child kobjects need a kobject_put before the parent kobject_put to free
the memory allocated for the parent (i.e. doing a kobject_put on the parent
will not automatically free all the child kobjects).
Thanks.
--
Ashutosh
next prev parent reply other threads:[~2022-05-13 5:05 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-29 19:56 [Intel-gfx] [PATCH v4 0/8] drm/i915: Media freq factor and per-gt enhancements/fixes Ashutosh Dixit
2022-04-29 19:56 ` [Intel-gfx] [PATCH 1/8] drm/i915: Introduce has_media_ratio_mode Ashutosh Dixit
2022-04-29 19:56 ` [Intel-gfx] [PATCH 2/8] drm/i915/gt: Add media freq factor to per-gt sysfs Ashutosh Dixit
2022-05-10 7:24 ` Tvrtko Ursulin
2022-05-12 4:25 ` Dixit, Ashutosh
2022-04-29 19:56 ` [Intel-gfx] [PATCH 3/8] drm/i915/pcode: Extend pcode functions for multiple gt's Ashutosh Dixit
2022-05-02 12:54 ` Rodrigo Vivi
2022-05-10 6:51 ` Andi Shyti
2022-05-10 7:34 ` Tvrtko Ursulin
2022-05-10 7:43 ` Jani Nikula
2022-05-11 5:26 ` Dixit, Ashutosh
2022-05-11 8:18 ` Tvrtko Ursulin
2022-05-12 4:28 ` Dixit, Ashutosh
2022-04-29 19:56 ` [Intel-gfx] [PATCH 4/8] drm/i915/pcode: Add a couple of pcode helpers Ashutosh Dixit
2022-04-29 19:56 ` [Intel-gfx] [PATCH 5/8] drm/i915/gt: Add media RP0/RPn to per-gt sysfs Ashutosh Dixit
2022-05-10 7:37 ` Tvrtko Ursulin
2022-05-12 4:25 ` Dixit, Ashutosh
2022-04-29 19:56 ` [Intel-gfx] [PATCH 6/8] drm/i915/gt: Fix memory leaks in " Ashutosh Dixit
2022-05-10 6:02 ` Andi Shyti
2022-05-10 7:28 ` Tvrtko Ursulin
2022-05-10 7:58 ` Andrzej Hajda
2022-05-10 8:18 ` Tvrtko Ursulin
2022-05-10 9:39 ` Andrzej Hajda
2022-05-10 9:48 ` Tvrtko Ursulin
2022-05-10 10:41 ` Andrzej Hajda
2022-05-10 13:25 ` Tvrtko Ursulin
2022-05-11 23:15 ` Dixit, Ashutosh
2022-05-12 7:48 ` Tvrtko Ursulin
2022-05-13 5:05 ` Dixit, Ashutosh [this message]
2022-05-13 9:28 ` Tvrtko Ursulin
2022-04-29 19:56 ` [Intel-gfx] [PATCH 7/8] drm/i915/gt: Expose per-gt RPS defaults in sysfs Ashutosh Dixit
2022-05-10 7:53 ` Tvrtko Ursulin
2022-05-10 10:58 ` Andi Shyti
2022-05-26 19:10 ` Dixit, Ashutosh
2022-05-26 19:09 ` Dixit, Ashutosh
2022-04-29 19:56 ` [Intel-gfx] [PATCH 8/8] drm/i915/gt: Expose default value for media_freq_factor in per-gt sysfs Ashutosh Dixit
2022-04-29 20:37 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Media freq factor and per-gt enhancements/fixes (rev4) Patchwork
2022-04-29 20:37 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-04-29 21:10 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-04-29 23:38 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2022-04-30 0:44 ` Dixit, Ashutosh
2022-04-30 5:28 ` Vudum, Lakshminarayana
2022-04-30 5:09 ` [Intel-gfx] ✓ Fi.CI.IGT: success " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2022-04-13 18:11 [Intel-gfx] [PATCH 0/8] drm/i915: Media freq factor and per-gt enhancements/fixes Ashutosh Dixit
2022-04-13 18:11 ` [Intel-gfx] [PATCH 6/8] drm/i915/gt: Fix memory leaks in per-gt sysfs Ashutosh Dixit
2022-04-13 19:14 ` Dixit, Ashutosh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871qwy6mzd.wl-ashutosh.dixit@intel.com \
--to=ashutosh.dixit@intel.com \
--cc=andi.shyti@intel.com \
--cc=andrzej.hajda@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=tvrtko.ursulin@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox