From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F16E7C433EF for ; Fri, 29 Apr 2022 04:25:55 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 75F3E10FB22; Fri, 29 Apr 2022 04:25:55 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id B3AA010FB22 for ; Fri, 29 Apr 2022 04:25:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651206354; x=1682742354; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=2pjjRAbjMw+t9HUdobyARI02BY1OmKTpE6RodpyJiFw=; b=bOfInDXXiGQqzJA0x/E85peGdvnQihy2+kYSnKLtFULJgYGCwL/t7I/t R2Wp+Z9vem7tTAqbRgueDichiFEre6gYFpePoOXtUnUA7TbY0fiIJHvAv txAsKm9UPIT3G+VG7Fb7ujzdtL4qaVVoY3ML3kstZWTgengpwMMaB2pJZ yKsQ1wPhZoiWJA3AgQoSgh0apA7MAoDCsorgYhRa8E87VXRlCHkzkPTSo jedCN6O93TbOeWUIae6Ao8ezy/AywrKpfQM7AUI8GHMHlUiTd/eQJUpVs vZscmxjVC4AellPVZQJ1vTbITVbo1Ldtrulg/+pA66Eg4u8aRKOtsXHp3 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10331"; a="266050072" X-IronPort-AV: E=Sophos;i="5.91,297,1647327600"; d="scan'208";a="266050072" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Apr 2022 21:25:54 -0700 X-IronPort-AV: E=Sophos;i="5.91,297,1647327600"; d="scan'208";a="597136922" Received: from adixit-mobl1.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.8.103]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Apr 2022 21:25:54 -0700 Date: Thu, 28 Apr 2022 21:25:53 -0700 Message-ID: <871qxg5xda.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Andrzej Hajda In-Reply-To: References: <9ed5af1177ad08c7c2d9c5d9b32ab0154dbd950f.1650430271.git.ashutosh.dixit@intel.com> <1339a2be-5fd0-cf65-d361-06c60d938ce5@intel.com> <87levzag3a.wl-ashutosh.dixit@intel.com> <87ee1i5k58.wl-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Thu, 28 Apr 2022 07:36:14 -0700, Andrzej Hajda wrote: > On 27.04.2022 22:46, Dixit, Ashutosh wrote: > > On Sun, 24 Apr 2022 15:36:23 -0700, Andi Shyti wrote: > >> Hi Andrzej and Ashutosh, > >> > >>>>>> b/drivers/gpu/drm/i915/gt/intel_gt_types.h > >>>>>> index 937b2e1a305e..4c72b4f983a6 100644 > >>>>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h > >>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h > >>>>>> @@ -222,6 +222,9 @@ struct intel_gt { > >>>>>> } mocs; > >>>>>> struct intel_pxp pxp; > >>>>>> + > >>>>>> + /* gt/gtN sysfs */ > >>>>>> + struct kobject sysfs_gtn; > >>>>> If you put kobject as a part of intel_gt what assures you that lifetime of > >>>>> kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of > >>>>> intel_gt? > >>>> Because we are explicitly doing a kobject_put() in > >>>> intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in > >>>> the previous code. > >>>> > >>>> Let me explain a bit about the previous code (but feel free to skip since > >>>> the patch should speak for itself): > >>>> * Previously we kzalloc a 'struct kobj_gt' > >>>> * But we don't save a pointer to the 'struct kobj_gt' so we don't have the > >>>> pointer to the kobject to be able to do a kobject_put() on it later > >>>> * Therefore we need to store the pointer in 'struct intel_gt' > >>>> * But if we have to put the pointer in 'struct intel_gt' we might as well > >>>> put the kobject as part of 'struct intel_gt' and that also removes the > >>>> need to have a 'struct kobj_gt' (kobj_to_gt() can just use container_of() > >>>> to get gt from kobj). > >>>> * So I think this patch simpler/cleaner than the original code if you take > >>>> the requirement for kobject_put() into account. > >> This is my oversight. This was something I completely forgot to > >> fix but it was my intention to do and actually I had some fixes > >> ongoing. But because this patch took too long to get in I > >> completely forgot about it (Sujaritha was actually the first who > >> pointed this out). > >> > >> Thanks, Ashutosh for taking this. > >> > >>> I fully agree that previous code is incorrect but I am not convinced current > >>> code is correct. > >>> If some objects are kref-counted it means usually they can have multiple > >>> concurrent users and kobject_put does not work as traditional > >>> destructor/cleanup/unregister. > >>> So in this particular case after calling kobject_init_and_add sysfs core can > >>> get multiple references on the object. Later, during driver unregistration > >>> kobject_put is called, but if the object is still in use by sysfs core, the > >>> object will not be destroyed/released. If the driver unregistration > >>> continues memory will be freed, leaving sysfs-core (or other users) with > >>> dangling pointers. Unless there is some additional synchronization mechanism > >>> I am not aware of. > >> Thanks Andrzej for summarizing this and what you said is actually > >> what happens. I had a similar solution developed and I had wrong > >> pointer reference happening. > > Hi Andrzej/Andi, > > > > I did do some research into kobject's and such before writing this patch > > and based on that I believe the patch is correct. Presenting some evidence > > below. > > > > The patch is verified by: > > > > a. Putting a printk in the release() method when it exists (it does for > > sysfs_gtn kobject) > > b. Enabling dynamic prints for lib/kobject.c > > > > For example, with the following: > > > > # echo 'file kobject.c +p' > /sys/kernel/debug/dynamic_debug/control > > # echo -n "0000:03:00.0" > /sys/bus/pci/drivers/i915/unbind > > > > We see this in dmesg (see kobject_cleanup() called from kobject_put()): > > > > [ 1034.930007] kobject: '.defaults' (ffff88817130a640): kobject_cleanup, parent ffff8882262b5778 > > [ 1034.930020] kobject: '.defaults' (ffff88817130a640): auto cleanup kobject_del > > [ 1034.930336] kobject: '.defaults' (ffff88817130a640): calling ktype release > > [ 1034.930340] kobject: (ffff88817130a640): dynamic_kobj_release > > [ 1034.930354] kobject: '.defaults': free name > > [ 1034.930366] kobject: 'gt0' (ffff8882262b5778): kobject_cleanup, parent ffff88817130a240 > > [ 1034.930371] kobject: 'gt0' (ffff8882262b5778): auto cleanup kobject_del > > [ 1034.931930] kobject: 'gt0' (ffff8882262b5778): calling ktype release > > [ 1034.931936] kobject: 'gt0': free name > > [ 1034.958004] kobject: 'i915_0000_03_00.0' (ffff88810e1f8800): fill_kobj_path: path = '/devices/i915_0000_03_00.0' > > [ 1034.958155] kobject: 'i915_0000_03_00.0' (ffff88810e1f8800): kobject_cleanup, parent 0000000000000000 > > [ 1034.958162] kobject: 'i915_0000_03_00.0' (ffff88810e1f8800): calling ktype release > > [ 1034.958188] kobject: 'i915_0000_03_00.0': free name > > [ 1034.958729] kobject: 'gt' (ffff88817130a240): kobject_cleanup, parent ffff8881160c5000 > > [ 1034.958736] kobject: 'gt' (ffff88817130a240): auto cleanup kobject_del > > [ 1034.958762] kobject: 'gt' (ffff88817130a240): calling ktype release > > [ 1034.958767] kobject: (ffff88817130a240): dynamic_kobj_release > > [ 1034.958778] kobject: 'gt': free name > > > > We have the following directory structure (one of the patches is creating > > /sys/class/drm/card0/gt/gt0/.defaults): > > > > /sys/class/drm/card0/gt > > |-gt0 > > |-.defaults > > > > And we see from dmesg .defaults, gt0 and gt kobjects being cleaned up in > > that order. > > > > Looking at lib/kobject.c there are several interesting things: > > > > * Three subsystems are involved: kobject, sysfs and kernfs. > > > > * A child kobject takes a reference on the parent, so we must do a > > kobject_put() on the child before doing kobject_put() on the parent > > (creating a child kobject creates a corresponding sub-directory in sysfs). > > > > * Adding files to a sysfs directory does not take a reference on the > > kobject, only on the parent kernfs_node. > > > > * Since we do call sysfs_create_group() (for RC6) ordinarily we will need > > to call sysfs_remove_group() but this does not seem to be needed because > > we are not creating a directory for the group (by providing a name for > > the group). So sysfs_create_group() is equivalent to sysfs_create_files(). > > So it seems we don't need sysfs_remove_group(). > > > > * Similarly it appears files created by sysfs_create_files() do not need to > > be removed by sysfs_remove_files() because __kobject_del() and > > sysfs_remove_dir() called from kobject_cleanup() do that for us (the > > comment in kobject_cleanup() says "remove from sysfs if the caller did > > not do it"). > > > > Based on the above it is clear that no one except a child kobject takes a > > reference on the parent kobject and as long as we kobject_put() them in the > > correct order (as we seem to be doing based on dmesg trace above) we should > > be ok. > > > > Also what is followed in this patch is a fairly standard coding > > pattern. Further, in case of any errors we generally see failure to unload > > the module etc. and none of these things are being observed, module reload > > works fine. > > > > I hope these points are helpful in completing review of the patch. > > See [1], it is quite old, so maybe it is not valid anymore, but I see no > code proving sth has changed. Hi Andrzej, A lot has changed since that article from 2003 (for 2.5 kernel). For instance there is kernfs (as I mention above): https://lwn.net/Articles/571590/ A process having a sysfs file open today in my view will result in the following: * It will take a reference on kernfs_node (not on kobject as was the case in kernel 2.5 in [1]) * An open file will prevent the module from being unloaded (not the kernel crashing as in 2.5 in [1]) So this is what I would expect with today's kernel. I am not seeing anything we've done here which violates anything in [1] or [2]. > Also current doc says also [2] similar things, especially: > "Once you registered your kobject via kobject_add(), you must never use > kfree() to free it directly" Correct, we are using kobject_put(), not kfree'ing the kobject. Thanks. -- Ashutosh > [1]: https://lwn.net/Articles/36850/ > [2]: https://elixir.bootlin.com/linux/v5.18-rc4/source/Documentation/core-api/kobject.rst#L246