* [PATCH] powerpc: Fix device node refcounting
@ 2023-02-01 19:58 Brian King
2023-02-07 15:14 ` Nathan Lynch
0 siblings, 1 reply; 5+ messages in thread
From: Brian King @ 2023-02-01 19:58 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Brian King, mmc, brking
While testing fixes to the hvcs hotplug code, kmemleak was reporting
potential memory leaks. This was tracked down to the struct device_node
object associated with the hvcs device. Looking at the leaked
object in crash showed that the kref in the kobject in the device_node
had a reference count of 1 still, and the release function was never
getting called as a result of this. This adds an of_node_put in
pSeries_reconfig_remove_node in order to balance the refcounting
so that we actually free the device_node in the case of it being
allocated in pSeries_reconfig_add_node.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---
arch/powerpc/platforms/pseries/reconfig.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/platforms/pseries/reconfig.c b/arch/powerpc/platforms/pseries/reconfig.c
index 599bd2c78514..8cb7309b19a4 100644
--- a/arch/powerpc/platforms/pseries/reconfig.c
+++ b/arch/powerpc/platforms/pseries/reconfig.c
@@ -77,6 +77,7 @@ static int pSeries_reconfig_remove_node(struct device_node *np)
}
of_detach_node(np);
+ of_node_put(np);
of_node_put(parent);
return 0;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] powerpc: Fix device node refcounting
2023-02-01 19:58 [PATCH] powerpc: Fix device node refcounting Brian King
@ 2023-02-07 15:14 ` Nathan Lynch
2023-02-09 15:16 ` Brian King
0 siblings, 1 reply; 5+ messages in thread
From: Nathan Lynch @ 2023-02-07 15:14 UTC (permalink / raw)
To: Brian King, linuxppc-dev
Cc: Tyrel Datwyler, Scott Cheloha, mmc, nnac123, brking
(cc'ing a few possibly interested people)
Brian King <brking@linux.vnet.ibm.com> writes:
> While testing fixes to the hvcs hotplug code, kmemleak was reporting
> potential memory leaks. This was tracked down to the struct device_node
> object associated with the hvcs device. Looking at the leaked
> object in crash showed that the kref in the kobject in the device_node
> had a reference count of 1 still, and the release function was never
> getting called as a result of this. This adds an of_node_put in
> pSeries_reconfig_remove_node in order to balance the refcounting
> so that we actually free the device_node in the case of it being
> allocated in pSeries_reconfig_add_node.
My concern here would be whether the additional put is the right thing
to do in all cases. The questions it raises for me are:
- Is it safe for nodes that were present at boot, instead of added
dynamically?
- Is it correct for all types of nodes, or is there something specific
to hvcs that leaves a dangling refcount?
Just hoping we're not stepping into a situation where we're preventing
leaks in some situations but doing use-after-free in others. :-)
>
> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
> ---
> arch/powerpc/platforms/pseries/reconfig.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/platforms/pseries/reconfig.c b/arch/powerpc/platforms/pseries/reconfig.c
> index 599bd2c78514..8cb7309b19a4 100644
> --- a/arch/powerpc/platforms/pseries/reconfig.c
> +++ b/arch/powerpc/platforms/pseries/reconfig.c
> @@ -77,6 +77,7 @@ static int pSeries_reconfig_remove_node(struct device_node *np)
> }
>
> of_detach_node(np);
> + of_node_put(np);
> of_node_put(parent);
> return 0;
In a situation like this where the of_node_put() call isn't obviously
connected to one of the of_ iterator APIs or similar, I would prefer a
comment indicating which "get" it balances. I suppose it corresponds to
the node initialization itself, i.e. the of_node_init() call sites in
pSeries_reconfig_add_node() and drivers/of/fdt.c::populate_node().
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] powerpc: Fix device node refcounting
2023-02-07 15:14 ` Nathan Lynch
@ 2023-02-09 15:16 ` Brian King
2023-02-09 17:11 ` Nathan Lynch
0 siblings, 1 reply; 5+ messages in thread
From: Brian King @ 2023-02-09 15:16 UTC (permalink / raw)
To: Nathan Lynch, linuxppc-dev
Cc: Tyrel Datwyler, Scott Cheloha, mmc, nnac123, brking
On 2/7/23 9:14 AM, Nathan Lynch wrote:
>
> (cc'ing a few possibly interested people)
>
> Brian King <brking@linux.vnet.ibm.com> writes:
>> While testing fixes to the hvcs hotplug code, kmemleak was reporting
>> potential memory leaks. This was tracked down to the struct device_node
>> object associated with the hvcs device. Looking at the leaked
>> object in crash showed that the kref in the kobject in the device_node
>> had a reference count of 1 still, and the release function was never
>> getting called as a result of this. This adds an of_node_put in
>> pSeries_reconfig_remove_node in order to balance the refcounting
>> so that we actually free the device_node in the case of it being
>> allocated in pSeries_reconfig_add_node.
>
> My concern here would be whether the additional put is the right thing
> to do in all cases. The questions it raises for me are:
>
> - Is it safe for nodes that were present at boot, instead of added
> dynamically?
Yes. of_node_release has a check to see if OF_DYNAMIC is set. If it is not set,
the release function is a noop.
> - Is it correct for all types of nodes, or is there something specific
> to hvcs that leaves a dangling refcount?
I would welcome more testing and I shared the same concern. I did do some
DLPARs of a virtual ethernet device with the change along with CONFIG_PAGE_POISONING
enabled and did not run into any issues. However if I do a DLPAR remove of a virtual
ethernet device without the change with kmemleak enabled it does not detect any
leaked memory.
Thanks,
Brian
>
> Just hoping we're not stepping into a situation where we're preventing
> leaks in some situations but doing use-after-free in others. :-)
>
>>
>> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/platforms/pseries/reconfig.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/powerpc/platforms/pseries/reconfig.c b/arch/powerpc/platforms/pseries/reconfig.c
>> index 599bd2c78514..8cb7309b19a4 100644
>> --- a/arch/powerpc/platforms/pseries/reconfig.c
>> +++ b/arch/powerpc/platforms/pseries/reconfig.c
>> @@ -77,6 +77,7 @@ static int pSeries_reconfig_remove_node(struct device_node *np)
>> }
>>
>> of_detach_node(np);
>> + of_node_put(np);
>> of_node_put(parent);
>> return 0;
>
> In a situation like this where the of_node_put() call isn't obviously
> connected to one of the of_ iterator APIs or similar, I would prefer a
> comment indicating which "get" it balances. I suppose it corresponds to
> the node initialization itself, i.e. the of_node_init() call sites in
> pSeries_reconfig_add_node() and drivers/of/fdt.c::populate_node().
--
Brian King
Power Linux I/O
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] powerpc: Fix device node refcounting
2023-02-09 15:16 ` Brian King
@ 2023-02-09 17:11 ` Nathan Lynch
2023-02-09 22:36 ` Brian King
0 siblings, 1 reply; 5+ messages in thread
From: Nathan Lynch @ 2023-02-09 17:11 UTC (permalink / raw)
To: Brian King, linuxppc-dev
Cc: Tyrel Datwyler, Scott Cheloha, mmc, nnac123, brking
Brian King <brking@linux.vnet.ibm.com> writes:
> On 2/7/23 9:14 AM, Nathan Lynch wrote:
>> Brian King <brking@linux.vnet.ibm.com> writes:
>>> While testing fixes to the hvcs hotplug code, kmemleak was reporting
>>> potential memory leaks. This was tracked down to the struct device_node
>>> object associated with the hvcs device. Looking at the leaked
>>> object in crash showed that the kref in the kobject in the device_node
>>> had a reference count of 1 still, and the release function was never
>>> getting called as a result of this. This adds an of_node_put in
>>> pSeries_reconfig_remove_node in order to balance the refcounting
>>> so that we actually free the device_node in the case of it being
>>> allocated in pSeries_reconfig_add_node.
>>
>> My concern here would be whether the additional put is the right thing
>> to do in all cases. The questions it raises for me are:
>>
>> - Is it safe for nodes that were present at boot, instead of added
>> dynamically?
>
> Yes. of_node_release has a check to see if OF_DYNAMIC is set. If it is not set,
> the release function is a noop.
Yes, but to be more specific - does the additional of_node_put() risk
underflowing the refcount on nodes without the OF_DYNAMIC flag? I
suspect it's OK. If it's not, then I would expect to see warnings from
the refcount code when that case is exercised.
>
>> - Is it correct for all types of nodes, or is there something specific
>> to hvcs that leaves a dangling refcount?
>
> I would welcome more testing and I shared the same concern. I did do some
> DLPARs of a virtual ethernet device with the change along with CONFIG_PAGE_POISONING
> enabled and did not run into any issues. However if I do a DLPAR remove of a virtual
> ethernet device without the change with kmemleak enabled it does not detect any
> leaked memory.
Seems odd. If the change is generically correct, then without it applied
I would expect kmemleak to flag a leak on removal of any type of
dynamically-added node. On the other hand, if the change is for some
reason not correct for virtual ethernet devices, then I would expect it
to cause complaints from the refcount code and/or allocator debug
facilities. But if I understand correctly, neither of those things is
happening.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] powerpc: Fix device node refcounting
2023-02-09 17:11 ` Nathan Lynch
@ 2023-02-09 22:36 ` Brian King
0 siblings, 0 replies; 5+ messages in thread
From: Brian King @ 2023-02-09 22:36 UTC (permalink / raw)
To: Nathan Lynch, linuxppc-dev
Cc: Tyrel Datwyler, Scott Cheloha, mmc, nnac123, brking
On 2/9/23 11:11 AM, Nathan Lynch wrote:
> Brian King <brking@linux.vnet.ibm.com> writes:
>> On 2/7/23 9:14 AM, Nathan Lynch wrote:
>>> Brian King <brking@linux.vnet.ibm.com> writes:
>>>> While testing fixes to the hvcs hotplug code, kmemleak was reporting
>>>> potential memory leaks. This was tracked down to the struct device_node
>>>> object associated with the hvcs device. Looking at the leaked
>>>> object in crash showed that the kref in the kobject in the device_node
>>>> had a reference count of 1 still, and the release function was never
>>>> getting called as a result of this. This adds an of_node_put in
>>>> pSeries_reconfig_remove_node in order to balance the refcounting
>>>> so that we actually free the device_node in the case of it being
>>>> allocated in pSeries_reconfig_add_node.
>>>
>>> My concern here would be whether the additional put is the right thing
>>> to do in all cases. The questions it raises for me are:
>>>
>>> - Is it safe for nodes that were present at boot, instead of added
>>> dynamically?
>>
>> Yes. of_node_release has a check to see if OF_DYNAMIC is set. If it is not set,
>> the release function is a noop.
>
> Yes, but to be more specific - does the additional of_node_put() risk
> underflowing the refcount on nodes without the OF_DYNAMIC flag? I
> suspect it's OK. If it's not, then I would expect to see warnings from
> the refcount code when that case is exercised.
Agreed. I have not seen any refcount underflow warnings in the testing I've done
so far.
>
>>
>>> - Is it correct for all types of nodes, or is there something specific
>>> to hvcs that leaves a dangling refcount?
>>
>> I would welcome more testing and I shared the same concern. I did do some
>> DLPARs of a virtual ethernet device with the change along with CONFIG_PAGE_POISONING
>> enabled and did not run into any issues. However if I do a DLPAR remove of a virtual
>> ethernet device without the change with kmemleak enabled it does not detect any
>> leaked memory.
>
> Seems odd. If the change is generically correct, then without it applied
> I would expect kmemleak to flag a leak on removal of any type of
> dynamically-added node. On the other hand, if the change is for some
> reason not correct for virtual ethernet devices, then I would expect it
> to cause complaints from the refcount code and/or allocator debug
> facilities. But if I understand correctly, neither of those things is
> happening.
Agreed. I'll do some more testing with and without the change and see
what that yields.
-Brian
--
Brian King
Power Linux I/O
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-02-09 22:37 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-01 19:58 [PATCH] powerpc: Fix device node refcounting Brian King
2023-02-07 15:14 ` Nathan Lynch
2023-02-09 15:16 ` Brian King
2023-02-09 17:11 ` Nathan Lynch
2023-02-09 22:36 ` Brian King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).