[PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
@ 2024-11-28  5:39 Dmitry Torokhov
  2024-11-28  5:39 ` [PATCH 2/2] device property: fix UAF in device_get_next_child_node() Dmitry Torokhov
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Dmitry Torokhov @ 2024-11-28  5:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andy Shevchenko, Rafael J . Wysocki,
	Sakari Ailus, Heikki Krogerus, Daniel Scally
  Cc: linux-acpi, linux-kernel

The documentation to various API calls that locate children for a given
fwnode (such as fwnode_get_next_available_child_node() or
device_get_next_child_node()) states that the reference to the node
passed in "child" argument is dropped unconditionally, however the
change that added checks for the main node to be NULL or error pointer
broke this promise.

Add missing fwnode_handle_put() calls to restore the documented
behavior.

Fixes: 002752af7b89 ("device property: Allow error pointer to be passed to fwnode APIs")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
---
 drivers/base/property.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/base/property.c b/drivers/base/property.c
index 837d77e3af2b..696ba43b8e8a 100644
--- a/drivers/base/property.c
+++ b/drivers/base/property.c
@@ -759,6 +759,12 @@ struct fwnode_handle *
 fwnode_get_next_child_node(const struct fwnode_handle *fwnode,
 			   struct fwnode_handle *child)
 {
+	if (IS_ERR_OR_NULL(fwnode) ||
+	    !fwnode_has_op(fwnode, get_next_child_node)) {
+		fwnode_handle_put(child);
+		return NULL;
+	}
+
 	return fwnode_call_ptr_op(fwnode, get_next_child_node, child);
 }
 EXPORT_SYMBOL_GPL(fwnode_get_next_child_node);
@@ -778,9 +784,6 @@ fwnode_get_next_available_child_node(const struct fwnode_handle *fwnode,
 {
 	struct fwnode_handle *next_child = child;
 
-	if (IS_ERR_OR_NULL(fwnode))
-		return NULL;
-
 	do {
 		next_child = fwnode_get_next_child_node(fwnode, next_child);
 		if (!next_child)
@@ -806,8 +809,10 @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
 	const struct fwnode_handle *fwnode = dev_fwnode(dev);
 	struct fwnode_handle *next;
 
-	if (IS_ERR_OR_NULL(fwnode))
+	if (IS_ERR_OR_NULL(fwnode)) {
+		fwnode_handle_put(child);
 		return NULL;
+	}
 
 	/* Try to find a child in primary fwnode */
 	next = fwnode_get_next_child_node(fwnode, child);
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/2] device property: fix UAF in device_get_next_child_node()
  2024-11-28  5:39 [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers Dmitry Torokhov
@ 2024-11-28  5:39 ` Dmitry Torokhov
  2024-11-28 13:20   ` Andy Shevchenko
  2024-11-28 11:49 ` [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers Greg Kroah-Hartman
  2024-11-28 13:13 ` Andy Shevchenko
  2 siblings, 1 reply; 17+ messages in thread
From: Dmitry Torokhov @ 2024-11-28  5:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Andy Shevchenko, Rafael J . Wysocki,
	Sakari Ailus, Heikki Krogerus, Daniel Scally
  Cc: linux-acpi, linux-kernel

fwnode_get_next_child_node() always drops reference to the node passed
as the "child" argument, which makes "child" pointer no longer valid
and we can not use it to scan the secondary node in case there are no
more children in primary one.

Also, it is not obvious whether it is safe to pass children of the
secondary node to fwnode_get_next_child_node() called on the primary
node in subsequent calls to device_get_next_child_node().

Fix the issue by checking whether the child node passed in is indeed a
child of primary or secondary node, and do not call
fwnode_get_next_child_node() for the wrong parent node. Also set the
"child" to NULL after unsuccessful call to fwnode_get_next_child_node()
on primary node to make sure secondary node's children are scanned from
the beginning.

Fixes: 114dbb4fa7c4 ("drivers property: When no children in primary, try secondary")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
---
 drivers/base/property.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/base/property.c b/drivers/base/property.c
index 696ba43b8e8a..0ca3c0908b0c 100644
--- a/drivers/base/property.c
+++ b/drivers/base/property.c
@@ -815,11 +815,26 @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
 	}
 
 	/* Try to find a child in primary fwnode */
-	next = fwnode_get_next_child_node(fwnode, child);
-	if (next)
-		return next;
+	if (!child || fwnode_get_parent(child) == fwnode) {
+		next = fwnode_get_next_child_node(fwnode, child);
+		if (next)
+			return next;
+		/*
+		 * We ran out of children in primary - reset the child
+		 * node to start from the beginning when scanning secondary
+		 * node.
+		 */
+		child = NULL;
+	}
 
 	/* When no more children in primary, continue with secondary */
+
+	if (IS_ERR_OR_NULL(fwnode->secondary) ||
+	    (child && fwnode_get_parent(child) != fwnode->secondary)) {
+		fwnode_handle_put(child);
+		return NULL;
+	}
+
 	return fwnode_get_next_child_node(fwnode->secondary, child);
 }
 EXPORT_SYMBOL_GPL(device_get_next_child_node);
-- 
2.47.0.338.g60cca15819-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-11-28  5:39 [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers Dmitry Torokhov
  2024-11-28  5:39 ` [PATCH 2/2] device property: fix UAF in device_get_next_child_node() Dmitry Torokhov
@ 2024-11-28 11:49 ` Greg Kroah-Hartman
  2024-11-28 13:13 ` Andy Shevchenko
  2 siblings, 0 replies; 17+ messages in thread
From: Greg Kroah-Hartman @ 2024-11-28 11:49 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Andy Shevchenko, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:
> The documentation to various API calls that locate children for a given
> fwnode (such as fwnode_get_next_available_child_node() or
> device_get_next_child_node()) states that the reference to the node
> passed in "child" argument is dropped unconditionally, however the
> change that added checks for the main node to be NULL or error pointer
> broke this promise.
> 
> Add missing fwnode_handle_put() calls to restore the documented
> behavior.
> 
> Fixes: 002752af7b89 ("device property: Allow error pointer to be passed to fwnode APIs")
> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> ---
>  drivers/base/property.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/base/property.c b/drivers/base/property.c
> index 837d77e3af2b..696ba43b8e8a 100644
> --- a/drivers/base/property.c
> +++ b/drivers/base/property.c
> @@ -759,6 +759,12 @@ struct fwnode_handle *
>  fwnode_get_next_child_node(const struct fwnode_handle *fwnode,
>  			   struct fwnode_handle *child)
>  {
> +	if (IS_ERR_OR_NULL(fwnode) ||
> +	    !fwnode_has_op(fwnode, get_next_child_node)) {
> +		fwnode_handle_put(child);
> +		return NULL;
> +	}
> +
>  	return fwnode_call_ptr_op(fwnode, get_next_child_node, child);
>  }
>  EXPORT_SYMBOL_GPL(fwnode_get_next_child_node);
> @@ -778,9 +784,6 @@ fwnode_get_next_available_child_node(const struct fwnode_handle *fwnode,
>  {
>  	struct fwnode_handle *next_child = child;
>  
> -	if (IS_ERR_OR_NULL(fwnode))
> -		return NULL;
> -
>  	do {
>  		next_child = fwnode_get_next_child_node(fwnode, next_child);
>  		if (!next_child)
> @@ -806,8 +809,10 @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
>  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
>  	struct fwnode_handle *next;
>  
> -	if (IS_ERR_OR_NULL(fwnode))
> +	if (IS_ERR_OR_NULL(fwnode)) {
> +		fwnode_handle_put(child);
>  		return NULL;
> +	}
>  
>  	/* Try to find a child in primary fwnode */
>  	next = fwnode_get_next_child_node(fwnode, child);
> -- 
> 2.47.0.338.g60cca15819-goog
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- You have marked a patch with a "Fixes:" tag for a commit that is in an
  older released kernel, yet you do not have a cc: stable line in the
  signed-off-by area at all, which means that the patch will not be
  applied to any older kernel releases.  To properly fix this, please
  follow the documented rules in the
  Documentation/process/stable-kernel-rules.rst file for how to resolve
  this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-11-28  5:39 [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers Dmitry Torokhov
  2024-11-28  5:39 ` [PATCH 2/2] device property: fix UAF in device_get_next_child_node() Dmitry Torokhov
  2024-11-28 11:49 ` [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers Greg Kroah-Hartman
@ 2024-11-28 13:13 ` Andy Shevchenko
  2024-11-28 23:04   ` Dmitry Torokhov
  2 siblings, 1 reply; 17+ messages in thread
From: Andy Shevchenko @ 2024-11-28 13:13 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:
> The documentation to various API calls that locate children for a given
> fwnode (such as fwnode_get_next_available_child_node() or
> device_get_next_child_node()) states that the reference to the node
> passed in "child" argument is dropped unconditionally, however the
> change that added checks for the main node to be NULL or error pointer
> broke this promise.

This commit message doesn't explain a use case. Hence it might be just
a documentation issue, please elaborate.

> Add missing fwnode_handle_put() calls to restore the documented
> behavior.

...

While at it, please fix the kernel-doc (missing Return section).

>  {
> +	if (IS_ERR_OR_NULL(fwnode) ||

Unneeded check as fwnode_has_op() has it already.

> +	    !fwnode_has_op(fwnode, get_next_child_node)) {
> +		fwnode_handle_put(child);
> +		return NULL;
> +	}

>  	return fwnode_call_ptr_op(fwnode, get_next_child_node, child);

Now it's useless to call the macro, you can simply take the direct call.

>  }

...

> @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
>  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
>  	struct fwnode_handle *next;

> -	if (IS_ERR_OR_NULL(fwnode))
> +	if (IS_ERR_OR_NULL(fwnode)) {
> +		fwnode_handle_put(child);
>  		return NULL;
> +	}

>  	/* Try to find a child in primary fwnode */
>  	next = fwnode_get_next_child_node(fwnode, child);

So, why not just moving the original check (w/o dropping the reference) here?
Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] device property: fix UAF in device_get_next_child_node()
  2024-11-28  5:39 ` [PATCH 2/2] device property: fix UAF in device_get_next_child_node() Dmitry Torokhov
@ 2024-11-28 13:20   ` Andy Shevchenko
  2024-11-28 23:16     ` Dmitry Torokhov
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Shevchenko @ 2024-11-28 13:20 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Wed, Nov 27, 2024 at 09:39:35PM -0800, Dmitry Torokhov wrote:
> fwnode_get_next_child_node() always drops reference to the node passed
> as the "child" argument,

As commented previously,it might be just a documentation bug. So, please
elaborate on the use case before this patch that leads to an issue.

> which makes "child" pointer no longer valid
> and we can not use it to scan the secondary node in case there are no
> more children in primary one.
> 
> Also, it is not obvious whether it is safe to pass children of the
> secondary node to fwnode_get_next_child_node() called on the primary
> node in subsequent calls to device_get_next_child_node().
> 
> Fix the issue by checking whether the child node passed in is indeed a
> child of primary or secondary node, and do not call
> fwnode_get_next_child_node() for the wrong parent node. Also set the
> "child" to NULL after unsuccessful call to fwnode_get_next_child_node()
> on primary node to make sure secondary node's children are scanned from
> the beginning.

To me it sounds over complicated. Why not just take reference to the child once
more and put it after we find next in either cases? Current solution provides
a lot of duplication and makes function less understandable.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-11-28 13:13 ` Andy Shevchenko
@ 2024-11-28 23:04   ` Dmitry Torokhov
  2024-11-29 14:50     ` Andy Shevchenko
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Torokhov @ 2024-11-28 23:04 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:
> > The documentation to various API calls that locate children for a given
> > fwnode (such as fwnode_get_next_available_child_node() or
> > device_get_next_child_node()) states that the reference to the node
> > passed in "child" argument is dropped unconditionally, however the
> > change that added checks for the main node to be NULL or error pointer
> > broke this promise.
> 
> This commit message doesn't explain a use case. Hence it might be just
> a documentation issue, please elaborate.

I do not have a specific use case in mind, however the implementation
behavior does not match the stated one, and so it makes sense to get it
fixed. Otherwise callers would have to add checks to conditionally drop
the reference to "child" argument in certain cases, which will
complicate caller's code.

> 
> > Add missing fwnode_handle_put() calls to restore the documented
> > behavior.
> 
> ...
> 
> While at it, please fix the kernel-doc (missing Return section).

OK.

> 
> >  {
> > +	if (IS_ERR_OR_NULL(fwnode) ||
> 
> Unneeded check as fwnode_has_op() has it already.

Yes, it has, but that is not obvious nor it is a documented behavior of
fwnode_has_op(). It also different semantics: it checks whether a fwnode
implements a given operation, not whether fwnode is valid. That check is
incidental in fwnode_has_op().

They all are macros so compiler should collapse duplicate checks, but if
you feel really strongly about it I can drop IS_ERR_OR_NULL() check.

> 
> > +	    !fwnode_has_op(fwnode, get_next_child_node)) {
> > +		fwnode_handle_put(child);
> > +		return NULL;
> > +	}
> 
> >  	return fwnode_call_ptr_op(fwnode, get_next_child_node, child);
> 
> Now it's useless to call the macro, you can simply take the direct call.

OK, will change to a direct call.

> 
> >  }
> 
> ...
> 
> > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> >  	struct fwnode_handle *next;
> 
> > -	if (IS_ERR_OR_NULL(fwnode))
> > +	if (IS_ERR_OR_NULL(fwnode)) {
> > +		fwnode_handle_put(child);
> >  		return NULL;
> > +	}
> 
> >  	/* Try to find a child in primary fwnode */
> >  	next = fwnode_get_next_child_node(fwnode, child);
> 
> So, why not just moving the original check (w/o dropping the reference) here?
> Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?

Because if you rely on check in fwnode_get_next_child_node() you would
not know if it returned NULL because there are no more children or
because the node is invalid. In the latter case you can't dereference
fwnode->secondary.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] device property: fix UAF in device_get_next_child_node()
  2024-11-28 13:20   ` Andy Shevchenko
@ 2024-11-28 23:16     ` Dmitry Torokhov
  2024-12-09 18:11       ` Andy Shevchenko
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Torokhov @ 2024-11-28 23:16 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Thu, Nov 28, 2024 at 03:20:09PM +0200, Andy Shevchenko wrote:
> On Wed, Nov 27, 2024 at 09:39:35PM -0800, Dmitry Torokhov wrote:
> > fwnode_get_next_child_node() always drops reference to the node passed
> > as the "child" argument,
> 
> As commented previously,it might be just a documentation bug. So, please

No, absolutely not. Consider calling device_get_next_child_node() with
"child" pointing to the last child of the primary fwnode.
fwnode_get_next_child_node() will drop the reference to "child"
rendering it unusable, and return NULL. The code will go and call
fwnode_get_next_child_node(fwnode->secondary, child) with invalid child
pointer, resulting in UAF condition.

Also, consider what happens next. Let's say we did not crash and instead
returned first child of the secondary node (let's assume primary is an
OF node and secondary is a software node). On the next iteration of
device_get_next_child_node() we will call
fwnode_get_next_child_node(fwnode, child) which results in passing
swnode child to of_fwnode_get_next_child_node() which may or may not
work. It all is very fragile.

That is why it is best to check if the child argument is indeed a child
to a given parent before calling fwnode_get_next_child_node() on them.

> elaborate on the use case before this patch that leads to an issue.
> 
> > which makes "child" pointer no longer valid
> > and we can not use it to scan the secondary node in case there are no
> > more children in primary one.
> > 
> > Also, it is not obvious whether it is safe to pass children of the
> > secondary node to fwnode_get_next_child_node() called on the primary
> > node in subsequent calls to device_get_next_child_node().
> > 
> > Fix the issue by checking whether the child node passed in is indeed a
> > child of primary or secondary node, and do not call
> > fwnode_get_next_child_node() for the wrong parent node. Also set the
> > "child" to NULL after unsuccessful call to fwnode_get_next_child_node()
> > on primary node to make sure secondary node's children are scanned from
> > the beginning.
> 
> To me it sounds over complicated. Why not just take reference to the child once
> more and put it after we find next in either cases?

You want to "reset" when switching from primary node over to secondary
instead of hoping that passing child pointer which is not really a child
to secondary node will somehow cause fwnode_get_next_child_node() to
return first its child.

> Current solution provides
> a lot of duplication and makes function less understandable.

The simplicity of the old version is deceiving. See the explanation
above.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-11-28 23:04   ` Dmitry Torokhov
@ 2024-11-29 14:50     ` Andy Shevchenko
  2024-11-30  7:16       ` Dmitry Torokhov
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Shevchenko @ 2024-11-29 14:50 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:
> > > The documentation to various API calls that locate children for a given
> > > fwnode (such as fwnode_get_next_available_child_node() or
> > > device_get_next_child_node()) states that the reference to the node
> > > passed in "child" argument is dropped unconditionally, however the
> > > change that added checks for the main node to be NULL or error pointer
> > > broke this promise.
> > 
> > This commit message doesn't explain a use case. Hence it might be just
> > a documentation issue, please elaborate.
> 
> I do not have a specific use case in mind, however the implementation
> behavior does not match the stated one, and so it makes sense to get it
> fixed. Otherwise callers would have to add checks to conditionally drop
> the reference to "child" argument in certain cases, which will
> complicate caller's code.

Perhaps this should be somewhere between the cover letter / commit message?

> > > Add missing fwnode_handle_put() calls to restore the documented
> > > behavior.

...

> > >  {
> > > +	if (IS_ERR_OR_NULL(fwnode) ||
> > 
> > Unneeded check as fwnode_has_op() has it already.
> 
> Yes, it has, but that is not obvious nor it is a documented behavior of
> fwnode_has_op().

Would like to document that then?

> It also different semantics: it checks whether a fwnode
> implements a given operation, not whether fwnode is valid. That check is
> incidental in fwnode_has_op().

I kinda disagree on this. The invalid fwnode may not have any operations,
so it's implied and will always be like that.

> They all are macros so compiler should collapse duplicate checks, but if
> you feel really strongly about it I can drop IS_ERR_OR_NULL() check.

Yes, please drop it and rather we want fwnode_has_op() to be documented with
main purpose and guaranteed side effect (the latter makes no need of
duplication that I pointed out).

> > > +	    !fwnode_has_op(fwnode, get_next_child_node)) {
> > > +		fwnode_handle_put(child);
> > > +		return NULL;
> > > +	}

...

> > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > >  	struct fwnode_handle *next;
> > 
> > > -	if (IS_ERR_OR_NULL(fwnode))
> > > +	if (IS_ERR_OR_NULL(fwnode)) {
> > > +		fwnode_handle_put(child);
> > >  		return NULL;
> > > +	}
> > 
> > >  	/* Try to find a child in primary fwnode */
> > >  	next = fwnode_get_next_child_node(fwnode, child);
> > 
> > So, why not just moving the original check (w/o dropping the reference) here?
> > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
> 
> Because if you rely on check in fwnode_get_next_child_node() you would
> not know if it returned NULL because there are no more children or
> because the node is invalid. In the latter case you can't dereference
> fwnode->secondary.

Yes, so, how does it contradict my proposal?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-11-29 14:50     ` Andy Shevchenko
@ 2024-11-30  7:16       ` Dmitry Torokhov
  2024-11-30 21:44         ` Andy Shevchenko
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Torokhov @ 2024-11-30  7:16 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Fri, Nov 29, 2024 at 04:50:15PM +0200, Andy Shevchenko wrote:
> On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> > On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:
> > > > The documentation to various API calls that locate children for a given
> > > > fwnode (such as fwnode_get_next_available_child_node() or
> > > > device_get_next_child_node()) states that the reference to the node
> > > > passed in "child" argument is dropped unconditionally, however the
> > > > change that added checks for the main node to be NULL or error pointer
> > > > broke this promise.
> > > 
> > > This commit message doesn't explain a use case. Hence it might be just
> > > a documentation issue, please elaborate.
> > 
> > I do not have a specific use case in mind, however the implementation
> > behavior does not match the stated one, and so it makes sense to get it
> > fixed. Otherwise callers would have to add checks to conditionally drop
> > the reference to "child" argument in certain cases, which will
> > complicate caller's code.
> 
> Perhaps this should be somewhere between the cover letter / commit message?

OK, I thought that it was pretty obvious, but I will expand the commit
message to include this.

> 
> > > > Add missing fwnode_handle_put() calls to restore the documented
> > > > behavior.
> 
> ...
> 
> > > >  {
> > > > +	if (IS_ERR_OR_NULL(fwnode) ||
> > > 
> > > Unneeded check as fwnode_has_op() has it already.
> > 
> > Yes, it has, but that is not obvious nor it is a documented behavior of
> > fwnode_has_op().
> 
> Would like to document that then?
> 
> > It also different semantics: it checks whether a fwnode
> > implements a given operation, not whether fwnode is valid. That check is
> > incidental in fwnode_has_op().
> 
> I kinda disagree on this. The invalid fwnode may not have any operations,
> so it's implied and will always be like that.

Yeah, it is clear that we disagree. I agree that invalid fwnode will not
have an operation defined, still checking whether an operation is
supported and whether a node is invalid or not are 2 different
operations to me. But we do not need to argue further.

> 
> > They all are macros so compiler should collapse duplicate checks, but if
> > you feel really strongly about it I can drop IS_ERR_OR_NULL() check.
> 
> Yes, please drop it and rather we want fwnode_has_op() to be documented with
> main purpose and guaranteed side effect (the latter makes no need of
> duplication that I pointed out).

OK.

> 
> > > > +	    !fwnode_has_op(fwnode, get_next_child_node)) {
> > > > +		fwnode_handle_put(child);
> > > > +		return NULL;
> > > > +	}
> 
> ...
> 
> > > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > > >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > > >  	struct fwnode_handle *next;
> > > 
> > > > -	if (IS_ERR_OR_NULL(fwnode))
> > > > +	if (IS_ERR_OR_NULL(fwnode)) {
> > > > +		fwnode_handle_put(child);
> > > >  		return NULL;
> > > > +	}
> > > 
> > > >  	/* Try to find a child in primary fwnode */
> > > >  	next = fwnode_get_next_child_node(fwnode, child);
> > > 
> > > So, why not just moving the original check (w/o dropping the reference) here?
> > > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
> > 
> > Because if you rely on check in fwnode_get_next_child_node() you would
> > not know if it returned NULL because there are no more children or
> > because the node is invalid. In the latter case you can't dereference
> > fwnode->secondary.
> 
> Yes, so, how does it contradict my proposal?

I guess I misunderstood your proposal then. Could you please explain it
in more detail?

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-11-30  7:16       ` Dmitry Torokhov
@ 2024-11-30 21:44         ` Andy Shevchenko
  2024-12-03  5:49           ` Dmitry Torokhov
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Shevchenko @ 2024-11-30 21:44 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Fri, Nov 29, 2024 at 11:16:54PM -0800, Dmitry Torokhov wrote:
> On Fri, Nov 29, 2024 at 04:50:15PM +0200, Andy Shevchenko wrote:
> > On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> > > On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > > > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:

...

> > > > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > > > >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > > > >  	struct fwnode_handle *next;
> > > > 
> > > > > -	if (IS_ERR_OR_NULL(fwnode))
> > > > > +	if (IS_ERR_OR_NULL(fwnode)) {
> > > > > +		fwnode_handle_put(child);
> > > > >  		return NULL;
> > > > > +	}
> > > > 
> > > > >  	/* Try to find a child in primary fwnode */
> > > > >  	next = fwnode_get_next_child_node(fwnode, child);
> > > > 
> > > > So, why not just moving the original check (w/o dropping the reference) here?
> > > > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
> > > 
> > > Because if you rely on check in fwnode_get_next_child_node() you would
> > > not know if it returned NULL because there are no more children or
> > > because the node is invalid. In the latter case you can't dereference
> > > fwnode->secondary.
> > 
> > Yes, so, how does it contradict my proposal?
> 
> I guess I misunderstood your proposal then. Could you please explain it
> in more detail?


Current code (in steps):
	if (IS_ERR_OR_NULL()) check
	trying primary
	trying secondary if previous is NULL


My proposal

	trying primary
	return if not NULL
	if (IS_ERR_OR_NULL()) check in its current form (no put op)
	trying secondary

After your first patch IIUC this is possible as trying primary will put child uncoditionally.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-11-30 21:44         ` Andy Shevchenko
@ 2024-12-03  5:49           ` Dmitry Torokhov
  2024-12-03 13:27             ` Andy Shevchenko
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Torokhov @ 2024-12-03  5:49 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Sat, Nov 30, 2024 at 11:44:04PM +0200, Andy Shevchenko wrote:
> On Fri, Nov 29, 2024 at 11:16:54PM -0800, Dmitry Torokhov wrote:
> > On Fri, Nov 29, 2024 at 04:50:15PM +0200, Andy Shevchenko wrote:
> > > On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> > > > On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > > > > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:
> 
> ...
> 
> > > > > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > > > > >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > > > > >  	struct fwnode_handle *next;
> > > > > 
> > > > > > -	if (IS_ERR_OR_NULL(fwnode))
> > > > > > +	if (IS_ERR_OR_NULL(fwnode)) {
> > > > > > +		fwnode_handle_put(child);
> > > > > >  		return NULL;
> > > > > > +	}
> > > > > 
> > > > > >  	/* Try to find a child in primary fwnode */
> > > > > >  	next = fwnode_get_next_child_node(fwnode, child);
> > > > > 
> > > > > So, why not just moving the original check (w/o dropping the reference) here?
> > > > > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
> > > > 
> > > > Because if you rely on check in fwnode_get_next_child_node() you would
> > > > not know if it returned NULL because there are no more children or
> > > > because the node is invalid. In the latter case you can't dereference
> > > > fwnode->secondary.
> > > 
> > > Yes, so, how does it contradict my proposal?
> > 
> > I guess I misunderstood your proposal then. Could you please explain it
> > in more detail?
> 
> 
> Current code (in steps):
> 	if (IS_ERR_OR_NULL()) check
> 	trying primary
> 	trying secondary if previous is NULL
> 
> 
> My proposal
> 
> 	trying primary
> 	return if not NULL
> 	if (IS_ERR_OR_NULL()) check in its current form (no put op)
> 	trying secondary
> 
> After your first patch IIUC this is possible as trying primary will put child uncoditionally.

Ah, I see. No, I do not think this is a good idea: it will make the code
harder to understand for a casual reader: "Why do we check node validity
only after we used it for the first time?"

For the code not in a hot path there is a lot of value in simplicity.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-12-03  5:49           ` Dmitry Torokhov
@ 2024-12-03 13:27             ` Andy Shevchenko
  2024-12-03 22:45               ` Dmitry Torokhov
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Shevchenko @ 2024-12-03 13:27 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Mon, Dec 02, 2024 at 09:49:06PM -0800, Dmitry Torokhov wrote:
> On Sat, Nov 30, 2024 at 11:44:04PM +0200, Andy Shevchenko wrote:
> > On Fri, Nov 29, 2024 at 11:16:54PM -0800, Dmitry Torokhov wrote:
> > > On Fri, Nov 29, 2024 at 04:50:15PM +0200, Andy Shevchenko wrote:
> > > > On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> > > > > On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > > > > > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:

...

> > > > > > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > > > > > >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > > > > > >  	struct fwnode_handle *next;
> > > > > > 
> > > > > > > -	if (IS_ERR_OR_NULL(fwnode))
> > > > > > > +	if (IS_ERR_OR_NULL(fwnode)) {
> > > > > > > +		fwnode_handle_put(child);
> > > > > > >  		return NULL;
> > > > > > > +	}
> > > > > > 
> > > > > > >  	/* Try to find a child in primary fwnode */
> > > > > > >  	next = fwnode_get_next_child_node(fwnode, child);
> > > > > > 
> > > > > > So, why not just moving the original check (w/o dropping the reference) here?
> > > > > > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
> > > > > 
> > > > > Because if you rely on check in fwnode_get_next_child_node() you would
> > > > > not know if it returned NULL because there are no more children or
> > > > > because the node is invalid. In the latter case you can't dereference
> > > > > fwnode->secondary.
> > > > 
> > > > Yes, so, how does it contradict my proposal?
> > > 
> > > I guess I misunderstood your proposal then. Could you please explain it
> > > in more detail?
> > 
> > 
> > Current code (in steps):
> > 	if (IS_ERR_OR_NULL()) check
> > 	trying primary
> > 	trying secondary if previous is NULL
> > 
> > 
> > My proposal
> > 
> > 	trying primary
> > 	return if not NULL
> > 	if (IS_ERR_OR_NULL()) check in its current form (no put op)
> > 	trying secondary
> > 
> > After your first patch IIUC this is possible as trying primary will put child uncoditionally.
> 
> Ah, I see. No, I do not think this is a good idea: it will make the code
> harder to understand for a casual reader: "Why do we check node validity
> only after we used it for the first time?"

Theare a re already a few API calls there that are hard to understand, I spent
some time on them to get it through and still got it wrong as this series
shows. So, I don't think we anyhow change this.

> For the code not in a hot path there is a lot of value in simplicity.

If you really want to go to this rabbit hole, think how we can get rid of
repetitive checks of the secondary or more if any in the future nodes in the
list.

So the basic idea is to have this all hidden (to some extent) behind the macro
or alike. In the code it would be something as

  for node in primary, secondary, ...
    call the API
    if (okay)
	return result

  return error

This will indeed help.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-12-03 13:27             ` Andy Shevchenko
@ 2024-12-03 22:45               ` Dmitry Torokhov
  2024-12-04  1:16                 ` Andy Shevchenko
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Torokhov @ 2024-12-03 22:45 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Tue, Dec 03, 2024 at 03:27:31PM +0200, Andy Shevchenko wrote:
> On Mon, Dec 02, 2024 at 09:49:06PM -0800, Dmitry Torokhov wrote:
> > On Sat, Nov 30, 2024 at 11:44:04PM +0200, Andy Shevchenko wrote:
> > > On Fri, Nov 29, 2024 at 11:16:54PM -0800, Dmitry Torokhov wrote:
> > > > On Fri, Nov 29, 2024 at 04:50:15PM +0200, Andy Shevchenko wrote:
> > > > > On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> > > > > > On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > > > > > > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:
> 
> ...
> 
> > > > > > > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > > > > > > >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > > > > > > >  	struct fwnode_handle *next;
> > > > > > > 
> > > > > > > > -	if (IS_ERR_OR_NULL(fwnode))
> > > > > > > > +	if (IS_ERR_OR_NULL(fwnode)) {
> > > > > > > > +		fwnode_handle_put(child);
> > > > > > > >  		return NULL;
> > > > > > > > +	}
> > > > > > > 
> > > > > > > >  	/* Try to find a child in primary fwnode */
> > > > > > > >  	next = fwnode_get_next_child_node(fwnode, child);
> > > > > > > 
> > > > > > > So, why not just moving the original check (w/o dropping the reference) here?
> > > > > > > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
> > > > > > 
> > > > > > Because if you rely on check in fwnode_get_next_child_node() you would
> > > > > > not know if it returned NULL because there are no more children or
> > > > > > because the node is invalid. In the latter case you can't dereference
> > > > > > fwnode->secondary.
> > > > > 
> > > > > Yes, so, how does it contradict my proposal?
> > > > 
> > > > I guess I misunderstood your proposal then. Could you please explain it
> > > > in more detail?
> > > 
> > > 
> > > Current code (in steps):
> > > 	if (IS_ERR_OR_NULL()) check
> > > 	trying primary
> > > 	trying secondary if previous is NULL
> > > 
> > > 
> > > My proposal
> > > 
> > > 	trying primary
> > > 	return if not NULL
> > > 	if (IS_ERR_OR_NULL()) check in its current form (no put op)
> > > 	trying secondary
> > > 
> > > After your first patch IIUC this is possible as trying primary will put child uncoditionally.
> > 
> > Ah, I see. No, I do not think this is a good idea: it will make the code
> > harder to understand for a casual reader: "Why do we check node validity
> > only after we used it for the first time?"
> 
> Theare a re already a few API calls there that are hard to understand, I spent
> some time on them to get it through and still got it wrong as this series
> shows. So, I don't think we anyhow change this.

The fact that some code is confusing does not mean that we should add
more confusing code. We will not fix everything at once, but we can make
things better bit by bit.

Look, the check where it is now makes total sense, you added it there
yourself! It checks that we are dealing with a valid node and returns
early. The intent is very easy to understand and the only thing that is
missing is that "put" operation to satisfy the documented behavior.
Anything more just makes things more complex for no good reason.

> 
> > For the code not in a hot path there is a lot of value in simplicity.
> 
> If you really want to go to this rabbit hole, think how we can get rid of
> repetitive checks of the secondary or more if any in the future nodes in the
> list.
> 
> So the basic idea is to have this all hidden (to some extent) behind the macro
> or alike. In the code it would be something as
> 
>   for node in primary, secondary, ...
>     call the API
>     if (okay)
> 	return result
> 
>   return error
> 
> This will indeed help.

I think this will indeed help if we ever going to have more than primary
and secondary nodes. It is also tricky if you want to transition
seamlessly between different types of nodes (i.e. you have ACPI primary
with OF overlay secondary with swnode as tertiary etc). And you probably
want to add support for references between different typesof nodes
(i.e. swnode being able to reference OF device node for example).

This kind of rework is however out of scope of what I have time to do at
the moment.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-12-03 22:45               ` Dmitry Torokhov
@ 2024-12-04  1:16                 ` Andy Shevchenko
  2024-12-05 20:57                   ` Dmitry Torokhov
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Shevchenko @ 2024-12-04  1:16 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Tue, Dec 03, 2024 at 02:45:49PM -0800, Dmitry Torokhov wrote:
> On Tue, Dec 03, 2024 at 03:27:31PM +0200, Andy Shevchenko wrote:
> > On Mon, Dec 02, 2024 at 09:49:06PM -0800, Dmitry Torokhov wrote:
> > > On Sat, Nov 30, 2024 at 11:44:04PM +0200, Andy Shevchenko wrote:
> > > > On Fri, Nov 29, 2024 at 11:16:54PM -0800, Dmitry Torokhov wrote:
> > > > > On Fri, Nov 29, 2024 at 04:50:15PM +0200, Andy Shevchenko wrote:
> > > > > > On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> > > > > > > On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > > > > > > > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:

...

> > > > > > > > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > > > > > > > >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > > > > > > > >  	struct fwnode_handle *next;
> > > > > > > > 
> > > > > > > > > -	if (IS_ERR_OR_NULL(fwnode))
> > > > > > > > > +	if (IS_ERR_OR_NULL(fwnode)) {
> > > > > > > > > +		fwnode_handle_put(child);
> > > > > > > > >  		return NULL;
> > > > > > > > > +	}
> > > > > > > > 
> > > > > > > > >  	/* Try to find a child in primary fwnode */
> > > > > > > > >  	next = fwnode_get_next_child_node(fwnode, child);
> > > > > > > > 
> > > > > > > > So, why not just moving the original check (w/o dropping the reference) here?
> > > > > > > > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
> > > > > > > 
> > > > > > > Because if you rely on check in fwnode_get_next_child_node() you would
> > > > > > > not know if it returned NULL because there are no more children or
> > > > > > > because the node is invalid. In the latter case you can't dereference
> > > > > > > fwnode->secondary.
> > > > > > 
> > > > > > Yes, so, how does it contradict my proposal?
> > > > > 
> > > > > I guess I misunderstood your proposal then. Could you please explain it
> > > > > in more detail?
> > > > 
> > > > 
> > > > Current code (in steps):
> > > > 	if (IS_ERR_OR_NULL()) check
> > > > 	trying primary
> > > > 	trying secondary if previous is NULL
> > > > 
> > > > 
> > > > My proposal
> > > > 
> > > > 	trying primary
> > > > 	return if not NULL
> > > > 	if (IS_ERR_OR_NULL()) check in its current form (no put op)
> > > > 	trying secondary
> > > > 
> > > > After your first patch IIUC this is possible as trying primary will put child uncoditionally.
> > > 
> > > Ah, I see. No, I do not think this is a good idea: it will make the code
> > > harder to understand for a casual reader: "Why do we check node validity
> > > only after we used it for the first time?"
> > 
> > Theare a re already a few API calls there that are hard to understand, I spent
> > some time on them to get it through and still got it wrong as this series
> > shows. So, I don't think we anyhow change this.
> 
> The fact that some code is confusing does not mean that we should add
> more confusing code. We will not fix everything at once, but we can make
> things better bit by bit.
> 
> Look, the check where it is now makes total sense, you added it there
> yourself! It checks that we are dealing with a valid node and returns
> early. The intent is very easy to understand and the only thing that is
> missing is that "put" operation to satisfy the documented behavior.
> Anything more just makes things more complex for no good reason.

Right, that's why I think we need to go away from open coding the iteration
over the list of nodes (primary, secondary, etc).

> > > For the code not in a hot path there is a lot of value in simplicity.
> > 
> > If you really want to go to this rabbit hole, think how we can get rid of
> > repetitive checks of the secondary or more if any in the future nodes in the
> > list.
> > 
> > So the basic idea is to have this all hidden (to some extent) behind the macro
> > or alike. In the code it would be something as
> > 
> >   for node in primary, secondary, ...
> >     call the API
> >     if (okay)
> > 	return result
> > 
> >   return error
> > 
> > This will indeed help.
> 
> I think this will indeed help if we ever going to have more than primary
> and secondary nodes. It is also tricky if you want to transition
> seamlessly between different types of nodes (i.e. you have ACPI primary
> with OF overlay secondary with swnode as tertiary etc). And you probably
> want to add support for references between different typesof nodes
> (i.e. swnode being able to reference OF device node for example).
> 
> This kind of rework is however out of scope of what I have time to do at
> the moment.

I am not asking you to invest into big rework, the idea is to try to fold the
iterations to a kind of loop. Is it feasible?

Or maybe it can be partially done, so the above becomes something like

call_prmary_op(fwnode, ...)
	if (IS_ERR_OR_NULL(fwnode))
		...
	else
		call_op()

call_secondary_op(fwnode, ...)
	if (IS_ERR_OR_NULL(fwnode))
		...
	else
		call_op()

(with the potential of collapsing one into the other)

and then the above

	next = call_primary_op(wnode, ...);
	if (next)
		return next;

	return call_secondary_op(fwnode, ...);

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-12-04  1:16                 ` Andy Shevchenko
@ 2024-12-05 20:57                   ` Dmitry Torokhov
  2024-12-09 18:06                     ` Andy Shevchenko
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Torokhov @ 2024-12-05 20:57 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Wed, Dec 04, 2024 at 03:16:34AM +0200, Andy Shevchenko wrote:
> On Tue, Dec 03, 2024 at 02:45:49PM -0800, Dmitry Torokhov wrote:
> > On Tue, Dec 03, 2024 at 03:27:31PM +0200, Andy Shevchenko wrote:
> > > On Mon, Dec 02, 2024 at 09:49:06PM -0800, Dmitry Torokhov wrote:
> > > > On Sat, Nov 30, 2024 at 11:44:04PM +0200, Andy Shevchenko wrote:
> > > > > On Fri, Nov 29, 2024 at 11:16:54PM -0800, Dmitry Torokhov wrote:
> > > > > > On Fri, Nov 29, 2024 at 04:50:15PM +0200, Andy Shevchenko wrote:
> > > > > > > On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> > > > > > > > On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > > > > > > > > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:
> 
> ...
> 
> > > > > > > > > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > > > > > > > > >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > > > > > > > > >  	struct fwnode_handle *next;
> > > > > > > > > 
> > > > > > > > > > -	if (IS_ERR_OR_NULL(fwnode))
> > > > > > > > > > +	if (IS_ERR_OR_NULL(fwnode)) {
> > > > > > > > > > +		fwnode_handle_put(child);
> > > > > > > > > >  		return NULL;
> > > > > > > > > > +	}
> > > > > > > > > 
> > > > > > > > > >  	/* Try to find a child in primary fwnode */
> > > > > > > > > >  	next = fwnode_get_next_child_node(fwnode, child);
> > > > > > > > > 
> > > > > > > > > So, why not just moving the original check (w/o dropping the reference) here?
> > > > > > > > > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
> > > > > > > > 
> > > > > > > > Because if you rely on check in fwnode_get_next_child_node() you would
> > > > > > > > not know if it returned NULL because there are no more children or
> > > > > > > > because the node is invalid. In the latter case you can't dereference
> > > > > > > > fwnode->secondary.
> > > > > > > 
> > > > > > > Yes, so, how does it contradict my proposal?
> > > > > > 
> > > > > > I guess I misunderstood your proposal then. Could you please explain it
> > > > > > in more detail?
> > > > > 
> > > > > 
> > > > > Current code (in steps):
> > > > > 	if (IS_ERR_OR_NULL()) check
> > > > > 	trying primary
> > > > > 	trying secondary if previous is NULL
> > > > > 
> > > > > 
> > > > > My proposal
> > > > > 
> > > > > 	trying primary
> > > > > 	return if not NULL
> > > > > 	if (IS_ERR_OR_NULL()) check in its current form (no put op)
> > > > > 	trying secondary
> > > > > 
> > > > > After your first patch IIUC this is possible as trying primary will put child uncoditionally.
> > > > 
> > > > Ah, I see. No, I do not think this is a good idea: it will make the code
> > > > harder to understand for a casual reader: "Why do we check node validity
> > > > only after we used it for the first time?"
> > > 
> > > Theare a re already a few API calls there that are hard to understand, I spent
> > > some time on them to get it through and still got it wrong as this series
> > > shows. So, I don't think we anyhow change this.
> > 
> > The fact that some code is confusing does not mean that we should add
> > more confusing code. We will not fix everything at once, but we can make
> > things better bit by bit.
> > 
> > Look, the check where it is now makes total sense, you added it there
> > yourself! It checks that we are dealing with a valid node and returns
> > early. The intent is very easy to understand and the only thing that is
> > missing is that "put" operation to satisfy the documented behavior.
> > Anything more just makes things more complex for no good reason.
> 
> Right, that's why I think we need to go away from open coding the iteration
> over the list of nodes (primary, secondary, etc).
> 
> > > > For the code not in a hot path there is a lot of value in simplicity.
> > > 
> > > If you really want to go to this rabbit hole, think how we can get rid of
> > > repetitive checks of the secondary or more if any in the future nodes in the
> > > list.
> > > 
> > > So the basic idea is to have this all hidden (to some extent) behind the macro
> > > or alike. In the code it would be something as
> > > 
> > >   for node in primary, secondary, ...
> > >     call the API
> > >     if (okay)
> > > 	return result
> > > 
> > >   return error
> > > 
> > > This will indeed help.
> > 
> > I think this will indeed help if we ever going to have more than primary
> > and secondary nodes. It is also tricky if you want to transition
> > seamlessly between different types of nodes (i.e. you have ACPI primary
> > with OF overlay secondary with swnode as tertiary etc). And you probably
> > want to add support for references between different typesof nodes
> > (i.e. swnode being able to reference OF device node for example).
> > 
> > This kind of rework is however out of scope of what I have time to do at
> > the moment.
> 
> I am not asking you to invest into big rework, the idea is to try to fold the
> iterations to a kind of loop. Is it feasible?

We could potentially do something like below.

BTW, do you know why fwnode_property_get_reference_args() returns
-ENOENT for NULL or error fwnode instead of -EINVAL as the rest of them?
And would you object to unifying this?

Thanks.

-- 
Dmitry

index 0ca3c0908b0c..3b4c394138e2 100644
--- a/drivers/base/property.c
+++ b/drivers/base/property.c
@@ -18,6 +18,28 @@
 #include <linux/string.h>
 #include <linux/types.h>
 
+#define FWNODE_ITERATE(n, result, cont_val, op, ...)				\
+({										\
+	int __ret = -EINVAL;							\
+	typeof(result) __r;							\
+										\
+	for (const struct fwnode_handle *__node = n;				\
+	     !IS_ERR_OR_NULL(__node);						\
+	     __node = __node->secondary) {					\
+	        if (!__node->ops || !__node->ops->op) {				\
+			__ret = -ENXIO;						\
+			continue;						\
+		}								\
+		__r = __node->ops->op(__node, ## __VA_ARGS__);			\
+		if (__r != cont_val) {						\
+			result = __r;						\
+			__ret = 0;						\
+			break;							\
+		}								\
+	}									\
+	__ret;									\
+})
+
 struct fwnode_handle *__dev_fwnode(struct device *dev)
 {
 	return IS_ENABLED(CONFIG_OF) && dev->of_node ?
@@ -57,16 +79,14 @@ EXPORT_SYMBOL_GPL(device_property_present);
 bool fwnode_property_present(const struct fwnode_handle *fwnode,
 			     const char *propname)
 {
+	int error;
 	bool ret;
 
-	if (IS_ERR_OR_NULL(fwnode))
+	error = FWNODE_ITERATE(fwnode, ret, false, property_present, propname);
+	if (error)
 		return false;
 
-	ret = fwnode_call_bool_op(fwnode, property_present, propname);
-	if (ret)
-		return ret;
-
-	return fwnode_call_bool_op(fwnode->secondary, property_present, propname);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(fwnode_property_present);
 
@@ -259,18 +279,15 @@ static int fwnode_property_read_int_array(const struct fwnode_handle *fwnode,
 					  unsigned int elem_size, void *val,
 					  size_t nval)
 {
+	int error;
 	int ret;
 
-	if (IS_ERR_OR_NULL(fwnode))
-		return -EINVAL;
-
-	ret = fwnode_call_int_op(fwnode, property_read_int_array, propname,
-				 elem_size, val, nval);
-	if (ret != -EINVAL)
-		return ret;
+	error = FWNODE_ITERATE(fwnode, ret, -EINVAL, property_read_int_array,
+			       propname, elem_size, val, nval);
+	if (error)
+		return error;
 
-	return fwnode_call_int_op(fwnode->secondary, property_read_int_array, propname,
-				  elem_size, val, nval);
+	return ret;
 }
 
 /**
@@ -414,18 +431,15 @@ int fwnode_property_read_string_array(const struct fwnode_handle *fwnode,
 				      const char *propname, const char **val,
 				      size_t nval)
 {
+	int error;
 	int ret;
 
-	if (IS_ERR_OR_NULL(fwnode))
-		return -EINVAL;
-
-	ret = fwnode_call_int_op(fwnode, property_read_string_array, propname,
-				 val, nval);
-	if (ret != -EINVAL)
-		return ret;
+	error = FWNODE_ITERATE(fwnode, ret, -EINVAL, property_read_string_array,
+			       propname, val, nval);
+	if (error)
+		return error;
 
-	return fwnode_call_int_op(fwnode->secondary, property_read_string_array, propname,
-				  val, nval);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(fwnode_property_read_string_array);
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers
  2024-12-05 20:57                   ` Dmitry Torokhov
@ 2024-12-09 18:06                     ` Andy Shevchenko
  0 siblings, 0 replies; 17+ messages in thread
From: Andy Shevchenko @ 2024-12-09 18:06 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Thu, Dec 05, 2024 at 12:57:38PM -0800, Dmitry Torokhov wrote:
> On Wed, Dec 04, 2024 at 03:16:34AM +0200, Andy Shevchenko wrote:
> > On Tue, Dec 03, 2024 at 02:45:49PM -0800, Dmitry Torokhov wrote:
> > > On Tue, Dec 03, 2024 at 03:27:31PM +0200, Andy Shevchenko wrote:
> > > > On Mon, Dec 02, 2024 at 09:49:06PM -0800, Dmitry Torokhov wrote:
> > > > > On Sat, Nov 30, 2024 at 11:44:04PM +0200, Andy Shevchenko wrote:
> > > > > > On Fri, Nov 29, 2024 at 11:16:54PM -0800, Dmitry Torokhov wrote:
> > > > > > > On Fri, Nov 29, 2024 at 04:50:15PM +0200, Andy Shevchenko wrote:
> > > > > > > > On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> > > > > > > > > On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > > > > > > > > > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:

...

> > > > > > > > > > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > > > > > > > > > >  	const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > > > > > > > > > >  	struct fwnode_handle *next;
> > > > > > > > > > 
> > > > > > > > > > > -	if (IS_ERR_OR_NULL(fwnode))
> > > > > > > > > > > +	if (IS_ERR_OR_NULL(fwnode)) {
> > > > > > > > > > > +		fwnode_handle_put(child);
> > > > > > > > > > >  		return NULL;
> > > > > > > > > > > +	}
> > > > > > > > > > 
> > > > > > > > > > >  	/* Try to find a child in primary fwnode */
> > > > > > > > > > >  	next = fwnode_get_next_child_node(fwnode, child);
> > > > > > > > > > 
> > > > > > > > > > So, why not just moving the original check (w/o dropping the reference) here?
> > > > > > > > > > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
> > > > > > > > > 
> > > > > > > > > Because if you rely on check in fwnode_get_next_child_node() you would
> > > > > > > > > not know if it returned NULL because there are no more children or
> > > > > > > > > because the node is invalid. In the latter case you can't dereference
> > > > > > > > > fwnode->secondary.
> > > > > > > > 
> > > > > > > > Yes, so, how does it contradict my proposal?
> > > > > > > 
> > > > > > > I guess I misunderstood your proposal then. Could you please explain it
> > > > > > > in more detail?
> > > > > > 
> > > > > > 
> > > > > > Current code (in steps):
> > > > > > 	if (IS_ERR_OR_NULL()) check
> > > > > > 	trying primary
> > > > > > 	trying secondary if previous is NULL
> > > > > > 
> > > > > > 
> > > > > > My proposal
> > > > > > 
> > > > > > 	trying primary
> > > > > > 	return if not NULL
> > > > > > 	if (IS_ERR_OR_NULL()) check in its current form (no put op)
> > > > > > 	trying secondary
> > > > > > 
> > > > > > After your first patch IIUC this is possible as trying primary will put child uncoditionally.
> > > > > 
> > > > > Ah, I see. No, I do not think this is a good idea: it will make the code
> > > > > harder to understand for a casual reader: "Why do we check node validity
> > > > > only after we used it for the first time?"
> > > > 
> > > > Theare a re already a few API calls there that are hard to understand, I spent
> > > > some time on them to get it through and still got it wrong as this series
> > > > shows. So, I don't think we anyhow change this.
> > > 
> > > The fact that some code is confusing does not mean that we should add
> > > more confusing code. We will not fix everything at once, but we can make
> > > things better bit by bit.
> > > 
> > > Look, the check where it is now makes total sense, you added it there
> > > yourself! It checks that we are dealing with a valid node and returns
> > > early. The intent is very easy to understand and the only thing that is
> > > missing is that "put" operation to satisfy the documented behavior.
> > > Anything more just makes things more complex for no good reason.
> > 
> > Right, that's why I think we need to go away from open coding the iteration
> > over the list of nodes (primary, secondary, etc).
> > 
> > > > > For the code not in a hot path there is a lot of value in simplicity.
> > > > 
> > > > If you really want to go to this rabbit hole, think how we can get rid of
> > > > repetitive checks of the secondary or more if any in the future nodes in the
> > > > list.
> > > > 
> > > > So the basic idea is to have this all hidden (to some extent) behind the macro
> > > > or alike. In the code it would be something as
> > > > 
> > > >   for node in primary, secondary, ...
> > > >     call the API
> > > >     if (okay)
> > > > 	return result
> > > > 
> > > >   return error
> > > > 
> > > > This will indeed help.
> > > 
> > > I think this will indeed help if we ever going to have more than primary
> > > and secondary nodes. It is also tricky if you want to transition
> > > seamlessly between different types of nodes (i.e. you have ACPI primary
> > > with OF overlay secondary with swnode as tertiary etc). And you probably
> > > want to add support for references between different typesof nodes
> > > (i.e. swnode being able to reference OF device node for example).
> > > 
> > > This kind of rework is however out of scope of what I have time to do at
> > > the moment.
> > 
> > I am not asking you to invest into big rework, the idea is to try to fold the
> > iterations to a kind of loop. Is it feasible?
> 
> We could potentially do something like below.
> 
> BTW, do you know why fwnode_property_get_reference_args() returns
> -ENOENT for NULL or error fwnode instead of -EINVAL as the rest of them?

I think we need to ask author, but I believe it's due to the OF analogue.
(Haven't checked myself, though)

> And would you object to unifying this?

...

> +#define FWNODE_ITERATE(n, result, cont_val, op, ...)				\

for_each_fwnode() or alike.

> +({										\
> +	int __ret = -EINVAL;							\
> +	typeof(result) __r;							\
> +										\
> +	for (const struct fwnode_handle *__node = n;				\
> +	     !IS_ERR_OR_NULL(__node);						\
> +	     __node = __node->secondary) {					\
> +	        if (!__node->ops || !__node->ops->op) {				\
> +			__ret = -ENXIO;						\
> +			continue;						\
> +		}								\
> +		__r = __node->ops->op(__node, ## __VA_ARGS__);			\
> +		if (__r != cont_val) {						\
> +			result = __r;						\
> +			__ret = 0;						\
> +			break;							\
> +		}								\
> +	}									\
> +	__ret;									\

With a bit of polishing this may be the way to go.

> +})

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] device property: fix UAF in device_get_next_child_node()
  2024-11-28 23:16     ` Dmitry Torokhov
@ 2024-12-09 18:11       ` Andy Shevchenko
  0 siblings, 0 replies; 17+ messages in thread
From: Andy Shevchenko @ 2024-12-09 18:11 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Sakari Ailus,
	Heikki Krogerus, Daniel Scally, linux-acpi, linux-kernel

On Thu, Nov 28, 2024 at 03:16:04PM -0800, Dmitry Torokhov wrote:
> On Thu, Nov 28, 2024 at 03:20:09PM +0200, Andy Shevchenko wrote:
> > On Wed, Nov 27, 2024 at 09:39:35PM -0800, Dmitry Torokhov wrote:
> > > fwnode_get_next_child_node() always drops reference to the node passed
> > > as the "child" argument,
> > 
> > As commented previously,it might be just a documentation bug. So, please
> 
> No, absolutely not. Consider calling device_get_next_child_node() with
> "child" pointing to the last child of the primary fwnode.
> fwnode_get_next_child_node() will drop the reference to "child"
> rendering it unusable, and return NULL. The code will go and call
> fwnode_get_next_child_node(fwnode->secondary, child) with invalid child
> pointer, resulting in UAF condition.
> 
> Also, consider what happens next. Let's say we did not crash and instead
> returned first child of the secondary node (let's assume primary is an
> OF node and secondary is a software node). On the next iteration of
> device_get_next_child_node() we will call
> fwnode_get_next_child_node(fwnode, child) which results in passing
> swnode child to of_fwnode_get_next_child_node() which may or may not
> work. It all is very fragile.

It seems some of OF calls will crash on this.

> That is why it is best to check if the child argument is indeed a child
> to a given parent before calling fwnode_get_next_child_node() on them.

> > elaborate on the use case before this patch that leads to an issue.
> > 
> > > which makes "child" pointer no longer valid
> > > and we can not use it to scan the secondary node in case there are no
> > > more children in primary one.
> > > 
> > > Also, it is not obvious whether it is safe to pass children of the
> > > secondary node to fwnode_get_next_child_node() called on the primary
> > > node in subsequent calls to device_get_next_child_node().
> > > 
> > > Fix the issue by checking whether the child node passed in is indeed a
> > > child of primary or secondary node, and do not call
> > > fwnode_get_next_child_node() for the wrong parent node. Also set the
> > > "child" to NULL after unsuccessful call to fwnode_get_next_child_node()
> > > on primary node to make sure secondary node's children are scanned from
> > > the beginning.
> > 
> > To me it sounds over complicated. Why not just take reference to the child once
> > more and put it after we find next in either cases?
> 
> You want to "reset" when switching from primary node over to secondary
> instead of hoping that passing child pointer which is not really a child
> to secondary node will somehow cause fwnode_get_next_child_node() to
> return first its child.

At some point we need to switch from primary to a secondary, yes, we need
to "reset" it somehow to switch the type.

> > Current solution provides
> > a lot of duplication and makes function less understandable.
> 
> The simplicity of the old version is deceiving. See the explanation
> above.


-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-12-09 18:11 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-28  5:39 [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers Dmitry Torokhov
2024-11-28  5:39 ` [PATCH 2/2] device property: fix UAF in device_get_next_child_node() Dmitry Torokhov
2024-11-28 13:20   ` Andy Shevchenko
2024-11-28 23:16     ` Dmitry Torokhov
2024-12-09 18:11       ` Andy Shevchenko
2024-11-28 11:49 ` [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers Greg Kroah-Hartman
2024-11-28 13:13 ` Andy Shevchenko
2024-11-28 23:04   ` Dmitry Torokhov
2024-11-29 14:50     ` Andy Shevchenko
2024-11-30  7:16       ` Dmitry Torokhov
2024-11-30 21:44         ` Andy Shevchenko
2024-12-03  5:49           ` Dmitry Torokhov
2024-12-03 13:27             ` Andy Shevchenko
2024-12-03 22:45               ` Dmitry Torokhov
2024-12-04  1:16                 ` Andy Shevchenko
2024-12-05 20:57                   ` Dmitry Torokhov
2024-12-09 18:06                     ` Andy Shevchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox