All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luca Ceresoli <luca.ceresoli@bootlin.com>
To: Saravana Kannan <saravanak@google.com>
Cc: "Rob Herring" <robh+dt@kernel.org>,
	"Frank Rowand" <frowand.list@gmail.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Xu Yang" <xu.yang_2@nxp.com>,
	kernel-team@android.com, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	"Hervé Codina" <herve.codina@bootlin.com>,
	"Geert Uytterhoeven" <geert@linux-m68k.org>
Subject: [REGRESSION] Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
Date: Mon, 26 Feb 2024 12:52:26 +0100	[thread overview]
Message-ID: <20240226125226.705efef3@booty> (raw)
In-Reply-To: <CAGETcx99hhfOaEn1CH1OLDGp_pnrVeJ2nWb3X5=0j8tij4NR9w@mail.gmail.com>

Hello Saravana,

On Fri, 23 Feb 2024 17:35:24 -0800
Saravana Kannan <saravanak@google.com> wrote:

> On Fri, Feb 23, 2024 at 8:18 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> >
> > Hello Saravana,
> >
> > [+cc Hervé Codina]
> >
> > On Tue,  6 Feb 2024 17:18:01 -0800
> > Saravana Kannan <saravanak@google.com> wrote:
> >  
> > > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > > remote-endpoint properties created a fwnode link from the consumer device
> > > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > > trying to create device links or detecting cycles. So, improve this the
> > > same way we improved finding the consumer of a remote-endpoint property.
> > >
> > > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > > Signed-off-by: Saravana Kannan <saravanak@google.com>  
> >
> > After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> > getting unexpected warnings during device tree overlay removal. After a
> > somewhat painful bisection I identified this patch as the one that
> > triggers it all.  
> 
> Thanks for the report.
> 
> >  
> > > ---
> > > --- a/drivers/of/property.c
> > > +++ b/drivers/of/property.c
> > > @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> > >  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> > >  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> > >  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> > > -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
> > >  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
> > >  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
> > >  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> > > @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
> > >       return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> > >  }
> > >
> > > +static struct device_node *parse_remote_endpoint(struct device_node *np,
> > > +                                              const char *prop_name,
> > > +                                              int index)
> > > +{
> > > +     /* Return NULL for index > 0 to signify end of remote-endpoints. */
> > > +     if (!index || strcmp(prop_name, "remote-endpoint"))  
> >
> > There seem to be a bug here: "!index" should be "index > 0", as the
> > comment suggests. Otherwise NULL is always returned.  
> 
> Ah crap, I think you are right. It should have been "index". Not
> "!index". But I tested this! Sigh. I probably screwed up my testing.
> 
> Please send out a Fix for this.
> 
> Geert, we got excited too soon. :(
> 
> > I am going to send a quick patch for that, but haven't done so yet
> > because it still won't solve the problem, so I wanted to open the topic
> > here without further delay.
> >
> > Even with the 'index > 0' fix I'm still getting pretty much the same:  
> 
> This part is confusing though. If I read your DT correctly, there's a
> cycle between platform:panel-dsi-lvds and i2c:13-002c. And fw_devlink
> should not be enforcing any ordering between those devices ever.
> 
> I'm surprised that in your "working" case, fw_devlink didn't detect
> any cycle. It should have. If there's any debugging to do, that's the
> one we need to debug.
> 
> >
> > [   34.836781] ------------[ cut here ]------------
> > [   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
> > ...
> > [   35.024751] Call trace:
> > [   35.027199]  devm_kfree+0x8c/0xfc
> > [   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
> > [   35.036990]  devres_release_group+0xe0/0x164
> > [   35.041264]  i2c_device_remove+0x38/0x9c
> > [   35.045196]  device_remove+0x4c/0x80
> > [   35.048774]  device_release_driver_internal+0x1d4/0x230
> > [   35.054003]  device_release_driver+0x18/0x24
> > [   35.058279]  bus_remove_device+0xcc/0x10c
> > [   35.062292]  device_del+0x15c/0x41c
> > [   35.065786]  device_unregister+0x18/0x34
> > [   35.069714]  i2c_unregister_device+0x54/0x88
> > [   35.073988]  of_i2c_notify+0x98/0x224
> > [   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
> > [   35.082543]  __of_changeset_entry_notify+0x100/0x16c
> > [   35.087515]  __of_changeset_revert_notify+0x44/0x78
> > [   35.092398]  of_overlay_remove+0x114/0x1c4
> > ...
> >
> > By comparing the two versions I found that before removing the overlay:
> >
> >  * in the "working" case (with this patch reverted) I have:
> >
> >    # ls /sys/class/devlink/ | grep 002c
> >    platform:hpbr--i2c:13-002c
> >    platform:panel-dsi-lvds--i2c:13-002c  
> 
> Can you check the "status" and "sync_state_only" file in this folder
> and tell me what it says?
> 
> Since these devices have a cyclic dependency between them, it should
> have been something other than "not tracked" and "sync_state_only"
> should be "1". But my guess is you'll see "active" and "0".
> 
> >    platform:regulator-sys-1v8--i2c:13-002c
> >    regulator:regulator.31--i2c:13-002c
> >    #
> >
> >  * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):
> >
> >    # ls /sys/class/devlink/ | grep 002c
> >    platform:hpbr--i2c:13-002c
> >    platform:regulator-sys-1v8--i2c:13-002c
> >    regulator:regulator.30--i2c:13-002c
> >    #
> >
> > So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
> > I think it gets created but later on removed. Here's a snippet of the
> > kernel log when that happens:
> >
> > [    9.578279] ----- cycle: start -----
> > [    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > [    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > [    9.578329] ----- cycle: end -----
> > [    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > ...  
> 
> Somewhere in this area, I'm thinking you'll also see "device:
> 'i2c:13-002c--platform:panel-dsi-lvds': device_add" do you not? And if
> you enabled device link logs, you'll see that it was "sync state only"
> link.
> 
> > [    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > ...
> > [    9.597280] ----- cycle: start -----
> > [    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > [    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > [    9.607581] ----- cycle: end -----
> > [    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
> > [    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
> > ...
> > [    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
> > [    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
> > ...
> > [    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
> > [    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister  
> 
> Oh yeah, see. The "device_add" I expected earlier is getting removed here.
> 
> > [   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
> > [   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister
> >
> > And here's the relevant portion of my device tree overlay:
> >
> > --------------------8<--------------------
> >  
> 
> I think the eventual fix would be this series + adding a
> "post-init-providers" property to the device that's supposed to probe
> first and point it to the device that's supposed to probe next. Do
> this at the device node level, not the endpoint level.
> https://lore.kernel.org/lkml/20240221233026.2915061-1-saravanak@google.com/

I'm certainly going to look at this series in more detail and at the
debugging you asked for, however I'm afraid I won't have access to the
hardware this week and it's not going to be a quick task anyway.

So in this moment I think it's quite clear that this specific patch
creates a regression and there is no clear fix that is reasonably
likely to get merged before 6.8.

I propose reverting this patch immediately, unless you have a better
short-term solution.

Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

  reply	other threads:[~2024-02-26 11:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-07  1:17 [PATCH v2 0/3] Improve remote-endpoint parsing Saravana Kannan
2024-02-07  1:18 ` [PATCH v2 1/3] of: property: Improve finding the consumer of a remote-endpoint property Saravana Kannan
2024-02-07  1:18 ` [PATCH v2 2/3] of: property: Improve finding the supplier " Saravana Kannan
2024-02-23 16:18   ` Luca Ceresoli
2024-02-24  1:35     ` Saravana Kannan
2024-02-26 11:52       ` Luca Ceresoli [this message]
2024-02-28 21:56         ` [REGRESSION] " Rob Herring
2024-02-28 23:58           ` Saravana Kannan
2024-02-29  0:26             ` Rob Herring
2024-02-29  9:34               ` Luca Ceresoli
2024-02-29 22:10                 ` Rob Herring
2024-02-29 22:54                   ` Saravana Kannan
2024-03-01  9:59                   ` Luca Ceresoli
2024-02-28 22:01     ` Rob Herring
2024-02-07  1:18 ` [PATCH v2 3/3] of: property: Add in-ports/out-ports support to of_graph_get_port_parent() Saravana Kannan
2024-02-21  0:47   ` Saravana Kannan
2024-02-21  7:00     ` Greg Kroah-Hartman
2024-02-21  8:38       ` Greg Kroah-Hartman
2024-02-07  8:10 ` [PATCH v2 0/3] Improve remote-endpoint parsing Rob Herring
2024-02-07 21:07   ` Saravana Kannan
2024-02-09 10:31 ` Rob Herring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240226125226.705efef3@booty \
    --to=luca.ceresoli@bootlin.com \
    --cc=devicetree@vger.kernel.org \
    --cc=frowand.list@gmail.com \
    --cc=geert@linux-m68k.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=herve.codina@bootlin.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robh+dt@kernel.org \
    --cc=saravanak@google.com \
    --cc=xu.yang_2@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.