All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tzung-Bi Shih <tzungbi@kernel.org>
To: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benson Leung <bleung@chromium.org>,
	linux-input@vger.kernel.org, chrome-platform@lists.linux.dev
Subject: Re: [PATCH 0/5] platform/chrome: Fix a race when probing drivers
Date: Fri, 5 Sep 2025 08:38:10 +0000	[thread overview]
Message-ID: <aLqhcu-zjpyeYMly@google.com> (raw)
In-Reply-To: <4gtrvxpo6zqk54uvavrox7hszszdpvdubz4w6iaks72zq3jjsw@b6cfvi5ysj2u>

On Thu, Sep 04, 2025 at 07:06:23AM -0700, Dmitry Torokhov wrote:
> On Tue, Sep 02, 2025 at 09:18:47PM +0800, Tzung-Bi Shih wrote:
> > On Fri, Aug 29, 2025 at 08:50:01PM +0800, Tzung-Bi Shih wrote:
> > > On Fri, Aug 29, 2025 at 11:28:55AM +0000, Dmitry Torokhov wrote:
> > > > On Thu, Aug 28, 2025 at 08:35:56AM +0000, Tzung-Bi Shih wrote:
> > > > > A race is observed when cros_ec_lpc and cros-ec-keyb are all built as
> > > > > modules.  cros_ec_lpc is cros-ec-keyb's parent.  However, they can be
> > > > > probed at the same time.
> > > > > 
> > > > > Example:
> > > > > 
> > > > > + -----------------------------------------------------------------+
> > > > > | Some init process (e.g. udevd) | deferred_probe_work_func worker |
> > > > > + -----------------------------------------------------------------+
> > > > > | Probe cros-ec-keyb.            |                                 |
> > > > > | - Decide to defer[1].          |                                 |
> > > > > |                                | A device bound to a driver[2].  |
> > > > > | Probe cros_ec_lpc.             |                                 |
> > > > > | - Init the struct[3].          |                                 |
> > > > > |                                | Retry cros-ec-keyb from the     |
> > > > > |                                | deferred list[4].               |
> > > > > |                                | - Won't defer again as [3].     |
> > > > > |                                | - Access uninitialized data in  |
> > > > > |                                |   the struct.                   |
> > > > > | - Register the device.         |                                 |
> > > > > + -----------------------------------------------------------------+
> > > > > 
> > > > > [1] https://elixir.bootlin.com/linux/v6.16/source/drivers/input/keyboard/cros_ec_keyb.c#L707
> > > > > [2] https://elixir.bootlin.com/linux/v6.16/source/drivers/base/dd.c#L405
> > > > > [3] https://elixir.bootlin.com/linux/v6.16/source/drivers/platform/chrome/cros_ec_lpc.c#L644
> > > > > [4] https://elixir.bootlin.com/linux/v6.16/source/drivers/base/dd.c#L418
> > > > > 
> > > > > Note that the device link[5] can't help as in the observed environment,
> > > > > the devices are already added via device_add()[6].
> > > > > 
> > > > > [5] https://www.kernel.org/doc/html/latest/driver-api/device_link.html#usage
> > > > > [6] https://elixir.bootlin.com/linux/v6.16/source/drivers/acpi/acpi_platform.c#L177
> > > > > 
> > > > > The series fixes the issue by ensuring the struct is ready for accessing
> > > > > before continuing to probe cros-ec-keyb.
> > > > 
> > > > Why is the keyboard platform device instantiated before the transport
> > > > (cros_ec_lpc) is done initializing? I think this is the root of the
> > > > issue...
> > > 
> > > I may misunderstand but it seems to me:
> > > 
> > > - The ACPI bus enumerated and instantiated the platform devices[6] first.
> > > 
> > > - The keyboard platform device was probed when `cros_ec_keyb_driver`
> > >   registered.  It deferred as its parent's drvdata was NULL[1].
> > > 
> > > - The transport platform device was probed when `cros_ec_lpc_driver`
> > >   registered.  It set the drvdata[3].
> > > 
> > > - The keyboard platform device was probed again from retrying the deferred
> > >   list, by another thread `deferred_probe_work_func`.  The parent's drvdata
> > >   wasn't NULL and cros_ec_register() for the transport device weren't
> > >   finished.  The race happened.
> > 
> > Hi Dmitry,
> > 
> > Does it make sense to you?
> 
> I'll have to research how MFD mixes up statically described and
> DT-described platform devices and makes sure that children are not
> probed before the parent is ready - I think we need to make cros_ec
> behave the same way.

I may misunderstand but FWIW:

I failed to find relevant code in MFD [7] that guarantees the probe order.
Also, I'm curious about wouldn't code at [7] results in duplicate platform
devices?  E.g., 1 populated from OF; 1 created by MFD.

Note: current cros_ec_dev.c doesn't use `of_compatible` in struct mfd_cell.

If we're looking at how cros_ec_dev.c guarantees the order:

- The transport platfrom device is probed first.  It calls cros_ec_register().
- In cros_ec_register(), it registers the MFD device "cros-ec-dev".  And the
  children devices are added via mfd_add_devices().


Back to the issue we observed:

- The platform devices are created when it scans the tree in ACPI[6].  We
  probably have no way to prevent the devices from adding unless specifying
  `enumeration_by_parent`[8].

- When some of them are modules, the driver registrations are tied to the
  module insertion.  They can be arrived by anytime unless we use something
  similar to soft dependency[9].  A Kconfig dependency will also
  be needed to prevent cros_ec_lpcs=m but cros_ec_keyb=y.  However,
  cros_ec_keyb would need to specify 2 possible dependencies "cros_ec_lpcs"
  and "cros_ec_spi"[10].  I'm not sure what would be happening if a system
  has no cros_ec_spi module at all.

[7] https://elixir.bootlin.com/linux/v6.16/source/drivers/mfd/mfd-core.c#L184
[8] https://elixir.bootlin.com/linux/v6.16/source/drivers/acpi/scan.c#L2204
[9] https://elixir.bootlin.com/linux/v6.16/source/include/linux/module.h#L176
[10] https://elixir.bootlin.com/linux/v6.16/source/drivers/input/keyboard/Kconfig#L743

  reply	other threads:[~2025-09-05  8:38 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-28  8:35 [PATCH 0/5] platform/chrome: Fix a race when probing drivers Tzung-Bi Shih
2025-08-28  8:35 ` [PATCH 1/5] platform/chrome: Centralize cros_ec_device allocation Tzung-Bi Shih
2025-08-28  8:35 ` [PATCH 2/5] platform/chrome: Centralize common cros_ec_device initialization Tzung-Bi Shih
2025-08-28  8:35 ` [PATCH 3/5] platform/chrome: cros_ec: Separate initialization from cros_ec_register() Tzung-Bi Shih
2025-08-28  8:36 ` [PATCH 4/5] platform/chrome: cros_ec: Add a flag to track registration state Tzung-Bi Shih
2025-08-28  8:36 ` [PATCH 5/5] Input: cros_ec_keyb - Defer probe until parent EC device is registered Tzung-Bi Shih
2025-09-14  1:06   ` Dmitry Torokhov
2025-08-29 11:28 ` [PATCH 0/5] platform/chrome: Fix a race when probing drivers Dmitry Torokhov
2025-08-29 12:50   ` Tzung-Bi Shih
2025-09-02 13:18     ` Tzung-Bi Shih
2025-09-04 14:06       ` Dmitry Torokhov
2025-09-05  8:38         ` Tzung-Bi Shih [this message]
2025-09-14  1:08           ` Dmitry Torokhov
2025-09-14  3:26             ` Tzung-Bi Shih
2025-09-14  3:47 ` Tzung-Bi Shih

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aLqhcu-zjpyeYMly@google.com \
    --to=tzungbi@kernel.org \
    --cc=bleung@chromium.org \
    --cc=chrome-platform@lists.linux.dev \
    --cc=dmitry.torokhov@gmail.com \
    --cc=linux-input@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.