From: Krzysztof Kozlowski <krzk@kernel.org>
To: Sebastian Reichel <sre@kernel.org>
Cc: "Pali Rohár" <pali@kernel.org>, "Andrew F. Davis" <afd@ti.com>,
"Ivaylo Dimitrov" <ivo.g.dimitrov.75@gmail.com>,
"Anton Vorontsov" <cbouatmailru@gmail.com>,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [RFC] power: supply: bq27xxx_battery: Fix polling interval after re-bind
Date: Mon, 22 Jun 2020 10:22:48 +0200 [thread overview]
Message-ID: <20200622082248.GB28886@kozik-lap> (raw)
In-Reply-To: <20200619175521.xrcd7ahvjtc4zoqi@earth.universe>
On Fri, Jun 19, 2020 at 07:55:21PM +0200, Sebastian Reichel wrote:
> Hi,
>
> On Wed, May 27, 2020 at 09:42:54AM +0200, Pali Rohár wrote:
> > On Tuesday 26 May 2020 21:16:28 Andrew F. Davis wrote:
> > > On 5/25/20 7:32 AM, Krzysztof Kozlowski wrote:
> > > > This reverts commit 8cfaaa811894a3ae2d7360a15a6cfccff3ebc7db.
> > > >
> > > > If device was unbound and bound, the polling interval would be set to 0.
> > > > This is both unexpected and messes up with other bq27xxx devices (if
> > > > more than one battery device is used).
> > > >
> > > > This reset of polling interval was added in commit 8cfaaa811894
> > > > ("bq27x00_battery: Fix OOPS caused by unregistring bq27x00 driver")
> > > > stating that power_supply_unregister() calls get_property(). However in
> > > > Linux kernel v3.1 and newer, such call trace does not exist.
> > > > Unregistering power supply does not call get_property() on unregistered
> > > > power supply.
> > > >
> > > > Fixes: 8cfaaa811894 ("bq27x00_battery: Fix OOPS caused by unregistring bq27x00 driver")
> > > > Cc: <stable@vger.kernel.org>
> > > > Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> > > >
> > > > ---
> > > >
> > > > I really could not identify the issue being fixed in offending commit
> > > > 8cfaaa811894 ("bq27x00_battery: Fix OOPS caused by unregistring bq27x00
> > > > driver"), therefore maybe I missed here something important.
> > > >
> > > > Please share your thoughts on this.
> > >
> > > I'm having a hard time finding the OOPS also. Maybe there is a window
> > > where the poll function is running or about to run where
> > > cancel_delayed_work_sync() is called and cancels the work, only to have
> > > an interrupt or late get_property call in to the poll function and
> > > re-schedule it.
> > >
> > > What we really need is to do is look at how we are handling the polling
> > > function. It gets called from the workqueue, from a threaded interrupt
> > > context, and from a power supply framework callback, possibly all at the
> > > same time. Sometimes its protected by a lock, sometimes not. Updating
> > > the device's cached data should always be locked.
> > >
> > > What's more is the poll function is self-arming, so if we call
> > > cancel_delayed_work_sync() (remove it from the work queue then then wait
> > > for it to finish if running), are we sure it wont have just re-arm itself?
> > >
> > > We should make the only way we call the poll function be through the
> > > work queue, (plus make sure all accesses to the cache are locked).
> > >
> > > Andrew
> >
> > I do not remember details too. It is long time ago.
> >
> > CCing Ivaylo Dimitrov as he may remember something...
>
> Applying this revert introduces at least a race condition when
> userspace reads sysfs files while kernel removes the driver.
>
> So looking at the entrypoints for schedules:
>
> bq27xxx_battery_i2c_probe:
> Not relevant, probe is done when the battery is being removed.
>
> poll_interval_param_set:
> Can be avoided by unregistering from the list earlier. This
> is the right thing to do considering the battery is added to
> the list as last step in the probe routine, it should be removed
> first during teardown.
Yes, good point.
>
> bq27xxx_external_power_changed:
> This can happen at any time while the power-supply device is
> registered, because of the code in get_property.
>
> bq27xxx_battery_poll:
> This can happen at any time while the power-supply device is
> registered.
>
> As far as I can see the only thing in the delayed work needing
> the power-supply device is power_supply_changed(). If we add a
> check, that di->bat is not NULL, we should be able to reorder
> teardown like this:
Except power_supply structure there is the device state struct
bq27xxx_device_info 'di'. If bq27xxx_battery_poll() is called during the
unbind, it will access the 'di' which is being freed by devm-framework.
And just checking for di->bat is also not thread safe (can be
reordered).
I think there is no easy few-line fix for this. Instead, the
workqueue scheduling should be guarded everywhere by device-instance
mutex (bq27xxx_device_info.lock).
>
> 1. remove from list
> 2. unregister power-supply device and set to di->bat to NULL
> 3. cancel delayed work
> 4. destroy mutex
>
> Also I agree with Andrew, that the locking looks fishy. I think
> the lock needs to be moved, so that the call to
> bq27xx_battery_update(di) in bq27xxx_battery_poll is protected.
Exactly.
Best regards,
Krzysztof
next prev parent reply other threads:[~2020-06-22 8:22 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-25 11:32 [RFC] power: supply: bq27xxx_battery: Fix polling interval after re-bind Krzysztof Kozlowski
2020-05-27 1:16 ` Andrew F. Davis
2020-05-27 7:42 ` Pali Rohár
2020-06-19 17:55 ` Sebastian Reichel
2020-06-19 18:58 ` Pali Rohár
2020-06-22 8:09 ` Krzysztof Kozlowski
2020-06-22 8:22 ` Krzysztof Kozlowski [this message]
2020-06-22 12:47 ` Sebastian Reichel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200622082248.GB28886@kozik-lap \
--to=krzk@kernel.org \
--cc=afd@ti.com \
--cc=cbouatmailru@gmail.com \
--cc=ivo.g.dimitrov.75@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=pali@kernel.org \
--cc=sre@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox