From: David Fries <david@fries.net>
To: Thorsten Bschorr <thorsten@bschorr.de>
Cc: ??????? ??????? <zbr@ioremap.net>,
Jonathan ALIBERT <jonathan.alibert@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] Avoid null-pointer access in w1/slaves/w1_therm
Date: Wed, 11 Mar 2015 19:44:37 -0500 [thread overview]
Message-ID: <20150312004437.GC23778@spacedout.fries.net> (raw)
In-Reply-To: <CA+1k2cEFBn-TbtOY59+1Rrb2FgEjioiwQUQ+QbmYEM5KPnvHjQ@mail.gmail.com>
On Tue, Mar 10, 2015 at 01:34:14AM +0100, Thorsten Bschorr wrote:
> > It looks like your patch runs into dead locks problems:
I was testing reading and removing the slave, I didn't test two
readers, I'll keep that in mind.
For some reason I was thinking it was a try lock after the sleep, it
isn't, a signal can cause it to return a value, but yes that's not
going to work. I have another version that uses an atomic counter in
place of this mutex, it requires a loop in the remove slave.
I have a one wire network with 14 temperature sensors on it. What I
do is I have a program that talks to the kernel over netlink, which is
now non-blocking for one wire after an earlier set of changes I made.
It sends out all the conversion request requests, waits, sends the
read request packets, then gets back the results, and relays the
results to the program that requested them. It doesn't use the
w1_therm and avoids all these problems. If I read them sequentially
with w1_therm, that would take more than 14*.750 or 10.5 seconds,
there's no way with w1_therm to read them at once without having 14
threads or programs doing so, the problem is it takes a lot more
programming to do netlink.
> > I have a cron job reading my sensors. If I read the sensors on another
> > thread (e.g. via cat), the 2nd thread can produce a dead lock:
> >
> > * thread 1 has bus & sl lock
> > * thread 1 drops bus lock, keeps sl locks and sleeps
> > * thread 2 get bus lock, waits for sl lock
> > * thread 1 returns from sleep, waits for bus lock, but this is help by thread 2
>
>
> Aquiring sl lock before the bus lock prevents this dead lock (no
> change of locking-order).
> With search enabled, removing a device by w1_search_process_cb then
> also worked as intended:
>
> Mar 10 01:29:04 pi kernel: [ 924.870893] w1_therm_remove_slave about
> to lock faily_data->lock
> Mar 10 01:29:04 pi kernel: [ 924.870935] w1_therm_remove_slave
> unlocked faily_data->lock
>
> Mar 10 01:29:04 pi kernel: [ 924.871115] w1_therm_remove_slave about
> to lock faily_data->lock
> Mar 10 01:29:05 pi kernel: [ 925.151242] start sleep d117a600 refcnt 1 **
> Mar 10 01:29:05 pi kernel: [ 925.151277] end sleep d117a600 refcnt 0 **
> Mar 10 01:29:11 pi kernel: [ 931.295344] w1_slave_driver
> 10-000802cc045a: Read failed CRC check
> Mar 10 01:29:11 pi kernel: [ 931.295437] w1_therm_remove_slave
> unlocked faily_data->lock
> ** I get the ref-cnt before and after the sleep and only log if they differ.
>
>
> But I am unable to judge if mobing the sl lock at the beginning of
> w1_slave_show can cause dead locks in other scenarios.
I'm not sure, I would probably switch back to the referencing counting
version I wrote earlier, or make the bus mutex lock a timed lock or
try lock first.
--
David Fries <david@fries.net> PGP pub CB1EE8F0
http://fries.net/~david/
next prev parent reply other threads:[~2015-03-12 0:44 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CA+1k2cH5Y+RngvEbgE3CW5g+bO3V1ytDJC=u6DLVDZvbEOhu5A@mail.gmail.com>
[not found] ` <CA+1k2cF1frcEUu-L_cSJxTp=GKExn6Vt2rdCeY=zrhM62FUggw@mail.gmail.com>
[not found] ` <CA+1k2cEwtY+U7TS3rpmMp5nEBokO8vwhcOiD0ExVQnuoB=XVLQ@mail.gmail.com>
2015-02-23 17:09 ` Fwd: w1/slaves/w1_therm: null-ptr access of sl->family_data Thorsten Bschorr
2015-02-24 1:37 ` David Fries
2015-02-25 9:28 ` Thorsten Bschorr
2015-02-27 8:43 ` [PATCH] Avoid null-pointer access in w1/slaves/w1_therm Thorsten.Bschorr
2015-02-28 20:17 ` David Fries
[not found] ` <369891425174502@web4m.yandex.ru>
2015-03-01 2:17 ` David Fries
2015-03-01 13:04 ` Thorsten Bschorr
2015-03-02 0:17 ` David Fries
2015-03-04 15:36 ` Евгений Поляков
2015-03-08 21:14 ` David Fries
2015-03-09 22:47 ` Thorsten Bschorr
2015-03-09 23:09 ` David Fries
2015-03-10 0:05 ` Thorsten Bschorr
2015-03-10 0:34 ` Thorsten Bschorr
2015-03-12 0:44 ` David Fries [this message]
2015-03-10 13:52 ` Evgeniy Polyakov
2015-03-12 0:54 ` David Fries
2015-03-14 20:55 ` Evgeniy Polyakov
2015-03-18 4:20 ` David Fries
2015-03-18 15:18 ` Evgeniy Polyakov
2015-03-19 0:09 ` David Fries
[not found] ` <CADq11+-tZmQEVUL=sHC64i4auC_5i=+y2yBcMTaJMdD5Z0dE6w@mail.gmail.com>
2015-04-16 3:51 ` David Fries
2015-04-16 11:57 ` Evgeniy Polyakov
[not found] ` <CA+1k2cFw+2NTOtbSaJ1S=kBAn2Mj62DTeZo68V9t1Wk-7m7GyA@mail.gmail.com>
2015-04-17 12:55 ` Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150312004437.GC23778@spacedout.fries.net \
--to=david@fries.net \
--cc=jonathan.alibert@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=thorsten@bschorr.de \
--cc=zbr@ioremap.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox