* [PATCH] i2c: core: Prevent race condition when removing i2c devices
@ 2017-06-08 2:43 David Thomson
2017-06-19 14:36 ` Wolfram Sang
0 siblings, 1 reply; 9+ messages in thread
From: David Thomson @ 2017-06-08 2:43 UTC (permalink / raw)
To: linux-i2c; +Cc: wsa, David Thomson
There's a race condition that occurs when deleting & closing multiple
i2c devices, which results in a kernel warning on an incorrect reference
count:
Call Trace:
[c00000007e2cfaf0] [c0000000000b84e4] .module_put+0x24/0xb0 (unreliable)
[c00000007e2cfb70] [c0000000005db830] .i2c_put_adapter+0x30/0x50
[c00000007e2cfbf0] [c0000000005deb74] .i2cdev_release+0x24/0x60
[c00000007e2cfc70] [c00000000014015c] .__fput+0xbc/0x290
[c00000007e2cfd10] [c000000000059b64] .task_work_run+0xf4/0x130
[c00000007e2cfdb0] [c00000000000a0a4] .do_notify_resume+0x74/0x80
[c00000007e2cfe30] [c000000000000acc] .ret_from_except_lite+0x78/0x7c
The race is seen when an i2c adapter is being deleted while
file descriptor on another i2c adapter is being closed. The following
shows two threads executing concurrently (the first number is the thread
id):
4580 i2c_del_mux_adapter adapter i2c-6-mux (chan_id 2)
4580 i2c_del_adapter adapter i2c-6-mux (chan_id 2)
4580 i2cdev_detach_adapter adapter i2c-6-mux (chan_id 2)
2627 i2cdev_release START adapter i2c-6-mux (chan_id 2)
2627 i2c_put_adapter START adapter i2c-6-mux (chan_id 2) owner (null)
2627 i2c_adapter_dev_release adapter i2c-6-mux (chan_id 2)
4580 i2c_del_mux_adapter adapter i2c-6-mux (chan_id 3)
4580 i2c_del_adapter adapter i2c-6-mux (chan_id 3)
4580 i2cdev_detach_adapter adapter i2c-6-mux (chan_id 3)
2627 i2c_put_adapter END adapter i2c-6-mux (chan_id 2) owner c0000000edf66400
2627 i2cdev_release END adapter i2c-6-mux (chan_id 2)
2627 i2cdev_release START adapter i2c-6-mux (chan_id 3)
2627 i2c_put_adapter START adapter i2c-6-mux (chan_id 3) owner (null)
2627 i2c_adapter_dev_release adapter i2c-6-mux (chan_id 3)
2627 i2c_put_adapter END adapter i2c-6-mux (chan_id 3) owner (null)
2627 i2cdev_release END adapter i2c-6-mux (chan_id 3)
When thread 4580 for chan_id 3 interrupts 2627 doing
i2c_adapter_dev_release, the 'owner' field for chan_id 2 becomes
non-null. This causes the code to attempt to decrement the reference for
this owner, but the current reference count is invalid. I'm not sure how
the owner is getting set, but adding a lock around the
put_device/module_put calls resolves the issue.
Signed-off-by: David Thomson <david.thomson@alliedtelesis.co.nz>
---
drivers/i2c/i2c-core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c
index 82576aaccc90..8831d9b363a7 100644
--- a/drivers/i2c/i2c-core.c
+++ b/drivers/i2c/i2c-core.c
@@ -3151,8 +3151,12 @@ void i2c_put_adapter(struct i2c_adapter *adap)
if (!adap)
return;
+ mutex_lock(&core_lock);
+
put_device(&adap->dev);
module_put(adap->owner);
+
+ mutex_unlock(&core_lock);
}
EXPORT_SYMBOL(i2c_put_adapter);
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] i2c: core: Prevent race condition when removing i2c devices
2017-06-08 2:43 [PATCH] i2c: core: Prevent race condition when removing i2c devices David Thomson
@ 2017-06-19 14:36 ` Wolfram Sang
2017-06-19 16:34 ` Peter Rosin
0 siblings, 1 reply; 9+ messages in thread
From: Wolfram Sang @ 2017-06-19 14:36 UTC (permalink / raw)
To: David Thomson, Peter Rosin; +Cc: linux-i2c
[-- Attachment #1: Type: text/plain, Size: 3114 bytes --]
On Thu, Jun 08, 2017 at 02:43:44PM +1200, David Thomson wrote:
> There's a race condition that occurs when deleting & closing multiple
> i2c devices, which results in a kernel warning on an incorrect reference
> count:
Thanks for the report and patch!
CCing peda. Do you have time to look into it? Muxes and locking are
kinda calling for your expertise ;)
>
> Call Trace:
> [c00000007e2cfaf0] [c0000000000b84e4] .module_put+0x24/0xb0 (unreliable)
> [c00000007e2cfb70] [c0000000005db830] .i2c_put_adapter+0x30/0x50
> [c00000007e2cfbf0] [c0000000005deb74] .i2cdev_release+0x24/0x60
> [c00000007e2cfc70] [c00000000014015c] .__fput+0xbc/0x290
> [c00000007e2cfd10] [c000000000059b64] .task_work_run+0xf4/0x130
> [c00000007e2cfdb0] [c00000000000a0a4] .do_notify_resume+0x74/0x80
> [c00000007e2cfe30] [c000000000000acc] .ret_from_except_lite+0x78/0x7c
>
> The race is seen when an i2c adapter is being deleted while
> file descriptor on another i2c adapter is being closed. The following
> shows two threads executing concurrently (the first number is the thread
> id):
>
> 4580 i2c_del_mux_adapter adapter i2c-6-mux (chan_id 2)
> 4580 i2c_del_adapter adapter i2c-6-mux (chan_id 2)
> 4580 i2cdev_detach_adapter adapter i2c-6-mux (chan_id 2)
> 2627 i2cdev_release START adapter i2c-6-mux (chan_id 2)
> 2627 i2c_put_adapter START adapter i2c-6-mux (chan_id 2) owner (null)
> 2627 i2c_adapter_dev_release adapter i2c-6-mux (chan_id 2)
> 4580 i2c_del_mux_adapter adapter i2c-6-mux (chan_id 3)
> 4580 i2c_del_adapter adapter i2c-6-mux (chan_id 3)
> 4580 i2cdev_detach_adapter adapter i2c-6-mux (chan_id 3)
> 2627 i2c_put_adapter END adapter i2c-6-mux (chan_id 2) owner c0000000edf66400
> 2627 i2cdev_release END adapter i2c-6-mux (chan_id 2)
> 2627 i2cdev_release START adapter i2c-6-mux (chan_id 3)
> 2627 i2c_put_adapter START adapter i2c-6-mux (chan_id 3) owner (null)
> 2627 i2c_adapter_dev_release adapter i2c-6-mux (chan_id 3)
> 2627 i2c_put_adapter END adapter i2c-6-mux (chan_id 3) owner (null)
> 2627 i2cdev_release END adapter i2c-6-mux (chan_id 3)
>
> When thread 4580 for chan_id 3 interrupts 2627 doing
> i2c_adapter_dev_release, the 'owner' field for chan_id 2 becomes
> non-null. This causes the code to attempt to decrement the reference for
> this owner, but the current reference count is invalid. I'm not sure how
> the owner is getting set, but adding a lock around the
> put_device/module_put calls resolves the issue.
>
> Signed-off-by: David Thomson <david.thomson@alliedtelesis.co.nz>
> ---
> drivers/i2c/i2c-core.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c
> index 82576aaccc90..8831d9b363a7 100644
> --- a/drivers/i2c/i2c-core.c
> +++ b/drivers/i2c/i2c-core.c
> @@ -3151,8 +3151,12 @@ void i2c_put_adapter(struct i2c_adapter *adap)
> if (!adap)
> return;
>
> + mutex_lock(&core_lock);
> +
> put_device(&adap->dev);
> module_put(adap->owner);
> +
> + mutex_unlock(&core_lock);
> }
> EXPORT_SYMBOL(i2c_put_adapter);
>
> --
> 2.13.0
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] i2c: core: Prevent race condition when removing i2c devices
2017-06-19 14:36 ` Wolfram Sang
@ 2017-06-19 16:34 ` Peter Rosin
2017-06-19 19:30 ` Peter Rosin
0 siblings, 1 reply; 9+ messages in thread
From: Peter Rosin @ 2017-06-19 16:34 UTC (permalink / raw)
To: Wolfram Sang, David Thomson; +Cc: linux-i2c
On 2017-06-19 16:36, Wolfram Sang wrote:
> On Thu, Jun 08, 2017 at 02:43:44PM +1200, David Thomson wrote:
>> There's a race condition that occurs when deleting & closing multiple
>> i2c devices, which results in a kernel warning on an incorrect reference
>> count:
>
> Thanks for the report and patch!
>
> CCing peda. Do you have time to look into it? Muxes and locking are
> kinda calling for your expertise ;)
>
>>
>> Call Trace:
>> [c00000007e2cfaf0] [c0000000000b84e4] .module_put+0x24/0xb0 (unreliable)
>> [c00000007e2cfb70] [c0000000005db830] .i2c_put_adapter+0x30/0x50
>> [c00000007e2cfbf0] [c0000000005deb74] .i2cdev_release+0x24/0x60
>> [c00000007e2cfc70] [c00000000014015c] .__fput+0xbc/0x290
>> [c00000007e2cfd10] [c000000000059b64] .task_work_run+0xf4/0x130
>> [c00000007e2cfdb0] [c00000000000a0a4] .do_notify_resume+0x74/0x80
>> [c00000007e2cfe30] [c000000000000acc] .ret_from_except_lite+0x78/0x7c
>>
>> The race is seen when an i2c adapter is being deleted while
>> file descriptor on another i2c adapter is being closed. The following
>> shows two threads executing concurrently (the first number is the thread
>> id):
>>
>> 4580 i2c_del_mux_adapter adapter i2c-6-mux (chan_id 2)
This function no longer exist in the kernel. I'm not saying that the
problem has been solved (it probably hasn't), but what version was this
and can you please redo this with a newer kernel?
>> 4580 i2c_del_adapter adapter i2c-6-mux (chan_id 2)
>> 4580 i2cdev_detach_adapter adapter i2c-6-mux (chan_id 2)
>> 2627 i2cdev_release START adapter i2c-6-mux (chan_id 2)
>> 2627 i2c_put_adapter START adapter i2c-6-mux (chan_id 2) owner (null)
>> 2627 i2c_adapter_dev_release adapter i2c-6-mux (chan_id 2)
>> 4580 i2c_del_mux_adapter adapter i2c-6-mux (chan_id 3)
>> 4580 i2c_del_adapter adapter i2c-6-mux (chan_id 3)
>> 4580 i2cdev_detach_adapter adapter i2c-6-mux (chan_id 3)
>> 2627 i2c_put_adapter END adapter i2c-6-mux (chan_id 2) owner c0000000edf66400
>> 2627 i2cdev_release END adapter i2c-6-mux (chan_id 2)
>> 2627 i2cdev_release START adapter i2c-6-mux (chan_id 3)
>> 2627 i2c_put_adapter START adapter i2c-6-mux (chan_id 3) owner (null)
>> 2627 i2c_adapter_dev_release adapter i2c-6-mux (chan_id 3)
>> 2627 i2c_put_adapter END adapter i2c-6-mux (chan_id 3) owner (null)
>> 2627 i2cdev_release END adapter i2c-6-mux (chan_id 3)
>>
>> When thread 4580 for chan_id 3 interrupts 2627 doing
>> i2c_adapter_dev_release, the 'owner' field for chan_id 2 becomes
>> non-null. This causes the code to attempt to decrement the reference for
>> this owner, but the current reference count is invalid. I'm not sure how
>> the owner is getting set, but adding a lock around the
>> put_device/module_put calls resolves the issue.
It would certainly be nice to know why .owner is clobbered, because I
don't see it. But then again, I don't know what sources I should be
reading...
Cheers,
peda
>> Signed-off-by: David Thomson <david.thomson@alliedtelesis.co.nz>
>> ---
>> drivers/i2c/i2c-core.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c
>> index 82576aaccc90..8831d9b363a7 100644
>> --- a/drivers/i2c/i2c-core.c
>> +++ b/drivers/i2c/i2c-core.c
>> @@ -3151,8 +3151,12 @@ void i2c_put_adapter(struct i2c_adapter *adap)
>> if (!adap)
>> return;
>>
>> + mutex_lock(&core_lock);
>> +
>> put_device(&adap->dev);
>> module_put(adap->owner);
>> +
>> + mutex_unlock(&core_lock);
>> }
>> EXPORT_SYMBOL(i2c_put_adapter);
>>
>> --
>> 2.13.0
>>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] i2c: core: Prevent race condition when removing i2c devices
2017-06-19 16:34 ` Peter Rosin
@ 2017-06-19 19:30 ` Peter Rosin
2017-06-28 22:20 ` David Thomson
2017-07-31 13:38 ` Wolfram Sang
0 siblings, 2 replies; 9+ messages in thread
From: Peter Rosin @ 2017-06-19 19:30 UTC (permalink / raw)
To: Wolfram Sang, David Thomson; +Cc: linux-i2c
On 2017-06-19 18:34, Peter Rosin wrote:
> It would certainly be nice to know why .owner is clobbered, because I
> don't see it. But then again, I don't know what sources I should be
> reading...
BTW, one thing I noticed when reading the current code is that I see this
at the end of i2c_del_adapter:
/* Clear the device structure in case this adapter is ever going to be
added again */
memset(&adap->dev, 0, sizeof(adap->dev));
}
EXPORT_SYMBOL(i2c_del_adapter);
which is not very compatible with this part of the device_add() docs:
* Do not call this routine or device_register() more than once for
* any device structure. The driver model core is not designed to work
* with devices that get unregistered and then spring back to life.
* (Among other things, it's very hard to guarantee that all references
* to the previous incarnation of @dev have been dropped.) Allocate
* and register a fresh new struct device instead.
That, of course, probably have no bearing on the problem/patch in this
thread...
Or maybe struct device reuse is exactly what is going on???
Cheers,
peda
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] i2c: core: Prevent race condition when removing i2c devices
2017-06-19 19:30 ` Peter Rosin
@ 2017-06-28 22:20 ` David Thomson
2017-08-18 8:04 ` Peter Rosin
2017-07-31 13:38 ` Wolfram Sang
1 sibling, 1 reply; 9+ messages in thread
From: David Thomson @ 2017-06-28 22:20 UTC (permalink / raw)
To: Peter Rosin, Wolfram Sang; +Cc: linux-i2c@vger.kernel.org
Hi Peter
Thanks for your reply. Unfortunately we cannot reproduce the issue on a newer kernel as it requires userspace applications (for processing the hotswap) that rely on our (many) changes to the kernel. I appreciate you taking the time to consider the issue.
Thanks
David
________________________________________
From: Peter Rosin <peda@axentia.se>
Sent: Tuesday, 20 June 2017 7:30 a.m.
To: Wolfram Sang; David Thomson
Cc: linux-i2c@vger.kernel.org
Subject: Re: [PATCH] i2c: core: Prevent race condition when removing i2c devices
On 2017-06-19 18:34, Peter Rosin wrote:
> It would certainly be nice to know why .owner is clobbered, because I
> don't see it. But then again, I don't know what sources I should be
> reading...
BTW, one thing I noticed when reading the current code is that I see this
at the end of i2c_del_adapter:
/* Clear the device structure in case this adapter is ever going to be
added again */
memset(&adap->dev, 0, sizeof(adap->dev));
}
EXPORT_SYMBOL(i2c_del_adapter);
which is not very compatible with this part of the device_add() docs:
* Do not call this routine or device_register() more than once for
* any device structure. The driver model core is not designed to work
* with devices that get unregistered and then spring back to life.
* (Among other things, it's very hard to guarantee that all references
* to the previous incarnation of @dev have been dropped.) Allocate
* and register a fresh new struct device instead.
That, of course, probably have no bearing on the problem/patch in this
thread...
Or maybe struct device reuse is exactly what is going on???
Cheers,
peda
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] i2c: core: Prevent race condition when removing i2c devices
2017-06-28 22:20 ` David Thomson
@ 2017-08-18 8:04 ` Peter Rosin
0 siblings, 0 replies; 9+ messages in thread
From: Peter Rosin @ 2017-08-18 8:04 UTC (permalink / raw)
To: David Thomson, Wolfram Sang; +Cc: linux-i2c@vger.kernel.org
On 2017-06-29 00:20, David Thomson wrote:
> On 2017-06-19 18:34, Peter Rosin wrote:
>> It would certainly be nice to know why .owner is clobbered, because I
>> don't see it. But then again, I don't know what sources I should be
>> reading...
> Hi Peter
>
> Thanks for your reply. Unfortunately we cannot reproduce the issue on a newer kernel as it requires userspace applications (for processing the hotswap) that rely on our (many) changes to the kernel. I appreciate you taking the time to consider the issue.
Still interested in what base version you have and if there are any
patches of relevance to the i2c subsystem... The oldest mainline
that had i2c_del_mux_adapter is v4.6 (May 2016), so I had a second
look at that. Could you also tell us what i2c-mux driver this is
and whether it is expected that .owner is NULL in the first place? In
other words, is the i2c-mux driver in question built-in or a loaded
module?
If you have not added any new i2c-mux driver (which I doubt, you
would surely have mentioned that) the only way that this race can
happen is if the driver is removed twice since i2c_del_mux_adapter
is only called in .remove (and in .probe on failure, but I don't
think that can race). At least that's the case for the i2c-mux drivers
that I looked at...
It seems pretty serious that a driver can have its .remove called
twice so I suspect your patch just papers over a much bigger problem?
So, why is the i2c-mux driver being removed more than once? The
module refcounting must be badly screwed up for that to happen. Or?
Cheers,
Peter
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] i2c: core: Prevent race condition when removing i2c devices
2017-06-19 19:30 ` Peter Rosin
2017-06-28 22:20 ` David Thomson
@ 2017-07-31 13:38 ` Wolfram Sang
2017-07-31 14:02 ` Peter Rosin
1 sibling, 1 reply; 9+ messages in thread
From: Wolfram Sang @ 2017-07-31 13:38 UTC (permalink / raw)
To: Peter Rosin; +Cc: David Thomson, linux-i2c
[-- Attachment #1: Type: text/plain, Size: 1132 bytes --]
On Mon, Jun 19, 2017 at 09:30:06PM +0200, Peter Rosin wrote:
> On 2017-06-19 18:34, Peter Rosin wrote:
> > It would certainly be nice to know why .owner is clobbered, because I
> > don't see it. But then again, I don't know what sources I should be
> > reading...
>
> BTW, one thing I noticed when reading the current code is that I see this
> at the end of i2c_del_adapter:
>
> /* Clear the device structure in case this adapter is ever going to be
> added again */
> memset(&adap->dev, 0, sizeof(adap->dev));
> }
> EXPORT_SYMBOL(i2c_del_adapter);
It was added with commit bd4bc3dbded9cd ("i2c: Clear i2c_adapter.dev on
adapter removal") with this commit message:
i2c: Clear i2c_adapter.dev on adapter removal
Clear i2c_adapter.dev on adapter removal. This makes it possible to
re-add the adapter at a later point, which some drivers
(i2c-amd756-s4882, i2c-nforce2-s4985) actually do.
This fixes a bug reported by John Stultz here:
http://lkml.org/lkml/2008/7/15/720
and by Ingo Molar there:
http://lkml.org/lkml/2008/7/16/78
So, despite the docs, it used to be an issue actually...
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] i2c: core: Prevent race condition when removing i2c devices
2017-07-31 13:38 ` Wolfram Sang
@ 2017-07-31 14:02 ` Peter Rosin
2017-07-31 14:11 ` Wolfram Sang
0 siblings, 1 reply; 9+ messages in thread
From: Peter Rosin @ 2017-07-31 14:02 UTC (permalink / raw)
To: Wolfram Sang; +Cc: David Thomson, linux-i2c
On 2017-07-31 15:38, Wolfram Sang wrote:
> On Mon, Jun 19, 2017 at 09:30:06PM +0200, Peter Rosin wrote:
>> On 2017-06-19 18:34, Peter Rosin wrote:
>>> It would certainly be nice to know why .owner is clobbered, because I
>>> don't see it. But then again, I don't know what sources I should be
>>> reading...
>>
>> BTW, one thing I noticed when reading the current code is that I see this
>> at the end of i2c_del_adapter:
>>
>> /* Clear the device structure in case this adapter is ever going to be
>> added again */
>> memset(&adap->dev, 0, sizeof(adap->dev));
>> }
>> EXPORT_SYMBOL(i2c_del_adapter);
>
> It was added with commit bd4bc3dbded9cd ("i2c: Clear i2c_adapter.dev on
> adapter removal") with this commit message:
>
> i2c: Clear i2c_adapter.dev on adapter removal
>
> Clear i2c_adapter.dev on adapter removal. This makes it possible to
> re-add the adapter at a later point, which some drivers
> (i2c-amd756-s4882, i2c-nforce2-s4985) actually do.
>
> This fixes a bug reported by John Stultz here:
> http://lkml.org/lkml/2008/7/15/720
> and by Ingo Molar there:
> http://lkml.org/lkml/2008/7/16/78
>
> So, despite the docs, it used to be an issue actually...
What do you mean "despite the docs"? The docs says that you shouldn't
reuse struct device. When you did, you got splats. Zeroing out struct
device happens to work in this case probably because there is a
completion going on around the call to device_unregister(). Without
that completion it would have been seriously wrong to zero out the
struct device, I think. But even with it, I'm not sure it's correct
to do so and worry about rcu grace periods etc. Has any driver model
expert looked at this struct device reuse and blessed it as ok?
Cheers,
Peter
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-08-18 8:04 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-08 2:43 [PATCH] i2c: core: Prevent race condition when removing i2c devices David Thomson
2017-06-19 14:36 ` Wolfram Sang
2017-06-19 16:34 ` Peter Rosin
2017-06-19 19:30 ` Peter Rosin
2017-06-28 22:20 ` David Thomson
2017-08-18 8:04 ` Peter Rosin
2017-07-31 13:38 ` Wolfram Sang
2017-07-31 14:02 ` Peter Rosin
2017-07-31 14:11 ` Wolfram Sang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).