* [PATCH] libata: Fix devres handling
@ 2017-05-19 23:03 Linus Walleij
2017-05-23 21:16 ` Tejun Heo
0 siblings, 1 reply; 5+ messages in thread
From: Linus Walleij @ 2017-05-19 23:03 UTC (permalink / raw)
To: Tejun Heo, Bartlomiej Zolnierkiewicz, linux-ide; +Cc: Linus Walleij, stable
The ATA hosts are allocated using devres with:
host = devres_alloc(ata_host_release, sz, GFP_KERNEL);
However in the ata_host_release() function the host is retrieved
using dev_get_drvdata() which is not what other devres handlers
do, instead we should probably use the passed resource.
Before this my kernel crashes badly when I fail to start a host
in ata_host_start() and need to bail out, because dev_get_drvdata()
gets the wrong-but-almost-correct pointer (so on some systems it
may by chance be the right pointer what do I know).
On ARMv4 Gemini it is not:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at ../lib/refcount.c:184 refcount_sub_and_test+0x9c/0xac
refcount_t: underflow; use-after-free.
CPU: 0 PID: 1 Comm: swapper Not tainted 4.12.0-rc1+ #657
Hardware name: Gemini (Device Tree)
[<c0010f10>] (unwind_backtrace) from [<c000d8a4>] (show_stack+0x10/0x14)
[<c000d8a4>] (show_stack) from [<c0018720>] (__warn+0xcc/0xf4)
[<c0018720>] (__warn) from [<c0018780>] (warn_slowpath_fmt+0x38/0x48)
[<c0018780>] (warn_slowpath_fmt) from [<c01fffcc>] (refcount_sub_and_test+0x9c/0xac)
[<c01fffcc>] (refcount_sub_and_test) from [<c01e8a5c>] (kobject_put+0x28/0xe0)
[<c01e8a5c>] (kobject_put) from [<c029b294>] (ata_host_release+0xb0/0x144)
[<c029b294>] (ata_host_release) from [<c027326c>] (release_nodes+0x178/0x1fc)
[<c027326c>] (release_nodes) from [<c02707e4>] (driver_probe_device+0xd0/0x2dc)
[<c02707e4>] (driver_probe_device) from [<c0270aac>] (__driver_attach+0xbc/0xc0)
[<c0270aac>] (__driver_attach) from [<c026eeac>] (bus_for_each_dev+0x70/0xa0)
[<c026eeac>] (bus_for_each_dev) from [<c026f824>] (bus_add_driver+0x178/0x200)
[<c026f824>] (bus_add_driver) from [<c0271184>] (driver_register+0x78/0xf8)
[<c0271184>] (driver_register) from [<c05b2d90>] (do_one_initcall+0xac/0x174)
[<c05b2d90>] (do_one_initcall) from [<c05b2f6c>] (kernel_init_freeable+0x114/0x1cc)
[<c05b2f6c>] (kernel_init_freeable) from [<c04beeb4>] (kernel_init+0x8/0xf4)
[<c04beeb4>] (kernel_init) from [<c000a270>] (ret_from_fork+0x14/0x24)
---[ end trace 0a4570446a019085 ]---
Then there is a second (worse) crash when it tries to iterate
to the next port. But it is all because the host pointer is
wrong.
In this case, the host should be 0xc7a3f3d0 as it was when it got
allocated but instead what dev_get_drvdata() returns is 0xc7a3f370.
Using the passed resource gives the right pointer.
Cc: stable@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
drivers/ata/libata-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 2d83b8c75965..5487c4a29bc3 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5921,7 +5921,7 @@ struct ata_port *ata_port_alloc(struct ata_host *host)
static void ata_host_release(struct device *gendev, void *res)
{
- struct ata_host *host = dev_get_drvdata(gendev);
+ struct ata_host *host = res;
int i;
for (i = 0; i < host->n_ports; i++) {
--
2.9.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] libata: Fix devres handling
2017-05-19 23:03 [PATCH] libata: Fix devres handling Linus Walleij
@ 2017-05-23 21:16 ` Tejun Heo
2017-05-23 21:27 ` Tejun Heo
2017-05-30 9:21 ` Linus Walleij
0 siblings, 2 replies; 5+ messages in thread
From: Tejun Heo @ 2017-05-23 21:16 UTC (permalink / raw)
To: Linus Walleij; +Cc: Bartlomiej Zolnierkiewicz, linux-ide, stable
Hello, Linus.
On Sat, May 20, 2017 at 01:03:14AM +0200, Linus Walleij wrote:
> The ATA hosts are allocated using devres with:
> host = devres_alloc(ata_host_release, sz, GFP_KERNEL);
> However in the ata_host_release() function the host is retrieved
> using dev_get_drvdata() which is not what other devres handlers
> do, instead we should probably use the passed resource.
>
> Before this my kernel crashes badly when I fail to start a host
> in ata_host_start() and need to bail out, because dev_get_drvdata()
> gets the wrong-but-almost-correct pointer (so on some systems it
> may by chance be the right pointer what do I know).
>
> On ARMv4 Gemini it is not:
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 1 at ../lib/refcount.c:184 refcount_sub_and_test+0x9c/0xac
> refcount_t: underflow; use-after-free.
> CPU: 0 PID: 1 Comm: swapper Not tainted 4.12.0-rc1+ #657
> Hardware name: Gemini (Device Tree)
> [<c0010f10>] (unwind_backtrace) from [<c000d8a4>] (show_stack+0x10/0x14)
> [<c000d8a4>] (show_stack) from [<c0018720>] (__warn+0xcc/0xf4)
> [<c0018720>] (__warn) from [<c0018780>] (warn_slowpath_fmt+0x38/0x48)
> [<c0018780>] (warn_slowpath_fmt) from [<c01fffcc>] (refcount_sub_and_test+0x9c/0xac)
> [<c01fffcc>] (refcount_sub_and_test) from [<c01e8a5c>] (kobject_put+0x28/0xe0)
> [<c01e8a5c>] (kobject_put) from [<c029b294>] (ata_host_release+0xb0/0x144)
> [<c029b294>] (ata_host_release) from [<c027326c>] (release_nodes+0x178/0x1fc)
> [<c027326c>] (release_nodes) from [<c02707e4>] (driver_probe_device+0xd0/0x2dc)
> [<c02707e4>] (driver_probe_device) from [<c0270aac>] (__driver_attach+0xbc/0xc0)
> [<c0270aac>] (__driver_attach) from [<c026eeac>] (bus_for_each_dev+0x70/0xa0)
> [<c026eeac>] (bus_for_each_dev) from [<c026f824>] (bus_add_driver+0x178/0x200)
> [<c026f824>] (bus_add_driver) from [<c0271184>] (driver_register+0x78/0xf8)
> [<c0271184>] (driver_register) from [<c05b2d90>] (do_one_initcall+0xac/0x174)
> [<c05b2d90>] (do_one_initcall) from [<c05b2f6c>] (kernel_init_freeable+0x114/0x1cc)
> [<c05b2f6c>] (kernel_init_freeable) from [<c04beeb4>] (kernel_init+0x8/0xf4)
> [<c04beeb4>] (kernel_init) from [<c000a270>] (ret_from_fork+0x14/0x24)
> ---[ end trace 0a4570446a019085 ]---
>
> Then there is a second (worse) crash when it tries to iterate
> to the next port. But it is all because the host pointer is
> wrong.
This is really weird. The two can't be different, well, at least
shouldn't.
> In this case, the host should be 0xc7a3f3d0 as it was when it got
> allocated but instead what dev_get_drvdata() returns is 0xc7a3f370.
> Using the passed resource gives the right pointer.
That's 96 bytes of difference, which seems too big for devres_node,
especially on 32bit machines. Can you check what gdb says on "print
((struct devres *)0)->data" or "print sizeof(struct devres_node)"?
There gotta be something else going on. devres_alloc() returns the
data pointer which is the same one which gets passed into the release
function.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] libata: Fix devres handling
2017-05-23 21:16 ` Tejun Heo
@ 2017-05-23 21:27 ` Tejun Heo
2017-05-30 9:21 ` Linus Walleij
1 sibling, 0 replies; 5+ messages in thread
From: Tejun Heo @ 2017-05-23 21:27 UTC (permalink / raw)
To: Linus Walleij; +Cc: Bartlomiej Zolnierkiewicz, linux-ide, stable
Hello, again.
On Tue, May 23, 2017 at 05:16:08PM -0400, Tejun Heo wrote:
> That's 96 bytes of difference, which seems too big for devres_node,
> especially on 32bit machines. Can you check what gdb says on "print
> ((struct devres *)0)->data" or "print sizeof(struct devres_node)"?
>
> There gotta be something else going on. devres_alloc() returns the
> data pointer which is the same one which gets passed into the release
> function.
Also, can you please add throw in some printks and see what's being
returned from devres_alloc() and getting set on driver data? This is
really weird. I can't think of a reason why they'd differ.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] libata: Fix devres handling
2017-05-23 21:16 ` Tejun Heo
2017-05-23 21:27 ` Tejun Heo
@ 2017-05-30 9:21 ` Linus Walleij
2017-05-30 17:59 ` Tejun Heo
1 sibling, 1 reply; 5+ messages in thread
From: Linus Walleij @ 2017-05-30 9:21 UTC (permalink / raw)
To: Tejun Heo; +Cc: Bartlomiej Zolnierkiewicz, linux-ide, stable
On Tue, May 23, 2017 at 11:16 PM, Tejun Heo <tj@kernel.org> wrote:
>> The ATA hosts are allocated using devres with:
>> host = devres_alloc(ata_host_release, sz, GFP_KERNEL);
>> However in the ata_host_release() function the host is retrieved
>> using dev_get_drvdata() which is not what other devres handlers
>> do, instead we should probably use the passed resource.
>>
>> Before this my kernel crashes badly when I fail to start a host
>> in ata_host_start() and need to bail out, because dev_get_drvdata()
>> gets the wrong-but-almost-correct pointer (so on some systems it
>> may by chance be the right pointer what do I know).
>>
>> On ARMv4 Gemini it is not:
(...)
> This is really weird. The two can't be different, well, at least
> shouldn't.
I found the problem.
This is because my driver issues platform_set_drvdata(pdev)
on the same struct device * overwriting the data with
its own. That function is just an alias for dev_set_drvdata().
Amazingly, libata survives this until release.
Maybe we should print a warning if dev_get_drvdata()
and res differ? It's a sign that something is wrong because
someone screwed with the drvdata behind the back of
libata.
It appears further that I am in bad company: there are a few
drivers in drivers/ata that have broken errorpath because
they do exactly this or variants of this. So it turns up in my
driver too because of copypaste.
Device drivers assume that they "own" drvdata inside the
device, but with libata they do not, as shown above.
It is used more as a rule than an exception to pass a
state container from probe() over to remove().
It is a common pattern to overwrite drvdata, the following
drivers now have this bug in one way or another:
pata_bf54x.c: platform_set_drvdata(pdev, host);
pata_ep93xx.c: platform_set_drvdata(pdev, drv_data);
sata_dwc_460ex.c: dev_set_drvdata(&ofdev->dev, host);
These drivers:
pata_rb532_cf.c: platform_set_drvdata(pdev, ah);
pata_samsung_cf.c: platform_set_drvdata(pdev, host);
sata_fsl.c: platform_set_drvdata(ofdev, host);
They set the ATA host as drvdata, essentially overwriting
the drvdata pointer with the same value.
I guess I will simply make a cleanup series for these,
making sure they use host->private_data instead and do not
double-write the drvdata.
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] libata: Fix devres handling
2017-05-30 9:21 ` Linus Walleij
@ 2017-05-30 17:59 ` Tejun Heo
0 siblings, 0 replies; 5+ messages in thread
From: Tejun Heo @ 2017-05-30 17:59 UTC (permalink / raw)
To: Linus Walleij; +Cc: Bartlomiej Zolnierkiewicz, linux-ide, stable
Hello, Linus.
On Tue, May 30, 2017 at 11:21:24AM +0200, Linus Walleij wrote:
> This is because my driver issues platform_set_drvdata(pdev)
> on the same struct device * overwriting the data with
> its own. That function is just an alias for dev_set_drvdata().
I see.
> Amazingly, libata survives this until release.
That is surprising given that libata does depend on that drvdata quite
a bit.
> Maybe we should print a warning if dev_get_drvdata()
> and res differ? It's a sign that something is wrong because
> someone screwed with the drvdata behind the back of
> libata.
Please feel free to submit a patch to add WARN_ON there.
> I guess I will simply make a cleanup series for these,
> making sure they use host->private_data instead and do not
> double-write the drvdata.
Great.
Thanks for working on this!
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-05-30 17:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-19 23:03 [PATCH] libata: Fix devres handling Linus Walleij
2017-05-23 21:16 ` Tejun Heo
2017-05-23 21:27 ` Tejun Heo
2017-05-30 9:21 ` Linus Walleij
2017-05-30 17:59 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox