* [GIT PULL] Driver core fix for 4.5-rc4
@ 2016-02-14 19:02 Greg KH
2016-02-14 20:57 ` Linus Torvalds
0 siblings, 1 reply; 5+ messages in thread
From: Greg KH @ 2016-02-14 19:02 UTC (permalink / raw)
To: Linus Torvalds, Andrew Morton; +Cc: linux-kernel
The following changes since commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95:
Linux 4.5-rc3 (2016-02-07 15:38:30 -0800)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/ tags/driver-core-4.5-rc4
for you to fetch changes up to 00cd29b799e3449f0c68b1cc77cd4a5f95b42d17:
klist: fix starting point removed bug in klist iterators (2016-02-07 22:18:47 -0800)
----------------------------------------------------------------
driver core fix for 4.5-rc4
Here is one driver core, well klist, fix for 4.5-rc4. It fixes a
problem found in the scsi device list traversal that probably also could
be triggered by other subsystems.
The fix has been in linux-next for a while with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
----------------------------------------------------------------
James Bottomley (1):
klist: fix starting point removed bug in klist iterators
lib/klist.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GIT PULL] Driver core fix for 4.5-rc4
2016-02-14 19:02 [GIT PULL] Driver core fix for 4.5-rc4 Greg KH
@ 2016-02-14 20:57 ` Linus Torvalds
2016-02-14 21:21 ` James Bottomley
0 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2016-02-14 20:57 UTC (permalink / raw)
To: Greg KH, James Bottomley; +Cc: Andrew Morton, Linux Kernel Mailing List
On Sun, Feb 14, 2016 at 11:02 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
>
> Here is one driver core, well klist, fix for 4.5-rc4. It fixes a
> problem found in the scsi device list traversal that probably also could
> be triggered by other subsystems.
So I pulled this, but quite frankly, the fix smells bad to me.
If the n_ref kref can go down to zero at any time, how is that "struct
klist_node *n" safe to ever even touch in the caller?
IOW, what is it that protects that klist_node from not having entirely
been released, and any access to the kref might be a use-after-free
(and the use of "kref_get_unless_zero()" just hides the problem).
So it smells to me like if the kref can go down to zero, the caller is
basically passing in a random pointer.
Please make me feel better about my pull. I need a virtual hug.
(Also, rather than assigning i_dur twice like this:
+ i->i_cur = NULL;
+ if (n && kref_get_unless_zero(&n->n_ref))
+ i->i_cur = n;
I think it would have been cleaner to [in]validate "n" first (perhaps
with a comment about _why_ that is needed yet safe):
+ if (n && !kref_get_unless_zero(&n->n_ref))
+ n = NULL;
and then just do a simple:
+ i->i_cur = n;
afterwards).
But I care less about that small syntactic issue than I care about
understanding why it's safe to pass around a klist_node that might not
exist any more.
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GIT PULL] Driver core fix for 4.5-rc4
2016-02-14 20:57 ` Linus Torvalds
@ 2016-02-14 21:21 ` James Bottomley
2016-02-14 21:33 ` Linus Torvalds
0 siblings, 1 reply; 5+ messages in thread
From: James Bottomley @ 2016-02-14 21:21 UTC (permalink / raw)
To: Linus Torvalds, Greg KH; +Cc: Andrew Morton, Linux Kernel Mailing List
On Sun, 2016-02-14 at 12:57 -0800, Linus Torvalds wrote:
> On Sun, Feb 14, 2016 at 11:02 AM, Greg KH <gregkh@linuxfoundation.org
> > wrote:
> >
> > Here is one driver core, well klist, fix for 4.5-rc4. It fixes a
> > problem found in the scsi device list traversal that probably also
> > could
> > be triggered by other subsystems.
>
> So I pulled this, but quite frankly, the fix smells bad to me.
>
> If the n_ref kref can go down to zero at any time, how is that
> "struct klist_node *n" safe to ever even touch in the caller?
>
> IOW, what is it that protects that klist_node from not having
> entirely been released, and any access to the kref might be a use
> -after-free (and the use of "kref_get_unless_zero()" just hides the
> problem).
OK, so if I explain, don't shoot the messenger; I didn't write the
klist abstraction, I'm just trying to make it work. The design of the
abstraction is that the reference counting is kept internal to the
layer that uses it (in this case drivers/base). The klist_node is
designed to be self renewing, so you can genuinely remove the
klist_node from the list and when the last reference goes to zero, the
object is really removed from the list and the klist_node is ready to
be re-used, because the object could be added back to that (or another)
list. Since klist_node is embedded in another object, whether it's
being freed is the property of the lifetime rules of that embedding
object, which are not subordinate to the klist lifetime rules (although
the klist usually takes a reference on the embedding object).
This means that when you pass an object to a caller(in this case, the
bus_find_device), you pass it with an incremented refcount on the
embedding object, which is what the caller cares about. What happens
to the klist_node is entirely internal to the callee subsystem. So you
never have to worry about the klist_node being freed, because it's
embedded in the object the caller holds a reference to and thus can't
be freed.
However, because the upper layer is unaware of what's going on inside
the lower subsystem, the object may be removed from the list, which
means the internal klist_node n_ref may go to zero. The caller still
holds a reference on the embedding object, so the object itself isn't
going to be freed, but n_ref == 0 means it is no longer on any internal
list. This is the case we get in the SCSI problem. When you pass the
object back to bus_find_device() because it isn't on the klist anymore,
we can't begin with it as the starting place for the iterator. That's
what the patch is fixing.
> So it smells to me like if the kref can go down to zero, the caller
> is basically passing in a random pointer.
>
> Please make me feel better about my pull. I need a virtual hug.
This being Valentine's day, I'm fresh out of hugs, but my dog sends you
a lick.
> (Also, rather than assigning i_dur twice like this:
>
> + i->i_cur = NULL;
> + if (n && kref_get_unless_zero(&n->n_ref))
> + i->i_cur = n;
>
> I think it would have been cleaner to [in]validate "n" first (perhaps
> with a comment about _why_ that is needed yet safe):
>
> + if (n && !kref_get_unless_zero(&n->n_ref))
> + n = NULL;
>
> and then just do a simple:
>
> + i->i_cur = n;
>
> afterwards).
>
> But I care less about that small syntactic issue than I care about
> understanding why it's safe to pass around a klist_node that might
> not exist any more.
Yes, that looks fine too. I was basically assuming the compiler would
optimise away the double setting of i->i_cur.
James
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GIT PULL] Driver core fix for 4.5-rc4
2016-02-14 21:21 ` James Bottomley
@ 2016-02-14 21:33 ` Linus Torvalds
2016-02-14 22:06 ` James Bottomley
0 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2016-02-14 21:33 UTC (permalink / raw)
To: James Bottomley; +Cc: Greg KH, Andrew Morton, Linux Kernel Mailing List
On Sun, Feb 14, 2016 at 1:21 PM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> This means that when you pass an object to a caller(in this case, the
> bus_find_device), you pass it with an incremented refcount on the
> embedding object, which is what the caller cares about. What happens
> to the klist_node is entirely internal to the callee subsystem. So you
> never have to worry about the klist_node being freed, because it's
> embedded in the object the caller holds a reference to and thus can't
> be freed.
So in this case I didn't actually look at the caller, my reaction was
more to the klist code itself - it doesn't seem to use that
kref_get_unless_zero()" model anywhere else. So the new code just
looked a bit out-of-place which in turn made me worry.
As long as there's a reference there that means that things can't go
away, I guess I'm happy.
> Yes, that looks fine too. I was basically assuming the compiler would
> optimise away the double setting of i->i_cur.
Usually the compiler won't be able to. Things like
"kref_get_unless_zero()" end up using ordered atomic ops (ie there's a
memory clobber in there), and gcc will do "I had better make sure
everything written to memory is up-to-date because now we're going
atomics".
So even when things are inlined and gcc sees everything and could in
theory move things around, doing so around atomics and reference
counts is not going to happen (and would be very much not ok - think
about the compiler starting to reorder memory accesses around people
doing things like that, and shudder).
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [GIT PULL] Driver core fix for 4.5-rc4
2016-02-14 21:33 ` Linus Torvalds
@ 2016-02-14 22:06 ` James Bottomley
0 siblings, 0 replies; 5+ messages in thread
From: James Bottomley @ 2016-02-14 22:06 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Greg KH, Andrew Morton, Linux Kernel Mailing List
On Sun, 2016-02-14 at 13:33 -0800, Linus Torvalds wrote:
> On Sun, Feb 14, 2016 at 1:21 PM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> >
> > This means that when you pass an object to a caller(in this case,
> > the
> > bus_find_device), you pass it with an incremented refcount on the
> > embedding object, which is what the caller cares about. What
> > happens
> > to the klist_node is entirely internal to the callee subsystem. So
> > you
> > never have to worry about the klist_node being freed, because it's
> > embedded in the object the caller holds a reference to and thus
> > can't
> > be freed.
>
> So in this case I didn't actually look at the caller, my reaction was
> more to the klist code itself - it doesn't seem to use that
> kref_get_unless_zero()" model anywhere else. So the new code just
> looked a bit out-of-place which in turn made me worry.
The klist code actually pre-dates kref_get_unless_zero() ...
The problem the klist code doesn't contemplate is the inputs to some of
its functions could be derived from upper layers with no awareness of
what the state of the klist membership is. In this case
klist_iter_init_node() assumes the klist_node is actually on a klist,
but if it's passed in from a layer that has no awareness, this might be
untrue. Looking through all the prototypes, I think this is the only
place where the already on list assumption is incorrect, so I think
this is the only place in all of the klist code where we actually need
a kref_get_unless_zero().
>
> As long as there's a reference there that means that things can't go
> away, I guess I'm happy.
>
> > Yes, that looks fine too. I was basically assuming the compiler
> > would
> > optimise away the double setting of i->i_cur.
>
> Usually the compiler won't be able to. Things like
> "kref_get_unless_zero()" end up using ordered atomic ops (ie there's
> a
> memory clobber in there), and gcc will do "I had better make sure
> everything written to memory is up-to-date because now we're going
> atomics".
>
> So even when things are inlined and gcc sees everything and could in
> theory move things around, doing so around atomics and reference
> counts is not going to happen (and would be very much not ok - think
> about the compiler starting to reorder memory accesses around people
> doing things like that, and shudder).
Hm, OK, I'll do better next time.
James
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-02-14 22:06 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-14 19:02 [GIT PULL] Driver core fix for 4.5-rc4 Greg KH
2016-02-14 20:57 ` Linus Torvalds
2016-02-14 21:21 ` James Bottomley
2016-02-14 21:33 ` Linus Torvalds
2016-02-14 22:06 ` James Bottomley
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.