netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* bug fix patch lost: git problem or just incorrect merge?
@ 2010-05-21 15:41 James Bottomley
  2010-05-21 16:45 ` Linus Torvalds
  2010-05-21 22:02 ` David Miller
  0 siblings, 2 replies; 6+ messages in thread
From: James Bottomley @ 2010-05-21 15:41 UTC (permalink / raw)
  To: Linus Torvalds, David Miller; +Cc: linux-kernel, netdev, linux-scsi

The patch in question is this one (upstream for a while):

commit d7d05548a62c87ee55b0c81933669177f885aa8d
Author: Mike Christie <michaelc@cs.wisc.edu>
Date:   Wed Mar 31 14:41:35 2010 -0500

    [SCSI] iscsi_tcp: fix relogin/shutdown hang

It's a simple one line change in iscsi_tcp.c (diff clipped):

--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -599,7 +599,7 @@ static void iscsi_sw_tcp_conn_stop(struct iscsi_cls_conn *cls_conn, int flag)
-       if (sock->sk->sk_sleep && waitqueue_active(sock->sk->sk_sleep)) {
+       if (sock->sk->sk_sleep) {

It was killed by this merge commit in the net-next tree:

commit 278554bd6579206921f5d8a523649a7a57f8850d
Merge: 5a147e8 cea0d76
Author: David S. Miller <davem@davemloft.net>
Date:   Wed May 12 00:05:35 2010 -0700

However, the curious thing is that git seems to have lost trace of the
missing patch entirely:  if I try to find it in linus' tree with a git
log -- drivers/scsi/iscsi_tcp.c, it doesn't show up.  The merge commit
which killed it does list iscsi_tcp.c as a file conflict, but git show
on that commit doesn't list that file in the resolution diff ... even
though this is where it actually got killed.

Is this a git problem ... or is it just a mismerge in the net tree?

Either way, of course, we need the patch back ...

James

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug fix patch lost: git problem or just incorrect merge?
  2010-05-21 15:41 bug fix patch lost: git problem or just incorrect merge? James Bottomley
@ 2010-05-21 16:45 ` Linus Torvalds
  2010-05-21 17:04   ` Linus Torvalds
  2010-05-21 22:07   ` David Miller
  2010-05-21 22:02 ` David Miller
  1 sibling, 2 replies; 6+ messages in thread
From: Linus Torvalds @ 2010-05-21 16:45 UTC (permalink / raw)
  To: James Bottomley; +Cc: David Miller, linux-kernel, netdev, linux-scsi



On Fri, 21 May 2010, James Bottomley wrote:
>
> The patch in question is this one (upstream for a while):
> 
> commit d7d05548a62c87ee55b0c81933669177f885aa8d
> Author: Mike Christie <michaelc@cs.wisc.edu>
> Date:   Wed Mar 31 14:41:35 2010 -0500
> 
>     [SCSI] iscsi_tcp: fix relogin/shutdown hang
> 
> It's a simple one line change in iscsi_tcp.c (diff clipped):
> 
> --- a/drivers/scsi/iscsi_tcp.c
> +++ b/drivers/scsi/iscsi_tcp.c
> @@ -599,7 +599,7 @@ static void iscsi_sw_tcp_conn_stop(struct iscsi_cls_conn *cls_conn, int flag)
> -       if (sock->sk->sk_sleep && waitqueue_active(sock->sk->sk_sleep)) {
> +       if (sock->sk->sk_sleep) {
> 
> It was killed by this merge commit in the net-next tree:
> 
> commit 278554bd6579206921f5d8a523649a7a57f8850d
> Merge: 5a147e8 cea0d76
> Author: David S. Miller <davem@davemloft.net>
> Date:   Wed May 12 00:05:35 2010 -0700
> 
> However, the curious thing is that git seems to have lost trace of the
> missing patch entirely

No, it's there, and the bug is that David doesn't do merges well.

One of the reasons I ask people to let me merge is that it results in 
cleaner history to not have criss-cross merges. And another is that I'm 
pretty good at it, and letting me make merges also means that I am more 
aware of problem spots.

> if I try to find it in linus' tree with a git log -- 
> drivers/scsi/iscsi_tcp.c, it doesn't show up.

That is because "git log" will see the merge, see that _all_ history came 
from the other side, and ignore the side that was ignored. Because 
clearly, that other side didn't actually contribute anything.

Now, the _reason_ that other side didn't contribute anything is that David 
messed up the merge, and took just his own sides changes.

> The merge commit which killed it does list iscsi_tcp.c as a file 
> conflict

Yes. I re-did the merge, and the result looks like this (cut-and-paste 
whitespace damage, I'm just illustrating he point):

	diff --cc drivers/scsi/iscsi_tcp.c
	index 9eae04a,02143af..0000000
	--- a/drivers/scsi/iscsi_tcp.c
	+++ b/drivers/scsi/iscsi_tcp.c
	@@@ -599,9 -599,9 +599,13 @@@ static void iscsi_sw_tcp_conn_stop(stru
	        set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_rx);
	        write_unlock_bh(&tcp_sw_conn->sock->sk->sk_callback_lock);
	  
	++<<<<<<< HEAD
	 +      if (sk_sleep(sock->sk) && waitqueue_active(sk_sleep(sock->sk))) {
	++=======
	+       if (sock->sk->sk_sleep) {
	++>>>>>>> cea0d76
	                sock->sk->sk_err = EIO;
	 -              wake_up_interruptible(sock->sk->sk_sleep);
	 +              wake_up_interruptible(sk_sleep(sock->sk));
	        }
	  
	        iscsi_conn_stop(cls_conn, flag);

and David picked his side of things, not your side.

The _correct_ merge would have been to take both changes, as is quite 
obvious if you do

	gitk 5a147e8...cea0d76 drivers/scsi/iscsi_tcp.c

and see that the conflict comes because:

 - one side (David's) changed

	sock->sk->sk_sleep

   into

	sk_sleep(sock->sk)

   in commit aa395145165cb06a0d0885221bbe0ce4a564391d

 - the other side (your) removed the 'waitqueue_active()' part in commit 
   d7d05548a62c87ee55b0c81933669177f885aa8d.

So the end result _should_ have been this merge resolution:

	diff --cc drivers/scsi/iscsi_tcp.c
	index 9eae04a,02143af..0000000
	--- a/drivers/scsi/iscsi_tcp.c
	+++ b/drivers/scsi/iscsi_tcp.c
	@@@ -599,9 -599,9 +599,9 @@@ static void iscsi_sw_tcp_conn_stop(stru
	        set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_rx);
	        write_unlock_bh(&tcp_sw_conn->sock->sk->sk_callback_lock);
	  
	-       if (sk_sleep(sock->sk) && waitqueue_active(sk_sleep(sock->sk))) {
	 -      if (sock->sk->sk_sleep) {
	++      if (sk_sleep(sock->sk)) {
	                sock->sk->sk_err = EIO;
	 -              wake_up_interruptible(sock->sk->sk_sleep);
	 +              wake_up_interruptible(sk_sleep(sock->sk));
	        }
	  
	        iscsi_conn_stop(cls_conn, flag);

but David just picked his side entirely. And that is also the reason for:

> but git show on that commit doesn't list that file in the resolution 
> diff ... even though this is where it actually got killed.

A merge diff ("combined diff") only shows conflicts as defined by "you 
resolved it to something that didn't match either side". That's a _real_ 
conflict. If you just end up picking one side, there is no diff.

> Is this a git problem ... or is it just a mismerge in the net tree?

So it's a mis-merge. You can see the commit with

	gitk v2.6.34.. --full-history drivers/scsi/iscsi_tcp.c

which doesn't do the "ignore the side of a merge that didn't bring any new 
data in". Or, with any recent git, you can use "--simplify-merges" instead 
of full-history, which only simplifies trivial merges where neither side 
really touched things at all.

If you do that, you'll also see why git doesn't show the uninteresting 
side of a merge by default. Just for fun, compare the graphs of

	gitk v2.6.34.. drivers/scsi/iscsi_tcp.c
	gitk v2.6.34.. --simplify-merges drivers/scsi/iscsi_tcp.c
	gitk v2.6.34.. --full-history drivers/scsi/iscsi_tcp.c

and ask yourself: do you normally want to see _all_ the history, even 
stuff that didn't end up affecting the end result?

> Either way, of course, we need the patch back ...

I'll fix it up.

			Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug fix patch lost: git problem or just incorrect merge?
  2010-05-21 16:45 ` Linus Torvalds
@ 2010-05-21 17:04   ` Linus Torvalds
  2010-05-21 17:23     ` James Bottomley
  2010-05-21 22:07   ` David Miller
  1 sibling, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2010-05-21 17:04 UTC (permalink / raw)
  To: James Bottomley; +Cc: David Miller, linux-kernel, netdev, linux-scsi



On Fri, 21 May 2010, Linus Torvalds wrote:
> 
> > Either way, of course, we need the patch back ...
> 
> I'll fix it up.

Hmm. Pushed that out as appended, since that is the correct resolve.

HOWEVER - the code still doesn't actually make any sense. It does

	if (sk_sleep(sock->sk)) {

and that sk_sleep() today is an inline function that just does

	return &sk->sk_wq->wait;

and testing the result of an address-of operation for NULL is almost 
certainly totally non-sensical. Sure, it _might_ work (maybe 'wait' is the 
first element in the 'sk_wq' data structure, and sk_wq is NULL), but that 
kind of code is always total and utterl crap regardless.

So I pushed it out because I had done the resolve already, and it's no 
worse than it was before, but it's still a steaming buggy pile of shit.

It being iscsi, I can't bring myself to care. But somebody who does, 
should really look at it. The most likely resolution is to remove the test 
entirely, since I don't think it's ever valid to have a socket that 
doesn't have a sk_wq (there's a _lot_ of unconditional use of sk_sleep()).

		Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug fix patch lost: git problem or just incorrect merge?
  2010-05-21 17:04   ` Linus Torvalds
@ 2010-05-21 17:23     ` James Bottomley
  0 siblings, 0 replies; 6+ messages in thread
From: James Bottomley @ 2010-05-21 17:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Miller, linux-kernel, netdev, linux-scsi

On Fri, 2010-05-21 at 10:04 -0700, Linus Torvalds wrote:
> 
> On Fri, 21 May 2010, Linus Torvalds wrote:
> > 
> > > Either way, of course, we need the patch back ...
> > 
> > I'll fix it up.
> 
> Hmm. Pushed that out as appended, since that is the correct resolve.

Thanks!

> HOWEVER - the code still doesn't actually make any sense. It does
> 
> 	if (sk_sleep(sock->sk)) {
> 
> and that sk_sleep() today is an inline function that just does
> 
> 	return &sk->sk_wq->wait;
> 
> and testing the result of an address-of operation for NULL is almost 
> certainly totally non-sensical. Sure, it _might_ work (maybe 'wait' is the 
> first element in the 'sk_wq' data structure, and sk_wq is NULL), but that 
> kind of code is always total and utterl crap regardless.
> 
> So I pushed it out because I had done the resolve already, and it's no 
> worse than it was before, but it's still a steaming buggy pile of shit.

Yes, the problem was caused by this patch

commit 43815482370c510c569fd18edb57afcb0fa8cab6
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Thu Apr 29 11:01:49 2010 +0000

    net: sock_def_readable() and friends RCU conversion

Which moved sk_sleep() from returning the pointer to the waitqueue,
which may or may not be assigned to returning a pointer to an internal
waitqueue in the socket, which, obviously, can never be null.

I suspect what iscsi should be doing is always sending the wakeup ... in
which case with your resolution, the code is operating correctly even if
the form is suboptimal.

> It being iscsi, I can't bring myself to care. But somebody who does, 
> should really look at it. The most likely resolution is to remove the test 
> entirely, since I don't think it's ever valid to have a socket that 
> doesn't have a sk_wq (there's a _lot_ of unconditional use of sk_sleep()).

I'll have Mike look at it, but I think just removing the if() will be
the correct resolution.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug fix patch lost: git problem or just incorrect merge?
  2010-05-21 15:41 bug fix patch lost: git problem or just incorrect merge? James Bottomley
  2010-05-21 16:45 ` Linus Torvalds
@ 2010-05-21 22:02 ` David Miller
  1 sibling, 0 replies; 6+ messages in thread
From: David Miller @ 2010-05-21 22:02 UTC (permalink / raw)
  To: James.Bottomley; +Cc: torvalds, linux-kernel, netdev, linux-scsi

From: James Bottomley <James.Bottomley@suse.de>
Date: Fri, 21 May 2010 10:41:55 -0500

> Is this a git problem ... or is it just a mismerge in the net tree?

Mismerge, because sk->sk_sleep() doesn't exist any more I mistakenly
updated the original line to do the sk_sleep() stuff.

Sorry about that.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bug fix patch lost: git problem or just incorrect merge?
  2010-05-21 16:45 ` Linus Torvalds
  2010-05-21 17:04   ` Linus Torvalds
@ 2010-05-21 22:07   ` David Miller
  1 sibling, 0 replies; 6+ messages in thread
From: David Miller @ 2010-05-21 22:07 UTC (permalink / raw)
  To: torvalds; +Cc: James.Bottomley, linux-kernel, netdev, linux-scsi

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Fri, 21 May 2010 09:45:49 -0700 (PDT)

> One of the reasons I ask people to let me merge is that it results in 
> cleaner history to not have criss-cross merges. And another is that I'm 
> pretty good at it, and letting me make merges also means that I am more 
> aware of problem spots.

That wasn't possible in this case.

This happened more than a week ago, as I needed to merge your tree into
net-2.6 to resolve a conflict there.  That's what took in the iscsi
bug fix, and this is way before the merge window.

Then I needed to pull net-2.6 into net-next-2.6 to resolve conflicts
existing between those two trees.

And this is why I ended up having to do the merge :-)

>> Either way, of course, we need the patch back ...
> 
> I'll fix it up.

Thanks Linus.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-05-21 22:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-21 15:41 bug fix patch lost: git problem or just incorrect merge? James Bottomley
2010-05-21 16:45 ` Linus Torvalds
2010-05-21 17:04   ` Linus Torvalds
2010-05-21 17:23     ` James Bottomley
2010-05-21 22:07   ` David Miller
2010-05-21 22:02 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).