* {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
@ 2010-09-27 13:17 Boaz Harrosh
2010-09-27 14:06 ` Phil Turmel
2010-09-28 20:24 ` Linus Torvalds
0 siblings, 2 replies; 20+ messages in thread
From: Boaz Harrosh @ 2010-09-27 13:17 UTC (permalink / raw)
To: Linus Torvalds, Julia Lawall, David S. Miller, Andrew Morton,
uml-devel, linux-kernel
Cc: Stephen Hemminger
[bharrosh@fs2 ~/dev/git/pub/scsi-misc] 1115$ git bisect good
f25c80a4b2bf93c99820f470573626557db35202 is the first bad commit
commit f25c80a4b2bf93c99820f470573626557db35202
Author: Julia Lawall <julia@diku.dk>
Date: Tue Jul 20 12:25:17 2010 +0000
arch/um/drivers: remove duplicate structure field initialization
There are two initializations of ndo_set_mac_address, one to a local
function that is not used otherwise and one to a function that is defined
elsewhere.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
identifier I, s, fld;
position p0,p;
expression E;
@@
struct I s =@p0 { ... .fld@p = E, ...};
@s@
identifier I, s, r.fld;
position r.p0,p;
expression E;
@@
struct I s =@p0 { ... .fld@p = E, ...};
@script:python@
p0 << r.p0;
fld << r.fld;
ps << s.p;
pr << r.p;
@@
if int(ps[0].line)<int(pr[0].line) or int(ps[0].column)<int(pr[0].column):
cocci.print_main(fld,p0)
// </smpl>
akpm:
- Use the standard eth_mac_addr() in uml_net_set_mac()
- Remove unneeded and racy local set_ether_mac()
- Remove duplicated (and incorrect)
uml_netdev_ops.ndo_set_mac_address initializer.
Fixes 8bb95b39a16ed55226810596f92216c53329d2fe ("uml: convert network
device to netdevice ops").
[akpm@linux-foundation.org: rework as above]
Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
:040000 040000 b3ff18738a5de659b97ff6ad7ba3c23eb2736263 241657298557b14317378fbb08ad664cf6f59ad3 M arch
When this patch is applied my UML's networking works less then 50% of the times.
Then 30% of the times I see ethX where 1 < X < enf. Only 20% of boots will come up
with the regular eth0.
At 50% of the times when the nic fails to show up I get:
* Bringing up interface eth0: RTNETLINK answers: Cannot assign requested address
Failed to bring up eth0.
When I get the funny ethX I get:
* Bringing up interface eth0: Device eth0 does not seem to be present,
delaying initialization. [FAILED]
The patch Reverts cleanly on top of 2.6.36-rc5 and after Revert works perfectly as
before.
If indeed this is a cleanup. It can be included (fixed) again in next Kernel.
I can try any patches until this is fixed.
<RANT A HEAD CAN BE IGNORED>
It has become extremely hard to bisect a simple problem in latest Kernels!
Most mainline merges during a merge window are based on an rc1 of the previous
Kernel. In the last 5 Kernels there was a 90% chance of a BAD bug in systems
I use, at rc1. If a bug is found that needs bisecting. The other bugs creep
up during bisect and mask out the possibility to bisect.
For instance I found the bug I see was already present in 2.6.36-rc2
and that a good point was 2.6.35. Bisecting two bad(s) quickly took me
to some 2.6.35-rc1 Kernel that did not boot at all. So I was clever I
decided to merge in 2.6.35 at each bisect point. Then reset and continue.
But for some reason the bisect got mixed up and complained about an impossible
merge common bases. So out of desperation I did a very very^ stupid thing.
[]$ git rebase -i v2.6.35 v2.6.36-rc1
After endless merge conflicts, at about 3700/7100 I realised that I know
what is the patch that makes my UML refuse to boot. I have actually bisect
a problem with it's absence 2 Kernels ago. So I was again at my bisecting
manually cherry-picking the offending patch. Making sure to reset it before
every git bisect XXX step. But clearly I got lucky.
In short I wish at some 2.6.XX-rc[45] of every Kernel cycle. Maintainers
would rebase their next's tree of [XX+1] to a some what more stable rc.
Sure re-run all the tests. They still have time for the new tree in next
to be tested and verified before the next merge window.
(Hell one of my bisect points took me as back as 2.6.34)
Please remind me why maintainers should not rebase their trees once
committed, to the point that they don't rebase even for buggy patches
that are already in next, and apply fix patches, all within the same
merge window. The same is also done with merge conflicts with the
rc-cycle of their own code, instead of rebasing.
So in short this is a call for, possibly, cleaner History in main Kernel.
Please remind me why re-writing history is a bad thing.
</RANT A HEAD CAN BE IGNORED>
Thanks
Boaz
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-27 13:17 {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization Boaz Harrosh
@ 2010-09-27 14:06 ` Phil Turmel
2010-09-28 20:24 ` Linus Torvalds
1 sibling, 0 replies; 20+ messages in thread
From: Phil Turmel @ 2010-09-27 14:06 UTC (permalink / raw)
To: Boaz Harrosh
Cc: Linus Torvalds, Julia Lawall, David S. Miller, Andrew Morton,
uml-devel, linux-kernel, Stephen Hemminger
On 09/27/2010 09:17 AM, Boaz Harrosh wrote:
[snip /]
>
> <RANT A HEAD CAN BE IGNORED>
>
> It has become extremely hard to bisect a simple problem in latest Kernels!
>
> Most mainline merges during a merge window are based on an rc1 of the previous
> Kernel. In the last 5 Kernels there was a 90% chance of a BAD bug in systems
> I use, at rc1. If a bug is found that needs bisecting. The other bugs creep
> up during bisect and mask out the possibility to bisect.
I had similar problems when bisecting the recent USB HID regression. Once I
realized that "bisect skip" kept dropping me into a rats nest, I guessed on
-rc2 and was able to proceed from there.
...
> In short I wish at some 2.6.XX-rc[45] of every Kernel cycle. Maintainers
> would rebase their next's tree of [XX+1] to a some what more stable rc.
> Sure re-run all the tests. They still have time for the new tree in next
> to be tested and verified before the next merge window.
> (Hell one of my bisect points took me as back as 2.6.34)
>
> Please remind me why maintainers should not rebase their trees once
> committed, to the point that they don't rebase even for buggy patches
> that are already in next, and apply fix patches, all within the same
> merge window. The same is also done with merge conflicts with the
> rc-cycle of their own code, instead of rebasing.
>
> So in short this is a call for, possibly, cleaner History in main Kernel.
> Please remind me why re-writing history is a bad thing.
I can't comment on whether rebasing is reasonable at that level, but I
was wondering if it made sense to teach git bisect to automatically
cherry-pick known regression fixes. If I recall correctly, someone once
suggested a history tag of the form "Fixes: <git-commit-id>". By itself,
that's probably not sufficient, as I'm sure some relevant commits would
get through without that tag. A separate index file containing pairs of
commit-ids could supplement the main history.
If that sounds like a reasonable approach, I'm willing to take a stab at
implementing it. (Unless someone smarter than me beats me to it, of course.)
Phil
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-27 13:17 {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization Boaz Harrosh
2010-09-27 14:06 ` Phil Turmel
@ 2010-09-28 20:24 ` Linus Torvalds
2010-09-28 20:47 ` Andrew Morton
` (2 more replies)
1 sibling, 3 replies; 20+ messages in thread
From: Linus Torvalds @ 2010-09-28 20:24 UTC (permalink / raw)
To: Boaz Harrosh
Cc: Julia Lawall, David S. Miller, Andrew Morton, uml-devel,
linux-kernel, Stephen Hemminger
Ping, no comments?
On Mon, Sep 27, 2010 at 6:17 AM, Boaz Harrosh <bharrosh@panasas.com> wrote:
>
> [bharrosh@fs2 ~/dev/git/pub/scsi-misc] 1115$ git bisect good
> f25c80a4b2bf93c99820f470573626557db35202 is the first bad commit
> commit f25c80a4b2bf93c99820f470573626557db35202
It looks like that commit is indeed very misleading. The commit message says:
"arch/um/drivers: remove duplicate structure field initialization"
but it is in fact not duplicate: there's two field initializations,
but they are _different_. Looking at the patch, it has:
.ndo_set_mac_address = uml_net_set_mac,
- .ndo_set_mac_address = eth_mac_addr,
so it removes the later one, but it is not at all clear which one the
compiler actually used. My guess is that it used to use the later one
(the standard eth_mac_addr function), and the patch made it suddenly
use the uml_net_set_mac function.
I didn't check what gcc used to do, but this:
> The patch Reverts cleanly on top of 2.6.36-rc5 and after Revert works perfectly as
> before.
makes me suspect that nobody else checked it either.
> <RANT A HEAD CAN BE IGNORED>
>
> It has become extremely hard to bisect a simple problem in latest Kernels!
It's always been extremely hard, it just depends on luck how well it
works. Sometimes you never see any problems (except the one you
bisect, which is obviously the problem you _want_ to see), and then
sometimes bisection is really painful because there are multiple
independent ones you hit.
> For instance I found the bug I see was already present in 2.6.36-rc2
> and that a good point was 2.6.35. Bisecting two bad(s) quickly took me
> to some 2.6.35-rc1 Kernel that did not boot at all. So I was clever I
> decided to merge in 2.6.35 at each bisect point.
That really wasn't clever. Don't do it. It will cause untold pain. All
your problems resulted from that initial thing - you were basically
bisecting commits that weren't even part of the original "bad" state.
After a while, git-bisect ends up hitting those commits that aren't
even reachable from 'bad', and now you're screwed.
So all the other problems you had were due to that.
Now, admittedly, the thing that caused you to do this in the first
place (hitting a bad kernel that you couldn't even test) is painful.
And no, "git bisect skip" doesn't necessarily work all that well. But
what you tried really just made things worse.
If the real thing to do, just so you know next time, is to do the "git
bisect visualize", and try to pick a good point, and just select that
(with "git reset --hard xyzzy").
Now, that "good" is obviously a matter of judgement, because you don't
want to pick it too close to a known-bad or known-good commit (that
just makes bisection not work very efficiently), and seeing that can
be hard. But usually a good strategy is to pick something that looks
_reasonably_ central in the gitk view, and also looks like it's not in
the middle of some big upheaval (and preferably as far as reasonable
from the commit you know you can't test - pick a point on a different
branch before that got merged, for example)
> Then reset and continue.
> But for some reason the bisect got mixed up and complained about an impossible
> merge common bases. So out of desperation I did a very very^ stupid thing.
> []$ git rebase -i v2.6.35 v2.6.36-rc1
This _really_ won't work. I mean yes, it can work, but with any kind
of complex history, you're setting yourself up for more pain than it's
worth. It _can_ be worthwhile, but it's absolutely the last thing you
should try.
> In short I wish at some 2.6.XX-rc[45] of every Kernel cycle. Maintainers
> would rebase their next's tree of [XX+1] to a some what more stable rc.
> Sure re-run all the tests. They still have time for the new tree in next
> to be tested and verified before the next merge window.
> (Hell one of my bisect points took me as back as 2.6.34)
>
> Please remind me why maintainers should not rebase their trees once
> committed, to the point that they don't rebase even for buggy patches
> that are already in next, and apply fix patches, all within the same
> merge window. The same is also done with merge conflicts with the
> rc-cycle of their own code, instead of rebasing.
Umm. Rebasing often makes things much _worse_.
The real problem is that maintainers often pick random - and not at
all stable - points for their development to begin with. They just
pick some random "this is where Linus -git tree is today", and do
their development on top of that. THAT is the problem - they are
unaware that there's some nasty bug in that version.
It's actually made worse by rebasing. A lot of people end up rebasing
on top of such a random kernel (again, just because it's the 'most
recent'). It's very annoying.
Not to mention that rebasing easily results in bugs of its own, when
you do hit merge conflicts or double-apply something that already got
applied (and wasn't seen as a duplicate and merged away automatically,
because the code had been further modified since). We had that in the
DVB pull request just yesterday.
> So in short this is a call for, possibly, cleaner History in main Kernel.
> Please remind me why re-writing history is a bad thing.
Rebasing doesn't result in cleaner history. It just results in
_incorrect_ history that looks simpler.
To get cleaner history, people should try to keep their tree clean.
Not add random patches to random branches, and not start random
branches at random points in time that aren't necessarily stable.
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 20:24 ` Linus Torvalds
@ 2010-09-28 20:47 ` Andrew Morton
2010-09-28 20:51 ` David Miller
2010-09-28 20:57 ` Linus Torvalds
2010-09-28 21:11 ` Al Viro
2011-01-26 16:32 ` {painfullyBISECTED} " Emil Langrock
2 siblings, 2 replies; 20+ messages in thread
From: Andrew Morton @ 2010-09-28 20:47 UTC (permalink / raw)
To: Linus Torvalds
Cc: Boaz Harrosh, Julia Lawall, David S. Miller, uml-devel,
linux-kernel, Stephen Hemminger
On Tue, 28 Sep 2010 13:24:40 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:
> Ping, no comments?
>
> On Mon, Sep 27, 2010 at 6:17 AM, Boaz Harrosh <bharrosh@panasas.com> wrote:
> >
> > [bharrosh@fs2 ~/dev/git/pub/scsi-misc] 1115$ git bisect good
> > f25c80a4b2bf93c99820f470573626557db35202 is the first bad commit
> > commit f25c80a4b2bf93c99820f470573626557db35202
>
> It looks like that commit is indeed very misleading. The commit message says:
>
> "arch/um/drivers: remove duplicate structure field initialization"
>
> but it is in fact not duplicate: there's two field initializations,
> but they are _different_. Looking at the patch, it has:
>
> .ndo_set_mac_address = uml_net_set_mac,
> - .ndo_set_mac_address = eth_mac_addr,
>
> so it removes the later one, but it is not at all clear which one the
> compiler actually used. My guess is that it used to use the later one
> (the standard eth_mac_addr function), and the patch made it suddenly
> use the uml_net_set_mac function.
>
> I didn't check what gcc used to do, but this:
>
> > The patch Reverts cleanly on top of 2.6.36-rc5 and after Revert works perfectly as
> > before.
>
> makes me suspect that nobody else checked it either.
I checked! gcc uses the second initialiser.
uml_net_set_mac() is:
static int uml_net_set_mac(struct net_device *dev, void *addr)
{
struct uml_net_private *lp = netdev_priv(dev);
struct sockaddr *hwaddr = addr;
spin_lock_irq(&lp->lock);
eth_mac_addr(dev, hwaddr->sa_data);
spin_unlock_irq(&lp->lock);
return 0;
}
And I misread that, assuming that it's just a wrapper around
eth_mac_addr(). Only it isn't, because it passes eth_mac_addr() the
MAC address's address directly (with ->sa_data). But eth_mac_addr()
expects a `struct sockaddr *'.
And for some wtf reason, eth_mac_addr() passes that `struct
sockaddr *' in a `void *', thus cunningly hiding the bug.
Yeah, please revert it.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 20:47 ` Andrew Morton
@ 2010-09-28 20:51 ` David Miller
2010-09-28 20:57 ` Linus Torvalds
1 sibling, 0 replies; 20+ messages in thread
From: David Miller @ 2010-09-28 20:51 UTC (permalink / raw)
To: akpm
Cc: torvalds, bharrosh, julia, user-mode-linux-devel, linux-kernel,
shemminger
From: Andrew Morton <akpm@linux-foundation.org>
Date: Tue, 28 Sep 2010 13:47:21 -0700
> Yeah, please revert it.
I'll take care of this, thanks everyone.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 20:47 ` Andrew Morton
2010-09-28 20:51 ` David Miller
@ 2010-09-28 20:57 ` Linus Torvalds
2010-09-28 21:00 ` David Miller
` (2 more replies)
1 sibling, 3 replies; 20+ messages in thread
From: Linus Torvalds @ 2010-09-28 20:57 UTC (permalink / raw)
To: Andrew Morton
Cc: Boaz Harrosh, Julia Lawall, David S. Miller, uml-devel,
linux-kernel, Stephen Hemminger
[-- Attachment #1: Type: text/plain, Size: 902 bytes --]
On Tue, Sep 28, 2010 at 1:47 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> And I misread that, assuming that it's just a wrapper around
> eth_mac_addr(). Only it isn't, because it passes eth_mac_addr() the
> MAC address's address directly (with ->sa_data). But eth_mac_addr()
> expects a `struct sockaddr *'.
>
> And for some wtf reason, eth_mac_addr() passes that `struct
> sockaddr *' in a `void *', thus cunningly hiding the bug.
>
> Yeah, please revert it.
Well, the alternative would be to just fix uml_net_set_mac(). The code
before the patch was clearly crap too, so reverting may make it work,
but gets us back to a stupid situation with two different initializers
for one field.
So perhaps something like the attached patch would make it work, and
have the locking in place that apparently people think it should have?
Boaz?
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 730 bytes --]
arch/um/drivers/net_kern.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/um/drivers/net_kern.c b/arch/um/drivers/net_kern.c
index 2ab233b..e44fd87 100644
--- a/arch/um/drivers/net_kern.c
+++ b/arch/um/drivers/net_kern.c
@@ -257,14 +257,14 @@ static void uml_net_tx_timeout(struct net_device *dev)
static int uml_net_set_mac(struct net_device *dev, void *addr)
{
+ int ret;
struct uml_net_private *lp = netdev_priv(dev);
- struct sockaddr *hwaddr = addr;
spin_lock_irq(&lp->lock);
- eth_mac_addr(dev, hwaddr->sa_data);
+ ret = eth_mac_addr(dev, addr);
spin_unlock_irq(&lp->lock);
- return 0;
+ return ret;
}
static int uml_net_change_mtu(struct net_device *dev, int new_mtu)
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 20:57 ` Linus Torvalds
@ 2010-09-28 21:00 ` David Miller
2010-09-28 21:08 ` Linus Torvalds
2010-09-29 8:34 ` [PATCH] um: Proper Fix for f25c80a4: " Boaz Harrosh
2010-09-29 8:41 ` {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: " Boaz Harrosh
2 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2010-09-28 21:00 UTC (permalink / raw)
To: torvalds
Cc: akpm, bharrosh, julia, user-mode-linux-devel, linux-kernel,
shemminger
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 28 Sep 2010 13:57:04 -0700
> So perhaps something like the attached patch would make it work, and
> have the locking in place that apparently people think it should have?
I'm fine with this if it works:
Acked-by: David S. Miller <davem@davemloft.net>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 21:00 ` David Miller
@ 2010-09-28 21:08 ` Linus Torvalds
0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2010-09-28 21:08 UTC (permalink / raw)
To: David Miller
Cc: akpm, bharrosh, julia, user-mode-linux-devel, linux-kernel,
shemminger
On Tue, Sep 28, 2010 at 2:00 PM, David Miller <davem@davemloft.net> wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Tue, 28 Sep 2010 13:57:04 -0700
>
>> So perhaps something like the attached patch would make it work, and
>> have the locking in place that apparently people think it should have?
>
> I'm fine with this if it works:
>
> Acked-by: David S. Miller <davem@davemloft.net>
And you can add my sign-off to it if somebody tests it.
It does look like the obvious fix. The "wrapper" is simply totally
wrong as-is, and passes the wrong thing entirely to eth_mac_addr()
afaik.
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 20:24 ` Linus Torvalds
2010-09-28 20:47 ` Andrew Morton
@ 2010-09-28 21:11 ` Al Viro
2010-09-28 21:24 ` Al Viro
2011-01-26 16:32 ` {painfullyBISECTED} " Emil Langrock
2 siblings, 1 reply; 20+ messages in thread
From: Al Viro @ 2010-09-28 21:11 UTC (permalink / raw)
To: Linus Torvalds
Cc: Boaz Harrosh, Julia Lawall, David S. Miller, Andrew Morton,
uml-devel, linux-kernel, Stephen Hemminger
On Tue, Sep 28, 2010 at 01:24:40PM -0700, Linus Torvalds wrote:
> It looks like that commit is indeed very misleading. The commit message says:
>
> "arch/um/drivers: remove duplicate structure field initialization"
>
> but it is in fact not duplicate: there's two field initializations,
> but they are _different_. Looking at the patch, it has:
>
> .ndo_set_mac_address = uml_net_set_mac,
> - .ndo_set_mac_address = eth_mac_addr,
>
> so it removes the later one, but it is not at all clear which one the
> compiler actually used. My guess is that it used to use the later one
> (the standard eth_mac_addr function), and the patch made it suddenly
> use the uml_net_set_mac function.
C99 6.7.8p19:
The initialization shall occur in initializer list order, each
initializer provided for a particular subobject overriding any previously
listed initializer for the same subobject[*]; all subobjects that are not
initialized explicitly shall be initialized implicitly the same as objects
that have static storage duration.
[*] Any initializer for subobject which is overridden and so not used to
initialize that subobject might not be evaluated at all.
IOW, it _must_ use the last one in such cases.
As for the driver, I smell an interface change (in eth_mac_addr() arguments)
that has been missed... FWIW, grep through the tree shows one more instance
of eth_mac_addr() called with such argument and it's also in net_kern.c; there
we simply want memcpy() instead, since device is definitely not running at
that point and we'd done the validity checks earlier.
Not sure if we need lp->lock around that eth_mac_addr() call - not familiar
with the driver in question. If we don't, we should switch to eth_mac_addr
for the method, indeed...
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 21:11 ` Al Viro
@ 2010-09-28 21:24 ` Al Viro
2010-09-28 21:42 ` David Miller
0 siblings, 1 reply; 20+ messages in thread
From: Al Viro @ 2010-09-28 21:24 UTC (permalink / raw)
To: Linus Torvalds
Cc: Boaz Harrosh, Julia Lawall, David S. Miller, Andrew Morton,
uml-devel, linux-kernel, Stephen Hemminger
On Tue, Sep 28, 2010 at 10:11:06PM +0100, Al Viro wrote:
> IOW, it _must_ use the last one in such cases.
>
> As for the driver, I smell an interface change (in eth_mac_addr() arguments)
> that has been missed... FWIW, grep through the tree shows one more instance
> of eth_mac_addr() called with such argument and it's also in net_kern.c; there
> we simply want memcpy() instead, since device is definitely not running at
> that point and we'd done the validity checks earlier.
>
> Not sure if we need lp->lock around that eth_mac_addr() call - not familiar
> with the driver in question. If we don't, we should switch to eth_mac_addr
> for the method, indeed...
FWIW, after looking at that code... I don't think we need lp->lock there,
but I really wonder if we need to update lp->mac as well, or, perhaps simply
remove it completely. Who maintains these drivers? It's not just
net_kern/net_user; there's a bunch of subdrivers for that sucker...
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 21:24 ` Al Viro
@ 2010-09-28 21:42 ` David Miller
2010-09-28 21:51 ` Al Viro
0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2010-09-28 21:42 UTC (permalink / raw)
To: viro
Cc: torvalds, bharrosh, julia, akpm, user-mode-linux-devel,
linux-kernel, shemminger
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 28 Sep 2010 22:24:17 +0100
> Who maintains these drivers?
Your guess is as good as mine, which probably means there is no
active maintainer right now.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 21:42 ` David Miller
@ 2010-09-28 21:51 ` Al Viro
2010-09-29 17:19 ` [uml-devel] " Renzo Davoli
0 siblings, 1 reply; 20+ messages in thread
From: Al Viro @ 2010-09-28 21:51 UTC (permalink / raw)
To: David Miller
Cc: torvalds, bharrosh, julia, akpm, user-mode-linux-devel,
linux-kernel, shemminger
On Tue, Sep 28, 2010 at 02:42:15PM -0700, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Tue, 28 Sep 2010 22:24:17 +0100
>
> > Who maintains these drivers?
>
> Your guess is as good as mine, which probably means there is no
> active maintainer right now.
Oh, joy... I'm regulary using uml, with some of these drivers used
from time to time, but I've no fscking clue on e.g. VDE - even on
the "what the hell it is" level ;-/
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH] um: Proper Fix for f25c80a4: remove duplicate structure field initialization
2010-09-28 20:57 ` Linus Torvalds
2010-09-28 21:00 ` David Miller
@ 2010-09-29 8:34 ` Boaz Harrosh
2010-09-29 15:05 ` Linus Torvalds
2010-09-29 8:41 ` {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: " Boaz Harrosh
2 siblings, 1 reply; 20+ messages in thread
From: Boaz Harrosh @ 2010-09-29 8:34 UTC (permalink / raw)
To: Linus Torvalds, David S. Miller
Cc: Andrew Morton, Julia Lawall, uml-devel, linux-kernel,
Stephen Hemminger, Al Viro
uml_net_set_mac() was broken and luckily it was never used, before.
What it was trying to do is spin_lock before memcopy the mac address.
Linus attempted to fix it in assumption that someone decided the
lock was needed. But since it was never ever used at all, and was
just dead code, I think we can assume that it is not needed, after
all.
On the other hand patch [f25c80a4] was trying to use eth_mac_addr()
in eth_configure(), *which was the real fallout*. Because of state
checks done inside eth_mac_addr() the address was never set. I have
not reintroduced the memcpy wrapper, but I've put a comment for future
cats.
The code now is back to exactly as it was before [f25c80a4]. With
the cleanup applied. If the spin_lock is indeed needed then a contender
should supply a test case that fails, then fix it with the proper
locking, as a separate unrelated patch.
CC: Julia Lawall <julia@diku.dk>
CC: David S. Miller <davem@davemloft.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Al Viro <viro@ZenIV.linux.org.uk>
Tested-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
arch/um/drivers/net_kern.c | 17 +++--------------
1 files changed, 3 insertions(+), 14 deletions(-)
diff --git a/arch/um/drivers/net_kern.c b/arch/um/drivers/net_kern.c
index 2ab233b..47d0c37 100644
--- a/arch/um/drivers/net_kern.c
+++ b/arch/um/drivers/net_kern.c
@@ -255,18 +255,6 @@ static void uml_net_tx_timeout(struct net_device *dev)
netif_wake_queue(dev);
}
-static int uml_net_set_mac(struct net_device *dev, void *addr)
-{
- struct uml_net_private *lp = netdev_priv(dev);
- struct sockaddr *hwaddr = addr;
-
- spin_lock_irq(&lp->lock);
- eth_mac_addr(dev, hwaddr->sa_data);
- spin_unlock_irq(&lp->lock);
-
- return 0;
-}
-
static int uml_net_change_mtu(struct net_device *dev, int new_mtu)
{
dev->mtu = new_mtu;
@@ -373,7 +361,7 @@ static const struct net_device_ops uml_netdev_ops = {
.ndo_start_xmit = uml_net_start_xmit,
.ndo_set_multicast_list = uml_net_set_multicast_list,
.ndo_tx_timeout = uml_net_tx_timeout,
- .ndo_set_mac_address = uml_net_set_mac,
+ .ndo_set_mac_address = eth_mac_addr,
.ndo_change_mtu = uml_net_change_mtu,
.ndo_validate_addr = eth_validate_addr,
};
@@ -472,7 +460,8 @@ static void eth_configure(int n, void *init, char *mac,
((*transport->user->init)(&lp->user, dev) != 0))
goto out_unregister;
- eth_mac_addr(dev, device->mac);
+ /* don't use eth_mac_addr, it will not work here */
+ memcpy(dev->dev_addr, device->mac, ETH_ALEN);
dev->mtu = transport->user->mtu;
dev->netdev_ops = ¨_netdev_ops;
dev->ethtool_ops = ¨_net_ethtool_ops;
--
1.7.2.3
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 20:57 ` Linus Torvalds
2010-09-28 21:00 ` David Miller
2010-09-29 8:34 ` [PATCH] um: Proper Fix for f25c80a4: " Boaz Harrosh
@ 2010-09-29 8:41 ` Boaz Harrosh
2010-09-29 15:01 ` Linus Torvalds
2 siblings, 1 reply; 20+ messages in thread
From: Boaz Harrosh @ 2010-09-29 8:41 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Julia Lawall, David S. Miller, uml-devel,
linux-kernel, Stephen Hemminger
On 09/28/2010 10:57 PM, Linus Torvalds wrote:
> On Tue, Sep 28, 2010 at 1:47 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
>>
>> And I misread that, assuming that it's just a wrapper around
>> eth_mac_addr(). Only it isn't, because it passes eth_mac_addr() the
>> MAC address's address directly (with ->sa_data). But eth_mac_addr()
>> expects a `struct sockaddr *'.
>>
>> And for some wtf reason, eth_mac_addr() passes that `struct
>> sockaddr *' in a `void *', thus cunningly hiding the bug.
>>
>> Yeah, please revert it.
>
> Well, the alternative would be to just fix uml_net_set_mac(). The code
> before the patch was clearly crap too, so reverting may make it work,
> but gets us back to a stupid situation with two different initializers
> for one field.
>
> So perhaps something like the attached patch would make it work, and
> have the locking in place that apparently people think it should have?
>
> Boaz?
>
OK I get the picture. We should attempt a proper fixing. Well since
I get to test it, then I get to fix it as well, right ;-)?
With this I'm now back to my usual: boots fine but half of the times
not halting. But I think that's a Fedora12 problem with interaction
with uml. (Which I do not have time to investigate)
> Linus
Thanks, have a good day
Boaz
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-29 8:41 ` {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: " Boaz Harrosh
@ 2010-09-29 15:01 ` Linus Torvalds
2010-09-30 2:27 ` David Miller
0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2010-09-29 15:01 UTC (permalink / raw)
To: Boaz Harrosh
Cc: Andrew Morton, Julia Lawall, David S. Miller, uml-devel,
linux-kernel, Stephen Hemminger
On Wed, Sep 29, 2010 at 1:41 AM, Boaz Harrosh <bharrosh@panasas.com> wrote:
>
> With this I'm now back to my usual: boots fine but half of the times
> not halting. But I think that's a Fedora12 problem with interaction
> with uml. (Which I do not have time to investigate)
So that "half of the time not halting" happened with the revert too?
If so, we should just do the fix. Thanks for testing.
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] um: Proper Fix for f25c80a4: remove duplicate structure field initialization
2010-09-29 8:34 ` [PATCH] um: Proper Fix for f25c80a4: " Boaz Harrosh
@ 2010-09-29 15:05 ` Linus Torvalds
2010-09-30 2:28 ` David Miller
0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2010-09-29 15:05 UTC (permalink / raw)
To: Boaz Harrosh
Cc: David S. Miller, Andrew Morton, Julia Lawall, uml-devel,
linux-kernel, Stephen Hemminger, Al Viro
On Wed, Sep 29, 2010 at 1:34 AM, Boaz Harrosh <bharrosh@panasas.com> wrote:
>
> On the other hand patch [f25c80a4] was trying to use eth_mac_addr()
> in eth_configure(), *which was the real fallout*. Because of state
> checks done inside eth_mac_addr() the address was never set. I have
> not reintroduced the memcpy wrapper, but I've put a comment for future
> cats.
Ahh. Ok, this looks fine. David, please use this instead of my broken
one that didn't do the whole thing.
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 21:51 ` Al Viro
@ 2010-09-29 17:19 ` Renzo Davoli
0 siblings, 0 replies; 20+ messages in thread
From: Renzo Davoli @ 2010-09-29 17:19 UTC (permalink / raw)
To: Al Viro
Cc: David Miller, user-mode-linux-devel, linux-kernel, julia, akpm,
torvalds, shemminger, shammash
> Oh, joy... I'm regulary using uml, with some of these drivers used
> from time to time, but I've no fscking clue on e.g. VDE - even on
> the "what the hell it is" level ;-/
VDE is Virtual Distributed Ethernet.
http://wiki.virtualsquare.org/wiki/index.php/Introduction#Virtual_Square_Networking
I am the main author of VDE.
Was it just an example or there is a bug on the VDE support for user-mode Linux?
renzo
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-29 15:01 ` Linus Torvalds
@ 2010-09-30 2:27 ` David Miller
0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2010-09-30 2:27 UTC (permalink / raw)
To: torvalds
Cc: bharrosh, akpm, julia, user-mode-linux-devel, linux-kernel,
shemminger
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 29 Sep 2010 08:01:07 -0700
> On Wed, Sep 29, 2010 at 1:41 AM, Boaz Harrosh <bharrosh@panasas.com> wrote:
>>
>> With this I'm now back to my usual: boots fine but half of the times
>> not halting. But I think that's a Fedora12 problem with interaction
>> with uml. (Which I do not have time to investigate)
>
> So that "half of the time not halting" happened with the revert too?
> If so, we should just do the fix. Thanks for testing.
Linus, I've queued up your fix into my tree.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] um: Proper Fix for f25c80a4: remove duplicate structure field initialization
2010-09-29 15:05 ` Linus Torvalds
@ 2010-09-30 2:28 ` David Miller
0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2010-09-30 2:28 UTC (permalink / raw)
To: torvalds
Cc: bharrosh, akpm, julia, user-mode-linux-devel, linux-kernel,
shemminger, viro
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 29 Sep 2010 08:05:04 -0700
> On Wed, Sep 29, 2010 at 1:34 AM, Boaz Harrosh <bharrosh@panasas.com> wrote:
>>
>> On the other hand patch [f25c80a4] was trying to use eth_mac_addr()
>> in eth_configure(), *which was the real fallout*. Because of state
>> checks done inside eth_mac_addr() the address was never set. I have
>> not reintroduced the memcpy wrapper, but I've put a comment for future
>> cats.
>
> Ahh. Ok, this looks fine. David, please use this instead of my broken
> one that didn't do the whole thing.
Oops, I just noticed this, ok will do.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: {painfullyBISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization
2010-09-28 20:24 ` Linus Torvalds
2010-09-28 20:47 ` Andrew Morton
2010-09-28 21:11 ` Al Viro
@ 2011-01-26 16:32 ` Emil Langrock
2 siblings, 0 replies; 20+ messages in thread
From: Emil Langrock @ 2011-01-26 16:32 UTC (permalink / raw)
To: linux-kernel
Linus Torvalds <torvalds <at> linux-foundation.org> writes:
> The real problem is that maintainers often pick random - and not at
> all stable - points for their development to begin with. They just
> pick some random "this is where Linus -git tree is today", and do
> their development on top of that. THAT is the problem - they are
> unaware that there's some nasty bug in that version.
>
> It's actually made worse by rebasing. A lot of people end up rebasing
> on top of such a random kernel (again, just because it's the 'most
> recent'). It's very annoying.
>
> Not to mention that rebasing easily results in bugs of its own, when
> you do hit merge conflicts or double-apply something that already got
> applied (and wasn't seen as a duplicate and merged away automatically,
> because the code had been further modified since). We had that in the
> DVB pull request just yesterday.
>
> > So in short this is a call for, possibly, cleaner History in main Kernel.
> > Please remind me why re-writing history is a bad thing.
>
> Rebasing doesn't result in cleaner history. It just results in
> _incorrect_ history that looks simpler.
>
> To get cleaner history, people should try to keep their tree clean.
> Not add random patches to random branches, and not start random
> branches at random points in time that aren't necessarily stable.
I understand what you want to say, but I don't see how you define stable points.
Maybe you mean that a stable point is a non-rc release of Linux. This of course
sounds like a good idea when you start a complete new development without any
dependencies, but think about following events (ordered by time of event):
* v2.6.37 was released
* foo starts to code the feature bar
* foo asks a subtree maintainer to merge his stuff (subtree maintainer doesn't
merge it because 2.6.38-rc1 wasn't released yet)
* v2.6.38-rc1 was released
* subtree maintainer merges the commit because we need more bar in Linux
* v2.6.38 was released
* subtree maintainer asks you to merge his tree with the cool bar feature
* foo wants to extent the bar feature to superbar and get it in 2.6.40
* v2.6.39-rc1 gets released with the cool bar feature
* v2.6.39 gets released with the final and bugfixed bar feature
* (your merge window - time were foo will not be able to get new patches to the
subtree maintainer for v2.6.40-rc1)
* v2.6.40-rc1 gets released
Even if this is a relative small example and the usual problem is more complex -
what starting point should be used for his superbar feature? My obvious answer
would be 2.6.39-rc1 (but is it really a stable starting point?)
Kind regards,
Emil Langrock
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2011-01-26 17:05 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-27 13:17 {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: remove duplicate structure field initialization Boaz Harrosh
2010-09-27 14:06 ` Phil Turmel
2010-09-28 20:24 ` Linus Torvalds
2010-09-28 20:47 ` Andrew Morton
2010-09-28 20:51 ` David Miller
2010-09-28 20:57 ` Linus Torvalds
2010-09-28 21:00 ` David Miller
2010-09-28 21:08 ` Linus Torvalds
2010-09-29 8:34 ` [PATCH] um: Proper Fix for f25c80a4: " Boaz Harrosh
2010-09-29 15:05 ` Linus Torvalds
2010-09-30 2:28 ` David Miller
2010-09-29 8:41 ` {painfully BISECTED} Please revert f25c80a4b2: arch/um/drivers: " Boaz Harrosh
2010-09-29 15:01 ` Linus Torvalds
2010-09-30 2:27 ` David Miller
2010-09-28 21:11 ` Al Viro
2010-09-28 21:24 ` Al Viro
2010-09-28 21:42 ` David Miller
2010-09-28 21:51 ` Al Viro
2010-09-29 17:19 ` [uml-devel] " Renzo Davoli
2011-01-26 16:32 ` {painfullyBISECTED} " Emil Langrock
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox