* Re: 2.6.25-rc8: FTP transfer errors [not found] ` <20080410.154651.101700010.davem@davemloft.net> @ 2008-04-11 0:16 ` Mark Lord 2008-04-11 0:24 ` David Miller 2008-04-11 0:26 ` David Miller 0 siblings, 2 replies; 129+ messages in thread From: Mark Lord @ 2008-04-11 0:16 UTC (permalink / raw) To: David Miller Cc: jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev David Miller wrote: > From: "Jesper Juhl" <jesper.juhl@gmail.com> > Date: Fri, 11 Apr 2008 00:09:11 +0200 > >> You can't expect users to know how to debug a problem or even bisect >> it. > > [ The person you are replying to was being sarcastic, BTW. ] > > That's not the case we're talking about in this specific instance. In > this particular case the user is more than capable of bisecting, he > just isn't willing to invest the time. .. Duh.. more like, "If I take 5-8 hours to attempt a bisect (which may not even work), then that's 5-8 hours I do not get paid for." Gotta eat, dude. Anyways, here's five hours of free consulting for you: git-bisect start # bad: [7180c4c9e09888db0a188f729c96c6d7bd61fa83] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 git-bisect bad 7180c4c9e09888db0a188f729c96c6d7bd61fa83 # good: [49914084e797530d9baaf51df9eda77babc98fa8] Linux 2.6.24 git-bisect good 49914084e797530d9baaf51df9eda77babc98fa8 # bad: [e5dfb815181fcb186d6080ac3a091eadff2d98fe] [NET_SCHED]: Add flow classifier git-bisect bad e5dfb815181fcb186d6080ac3a091eadff2d98fe # good: [00e0b8cb74ed7c16b2bc41eb33a16eae5b6e2d5c] b43: reinit on too many PHY TX errors git-bisect good 00e0b8cb74ed7c16b2bc41eb33a16eae5b6e2d5c # good: [42d545c9a4c0d3faeab658a40165c3da2dda91b2] x86: remove depends on X86_32 from PARAVIRT & PARAVIRT_GUEST git-bisect good 42d545c9a4c0d3faeab658a40165c3da2dda91b2 # good: [6232665040f9a23fafd9d94d4ae8d5a2dc850f65] Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 git-bisect good 6232665040f9a23fafd9d94d4ae8d5a2dc850f65 # good: [e5723b41abe559bafc52591dcf8ee19cc131d3a1] [ALSA] Remove sequencer instrument layer git-bisect good e5723b41abe559bafc52591dcf8ee19cc131d3a1 # good: [461e2c78b153e38f284d09721c50c0cd3c47e073] [ALSA] hda-codec - Add Conexant 5051 codec support git-bisect good 461e2c78b153e38f284d09721c50c0cd3c47e073 # good: [1987e7b4855fcb6a866d3279ee9f2890491bc34d] [AX25]: Kill ax25_bind() user triggable printk. git-bisect good 1987e7b4855fcb6a866d3279ee9f2890491bc34d # good: [58a3c9bb0c69f8517c2243cd0912b3f87b4f868c] [NETFILTER]: nf_conntrack: use RCU for conntrack helpers git-bisect good 58a3c9bb0c69f8517c2243cd0912b3f87b4f868c # good: [32948588ac4ec54300bae1037e839277fd4536e2] [NETFILTER]: nf_conntrack: annotate l3protos with const git-bisect good 32948588ac4ec54300bae1037e839277fd4536e2 # bad: [e83a2ea850bf0c0c81c675444080970fc07798c6] [VLAN]: set_rx_mode support for unicast address list git-bisect bad e83a2ea850bf0c0c81c675444080970fc07798c6 # good: [941b1d22cc035ad58b3d9b44a1c74efac2d7e499] [NETNS]: Make bind buckets live in net namespaces. git-bisect good 941b1d22cc035ad58b3d9b44a1c74efac2d7e499 # bad: [23717795bee15470b96f9b7aa5ecf4efe14c8e32] [IPV6]: Update MSS even if MTU is unchanged. git-bisect bad 23717795bee15470b96f9b7aa5ecf4efe14c8e32 # bad: [d86e0dac2ce412715181f792aa0749fe3effff11] [NETNS]: Tcp-v6 sockets per-net lookup. git-bisect bad d86e0dac2ce412715181f792aa0749fe3effff11 # bad: [c67499c0e772064b37ad75eb69b28fc218752636] [NETNS]: Tcp-v4 sockets per-net lookup. git-bisect bad c67499c0e772064b37ad75eb69b28fc218752636 [c67499c0e772064b37ad75eb69b28fc218752636 is first bad commit commit c67499c0e772064b37ad75eb69b28fc218752636 Author: Pavel Emelyanov <xemul@openvz.org> Date: Thu Jan 31 05:06:40 2008 -0800 [NETNS]: Tcp-v4 sockets per-net lookup. Add a net argument to inet_lookup and propagate it further into lookup calls. Plus tune the __inet_check_established. The dccp and inet_diag, which use that lookup functions pass the init_net into them. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> --- diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h index 55532b9..c23c4ed 100644 --- a/include/net/inet_hashtables.h +++ b/include/net/inet_hashtables.h @@ -302,15 +302,17 @@ out: wake_up(&hashinfo->lhash_wait); } -extern struct sock *__inet_lookup_listener(struct inet_hashinfo *hashinfo, +extern struct sock *__inet_lookup_listener(struct net *net, + struct inet_hashinfo *hashinfo, const __be32 daddr, const unsigned short hnum, const int dif); -static inline struct sock *inet_lookup_listener(struct inet_hashinfo *hashinfo, - __be32 daddr, __be16 dport, int dif) +static inline struct sock *inet_lookup_listener(struct net *net, + struct inet_hashinfo *hashinfo, + __be32 daddr, __be16 dport, int dif) { - return __inet_lookup_listener(hashinfo, daddr, ntohs(dport), dif); + return __inet_lookup_listener(net, hashinfo, daddr, ntohs(dport), dif); } /* Socket demux engine toys. */ @@ -344,26 +346,26 @@ typedef __u64 __bitwise __addrpair; (((__force __u64)(__be32)(__daddr)) << 32) | \ ((__force __u64)(__be32)(__saddr))); #endif /* __BIG_ENDIAN */ -#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ - (((__sk)->sk_hash == (__hash)) && \ +#define INET_MATCH(__sk, __net, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ + (((__sk)->sk_hash == (__hash)) && ((__sk)->sk_net == (__net)) && \ ((*((__addrpair *)&(inet_sk(__sk)->daddr))) == (__cookie)) && \ ((*((__portpair *)&(inet_sk(__sk)->dport))) == (__ports)) && \ (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif)))) -#define INET_TW_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ - (((__sk)->sk_hash == (__hash)) && \ +#define INET_TW_MATCH(__sk, __net, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ + (((__sk)->sk_hash == (__hash)) && ((__sk)->sk_net == (__net)) && \ ((*((__addrpair *)&(inet_twsk(__sk)->tw_daddr))) == (__cookie)) && \ ((*((__portpair *)&(inet_twsk(__sk)->tw_dport))) == (__ports)) && \ (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif)))) #else /* 32-bit arch */ #define INET_ADDR_COOKIE(__name, __saddr, __daddr) -#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif) \ - (((__sk)->sk_hash == (__hash)) && \ +#define INET_MATCH(__sk, __net, __hash, __cookie, __saddr, __daddr, __ports, __dif) \ + (((__sk)->sk_hash == (__hash)) && ((__sk)->sk_net == (__net)) && \ (inet_sk(__sk)->daddr == (__saddr)) && \ (inet_sk(__sk)->rcv_saddr == (__daddr)) && \ ((*((__portpair *)&(inet_sk(__sk)->dport))) == (__ports)) && \ (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif)))) -#define INET_TW_MATCH(__sk, __hash,__cookie, __saddr, __daddr, __ports, __dif) \ - (((__sk)->sk_hash == (__hash)) && \ +#define INET_TW_MATCH(__sk, __net, __hash,__cookie, __saddr, __daddr, __ports, __dif) \ + (((__sk)->sk_hash == (__hash)) && ((__sk)->sk_net == (__net)) && \ (inet_twsk(__sk)->tw_daddr == (__saddr)) && \ (inet_twsk(__sk)->tw_rcv_saddr == (__daddr)) && \ ((*((__portpair *)&(inet_twsk(__sk)->tw_dport))) == (__ports)) && \ @@ -376,32 +378,36 @@ typedef __u64 __bitwise __addrpair; * * Local BH must be disabled here. */ -extern struct sock * __inet_lookup_established(struct inet_hashinfo *hashinfo, +extern struct sock * __inet_lookup_established(struct net *net, + struct inet_hashinfo *hashinfo, const __be32 saddr, const __be16 sport, const __be32 daddr, const u16 hnum, const int dif); static inline struct sock * - inet_lookup_established(struct inet_hashinfo *hashinfo, + inet_lookup_established(struct net *net, struct inet_hashinfo *hashinfo, const __be32 saddr, const __be16 sport, const __be32 daddr, const __be16 dport, const int dif) { - return __inet_lookup_established(hashinfo, saddr, sport, daddr, + return __inet_lookup_established(net, hashinfo, saddr, sport, daddr, ntohs(dport), dif); } -static inline struct sock *__inet_lookup(struct inet_hashinfo *hashinfo, +static inline struct sock *__inet_lookup(struct net *net, + struct inet_hashinfo *hashinfo, const __be32 saddr, const __be16 sport, const __be32 daddr, const __be16 dport, const int dif) { u16 hnum = ntohs(dport); - struct sock *sk = __inet_lookup_established(hashinfo, saddr, sport, daddr, - hnum, dif); - return sk ? : __inet_lookup_listener(hashinfo, daddr, hnum, dif); + struct sock *sk = __inet_lookup_established(net, hashinfo, + saddr, sport, daddr, hnum, dif); + + return sk ? : __inet_lookup_listener(net, hashinfo, daddr, hnum, dif); } -static inline struct sock *inet_lookup(struct inet_hashinfo *hashinfo, +static inline struct sock *inet_lookup(struct net *net, + struct inet_hashinfo *hashinfo, const __be32 saddr, const __be16 sport, const __be32 daddr, const __be16 dport, const int dif) @@ -409,7 +415,7 @@ static inline struct sock *inet_lookup(struct inet_hashinfo *hashinfo, struct sock *sk; local_bh_disable(); - sk = __inet_lookup(hashinfo, saddr, sport, daddr, dport, dif); + sk = __inet_lookup(net, hashinfo, saddr, sport, daddr, dport, dif); local_bh_enable(); return sk; diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index 9e38b0d..c982ad8 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -218,7 +218,7 @@ static void dccp_v4_err(struct sk_buff *skb, u32 info) return; } - sk = inet_lookup(&dccp_hashinfo, iph->daddr, dh->dccph_dport, + sk = inet_lookup(&init_net, &dccp_hashinfo, iph->daddr, dh->dccph_dport, iph->saddr, dh->dccph_sport, inet_iif(skb)); if (sk == NULL) { ICMP_INC_STATS_BH(ICMP_MIB_INERRORS); @@ -436,7 +436,7 @@ static struct sock *dccp_v4_hnd_req(struct sock *sk, struct sk_buff *skb) if (req != NULL) return dccp_check_req(sk, skb, req, prev); - nsk = inet_lookup_established(&dccp_hashinfo, + nsk = inet_lookup_established(&init_net, &dccp_hashinfo, iph->saddr, dh->dccph_sport, iph->daddr, dh->dccph_dport, inet_iif(skb)); @@ -817,7 +817,7 @@ static int dccp_v4_rcv(struct sk_buff *skb) /* Step 2: * Look up flow ID in table and get corresponding socket */ - sk = __inet_lookup(&dccp_hashinfo, + sk = __inet_lookup(&init_net, &dccp_hashinfo, iph->saddr, dh->dccph_sport, iph->daddr, dh->dccph_dport, inet_iif(skb)); /* diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index 4cfb15c..95c9f14 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -268,7 +268,7 @@ static int inet_diag_get_exact(struct sk_buff *in_skb, err = -EINVAL; if (req->idiag_family == AF_INET) { - sk = inet_lookup(hashinfo, req->id.idiag_dst[0], + sk = inet_lookup(&init_net, hashinfo, req->id.idiag_dst[0], req->id.idiag_dport, req->id.idiag_src[0], req->id.idiag_sport, req->id.idiag_if); } diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index db1e53a..48d4500 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -127,7 +127,8 @@ EXPORT_SYMBOL(inet_listen_wlock); * remote address for the connection. So always assume those are both * wildcarded during the search since they can never be otherwise. */ -static struct sock *inet_lookup_listener_slow(const struct hlist_head *head, +static struct sock *inet_lookup_listener_slow(struct net *net, + const struct hlist_head *head, const __be32 daddr, const unsigned short hnum, const int dif) @@ -139,7 +140,8 @@ static struct sock *inet_lookup_listener_slow(const struct hlist_head *head, sk_for_each(sk, node, head) { const struct inet_sock *inet = inet_sk(sk); - if (inet->num == hnum && !ipv6_only_sock(sk)) { + if (sk->sk_net == net && inet->num == hnum && + !ipv6_only_sock(sk)) { const __be32 rcv_saddr = inet->rcv_saddr; int score = sk->sk_family == PF_INET ? 1 : 0; @@ -165,7 +167,8 @@ static struct sock *inet_lookup_listener_slow(const struct hlist_head *head, } /* Optimize the common listener case. */ -struct sock *__inet_lookup_listener(struct inet_hashinfo *hashinfo, +struct sock *__inet_lookup_listener(struct net *net, + struct inet_hashinfo *hashinfo, const __be32 daddr, const unsigned short hnum, const int dif) { @@ -180,9 +183,9 @@ struct sock *__inet_lookup_listener(struct inet_hashinfo *hashinfo, if (inet->num == hnum && !sk->sk_node.next && (!inet->rcv_saddr || inet->rcv_saddr == daddr) && (sk->sk_family == PF_INET || !ipv6_only_sock(sk)) && - !sk->sk_bound_dev_if) + !sk->sk_bound_dev_if && sk->sk_net == net) goto sherry_cache; - sk = inet_lookup_listener_slow(head, daddr, hnum, dif); + sk = inet_lookup_listener_slow(net, head, daddr, hnum, dif); } if (sk) { sherry_cache: @@ -193,7 +196,8 @@ sherry_cache: } EXPORT_SYMBOL_GPL(__inet_lookup_listener); -struct sock * __inet_lookup_established(struct inet_hashinfo *hashinfo, +struct sock * __inet_lookup_established(struct net *net, + struct inet_hashinfo *hashinfo, const __be32 saddr, const __be16 sport, const __be32 daddr, const u16 hnum, const int dif) @@ -212,13 +216,15 @@ struct sock * __inet_lookup_established(struct inet_hashinfo *hashinfo, prefetch(head->chain.first); read_lock(lock); sk_for_each(sk, node, &head->chain) { - if (INET_MATCH(sk, hash, acookie, saddr, daddr, ports, dif)) + if (INET_MATCH(sk, net, hash, acookie, + saddr, daddr, ports, dif)) goto hit; /* You sunk my battleship! */ } /* Must check for a TIME_WAIT'er before going to listener hash. */ sk_for_each(sk, node, &head->twchain) { - if (INET_TW_MATCH(sk, hash, acookie, saddr, daddr, ports, dif)) + if (INET_TW_MATCH(sk, net, hash, acookie, + saddr, daddr, ports, dif)) goto hit; } sk = NULL; @@ -249,6 +255,7 @@ static int __inet_check_established(struct inet_timewait_death_row *death_row, struct sock *sk2; const struct hlist_node *node; struct inet_timewait_sock *tw; + struct net *net = sk->sk_net; prefetch(head->chain.first); write_lock(lock); @@ -257,7 +264,8 @@ static int __inet_check_established(struct inet_timewait_death_row *death_row, sk_for_each(sk2, node, &head->twchain) { tw = inet_twsk(sk2); - if (INET_TW_MATCH(sk2, hash, acookie, saddr, daddr, ports, dif)) { + if (INET_TW_MATCH(sk2, net, hash, acookie, + saddr, daddr, ports, dif)) { if (twsk_unique(sk, sk2, twp)) goto unique; else @@ -268,7 +276,8 @@ static int __inet_check_established(struct inet_timewait_death_row *death_row, /* And established part... */ sk_for_each(sk2, node, &head->chain) { - if (INET_MATCH(sk2, hash, acookie, saddr, daddr, ports, dif)) + if (INET_MATCH(sk2, net, hash, acookie, + saddr, daddr, ports, dif)) goto not_unique; } diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 9aea88b..77c1939 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -369,8 +369,8 @@ void tcp_v4_err(struct sk_buff *skb, u32 info) return; } - sk = inet_lookup(&tcp_hashinfo, iph->daddr, th->dest, iph->saddr, - th->source, inet_iif(skb)); + sk = inet_lookup(skb->dev->nd_net, &tcp_hashinfo, iph->daddr, th->dest, + iph->saddr, th->source, inet_iif(skb)); if (!sk) { ICMP_INC_STATS_BH(ICMP_MIB_INERRORS); return; @@ -1503,8 +1503,8 @@ static struct sock *tcp_v4_hnd_req(struct sock *sk, struct sk_buff *skb) if (req) return tcp_check_req(sk, skb, req, prev); - nsk = inet_lookup_established(&tcp_hashinfo, iph->saddr, th->source, - iph->daddr, th->dest, inet_iif(skb)); + nsk = inet_lookup_established(sk->sk_net, &tcp_hashinfo, iph->saddr, + th->source, iph->daddr, th->dest, inet_iif(skb)); if (nsk) { if (nsk->sk_state != TCP_TIME_WAIT) { @@ -1661,8 +1661,8 @@ int tcp_v4_rcv(struct sk_buff *skb) TCP_SKB_CB(skb)->flags = iph->tos; TCP_SKB_CB(skb)->sacked = 0; - sk = __inet_lookup(&tcp_hashinfo, iph->saddr, th->source, - iph->daddr, th->dest, inet_iif(skb)); + sk = __inet_lookup(skb->dev->nd_net, &tcp_hashinfo, iph->saddr, + th->source, iph->daddr, th->dest, inet_iif(skb)); if (!sk) goto no_tcp_socket; @@ -1735,7 +1735,8 @@ do_time_wait: } switch (tcp_timewait_state_process(inet_twsk(sk), skb, th)) { case TCP_TW_SYN: { - struct sock *sk2 = inet_lookup_listener(&tcp_hashinfo, + struct sock *sk2 = inet_lookup_listener(skb->dev->nd_net, + &tcp_hashinfo, iph->daddr, th->dest, inet_iif(skb)); if (sk2) { ^ permalink raw reply related [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 0:16 ` 2.6.25-rc8: FTP transfer errors Mark Lord @ 2008-04-11 0:24 ` David Miller 2008-04-11 0:27 ` Mark Lord 2008-04-11 0:56 ` 2.6.25-rc8: FTP transfer errors Tilman Schmidt 2008-04-11 0:26 ` David Miller 1 sibling, 2 replies; 129+ messages in thread From: David Miller @ 2008-04-11 0:24 UTC (permalink / raw) To: lkml; +Cc: jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev From: Mark Lord <lkml@rtr.ca> Date: Thu, 10 Apr 2008 20:16:11 -0400 > Duh.. more like, "If I take 5-8 hours to attempt a bisect (which may not > even work), then that's 5-8 hours I do not get paid for." And if I invest my spare time on your bug how does this statement apply to me? Or does it only apply to you? Every single argument you make that supports why you should not be investing the necessary time into the bug applies equally to the very developers you are so quickly to quip at and want help from. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 0:24 ` David Miller @ 2008-04-11 0:27 ` Mark Lord 2008-04-11 0:39 ` David Miller 2008-04-11 0:56 ` 2.6.25-rc8: FTP transfer errors Tilman Schmidt 1 sibling, 1 reply; 129+ messages in thread From: Mark Lord @ 2008-04-11 0:27 UTC (permalink / raw) To: David Miller Cc: jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev David Miller wrote: > From: Mark Lord <lkml@rtr.ca> > Date: Thu, 10 Apr 2008 20:16:11 -0400 > >> Duh.. more like, "If I take 5-8 hours to attempt a bisect (which may not >> even work), then that's 5-8 hours I do not get paid for." > > And if I invest my spare time on your bug > how does this statement apply to me? .. It's not "my bug". I'm just the first person to notice, take time to report it, and even hand it to you on a platter (bisect). It's *your* bug -- you signed off on the commit. Cheers ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 0:27 ` Mark Lord @ 2008-04-11 0:39 ` David Miller 2008-04-11 1:23 ` Mark Lord 2008-04-15 21:53 ` about bisections (was: Re: 2.6.25-rc8: FTP transfer errors) Ingo Molnar 0 siblings, 2 replies; 129+ messages in thread From: David Miller @ 2008-04-11 0:39 UTC (permalink / raw) To: lkml; +Cc: jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev From: Mark Lord <lkml@rtr.ca> Date: Thu, 10 Apr 2008 20:27:14 -0400 > It's *your* bug -- you signed off on the commit. I sign off on basically every networking commit, does that mean I have to fix every networking bug and every networking bug is "mine"? Of course not, that doesn't scale at all. What does scale is a combination of good fully formed bug reports from users combined with the efforts of the global developer pool. Linus signs off on every patch from Andrew Morton he puts into the tree, which is a lot, but does Linus work on every bug introduced by one of those patches and are such bugs "his" bugs? Of course he doesn't, and of course not. They get pushed up to the person who wrote the patch once identified as such, and the patch is reverted if the developer is unresponsive and this will have consequences for patches they submit in the future. I still think you have a very self-centered attitude about things. This is about distributing effort, not forcing it upon individuals or a constrained resource. If I get hit by a bus, networking bugs would still get fixed if handled properly. And it's a win-win situation. The incentive for a capable user to do a bisect or whatever else is that if they do it their bug gets fixed quickly. That is the free market economy of Linux kernel bug reporting. It addresses the issue that in reality we'll never fix all bugs, and therefore we prioritize. And therefore if there is a bisected bug report and also another one from a user who refuses to do that, guess which bug gets worked on with a higher priority and which bug gets fixed first? ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 0:39 ` David Miller @ 2008-04-11 1:23 ` Mark Lord 2008-04-11 6:40 ` Ilpo Järvinen 2008-04-11 19:58 ` Valdis.Kletnieks 2008-04-15 21:53 ` about bisections (was: Re: 2.6.25-rc8: FTP transfer errors) Ingo Molnar 1 sibling, 2 replies; 129+ messages in thread From: Mark Lord @ 2008-04-11 1:23 UTC (permalink / raw) To: David Miller Cc: jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev David Miller wrote: > From: Mark Lord <lkml@rtr.ca> > Date: Thu, 10 Apr 2008 20:27:14 -0400 > >> It's *your* bug -- you signed off on the commit. > > I sign off on basically every networking commit, does that mean I have > to fix every networking bug and every networking bug is "mine"? .. Absolutely, though to a varying degree. That's the responsibility that goes with the role of a subsystem maintainer. I once had such a role, and gave it up when I felt I could no longer keep up. You still keep refering to it as "your (my) bug". It's not. I had nothing to do with it, other than stumbling over it. When people stumble over a libata bug, I look hard to see if my code could possibly cause it. Jeff looks even harder, because he's the current subsystem dude for libata. I never suggest a user search through a mountain of unrelated commits for something I've screwed up on. I give more directed help, patches to collect more relevant information, and patches to try and resolve it. The last thing I'd ever do, is diss the reporter. Regards. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 1:23 ` Mark Lord @ 2008-04-11 6:40 ` Ilpo Järvinen 2008-04-11 13:19 ` Mark Lord 2008-04-11 19:58 ` Valdis.Kletnieks 1 sibling, 1 reply; 129+ messages in thread From: Ilpo Järvinen @ 2008-04-11 6:40 UTC (permalink / raw) To: Mark Lord Cc: David Miller, jesper.juhl, tilman, yoshfuji, Jeff Garzik, rjw, LKML, Netdev On Thu, 10 Apr 2008, Mark Lord wrote: > David Miller wrote: > > From: Mark Lord <lkml@rtr.ca> > > Date: Thu, 10 Apr 2008 20:27:14 -0400 > > > > > It's *your* bug -- you signed off on the commit. > > > > I sign off on basically every networking commit, does that mean I have > > to fix every networking bug and every networking bug is "mine"? > .. > > Absolutely, though to a varying degree. That's the responsibility > that goes with the role of a subsystem maintainer. I once had > such a role, and gave it up when I felt I could no longer keep up. > You still keep refering to it as "your (my) bug". > It's not. I had nothing to do with it, other than stumbling over it. This bug is perfect example where bisect clearly was useful :-). Nobody knew whose bug it actually was until your bisect gave directions. > When people stumble over a libata bug, I look hard to see if my code > could possibly cause it. Jeff looks even harder, because he's the > current subsystem dude for libata. > > I never suggest a user search through a mountain of unrelated commits > for something I've screwed up on. But it is ok for you to ask an innocent net developer to do that (even with your terms as I hadn't signed off _anything_ related to that one), hmm? ...You had this pretty demanding tone earlier: > Or I can ignore it, like the net developers, since I have a workaround. > And then we'll see what other apps are broken upon 2.6.25 final release. > > Really, folks. Bug reports are intended to *help* the developers, > not something to be thrown back in their faces. > > There do seem to have been a *lot* of changes around the tcp closing/close > code (as I see from diff'ing 2.6.24 against latest -git). > > *Somebody* is responsible for those changes. > That particular *somebody* ought to volunteer some help here, > reducing the mountain of commits to a big handful or two. ...and also... > > Anyways, here's five hours of free consulting for you ...Sure I could use similar words, but you might use the not-mine bug approach again to deflect... :-( ...No, I don't mind really :-). I well understand that I occassionally end up chasing things which are bugs that other people have caused, that's part of the game. > I give more directed help, patches to collect more relevant information, > and patches to try and resolve it. Now that you have, as stated earlier, first looked the diffs (tcp*.c stuff mainly I suppose?!?), and the bisected it and found the breaker, and even patch is available already... Seriously, knowing all what's now available, how could we have solved _this particular case_ without that very useful help (bisect) from your side? Yes, I went through the commit list (maybe you did as well), I'm not sure if Dave did as well. In addition, I checked a number of individual diffs too but this just isn't something very obvious (I have to admit though that I don't really understand all those namespace things, so I didn't even know how to look them too carefully). -- i. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 6:40 ` Ilpo Järvinen @ 2008-04-11 13:19 ` Mark Lord 2008-04-11 14:35 ` Evgeniy Polyakov 0 siblings, 1 reply; 129+ messages in thread From: Mark Lord @ 2008-04-11 13:19 UTC (permalink / raw) To: Ilpo Järvinen Cc: David Miller, jesper.juhl, tilman, yoshfuji, Jeff Garzik, rjw, LKML, Netdev Ilpo Järvinen wrote: > On Thu, 10 Apr 2008, Mark Lord wrote: > >> David Miller wrote: >>> From: Mark Lord <lkml@rtr.ca> >>> Date: Thu, 10 Apr 2008 20:27:14 -0400 >>> >>>> It's *your* bug -- you signed off on the commit. >>> I sign off on basically every networking commit, does that mean I have >>> to fix every networking bug and every networking bug is "mine"? >> .. >> >> Absolutely, though to a varying degree. That's the responsibility >> that goes with the role of a subsystem maintainer. I once had >> such a role, and gave it up when I felt I could no longer keep up. >> You still keep refering to it as "your (my) bug". >> It's not. I had nothing to do with it, other than stumbling over it. > > This bug is perfect example where bisect clearly was useful :-). Nobody > knew whose bug it actually was until your bisect gave directions. > >> When people stumble over a libata bug, I look hard to see if my code >> could possibly cause it. Jeff looks even harder, because he's the >> current subsystem dude for libata. >> >> I never suggest a user search through a mountain of unrelated commits >> for something I've screwed up on. > > But it is ok for you to ask an innocent net developer to do that (even > with your terms as I hadn't signed off _anything_ related to that one), > hmm? > > ...You had this pretty demanding tone earlier: > >> Or I can ignore it, like the net developers, since I have a workaround. >> And then we'll see what other apps are broken upon 2.6.25 final release. .. That's not demanding, that's quite relaxed. I had a good workaround, and didn't really care any more at that point. Just though it was rather odd that none of the developers seemed interested in tracking it down. I offered tons of help, gave it, and said I didn't have time for a full bisect at that juncture. For that, I get repeatedly slammed by the netdev folks. Even after I put aside *paid* work to submit to your demands. Next time around, I won't bother reporting bugs to you folks, that's for damned sure. Cheers ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 13:19 ` Mark Lord @ 2008-04-11 14:35 ` Evgeniy Polyakov 2008-04-11 14:59 ` Mark Lord 2008-04-11 22:16 ` Tilman Schmidt 0 siblings, 2 replies; 129+ messages in thread From: Evgeniy Polyakov @ 2008-04-11 14:35 UTC (permalink / raw) To: Mark Lord Cc: Ilpo Järvinen, David Miller, jesper.juhl, tilman, yoshfuji, Jeff Garzik, rjw, LKML, Netdev On Fri, Apr 11, 2008 at 09:19:17AM -0400, Mark Lord (lkml@rtr.ca) wrote: > >>Or I can ignore it, like the net developers, since I have a workaround. > >>And then we'll see what other apps are broken upon 2.6.25 final release. > > That's not demanding, that's quite relaxed. I had a good workaround, > and didn't really care any more at that point. Just though it was rather > odd that none of the developers seemed interested in tracking it down. > I offered tons of help, gave it, and said I didn't have time for a full > bisect at that juncture. > > For that, I get repeatedly slammed by the netdev folks. > Even after I put aside *paid* work to submit to your demands. > > Next time around, I won't bother reporting bugs to you folks, > that's for damned sure. Actually that will be the best decision from evolutional point of view. Bugs, which 'are thrown back to your face' like what you did with this one, are useless. Developers already know, that bugs exist. If you do not care about bug, why do you ever bothered filling it? You expected that anyone will start running to fix it for you. You were wrong. Developers only fix bugs, which do not require mind-reading and magnetic quantification of your brain. If you do not want to help fixing it, do not expect it will be fixed at all. Sentence, that you will probably understand better: no one get paid to fix it. No one get fun fixing something with description you provided (not sure about David though, probably he has some masochistic propensities doing that and trying to get some bits of information from reportes for years). You were suggested some simple checks, they did not help. Developers can not remotely control electrons in your wires, so next sugestion was bisecting, which ended up with some crap from your point. If you want bug got fixed, provide info and if it is not enough, help by trying what you are being suggested, if you do not want, stop. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 14:35 ` Evgeniy Polyakov @ 2008-04-11 14:59 ` Mark Lord 2008-04-11 15:18 ` Evgeniy Polyakov 2008-04-11 19:58 ` Valdis.Kletnieks 2008-04-11 22:16 ` Tilman Schmidt 1 sibling, 2 replies; 129+ messages in thread From: Mark Lord @ 2008-04-11 14:59 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Ilpo Järvinen, David Miller, jesper.juhl, tilman, yoshfuji, Jeff Garzik, rjw, LKML, Netdev Evgeniy Polyakov wrote: > On Fri, Apr 11, 2008 at 09:19:17AM -0400, Mark Lord (lkml@rtr.ca) wrote: >>>> Or I can ignore it, like the net developers, since I have a workaround. >>>> And then we'll see what other apps are broken upon 2.6.25 final release. >> That's not demanding, that's quite relaxed. I had a good workaround, >> and didn't really care any more at that point. Just though it was rather >> odd that none of the developers seemed interested in tracking it down. >> I offered tons of help, gave it, and said I didn't have time for a full >> bisect at that juncture. >> >> For that, I get repeatedly slammed by the netdev folks. >> Even after I put aside *paid* work to submit to your demands. >> >> Next time around, I won't bother reporting bugs to you folks, >> that's for damned sure. > > Actually that will be the best decision from evolutional point of view. > > Bugs, which 'are thrown back to your face' like what you did with this > one, are useless. Developers already know, that bugs exist. > > If you do not care about bug, why do you ever bothered filling it? .. Because I care, about Linux's reputation and performance. I care about basic networking operations, and knew that this bug would probably affect other applications once widely deployed. I care about Linux. That's why. > If you do not want to help fixing it ... Where the hell did I *ever* say that? I did nothing but offer help, and respond quickly. The one thing I did not have time for initially, was a painstaking blunt instrument binary search of every commit since v2.6.24. There are other ways to debug things and find the causes quickly, with less impact upon the reporters of bugs. The current generation of kernel "code submitters" here seems to have never learned those. Bummer. Cheers ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 14:59 ` Mark Lord @ 2008-04-11 15:18 ` Evgeniy Polyakov 2008-04-11 18:07 ` David Miller 2008-04-11 19:58 ` Valdis.Kletnieks 1 sibling, 1 reply; 129+ messages in thread From: Evgeniy Polyakov @ 2008-04-11 15:18 UTC (permalink / raw) To: Mark Lord Cc: Ilpo Järvinen, David Miller, jesper.juhl, tilman, yoshfuji, Jeff Garzik, rjw, LKML, Netdev On Fri, Apr 11, 2008 at 10:59:57AM -0400, Mark Lord (lkml@rtr.ca) wrote: > >If you do not care about bug, why do you ever bothered filling it? > .. > > Because I care, about Linux's reputation and performance. > I care about basic networking operations, and knew that this bug > would probably affect other applications once widely deployed. > > I care about Linux. That's why. Blah-blah-blah, you care so much, that pissed people off which suggested you how to really help Linux. And then you returned with besiect results. You helped, but if you just cared, you had missed first part. > >If you do not want to help fixing it ... > > Where the hell did I *ever* say that? > I did nothing but offer help, and respond quickly. Citations: > So you will likely need to bisect. Or I can ignore it, like the net developers, since I have a workaround. And then we'll see what other apps are broken upon 2.6.25 final release. *Somebody* is responsible for those changes. That particular *somebody* ought to volunteer some help here, reducing the mountain of commits to a big handful or two. ----- I can proceed - just reread a thread. > The one thing I did not have time for initially, > was a painstaking blunt instrument binary search of > every commit since v2.6.24. If you do not know math, binary search takes log2(N), so you would only need to check at most around dozen commits. That's lot of time to run 'git bisect good/bad', especially for man, who "care about Linux". > There are other ways to debug things and find the causes > quickly, with less impact upon the reporters of bugs. I know one. It is guessing. I will start: did you start hearing voices after 2.6.24 upgrade? Or did you observed meteorites hitting wires between you machines? > The current generation of kernel "code submitters" here > seems to have never learned those. Bummer. Next time I will ask soothsayer, she really knows how to debug network bug with following description: "it worked before I changed a kernel version. you have to return my puppets back". -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 15:18 ` Evgeniy Polyakov @ 2008-04-11 18:07 ` David Miller 2008-04-11 21:29 ` Evgeniy Polyakov 2008-04-12 8:44 ` Willy Tarreau 0 siblings, 2 replies; 129+ messages in thread From: David Miller @ 2008-04-11 18:07 UTC (permalink / raw) To: johnpol Cc: lkml, ilpo.jarvinen, jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Date: Fri, 11 Apr 2008 19:18:21 +0400 > On Fri, Apr 11, 2008 at 10:59:57AM -0400, Mark Lord (lkml@rtr.ca) wrote: > > >If you do not care about bug, why do you ever bothered filling it? > > .. > > > > Because I care, about Linux's reputation and performance. > > I care about basic networking operations, and knew that this bug > > would probably affect other applications once widely deployed. > > > > I care about Linux. That's why. > > Blah-blah-blah, you care so much, that pissed people off which suggested > you how to really help Linux. And then you returned with besiect > results. You helped, but if you just cared, you had missed first part. Every time I see someone play the "I care about Linux" card, they are typically being a hypocrit. It's a knee jerk, defensive gesture, and usually has absolutely zero substance. > > The current generation of kernel "code submitters" here > > seems to have never learned those. Bummer. > > Next time I will ask soothsayer, she really knows how to debug network > bug with following description: "it worked before I changed a kernel > version. you have to return my puppets back". Thanks for your support Evgeniy, it is truly appreciated. We had Mark's bug fixed in 15 minutes once the bisect result was known, even after Ilpo and myself had scanned through the changesets. This proves the utility of bisect and in fact that trying to intuit the cause by continuing to study changesets and code would have been a complete waste of time. Yes, Mark, we used to do things that way for every bug in the kernel. And as a result many bugs sat unfixed for weeks if not months. Many of us have left the cave, feel free to join us. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 18:07 ` David Miller @ 2008-04-11 21:29 ` Evgeniy Polyakov 2008-04-12 8:44 ` Willy Tarreau 1 sibling, 0 replies; 129+ messages in thread From: Evgeniy Polyakov @ 2008-04-11 21:29 UTC (permalink / raw) To: David Miller Cc: lkml, ilpo.jarvinen, jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev On Fri, Apr 11, 2008 at 11:07:02AM -0700, David Miller (davem@davemloft.net) wrote: > Thanks for your support Evgeniy, it is truly appreciated. No problemo :) -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 18:07 ` David Miller 2008-04-11 21:29 ` Evgeniy Polyakov @ 2008-04-12 8:44 ` Willy Tarreau 2008-04-12 9:49 ` David Miller 1 sibling, 1 reply; 129+ messages in thread From: Willy Tarreau @ 2008-04-12 8:44 UTC (permalink / raw) To: David Miller Cc: johnpol, lkml, ilpo.jarvinen, jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev Hi guys, I've read quite a bunch of this thread, and I think there's some misunderstanding between both parts, as well as inappropriate expectations in both cases. On Fri, Apr 11, 2008 at 11:07:02AM -0700, David Miller wrote: > We had Mark's bug fixed in 15 minutes once the bisect result was > known, even after Ilpo and myself had scanned through the changesets. > > This proves the utility of bisect and in fact that trying to intuit > the cause by continuing to study changesets and code would have been a > complete waste of time. > > Yes, Mark, we used to do things that way for every bug in the kernel. > And as a result many bugs sat unfixed for weeks if not months. Many > of us have left the cave, feel free to join us. We should be very careful about git-bisect. First, it does not necessarily point to the bug, but to the commit which exhibits the bug, so simply reverting the commit might just hide the bug again. I want to ensure that people do not forget that it does not replace a brain, it enhances your eyes by pointing to a change related to the problem. While it is a powerful tool, we must accept that it cannot efficiently work in some circumstances, such as : - the machine cannot be rebooted often. I've been used to work for customers who plan changes once a week, and change absolutely nothing on their production if unplanned. This means one bisect step per week. Often, those people even require that your changes pass through a week of non-regression testing on a pre-production system (which was my case), with no overlapping between changes, so then you can count on one git-bisect iteration every two weeks. - the problem only happens in peak traffic hours on production, and the loss of service has already gone far beyond the annual quota. The only case they will accept an upgrade if you engage your full responsibility that it will definitely fix the problem. I've already been in such a situation, you say to the guy in front of you that you're putting your balls on the table, it will work (and sometimes you're only 90% confident). You obviously cannot do this to just check if the current bisect exhibits the problem or not. - the reporter has very few spare time. I do have friends in this situation. Basically, when your schedule is full of customers visits one month ahead, it's very hard to find several consecutive hours to track the problem down. Sometimes you're happy if you can spend two hours on it in a week. BTW, many developers are also in the same situation. Also most of the time, this must be done at the customer's and some of them do not accept people out of work hours. Then the problem may lay for weeks or months. - the problem is very reproducible but takes a lot of time before triggering (typically memory leaks). In these situations, either git-bisect will not be usable, or will take a lot of time to converge (up to several weeks), so will reveal inefficient. So the reporter will either stay with the last known working version, or with the new one accompanied with a workaround. For this reason, we should not "force" reporters to git-bisect. Just ask them if they can do so, otherwise investigations on their bug will not progress until someone else reports the same one, with some time to bisect it. And there is nothing wrong with that IMHO. If the problem only affects one person and this person has a solution, is that really much of a problem ? Sure it would be better fixed, but nobody suffers from it. On the other hand, being aware that there exists a person somewhere experiencing a specific bug is useful to the developers, because when they think they might have fixed it, they can ping him for validation. Now, from a developer's point of view, the reporters should not consider that development in free software is a public service and that developers have a strong obligation to find and fix new bugs. Mark said his time is paid for, but most of the people here will tend to take that as a customer-provider insult since their time is also paid for, and while the reporter's work may consist in consulting customers without much schedule freedom, the developer's work consists in delivering new features in a more or less agreed schedule. So everyone's time is valuable. Of course it's better when developers help, and we must keep in mind that they're the better placed to understand their code (even more when it's recent). But due to the long chain of contributors, the ones in direct contact with the reporter are not often the ones who will be able to debug the code. So they need to know a bit more to find whom to ping first. Both Mark and Ilpo said something true here. It's that they feel concerned when a bug is reported in an area they have worked on. It is possible that none of the people who have worked on this bug was responsible of it, and in this case it's important to insist on the code author about the fact that he's not only a code author but also has to support his code, and that next time he'll be welcome to check if his code might have caused the reported problem. But clearly, for scalability reasons, we cannot expect people in the middle of the chain to investigate all bugs. Their experience in the area is much better used at assisting both reporters and code authors at taking the right direction though. So if I can conclude, both reporter's and developer's time is valuable and may not be spent on chasing every bug down. git-bisect is very good at saving developer's time in exchange of approximately the same amount of time on the reporter's side, which makes the whole process scalable. Sometimes for various reasons the reporter cannot do this (or not efficiently). We should not call him names in this case, just tell him that we cannot go further on this bug without much more information, and that he'll be asked for tests when someone else reports it and debugs it. If the person expected more investigative support, he should have gone with a commercialy supported distro. Now speaking for my case, I know that as a developer, I'm faster than many others to find bugs in *my* code, but am of little help when it comes to external contributions to my code. As a user, I will not always be able to git-bisect (or that would be inefficient, see reasons above). But I know that a report is a report, and even if I have a workaround, I feel it as a moral obligation to report the bug, and I want to be able to do it without the fear of being agressed due to my lack of involvement in the fix. An no Dave, I'm not hypocritic when I say this. I really hate people who say "oh yes I know about this bug, I've already encountered it but did not care to report it". I just want to ensure that people will always report bugs, whatever the level of help they will be able to provide. It's important to know if a problem happens for the first time or is very wide-spread since version X or Y. And for such a case, I agree that bugzilla would at least help not losing those reports. Willy ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-12 8:44 ` Willy Tarreau @ 2008-04-12 9:49 ` David Miller 2008-04-13 18:15 ` Rafael J. Wysocki 0 siblings, 1 reply; 129+ messages in thread From: David Miller @ 2008-04-12 9:49 UTC (permalink / raw) To: w Cc: johnpol, lkml, ilpo.jarvinen, jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev From: Willy Tarreau <w@1wt.eu> Date: Sat, 12 Apr 2008 10:44:00 +0200 > While it is a powerful tool, we must accept that it cannot efficiently > work in some circumstances, such as : Everyone is well aware of all of this, that's why I specifically asked for a bisect, because I knew it would be crucial to pinpointing this particular bug. And lo' and behold in 15 minutes after the bisect results were available it got fixed. Yes it takes judgment, and nobody ever suggested that a revert is the way to go. Git bisect results must be parsed by a human brain. Nothing else was ever implied in any way shape or form. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-12 9:49 ` David Miller @ 2008-04-13 18:15 ` Rafael J. Wysocki 2008-04-13 18:51 ` Sergio Luis 0 siblings, 1 reply; 129+ messages in thread From: Rafael J. Wysocki @ 2008-04-13 18:15 UTC (permalink / raw) To: David Miller Cc: w, johnpol, lkml, ilpo.jarvinen, jesper.juhl, tilman, yoshfuji, jeff, linux-kernel, netdev On Saturday, 12 of April 2008, David Miller wrote: > From: Willy Tarreau <w@1wt.eu> > Date: Sat, 12 Apr 2008 10:44:00 +0200 > > > While it is a powerful tool, we must accept that it cannot efficiently > > work in some circumstances, such as : > > Everyone is well aware of all of this, that's why I specifically asked > for a bisect, because I knew it would be crucial to pinpointing this > particular bug. > > And lo' and behold in 15 minutes after the bisect results were > available it got fixed. I can't find the fix, btw. Can you please point me to it? Thanks, Rafael ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-13 18:15 ` Rafael J. Wysocki @ 2008-04-13 18:51 ` Sergio Luis 2008-04-13 19:24 ` Rafael J. Wysocki 0 siblings, 1 reply; 129+ messages in thread From: Sergio Luis @ 2008-04-13 18:51 UTC (permalink / raw) To: Rafael J. Wysocki Cc: David Miller, w, johnpol, lkml, ilpo.jarvinen, jesper.juhl, tilman, yoshfuji, jeff, linux-kernel, netdev On Sun, Apr 13, 2008 at 3:15 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Saturday, 12 of April 2008, David Miller wrote: > > From: Willy Tarreau <w@1wt.eu> > > Date: Sat, 12 Apr 2008 10:44:00 +0200 > > > > > While it is a powerful tool, we must accept that it cannot efficiently > > > work in some circumstances, such as : > > > > Everyone is well aware of all of this, that's why I specifically asked > > for a bisect, because I knew it would be crucial to pinpointing this > > particular bug. > > > > And lo' and behold in 15 minutes after the bisect results were > > available it got fixed. > > I can't find the fix, btw. Can you please point me to it? there you go http://lkml.org/lkml/2008/4/10/409 http://lkml.org/lkml/2008/4/10/410 > > Thanks, > Rafael -sergio ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-13 18:51 ` Sergio Luis @ 2008-04-13 19:24 ` Rafael J. Wysocki 0 siblings, 0 replies; 129+ messages in thread From: Rafael J. Wysocki @ 2008-04-13 19:24 UTC (permalink / raw) To: Sergio Luis Cc: David Miller, w, johnpol, lkml, ilpo.jarvinen, jesper.juhl, tilman, yoshfuji, jeff, linux-kernel, netdev On Sunday, 13 of April 2008, Sergio Luis wrote: > On Sun, Apr 13, 2008 at 3:15 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Saturday, 12 of April 2008, David Miller wrote: > > > From: Willy Tarreau <w@1wt.eu> > > > Date: Sat, 12 Apr 2008 10:44:00 +0200 > > > > > > > While it is a powerful tool, we must accept that it cannot efficiently > > > > work in some circumstances, such as : > > > > > > Everyone is well aware of all of this, that's why I specifically asked > > > for a bisect, because I knew it would be crucial to pinpointing this > > > particular bug. > > > > > > And lo' and behold in 15 minutes after the bisect results were > > > available it got fixed. > > > > I can't find the fix, btw. Can you please point me to it? > > there you go > http://lkml.org/lkml/2008/4/10/409 > http://lkml.org/lkml/2008/4/10/410 Thanks a lot, Rafael ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 14:59 ` Mark Lord 2008-04-11 15:18 ` Evgeniy Polyakov @ 2008-04-11 19:58 ` Valdis.Kletnieks 1 sibling, 0 replies; 129+ messages in thread From: Valdis.Kletnieks @ 2008-04-11 19:58 UTC (permalink / raw) To: Mark Lord Cc: Evgeniy Polyakov, Ilpo Järvinen, David Miller, jesper.juhl, tilman, yoshfuji, Jeff Garzik, rjw, LKML, Netdev [-- Attachment #1: Type: text/plain, Size: 401 bytes --] On Fri, 11 Apr 2008 10:59:57 EDT, Mark Lord said: > The one thing I did not have time for initially, > was a painstaking blunt instrument binary search of > every commit since v2.6.24. The nice thing about binary search is that it's by definition an O(log2(N)) operation, which isn't bad at all as far as algorithms go. The truly blunt instrument here would be a linear search of every commit... [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 14:35 ` Evgeniy Polyakov 2008-04-11 14:59 ` Mark Lord @ 2008-04-11 22:16 ` Tilman Schmidt 2008-04-11 22:25 ` Evgeniy Polyakov 2008-04-11 22:26 ` David Miller 1 sibling, 2 replies; 129+ messages in thread From: Tilman Schmidt @ 2008-04-11 22:16 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Mark Lord, Ilpo Järvinen, David Miller, jesper.juhl, yoshfuji, Jeff Garzik, rjw, LKML, Netdev [-- Attachment #1: Type: text/plain, Size: 725 bytes --] On Fri, 11 Apr 2008 18:35:37 +0400, Evgeniy Polyakov wrote: > On Fri, Apr 11, 2008 at 09:19:17AM -0400, Mark Lord (lkml@rtr.ca) wrote: >> I offered tons of help, gave it, and said I didn't have time for a full >> bisect at that juncture. >> >> For that, I get repeatedly slammed by the netdev folks. >> Even after I put aside *paid* work to submit to your demands. >> >> Next time around, I won't bother reporting bugs to you folks, >> that's for damned sure. > > Actually that will be the best decision from evolutional point of view. So I was right after all? Bug reports from people who (for whatever reason, including having to earn their living) cannot do a bisect are not welcome? :-( T. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 254 bytes --] ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 22:16 ` Tilman Schmidt @ 2008-04-11 22:25 ` Evgeniy Polyakov 2008-04-11 22:27 ` David Miller 2008-04-11 23:23 ` Tilman Schmidt 2008-04-11 22:26 ` David Miller 1 sibling, 2 replies; 129+ messages in thread From: Evgeniy Polyakov @ 2008-04-11 22:25 UTC (permalink / raw) To: Tilman Schmidt Cc: Mark Lord, Ilpo Järvinen, David Miller, jesper.juhl, yoshfuji, Jeff Garzik, rjw, LKML, Netdev On Sat, Apr 12, 2008 at 12:16:28AM +0200, Tilman Schmidt (tilman@imap.cc) wrote: > > Actually that will be the best decision from evolutional point of view. > > So I was right after all? Bug reports from people who (for whatever > reason, including having to earn their living) cannot do a bisect are > not welcome? You got it wrong. If bug is subtle and developers can not reproduce it, there are only two ways out of the problem: to help developers or not to help. In the latter case bug report is useless (except that to show that it exists, since practically no one can fix it until some new details added). In the former case there is a discussion between developers and reporters, so things have progress. In this particular case there were no healthy discussion, that is why all this is about. Bisection was just an example of the help, reporter can provide. In this case there were no other suggestions remotely useful or they were already tried. If you can not proceed with what was suggested, then do not piss anyone off because you were told to do something to help. If you go to the doctor because of aching throat and he asks you to open a mouth, you will not blame him for asking you to do that. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 22:25 ` Evgeniy Polyakov @ 2008-04-11 22:27 ` David Miller 2008-04-11 23:23 ` Tilman Schmidt 1 sibling, 0 replies; 129+ messages in thread From: David Miller @ 2008-04-11 22:27 UTC (permalink / raw) To: johnpol Cc: tilman, lkml, ilpo.jarvinen, jesper.juhl, yoshfuji, jeff, rjw, linux-kernel, netdev From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Date: Sat, 12 Apr 2008 02:25:36 +0400 > If you go to the doctor because of aching throat and he asks you to > open a mouth, you will not blame him for asking you to do that. ROFL ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 22:25 ` Evgeniy Polyakov 2008-04-11 22:27 ` David Miller @ 2008-04-11 23:23 ` Tilman Schmidt 2008-04-12 5:37 ` Evgeniy Polyakov 2008-04-12 7:06 ` Ilpo Järvinen 1 sibling, 2 replies; 129+ messages in thread From: Tilman Schmidt @ 2008-04-11 23:23 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Mark Lord, Ilpo Järvinen, David Miller, jesper.juhl, yoshfuji, Jeff Garzik, rjw, LKML, Netdev [-- Attachment #1: Type: text/plain, Size: 3358 bytes --] On Sat, 12 Apr 2008 02:25:36 +0400, Evgeniy Polyakov wrote: > On Sat, Apr 12, 2008 at 12:16:28AM +0200, Tilman Schmidt (tilman@imap.cc) wrote: >> So I was right after all? Bug reports from people who (for whatever >> reason, including having to earn their living) cannot do a bisect are >> not welcome? > > You got it wrong. Did I really? Let's see ... > If bug is subtle and developers can not reproduce it, there are only two > ways out of the problem: to help developers or not to help. > > In the latter case bug report is useless (except that to show that it > exists, since practically no one can fix it until some new details > added). Looks like you're saying I was right after all. Useless bug reports shouldn't be submitted. So please answer this simple question: If I know beforehand that I won't have the time to do a bisect (or other similarly time-consuming task the maintainers might ask from me), should I report the bug, or should I keep my knowledge to myself? This question is not theoretical. It's a situation I find myself in quite regularly, because I allow myself the luxury of building most rc kernels and even the odd mm kernel just for fun even though I have a daytime job and a family to feed. It would be quite easy to look the other way if I encounter a problem in one of those, hoping someone else with more time on his or her hands will also come across it and report it. So far my conscience told me not to do that. But if reporting it without being able to follow up on it is considered useless then my conscience was apparently wrong. Just say the word, and I'll stop what I'm doing. I'll have no problem finding other things to do with my time. > Bisection was just an example of the help, reporter can provide. Sure. It's not about bisection specifically, but about the time a reporter is able to invest in addition to what went into the report already. But bisection is is a good example, because it's the most time-consuming of all the tasks routinely asked from bug reporters. > If you can not proceed with what was suggested, then do > not piss anyone off because you were told to do something to help. If a polite "sorry, I don't have the time" already counts as pissing off, the only choice left is to avoid the situation in which I'd have to say that. IOW, don't report bugs if I don't have the time to follow through. Again: if that is what you want, I have no problem with it. > If you go to the doctor because of aching throat and he asks you to > open a mouth, you will not blame him for asking you to do that. That analogy is wrong on so many accounts. It is not my throat that's aching. A doctor would not insult me for not wanting to open my mouth but rather ask if there was perhaps a valid reason for that. Not to mention that opening my mouth takes substantially less time than a Linux kernel bisection ... A better analogy would be if I see an object lying on the highway, and I stop at the next service area to call the police and alert them about the possible danger. If they'd ask me to drive back to the place where I saw it in order to describe precisely where it lay and what it looked like, I think I might indeed become a bit upset. HTH T. PS: I'll shut my big mouth now. The topic has been beaten to death. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 254 bytes --] ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 23:23 ` Tilman Schmidt @ 2008-04-12 5:37 ` Evgeniy Polyakov 2008-04-12 7:06 ` Ilpo Järvinen 1 sibling, 0 replies; 129+ messages in thread From: Evgeniy Polyakov @ 2008-04-12 5:37 UTC (permalink / raw) To: Tilman Schmidt Cc: Mark Lord, Ilpo Järvinen, David Miller, jesper.juhl, yoshfuji, Jeff Garzik, rjw, LKML, Netdev On Sat, Apr 12, 2008 at 01:23:17AM +0200, Tilman Schmidt (tilman@imap.cc) wrote: > > If you can not proceed with what was suggested, then do > > not piss anyone off because you were told to do something to help. > > If a polite "sorry, I don't have the time" already counts as pissing > off, the only choice left is to avoid the situation in which I'd have to > say that. IOW, don't report bugs if I don't have the time to follow > through. Again: if that is what you want, I have no problem with it. We want bug report, but we definitely do not want, when we ask for additionl help, to listen crap that reporter does not have to help and it is our roblem. > > If you go to the doctor because of aching throat and he asks you to > > open a mouth, you will not blame him for asking you to do that. > > That analogy is wrong on so many accounts. It is not my throat that's > aching. A doctor would not insult me for not wanting to open my mouth > but rather ask if there was perhaps a valid reason for that. Not to > mention that opening my mouth takes substantially less time than a Linux > kernel bisection ... It is your throat, since doctor's one is ok, and no one else came with the same problems. Doctor will not insult you if you will not piss him off. Read the thread from the beginning. > A better analogy would be if I see an object lying on the highway, and I > stop at the next service area to call the police and alert them about > the possible danger. If they'd ask me to drive back to the place where I > saw it in order to describe precisely where it lay and what it looked > like, I think I might indeed become a bit upset. Yeah, they first asked how it looked and where it was, then they asked to move here, and you told them, that it is they who have to do that, that it is exactly their problem, that you are not paid to do that. Did I miss something, yeah, probably part, where you then tell, that you care about highways, so you moved there and did what was asked. Although, no, you first tell to police, that you spend your paid time and will teach them how to do thing or similar crap. Only few hours later, to some other people, you will tell that you care about highways, global warming, Uganda childs and adron collider danger. Ugh, just removing that object, when you were there 'takes substantially less time than a Linux kernel bisection'. Fortunately flooding developers with tons of urine during the whole day is much more comfortable and ego-boosting, since it allows to close eyes and do not see, that this took much more time than bisection. Hope you got it right: we want bug reports and help. If you do not want to provide some help, do not expect bug will be fixed, although bug eistence is significant sign. So, be cool, and everything will be ok. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 23:23 ` Tilman Schmidt 2008-04-12 5:37 ` Evgeniy Polyakov @ 2008-04-12 7:06 ` Ilpo Järvinen 1 sibling, 0 replies; 129+ messages in thread From: Ilpo Järvinen @ 2008-04-12 7:06 UTC (permalink / raw) To: Tilman Schmidt Cc: Evgeniy Polyakov, Mark Lord, David Miller, jesper.juhl, yoshfuji, Jeff Garzik, rjw, LKML, Netdev On Sat, 12 Apr 2008, Tilman Schmidt wrote: > On Sat, 12 Apr 2008 02:25:36 +0400, Evgeniy Polyakov wrote: > > On Sat, Apr 12, 2008 at 12:16:28AM +0200, Tilman Schmidt (tilman@imap.cc) wrote: > >> So I was right after all? Bug reports from people who (for whatever > >> reason, including having to earn their living) cannot do a bisect are > >> not welcome? > > > > You got it wrong. > > Did I really? Let's see ... > > > If bug is subtle and developers can not reproduce it, there are only two > > ways out of the problem: to help developers or not to help. > > > > In the latter case bug report is useless (except that to show that it > > exists, since practically no one can fix it until some new details > > added). > > Looks like you're saying I was right after all. Useless bug reports > shouldn't be submitted. ...No, useless bug reports don't lead to a solution, ie., that particular bug won't get fixed as a result of the report! That's what these people are trying to say. Sure the point of bug reports is to get the bugs fixed, don't you think? :-/ ...Or do you thing it's only secondary to get them fixed. > So please answer this simple question: If I know beforehand that I won't > have the time to do a bisect (or other similarly time-consuming task the > maintainers might ask from me), should I report the bug, or should I > keep my knowledge to myself? > > > > Bisection was just an example of the help, reporter can provide. > > Sure. It's not about bisection specifically, but about the time a > reporter is able to invest in addition to what went into the report > already. But bisection is is a good example, because it's the most > time-consuming of all the tasks routinely asked from bug reporters. I'm asking the same thing from you as I did from Mark (it still remains unanswered)... What's your suggestion, how should we have solved this particular case? Do you join those that ask for developers to "invest" time to repeatedly go through the commits that are not guilty? ...One would never find the solution by that method :-/. Yes, I'm fine that you don't want to help (or would want but cannot help like have been with many of nearly impossible to reproduce bugs with TCP lately) but the sole consequence is that the bug remains unsolved, it's plain simple. That's until somebody else is affected and reports and we get the necessary information. Or alternatively somebody just reads the offending code (possibly much later) and begins to wonder why there's this particular thing missing there (this is in fact not related to the bug reports at all, many bugs are found this way but it's not a thing one can force to happen in a timely manner :-)). > > If you can not proceed with what was suggested, then do > > not piss anyone off because you were told to do something to help. > > If a polite "sorry, I don't have the time" already counts as pissing > off, the only choice left is to avoid the situation in which I'd have to > say that. IOW, don't report bugs if I don't have the time to follow > through. Again: if that is what you want, I have no problem with it. Please reread the thread, this couldn't be farther from the truth... ...Dave had suggested Mark would have to bisect, I suppose this was after founding out that there wasn't anything particular that should cause this kind of behavior, or at least he couldn't find anything even suspicious looking. Mark, with rather demanding tone, was _also_ asking for that "somebody" who did all those TCP fin/closing changes (that would be me) to be responsible over them, ie., those parts that Dave had checked and found not suspicious (and bisect also proved them innocent later on). Yes, I then went through that "mountain of commits" which Mark "was not willing to do" himself. I invested the time even after Dave had also come to the same conclusion as I again did, that there is nothing wrong with the particular parts of the TCP. Funny, since it seems that even Mark himself had come to that same conclusion as both Dave and myself then came (though I cannot really speak for him). So I was third one to check those parts which were then found not guilty, would I be angry about it, I would call that plain waste of time :-). Please note too that I would have gone through them without his remarks as well to just check the thing I already knew. So what's the problem with that. The thing that Mark wasn't very willing to go through that "mountain of commits" and made some accusions that one shouldn't ask user to do that, yet he was doing the same thing, asking for me to go through that "mountain of commits". I don't find that as polite as you do (and maybe you don't honestly either), especially as that was _already done by Dave_, yet it wasn't enough for him. [...doctor part snipped...] > A better analogy would be if I see an object lying on the highway, and I > stop at the next service area to call the police and alert them about > the possible danger. If they'd ask me to drive back to the place where I > saw it in order to describe precisely where it lay and what it looked > like, I think I might indeed become a bit upset. But you would feel qualified to tell how the police/doctor must handle it? ...That was the main problem. To conclude I moved this your case down here... > This question is not theoretical. It's a situation I find myself in > quite regularly, because I allow myself the luxury of building most rc > kernels and even the odd mm kernel just for fun even though I have a > daytime job and a family to feed. It would be quite easy to look the > other way if I encounter a problem in one of those, hoping someone else > with more time on his or her hands will also come across it and report > it. So far my conscience told me not to do that. But if reporting it > without being able to follow up on it is considered useless then my > conscience was apparently wrong. Just say the word, and I'll stop what > I'm doing. I'll have no problem finding other things to do with my time. It's perfectly fine that you report bugs, even with little time :-). But then if you are asked to do something that is necessary to help developers and you are not willing to do that, please don't start adding demands (it might actually be quite hard to restrain oneself from adding hidden "attacks" to wording, at least for me). Also we have to then accept the concequences, ie., the bug won't get fixed because of that report (unless something comes up later which connects the pieces). > PS: I'll shut my big mouth now. The topic has been beaten to death. I thought the same earlier, but I want to try to correct the misunderstand you are tryng really hard to get here :-). -- i. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 22:16 ` Tilman Schmidt 2008-04-11 22:25 ` Evgeniy Polyakov @ 2008-04-11 22:26 ` David Miller 1 sibling, 0 replies; 129+ messages in thread From: David Miller @ 2008-04-11 22:26 UTC (permalink / raw) To: tilman Cc: johnpol, lkml, ilpo.jarvinen, jesper.juhl, yoshfuji, jeff, rjw, linux-kernel, netdev From: Tilman Schmidt <tilman@imap.cc> Date: Sat, 12 Apr 2008 00:16:28 +0200 > So I was right after all? Bug reports from people who (for whatever > reason, including having to earn their living) cannot do a bisect are > not welcome? You need to qualify this with: when a bisect is asked of them You seem to be quite eager to harp on this specitic point, to make it seem as if a bug report is useless if the person cannot or will not bisect in all cases. And that simply is not what we are saying here. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 1:23 ` Mark Lord 2008-04-11 6:40 ` Ilpo Järvinen @ 2008-04-11 19:58 ` Valdis.Kletnieks 2008-04-11 22:27 ` Tilman Schmidt 1 sibling, 1 reply; 129+ messages in thread From: Valdis.Kletnieks @ 2008-04-11 19:58 UTC (permalink / raw) To: Mark Lord Cc: David Miller, jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev [-- Attachment #1: Type: text/plain, Size: 334 bytes --] On Thu, 10 Apr 2008 21:23:54 EDT, Mark Lord said: > You still keep refering to it as "your (my) bug". > It's not. I had nothing to do with it, other than stumbling over it. Like it or not, when you're the owner of the only box that can reliably reproduce an error condition, it's your bug. Been there, done that, plenty of times. [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 19:58 ` Valdis.Kletnieks @ 2008-04-11 22:27 ` Tilman Schmidt 2008-04-13 18:40 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Rafael J. Wysocki 0 siblings, 1 reply; 129+ messages in thread From: Tilman Schmidt @ 2008-04-11 22:27 UTC (permalink / raw) To: Valdis.Kletnieks Cc: Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, rjw, linux-kernel, netdev [-- Attachment #1: Type: text/plain, Size: 507 bytes --] On Fri, 11 Apr 2008 15:58:42 -0400, Valdis.Kletnieks@vt.edu wrote: > On Thu, 10 Apr 2008 21:23:54 EDT, Mark Lord said: > >> You still keep refering to it as "your (my) bug". >> It's not. I had nothing to do with it, other than stumbling over it. > > Like it or not, when you're the owner of the only box that can reliably > reproduce an error condition, it's your bug. Thanks for the advice. I'll keep it in mind next time I have to decide whether to report a bug I'm stumbling over. T. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 254 bytes --] ^ permalink raw reply [flat|nested] 129+ messages in thread
* Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-11 22:27 ` Tilman Schmidt @ 2008-04-13 18:40 ` Rafael J. Wysocki 2008-04-13 18:47 ` Willy Tarreau 0 siblings, 1 reply; 129+ messages in thread From: Rafael J. Wysocki @ 2008-04-13 18:40 UTC (permalink / raw) To: Tilman Schmidt Cc: Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev, Andrew Morton On Saturday, 12 of April 2008, Tilman Schmidt wrote: > On Fri, 11 Apr 2008 15:58:42 -0400, Valdis.Kletnieks@vt.edu wrote: > > On Thu, 10 Apr 2008 21:23:54 EDT, Mark Lord said: > > > >> You still keep refering to it as "your (my) bug". > >> It's not. I had nothing to do with it, other than stumbling over it. > > > > Like it or not, when you're the owner of the only box that can reliably > > reproduce an error condition, it's your bug. > > Thanks for the advice. I'll keep it in mind next time I have to decide > whether to report a bug I'm stumbling over. Well, the fact is, reporting bugs is always welcome. However, it may not be immediately obvious what causes the bug to appear as well as the bug need not be readily reproducible on any other system than yours, at least at the moment. In which case whether or not the bug will be fixed depends on the reporter. Namely, if the reporter wants and has the time to provide developers with additional information, the bug has a good chance to be fixed. Otherwise, it'll probably stay there until there's a more persistent reporter or it's fixed as a result of a related change. So, if people ask you to do a bisection, they probably mean "we don't see what the problem is and can't reproduce it, so please get us more information, otherwise we won't know how to fix it". In that case, you could provide them with a reproducible test case just as well. That said, there may be some developers who just don't want to spend time on analysing code and put the burden of finding the offending change on the reporter, but I don't think it's common practice. Thanks, Rafael ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-13 18:40 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Rafael J. Wysocki @ 2008-04-13 18:47 ` Willy Tarreau 2008-04-13 19:18 ` Andrew Morton ` (2 more replies) 0 siblings, 3 replies; 129+ messages in thread From: Willy Tarreau @ 2008-04-13 18:47 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev, Andrew Morton On Sun, Apr 13, 2008 at 08:40:11PM +0200, Rafael J. Wysocki wrote: > On Saturday, 12 of April 2008, Tilman Schmidt wrote: > > On Fri, 11 Apr 2008 15:58:42 -0400, Valdis.Kletnieks@vt.edu wrote: > > > On Thu, 10 Apr 2008 21:23:54 EDT, Mark Lord said: > > > > > >> You still keep refering to it as "your (my) bug". > > >> It's not. I had nothing to do with it, other than stumbling over it. > > > > > > Like it or not, when you're the owner of the only box that can reliably > > > reproduce an error condition, it's your bug. > > > > Thanks for the advice. I'll keep it in mind next time I have to decide > > whether to report a bug I'm stumbling over. > > Well, the fact is, reporting bugs is always welcome. > > However, it may not be immediately obvious what causes the bug to appear > as well as the bug need not be readily reproducible on any other system than > yours, at least at the moment. > > In which case whether or not the bug will be fixed depends on the reporter. > Namely, if the reporter wants and has the time to provide developers with > additional information, the bug has a good chance to be fixed. Otherwise, > it'll probably stay there until there's a more persistent reporter or it's > fixed as a result of a related change. > > So, if people ask you to do a bisection, they probably mean "we don't see > what the problem is and can't reproduce it, so please get us more information, > otherwise we won't know how to fix it". In that case, you could provide them > with a reproducible test case just as well. > > That said, there may be some developers who just don't want to spend time on > analysing code and put the burden of finding the offending change on the > reporter, but I don't think it's common practice. Very true. One other thing which might get confusing/frustrating on the user side is that currently, Linux is the *only* product which requires the bug reporter to find the fault change (yes, I know, it's scalable). All other products the reporter uses work differently: the reporter contacts the editor/author/support/... and briefly describes his problem. Support asks him for a bit more details, remains silent for some time, then comes up with a patched version to confirm that the bug is fixed. So it is understandable from the user's standpoint that Linux appears quite complex to report bugs. But we should remind users that LKML is *not* a place to get free kernel support, but it's a *development* mailing list, and that it is somewhat expected that developers ask reporters for more development related contribution. But if the reporter does not want to/cannot do much more, we should not aggress him, and point it to other places instead (eg: at least create an entry in bugzilla so that their report is not lost, and they have a chance to get contacted when the fix is known). Regards, Willy ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-13 18:47 ` Willy Tarreau @ 2008-04-13 19:18 ` Andrew Morton 2008-04-13 19:27 ` Rafael J. Wysocki ` (3 more replies) 2008-04-13 20:10 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Adrian Bunk 2008-04-14 9:58 ` Reporting bugs and bisection Andi Kleen 2 siblings, 4 replies; 129+ messages in thread From: Andrew Morton @ 2008-04-13 19:18 UTC (permalink / raw) To: Willy Tarreau Cc: Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev On Sun, 13 Apr 2008 20:47:30 +0200 Willy Tarreau <w@1wt.eu> wrote: > One other thing which might get confusing/frustrating on the > user side is that currently, Linux is the *only* product which requires > the bug reporter to find the fault change That's because many (probably most) Linux bugs are dependent upon the hardware which they run on, and developers cannot reproduce the failure on their hardware. Other software products don't have that problem. That being said.. four or five years ago, developers would often work closely with the reporter working out why the reporter's failure was occurring. Several days of back-and-forth. We dont' do that as much nowadays - there's a tendency to a) throw the problem back at the reporter, often asking them to bisect. If the reporter is running a distro kernel (eg: Fedora) then that's quite hard, and often isn't a think they have knowledge to do. So they'll just disappear. Or b) just ignore the report altogether. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-13 19:18 ` Andrew Morton @ 2008-04-13 19:27 ` Rafael J. Wysocki 2008-04-13 19:47 ` Reporting bugs and bisection David Miller ` (2 subsequent siblings) 3 siblings, 0 replies; 129+ messages in thread From: Rafael J. Wysocki @ 2008-04-13 19:27 UTC (permalink / raw) To: Andrew Morton Cc: Willy Tarreau, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev On Sunday, 13 of April 2008, Andrew Morton wrote: > On Sun, 13 Apr 2008 20:47:30 +0200 Willy Tarreau <w@1wt.eu> wrote: > > > One other thing which might get confusing/frustrating on the > > user side is that currently, Linux is the *only* product which requires > > the bug reporter to find the fault change > > That's because many (probably most) Linux bugs are dependent upon the > hardware which they run on, and developers cannot reproduce the failure on > their hardware. Other software products don't have that problem. > > > That being said.. four or five years ago, developers would often work > closely with the reporter working out why the reporter's failure was > occurring. Several days of back-and-forth. > > We dont' do that as much nowadays - there's a tendency to > > a) throw the problem back at the reporter, often asking them to bisect. > If the reporter is running a distro kernel (eg: Fedora) then that's > quite hard, and often isn't a think they have knowledge to do. So > they'll just disappear. Or > > b) just ignore the report altogether. IMHO we should try to make that difficult. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-13 19:18 ` Andrew Morton 2008-04-13 19:27 ` Rafael J. Wysocki @ 2008-04-13 19:47 ` David Miller 2008-04-13 20:21 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Evgeniy Polyakov 2008-04-14 10:18 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Ingo Molnar 3 siblings, 0 replies; 129+ messages in thread From: David Miller @ 2008-04-13 19:47 UTC (permalink / raw) To: akpm Cc: w, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev From: Andrew Morton <akpm@linux-foundation.org> Date: Sun, 13 Apr 2008 12:18:31 -0700 > That being said.. four or five years ago, developers would often work > closely with the reporter working out why the reporter's failure was > occurring. Several days of back-and-forth. The ratio of bug reports to developers was significantly different back then. I pine for the "good ole' days" of kernel development sometimes too, but I try to be realistic and understand why things are different now. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-13 19:18 ` Andrew Morton 2008-04-13 19:27 ` Rafael J. Wysocki 2008-04-13 19:47 ` Reporting bugs and bisection David Miller @ 2008-04-13 20:21 ` Evgeniy Polyakov 2008-04-13 20:33 ` Rafael J. Wysocki 2008-04-13 20:35 ` David Miller 2008-04-14 10:18 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Ingo Molnar 3 siblings, 2 replies; 129+ messages in thread From: Evgeniy Polyakov @ 2008-04-13 20:21 UTC (permalink / raw) To: Andrew Morton Cc: Willy Tarreau, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev Hi. On Sun, Apr 13, 2008 at 12:18:31PM -0700, Andrew Morton (akpm@linux-foundation.org) wrote: > > One other thing which might get confusing/frustrating on the > > user side is that currently, Linux is the *only* product which requires > > the bug reporter to find the fault change > > That's because many (probably most) Linux bugs are dependent upon the > hardware which they run on, and developers cannot reproduce the failure on > their hardware. Other software products don't have that problem. Bugs are bugs, they either depend on hardware or do not. There is no perfect world where after reporting subtle bug it will be fixed. It is not Linux, it is everywhere. Bugs are only fixed when they have major impact. Only. Either by having exploit, or crash, or good testcase. Or bisect result. This just a tool to help both parties. And a huge help for regressions. If bug would exist for years, bisection unlikely to help. > That being said.. four or five years ago, developers would often work > closely with the reporter working out why the reporter's failure was > occurring. Several days of back-and-forth. Yeah, spent two weeks kicking all possible stuff around and eventually drop that namespace patch at all to find where the problem was. We started to move further. Bisect is just a tool. It is not something developers throw into user when they do not want to work. This _is_ a help, which allows both to solve problem in the fastest way. If the same would be done on developers machine and huge patches would be sent to jump between changesets, that would be a real 'work closely with the reporter working out why the reporter's failure was occurring'? You pointed it yourself: several days of back-and-forth. With this helping automation tool called bisect bug was resolved in 15 minutes after completion. Completion itself took couple of hours. > We dont' do that as much nowadays - there's a tendency to > > a) throw the problem back at the reporter, often asking them to bisect. > If the reporter is running a distro kernel (eg: Fedora) then that's > quite hard, and often isn't a think they have knowledge to do. So > they'll just disappear. Or > > b) just ignore the report altogether. There is also global warming tendency. IIRC. Bugs _are_ fixed, Andrew. And developers did not change suddenly to selfish bastards who do not care for users. They just developed a tool, which greatly helps to both and saves lots of users time, since regression gets fixed with this tool really quickly. Bisect is not asked to be performed without a reason. For subtle bug it is the fastest way, but otherwise there might be a long conversation. And even in this really subtle case there was a dialog. Bisect automation does not add kind relations though, but we can ask Linus to add couple of smiles into the output. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-13 20:21 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Evgeniy Polyakov @ 2008-04-13 20:33 ` Rafael J. Wysocki 2008-04-13 20:54 ` Evgeniy Polyakov 2008-04-13 20:35 ` David Miller 1 sibling, 1 reply; 129+ messages in thread From: Rafael J. Wysocki @ 2008-04-13 20:33 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Andrew Morton, Willy Tarreau, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev On Sunday, 13 of April 2008, Evgeniy Polyakov wrote: > Hi. > > On Sun, Apr 13, 2008 at 12:18:31PM -0700, Andrew Morton (akpm@linux-foundation.org) wrote: > > > One other thing which might get confusing/frustrating on the > > > user side is that currently, Linux is the *only* product which requires > > > the bug reporter to find the fault change > > > > That's because many (probably most) Linux bugs are dependent upon the > > hardware which they run on, and developers cannot reproduce the failure on > > their hardware. Other software products don't have that problem. > > Bugs are bugs, they either depend on hardware or do not. > There is no perfect world where after reporting subtle bug it will be > fixed. It is not Linux, it is everywhere. Bugs are only fixed when > they have major impact. Only. Either by having exploit, or crash, > or good testcase. Or bisect result. > > This just a tool to help both parties. And a huge help for regressions. > If bug would exist for years, bisection unlikely to help. > > > That being said.. four or five years ago, developers would often work > > closely with the reporter working out why the reporter's failure was > > occurring. Several days of back-and-forth. > > Yeah, spent two weeks kicking all possible stuff around and eventually > drop that namespace patch at all to find where the problem was. We > started to move further. > > Bisect is just a tool. It is not something developers throw into user > when they do not want to work. This _is_ a help, which allows both to > solve problem in the fastest way. > > If the same would be done on developers machine and huge patches would > be sent to jump between changesets, that would be a real 'work closely > with the reporter working out why the reporter's failure was occurring'? > > You pointed it yourself: several days of back-and-forth. > With this helping automation tool called bisect bug was resolved in 15 > minutes after completion. Completion itself took couple of hours. > > > We dont' do that as much nowadays - there's a tendency to > > > > a) throw the problem back at the reporter, often asking them to bisect. > > If the reporter is running a distro kernel (eg: Fedora) then that's > > quite hard, and often isn't a think they have knowledge to do. So > > they'll just disappear. Or > > > > b) just ignore the report altogether. > > There is also global warming tendency. IIRC. > > Bugs _are_ fixed, Andrew. And developers did not change suddenly to > selfish bastards who do not care for users. They just developed a tool, > which greatly helps to both and saves lots of users time, since > regression gets fixed with this tool really quickly. Bisect is not asked > to be performed without a reason. To be honest, at least in one case no one reacted to my report(s) until I ran a bisection and then it turned up an obviously broken patch. The breakage was so obvious that if anyone had actually looked at the code in question, he would have see it immediately. Things like this are very disappointing and have a very negative impact on bug reporters. We should do our best to avoid them. Thanks, Rafael ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-13 20:33 ` Rafael J. Wysocki @ 2008-04-13 20:54 ` Evgeniy Polyakov 2008-04-13 22:24 ` Reporting bugs and bisection Stephen Clark 0 siblings, 1 reply; 129+ messages in thread From: Evgeniy Polyakov @ 2008-04-13 20:54 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Andrew Morton, Willy Tarreau, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev On Sun, Apr 13, 2008 at 10:33:49PM +0200, Rafael J. Wysocki (rjw@sisk.pl) wrote: > Things like this are very disappointing and have a very negative impact on bug > reporters. We should do our best to avoid them. Shit happens. This is a matter of either bug report or those who were in the copy list. There are different people and different situations, in which they do not reply. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-13 20:54 ` Evgeniy Polyakov @ 2008-04-13 22:24 ` Stephen Clark 2008-04-13 22:41 ` Rafael J. Wysocki ` (2 more replies) 0 siblings, 3 replies; 129+ messages in thread From: Stephen Clark @ 2008-04-13 22:24 UTC (permalink / raw) To: Evgeniy Polyakov Cc: Rafael J. Wysocki, Andrew Morton, Willy Tarreau, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev Evgeniy Polyakov wrote: > On Sun, Apr 13, 2008 at 10:33:49PM +0200, Rafael J. Wysocki (rjw@sisk.pl) wrote: >> Things like this are very disappointing and have a very negative impact on bug >> reporters. We should do our best to avoid them. > > Shit happens. This is a matter of either bug report or those who were in > the copy list. There are different people and different situations, in > which they do not reply. > Well less shit would happen if developers would take the time to at least test their patches before they were submitted. It like we will just have the poor user do our testing for us. What kind of testing do developers do. I been a linux user and have followed the LKML for a number of years and have yet to see any test plans for any submitted patches. My $.02 Steve Clark -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-13 22:24 ` Reporting bugs and bisection Stephen Clark @ 2008-04-13 22:41 ` Rafael J. Wysocki 2008-04-13 23:51 ` david 2008-04-14 9:26 ` Andi Kleen 2 siblings, 0 replies; 129+ messages in thread From: Rafael J. Wysocki @ 2008-04-13 22:41 UTC (permalink / raw) To: sclark46 Cc: Evgeniy Polyakov, Andrew Morton, Willy Tarreau, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev On Monday, 14 of April 2008, Stephen Clark wrote: > Evgeniy Polyakov wrote: > > On Sun, Apr 13, 2008 at 10:33:49PM +0200, Rafael J. Wysocki (rjw@sisk.pl) wrote: > >> Things like this are very disappointing and have a very negative impact on bug > >> reporters. We should do our best to avoid them. > > > > Shit happens. This is a matter of either bug report or those who were in > > the copy list. There are different people and different situations, in > > which they do not reply. > > > Well less shit would happen if developers would take the time to at least test > their patches before they were submitted. It like we will just have the poor > user do our testing for us. What kind of testing do developers do. I been a > linux user and have followed the LKML for a number of years and have yet to see > any test plans for any submitted patches. The (apparent) lack of test plans doesn't imply the patches not being tested, actually. My experience indicates that they are tested in the majority of cases. Still, sometimes they are not and that's when the most damage is done. Thanks, Rafael ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-13 22:24 ` Reporting bugs and bisection Stephen Clark 2008-04-13 22:41 ` Rafael J. Wysocki @ 2008-04-13 23:51 ` david 2008-04-14 0:36 ` Jakub Narebski 2008-04-14 4:39 ` Willy Tarreau 2008-04-14 9:26 ` Andi Kleen 2 siblings, 2 replies; 129+ messages in thread From: david @ 2008-04-13 23:51 UTC (permalink / raw) To: Stephen Clark Cc: Evgeniy Polyakov, Rafael J. Wysocki, Andrew Morton, Willy Tarreau, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev cross-posted to git for the suggestion at the bottom On Sun, 13 Apr 2008, Stephen Clark wrote: > Evgeniy Polyakov wrote: >> On Sun, Apr 13, 2008 at 10:33:49PM +0200, Rafael J. Wysocki (rjw@sisk.pl) >> wrote: >>> Things like this are very disappointing and have a very negative impact on >>> bug >>> reporters. We should do our best to avoid them. >> >> Shit happens. This is a matter of either bug report or those who were in >> the copy list. There are different people and different situations, in >> which they do not reply. >> > Well less shit would happen if developers would take the time to at least > test their patches before they were submitted. It like we will just have the > poor user do our testing for us. What kind of testing do developers do. I > been a linux user and have followed the LKML for a number of years and have > yet to see > any test plans for any submitted patches. I've been reading LKML for 11 years now, I've tested kernels and reported a few bugs along the way. the expectation is that the submitter should have tested the patches before submitting them (where hardware allows). but that "where hardware allows" is a big problem. so many issues are dependant on hardwre that it's not possible to test everything. there are people who download, compile and test the tree nightly (with farms of machines to test different configs), but they can't catch everything. expecting the patches to be tested to the point where there are no bugs is unreasonable. bisecting is a very powerful tool, but I do think that sometimes developers lean on it a bit much. taking the attitude (as some have) that 'if the reporter can't be bothered to do a bisection I can't be bothered to deal with the bug' is going way too far. if a bug can be reproduced reliably on a test system then bisecting it may reveal the patch that introduced or unmasked the bug (assuming that there aren't other problems along the way), but if the bug takes a long time to show up after a boot, or only happens under production loads, bisecting it may not be possible. that doesn't mean that the bug isn't real, it just means that the user is going to have to stick with an old version until there is a solution or work-around. even in the hard-to-test situations, the reporter is usually able to test a few fixes, but there's a big difference between going to management and saying "the kernel guru's think that this will help, can we test it this weekend" 2-3 times and doing a bisection that will take 10-15 cycles to find the problem. it's very reasonable to ask the reporter if they can bisect the problem, but if they say that they can't, declaring that they are out of luck is not reasonable, it just means that it's going to take more thinking to find the problem instead of being able to let the mechanical bisect process narrow things down for you. it may mean that the developer will need to make a patch to instrament an old (working) kernel that has minimal impact on that kernel so that the reporter can run this to gather information about what the load is so that the developer can try to simulate it on a new (non-working) kernel in theory everyone has a test environment that lets them simulate everything in their production envrionment. in practice this is only true at the very low end (where it's easy to do) and the very high end (where it's so critical that it's done no matter how much it costs). Everyone else has a test environment that can test most things, but not everything. As such when they run into a problem they may not be able to do lots of essentially random testing. elsewhere in this thread someone said that the pre-git way was to do a manual bisect where the developer would send patches backing out specific changes to find the problem. one big difference between tat and bisecting the problem is that the manual process was focused on the changes in the area that is suspected of causing the problem, while the git bisect process goes after all changes. this makes it much more likely that the tester will run into unrelated problems along the way. I wonder if it would be possible to make a variation of git bisect that only looked at a subset of the tree when picking bisect points (if you are looking for a e1000 bug, testing bisect points that haven't changed that driver won't help you for example). If this can be done it would speed up the reporters efforts, but will require more assistance from the developers (who would need to tell the reporters what subtrees to test) so it's a tradeoff of efficiancy vs simplicity. David Lang ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-13 23:51 ` david @ 2008-04-14 0:36 ` Jakub Narebski 2008-04-14 4:39 ` Willy Tarreau 1 sibling, 0 replies; 129+ messages in thread From: Jakub Narebski @ 2008-04-14 0:36 UTC (permalink / raw) To: david Cc: Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Andrew Morton, Willy Tarreau, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev david@lang.hm writes: > cross-posted to git for the suggestion at the bottom [...] > Elsewhere in this thread someone said that the pre-git way was to do a > manual bisect where the developer would send patches backing out > specific changes to find the problem. one big difference between that > and bisecting the problem is that the manual process was focused on > the changes in the area that is suspected of causing the problem, > while the git bisect process goes after all changes. this makes it > much more likely that the tester will run into unrelated problems > along the way. > > I wonder if it would be possible to make a variation of git bisect > that only looked at a subset of the tree when picking bisect points > (if you are looking for a e1000 bug, testing bisect points that > haven't changed that driver won't help you for example). If this can > be done it would speed up the reporters efforts, but will require more > assistance from the developers (who would need to tell the reporters > what subtrees to test) so it's a tradeoff of efficiancy vs simplicity. Errr... the synopisis of git-bisect contains the following: git bisect start [<bad> [<good>...]] [--] [<paths>...] so you can limit bisection to commits affecting specified subsystem. P.S. Unfortunately git currently doesn't deal with directory renames, so if there was sime big code restructuring one has to provide all historic pathspecs. -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-13 23:51 ` david 2008-04-14 0:36 ` Jakub Narebski @ 2008-04-14 4:39 ` Willy Tarreau 2008-04-14 5:39 ` Al Viro 1 sibling, 1 reply; 129+ messages in thread From: Willy Tarreau @ 2008-04-14 4:39 UTC (permalink / raw) To: david Cc: Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Andrew Morton, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Sun, Apr 13, 2008 at 04:51:34PM -0700, david@lang.hm wrote: > cross-posted to git for the suggestion at the bottom > > On Sun, 13 Apr 2008, Stephen Clark wrote: > > >Evgeniy Polyakov wrote: > >>On Sun, Apr 13, 2008 at 10:33:49PM +0200, Rafael J. Wysocki (rjw@sisk.pl) > >>wrote: > >>>Things like this are very disappointing and have a very negative impact > >>>on bug > >>>reporters. We should do our best to avoid them. > >> > >>Shit happens. This is a matter of either bug report or those who were in > >>the copy list. There are different people and different situations, in > >>which they do not reply. > >> > >Well less shit would happen if developers would take the time to at least > >test their patches before they were submitted. It like we will just have > >the poor user do our testing for us. What kind of testing do developers > >do. I been a linux user and have followed the LKML for a number of years > >and have yet to see > >any test plans for any submitted patches. > > I've been reading LKML for 11 years now, I've tested kernels and reported > a few bugs along the way. > > the expectation is that the submitter should have tested the patches > before submitting them (where hardware allows). but that "where hardware > allows" is a big problem. so many issues are dependant on hardwre that > it's not possible to test everything. > > there are people who download, compile and test the tree nightly (with > farms of machines to test different configs), but they can't catch > everything. > > expecting the patches to be tested to the point where there are no bugs is > unreasonable. [...] Agreed. The difficulty is that only the developer knows how confident he is in his code. Even the subsystem maintainer does not know, which is the real issue since as long as the code is not identified, he does not know whom to ping. And I think that it might help if we could add a "Trust" rating to the patches we submit, similarly to "Tested-By" or "Signed-off-by". We could use 1 to 5. Basically, when the patch was completed at 3am and just builds, it's more likely 1/5. When it has been stressed for 1 week, it would be 4/5. 5/5 would only be used in backports of known working code, for some wide-used external patches, or for trivial patches (eg: doc/whitespace fixes). The goal would clearly not be to just trust patches with a high rate (since they might break when associated with others), but for the subsystem maintainer to quickly check if there are some of them the author does not 100% trust, in which case he could ping the author to check if his patch *may* cause the reported problem. What makes this rating system delicate is that the rate cannot be changed afterwards. But after all, that's not much of a problem. A bug may very well reveal itself one year after the code was merged, so it's really the developer's estimation which matters. For this to be efficiently used, we would need git-commit to accept a new "-T <rating>" argument with the following possible values : 0: untested (default) 1: builds 2: seems to be working 3: passed basic non-regression tests 4: survived stress testing at the developer's 5: known to be working for a long time somewhere else I'm sure many people would find this useless (or in fact reject the idea because it would show that most code will be rated 1 or 2), but I really think it can help subsystem maintainers make the relation between a reported bug and a possible submitter. Willy ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 4:39 ` Willy Tarreau @ 2008-04-14 5:39 ` Al Viro 2008-04-14 6:24 ` Andrew Morton 0 siblings, 1 reply; 129+ messages in thread From: Al Viro @ 2008-04-14 5:39 UTC (permalink / raw) To: Willy Tarreau Cc: david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Andrew Morton, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, Apr 14, 2008 at 06:39:39AM +0200, Willy Tarreau wrote: [snip] > I'm sure many people would find this useless (or in fact reject the > idea because it would show that most code will be rated 1 or 2), > but I really think it can help subsystem maintainers make the relation > between a reported bug and a possible submitter. I have a related proposal: let us require all patches to be stamped with Discordian *and* Eternal September dates. In triplicate. While we are at it, why don't we introduce new mandatory headers like, say it, X-checkpatch: {Yes,No} X-checkpatch-why-not: <string> X-pointless: <number from 1 to 69, going from "1: does something useful" all the way to "68: aligns right ends of lines in comments"> X-arbitrary-rules-added-to-CodingStyle: <number> (should be present if and only if X-pointless: 69 is present). Come to think of that, we clearly need a new file in Documentation/*, documenting such headers. Why don't we organize a subcommittee^Wnew maillist devoted to that? That would provide another entry route for contributors, lowering the overall entry barriers even further... Seriously, looks like Andi is right - we've got ourselves a developing beaurocracy. As in "more and more ways of generating activity without doing anything even remotely useful". Complete with tendency to operate in the ways that make sense only to beaurocracy in question and an ever-growing set of bylaws... ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 5:39 ` Al Viro @ 2008-04-14 6:24 ` Andrew Morton 2008-04-14 6:39 ` David Miller ` (2 more replies) 0 siblings, 3 replies; 129+ messages in thread From: Andrew Morton @ 2008-04-14 6:24 UTC (permalink / raw) To: Al Viro Cc: Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008 06:39:43 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote: > On Mon, Apr 14, 2008 at 06:39:39AM +0200, Willy Tarreau wrote: > > [snip] > > > I'm sure many people would find this useless (or in fact reject the > > idea because it would show that most code will be rated 1 or 2), > > but I really think it can help subsystem maintainers make the relation > > between a reported bug and a possible submitter. > > I have a related proposal: let us require all patches to be stamped > with Discordian *and* Eternal September dates. In triplicate. While > we are at it, why don't we introduce new mandatory headers like, say > it, > > X-checkpatch: {Yes,No} > X-checkpatch-why-not: <string> > X-pointless: <number from 1 to 69, going from "1: does something useful" all > the way to "68: aligns right ends of lines in comments"> > X-arbitrary-rules-added-to-CodingStyle: <number> (should be present if > and only if X-pointless: 69 is present). > > Come to think of that, we clearly need a new file in Documentation/*, > documenting such headers. Why don't we organize a subcommittee^Wnew maillist > devoted to that? That would provide another entry route for contributors, > lowering the overall entry barriers even further... > None of the above was particularly useful. > > Seriously, looks like Andi is right - we've got ourselves a developing > beaurocracy. As in "more and more ways of generating activity without > doing anything even remotely useful". Complete with tendency to operate in > the ways that make sense only to beaurocracy in question and an ever-growing > set of bylaws... No. The problem we're discussing here is the apparently-large number of bugs which are in the kernel, the apparently-large number of new bugs which we're adding to the kernel, and our apparent tardiness in addressing them. Do you agree with these impressions, or not? If you do agree, what would you propose we do about it? ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 6:24 ` Andrew Morton @ 2008-04-14 6:39 ` David Miller 2008-04-14 6:43 ` David Miller 2008-04-14 7:23 ` Al Viro 2008-04-14 19:13 ` Rene Herman 2 siblings, 1 reply; 129+ messages in thread From: David Miller @ 2008-04-14 6:39 UTC (permalink / raw) To: akpm Cc: viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev From: Andrew Morton <akpm@linux-foundation.org> Date: Sun, 13 Apr 2008 23:24:41 -0700 > Do you agree with these impressions, or not? I think things are improving. I wrote or merged in ~10 bugs in the last hour, for example. And I also agree with Al's point, which was embedded in his humorous and obviously sarcastic suggestions, in that adding beurocracy isn't the answer. We already have too much and it scares developers away. Sure you don't want crap getting into the tree (for too long), but it is important to be careful to define crap properly. For example, inundating patch submitters with more requirements, especially ones involving automatons like checkpatch, is in the end bad. We can improve the quality of stuff going in and be flexible at the same time. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 6:39 ` David Miller @ 2008-04-14 6:43 ` David Miller 0 siblings, 0 replies; 129+ messages in thread From: David Miller @ 2008-04-14 6:43 UTC (permalink / raw) To: akpm Cc: viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev From: David Miller <davem@davemloft.net> Date: Sun, 13 Apr 2008 23:39:59 -0700 (PDT) > I wrote or merged in ~10 bugs in the last hour, for example. Bug fixes! I meant "fixes" I swear! That's quite a Freudian slip if I ever saw one. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 6:24 ` Andrew Morton 2008-04-14 6:39 ` David Miller @ 2008-04-14 7:23 ` Al Viro 2008-04-14 7:43 ` Al Viro ` (2 more replies) 2008-04-14 19:13 ` Rene Herman 2 siblings, 3 replies; 129+ messages in thread From: Al Viro @ 2008-04-14 7:23 UTC (permalink / raw) To: Andrew Morton Cc: Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Sun, Apr 13, 2008 at 11:24:41PM -0700, Andrew Morton wrote: > No. The problem we're discussing here is the apparently-large number of > bugs which are in the kernel, the apparently-large number of new bugs which > we're adding to the kernel, and our apparent tardiness in addressing them. > > Do you agree with these impressions, or not? > > If you do agree, what would you propose we do about it? In addition to obvious "we need testing and something better than bugzilla to keep track of bugs"? Real review of code in tree and patches getting into the tree. And the latter part _must_ be done on each entry point. Any git tree that acts as injection point really needs a working mechanism of some sort that would do that; afterwards it's too late, since review of the stuff getting into mainline on a massive merge is sadly impractical. I don't know any formal mechanism that could take care of that; no more than making sure that no backdoors are injected into the tree. It really has to be a matter of trust for tree maintainers and community around the subsystem. Git is damn good at killing the merge bottleneck. Too good, since it hides the review bottleneck. And we get equivalents of self-selected communities that had been problem for "here's our CVS, here's monthly dump from it, apply" kind of setups. It _is_ better, since one can get to commit history (modulo interesting issues with merge nodes and conflict resolution). But in practice it's not good enough - the patches going in during a merge (especially for a tree that collects from secondaries) are not visible enough. And it's too late at that point, since one has to do something monumentally ugly to get Linus revert a large merge. On the scale of Great IDE Mess in 2.5... linux-next might help with the last part, but I don't think it really deals with the first one. It certainly helps to some extent, but... We need higher S/N on l-k. We need people looking into the subsystem trees as those grow and causing a stench when bad things are found, with design issues getting brought to l-k if nothing else helps. We need tree maintainers understanding that review, including out-of-community one, is needed (the need of testing is generally better understood - I _hope_). We need more people reading the fscking source. Subsystem by subsystem. Without assumption that code is not broken. With mechanism collating the questions asked and answers given. Ideally we need growing documentation of core subsystems and data structures, with explicit goal of helping reviewers new to an area to find their way around it. And yes, I'm guilty of procrastinating on that - several half-finished pieces on VFS-related stuff are sitting locally ;-/ We need gregkh to get real and stop assuming that two Signed-off-by are equivalent to "reviewed at least twice", while we are at it ;-) We need people to realize that warnings are useful as triage tools - not as "Ug see warning. Warning bad. Ug fix that line. Warning go away. Ug changeset count grow. Ug happy.", but as machine-assisted part of finding confused areas of code. With human combining signals from different warnings to get statistically useful triage strategies (note that aforementioned making gcc/sparse/whatnot to STFU by local change has a lovely potential of distorting those signals and actually _hiding_ crap code). Maybe we need a list a-la linux-arch for tree maintainers to coordinate stuff - obviously open not only for those. We really need to get around to doing triage of remaining stuff in -mm, BTW - again, guilty for not getting through such on VFS-related stuff in there. Hopefully linux-next trees will eventually vacuum most of the pile in... As for the bug that got this thread started... I'd say that asking to bisect was reasonable in this particular case. The following DSW mixed into the thread very soon went the way of all DSW (OK, it hadn't godwinated yet, at least in the parts I've seen, so there's still way to go, but...) ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 7:23 ` Al Viro @ 2008-04-14 7:43 ` Al Viro 2008-04-14 8:04 ` Andrew Morton 2008-04-14 15:54 ` James Morris 2 siblings, 0 replies; 129+ messages in thread From: Al Viro @ 2008-04-14 7:43 UTC (permalink / raw) To: Andrew Morton Cc: Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, Apr 14, 2008 at 08:23:28AM +0100, Al Viro wrote: > And the latter part _must_ be done on each entry point. Any git tree > that acts as injection point really needs a working mechanism of some > sort that would do that; afterwards it's too late, since review of > the stuff getting into mainline on a massive merge is sadly impractical. PS: net/* is actually pretty sane in that respect - the huge volume being what it is, of course, but still, my impression is that it's pretty far from the worst sources of crap. OTOH, I might be missing secondary tree problems - e.g. net/sctp is much worse off in that respect, AFAICT; there might very well be more of such areas. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 7:23 ` Al Viro 2008-04-14 7:43 ` Al Viro @ 2008-04-14 8:04 ` Andrew Morton 2008-04-14 8:30 ` David Miller ` (2 more replies) 2008-04-14 15:54 ` James Morris 2 siblings, 3 replies; 129+ messages in thread From: Andrew Morton @ 2008-04-14 8:04 UTC (permalink / raw) To: Al Viro Cc: Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008 08:23:28 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote: > On Sun, Apr 13, 2008 at 11:24:41PM -0700, Andrew Morton wrote: > > > No. The problem we're discussing here is the apparently-large number of > > bugs which are in the kernel, the apparently-large number of new bugs which > > we're adding to the kernel, and our apparent tardiness in addressing them. > > > > Do you agree with these impressions, or not? > > > > If you do agree, what would you propose we do about it? > > In addition to obvious "we need testing and something better than bugzilla > to keep track of bugs"? Swapping out bugzilla for something else wouldn't help. We'd end up with lots of people ignoring a good bug tracking system just like they were ignoring a bad one. (And I don't think developers and maintainers _should_ spend time mucking in bug-tracking systems. They should have helpers who do all the triaging/tracking/routing/closing work for them, and then provide other developers with the results, letting them know what they should be spending time on. But there's a manpower problem). > Real review of code in tree and patches getting into > the tree. > > And the latter part _must_ be done on each entry point. Any git tree > that acts as injection point really needs a working mechanism of some > sort that would do that; afterwards it's too late, since review of > the stuff getting into mainline on a massive merge is sadly impractical. > > I don't know any formal mechanism that could take care of that; no more > than making sure that no backdoors are injected into the tree. It really > has to be a matter of trust for tree maintainers and community around > the subsystem. > > Git is damn good at killing the merge bottleneck. Too good, since it > hides the review bottleneck. And we get equivalents of self-selected > communities that had been problem for "here's our CVS, here's monthly > dump from it, apply" kind of setups. It _is_ better, since one can > get to commit history (modulo interesting issues with merge nodes and > conflict resolution). But in practice it's not good enough - the patches > going in during a merge (especially for a tree that collects from > secondaries) are not visible enough. And it's too late at that point, > since one has to do something monumentally ugly to get Linus revert > a large merge. On the scale of Great IDE Mess in 2.5... > > linux-next might help with the last part, but I don't think it really > deals with the first one. It certainly helps to some extent, but... > > We need higher S/N on l-k. We need people looking into the subsystem > trees as those grow and causing a stench when bad things are found, > with design issues getting brought to l-k if nothing else helps. We > need tree maintainers understanding that review, including out-of-community > one, is needed (the need of testing is generally better understood - I > _hope_). > > We need more people reading the fscking source. Subsystem by subsystem. > Without assumption that code is not broken. With mechanism collating > the questions asked and answers given. Ideally we need growing documentation > of core subsystems and data structures, with explicit goal of helping > reviewers new to an area to find their way around it. And yes, I'm > guilty of procrastinating on that - several half-finished pieces on > VFS-related stuff are sitting locally ;-/ > > We need gregkh to get real and stop assuming that two Signed-off-by are > equivalent to "reviewed at least twice", while we are at it ;-) > > We need people to realize that warnings are useful as triage tools - > not as "Ug see warning. Warning bad. Ug fix that line. Warning go away. > Ug changeset count grow. Ug happy.", but as machine-assisted part of > finding confused areas of code. With human combining signals from > different warnings to get statistically useful triage strategies (note > that aforementioned making gcc/sparse/whatnot to STFU by local change > has a lovely potential of distorting those signals and actually _hiding_ > crap code). > > Maybe we need a list a-la linux-arch for tree maintainers to coordinate > stuff - obviously open not only for those. > > We really need to get around to doing triage of remaining stuff in -mm, > BTW - again, guilty for not getting through such on VFS-related stuff > in there. Hopefully linux-next trees will eventually vacuum most of the > pile in... That all sounds good and I expect few would disagree. But if it is to happen, it clearly won't happen by itself, automatically. We will need to force it upon ourselves and the means by which we will do that is process changes. The thing which is being disparaged as "bureaucracy". The steps to be taken are: a) agree that we have a problem b) agree that we need to address it c) identify the day-to-day work practices which will help address it (as you have done) d) identify the process changes which will force us to adopt those practices e) implement those process changes. I have thus far failed to get us past step a). ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 8:04 ` Andrew Morton @ 2008-04-14 8:30 ` David Miller 2008-04-14 9:06 ` Christoph Hellwig ` (2 more replies) 2008-04-14 12:08 ` Adrian Bunk 2008-04-14 14:43 ` Arjan van de Ven 2 siblings, 3 replies; 129+ messages in thread From: David Miller @ 2008-04-14 8:30 UTC (permalink / raw) To: akpm Cc: viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev From: Andrew Morton <akpm@linux-foundation.org> Date: Mon, 14 Apr 2008 01:04:12 -0700 > That all sounds good and I expect few would disagree. But if it is to > happen, it clearly won't happen by itself, automatically. We will need to > force it upon ourselves and the means by which we will do that is process > changes. The thing which is being disparaged as "bureaucracy". > > The steps to be taken are: > > a) agree that we have a problem ... > I have thus far failed to get us past step a). A lot of people, myself included, subconsciously don't want to get past step a) because the resulting "bureaucracy" or whatever you want to call it is perceived to undercut the very thing that makes the Linux kernel fun to work on. It's still largely free form, loose, and flexible. And that's a notable accomplishment considering how much things have changed. That feeling is why I got involved in the first place, and I know it's what gets other new people in and addicted too. Nobody is "forced" to do anything, and I notice you used the word "force" in d) :-) And I realize this relaxed attitude goes hand in hand with reduced quality and occaisionally more bugs. In many ways, I'm happy with that tradeoff at least wrt. how that works out for the subsystems I'm responsible for. We can ask more subsystem tree maintainers to run their trees more strictly, review patches more closely, etc. But, be honest, good luck getting that from the guys who do subsystem maintainence in their spare time on the weekends. The remaining cases should know better, or simply don't care. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 8:30 ` David Miller @ 2008-04-14 9:06 ` Christoph Hellwig 2008-04-14 9:46 ` Andi Kleen 2008-04-14 10:15 ` Andrew Morton 2 siblings, 0 replies; 129+ messages in thread From: Christoph Hellwig @ 2008-04-14 9:06 UTC (permalink / raw) To: David Miller Cc: akpm, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, Apr 14, 2008 at 01:30:58AM -0700, David Miller wrote: > We can ask more subsystem tree maintainers to run their trees more > strictly, review patches more closely, etc. But, be honest, good luck > getting that from the guys who do subsystem maintainence in their > spare time on the weekends. The remaining cases should know better, > or simply don't care. Actually my impression is that spare-time maitainer produce much better code and subsystem trees than corporate-drones. But of course there's a lot of shades between those two extremes. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 8:30 ` David Miller 2008-04-14 9:06 ` Christoph Hellwig @ 2008-04-14 9:46 ` Andi Kleen 2008-04-15 5:25 ` Bill Fink 2008-04-14 10:15 ` Andrew Morton 2 siblings, 1 reply; 129+ messages in thread From: Andi Kleen @ 2008-04-14 9:46 UTC (permalink / raw) To: David Miller Cc: akpm, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev David Miller <davem@davemloft.net> writes: > > It's still largely free form, loose, and flexible. I think Al's point was that we need far more "free form, loose and flexible" work for reviewing code. As in people going over trees and just checking it for anything suspicious and going over existing code and checking it for anything suspicious and going also over mailing list patch posts. And also maintainers who appreciate such review. And checking it for anything suspicious does not mean running only checkpatch.pl or even just sparse, but actually reading it and trying to make sense of it. I don't see that really as conflicting with your goals. It would be some more work for the maintainers to handle more such feedback because they would need to process comments from such "free form reviewers". Some of them will undoutedly be wrong and that will take some time away from processing features (and bugs) but I suspect it would be still worth it. On the other hand it would also take some work away from processing bugs, but as Andrew mentions earlier it looks like significant parts of the boring areas of bug reports (like getting basic information from reporter etc.) could be "out-sourced" to bug masters. And I think being a bug master is an excellent way for someone who isn't a great coder to contribute in excellent ways to Linux (far more than someone e.g. running checkpatch.pl ever could) The challenging thing is also to make sure that the quality of comments stays high. That means more focus on logic and functionality than on form. If the reviewer just goes over the coding style or trivialities I don't think that will improve Linux really. I think the problem is often that people think kernel code must be very complicated and they don't even dare try to understand it. But frankly a lot of the kernel code is not really that complicated logic wise and also doesn't need too specialized knowledge to understand. So I am optimistic that there are a lot of people out there who would be qualified to do some logic review. Really Linux needs a better "reviewing culture" and also a better "bug processing culture" > We can ask more subsystem tree maintainers to run their trees more > strictly, review patches more closely, etc. But, be honest, good luck > getting that from the guys who do subsystem maintainence in their > spare time on the weekends. The remaining cases should know better, > or simply don't care. In my experience weekend maintainers tend to be better at sharing out work. As in they usually (ok there are exceptions) more work including review work on the mailing lists, while my impression is that paid for maintainers tend to have tendency for more centralized "cathedral" tree maintenance. That is with them trying to keep everything under control and effectively much more stuff going on the background out of public view. But the sharing out of work and less centralization is what we really want here I think. Anyways I'm not saying all paid-for maintainers are like this, but there is certainly a trend I think. I admit I personally went through both phases in several projects. When you're really focussed on something it is tempting to do the "keep things under control" central model, but in the end it is the wrong way to go. -Andi ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 9:46 ` Andi Kleen @ 2008-04-15 5:25 ` Bill Fink 0 siblings, 0 replies; 129+ messages in thread From: Bill Fink @ 2008-04-15 5:25 UTC (permalink / raw) To: Andi Kleen Cc: David Miller, akpm, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008, Andi Kleen wrote: > David Miller <davem@davemloft.net> writes: > > > > It's still largely free form, loose, and flexible. > > I think Al's point was that we need far more "free form, loose and > flexible" work for reviewing code. As in people going over trees and > just checking it for anything suspicious and going over existing code > and checking it for anything suspicious and going also over mailing > list patch posts. And also maintainers who appreciate such review. > > And checking it for anything suspicious does not mean running > only checkpatch.pl or even just sparse, but actually reading it > and trying to make sense of it. If you really want to get more such review, then it would be very useful when someone asks about some obtuse portion of kernel code or makes a suggested improvement, that the reviewer then not be flamed as being dense for not understanding the code or some kernel coding concept. It would be much better to treat it as an oppurtunity to educate rather than belittle, thus eventually enlarging the base of people who can assist with various aspects of kernel development. For what's supposed to be an open, engaging community, and which generally is, there sometimes seems to be some level of dismissal of newcomers (not sure it's intended that way but nevertheless it can tend to discourage newcomers from getting more involved). -Bill ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 8:30 ` David Miller 2008-04-14 9:06 ` Christoph Hellwig 2008-04-14 9:46 ` Andi Kleen @ 2008-04-14 10:15 ` Andrew Morton 2008-04-14 10:41 ` David Miller 2 siblings, 1 reply; 129+ messages in thread From: Andrew Morton @ 2008-04-14 10:15 UTC (permalink / raw) To: David Miller Cc: viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008 01:30:58 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Andrew Morton <akpm@linux-foundation.org> > Date: Mon, 14 Apr 2008 01:04:12 -0700 > > > That all sounds good and I expect few would disagree. But if it is to > > happen, it clearly won't happen by itself, automatically. We will need to > > force it upon ourselves and the means by which we will do that is process > > changes. The thing which is being disparaged as "bureaucracy". > > > > The steps to be taken are: > > > > a) agree that we have a problem > ... > > I have thus far failed to get us past step a). > > A lot of people, myself included, subconsciously don't want to > get past step a) because the resulting "bureaucracy" or whatever > you want to call it is perceived to undercut the very thing > that makes the Linux kernel fun to work on. > > It's still largely free form, loose, and flexible. And that's > a notable accomplishment considering how much things have changed. > That feeling is why I got involved in the first place, and I know > it's what gets other new people in and addicted too. > > Nobody is "forced" to do anything, and I notice you used the > word "force" in d) :-) OK, I was going to let this pass, but I changed my mind. You carefully deleted my text so that you could misquote it, thereby flagrantly misrepresenting everything I said. Here it is again: : The steps to be taken are: : : a) agree that we have a problem : : b) agree that we need to address it : : c) identify the day-to-day work practices which will help address it (as : you have done) : : d) identify the process changes which will force us to adopt those practices : : e) implement those process changes. Forcing a discipline upon oneself is totally different from having it forced upon you by someone else. Each step will need general agreement and buyin, otherwise none of it will (or should) work. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 10:15 ` Andrew Morton @ 2008-04-14 10:41 ` David Miller 2008-04-14 17:35 ` Roman Shaposhnik 0 siblings, 1 reply; 129+ messages in thread From: David Miller @ 2008-04-14 10:41 UTC (permalink / raw) To: akpm Cc: viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev From: Andrew Morton <akpm@linux-foundation.org> Date: Mon, 14 Apr 2008 03:15:30 -0700 > You carefully deleted my text so that you could misquote it, thereby > flagrantly misrepresenting everything I said. Not the intention, but anyways: > Here it is again: > > : The steps to be taken are: > : > : a) agree that we have a problem > : > : b) agree that we need to address it > : > : c) identify the day-to-day work practices which will help address it (as > : you have done) > : > : d) identify the process changes which will force us to adopt those practices > : > : e) implement those process changes. > > Forcing a discipline upon oneself is totally different from having it > forced upon you by someone else. > > Each step will need general agreement and buyin, otherwise none of it will > (or should) work. The "force" is to "us" which is a group. And I imagine that newcomers will be expected to adopt these "practices". So in effect, they will be "forced" into the process changes as well. I'm getting more and more sensitive to issues on this level over time, because I realize that the fundamental issue in all human group issues is getting people to "want" to do things. And "force", in any form, tends to be incompatible with "want". And in particular, people will often even shun things they "want" when it is "forced" to them. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 10:41 ` David Miller @ 2008-04-14 17:35 ` Roman Shaposhnik 0 siblings, 0 replies; 129+ messages in thread From: Roman Shaposhnik @ 2008-04-14 17:35 UTC (permalink / raw) To: David Miller Cc: akpm, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 2008-04-14 at 03:41 -0700, David Miller wrote: > I'm getting more and more sensitive to issues on this level over time, > because I realize that the fundamental issue in all human group issues > is getting people to "want" to do things. And "force", in any form, > tends to be incompatible with "want". And in particular, people will > often even shun things they "want" when it is "forced" to them. Just wanted to add my 2c by mentioning my favorite example of "virtual Tom Sawyering" as far as a tedious review process goes: http://en.wikipedia.org/wiki/Knuth_reward_check Which is also quite cheap too -- AFAIK very few of those have ever been cashed. Thanks, Roman. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 8:04 ` Andrew Morton 2008-04-14 8:30 ` David Miller @ 2008-04-14 12:08 ` Adrian Bunk 2008-04-14 14:43 ` Arjan van de Ven 2 siblings, 0 replies; 129+ messages in thread From: Adrian Bunk @ 2008-04-14 12:08 UTC (permalink / raw) To: Andrew Morton Cc: Al Viro, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, Apr 14, 2008 at 01:04:12AM -0700, Andrew Morton wrote: >... > (And I don't think developers and maintainers _should_ spend time mucking > in bug-tracking systems. They should have helpers who do all the > triaging/tracking/routing/closing work for them, and then provide other > developers with the results, letting them know what they should be spending > time on. But there's a manpower problem). >... Speaking as the one who was for a few years going again and again through all open bugs in the kernel Bugzilla: The manpower problem isn't in handling the bugs in Bugzilla. I'd claim that even if all bugs in the kernel would be reported in the kernel Bugzilla I alone would be able to do all the handling of incoming bugs, bug forwarding and doing all the cleanup stuff like asking submitters whether a bug is still present in the latest kernel. The manpower problem is at the developers and maintainers who could actually debug the problems. One problem are unmaintained areas. Do we have anyone who would debug e.g. APM bugs? And if I want to be really nasty, I'll ask whether we have anyone who understands our floppy driver... ;) And who would debug problems with old and unmaintained drivers, e.g. some old net or SCSI driver? Note that I do not blame James or Jeff or whoever else for the latter - they might simply not have the time to spend a day or two for debugging some obscure problem on some obscure hardware. And it could happen everywhere that maintainers simply don't have the time to cope with all incoming bug reports. We have many people who write new bugs^Wcode. But too few people who review code. And too few people willing to maintain the existing code. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 8:04 ` Andrew Morton 2008-04-14 8:30 ` David Miller 2008-04-14 12:08 ` Adrian Bunk @ 2008-04-14 14:43 ` Arjan van de Ven 2008-04-14 17:51 ` Andrew Morton 2 siblings, 1 reply; 129+ messages in thread From: Arjan van de Ven @ 2008-04-14 14:43 UTC (permalink / raw) To: Andrew Morton Cc: Al Viro, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008 01:04:12 -0700 > > The steps to be taken are: > > a) agree that we have a problem > I for one do not agree that we have a problem. Based on actual data on oopses (which very clearly excludes other kinds of bugs, so I know I only see part of the story) we are doing reasonably well. Lets look at the 2.6.25 cycle. We got a total of roughly 2700 reports of oopses/warn_ons from users. (This may sound high to those of you only reading lkml, but this includes automatically collected oopses from Fedora 9 beta testers). Out of these 2700, the top 20 issues account for 75% of the total reports. Out of these 20 issues, 9 were from still out of tree drivers (wireless.git and drm.git included in F9). These were caught before they even got close to mainline. The remaining 11 issues can be split in 1) The ones we caught and fixed 2) TCP/IP warnings that DaveM and co are chasing down hard (but have trouble finding reproducers) 3) An EXT3 bug that in theory can cause data corruption, but in practice seems to happen after you yank out a USB stick with an EXT3 filesystem on (so it can't corrupt the disk data). Ted is working on this 4) A bug (double free) that hits in the skb layer, probably caused by a bug in the ipv4 code (a first analysis + potential patch was mailed to netdev this weekend) 5) sysfs "existing file added" warning, mostly in the USB stack (gregkh claims he fixed this recently, I'm not entirely sure he got all cases) And when I look beyond the first 20, the same pattern arises, we fixed the majority of the issues before -rc9. At position 25 we have less than 20 reports per bug. At position 35 we have less than 10 reports per bug. At position 50 we have less than 5 reports per bug. Conclusion there: the bugs people actually hit fall of dramatically; there's a core set of issues that gets hit a lot, the rest quickly gets reduced to noise levels. To me this does not sound like we have a huge quality problem because 1) The distribution of the bugs is such that there is a relatively small set of core issues that are widely hit, and then there's a near exponential drop after that 2) We are fixing the important bugs by and large before they hit a release (important as defined by the number of people actually hitting the bug) I'll be writing a report with more details about this soon with more analysis and statistics (I'll be looking at more detail around the top 25 issues, when they got introduced, when they got fixed etc) -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 14:43 ` Arjan van de Ven @ 2008-04-14 17:51 ` Andrew Morton 2008-04-14 18:24 ` Arjan van de Ven 2008-04-14 19:30 ` Ilpo Järvinen 0 siblings, 2 replies; 129+ messages in thread From: Andrew Morton @ 2008-04-14 17:51 UTC (permalink / raw) To: Arjan van de Ven Cc: Al Viro, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008 07:43:49 -0700 Arjan van de Ven <arjan@infradead.org> wrote: > On Mon, 14 Apr 2008 01:04:12 -0700 > > > > The steps to be taken are: > > > > a) agree that we have a problem > > > > > I for one do not agree that we have a problem. > > Based on actual data on oopses (which very clearly excludes other kinds of bugs, so I know I only see part of the story) > we are doing reasonably well. Lets look at the 2.6.25 cycle. > We got a total of roughly 2700 reports of oopses/warn_ons from users. (This may sound high to those of you only reading > lkml, but this includes automatically collected oopses from Fedora 9 beta testers). > Out of these 2700, the top 20 issues account for 75% of the total reports. > > Out of these 20 issues, 9 were from still out of tree drivers (wireless.git and drm.git included in F9). These were > caught before they even got close to mainline. > The remaining 11 issues can be split in > 1) The ones we caught and fixed > 2) TCP/IP warnings that DaveM and co are chasing down hard (but have trouble finding reproducers) > 3) An EXT3 bug that in theory can cause data corruption, but in practice seems to happen after you yank out a USB stick > with an EXT3 filesystem on (so it can't corrupt the disk data). Ted is working on this > 4) A bug (double free) that hits in the skb layer, probably caused by a bug in the ipv4 code > (a first analysis + potential patch was mailed to netdev this weekend) > 5) sysfs "existing file added" warning, mostly in the USB stack > (gregkh claims he fixed this recently, I'm not entirely sure he got all cases) > > And when I look beyond the first 20, the same pattern arises, we fixed the majority of the issues before -rc9. > At position 25 we have less than 20 reports per bug. At position 35 we have less than 10 reports per bug. > At position 50 we have less than 5 reports per bug. Conclusion there: the bugs people actually hit fall of dramatically; > there's a core set of issues that gets hit a lot, the rest quickly gets reduced to noise levels. > > > To me this does not sound like we have a huge quality problem because > 1) The distribution of the bugs is such that there is a relatively small set of core issues > that are widely hit, and then there's a near exponential drop after that > 2) We are fixing the important bugs by and large before they hit a release > (important as defined by the number of people actually hitting the bug) > > > > I'll be writing a report with more details about this soon with more analysis and statistics > (I'll be looking at more detail around the top 25 issues, when they got introduced, when they got fixed etc) Well OK. But I don't think we can generalise from oops-causing bugs all the way to all bugs. Very few bugs actually cause oopses, and oopses tend to be the thing which developers will zoom in on and pay attention to. If we had metrics on "time goes backwards" or anything containing "ASUS", things might be different. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 17:51 ` Andrew Morton @ 2008-04-14 18:24 ` Arjan van de Ven 2008-04-14 19:30 ` Ilpo Järvinen 1 sibling, 0 replies; 129+ messages in thread From: Arjan van de Ven @ 2008-04-14 18:24 UTC (permalink / raw) To: Andrew Morton Cc: Al Viro, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008 10:51:52 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > Well OK. But I don't think we can generalise from oops-causing bugs including all WARN_ON's and various other kernel backtrace-causing bugs. > all the way to all bugs. Very few bugs actually cause oopses, and > oopses tend to be the thing which developers will zoom in on and pay > attention to. maybe. > > If we had metrics on "time goes backwards" or anything containing > "ASUS", things might be different. Sounds really like we need to add more strategic WARN_ON's and other diagnostics in the kernel to track these issues down. Because another thing that I found so far is that what hits LKML is by far not representative on what happens for users. The most obvious example was the whole input layer refcounting disaster in 2.6.25-rc; this was about 1/3rd of TOTAL reports for a few weeks in a row, but there was hardly an LKML posting for it (in fact there was only 1 half one). We need diagnostics and stuff the kernel spits out so that automated tools can detect these, otherwise we'll very likely not get good information on what is actually wrong with the kernel. In case you want to see the 2.6.25-rc data, the top 100 list is at http://www.kerneloops.org/twentyfive.html (I'm still working on annotating the individual items, but since there's 100 that does take time) -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 17:51 ` Andrew Morton 2008-04-14 18:24 ` Arjan van de Ven @ 2008-04-14 19:30 ` Ilpo Järvinen 1 sibling, 0 replies; 129+ messages in thread From: Ilpo Järvinen @ 2008-04-14 19:30 UTC (permalink / raw) To: Andrew Morton, Arjan van de Ven Cc: Al Viro, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, Jeff Garzik, linux-kernel, git, Netdev On Mon, 14 Apr 2008, Andrew Morton wrote: > On Mon, 14 Apr 2008 07:43:49 -0700 Arjan van de Ven <arjan@infradead.org> wrote: > > > I'll be writing a report with more details about this soon with more analysis and statistics > > (I'll be looking at more detail around the top 25 issues, when they got introduced, when they got fixed etc) > > Well OK. But I don't think we can generalise from oops-causing bugs all > the way to all bugs. Very few bugs actually cause oopses, and oopses tend > to be the thing which developers will zoom in on and pay attention to. > > If we had metrics on "time goes backwards" or anything containing "ASUS", > things might be different. Even oopses have pitfalls, like in 25-rcs where those WARN_ON TCP backtraces were due to three different bugs (there might be fourth one still remaining). ...kerneloops.org didn't even make difference between different WARN_ONs in a function though that would have helped only little in the case of 25-rc TCP because of different bugs causing failures in the same invariant. -- i. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 7:23 ` Al Viro 2008-04-14 7:43 ` Al Viro 2008-04-14 8:04 ` Andrew Morton @ 2008-04-14 15:54 ` James Morris 2008-04-14 22:01 ` David Miller 2008-04-15 9:33 ` David Newall 2 siblings, 2 replies; 129+ messages in thread From: James Morris @ 2008-04-14 15:54 UTC (permalink / raw) To: Al Viro Cc: Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008, Al Viro wrote: > Real review of code in tree and patches getting into the tree. There is currently little incentive for developers to perform review. It's difficult work, and is generally not rewarded or recognized, except in often quite negative ways. There is a small handful of people who do a lot of review, but they are exceptional in various ways. OTOH, writing code is relatively simple, and is much more highly rewarded: - People tend to get paid to write kernel code, but not so much to review it. - Things like "who made the kernel" statistics and related articles ignore code review. - Creating new features is perceived as the highest form of contribution for general developers, and likely important as career currency (similar to the publish or perish model in the academic world). I don't know how to solve this, but suspect that encouraging the use of reviewed-by and also including it in things like analysis of who is contributing, selection for kernel summit invitations etc. would be a start. At least, better than nothing. - James -- James Morris <jmorris@namei.org> ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 15:54 ` James Morris @ 2008-04-14 22:01 ` David Miller 2008-04-14 23:05 ` Andrew Morton 2008-04-15 9:33 ` David Newall 1 sibling, 1 reply; 129+ messages in thread From: David Miller @ 2008-04-14 22:01 UTC (permalink / raw) To: jmorris Cc: viro, akpm, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev From: James Morris <jmorris@namei.org> Date: Tue, 15 Apr 2008 01:54:00 +1000 (EST) > - Things like "who made the kernel" statistics and related articles ignore > code review. Note the apparent irony in that the person who ends up often on the top of those lists, Al Viro, is also someone who also does a significant amount of code review. I think this is no accident. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 22:01 ` David Miller @ 2008-04-14 23:05 ` Andrew Morton 2008-04-15 4:55 ` Willy Tarreau 0 siblings, 1 reply; 129+ messages in thread From: Andrew Morton @ 2008-04-14 23:05 UTC (permalink / raw) To: David Miller Cc: jmorris, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008 15:01:05 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: James Morris <jmorris@namei.org> > Date: Tue, 15 Apr 2008 01:54:00 +1000 (EST) > > > - Things like "who made the kernel" statistics and related articles ignore > > code review. > > Note the apparent irony in that the person who ends up often on the > top of those lists, Al Viro, is also someone who also does a > significant amount of code review. > > I think this is no accident. "who made the kernel" was an interesting and useful exercise, but if you like irony then... - The way to boost your commit count is to submit buggy patches and to then fix your own bugs. - The way to lower your commit count is to fix things in other people's patches, then fold your fix into the base patch. I've lost over 1000 commits that way. Unless they are counting '^ [akpm' as a commit. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 23:05 ` Andrew Morton @ 2008-04-15 4:55 ` Willy Tarreau 2008-04-15 13:18 ` Work WAS(Re: " jamal 0 siblings, 1 reply; 129+ messages in thread From: Willy Tarreau @ 2008-04-15 4:55 UTC (permalink / raw) To: Andrew Morton Cc: David Miller, jmorris, viro, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, Apr 14, 2008 at 04:05:13PM -0700, Andrew Morton wrote: > On Mon, 14 Apr 2008 15:01:05 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > > > From: James Morris <jmorris@namei.org> > > Date: Tue, 15 Apr 2008 01:54:00 +1000 (EST) > > > > > - Things like "who made the kernel" statistics and related articles ignore > > > code review. > > > > Note the apparent irony in that the person who ends up often on the > > top of those lists, Al Viro, is also someone who also does a > > significant amount of code review. > > > > I think this is no accident. > > "who made the kernel" was an interesting and useful exercise, but if you > like irony then... > > - The way to boost your commit count is to submit buggy patches and to > then fix your own bugs. > > - The way to lower your commit count is to fix things in other people's > patches, then fold your fix into the base patch. I've lost over 1000 > commits that way. Unless they are counting '^ [akpm' as a commit. And if Dave speaks about these stats : http://lwn.net/Articles/237768/ then Al does not even appear in it, which proves your point. Willy ^ permalink raw reply [flat|nested] 129+ messages in thread
* Work WAS(Re: Reporting bugs and bisection 2008-04-15 4:55 ` Willy Tarreau @ 2008-04-15 13:18 ` jamal 0 siblings, 0 replies; 129+ messages in thread From: jamal @ 2008-04-15 13:18 UTC (permalink / raw) To: Willy Tarreau Cc: Andrew Morton, David Miller, jmorris, viro, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Tue, 2008-15-04 at 06:55 +0200, Willy Tarreau wrote: > And if Dave speaks about these stats : http://lwn.net/Articles/237768/ > then Al does not even appear in it, which proves your point. Stats such as those above, while useful, are flawed. IMO James Morris has (probably more than anybody else) hit on the core issue. To extend his view: theres more than just code review that deserves respect. Testing is one. Commenting, not necessarily on code, but on architecture is another. Documenting. Yes, running sparse or even Lindent or checkpatch. In the old/current Linux thinking (pun intended) work equates to churning code. That thought process derives from Linus actually then propagates down stream to other folks. I think the Linus approach is still excellent - but its definition of "work" is no longer valid. Work must include all these other things and visible credit is important if the revolution is to continue. If you look at it from a software engineering or production resource management, the Linux development model has gotta be one of the most inefficient[1] - with a reward system geared to developers mostly. If you want to look it from an investment of time (ROI perspective), developers get way too much credit riding on everybody elses back. Why should Mark Lord report another bug to us? Put yourself in his shoes: - he is a clever guy who has already worked around the bug. So a proper fix is only a convinience for him. - Blessed as he was - he got to do more and more work after reporting. - he got slapped for claiming he had to go and get lunch and therefore didnt have time to do more bisect for a bug that wasnt just unique to his setup. - he spent a gazillion electrons responding to people and justifying his stance - he got no credit for his time whatsoever when the bug was fixed (he wont be showing up on lwn list). I think perspective and credit for peoples time needs to change. cheers, jamal [1] With current momentum, theres an infinite resources of developers and testers and documenters in Linux, i.e resource management is only valid as a metric if you had finite resources. So the point i am making is moot - but I do strongly believe the momentum will dampen if current trend of defining work continues. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 15:54 ` James Morris 2008-04-14 22:01 ` David Miller @ 2008-04-15 9:33 ` David Newall 2008-04-15 9:54 ` Michael Kerrisk 2008-04-16 12:15 ` Sverre Rabbelier 1 sibling, 2 replies; 129+ messages in thread From: David Newall @ 2008-04-15 9:33 UTC (permalink / raw) To: James Morris Cc: Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev James Morris wrote: > I don't know how to solve this, but suspect that encouraging the use of > reviewed-by and also including it in things like analysis of who is > contributing, selection for kernel summit invitations etc. would be a > start. At least, better than nothing. Would it be hard to keep count of the number of errors introduced by author and reviewer? ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-15 9:33 ` David Newall @ 2008-04-15 9:54 ` Michael Kerrisk 2008-04-15 14:04 ` David Newall 2008-04-16 12:15 ` Sverre Rabbelier 1 sibling, 1 reply; 129+ messages in thread From: Michael Kerrisk @ 2008-04-15 9:54 UTC (permalink / raw) To: David Newall Cc: James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On 4/15/08, David Newall <davidn@davidnewall.com> wrote: > James Morris wrote: > > I don't know how to solve this, but suspect that encouraging the use of > > reviewed-by and also including it in things like analysis of who is > > contributing, selection for kernel summit invitations etc. would be a > > start. At least, better than nothing. > > Would it be hard to keep count of the number of errors introduced by > author and reviewer? I've found quite a few errors in kernel-userland APIs, but I'm not sure that this sort of negative statistic would be helpful -- e.g., more productive developers probably also introduce more errors. -- I'll likely only see replies if they are CCed to mtk.manpages at gmail dot com ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-15 9:54 ` Michael Kerrisk @ 2008-04-15 14:04 ` David Newall 2008-04-15 20:51 ` Rafael J. Wysocki 0 siblings, 1 reply; 129+ messages in thread From: David Newall @ 2008-04-15 14:04 UTC (permalink / raw) To: Michael Kerrisk Cc: James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev Michael Kerrisk wrote: > On 4/15/08, David Newall <davidn@davidnewall.com> wrote: > >> James Morris wrote: >> > I don't know how to solve this, but suspect that encouraging the use of >> > reviewed-by and also including it in things like analysis of who is >> > contributing, selection for kernel summit invitations etc. would be a >> > start. At least, better than nothing. >> >> Would it be hard to keep count of the number of errors introduced by >> author and reviewer? >> > > I've found quite a few errors in kernel-userland APIs, but I'm not > sure that this sort of negative statistic would be helpful -- e.g., > more productive developers probably also introduce more errors. We can already see which developers are more active. What we can't see is who is careless, which would be useful to know. It would also be useful to know who is careless in approving changes, because they share responsibility for those changes. It would be a good thing if this highlighted that some people are behind frequent buggy changes. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-15 14:04 ` David Newall @ 2008-04-15 20:51 ` Rafael J. Wysocki 2008-04-16 2:34 ` David Newall 0 siblings, 1 reply; 129+ messages in thread From: Rafael J. Wysocki @ 2008-04-15 20:51 UTC (permalink / raw) To: David Newall Cc: Michael Kerrisk, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Tuesday, 15 of April 2008, David Newall wrote: > Michael Kerrisk wrote: > > On 4/15/08, David Newall <davidn@davidnewall.com> wrote: > > > >> James Morris wrote: > >> > I don't know how to solve this, but suspect that encouraging the use of > >> > reviewed-by and also including it in things like analysis of who is > >> > contributing, selection for kernel summit invitations etc. would be a > >> > start. At least, better than nothing. > >> > >> Would it be hard to keep count of the number of errors introduced by > >> author and reviewer? > >> > > > > I've found quite a few errors in kernel-userland APIs, but I'm not > > sure that this sort of negative statistic would be helpful -- e.g., > > more productive developers probably also introduce more errors. > > We can already see which developers are more active. What we can't see > is who is careless, which would be useful to know. It would also be > useful to know who is careless in approving changes, because they share > responsibility for those changes. It would be a good thing if this > highlighted that some people are behind frequent buggy changes. Well, even if someone introduces bugs relatively frequently, but then also works with the reporters and fixes the bugs timely, it's about okay IMO. The real problem is when patch submitters don't care for their changes any more once the patches have been merged. Thanks, Rafael ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-15 20:51 ` Rafael J. Wysocki @ 2008-04-16 2:34 ` David Newall 2008-04-16 3:53 ` david 2008-04-16 4:29 ` Willy Tarreau 0 siblings, 2 replies; 129+ messages in thread From: David Newall @ 2008-04-16 2:34 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Michael Kerrisk, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev Rafael J. Wysocki wrote: > Well, even if someone introduces bugs relatively frequently, but then also > works with the reporters and fixes the bugs timely, it's about okay IMO. > This really is not okay. Even if bugs are fixed a version or two later, the impact those bugs have on users makes the system look bad and drives them away. We do not, I believe, want Linux to top the list for "most bugs". It's unprofessional, unreliable and quite undesirable. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 2:34 ` David Newall @ 2008-04-16 3:53 ` david 2008-04-16 9:06 ` David Newall 2008-04-16 12:41 ` Stephen Clark 2008-04-16 4:29 ` Willy Tarreau 1 sibling, 2 replies; 129+ messages in thread From: david @ 2008-04-16 3:53 UTC (permalink / raw) To: David Newall Cc: Rafael J. Wysocki, Michael Kerrisk, James Morris, Al Viro, Andrew Morton, Willy Tarreau, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Wed, 16 Apr 2008, David Newall wrote: > Rafael J. Wysocki wrote: >> Well, even if someone introduces bugs relatively frequently, but then also >> works with the reporters and fixes the bugs timely, it's about okay IMO. >> > This really is not okay. Even if bugs are fixed a version or two later, > the impact those bugs have on users makes the system look bad and drives > them away. We do not, I believe, want Linux to top the list for "most > bugs". It's unprofessional, unreliable and quite undesirable. timely frequently means the code was merged in -rc1/2 and was fixed before the final release of the same version. given the huge variety of hardware and workloads, it's just too easy for there to be cases where any trade-off you make (code size, performance, memory usage, common case definitions) can turn around and bite you. In addition frequently hardware doesn't work quite the way the design specs say that it should (completely ignoring the fact that many drivers are reverse engineered). what's most important is that when a case shows up it gets addressed promptly I'd rather have a developer/maintainer who introduces and fixed 100 bug, but fixes them promptly, as opposed to one who only introduces one bug, but refuses to consider fixing the code 'because they don't make mistakes like that' (u\bsadly a common attitude from people who produce very good code much of the time) best of all is a developer/maintainer who writes very good code and is willing to accept the fact that they make mistakes and fixes the code promptly, but those people are extremely rare, and usually they emerge from the pool of people who make more mistakes and fix them promptly, which is an added reason I'm more tolerant of that group. David Lang ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 3:53 ` david @ 2008-04-16 9:06 ` David Newall 2008-04-16 11:02 ` Andi Kleen 2008-04-16 12:41 ` Stephen Clark 1 sibling, 1 reply; 129+ messages in thread From: David Newall @ 2008-04-16 9:06 UTC (permalink / raw) To: david Cc: Rafael J. Wysocki, Michael Kerrisk, James Morris, Al Viro, Andrew Morton, Willy Tarreau, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev david@lang.hm wrote: > I'd rather have a developer/maintainer who introduces and fixed 100 > bug, but fixes them promptly, And I'd rather be able to see that that person introduced 100 bugs than to have no idea. As has been said before, the current situation rewards people for sloppy work. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 9:06 ` David Newall @ 2008-04-16 11:02 ` Andi Kleen 0 siblings, 0 replies; 129+ messages in thread From: Andi Kleen @ 2008-04-16 11:02 UTC (permalink / raw) To: David Newall Cc: david, Rafael J. Wysocki, Michael Kerrisk, James Morris, Al Viro, Andrew Morton, Willy Tarreau, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev David Newall <davidn@davidnewall.com> writes: > > And I'd rather be able to see that that person introduced 100 bugs than > to have no idea. As has been said before, the current situation rewards > people for sloppy work. A common issue in the kernel is code who works with a wide range of hardware and firmware with varying quality. The original code is written to spec but then in the real world the hardware and firmware has all kinds of interesting quirks not quite matching the spec that need additional updates to handle. I don't think it's fair to say in this case the original developer was sloppy. Then there is also code which is just hard to tune. Examples for this are the CPU scheduler and the VM, but also other areas. They have to handle a lot of different workloads with often subtle side effects. Lots of people have put a lot of excellent work into tuning these subsystems as users report issues with their workloads. Would you say the original developers were sloppy? I don't think that would be a fair description. Some problems are just hard and need many iterations to get right. And then often also the requirements change over time and need further updates. There are more such examples in kernel. Grading programers is a hard problem and I don't think the software industry has really solved it so far, even though there was a lot of effort trying to do it over several decades. I doubt it will be solved for the Linux kernel either. -Andi ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 3:53 ` david 2008-04-16 9:06 ` David Newall @ 2008-04-16 12:41 ` Stephen Clark 1 sibling, 0 replies; 129+ messages in thread From: Stephen Clark @ 2008-04-16 12:41 UTC (permalink / raw) To: david Cc: David Newall, Rafael J. Wysocki, Michael Kerrisk, James Morris, Al Viro, Andrew Morton, Willy Tarreau, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev david@lang.hm wrote: > On Wed, 16 Apr 2008, David Newall wrote: > >> Rafael J. Wysocki wrote: >>> Well, even if someone introduces bugs relatively frequently, but then >>> also >>> works with the reporters and fixes the bugs timely, it's about okay IMO. >>> >> This really is not okay. Even if bugs are fixed a version or two later, >> the impact those bugs have on users makes the system look bad and drives >> them away. We do not, I believe, want Linux to top the list for "most >> bugs". It's unprofessional, unreliable and quite undesirable. > > timely frequently means the code was merged in -rc1/2 and was fixed > before the final release of the same version. > > given the huge variety of hardware and workloads, it's just too easy for > there to be cases where any trade-off you make (code size, performance, > memory usage, common case definitions) can turn around and bite you. In > addition frequently hardware doesn't work quite the way the design specs > say that it should (completely ignoring the fact that many drivers are > reverse engineered). what's most important is that when a case shows up > it gets addressed promptly > > I'd rather have a developer/maintainer who introduces and fixed 100 bug, > but fixes them promptly, as opposed to one who only introduces one bug, > but refuses to consider fixing the code 'because they don't make > mistakes like that' (u\bsadly a common attitude from people who produce > very good code much of the time) > > best of all is a developer/maintainer who writes very good code and is > willing to accept the fact that they make mistakes and fixes the code > promptly, but those people are extremely rare, and usually they emerge > from the pool of people who make more mistakes and fix them promptly, > which is an added reason I'm more tolerant of that group. > > David Lang > Having been a Linux user since the late 90's the problem I see is that developers decide to re-design stuff that is already working and then things that used to work don't work anymore. Libata is a good example. I had an older laptop that eventually got working again - but the old ide stuff wasn't studied enough to find out what had to be brought forward and supported in libata. Regards, Steve -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 2:34 ` David Newall 2008-04-16 3:53 ` david @ 2008-04-16 4:29 ` Willy Tarreau 2008-04-16 12:13 ` Rafael J. Wysocki 1 sibling, 1 reply; 129+ messages in thread From: Willy Tarreau @ 2008-04-16 4:29 UTC (permalink / raw) To: David Newall Cc: Rafael J. Wysocki, Michael Kerrisk, James Morris, Al Viro, Andrew Morton, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Wed, Apr 16, 2008 at 12:04:59PM +0930, David Newall wrote: > Rafael J. Wysocki wrote: > > Well, even if someone introduces bugs relatively frequently, but then also > > works with the reporters and fixes the bugs timely, it's about okay IMO. > > > This really is not okay. Even if bugs are fixed a version or two later, > the impact those bugs have on users makes the system look bad and drives > them away. We do not, I believe, want Linux to top the list for "most > bugs". It's unprofessional, unreliable and quite undesirable. that's what -rc are for, and it's unprofessional to use them in production :-) ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 4:29 ` Willy Tarreau @ 2008-04-16 12:13 ` Rafael J. Wysocki 0 siblings, 0 replies; 129+ messages in thread From: Rafael J. Wysocki @ 2008-04-16 12:13 UTC (permalink / raw) To: Willy Tarreau, David Newall Cc: Michael Kerrisk, James Morris, Al Viro, Andrew Morton, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev, Andi Kleen On Wednesday, 16 of April 2008, Willy Tarreau wrote: > On Wed, Apr 16, 2008 at 12:04:59PM +0930, David Newall wrote: > > Rafael J. Wysocki wrote: > > > Well, even if someone introduces bugs relatively frequently, but then also > > > works with the reporters and fixes the bugs timely, it's about okay IMO. > > > > > This really is not okay. Even if bugs are fixed a version or two later, > > the impact those bugs have on users makes the system look bad and drives > > them away. We do not, I believe, want Linux to top the list for "most > > bugs". It's unprofessional, unreliable and quite undesirable. > > that's what -rc are for, and it's unprofessional to use them in production :-) Exactly. And BTW, by saying "timely" I meant "in -rc" or "before the next major release". Thanks, Rafael ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-15 9:33 ` David Newall 2008-04-15 9:54 ` Michael Kerrisk @ 2008-04-16 12:15 ` Sverre Rabbelier 2008-04-16 13:26 ` Adrian Bunk 2008-04-16 21:17 ` Jesper Juhl 1 sibling, 2 replies; 129+ messages in thread From: Sverre Rabbelier @ 2008-04-16 12:15 UTC (permalink / raw) To: git, linux-kernel Cc: James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, netdev, David Newall I'm not subscribed to the kernel mailing list, so please include me in the cc if you don't reply to the git list (which I am subscribed to). Git is participating in Google Summer of Code this year and I've proposed to write a 'git statistics' command. This command would allow the user to gather data about a repository, ranging from "how active is dev x" to "what did x work on in the last 3 weeks". It's main feature however, would be an algorithm that ranks commits as being either 'buggy', 'bugfix' or 'enhancement'. (There are several clues that can aid in determining this, a commit msg along the lines of "fixes ..." being the most obvious.) In the light of this recent discussion, especially the part on "keeping count of the number of errors introduced by author and reviewer?", I thought it might for the kernel mailing list to be aware of this. Also mentioned in this thread was that reviewers don't get enough credits. As long as patches are signed with, say, 'reviewed-by:', 'acked-by:' or 'signed-off-by:' the command I suggest to implement would be able to give more accurate statistics on who "works on the kernel". This way reviewers get the credit they deserve. The knife cuts on both sides of course, if someone reviews a patch that is later determined to introduce a bug, they can be recorded to have acked a buggy commit. This is especially interesting in determining who are the good reviewers, but also in determining who are the good contributors. A distinction could be made between parts of the source, say, a maintainer might excel in patches related to driver foo, but when they submit a patch for driver bar it usually contains bugs . Armed with these statistics reviewers might decide to be more careful before acking a patch from that maintainer if it's on driver bar, but when that same maintainer sends in a patch from driver bar it is probably ok and needs less attention. My application, and a more extended description, can be found here: http://alturin.googlepages.com/gsoc2008 I'm interested to know if the community is indeed as interested in my proposal as I hope and if I oversaw any obvious features that would make it an even better command. Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 12:15 ` Sverre Rabbelier @ 2008-04-16 13:26 ` Adrian Bunk 2008-04-16 19:02 ` Andrew Morton ` (2 more replies) 2008-04-16 21:17 ` Jesper Juhl 1 sibling, 3 replies; 129+ messages in thread From: Adrian Bunk @ 2008-04-16 13:26 UTC (permalink / raw) To: sverre Cc: git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, netdev, David Newall On Wed, Apr 16, 2008 at 02:15:22PM +0200, Sverre Rabbelier wrote: > I'm not subscribed to the kernel mailing list, so please include me in > the cc if you don't reply to the git list (which I am subscribed to). > > Git is participating in Google Summer of Code this year and I've > proposed to write a 'git statistics' command. This command would allow > the user to gather data about a repository, ranging from "how active > is dev x" to "what did x work on in the last 3 weeks". It's main > feature however, would be an algorithm that ranks commits as being > either 'buggy', 'bugfix' or 'enhancement'. (There are several clues > that can aid in determining this, a commit msg along the lines of > "fixes ..." being the most obvious.) >... At least with the data we have currently in git it's impossible to figure that out automatically. E.g. if you look at commit f743d04dcfbeda7439b78802d35305781999aa11 (ide/legacy/q40ide.c: add MODULE_LICENSE), how could you determine automatically that it is a bugfix, and the commit that introduced the bug? You can always get some data, but if you want to get usable statistics you need explicit tags in the commits, not some algorithm that tries to guess. > Cheers, > > Sverre Rabbelier cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 13:26 ` Adrian Bunk @ 2008-04-16 19:02 ` Andrew Morton 2008-04-16 19:43 ` Sverre Rabbelier ` (3 more replies) 2008-04-16 19:39 ` Sverre Rabbelier 2008-04-16 20:04 ` Willy Tarreau 2 siblings, 4 replies; 129+ messages in thread From: Andrew Morton @ 2008-04-16 19:02 UTC (permalink / raw) To: Adrian Bunk Cc: sverre, git, linux-kernel, jmorris, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, davem, jesper.juhl, yoshfuji, jeff, netdev, davidn On Wed, 16 Apr 2008 16:26:34 +0300 Adrian Bunk <bunk@kernel.org> wrote: > On Wed, Apr 16, 2008 at 02:15:22PM +0200, Sverre Rabbelier wrote: > > I'm not subscribed to the kernel mailing list, so please include me in > > the cc if you don't reply to the git list (which I am subscribed to). > > > > Git is participating in Google Summer of Code this year and I've > > proposed to write a 'git statistics' command. This command would allow > > the user to gather data about a repository, ranging from "how active > > is dev x" to "what did x work on in the last 3 weeks". It's main > > feature however, would be an algorithm that ranks commits as being > > either 'buggy', 'bugfix' or 'enhancement'. (There are several clues > > that can aid in determining this, a commit msg along the lines of > > "fixes ..." being the most obvious.) > >... Sounds like an interesting project. > At least with the data we have currently in git it's impossible to > figure that out automatically. > > E.g. if you look at commit f743d04dcfbeda7439b78802d35305781999aa11 > (ide/legacy/q40ide.c: add MODULE_LICENSE), how could you determine > automatically that it is a bugfix, and the commit that introduced > the bug? > > You can always get some data, but if you want to get usable statistics > you need explicit tags in the commits, not some algorithm that tries > to guess. Well yes. One outcome of the project would be to tell us what changes we'd need to make to our processes to make such data gathering more effective. Of course, we may not actually implement such changes. That would depend upon how useful the output is to us. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 19:02 ` Andrew Morton @ 2008-04-16 19:43 ` Sverre Rabbelier 2008-04-16 19:55 ` Adrian Bunk ` (2 subsequent siblings) 3 siblings, 0 replies; 129+ messages in thread From: Sverre Rabbelier @ 2008-04-16 19:43 UTC (permalink / raw) To: Andrew Morton Cc: Adrian Bunk, git, linux-kernel, jmorris, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, davem, jesper.juhl, yoshfuji, jeff, netdev, davidn On Wed, Apr 16, 2008 at 9:02 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > Sounds like an interesting project. Thank you :). > Well yes. One outcome of the project would be to tell us what changes we'd > need to make to our processes to make such data gathering more effective. I defenitly agree here, the command's reliability could be increased by always specifying bugfixes in a certain way. 'fixed-bug:' for example should be very recognizable. > Of course, we may not actually implement such changes. That would depend > upon how useful the output is to us. Ah yes, free will and whatnot. Then again, everybody already does 'signed-off-by:', if there's an easy command in git to mark a bugfix, it would increase the odds of people using it. Perhaps something like 'git commit -b 10256" which would then automagically append a predefined message to the commit users would feel more inclined? Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 19:02 ` Andrew Morton 2008-04-16 19:43 ` Sverre Rabbelier @ 2008-04-16 19:55 ` Adrian Bunk 2008-04-17 13:50 ` J. Bruce Fields 2008-04-16 19:58 ` Alexey Dobriyan 2008-04-16 20:01 ` Arjan van de Ven 3 siblings, 1 reply; 129+ messages in thread From: Adrian Bunk @ 2008-04-16 19:55 UTC (permalink / raw) To: Andrew Morton Cc: sverre, git, linux-kernel, jmorris, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, davem, jesper.juhl, yoshfuji, jeff, netdev, davidn On Wed, Apr 16, 2008 at 12:02:47PM -0700, Andrew Morton wrote: > On Wed, 16 Apr 2008 16:26:34 +0300 > Adrian Bunk <bunk@kernel.org> wrote: > > > On Wed, Apr 16, 2008 at 02:15:22PM +0200, Sverre Rabbelier wrote: > > > I'm not subscribed to the kernel mailing list, so please include me in > > > the cc if you don't reply to the git list (which I am subscribed to). > > > > > > Git is participating in Google Summer of Code this year and I've > > > proposed to write a 'git statistics' command. This command would allow > > > the user to gather data about a repository, ranging from "how active > > > is dev x" to "what did x work on in the last 3 weeks". It's main > > > feature however, would be an algorithm that ranks commits as being > > > either 'buggy', 'bugfix' or 'enhancement'. (There are several clues > > > that can aid in determining this, a commit msg along the lines of > > > "fixes ..." being the most obvious.) > > >... > > Sounds like an interesting project. > > > At least with the data we have currently in git it's impossible to > > figure that out automatically. > > > > E.g. if you look at commit f743d04dcfbeda7439b78802d35305781999aa11 > > (ide/legacy/q40ide.c: add MODULE_LICENSE), how could you determine > > automatically that it is a bugfix, and the commit that introduced > > the bug? > > > > You can always get some data, but if you want to get usable statistics > > you need explicit tags in the commits, not some algorithm that tries > > to guess. > > Well yes. One outcome of the project would be to tell us what changes we'd > need to make to our processes to make such data gathering more effective. > > Of course, we may not actually implement such changes. That would depend > upon how useful the output is to us. That you can add this information through tags is clear, but according to his SoC application that's not what he wants to do. According to his application he wants to determine automatically whether a commit was a fix or whether a commit introduced a bug by doing stuff like tracking whether a changed line was modified again shortly after a commit. This plan of him will simply not result in accurate numbers. Sure, you will get some numbers, but if anyone would e.g. wrongly accuse me that 2% of my commits last year introduced bugs I would get ***really*** angry. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 19:55 ` Adrian Bunk @ 2008-04-17 13:50 ` J. Bruce Fields 2008-04-17 15:26 ` Adrian Bunk 0 siblings, 1 reply; 129+ messages in thread From: J. Bruce Fields @ 2008-04-17 13:50 UTC (permalink / raw) To: Adrian Bunk Cc: Andrew Morton, sverre, git, linux-kernel, jmorris, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, davem, jesper.juhl, yoshfuji, jeff, netdev, davidn On Wed, Apr 16, 2008 at 10:55:03PM +0300, Adrian Bunk wrote: > On Wed, Apr 16, 2008 at 12:02:47PM -0700, Andrew Morton wrote: > > On Wed, 16 Apr 2008 16:26:34 +0300 > > Adrian Bunk <bunk@kernel.org> wrote: > > > > > On Wed, Apr 16, 2008 at 02:15:22PM +0200, Sverre Rabbelier wrote: > > > > I'm not subscribed to the kernel mailing list, so please include me in > > > > the cc if you don't reply to the git list (which I am subscribed to). > > > > > > > > Git is participating in Google Summer of Code this year and I've > > > > proposed to write a 'git statistics' command. This command would allow > > > > the user to gather data about a repository, ranging from "how active > > > > is dev x" to "what did x work on in the last 3 weeks". It's main > > > > feature however, would be an algorithm that ranks commits as being > > > > either 'buggy', 'bugfix' or 'enhancement'. (There are several clues > > > > that can aid in determining this, a commit msg along the lines of > > > > "fixes ..." being the most obvious.) > > > >... > > > > Sounds like an interesting project. > > > > > At least with the data we have currently in git it's impossible to > > > figure that out automatically. > > > > > > E.g. if you look at commit f743d04dcfbeda7439b78802d35305781999aa11 > > > (ide/legacy/q40ide.c: add MODULE_LICENSE), how could you determine > > > automatically that it is a bugfix, and the commit that introduced > > > the bug? > > > > > > You can always get some data, but if you want to get usable statistics > > > you need explicit tags in the commits, not some algorithm that tries > > > to guess. > > > > Well yes. One outcome of the project would be to tell us what changes we'd > > need to make to our processes to make such data gathering more effective. > > > > Of course, we may not actually implement such changes. That would depend > > upon how useful the output is to us. > > That you can add this information through tags is clear, but according > to his SoC application that's not what he wants to do. > > According to his application he wants to determine automatically whether > a commit was a fix or whether a commit introduced a bug by doing stuff > like tracking whether a changed line was modified again shortly after a > commit. > > This plan of him will simply not result in accurate numbers. They won't be completely accurate, but who knows, maybe they'd turn out to have a higher rate of accuracy than we'd expect. (I assume you could do a closer manual study of a small random sample of the results to estimate the accuracy.) Seems worth a try. > Sure, you will get some numbers, but if anyone would e.g. wrongly accuse > me that 2% of my commits last year introduced bugs I would get > ***really*** angry. It's just an experiment; reasonable people won't take it as the final word. --b. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-17 13:50 ` J. Bruce Fields @ 2008-04-17 15:26 ` Adrian Bunk 0 siblings, 0 replies; 129+ messages in thread From: Adrian Bunk @ 2008-04-17 15:26 UTC (permalink / raw) To: J. Bruce Fields Cc: Andrew Morton, sverre, git, linux-kernel, jmorris, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, davem, jesper.juhl, yoshfuji, jeff, netdev, davidn On Thu, Apr 17, 2008 at 09:50:13AM -0400, J. Bruce Fields wrote: > On Wed, Apr 16, 2008 at 10:55:03PM +0300, Adrian Bunk wrote: >... > > Sure, you will get some numbers, but if anyone would e.g. wrongly accuse > > me that 2% of my commits last year introduced bugs I would get > > ***really*** angry. > > It's just an experiment; reasonable people won't take it as the final > word. Take e.g. [1] as an example how git statistics about the Linux kernel are already used to "prove" things that aren't true. > --b. cu Adrian [1] http://digitalvampire.org/blog/index.php/2008/04/11/lies-d-oh-forget-it/ -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 19:02 ` Andrew Morton 2008-04-16 19:43 ` Sverre Rabbelier 2008-04-16 19:55 ` Adrian Bunk @ 2008-04-16 19:58 ` Alexey Dobriyan 2008-04-16 20:01 ` Arjan van de Ven 3 siblings, 0 replies; 129+ messages in thread From: Alexey Dobriyan @ 2008-04-16 19:58 UTC (permalink / raw) To: Andrew Morton Cc: Adrian Bunk, sverre, git, linux-kernel, jmorris, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, davem, jesper.juhl, yoshfuji, jeff, netdev, davidn On Wed, Apr 16, 2008 at 12:02:47PM -0700, Andrew Morton wrote: > On Wed, 16 Apr 2008 16:26:34 +0300 > Adrian Bunk <bunk@kernel.org> wrote: > > > On Wed, Apr 16, 2008 at 02:15:22PM +0200, Sverre Rabbelier wrote: > > > I'm not subscribed to the kernel mailing list, so please include me in > > > the cc if you don't reply to the git list (which I am subscribed to). > > > > > > Git is participating in Google Summer of Code this year and I've > > > proposed to write a 'git statistics' command. This command would allow > > > the user to gather data about a repository, ranging from "how active > > > is dev x" to "what did x work on in the last 3 weeks". These are pointy-hairy questions. > > > It's main > > > feature however, would be an algorithm that ranks commits as being > > > either 'buggy', 'bugfix' or 'enhancement'. (There are several clues > > > that can aid in determining this, a commit msg along the lines of > > > "fixes ..." being the most obvious.) > > >... > > Sounds like an interesting project. The interesting (and answerable) questions are: 1) How many bugs one non-merge commit brings on average 2) What is average time between buggy commit entering Linus's tree and fix entering the same tree. 3) Graphs of #1 and #2 over time. 4) rough division of bugs a-la refcounting, locking, hw, hw workaround. 5) if other OS have such statistics, comparison with them (little finger for this) #1 alone can shred OSDL and LWN induced PDFs into innumerable pieces! ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 19:02 ` Andrew Morton ` (2 preceding siblings ...) 2008-04-16 19:58 ` Alexey Dobriyan @ 2008-04-16 20:01 ` Arjan van de Ven 3 siblings, 0 replies; 129+ messages in thread From: Arjan van de Ven @ 2008-04-16 20:01 UTC (permalink / raw) To: Andrew Morton Cc: Adrian Bunk, sverre, git, linux-kernel, jmorris, viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, davem, jesper.juhl, yoshfuji, jeff, netdev, davidn On Wed, 16 Apr 2008 12:02:47 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > > > At least with the data we have currently in git it's impossible to > > figure that out automatically. > > > > E.g. if you look at commit f743d04dcfbeda7439b78802d35305781999aa11 > > (ide/legacy/q40ide.c: add MODULE_LICENSE), how could you determine > > automatically that it is a bugfix, and the commit that introduced > > the bug? > > > > You can always get some data, but if you want to get usable > > statistics you need explicit tags in the commits, not some > > algorithm that tries to guess. > > Well yes. One outcome of the project would be to tell us what > changes we'd need to make to our processes to make such data > gathering more effective. also.. "what is a bugfix" is an interesting thing... for some things it's very easy. For others.. it's really hard to draw a solid line where bugs stop and features start. (for example, is a missing cpu id in oprofile a bugfix ("oprofile doesn't work") or a feature ("new cpu support"). This one is one of the more simple ones even...) -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 13:26 ` Adrian Bunk 2008-04-16 19:02 ` Andrew Morton @ 2008-04-16 19:39 ` Sverre Rabbelier 2008-04-16 20:16 ` Adrian Bunk 2008-04-16 20:04 ` Willy Tarreau 2 siblings, 1 reply; 129+ messages in thread From: Sverre Rabbelier @ 2008-04-16 19:39 UTC (permalink / raw) To: Adrian Bunk Cc: git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, netdev, David Newall On Wed, Apr 16, 2008 at 3:26 PM, Adrian Bunk <bunk@kernel.org> wrote: > On Wed, Apr 16, 2008 at 02:15:22PM +0200, Sverre Rabbelier wrote: > At least with the data we have currently in git it's impossible to > figure that out automatically. I don't quite agree, as I explained in my proposal there are several ways to detect that a commit was a bugfix. From thereon you can deduct that if it was a bugfix, that the commit that introduced the fixed change was a bug! From thereon you can start sifting and get more confirmations. Junio has made several suggestions as to how this could be implemented and I'm confident that and algorithm can be devised that is at least capable of 'guessing' what type a commit is. Aside from the guessing part I think a lot of information can be gathered from commit msgs. Of course, some commits might not be able to be typed (as there might not be any 'follow up' information on them). Those commits can be marked as 'unknown' and be ignored. Agreed, should all commits be 'unknown' then the command wouldn't be very useful, but especially on large repos there is a very large dataset. As the size of the dataset increases I estimate that the correlation between commits increases (less commits that add new code which then is never changed therafter). The higher the degree of correlation between individual commits the more we can determine about the nature of a commit. > E.g. if you look at commit f743d04dcfbeda7439b78802d35305781999aa11 > (ide/legacy/q40ide.c: add MODULE_LICENSE), how could you determine > automatically that it is a bugfix, and the commit that introduced > the bug? Well, a dead giveaway would be: "http://bugzilla.kernel.org/show_bug.cgi?id=10124" > You can always get some data, but if you want to get usable statistics > you need explicit tags in the commits, not some algorithm that tries > to guess. As said above, I don't agree, you can 'guess' very reliably on a large dataset. Also, most commits are already 'tagged' in some way or another. The trick is to find the pattern in this tagging and use it. I hope this clears things up a bit, Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 19:39 ` Sverre Rabbelier @ 2008-04-16 20:16 ` Adrian Bunk 2008-04-16 20:53 ` Adrian Bunk 0 siblings, 1 reply; 129+ messages in thread From: Adrian Bunk @ 2008-04-16 20:16 UTC (permalink / raw) To: sverre Cc: git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, netdev, David Newall On Wed, Apr 16, 2008 at 09:39:41PM +0200, Sverre Rabbelier wrote: > On Wed, Apr 16, 2008 at 3:26 PM, Adrian Bunk <bunk@kernel.org> wrote: >... > > E.g. if you look at commit f743d04dcfbeda7439b78802d35305781999aa11 > > (ide/legacy/q40ide.c: add MODULE_LICENSE), how could you determine > > automatically that it is a bugfix, and the commit that introduced > > the bug? > > Well, a dead giveaway would be: > "http://bugzilla.kernel.org/show_bug.cgi?id=10124" Which could be "There is no driver for my TV card in the kernel." > > You can always get some data, but if you want to get usable statistics > > you need explicit tags in the commits, not some algorithm that tries > > to guess. > > As said above, I don't agree, you can 'guess' very reliably on a large > dataset. Also, most commits are already 'tagged' in some way or > another. The trick is to find the pattern in this tagging and use it. > > I hope this clears things up a bit, I hope you are aware of the non-technical implications if the results don't match reality? E.g. I am proud that my commits do virtually never introduce bugs, so any results someone publishes about what I do should better be right or my first thoughts are somewhere between "fist" and "lawyer". [1] > Cheers, > > Sverre Rabbelier cu Adrian [1] my actual reaction might only be an angry email, but I hope you get the point that wrong results can really piss off people -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 20:16 ` Adrian Bunk @ 2008-04-16 20:53 ` Adrian Bunk 2008-04-16 21:05 ` Sverre Rabbelier 0 siblings, 1 reply; 129+ messages in thread From: Adrian Bunk @ 2008-04-16 20:53 UTC (permalink / raw) To: sverre Cc: git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, netdev, David Newall On Wed, Apr 16, 2008 at 11:16:06PM +0300, Adrian Bunk wrote: >... > E.g. I am proud that my commits do virtually never introduce bugs, so > any results someone publishes about what I do should better be right > or my first thoughts are somewhere between "fist" and "lawyer". [1] >... To avoid any misunderstandings: This is not in any way meant against you personally. But saying things like " X% of your commits introduced bugs" is not a friendly thing, and wrong data could be quite hurting. Especially in the open source world where much motivation comes from people being proud of their work. Even correct data can do harm. And bad data can have really bad effects. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 20:53 ` Adrian Bunk @ 2008-04-16 21:05 ` Sverre Rabbelier 2008-04-16 21:25 ` Adrian Bunk 0 siblings, 1 reply; 129+ messages in thread From: Sverre Rabbelier @ 2008-04-16 21:05 UTC (permalink / raw) To: Adrian Bunk Cc: git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, netdev, David Newall On Wed, Apr 16, 2008 at 10:53 PM, Adrian Bunk <bunk@kernel.org> wrote: > To avoid any misunderstandings: > > This is not in any way meant against you personally. Thanks for pointing it out, I wasn't quite sure, but assumed that :). > But saying things like " X% of your commits introduced bugs" is not a > friendly thing, and wrong data could be quite hurting. Yes, it could be, and I agree that conclusions shouldn't be based on the details, but on the bigger picture. Also, I think it should (at first) be used mainly as an indicator, of where attention might be required. I mean, if it points out that one contributor almost always commits buggy code, you don't have to present them with those statistics right away. Instead you can ask the program where it bases it's conclusions on, and research them yourself. If it does indeed turn out that they are slacking that much you have good ground to have a talk with them. > Especially in the open source world where much motivation comes from > people being proud of their work. Yes, that is very true, I very much agree with that, but on the other hand it might also point out contributors that are particularly skillful in a certain section that was previously not noted. As with all statistics, it's up to interpretation, misinterpreting statistics could -always- have bad effects. > Even correct data can do harm. > > And bad data can have really bad effects. True, both, but as said, if properly interpreted it could be very useful. Cheers, Sverre Rabbelier ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 21:05 ` Sverre Rabbelier @ 2008-04-16 21:25 ` Adrian Bunk 0 siblings, 0 replies; 129+ messages in thread From: Adrian Bunk @ 2008-04-16 21:25 UTC (permalink / raw) To: sverre Cc: git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, netdev, David Newall On Wed, Apr 16, 2008 at 11:05:17PM +0200, Sverre Rabbelier wrote: > On Wed, Apr 16, 2008 at 10:53 PM, Adrian Bunk <bunk@kernel.org> wrote: > > To avoid any misunderstandings: > > > > This is not in any way meant against you personally. > > Thanks for pointing it out, I wasn't quite sure, but assumed that :). Sorry, I was a bit overreacting since I see too often people putting some data into some statistics or graph and drawing conclusins without paying attention to whether their data allows these conclusions at all. > > But saying things like " X% of your commits introduced bugs" is not a > > friendly thing, and wrong data could be quite hurting. > > Yes, it could be, and I agree that conclusions shouldn't be based on > the details, but on the bigger picture. Also, I think it should (at > first) be used mainly as an indicator, of where attention might be > required. I mean, if it points out that one contributor almost always > commits buggy code, I would assume that in all projects the main maintainers already have an impression of how good the quality of the patches of each main contributor is. In much more complex ways than a number could express. > you don't have to present them with those > statistics right away. Instead you can ask the program where it bases > it's conclusions on, and research them yourself. Sooner or later someone will run the program for the Linux kernel, write a paper about the results, and publish his research somewhere. >... > Cheers, > > Sverre Rabbelier cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 13:26 ` Adrian Bunk 2008-04-16 19:02 ` Andrew Morton 2008-04-16 19:39 ` Sverre Rabbelier @ 2008-04-16 20:04 ` Willy Tarreau 2008-04-16 20:55 ` Jakub Narebski 2 siblings, 1 reply; 129+ messages in thread From: Willy Tarreau @ 2008-04-16 20:04 UTC (permalink / raw) To: Adrian Bunk Cc: sverre, git, linux-kernel, James Morris, Al Viro, Andrew Morton, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, netdev, David Newall On Wed, Apr 16, 2008 at 04:26:34PM +0300, Adrian Bunk wrote: > On Wed, Apr 16, 2008 at 02:15:22PM +0200, Sverre Rabbelier wrote: > > I'm not subscribed to the kernel mailing list, so please include me in > > the cc if you don't reply to the git list (which I am subscribed to). > > > > Git is participating in Google Summer of Code this year and I've > > proposed to write a 'git statistics' command. This command would allow > > the user to gather data about a repository, ranging from "how active > > is dev x" to "what did x work on in the last 3 weeks". It's main > > feature however, would be an algorithm that ranks commits as being > > either 'buggy', 'bugfix' or 'enhancement'. (There are several clues > > that can aid in determining this, a commit msg along the lines of > > "fixes ..." being the most obvious.) > >... > > At least with the data we have currently in git it's impossible to > figure that out automatically. > > E.g. if you look at commit f743d04dcfbeda7439b78802d35305781999aa11 > (ide/legacy/q40ide.c: add MODULE_LICENSE), how could you determine > automatically that it is a bugfix, and the commit that introduced > the bug? > > You can always get some data, but if you want to get usable statistics > you need explicit tags in the commits, not some algorithm that tries > to guess. yes, and doing that would get back to the bureaucracy some people are trying to reduce in order to save time to do the real work. However, in another project of mine, I've got used to systematically indicate the type of change in the subject line. It does not get any slower for the author, and it appears in shortlogs. And quite amazingly the principle has immediately been adopted by several contributors : ----- Note to contributors: it's very handy when patches comes with a properly formated subject. Try to put one of the following words between brackets to indicate the importance of the patch followed by a short description: [MINOR] minor fix, very low risk of impact [MEDIUM] medium risk, may cause unexpected regressions of low importance or which may quickly be discovered [MAJOR] major risk of hidden regression. This happens when I rearrange large parts of code, when I play with timeouts, with variable initializations, etc... [BUG] fix for a minor or medium-level bug. [CRITICAL] medium-term reliability or security is at risk, an upgrade is absolutely required. [RELEASE] release a new version [BUILD] fix build issues. If you could build, no upgrade required. [CLEANUP] code cleanup, silence of warnings, etc... theorically no impact [TESTS] added regression testing configuration files or scripts [DOC] documentation updates, no need to upgrade [LICENSE] licensing updates (may impact distro packagers) Example: "[DOC] document options forwardfor to logasap" ----- Nothing is mandatory, and I (as the maintainer) can still choose to adjust the prefix if I want. But in fact, I only had to to it when contributors did not classify their patch themselves. Several other tags may be added for LKML, such as "RFC" which is already used, etc... The advantages of this usage are multiple. Nothing needs to be changed in the tools, no header needs to be added, it's still very compatible with the mailing-list usages (and helps focusing on specific patches), it's absolutely not mandatory and easily tweakable. I'd like people in this thread not to forget that what we need is not a fantastic tool to work around some developers' weaknesses, but cheap (if any) help from the developers to help reviewers. I think that such a proposal falls exactly in this category. I'm quite ready to use it already (though I do not post often), and think that it would still feel natural to many developers since most of them are already used to such a format. I think it just requires a few starters to get most of us to progressively use such a scheme by default. Regards, Willy ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 20:04 ` Willy Tarreau @ 2008-04-16 20:55 ` Jakub Narebski 0 siblings, 0 replies; 129+ messages in thread From: Jakub Narebski @ 2008-04-16 20:55 UTC (permalink / raw) To: git; +Cc: linux-kernel, netdev Willy Tarreau wrote: > Note to contributors: it's very handy when patches comes with a properly > formated subject. Try to put one of the following words between brackets > to indicate the importance of the patch followed by a short description: > > [MINOR] minor fix, very low risk of impact > [MEDIUM] medium risk, may cause unexpected regressions of low importance or > which may quickly be discovered [...] And git-am strips such prefixes because of [PATCH] and [PATCH n/m] which should be stripped. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 12:15 ` Sverre Rabbelier 2008-04-16 13:26 ` Adrian Bunk @ 2008-04-16 21:17 ` Jesper Juhl 2008-04-17 17:04 ` David Newall 1 sibling, 1 reply; 129+ messages in thread From: Jesper Juhl @ 2008-04-16 21:17 UTC (permalink / raw) To: sverre Cc: git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, yoshfuji, jeff, netdev, David Newall On 16/04/2008, Sverre Rabbelier <alturin@gmail.com> wrote: ... > Git is participating in Google Summer of Code this year and I've > proposed to write a 'git statistics' command. This command would allow > the user to gather data about a repository, ranging from "how active > is dev x" to "what did x work on in the last 3 weeks". It's main > feature however, would be an algorithm that ranks commits as being > either 'buggy', 'bugfix' or 'enhancement'. Interresting. Just be careful results are produced for the big picture and not used to point fingers at individuals. >(There are several clues > that can aid in determining this, a commit msg along the lines of > "fixes ..." being the most obvious.) One thing I thought of is that the more "Acked-by", "Reviewed-by" and "Signed-off-by" lines a patch has, the better reviewed we can probably assume it to be and thus the probability of it having introduced a bug probably drops slightly compared to other less-reviewed patches... or maybe not, but at least it's something to think about :-) -- Jesper Juhl <jesper.juhl@gmail.com> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-16 21:17 ` Jesper Juhl @ 2008-04-17 17:04 ` David Newall 2008-04-17 19:09 ` Rafael J. Wysocki 0 siblings, 1 reply; 129+ messages in thread From: David Newall @ 2008-04-17 17:04 UTC (permalink / raw) To: Jesper Juhl Cc: sverre, git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, yoshfuji, jeff, netdev Jesper Juhl wrote: > Interresting. Just be careful results are produced for the big picture > and not used to point fingers at individuals. > If there are individuals at whom a finger needs to be pointed, this system will highlight them, and fingers will (and should) be pointed. Contributors of poor-quality code need to be weeded-out. Finger-pointing, in these extreme cases, gives incentive to improve quality. It's a positive thing. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-17 17:04 ` David Newall @ 2008-04-17 19:09 ` Rafael J. Wysocki 2008-04-17 19:35 ` Ray Lee 0 siblings, 1 reply; 129+ messages in thread From: Rafael J. Wysocki @ 2008-04-17 19:09 UTC (permalink / raw) To: David Newall Cc: Jesper Juhl, sverre, git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, yoshfuji, jeff, netdev On Thursday, 17 of April 2008, David Newall wrote: > Jesper Juhl wrote: > > Interresting. Just be careful results are produced for the big picture > > and not used to point fingers at individuals. > > > > If there are individuals at whom a finger needs to be pointed, this > system will highlight them, and fingers will (and should) be pointed. > Contributors of poor-quality code need to be weeded-out. Define poor quality. > Finger-pointing, in these extreme cases, gives incentive to improve > quality. It's a positive thing. Sorry, but I have to disagree. Negative finger-pointing is never a good thing. Also, it doesn't give any incentive to anyone. It only makes people feel bad and finally discourages them from contributing anything. If you want to give poeple incentives, reward them for doing things you'd like them to do. Thanks, Rafael ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-17 19:09 ` Rafael J. Wysocki @ 2008-04-17 19:35 ` Ray Lee 2008-04-17 19:57 ` Sverre Rabbelier 2008-04-17 20:16 ` Al Viro 0 siblings, 2 replies; 129+ messages in thread From: Ray Lee @ 2008-04-17 19:35 UTC (permalink / raw) To: Rafael J. Wysocki Cc: David Newall, Jesper Juhl, sverre, git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, yoshfuji, jeff, netdev On Thu, Apr 17, 2008 at 12:09 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > Finger-pointing, in these extreme cases, gives incentive to improve > > quality. It's a positive thing. > > Sorry, but I have to disagree. Negative finger-pointing is never a good thing. Correct, but let's be careful here. The original suggestion was, effectively, to get better metrics on the quality of contributions. Those metrics *could* be used for finger pointing, or (my preference) they could be used to direct and allocate our scarce resources: code reviews and mentoring. There's no way to know what the metrics will tell us until we have them. Arguing against metrics because they *may* be used to point fingers at people is a silly argument; anything can be subverted to do that. Let's get some measurements and see what they say. In the meantime, try to believe that they could be put to good purposes, such as identifying code areas that are tricky for contributors to get right (independent of contributor), or contributors that could benefit from code reviews, etc. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-17 19:35 ` Ray Lee @ 2008-04-17 19:57 ` Sverre Rabbelier 2008-04-17 20:16 ` Al Viro 1 sibling, 0 replies; 129+ messages in thread From: Sverre Rabbelier @ 2008-04-17 19:57 UTC (permalink / raw) To: Ray Lee Cc: Rafael J. Wysocki, David Newall, Jesper Juhl, git, linux-kernel, James Morris, Al Viro, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, yoshfuji, jeff, netdev On Thu, Apr 17, 2008 at 9:35 PM, Ray Lee <ray-lk@madrabbit.org> wrote: > On Thu, Apr 17, 2008 at 12:09 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > Finger-pointing, in these extreme cases, gives incentive to improve > > > quality. It's a positive thing. > > > > Sorry, but I have to disagree. Negative finger-pointing is never a good thing. > > Correct, but let's be careful here. The original suggestion was, > effectively, to get better metrics on the quality of contributions. > Those metrics *could* be used for finger pointing, or (my preference) > they could be used to direct and allocate our scarce resources: code > reviews and mentoring. Exactly! > There's no way to know what the metrics will tell us until we have > them. Arguing against metrics because they *may* be used to point > fingers at people is a silly argument; anything can be subverted to do > that. Thank you, that should have been said before, you worded it perfectly. > Let's get some measurements and see what they say. In the meantime, > try to believe that they could be put to good purposes, such as > identifying code areas that are tricky for contributors to get right > (independent of contributor), or contributors that could benefit from > code reviews, etc. This especially is an area that I plan to focus on and should be very reliable when finished. As can be read in my application, I plan to look at how often a piece of code is changed, in what timespan and by how many different authors. Thanks for the reply! Cheers, Sverre ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-17 19:35 ` Ray Lee 2008-04-17 19:57 ` Sverre Rabbelier @ 2008-04-17 20:16 ` Al Viro 2008-04-17 20:38 ` Ray Lee 1 sibling, 1 reply; 129+ messages in thread From: Al Viro @ 2008-04-17 20:16 UTC (permalink / raw) To: Ray Lee Cc: Rafael J. Wysocki, David Newall, Jesper Juhl, sverre, git, linux-kernel, James Morris, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, yoshfuji, jeff, netdev On Thu, Apr 17, 2008 at 12:35:12PM -0700, Ray Lee wrote: > On Thu, Apr 17, 2008 at 12:09 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > Finger-pointing, in these extreme cases, gives incentive to improve > > > quality. It's a positive thing. > > > > Sorry, but I have to disagree. Negative finger-pointing is never a good thing. > > Correct, but let's be careful here. The original suggestion was, > effectively, to get better metrics on the quality of contributions. There already is one: reputation with people working on the tree, be it actively modifying/reviewing/bug hunting/etc. _We_ _already_ _know_; generally one gets a decent idea of what to expect pretty soon. And frankly, that's the only thing that matters anyway; I suspect I'd do rather well by proposed criteria, but you know what? I don't give a flying f*ck through the rolling doughnut for self-appointed PHBs and their idea of performance reviews. Think of it as a modified Turing test: convince me that you are not a script piped through an Eng.Lit. wanker or an MBA, then I might care for your opinion. Al, who never had problems with pointing fingers and laughing, but likes an informed human brain to be the source of it... ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-17 20:16 ` Al Viro @ 2008-04-17 20:38 ` Ray Lee 2008-04-17 20:53 ` Al Viro 0 siblings, 1 reply; 129+ messages in thread From: Ray Lee @ 2008-04-17 20:38 UTC (permalink / raw) To: Al Viro Cc: Rafael J. Wysocki, David Newall, Jesper Juhl, sverre, git, linux-kernel, James Morris, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, yoshfuji, jeff, netdev On Thu, Apr 17, 2008 at 1:16 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Thu, Apr 17, 2008 at 12:35:12PM -0700, Ray Lee wrote: > > On Thu, Apr 17, 2008 at 12:09 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > Finger-pointing, in these extreme cases, gives incentive to improve > > > > quality. It's a positive thing. > > > > > > Sorry, but I have to disagree. Negative finger-pointing is never a good thing. > > > > Correct, but let's be careful here. The original suggestion was, > > effectively, to get better metrics on the quality of contributions. > > There already is one: reputation with people working on the tree, > be it actively modifying/reviewing/bug hunting/etc. _We_ _already_ _know_; Sigh. No, you already know. I don't. This is not a rhetorical point. I've just bid out another project that'd involve getting linux running on another embedded hardware platform. If that happens, I get to spend paid time to work on the kernel, and as a by-product spend more time looking at patches and code coming across the list. So, where would it be best to spend my time? Or anyone else's? > generally one gets a decent idea of what to expect pretty soon. > > And frankly, that's the only thing that matters anyway; I suspect > I'd do rather well by proposed criteria, but you know what? I don't give > a flying f*ck through the rolling doughnut for self-appointed PHBs and > their idea of performance reviews. (Geez, conflate the issue much?) No one is saying you should. But also, I haven't seen anyone saying it'd be used for performance reviews other than you. > Think of it as a modified Turing test: convince me that you are > not a script piped through an Eng.Lit. wanker or an MBA, then I might care > for your opinion. <shrug> Shockingly enough, I actually don't care. I'm just trying to scratch my own itch, which is figure out where in the kernel (if anywhere!) it'd be best to donate my time. And your point is likely about the metrics, and yes, they'll be computer generated. So? Perhaps they'll be crap. Who knows until we look at them and match them up with what everyone already knows? If, by some one in a thousand chance, they turn out to be good and useful, then it'll either be a one-off eye-opener, or perhaps something useful more than once. Who knows? And to the larger point, why put effort into stopping someone else from finding out? > Al, who never had problems with pointing fingers and laughing, but > likes an informed human brain to be the source of it... <shrug> Shame and Guilt, two major motivators of human behavior, it's true. But, one last time, *you're* the one saying the stats would be used for finger pointing at people. Perhaps, instead, the stats will show that we should all collectively point our fingers at some random area in the tree, where everyone, despite their track record, ends up making mistakes. Let the kid find out, that's all I'm saying. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-17 20:38 ` Ray Lee @ 2008-04-17 20:53 ` Al Viro 2008-04-17 21:01 ` Ray Lee 0 siblings, 1 reply; 129+ messages in thread From: Al Viro @ 2008-04-17 20:53 UTC (permalink / raw) To: Ray Lee Cc: Rafael J. Wysocki, David Newall, Jesper Juhl, sverre, git, linux-kernel, James Morris, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, yoshfuji, jeff, netdev On Thu, Apr 17, 2008 at 01:38:18PM -0700, Ray Lee wrote: > > And frankly, that's the only thing that matters anyway; I suspect > > I'd do rather well by proposed criteria, but you know what? I don't give > > a flying f*ck through the rolling doughnut for self-appointed PHBs and > > their idea of performance reviews. > > (Geez, conflate the issue much?) No one is saying you should. But > also, I haven't seen anyone saying it'd be used for performance > reviews other than you. || If there are individuals at whom a finger needs to be pointed, this || system will highlight them, and fingers will (and should) be pointed. || Contributors of poor-quality code need to be weeded-out. in this thread (From: David Newall). > <shrug> Shame and Guilt, two major motivators of human behavior, it's > true. But, one last time, *you're* the one saying the stats would be > used for finger pointing at people. Not really. Unless you are trying to imply that David is my sock puppet, that is... ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-17 20:53 ` Al Viro @ 2008-04-17 21:01 ` Ray Lee 0 siblings, 0 replies; 129+ messages in thread From: Ray Lee @ 2008-04-17 21:01 UTC (permalink / raw) To: Al Viro Cc: Rafael J. Wysocki, David Newall, Jesper Juhl, sverre, git, linux-kernel, James Morris, Andrew Morton, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, yoshfuji, jeff, netdev On Thu, Apr 17, 2008 at 1:53 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Thu, Apr 17, 2008 at 01:38:18PM -0700, Ray Lee wrote: > > > And frankly, that's the only thing that matters anyway; I suspect > > > I'd do rather well by proposed criteria, but you know what? I don't give > > > a flying f*ck through the rolling doughnut for self-appointed PHBs and > > > their idea of performance reviews. > > > > (Geez, conflate the issue much?) No one is saying you should. But > > also, I haven't seen anyone saying it'd be used for performance > > reviews other than you. > > > || If there are individuals at whom a finger needs to be pointed, this > || system will highlight them, and fingers will (and should) be pointed. > || Contributors of poor-quality code need to be weeded-out. > > in this thread (From: David Newall). Ah, I failed reading comprehension, yet again. Well, sounds like you have a beef to take up with David, then. That's still not an argument against trying to gather statistics and to see if they're worth anything. > > <shrug> Shame and Guilt, two major motivators of human behavior, it's > > true. But, one last time, *you're* the one saying the stats would be > > used for finger pointing at people. > > Not really. Unless you are trying to imply that David is my sock puppet, that > is... Momentarily amusing to think so, but no :-). ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 6:24 ` Andrew Morton 2008-04-14 6:39 ` David Miller 2008-04-14 7:23 ` Al Viro @ 2008-04-14 19:13 ` Rene Herman 2008-04-14 20:38 ` Andrew Morton 2 siblings, 1 reply; 129+ messages in thread From: Rene Herman @ 2008-04-14 19:13 UTC (permalink / raw) To: Andrew Morton Cc: Al Viro, Willy Tarreau, david, Stephen Clark, Evgeniy Polyakov, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On 14-04-08 08:24, Andrew Morton wrote: > On Mon, 14 Apr 2008 06:39:43 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote: >> I have a related proposal: let us require all patches to be stamped >> with Discordian *and* Eternal September dates. In triplicate. While >> we are at it, why don't we introduce new mandatory headers like, say >> it, >> >> X-checkpatch: {Yes,No} >> X-checkpatch-why-not: <string> >> X-pointless: <number from 1 to 69, going from "1: does something useful" all >> the way to "68: aligns right ends of lines in comments"> >> X-arbitrary-rules-added-to-CodingStyle: <number> (should be present if >> and only if X-pointless: 69 is present). >> >> Come to think of that, we clearly need a new file in Documentation/*, >> documenting such headers. Why don't we organize a subcommittee^Wnew maillist >> devoted to that? That would provide another entry route for contributors, >> lowering the overall entry barriers even further... >> > > None of the above was particularly useful. Does that mean you're not going to take patches that align the right end of lines in comments? :-( Rene. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 19:13 ` Rene Herman @ 2008-04-14 20:38 ` Andrew Morton 2008-04-14 22:18 ` Rene Herman 0 siblings, 1 reply; 129+ messages in thread From: Andrew Morton @ 2008-04-14 20:38 UTC (permalink / raw) To: Rene Herman Cc: viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, davem, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On Mon, 14 Apr 2008 21:13:41 +0200 Rene Herman <rene.herman@keyaccess.nl> wrote: > Does that mean you're not going to take patches that align the right end of > lines in comments? :-( erm, was that ":-(" supposed to be a ":-)"? I don't like to merge patches which fix typos and spellos and grammaros in comments, simply because I'd be buried in the things. I do take such fixes for user-visible text (Documentation/, kerneldoc comments and printks). Right-justification of comments would fall rather a long way below spelling fixes. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 20:38 ` Andrew Morton @ 2008-04-14 22:18 ` Rene Herman 0 siblings, 0 replies; 129+ messages in thread From: Rene Herman @ 2008-04-14 22:18 UTC (permalink / raw) To: Andrew Morton Cc: viro, w, david, sclark46, johnpol, rjw, tilman, Valdis.Kletnieks, lkml, davem, jesper.juhl, yoshfuji, jeff, linux-kernel, git, netdev On 14-04-08 22:38, Andrew Morton wrote: > On Mon, 14 Apr 2008 21:13:41 +0200 > Rene Herman <rene.herman@keyaccess.nl> wrote: > >> Does that mean you're not going to take patches that align the right end of >> lines in comments? :-( > > erm, was that ":-(" supposed to be a ":-)"? The ":-(" was supposed to add to the implicitly obvious ":-)". That is, was indeed joking (Al mentioned them) but with a slightly serious undertone: > I don't like to merge patches which fix typos and spellos and grammaros > in comments, simply because I'd be buried in the things. I do take such > fixes for user-visible text (Documentation/, kerneldoc comments and > printks). > > Right-justification of comments would fall rather a long way below > spelling fixes. You, particularly, seem to be very good at picking up trivia. I've posted completely trivial patches from time to time for small things I encounter while looking at something else. Things at the "are people going to look funny at me for even bothering or..." level but you picking them up means it's still useful to post, so I sometimes do. Now, in fact, Linux as a _whole_ doesn't seem bad at accepting that kind of small janitorial stuff but I have been noticing some backlash to it as well. I'm not sure it's worse or better than historically, but the "checkpatch syndrome" certainly triggers more of it. Al specifically wanted more new eyes but the way to reward those new eyes is accepting their small changes. Al also specifically doesn't like those small changes when at the level of the automated and semi-brainless checkpatch level. I believe the janitorial work has been over-organized, both through the kernel-janitors and checkpatch since while these are very useful in guiding a newbie in _what_ to do they cause "automated" huge tree-wide trivia storms which people then don't react overly favourable to and the new eyes who did all that work of generating it all dim again... Frankly, the kernel really is fairly complex these days when starting at 0. Much more complex certainly than, say, back in 2.0 or 2.2 days and while Al's scenario of per-subsystem reviews might be good, I don't believe it's very realistic. Companies don't pay to have those done and for newbies it's generally too complex since understanding most parts of the kernel fully, requires understanding most of the rest kernel rather well also. So you get the really promising newbies? Yeah, that, or you don't get anyone and if some promising newbies are building up 137 part checkpatch inspired patchsets that don't help none. So, what am I saying (what _am_ I saying?!?) ... I seemed to observe somewhat of an internal contradiction in Al's message about new eyes and his dislike of the trivial stuff but the contradiction only exists if the dislike wouldn't be limited to these kinds of huge trivia storms. I believe it is, and I furthermore believe that yes, it's over-organization that causes many new eyes to focus on the brainless aspects. Now, do those new eyes have many other options when very few (to none) of the core crowd ever does things like answer question on the kernelnewbies list? From the established names, I only remember ever seeing Greg KH and Adrian Bunk there. And I'm _still_ pissed that noone would or could tell me what was wrong with the legacy CD-ROM driver I and Pekka Enberg were toying around with a while ago. Frankly, I care a whole lot less about a hundred sparse warning fixes. In short -- the kernel in it's current state is already quite complex and if new eyes are wanted they'll need to be coached more. I'm seeing very little of that. Rene. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-13 22:24 ` Reporting bugs and bisection Stephen Clark 2008-04-13 22:41 ` Rafael J. Wysocki 2008-04-13 23:51 ` david @ 2008-04-14 9:26 ` Andi Kleen 2 siblings, 0 replies; 129+ messages in thread From: Andi Kleen @ 2008-04-14 9:26 UTC (permalink / raw) To: sclark46 Cc: Evgeniy Polyakov, Rafael J. Wysocki, Andrew Morton, Willy Tarreau, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev Stephen Clark <sclark46@earthlink.net> writes: > I been a linux user and have followed the LKML for a > number of years and have yet to see > any test plans for any submitted patches. You haven't looked closely then. While it's not very common there is a non trivial number of patches who describe how they got tested in the patch description. -Andi ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-13 20:21 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Evgeniy Polyakov 2008-04-13 20:33 ` Rafael J. Wysocki @ 2008-04-13 20:35 ` David Miller 1 sibling, 0 replies; 129+ messages in thread From: David Miller @ 2008-04-13 20:35 UTC (permalink / raw) To: johnpol Cc: akpm, w, rjw, tilman, Valdis.Kletnieks, lkml, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Date: Mon, 14 Apr 2008 00:21:18 +0400 > If the same would be done on developers machine and huge patches would > be sent to jump between changesets, that would be a real 'work closely > with the reporter working out why the reporter's failure was occurring'? In fact, this is what Andrew's so-called "back and forth with the bug reporter" used to mainly consist of. Asking the user to try this patch or that patch, which most of the time were reverts of suspect changes. Which, surprise surprise, means we were spending lots of time bisecting things by hand. We're able to automate this now and it's not a bad thing. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-13 19:18 ` Andrew Morton ` (2 preceding siblings ...) 2008-04-13 20:21 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Evgeniy Polyakov @ 2008-04-14 10:18 ` Ingo Molnar 2008-04-14 10:29 ` Reporting bugs and bisection Andi Kleen 3 siblings, 1 reply; 129+ messages in thread From: Ingo Molnar @ 2008-04-14 10:18 UTC (permalink / raw) To: Andrew Morton Cc: Willy Tarreau, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev * Andrew Morton <akpm@linux-foundation.org> wrote: > We dont' do that as much nowadays - there's a tendency to > > a) throw the problem back at the reporter, often asking them to > bisect. If the reporter is running a distro kernel (eg: Fedora) > then that's quite hard, and often isn't a think they have knowledge > to do. So they'll just disappear. Or > > b) just ignore the report altogether. hm, who does this - i've seen networking folks do it but does anyone else do it? Such cases are _clear_ abuse of users and they'll do the obvious thing: vote with their feet. I only ask people to bisect it when all other avenues fail - and even then i try to make it clear that bisection is just something they can _optionally_ do to speed things up (it's never required), and that it's a pure opt-in. doing _kernel_ bisection is totally hard at the moment - it disrupts the user way too much and causes many hours of work for most users. [ Requiring bisection for userspace projects might be more doable. (but even there's it's wrong when it's not automated completely and where a failure pattern is not deterministic.) ] Ingo ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 10:18 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Ingo Molnar @ 2008-04-14 10:29 ` Andi Kleen 0 siblings, 0 replies; 129+ messages in thread From: Andi Kleen @ 2008-04-14 10:29 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Willy Tarreau, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev Ingo Molnar <mingo@elte.hu> writes: > > doing _kernel_ bisection is totally hard at the moment - it disrupts the > user way too much and causes many hours of work for most users. It depends. Sometimes the bisection can be done in qemu/kvm/xen or similar tools. At least if the problem is not too hardware dependent. And more and more people actually run in such environments. I can also do it faster with autoboot or nfs root/powerswitch, but admittedly that's a very specialized setup most people don't have. Still I agree with your basic point that it should be only last resort. -Andi ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-13 18:47 ` Willy Tarreau 2008-04-13 19:18 ` Andrew Morton @ 2008-04-13 20:10 ` Adrian Bunk 2008-04-14 9:58 ` Reporting bugs and bisection Andi Kleen 2 siblings, 0 replies; 129+ messages in thread From: Adrian Bunk @ 2008-04-13 20:10 UTC (permalink / raw) To: Willy Tarreau Cc: Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev, Andrew Morton On Sun, Apr 13, 2008 at 08:47:30PM +0200, Willy Tarreau wrote: >... > Very true. One other thing which might get confusing/frustrating on the > user side is that currently, Linux is the *only* product which requires > the bug reporter to find the fault change (yes, I know, it's scalable). >... That's not true, for several regressions I reported to the Wine Bugzilla I had been asked to git bisect for the commit that broke it. And I'd actually assume that it's quite common for git using open source projects to ask the user to bisect regressions. > Regards, > Willy cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-13 18:47 ` Willy Tarreau 2008-04-13 19:18 ` Andrew Morton 2008-04-13 20:10 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Adrian Bunk @ 2008-04-14 9:58 ` Andi Kleen 2008-04-14 10:00 ` Willy Tarreau 2 siblings, 1 reply; 129+ messages in thread From: Andi Kleen @ 2008-04-14 9:58 UTC (permalink / raw) To: Willy Tarreau Cc: Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev, Andrew Morton Willy Tarreau <w@1wt.eu> writes: > Linux is the *only* product which requires > the bug reporter to find the fault change (yes, I know, it's scalable). It's a pretty common procedure for compilers (gcc, llvm) too, although they have the advantage that given a test case usually someone else can run the bisect procedure because they do not depend on the underlying hardware That's unfortunately not the case for most kernel bugs, although sometimes it is possible given a hardware independent test case. And while most of the kernel code is drivers and arch, a lot of it is still pretty hardware independent, so at least in some cases it is possible to submit test cases and then let someone else (like a bug master) do the bisect. Of course it is unclear if producing a submittable test case will be actually any faster than just running bisect for the user. That said I agree it's a big burden to run bisect for everything because it can take very long (especially if the problem is not trivially reproducable) It would be fair at least if maintainers always gave some candidate commit ids when asking for bisect for likely changes that could have matched the bug. Then those could be checked quickly first before doing the full run. While that will not always work it would be still a useful short cut and save a lot of time for the reporter. -Andi ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 9:58 ` Reporting bugs and bisection Andi Kleen @ 2008-04-14 10:00 ` Willy Tarreau 2008-04-14 10:16 ` Andi Kleen 0 siblings, 1 reply; 129+ messages in thread From: Willy Tarreau @ 2008-04-14 10:00 UTC (permalink / raw) To: Andi Kleen Cc: Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev, Andrew Morton On Mon, Apr 14, 2008 at 11:58:08AM +0200, Andi Kleen wrote: > Willy Tarreau <w@1wt.eu> writes: > > > Linux is the *only* product which requires > > the bug reporter to find the fault change (yes, I know, it's scalable). > > It's a pretty common procedure for compilers (gcc, llvm) too, although > they have the advantage that given a test case usually someone else > can run the bisect procedure because they do not depend on the underlying > hardware > > That's unfortunately not the case for most kernel bugs, although > sometimes it is possible given a hardware independent test case. And > while most of the kernel code is drivers and arch, a lot of it is > still pretty hardware independent, so at least in some cases it is > possible to submit test cases and then let someone else (like a bug > master) do the bisect. > > Of course it is unclear if producing a submittable test case will be > actually any faster than just running bisect for the user. > > That said I agree it's a big burden to run bisect for everything > because it can take very long (especially if the problem > is not trivially reproducable) > > It would be fair at least if maintainers always gave some candidate > commit ids when asking for bisect for likely changes that could > have matched the bug. Then those could be checked quickly first > before doing the full run. > > While that will not always work it would be still a useful short cut > and save a lot of time for the reporter. And most of all, the reporter would not feel like the bisection is the default response ! > -Andi Willy ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: Reporting bugs and bisection 2008-04-14 10:00 ` Willy Tarreau @ 2008-04-14 10:16 ` Andi Kleen 0 siblings, 0 replies; 129+ messages in thread From: Andi Kleen @ 2008-04-14 10:16 UTC (permalink / raw) To: Willy Tarreau Cc: Andi Kleen, Rafael J. Wysocki, Tilman Schmidt, Valdis.Kletnieks, Mark Lord, David Miller, jesper.juhl, yoshfuji, jeff, linux-kernel, netdev, Andrew Morton > And most of all, the reporter would not feel like the bisection is > the default response ! Well it is proportional to the quality of the bug report. If it very vague enough often there is no other good answer. If it comes with already some debugging or good logs or a good test case etc. I agree just saying "please bisect" is not very nice (but sometimes it might be still needed if code review doesn't find anything) Perhaps there should be a document somewhere explaining this which can be easily pointed to. -Andi ^ permalink raw reply [flat|nested] 129+ messages in thread
* about bisections (was: Re: 2.6.25-rc8: FTP transfer errors) 2008-04-11 0:39 ` David Miller 2008-04-11 1:23 ` Mark Lord @ 2008-04-15 21:53 ` Ingo Molnar 2008-04-15 22:30 ` about bisections David Miller 1 sibling, 1 reply; 129+ messages in thread From: Ingo Molnar @ 2008-04-15 21:53 UTC (permalink / raw) To: David Miller Cc: lkml, jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev, corbet, Linus Torvalds * David Miller <davem@davemloft.net> wrote: > And it's a win-win situation. The incentive for a capable user to do > a bisect or whatever else is that if they do it their bug gets fixed > quickly. That is the free market economy of Linux kernel bug > reporting. > > It addresses the issue that in reality we'll never fix all bugs, and > therefore we prioritize. And therefore if there is a bisected bug > report and also another one from a user who refuses to do that, guess > which bug gets worked on with a higher priority and which bug gets > fixed first? this argument is a fallacy because it assumes that the Linux kernel is a closed ecosystem and i'm really surprised to see you advance this economic argument. i remind you: Linux is very much not a closed ecosystem. ... and hence, your "free market economy of bugs" that in essence strongly suggests users to do bisections when they find bugs in networking, works exactly the way you did not intend it to work: it pushes users towards other OSs. It pushes them towards Solaris, FreeBSD, MacOS and even Windows. That happens because the barrier to getting bugs fixed is _increased_ - and users might find it easier to participate in the ecosystem of other OSs - instead of having to compete with "each other" for the attention of the head honcho (you). You have a unique position within Linux: through a decade of hard and excellent work you have built a quasi-monopoly to all things networking commits: if you say about something that it should go into networking it will, if you say that it should stay out, it wont go in. So it is fundamentally _you_ who determines the feature/fix ratio in the networking code, and it is _you_ who determines the amount of bugs users have to find! There's no real competition for your position - it would take years for anyone to replace you. (and it would be a shame and a loss - you do your job so well) No doubt about it: bisection is very nice, it's one of the best things that happened to Linux debuggability in the past 2 years, i use it heavily myself, but please do _not_ require it from testers and users. They dont have nice 32-way Niagara's to build a kernel in 1 minute. They dont have nice virtualization to do easy bisection. Take bisection as an additional gift/tool but dont make it a semi-required aspect of your subsystem. Pretty please. And _PLEASE_ realize that the networking bug-count has been created primarily by _you_, because it is you who throttles the amount of new code in new kernel releases. If you cannot cope with the resulting bug-count via your network of users - it might be perhaps because you let too much new stuff in to begin with? Ingo ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: about bisections 2008-04-15 21:53 ` about bisections (was: Re: 2.6.25-rc8: FTP transfer errors) Ingo Molnar @ 2008-04-15 22:30 ` David Miller 2008-04-15 22:48 ` Ingo Molnar 0 siblings, 1 reply; 129+ messages in thread From: David Miller @ 2008-04-15 22:30 UTC (permalink / raw) To: mingo Cc: lkml, jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev, corbet, torvalds From: Ingo Molnar <mingo@elte.hu> Date: Tue, 15 Apr 2008 23:53:06 +0200 > but please do _not_ require it from testers and users. I don't. I ask for a bisection when it is appropriate and I think other avenues will not bear fruit in a reasonable amount of time. Thanks for the arbitrary diatribe about my contributions over the years and accusations that I have some kind of monopoly over the networking code and fixes to it. I really appreciate that. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: about bisections 2008-04-15 22:30 ` about bisections David Miller @ 2008-04-15 22:48 ` Ingo Molnar 0 siblings, 0 replies; 129+ messages in thread From: Ingo Molnar @ 2008-04-15 22:48 UTC (permalink / raw) To: David Miller Cc: lkml, jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev, corbet, torvalds * David Miller <davem@davemloft.net> wrote: > From: Ingo Molnar <mingo@elte.hu> > Date: Tue, 15 Apr 2008 23:53:06 +0200 > > > but please do _not_ require it from testers and users. > > I don't. I ask for a bisection when it is appropriate and I think > other avenues will not bear fruit in a reasonable amount of time. i'm glad i misunderstood you. My impression from reading this thread was that you preferred reporters who do bisection (which is fine so far), to the level of sometimes ignoring reporters who dont (which is not). > Thanks for the arbitrary diatribe about my contributions over the > years and accusations that I have some kind of monopoly over the > networking code and fixes to it. I really appreciate that. you certainly do have a fair amount of exclusivity in determining the dosage of networking commits. Dont get me wrong, you earned it and you deserve it - not the least because you do it best. Ingo ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 0:24 ` David Miller 2008-04-11 0:27 ` Mark Lord @ 2008-04-11 0:56 ` Tilman Schmidt 2008-04-11 1:08 ` David Miller 1 sibling, 1 reply; 129+ messages in thread From: Tilman Schmidt @ 2008-04-11 0:56 UTC (permalink / raw) To: David Miller; +Cc: lkml, jesper.juhl, yoshfuji, jeff, rjw, linux-kernel, netdev [-- Attachment #1: Type: text/plain, Size: 1409 bytes --] David Miller schrieb: > From: Mark Lord <lkml@rtr.ca> > Date: Thu, 10 Apr 2008 20:16:11 -0400 > >> Duh.. more like, "If I take 5-8 hours to attempt a bisect (which may not >> even work), then that's 5-8 hours I do not get paid for." > > And if I invest my spare time on your bug how does this statement > apply to me? Or does it only apply to you? > > Every single argument you make that supports why you should not be > investing the necessary time into the bug applies equally to the > very developers you are so quickly to quip at and want help from. I think you got it backwards. Mark and other bug reporters (including, at times, yours truly) are helping you and other developers to make Linux better. Most of the times I report a bug, I am not asking for help - I have no personal need to get it fixed, as I can easily avoid it, and I only report it to give developers like you a chance to fix it before it really hurts someone - and I gather that Mark has been in a similar position wrt to the bug in question. So what would you have us do? Not report the bugs we find so that you don't have to invest your spare time on "our" bugs? Report them and accept a rebuke for our "unwillingness" to do even more benevolent work than we already did? Report only those for which we really need a fix, and are consequently willing to invest additional time? Thanks, Tilman [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 254 bytes --] ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 0:56 ` 2.6.25-rc8: FTP transfer errors Tilman Schmidt @ 2008-04-11 1:08 ` David Miller 0 siblings, 0 replies; 129+ messages in thread From: David Miller @ 2008-04-11 1:08 UTC (permalink / raw) To: tilman; +Cc: lkml, jesper.juhl, yoshfuji, jeff, rjw, linux-kernel, netdev From: Tilman Schmidt <tilman@imap.cc> Date: Fri, 11 Apr 2008 02:56:49 +0200 > I think you got it backwards. Mark and other bug reporters (including, > at times, yours truly) are helping you and other developers to make > Linux better. I appreciate the bug reports, believe me. The issue is which of the limited developer resources get put onto which bugs. A developer who does this for fun is going to prioritize to things that are pleasant and interesting to work on, and also a good effective use of their time. So people prioritize. Therefore, my point is, the net result is that user have a direct influence on which bugs get worked on with the highest priority and thus get fixed faster. And those are the ones that have the most information available, and in particular bisec results when appropriate. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 0:16 ` 2.6.25-rc8: FTP transfer errors Mark Lord 2008-04-11 0:24 ` David Miller @ 2008-04-11 0:26 ` David Miller 2008-04-11 0:29 ` Mark Lord 1 sibling, 1 reply; 129+ messages in thread From: David Miller @ 2008-04-11 0:26 UTC (permalink / raw) To: lkml; +Cc: jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev, xemul From: Mark Lord <lkml@rtr.ca> Date: Thu, 10 Apr 2008 20:16:11 -0400 > [c67499c0e772064b37ad75eb69b28fc218752636 is first bad commit > commit c67499c0e772064b37ad75eb69b28fc218752636 > Author: Pavel Emelyanov <xemul@openvz.org> > Date: Thu Jan 31 05:06:40 2008 -0800 > > [NETNS]: Tcp-v4 sockets per-net lookup. > > Add a net argument to inet_lookup and propagate it further > into lookup calls. Plus tune the __inet_check_established. > > The dccp and inet_diag, which use that lookup functions > pass the init_net into them. > > Signed-off-by: Pavel Emelyanov <xemul@openvz.org> > Signed-off-by: David S. Miller <davem@davemloft.net> Thanks Mark. Pavel can you take a look? I suspect that the namespace changes or gets NULL'd out somehow and this leads to the resets because the socket can no longer be found. Perhaps it's even a problem with time-wait socket namespace propagation. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 0:26 ` David Miller @ 2008-04-11 0:29 ` Mark Lord 2008-04-11 2:59 ` YOSHIFUJI Hideaki / 吉藤英明 0 siblings, 1 reply; 129+ messages in thread From: Mark Lord @ 2008-04-11 0:29 UTC (permalink / raw) To: David Miller Cc: jesper.juhl, tilman, yoshfuji, jeff, rjw, linux-kernel, netdev, xemul David Miller wrote: > From: Mark Lord <lkml@rtr.ca> > Date: Thu, 10 Apr 2008 20:16:11 -0400 > >> [c67499c0e772064b37ad75eb69b28fc218752636 is first bad commit >> commit c67499c0e772064b37ad75eb69b28fc218752636 >> Author: Pavel Emelyanov <xemul@openvz.org> >> Date: Thu Jan 31 05:06:40 2008 -0800 >> >> [NETNS]: Tcp-v4 sockets per-net lookup. >> >> Add a net argument to inet_lookup and propagate it further >> into lookup calls. Plus tune the __inet_check_established. >> >> The dccp and inet_diag, which use that lookup functions >> pass the init_net into them. >> >> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> >> Signed-off-by: David S. Miller <davem@davemloft.net> > > Thanks Mark. > > Pavel can you take a look? I suspect that the namespace > changes or gets NULL'd out somehow and this leads to the > resets because the socket can no longer be found. Perhaps > it's even a problem with time-wait socket namespace > propagation. .. My system here is now set up for quick/easy retest, if you have any suggestions or patches to try out. Thanks guys. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 0:29 ` Mark Lord @ 2008-04-11 2:59 ` YOSHIFUJI Hideaki / 吉藤英明 2008-04-11 3:18 ` [PATCH 2.6.25] net sockets: fix timewait namespace regression Mark Lord 2008-04-11 7:50 ` 2.6.25-rc8: FTP transfer errors Pavel Emelyanov 0 siblings, 2 replies; 129+ messages in thread From: YOSHIFUJI Hideaki / 吉藤英明 @ 2008-04-11 2:59 UTC (permalink / raw) To: lkml, davem Cc: jesper.juhl, tilman, jeff, rjw, linux-kernel, netdev, xemul, yoshfuji In article <47FEB0E3.8080507@rtr.ca> (at Thu, 10 Apr 2008 20:29:23 -0400), Mark Lord <lkml@rtr.ca> says: > David Miller wrote: > > From: Mark Lord <lkml@rtr.ca> > > Date: Thu, 10 Apr 2008 20:16:11 -0400 > > > >> [c67499c0e772064b37ad75eb69b28fc218752636 is first bad commit > >> commit c67499c0e772064b37ad75eb69b28fc218752636 > >> Author: Pavel Emelyanov <xemul@openvz.org> > >> Date: Thu Jan 31 05:06:40 2008 -0800 > >> > >> [NETNS]: Tcp-v4 sockets per-net lookup. > >> > >> Add a net argument to inet_lookup and propagate it further > >> into lookup calls. Plus tune the __inet_check_established. > >> > >> The dccp and inet_diag, which use that lookup functions > >> pass the init_net into them. > >> > >> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> > >> Signed-off-by: David S. Miller <davem@davemloft.net> > > > > Thanks Mark. > > > > Pavel can you take a look? I suspect that the namespace > > changes or gets NULL'd out somehow and this leads to the > > resets because the socket can no longer be found. Perhaps > > it's even a problem with time-wait socket namespace > > propagation. > .. > > My system here is now set up for quick/easy retest, if you have any > suggestions or patches to try out. Please try this, from net-2.6.26 tree. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> ---- >From 8d9f1744cab50acb0c6c9553be533621e01f178b Mon Sep 17 00:00:00 2001 From: Daniel Lezcano <dlezcano@fr.ibm.com> Date: Fri, 21 Mar 2008 04:12:54 -0700 Subject: [PATCH] [NETNS][IPV6] tcp - assign the netns for timewait sockets Copy the network namespace from the socket to the timewait socket. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> --- net/ipv4/inet_timewait_sock.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c index 876169f..717c411 100644 --- a/net/ipv4/inet_timewait_sock.c +++ b/net/ipv4/inet_timewait_sock.c @@ -124,6 +124,7 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk, const int stat tw->tw_hash = sk->sk_hash; tw->tw_ipv6only = 0; tw->tw_prot = sk->sk_prot_creator; + tw->tw_net = sk->sk_net; atomic_set(&tw->tw_refcnt, 1); inet_twsk_dead_node_init(tw); __module_get(tw->tw_prot->owner); -- 1.4.4.4 -- YOSHIFUJI Hideaki @ USAGI Project <yoshfuji@linux-ipv6.org> GPG-FP : 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA ^ permalink raw reply related [flat|nested] 129+ messages in thread
* [PATCH 2.6.25] net sockets: fix timewait namespace regression 2008-04-11 2:59 ` YOSHIFUJI Hideaki / 吉藤英明 @ 2008-04-11 3:18 ` Mark Lord 2008-04-11 3:51 ` David Miller 2008-04-11 7:50 ` 2.6.25-rc8: FTP transfer errors Pavel Emelyanov 1 sibling, 1 reply; 129+ messages in thread From: Mark Lord @ 2008-04-11 3:18 UTC (permalink / raw) To: YOSHIFUJI Hideaki / 吉藤英明 Cc: davem, jesper.juhl, tilman, jeff, rjw, linux-kernel, netdev, xemul, Linus Torvalds, Andrew Morton YOSHIFUJI Hideaki / 吉藤英明 wrote: > In article <47FEB0E3.8080507@rtr.ca> (at Thu, 10 Apr 2008 20:29:23 -0400), Mark Lord <lkml@rtr.ca> says: > >> David Miller wrote: >>> From: Mark Lord <lkml@rtr.ca> >>> Date: Thu, 10 Apr 2008 20:16:11 -0400 >>> >>>> [c67499c0e772064b37ad75eb69b28fc218752636 is first bad commit >>>> commit c67499c0e772064b37ad75eb69b28fc218752636 >>>> Author: Pavel Emelyanov <xemul@openvz.org> >>>> Date: Thu Jan 31 05:06:40 2008 -0800 >>>> >>>> [NETNS]: Tcp-v4 sockets per-net lookup. >>>> >>>> Add a net argument to inet_lookup and propagate it further >>>> into lookup calls. Plus tune the __inet_check_established. >>>> >>>> The dccp and inet_diag, which use that lookup functions >>>> pass the init_net into them. >>>> >>>> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> >>>> Signed-off-by: David S. Miller <davem@davemloft.net> >>> Thanks Mark. >>> >>> Pavel can you take a look? I suspect that the namespace >>> changes or gets NULL'd out somehow and this leads to the >>> resets because the socket can no longer be found. Perhaps >>> it's even a problem with time-wait socket namespace >>> propagation. >> .. >> >> My system here is now set up for quick/easy retest, if you have any >> suggestions or patches to try out. > > Please try this, from net-2.6.26 tree. .. Works perfectly, thanks. Looks obvious, too. Push it out to Linus now for 2.6.25. Thanks! Acked-by: Mark Lord <mlord@pobox.com> > ---- >>From 8d9f1744cab50acb0c6c9553be533621e01f178b Mon Sep 17 00:00:00 2001 > From: Daniel Lezcano <dlezcano@fr.ibm.com> > Date: Fri, 21 Mar 2008 04:12:54 -0700 > Subject: [PATCH] [NETNS][IPV6] tcp - assign the netns for timewait sockets > > Copy the network namespace from the socket to the timewait socket. > > Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> > Signed-off-by: David S. Miller <davem@davemloft.net> .. > Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Mark Lord <mlord@pobox.com> > --- > net/ipv4/inet_timewait_sock.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c > index 876169f..717c411 100644 > --- a/net/ipv4/inet_timewait_sock.c > +++ b/net/ipv4/inet_timewait_sock.c > @@ -124,6 +124,7 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk, const int stat > tw->tw_hash = sk->sk_hash; > tw->tw_ipv6only = 0; > tw->tw_prot = sk->sk_prot_creator; > + tw->tw_net = sk->sk_net; > atomic_set(&tw->tw_refcnt, 1); > inet_twsk_dead_node_init(tw); > __module_get(tw->tw_prot->owner); ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: [PATCH 2.6.25] net sockets: fix timewait namespace regression 2008-04-11 3:18 ` [PATCH 2.6.25] net sockets: fix timewait namespace regression Mark Lord @ 2008-04-11 3:51 ` David Miller 0 siblings, 0 replies; 129+ messages in thread From: David Miller @ 2008-04-11 3:51 UTC (permalink / raw) To: lkml Cc: yoshfuji, jesper.juhl, tilman, jeff, rjw, linux-kernel, netdev, xemul, torvalds, akpm From: Mark Lord <lkml@rtr.ca> Date: Thu, 10 Apr 2008 23:18:10 -0400 > Works perfectly, thanks. Looks obvious, too. > Push it out to Linus now for 2.6.25. > > Acked-by: Mark Lord <mlord@pobox.com> Will do, thanks for testing. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-11 2:59 ` YOSHIFUJI Hideaki / 吉藤英明 2008-04-11 3:18 ` [PATCH 2.6.25] net sockets: fix timewait namespace regression Mark Lord @ 2008-04-11 7:50 ` Pavel Emelyanov 1 sibling, 0 replies; 129+ messages in thread From: Pavel Emelyanov @ 2008-04-11 7:50 UTC (permalink / raw) To: YOSHIFUJI Hideaki / 吉藤英明 Cc: lkml, davem, jesper.juhl, tilman, jeff, rjw, linux-kernel, netdev YOSHIFUJI Hideaki / 吉藤英明 wrote: > In article <47FEB0E3.8080507@rtr.ca> (at Thu, 10 Apr 2008 20:29:23 -0400), Mark Lord <lkml@rtr.ca> says: > >> David Miller wrote: >>> From: Mark Lord <lkml@rtr.ca> >>> Date: Thu, 10 Apr 2008 20:16:11 -0400 >>> >>>> [c67499c0e772064b37ad75eb69b28fc218752636 is first bad commit >>>> commit c67499c0e772064b37ad75eb69b28fc218752636 >>>> Author: Pavel Emelyanov <xemul@openvz.org> >>>> Date: Thu Jan 31 05:06:40 2008 -0800 >>>> >>>> [NETNS]: Tcp-v4 sockets per-net lookup. >>>> >>>> Add a net argument to inet_lookup and propagate it further >>>> into lookup calls. Plus tune the __inet_check_established. >>>> >>>> The dccp and inet_diag, which use that lookup functions >>>> pass the init_net into them. >>>> >>>> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> >>>> Signed-off-by: David S. Miller <davem@davemloft.net> >>> Thanks Mark. >>> >>> Pavel can you take a look? I suspect that the namespace >>> changes or gets NULL'd out somehow and this leads to the >>> resets because the socket can no longer be found. Perhaps >>> it's even a problem with time-wait socket namespace >>> propagation. >> .. >> >> My system here is now set up for quick/easy retest, if you have any >> suggestions or patches to try out. > > Please try this, from net-2.6.26 tree. > > Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Too late, but still Acked-by: Pavel Emelyanov <xemul@openvz.org> Sorry, guys, but my timezone does not allow me to react in time to found bugs :( So, when I wake up in the morning I usually just find out that someone has caught a BUG made by me and someone else has fixed it already... > ---- >>From 8d9f1744cab50acb0c6c9553be533621e01f178b Mon Sep 17 00:00:00 2001 > From: Daniel Lezcano <dlezcano@fr.ibm.com> > Date: Fri, 21 Mar 2008 04:12:54 -0700 > Subject: [PATCH] [NETNS][IPV6] tcp - assign the netns for timewait sockets > > Copy the network namespace from the socket to the timewait socket. > > Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > --- > net/ipv4/inet_timewait_sock.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c > index 876169f..717c411 100644 > --- a/net/ipv4/inet_timewait_sock.c > +++ b/net/ipv4/inet_timewait_sock.c > @@ -124,6 +124,7 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk, const int stat > tw->tw_hash = sk->sk_hash; > tw->tw_ipv6only = 0; > tw->tw_prot = sk->sk_prot_creator; > + tw->tw_net = sk->sk_net; > atomic_set(&tw->tw_refcnt, 1); > inet_twsk_dead_node_init(tw); > __module_get(tw->tw_prot->owner); ^ permalink raw reply [flat|nested] 129+ messages in thread
[parent not found: <1207869029.19683.13.camel@localhost>]
[parent not found: <20080410.161453.52032573.davem@davemloft.net>]
[parent not found: <1207870334.13150.11.camel@localhost>]
* Re: 2.6.25-rc8: FTP transfer errors [not found] ` <1207870334.13150.11.camel@localhost> @ 2008-04-10 23:41 ` David Miller 2008-04-10 23:51 ` vincent-perrier 2008-04-18 8:32 ` David Miller 0 siblings, 2 replies; 129+ messages in thread From: David Miller @ 2008-04-10 23:41 UTC (permalink / raw) To: vincent-perrier Cc: jesper.juhl, tilman, lkml, yoshfuji, jeff, rjw, linux-kernel, netdev From: vincent-perrier <vincent-perrier@club-internet.fr> Date: Fri, 11 Apr 2008 01:32:14 +0200 [ Please use netdev@vger.kernel.org so that this discussion reaches the networking develops. ] > Even if the patch is not good, the line dst_free(&rt->u.dst); > when rt is still in tree leads to a crash, but when you do not > do the dst_free, when rt is in tree, then it may have hidden > other bugs, but at least I can keep working. > > > I never said my patch was good, but it does the minimum to avoid my bug: > > > if (fn->leaf == NULL) { > bug_8895_clownix_provisional_workaround = 1; > fn->leaf = rt; > atomic_inc(&rt->rt6i_ref); > } > ... > > ip6_fib.c, line 796: > > if (!bug_8895_clownix_provisional_workaround) > dst_free(&rt->u.dst); > > That way at least it does not crash. > > I cannot provide more than the line and the reason for the crash, I am > one of the numerous brainless users. Now that the discussion has reached the mailing list, it won't die in bugzilla like most such bugs do, and very likely will get fixed quickly as a result. Thank you. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-10 23:41 ` David Miller @ 2008-04-10 23:51 ` vincent-perrier 2008-04-18 8:32 ` David Miller 1 sibling, 0 replies; 129+ messages in thread From: vincent-perrier @ 2008-04-10 23:51 UTC (permalink / raw) To: David Miller Cc: jesper.juhl, tilman, lkml, yoshfuji, jeff, rjw, linux-kernel, netdev Thanks to you, and also to Jesper for the "git bisect" explanation, you have powerfull tools, it is all for the best, millions of users are relying on you! On Thu, 2008-04-10 at 16:41 -0700, David Miller wrote: > From: vincent-perrier <vincent-perrier@club-internet.fr> > Date: Fri, 11 Apr 2008 01:32:14 +0200 > > [ Please use netdev@vger.kernel.org so that this discussion > reaches the networking develops. ] > > > Even if the patch is not good, the line dst_free(&rt->u.dst); > > when rt is still in tree leads to a crash, but when you do not > > do the dst_free, when rt is in tree, then it may have hidden > > other bugs, but at least I can keep working. > > > > > > I never said my patch was good, but it does the minimum to avoid my bug: > > > > > > if (fn->leaf == NULL) { > > bug_8895_clownix_provisional_workaround = 1; > > fn->leaf = rt; > > atomic_inc(&rt->rt6i_ref); > > } > > ... > > > > ip6_fib.c, line 796: > > > > if (!bug_8895_clownix_provisional_workaround) > > dst_free(&rt->u.dst); > > > > That way at least it does not crash. > > > > I cannot provide more than the line and the reason for the crash, I am > > one of the numerous brainless users. > > Now that the discussion has reached the mailing list, it won't die in > bugzilla like most such bugs do, and very likely will get fixed > quickly as a result. > > Thank you. > � > ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-10 23:41 ` David Miller 2008-04-10 23:51 ` vincent-perrier @ 2008-04-18 8:32 ` David Miller 2008-04-19 8:07 ` vincent-perrier 1 sibling, 1 reply; 129+ messages in thread From: David Miller @ 2008-04-18 8:32 UTC (permalink / raw) To: vincent-perrier; +Cc: yoshfuji, netdev From: David Miller <davem@davemloft.net> Date: Thu, 10 Apr 2008 16:41:06 -0700 (PDT) > From: vincent-perrier <vincent-perrier@club-internet.fr> > Date: Fri, 11 Apr 2008 01:32:14 +0200 > > > Even if the patch is not good, the line dst_free(&rt->u.dst); > > when rt is still in tree leads to a crash, but when you do not > > do the dst_free, when rt is in tree, then it may have hidden > > other bugs, but at least I can keep working. > > > > > > I never said my patch was good, but it does the minimum to avoid my bug: > > > > > > if (fn->leaf == NULL) { > > bug_8895_clownix_provisional_workaround = 1; > > fn->leaf = rt; > > atomic_inc(&rt->rt6i_ref); > > } > > ... > > > > ip6_fib.c, line 796: > > > > if (!bug_8895_clownix_provisional_workaround) > > dst_free(&rt->u.dst); > > > > That way at least it does not crash. I started looking actively at this. There are a lot of complicated side effects here, especially when subtrees are enabled as it is in your case. The main issue is whether we added any references to 'rt' into the routing tree. If we get an error, we have to undo any such added references. And that's not being done when the "if (fn->leaf == NULL)" code runs and fib6_add_rt2node() returns an error. I think this patch will fix it, could you please test it out? diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index b3f6e03..50f3f8f 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -772,6 +772,10 @@ out: * If fib6_add_1 has cleared the old leaf pointer in the * super-tree leaf node we have to find a new one for it. */ + if (pn != fn && pn->leaf == rt) { + pn->leaf = NULL; + atomic_dec(&rt->rt6i_ref); + } if (pn != fn && !pn->leaf && !(pn->fn_flags & RTN_RTINFO)) { pn->leaf = fib6_find_prefix(info->nl_net, pn); #if RT6_DEBUG >= 2 ^ permalink raw reply related [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-18 8:32 ` David Miller @ 2008-04-19 8:07 ` vincent-perrier 0 siblings, 0 replies; 129+ messages in thread From: vincent-perrier @ 2008-04-19 8:07 UTC (permalink / raw) To: David Miller; +Cc: yoshfuji, netdev Hello, I am very sorry, for the moment I cannot reproduce it and I am going on holidays tomorow (we have a lot of holidays in france) and I have no internet where I go. Nevertheless, I have tried to reproduce the bug, but all my config has changed a lot, the user soft as well as the kernel I use now. I will try again during the week, but I will not be able to send mail before next week. Regards On Fri, 2008-04-18 at 01:32 -0700, David Miller wrote: > From: David Miller <davem@davemloft.net> > Date: Thu, 10 Apr 2008 16:41:06 -0700 (PDT) > > > From: vincent-perrier <vincent-perrier@club-internet.fr> > > Date: Fri, 11 Apr 2008 01:32:14 +0200 > > > > > Even if the patch is not good, the line dst_free(&rt->u.dst); > > > when rt is still in tree leads to a crash, but when you do not > > > do the dst_free, when rt is in tree, then it may have hidden > > > other bugs, but at least I can keep working. > > > > > > > > > I never said my patch was good, but it does the minimum to avoid my bug: > > > > > > > > > if (fn->leaf == NULL) { > > > bug_8895_clownix_provisional_workaround = 1; > > > fn->leaf = rt; > > > atomic_inc(&rt->rt6i_ref); > > > } > > > ... > > > > > > ip6_fib.c, line 796: > > > > > > if (!bug_8895_clownix_provisional_workaround) > > > dst_free(&rt->u.dst); > > > > > > That way at least it does not crash. > > I started looking actively at this. > > There are a lot of complicated side effects here, especially when > subtrees are enabled as it is in your case. > > The main issue is whether we added any references to 'rt' into > the routing tree. If we get an error, we have to undo any > such added references. > > And that's not being done when the "if (fn->leaf == NULL)" code > runs and fib6_add_rt2node() returns an error. > > I think this patch will fix it, could you please test it out? > > diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c > index b3f6e03..50f3f8f 100644 > --- a/net/ipv6/ip6_fib.c > +++ b/net/ipv6/ip6_fib.c > @@ -772,6 +772,10 @@ out: > * If fib6_add_1 has cleared the old leaf pointer in the > * super-tree leaf node we have to find a new one for it. > */ > + if (pn != fn && pn->leaf == rt) { > + pn->leaf = NULL; > + atomic_dec(&rt->rt6i_ref); > + } > if (pn != fn && !pn->leaf && !(pn->fn_flags & RTN_RTINFO)) { > pn->leaf = fib6_find_prefix(info->nl_net, pn); > #if RT6_DEBUG >= 2 > > > � > ^ permalink raw reply [flat|nested] 129+ messages in thread
[parent not found: <47FCF9DD.6080007@rtr.ca>]
[parent not found: <20080410.023045.16227424.yoshfuji@linux-ipv6.org>]
[parent not found: <47FD138B.2060801@rtr.ca>]
[parent not found: <20080409.152933.132174258.davem@davemloft.net>]
[parent not found: <47FD590C.5020003@rtr.ca>]
* Re: 2.6.25-rc8: FTP transfer errors [not found] ` <47FD590C.5020003@rtr.ca> @ 2008-04-10 20:46 ` Ilpo Järvinen 2008-04-10 21:05 ` Mark Lord 0 siblings, 1 reply; 129+ messages in thread From: Ilpo Järvinen @ 2008-04-10 20:46 UTC (permalink / raw) To: Mark Lord Cc: David Miller, yoshfuji, Jeff Garzik, rjw, LKML, linux-net, Netdev On Wed, 9 Apr 2008, Mark Lord wrote: > David Miller wrote: > > From: Mark Lord <lkml@rtr.ca> > > Date: Wed, 09 Apr 2008 15:05:47 -0400 > > > > > But it would be far more useful for whoever has been working on the > > > stack to suggest some possible/likely commits to look at instead. > > > > Personally all I see is that one side closes the socket before all > > data packets received have been read into the application, resulting > > in a (correct) reset going out. > > > > I can't think of any change we've made over the course of this > > release that would change behvaior in that area. > > > > So you will likely need to bisect. > .. > > Or I can ignore it, like the net developers, since I have a workaround. > And then we'll see what other apps are broken upon 2.6.25 final release. > > Really, folks. Bug reports are intended to *help* the developers, > not something to be thrown back in their faces. > > There do seem to have been a *lot* of changes around the tcp closing/close > code (as I see from diff'ing 2.6.24 against latest -git). Sure, if you count in all whitespace/indentation/code moving changes to that as well... :-) > *Somebody* is responsible for those changes. > That particular *somebody* ought to volunteer some help here, I might help if would add netdev on cc list in case you really want to reac net developers, otherwise they might just end up "ignoring it"... ;-) > reducing the mountain of commits to a big handful or two. Those touching fin/close are mostly whitespace/move things, so I doubt that you find these useful but in case you insist, here's the list: 056834d9f6f6eaf4cc7268569e53acab957aac27 [TCP]: cleanup tcp_{in,out}put.c style 058dc3342b71ffb3531c4f9df7c35f943f392b8d [TCP]: reduce tcp_output's indentation levels a bit 490d5046930276aae50dd16942649bfc626056f7 [TCP]: Uninline tcp_set_state In addition, there's this one (...though I have read it number of times through and still cannot catch something that would cause the wrongness you're seeing): e870a8efcddaaa3da7e180b6ae21239fb96aa2bb [TCP]: Perform setting of common control fields in one place There's very little really on interesting side I can think of, mostly thinks are congestion control related changes... ...maybe either one of these could cause something unpleasant in some corner case: bd515c3e48ececd774eb3128e81b669dbbd32637 [TCP]: Fix TSO deferring 0e3a4803aa06cd7bc2cfc1d04289df4f6027640a [TCP]: Force TSO splits to MSS boundaries ...e.g., if the latter causes a return with zero limit under some conditions, tso_fragment might generate, well, interesting packets and never finish if the condition persists but. -- i. ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-10 20:46 ` Ilpo Järvinen @ 2008-04-10 21:05 ` Mark Lord 2008-04-10 21:43 ` Ilpo Järvinen 0 siblings, 1 reply; 129+ messages in thread From: Mark Lord @ 2008-04-10 21:05 UTC (permalink / raw) To: Ilpo Järvinen Cc: David Miller, yoshfuji, Jeff Garzik, rjw, LKML, linux-net, Netdev Ilpo Järvinen wrote: > On Wed, 9 Apr 2008, Mark Lord wrote: > >> David Miller wrote: >>> From: Mark Lord <lkml@rtr.ca> >>> Date: Wed, 09 Apr 2008 15:05:47 -0400 >>> >>>> But it would be far more useful for whoever has been working on the >>>> stack to suggest some possible/likely commits to look at instead. >>> Personally all I see is that one side closes the socket before all >>> data packets received have been read into the application, resulting >>> in a (correct) reset going out. >>> >>> I can't think of any change we've made over the course of this >>> release that would change behvaior in that area. >>> >>> So you will likely need to bisect. >> .. >> >> Or I can ignore it, like the net developers, since I have a workaround. >> And then we'll see what other apps are broken upon 2.6.25 final release. >> >> Really, folks. Bug reports are intended to *help* the developers, >> not something to be thrown back in their faces. >> >> There do seem to have been a *lot* of changes around the tcp closing/close >> code (as I see from diff'ing 2.6.24 against latest -git). .. > I might help if would add netdev on cc list in case you really want to > reac net developers, otherwise they might just end up "ignoring it"... ;-) .. Oh.. I didn't know about that list. How does that differ from linux-net ? (Thanks) > >> reducing the mountain of commits to a big handful or two. > > Those touching fin/close are mostly whitespace/move things, so I doubt > that you find these useful but in case you insist, here's the list: > > 056834d9f6f6eaf4cc7268569e53acab957aac27 [TCP]: cleanup tcp_{in,out}put.c style > 058dc3342b71ffb3531c4f9df7c35f943f392b8d [TCP]: reduce tcp_output's indentation levels a bit > 490d5046930276aae50dd16942649bfc626056f7 [TCP]: Uninline tcp_set_state > > In addition, there's this one (...though I have read it number of times > through and still cannot catch something that would cause the wrongness > you're seeing): > > e870a8efcddaaa3da7e180b6ae21239fb96aa2bb [TCP]: Perform setting of common > control fields in one place > > There's very little really on interesting side I can think of, mostly > thinks are congestion control related changes... ...maybe either one of > these could cause something unpleasant in some corner case: > > bd515c3e48ececd774eb3128e81b669dbbd32637 [TCP]: Fix TSO deferring > 0e3a4803aa06cd7bc2cfc1d04289df4f6027640a [TCP]: Force TSO splits to MSS boundaries > > ...e.g., if the latter causes a return with zero limit under some > conditions, tso_fragment might generate, well, interesting packets and > never finish if the condition persists but. .. That matches my own assessment there, too: lot's of whitespace changes, and not much real code difference on most paths. Bummer. :) -ml ^ permalink raw reply [flat|nested] 129+ messages in thread
* Re: 2.6.25-rc8: FTP transfer errors 2008-04-10 21:05 ` Mark Lord @ 2008-04-10 21:43 ` Ilpo Järvinen 0 siblings, 0 replies; 129+ messages in thread From: Ilpo Järvinen @ 2008-04-10 21:43 UTC (permalink / raw) To: Mark Lord Cc: David Miller, yoshfuji, Jeff Garzik, rjw, LKML, linux-net, Netdev [-- Attachment #1: Type: TEXT/PLAIN, Size: 2915 bytes --] On Thu, 10 Apr 2008, Mark Lord wrote: > Ilpo Järvinen wrote: > > > I might help if would add netdev on cc list in case you really want to reac > > net developers, otherwise they might just end up "ignoring it"... ;-) > .. > > Oh.. I didn't know about that list. How does that differ from linux-net ? > (Thanks) (Somebody please correct me if I'm wrong) if I've understood it correctly, linux-net is meant for users discussions while the developers hang at netdev, some do read lkml but not all. And in fact, I wouldn't have noticed this thread for some time except that I was currently trying to see if there are some new tcp warn_on reports showing up. ...In case you have a regression, bug etc. related to networking, netdev should definately be included. > > > reducing the mountain of commits to a big handful or two. > > > > Those touching fin/close are mostly whitespace/move things, so I doubt that > > you find these useful but in case you insist, here's the list: > > > > 056834d9f6f6eaf4cc7268569e53acab957aac27 [TCP]: cleanup tcp_{in,out}put.c > > style > > 058dc3342b71ffb3531c4f9df7c35f943f392b8d [TCP]: reduce tcp_output's > > indentation levels a bit > > 490d5046930276aae50dd16942649bfc626056f7 [TCP]: Uninline tcp_set_state > > > > In addition, there's this one (...though I have read it number of times > > through and still cannot catch something that would cause the wrongness > > you're seeing): > > > > e870a8efcddaaa3da7e180b6ae21239fb96aa2bb [TCP]: Perform setting of common > > control fields in one place > > > > There's very little really on interesting side I can think of, mostly thinks > > are congestion control related changes... ...maybe either one of these could > > cause something unpleasant in some corner case: > > > > bd515c3e48ececd774eb3128e81b669dbbd32637 [TCP]: Fix TSO deferring > > 0e3a4803aa06cd7bc2cfc1d04289df4f6027640a [TCP]: Force TSO splits to MSS > > boundaries > > > > ...e.g., if the latter causes a return with zero limit under some > > conditions, tso_fragment might generate, well, interesting packets and never > > finish if the condition persists but. > .. > That matches my own assessment there, too: lot's of whitespace changes, > and not much real code difference on most paths. Bummer. :) ...I just got tired of seeing all those braindamaged line splits and other "legacy" formatting over and over again... :-) Much of those things predate even 2.4, luckily there isn't yet an agency which would prevent changing lines of code with that long historic encumbrance :-). That last TSO change seem the most potential one from the list of all net/ipv4/tcp*.c include/net/tcp.h touching commits, trying 0e3a4803aa06cd7bc2cfc1d04289df4f6027640a^ might be worthwhile (^ = a commit before the "quoted one") and you would be able to reuse its result anyway if there's a need to bisect it because that commit is around the halfway. -- i. ^ permalink raw reply [flat|nested] 129+ messages in thread
end of thread, other threads:[~2008-04-19 8:07 UTC | newest]
Thread overview: 129+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20080409.182228.193699767.davem@davemloft.net>
[not found] ` <47FE3020.1070502@imap.cc>
[not found] ` <9a8748490804101509l5d043ff8w565dc44dfeaf0072@mail.gmail.com>
[not found] ` <20080410.154651.101700010.davem@davemloft.net>
2008-04-11 0:16 ` 2.6.25-rc8: FTP transfer errors Mark Lord
2008-04-11 0:24 ` David Miller
2008-04-11 0:27 ` Mark Lord
2008-04-11 0:39 ` David Miller
2008-04-11 1:23 ` Mark Lord
2008-04-11 6:40 ` Ilpo Järvinen
2008-04-11 13:19 ` Mark Lord
2008-04-11 14:35 ` Evgeniy Polyakov
2008-04-11 14:59 ` Mark Lord
2008-04-11 15:18 ` Evgeniy Polyakov
2008-04-11 18:07 ` David Miller
2008-04-11 21:29 ` Evgeniy Polyakov
2008-04-12 8:44 ` Willy Tarreau
2008-04-12 9:49 ` David Miller
2008-04-13 18:15 ` Rafael J. Wysocki
2008-04-13 18:51 ` Sergio Luis
2008-04-13 19:24 ` Rafael J. Wysocki
2008-04-11 19:58 ` Valdis.Kletnieks
2008-04-11 22:16 ` Tilman Schmidt
2008-04-11 22:25 ` Evgeniy Polyakov
2008-04-11 22:27 ` David Miller
2008-04-11 23:23 ` Tilman Schmidt
2008-04-12 5:37 ` Evgeniy Polyakov
2008-04-12 7:06 ` Ilpo Järvinen
2008-04-11 22:26 ` David Miller
2008-04-11 19:58 ` Valdis.Kletnieks
2008-04-11 22:27 ` Tilman Schmidt
2008-04-13 18:40 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Rafael J. Wysocki
2008-04-13 18:47 ` Willy Tarreau
2008-04-13 19:18 ` Andrew Morton
2008-04-13 19:27 ` Rafael J. Wysocki
2008-04-13 19:47 ` Reporting bugs and bisection David Miller
2008-04-13 20:21 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Evgeniy Polyakov
2008-04-13 20:33 ` Rafael J. Wysocki
2008-04-13 20:54 ` Evgeniy Polyakov
2008-04-13 22:24 ` Reporting bugs and bisection Stephen Clark
2008-04-13 22:41 ` Rafael J. Wysocki
2008-04-13 23:51 ` david
2008-04-14 0:36 ` Jakub Narebski
2008-04-14 4:39 ` Willy Tarreau
2008-04-14 5:39 ` Al Viro
2008-04-14 6:24 ` Andrew Morton
2008-04-14 6:39 ` David Miller
2008-04-14 6:43 ` David Miller
2008-04-14 7:23 ` Al Viro
2008-04-14 7:43 ` Al Viro
2008-04-14 8:04 ` Andrew Morton
2008-04-14 8:30 ` David Miller
2008-04-14 9:06 ` Christoph Hellwig
2008-04-14 9:46 ` Andi Kleen
2008-04-15 5:25 ` Bill Fink
2008-04-14 10:15 ` Andrew Morton
2008-04-14 10:41 ` David Miller
2008-04-14 17:35 ` Roman Shaposhnik
2008-04-14 12:08 ` Adrian Bunk
2008-04-14 14:43 ` Arjan van de Ven
2008-04-14 17:51 ` Andrew Morton
2008-04-14 18:24 ` Arjan van de Ven
2008-04-14 19:30 ` Ilpo Järvinen
2008-04-14 15:54 ` James Morris
2008-04-14 22:01 ` David Miller
2008-04-14 23:05 ` Andrew Morton
2008-04-15 4:55 ` Willy Tarreau
2008-04-15 13:18 ` Work WAS(Re: " jamal
2008-04-15 9:33 ` David Newall
2008-04-15 9:54 ` Michael Kerrisk
2008-04-15 14:04 ` David Newall
2008-04-15 20:51 ` Rafael J. Wysocki
2008-04-16 2:34 ` David Newall
2008-04-16 3:53 ` david
2008-04-16 9:06 ` David Newall
2008-04-16 11:02 ` Andi Kleen
2008-04-16 12:41 ` Stephen Clark
2008-04-16 4:29 ` Willy Tarreau
2008-04-16 12:13 ` Rafael J. Wysocki
2008-04-16 12:15 ` Sverre Rabbelier
2008-04-16 13:26 ` Adrian Bunk
2008-04-16 19:02 ` Andrew Morton
2008-04-16 19:43 ` Sverre Rabbelier
2008-04-16 19:55 ` Adrian Bunk
2008-04-17 13:50 ` J. Bruce Fields
2008-04-17 15:26 ` Adrian Bunk
2008-04-16 19:58 ` Alexey Dobriyan
2008-04-16 20:01 ` Arjan van de Ven
2008-04-16 19:39 ` Sverre Rabbelier
2008-04-16 20:16 ` Adrian Bunk
2008-04-16 20:53 ` Adrian Bunk
2008-04-16 21:05 ` Sverre Rabbelier
2008-04-16 21:25 ` Adrian Bunk
2008-04-16 20:04 ` Willy Tarreau
2008-04-16 20:55 ` Jakub Narebski
2008-04-16 21:17 ` Jesper Juhl
2008-04-17 17:04 ` David Newall
2008-04-17 19:09 ` Rafael J. Wysocki
2008-04-17 19:35 ` Ray Lee
2008-04-17 19:57 ` Sverre Rabbelier
2008-04-17 20:16 ` Al Viro
2008-04-17 20:38 ` Ray Lee
2008-04-17 20:53 ` Al Viro
2008-04-17 21:01 ` Ray Lee
2008-04-14 19:13 ` Rene Herman
2008-04-14 20:38 ` Andrew Morton
2008-04-14 22:18 ` Rene Herman
2008-04-14 9:26 ` Andi Kleen
2008-04-13 20:35 ` David Miller
2008-04-14 10:18 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Ingo Molnar
2008-04-14 10:29 ` Reporting bugs and bisection Andi Kleen
2008-04-13 20:10 ` Reporting bugs and bisection (was: Re: 2.6.25-rc8: FTP transfer errors) Adrian Bunk
2008-04-14 9:58 ` Reporting bugs and bisection Andi Kleen
2008-04-14 10:00 ` Willy Tarreau
2008-04-14 10:16 ` Andi Kleen
2008-04-15 21:53 ` about bisections (was: Re: 2.6.25-rc8: FTP transfer errors) Ingo Molnar
2008-04-15 22:30 ` about bisections David Miller
2008-04-15 22:48 ` Ingo Molnar
2008-04-11 0:56 ` 2.6.25-rc8: FTP transfer errors Tilman Schmidt
2008-04-11 1:08 ` David Miller
2008-04-11 0:26 ` David Miller
2008-04-11 0:29 ` Mark Lord
2008-04-11 2:59 ` YOSHIFUJI Hideaki / 吉藤英明
2008-04-11 3:18 ` [PATCH 2.6.25] net sockets: fix timewait namespace regression Mark Lord
2008-04-11 3:51 ` David Miller
2008-04-11 7:50 ` 2.6.25-rc8: FTP transfer errors Pavel Emelyanov
[not found] <1207869029.19683.13.camel@localhost>
[not found] ` <20080410.161453.52032573.davem@davemloft.net>
[not found] ` <1207870334.13150.11.camel@localhost>
2008-04-10 23:41 ` David Miller
2008-04-10 23:51 ` vincent-perrier
2008-04-18 8:32 ` David Miller
2008-04-19 8:07 ` vincent-perrier
[not found] <47FCF9DD.6080007@rtr.ca>
[not found] ` <20080410.023045.16227424.yoshfuji@linux-ipv6.org>
[not found] ` <47FD138B.2060801@rtr.ca>
[not found] ` <20080409.152933.132174258.davem@davemloft.net>
[not found] ` <47FD590C.5020003@rtr.ca>
2008-04-10 20:46 ` Ilpo Järvinen
2008-04-10 21:05 ` Mark Lord
2008-04-10 21:43 ` Ilpo Järvinen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).