From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arnd Bergmann Date: Mon, 08 Dec 2014 15:39:30 +0000 Subject: Re: stable boot: 93 boots: 92 pass, 1 fail (v3.17.6) Message-Id: <2389005.vUt7gzqoxX@wuerfel> List-Id: References: <2553650.3KSUKtTdo3@wuerfel> In-Reply-To: <2553650.3KSUKtTdo3@wuerfel> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-sh@vger.kernel.org On Monday 08 December 2014 14:48:20 Geert Uytterhoeven wrote: > On Mon, Dec 8, 2014 at 12:35 PM, Arnd Bergmann wrote: > > On Sunday 07 December 2014 21:49:05 Kevin's boot bot wrote: > >> Full Build report: http://status.armcloud.us/build/stable/kernel/v3.17.6/ > >> Full Boot report: http://status.armcloud.us/boot/all/job/stable/kernel/v3.17.6/ > >> > >> Tree/Branch: stable > >> Git describe: v3.17.6 > >> > >> Failed boot tests > >> ========> >> emev2-kzm9d: FAIL: arm-shmobile_defconfig > >> http://storage.armcloud.us/kernel-ci/stable/v3.17.6/arm-shmobile_defconfig/boot-emev2-kzm9d.html > >> > > > > I tried to look at the reports, but the links don't work. Going > > back in the archive, I see that it was still working until v3.17.2, > > broken in v3.17.4/.5/.6 and the boot report for v3.17.3 seemed to run > > into an unrelated issue. Mainline was always fine since 3.17, only > > the stable kernels had problems: > > > > http://storage.armcloud.us/kernel-ci/stable/v3.17.2/arm-shmobile_defconfig/boot-emev2-kzm9d.html > > http://storage.armcloud.us/kernel-ci/stable/v3.17.3/arm-shmobile_defconfig/boot-emev2-kzm9d.html > > http://storage.armcloud.us/kernel-ci/stable/v3.17.4/arm-shmobile_defconfig/boot-emev2-kzm9d.html > > > > It may be worth having the sh team investigate the problem some more, > > to see if a bad patch made it into stable kernels. > > Thanks for telling us! > > I don't have a kzm9d, so I'm just guessing based on the evidence. > > 1. When comparing successful v3.1.72 with failed v3.17.4, I see: > > Sending DHCP requests ., OK > -IP-Config: Got DHCP answer from 192.168.1.3, my address is 192.168.1.189 > +IP-Config: Got DHCP answer from 192.168.1.254, my address is 192.168.1.188 > IP-Config: Complete: > - device=eth0, hwaddr:01:9b:04:03:cd, ipaddr2.168.1.189, > mask%5.255.255.0, gw2.168.1.254 > - host=kzm9d, domain=lan, nis-domain=(none) > - bootserver2.168.1.2, rootserver2.168.1.2, > rootpath=/opt/kjh/rootfs/debian/armel,rsize@96,wsize@96 > - nameserver02.168.1.254, nameserver1f.93.87.2, > nameserver2!6.231.41.2 > + device=eth0, hwaddr:01:9b:04:03:cd, ipaddr2.168.1.188, > mask%5.255.255.0, gw2.168.1.254 > + host2.168.1.188, domain=lan, nis-domain=(none) > + bootserver2.168.1.254, rootserver2.168.1.254, rootpath> + nameserver02.168.1.254 > ALSA device list: > No soundcards found. > > So both the client and server IP addresses have changed. Is there a proper NFS > root file system available? Mounting of NFS root will time out, but the boot > farm management software may time out earlier, so we don't get to see the > panic message? The 3.18-rc7 boot got these: IP-Config: Got DHCP answer from 192.168.1.254, my address is 192.168.1.185 IP-Config: Complete: device=eth0, hwaddr:01:9b:04:03:cd, ipaddr2.168.1.185, mask%5.255.255.0, gw2.168.1.254 host2.168.1.185, domain=lan, nis-domain=(none) bootserver2.168.1.254, rootserver2.168.1.254, rootpath nameserver02.168.1.254 So it's the same server as the failing config, and yet another client address. I'd say it's unlikely to be related. > Could this be a configuration issue? > Do you have more logs, e.g. from successful v3.18-rc* builds?? http://lists.linaro.org/pipermail/kernel-build-reports/2014-November/author.html has all the build reports from November, and you can navigate the archives to find other months. See http://storage.armcloud.us/kernel-ci/mainline/v3.18-rc7/arm-shmobile_defconfig/boot-emev2-kzm9d.txt for another example of a successful log, or navigate the directories on that server for others. > 2. v3.17.4 has > > commit a54857a74cf6724a872217477fa5827d6b9d26c8 > Author: Enric Balletbo i Serra > Date: Thu Nov 13 09:14:34 2014 +0100 > > smsc911x: power-up phydev before doing a software reset. > > [ Upstream commit ccf899a27c08038db91765ff12bb0380dcd85887 ] > > Seems unlikely the above is the culprit, but please note that upstream does have > a few more fixes in this area: > > 6ff53fd37175e35d net/smsc911x: Fix delays in the PHY enable/disable routines > 242bcd5ba1dcea80 net/smsc911x: Fix rare soft reset timeout issue due > to PHY power-down mode If you think it helps, we could try reverting this commit and have the boot farm try the latest 3.17.6 without this patch. Arnd