From mboxrd@z Thu Jan 1 00:00:00 1970 From: tomasz.figa@gmail.com (Tomasz Figa) Date: Fri, 17 Aug 2012 22:23:25 +0200 Subject: Issues with all kernels after 3.3.7 In-Reply-To: References: Message-ID: <2094430.QvfuTg7dJ7@flatron> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Friday 17 of August 2012 21:19:08 Alan M Butler wrote: > On 17 August 2012 21:08, Tomasz Figa wrote: > > Hi Alan, > > > > On Friday 17 of August 2012 20:37:08 Alan M Butler wrote: > >> On 17 August 2012 19:12, Alan M Butler wrote: > >> > From: Alan M Butler > >> > Date: 17 August 2012 18:09 > >> > Subject: Re: Issues with all kernels after 3.3.7 > >> > To: Uwe Kleine-K?nig > >> > > >> > > >> > On 17 August 2012 17:46, Uwe Kleine-K?nig > >> > > >> > wrote: > >> >> Hello, > >> >> > >> >> On Fri, Aug 17, 2012 at 05:30:27PM +0000, Alan M Butler wrote: > >> >>> On 17 August 2012 14:01, Alan M Butler wrote: > >> >>> > On 17 August 2012 13:26, Uwe Kleine-K?nig > >> >>> > > >> >>> > wrote: > >> >>> >> On Fri, Aug 17, 2012 at 01:47:06PM +0100, alan butler wrote: > >> >>> >>> I have been trying all kernels after 3.3.7 (except for the 3.5 > >> >>> >>> series) > >> >>> >>> on my ix2-200 > >> >>> >>> and have found that on every kernel 3.3.8, 3.4, 3.4.2, 3.4.4 > >> >>> >>> =>3.4.7 > >> >>> >>> and now even on 3.6 rc1 > >> >>> >>> and the next dated today 17 august that when i use git for > >> >>> >>> example: > >> >>> >>> > >> >>> >>> git clone git://git.videolan.org/x264 > >> >>> >>> > >> >>> >>> my system just crash's / hangs and even through the serial > >> >>> >>> port i > >> >>> >>> get > >> >>> >>> no response. > >> >>> >>> This also happens when i try to access a web interface for > >> >>> >>> example the > >> >>> >>> serviio java webui. > >> >>> >>> > >> >>> >>> The second issue i have noticed is that my raid 0 array hangs > >> >>> >>> / > >> >>> >>> pauses > >> >>> >>> when being mounted > >> >>> >>> at system startup. (for approximatly 30 seconds maybe more > >> >>> >>> maybe > >> >>> >>> less > >> >>> >>> im not certain). > >> >>> >>> But again on the 3.3.7 kernel there was no issue and no hang. > >> >>> >>> > >> >>> >>> If i return to 3.3.7 everything works fine with no other > >> >>> >>> modifications > >> >>> >>> to the os or kernels. > >> >>> >>> I am using debian wheezy at the moment but also tried debian > >> >>> >>> squeeze > >> >>> >>> so it does not > >> >>> >>> seem to be related to the specific linux version just the > >> >>> >>> kernel. > >> >>> >> > >> >>> >> Can you bisect your problem? Doing that between 3.3.7 and 3.3.8 > >> >>> >> seems to > >> >>> >> be the obvious range to test. > >> >>> >> > >> >>> >> You can also try to enable the various debugging options like > >> >>> >> > >> >>> >> CONFIG_DETECT_HUNG_TASK > >> >>> >> CONFIG_PROVE_LOCKING > >> >>> >> CONFIG_DEBUG_ATOMIC_SLEEP > >> >>> >> CONFIG_MAGIC_SYSRQ > >> >>> >> > >> >>> >> or try https://lkml.org/lkml/2012/5/26/83. > >> >>> >> > >> >>> >> Best regards > >> >>> >> Uwe > >> >>> >> > >> >>> >> -- > >> >>> >> Pengutronix e.K. | Uwe Kleine-K?nig > >> >>> >> > >> >>> >> | Industrial Linux Solutions | > >> >>> >> > >> >>> >> http://www.pengutronix.de/ |>>> > > >> >>> > > >> >>> > i enabled the options you said there and i see alot of the > >> >>> > following > >> >>> > > >> >>> > popping up while connected through serial: > >> >>> > BUG: sleeping function called from invalid context at > >> >>> > > >> >>> > include/linux/freezer.h:46 > >> >>> > [ 126.896958] in_atomic(): 0, irqs_disabled(): 128, pid: 2180, > >> >>> > name: > >> >>> > console-kit-dae > >> >>> > [ 126.904566] no locks held by console-kit-dae/2180. > >> >>> > [ 126.909378] irq event stamp: 27643 > >> >>> > [ 126.912797] hardirqs last enabled at (27642): [] > >> >>> > _raw_spin_unlock_irqrestore+0x3c/0x5c > >> >>> > [ 126.921735] hardirqs last disabled at (27643): [] > >> >>> > ret_fast_syscall+0xc/0x38 > >> >>> > [ 126.929618] softirqs last enabled at (25565): [] > >> >>> > irq_exit+0x54/0xb8 > >> >>> > [ 126.936890] softirqs last disabled at (25558): [] > >> >>> > irq_exit+0x54/0xb8 > >> >>> > [ 126.944185] [] (unwind_backtrace+0x0/0xe0) from > >> >>> > [] (do_signal+0x84/0x5c0) > >> >>> > [ 126.952764] [] (do_signal+0x84/0x5c0) from > >> >>> > [] > >> >>> > (do_notify_resume+0x18/0x60) > >> >>> > [ 126.961430] [] (do_notify_resume+0x18/0x60) from > >> >>> > [] (work_pending+0x24/0x28) > >> >>> > > >> >>> > there seems to be alot more of them when i have serviio upnp > >> >>> > server > >> >>> > running.>>> > >> >>> > >> >>> After a little testing with those config options enabled that you > >> >>> sujested iv found that the problem with git first appears in the > >> >>> 3.4.1 > >> >>> kernel. > >> >>> For example: > >> >>> in 3.4.0 kernel i can use 'git clone git://git.videolan.org/x264' > >> >>> successfully. > >> >>> > >> >>> From 3.4.1 kernel on I can not use 'git clone > >> >>> git://git.videolan.org/x264' the system hangs / crashes with no > >> >>> output > >> >>> at all. > >> >>> > >> >>> the other issue the hang / stall while mounting my etx4 raid 0 is > >> >>> actualy much more recent than i remembered i have tested each > >> >>> kernel > >> >>> from 3.4.0 all the way to 3.4.9 with the config options enabled as > >> >>> sujested before and the stall first starts in kernel 3.4.8 and the > >> >>> following bug keeps popping up repeatedly until the raid is > >> >>> mounted > >> >>> and then anytime a disk is accessed it seems. I was certain it had > >> >>> been popping up before 3.4.8. > >> >>> > >> >>> The following is one of what pops up: > >> >>> BUG: sleeping function called from invalid context at > >> >>> > >> >>> include/linux/freezer.h:46 > >> >>> in_atomic(): 0, irqs_disabled(): 128, pid: 2166, name: minissdpd > >> >>> no locks held by minissdpd/2166. > >> >>> > >> >>> irq event stamp: 2081 > >> >>> > >> >>> hardirqs last enabled at (2080): [] > >> >>> _raw_spin_unlock_irq+0x24/0x4c hardirqs last disabled at (2081): > >> >>> [] ret_fast_syscall+0xc/0x38 softirqs last enabled at > >> >>> (0): [] copy_process+0x3f8/0xfe8 softirqs last disabled > >> >>> at > >> >>> (0): [< (null)>] (null) > >> >>> [] (unwind_backtrace+0x0/0xe0) from [] > >> >>> (do_signal+0x84/0x554) > >> >>> [] (do_signal+0x84/0x554) from [] > >> >>> (do_notify_resume+0x18/0x60) > >> >>> [] (do_notify_resume+0x18/0x60) from [] > >> >>> (work_pending+0x24/0x28) > >> >> > >> >> I think this is an unrelated issue that I think is fixed in later > >> >> kernels. So I'd disable CONFIG_DEBUG_ATOMIC_SLEEP for further > >> >> testing. > >> >> Can you try a bisection, i.e. > >> >> > >> >> git clone > >> >> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linu > >> >> x. > >> >> git cd linus > >> >> git remote add -f stable > >> >> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux- > >> >> st > >> >> able.git git bisect v3.4.1 v3.4 > >> >> > >> >> and test if the kernel that is checked out then results in a > >> >> freeze. > >> >> > >> >> Depending on the test result either do: > >> >> git bisect good # i.e. problem doesn't occur > >> >> > >> >> or > >> >> > >> >> git bisect bad # problem is reproducible > >> >> > >> >> Repeat that until git points out the first bad commit and report > >> >> that. > >> >> > >> >> Best regards > >> >> Uwe > >> >> > >> >> -- > >> >> Pengutronix e.K. | Uwe Kleine-K?nig > >> >> > >> >> | Industrial Linux Solutions | > >> >> > >> >> http://www.pengutronix.de/ |> > >> > > >> > I dont think it has as i have recently been building the latest 3.6 > >> > rc > >> > kernels while creating a patch for my nas and i had the same issue's > >> > on them. Thats what prompted me to actualy come to the mailing list > >> > with it. I have tried the 3.6-rc1 and the latest linux-next dated > >> > 17th > >> > of august. > >> > >> Im not sure if this is the right commit as i dont know what way it > >> works but basically the very first commit caused the issue of me not > >> being able to use git and when i typed the git bisect bad it said > >> this: > >> > >> ownerx35 at ownerx35-VirtualBox:~/Desktop/linux$ git bisect bad > >> > >> Bisecting: 21 revisions left to test after this (roughly 5 steps) > >> [774a93aa647f8939867c8ff956847bc63dd51cb3] usbhid: prevent deadlock > >> during timeout > > > > After calling git bisect bad it checks out files from a revision > > between > > the one currently checked and the one marked as good (imagine binary > > search). > > > > You have to compile and test resulting kernels and tell git whether > > they're good or bad until it tells you that it found the problematic > > commit. > > > > Best regards, > > Tomasz Figa > > > >> _______________________________________________ > >> linux-arm-kernel mailing list > >> linux-arm-kernel at lists.infradead.org > >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > It said something about doing it 5 more times or such. and if im > understanding what your saying basicaly i have to build it 5 more > times then it will tell me the commit that was actualy bad so the one > i said there is not the problematic one? > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel Yes, that's it. P.S. Next time please reply to the mailing list, not to individual posters. I have forwarded your message to the list this time. Best regards, Tomasz Figa