From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Serge E. Hallyn" Subject: Re: 2.6.33-rc4 i686 clone function looping (seems real!) Date: Tue, 19 Jan 2010 16:10:47 -0600 Message-ID: <20100119221047.GA12806@us.ibm.com> References: <1263852243.4745.363.camel@Mercier.safe.ca> <20100119150931.GA7708@us.ibm.com> <1263916851.4745.386.camel@Mercier.safe.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <1263916851.4745.386.camel-4BUXZ/Ty1v7iqR6jatDSCA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Jean-Marc Pigeon Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org List-Id: containers.vger.kernel.org Quoting Jean-Marc Pigeon (jmp-4qkeo2rQ0gg@public.gmane.org): > Hello Serge, > > Thanks for the small test, I do confirm > there is the same problem here with it, > as soon started the program use ALL available > CPU cycle (minus some few %) and NEVER > EVER come back from "clone" function. > The ONLY way I found to recover the > system is to power it down (sic!). > > See attachment, your program, the > .config file, and cpu information > (I put back your test too, (done > cosmetic changes only)). > > Once again, clone call on 2.6.32.3 > is working fine. > > Sorry to bother the list, I was "expecting" > a Stack size problem, but I increase > your value from 4 to 10, with the same result. > Hopefully it could be something I overlooked > with my Kernel config file, if someone > want to have a look I attached it. > > My guess for now it is something within > the clone code specific to i386. > > I 'll try to pin point within the > clone code. > > (could someone check tstclone.c under > i386 arch and confirm trouble?) I just tried it on a x86-32 kvm image with no hang (cut-n-paste from your copy of testclone.c). Could you: 1. use cpusets and memory limiter to limit the shell fromwhich you run testclone to 1 cpu and 1/3 of your ram, 2. fire up testclone 3. gdb -se testclone -p `pidof testclone` in another shell and see whether anything is hung in userspace? If you need more specific hints pls let me know. Also, you might try with selinux disabled ('setenforce 0'), though I don't see how it can be responsible. -serge