From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751281AbWFBHvb (ORCPT ); Fri, 2 Jun 2006 03:51:31 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751296AbWFBHvb (ORCPT ); Fri, 2 Jun 2006 03:51:31 -0400 Received: from mx3.mail.elte.hu ([157.181.1.138]:61370 "EHLO mx3.mail.elte.hu") by vger.kernel.org with ESMTP id S1751281AbWFBHvb (ORCPT ); Fri, 2 Jun 2006 03:51:31 -0400 Date: Fri, 2 Jun 2006 09:51:50 +0200 From: Ingo Molnar To: Jan Beulich Cc: jeff@garzik.org, htejun@gmail.com, Andrew Morton , reuben-lkml@reub.net, linux-kernel@vger.kernel.org Subject: Re: 2.6.17-rc5-mm2 Message-ID: <20060602075150.GA12212@elte.hu> References: <20060601014806.e86b3cc0.akpm@osdl.org> <447EB4AD.4060101@reub.net> <20060601025632.6683041e.akpm@osdl.org> <447EBD46.7010607@reub.net> <20060601103315.GA1865@elte.hu> <20060601105300.GA2985@elte.hu> <447EF7A8.76E4.0078.0@novell.com> <448006F6.76E4.0078.0@novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <448006F6.76E4.0078.0@novell.com> User-Agent: Mutt/1.4.2.1i X-ELTE-SpamScore: 0.0 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=0.0 required=5.9 tests=AWL,BAYES_50 autolearn=no SpamAssassin version=3.0.3 0.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60% [score: 0.5056] 0.0 AWL AWL: From: address is in the auto white-list X-ELTE-VirusStatus: clean Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org * Jan Beulich wrote: > >firstly, i'd suggest to use another magic value for 'bottom of call > >stacks' - it is way too common to jump or call a NULL pointer. Something > >like 0xfedcba9876543210 would be better. > > That's contrary to common use (outside of the kernel). I'm opposed to > this. Detecting an initial bad EIP isn't a problem, and the old code > can be used easily in that case. but 0 is pretty much the worst choice for something that needs to be reliable - it's the most common type of machine word in existence, amongst all the 18446744073709551616 possibilities. And we need not care about userspace's prior choices, this code and data is totally under the kernel's control. > >for the RIP/EIP to get corrupted is a common occurance. So is stack > >corruption. So the fallback mechanism shouldnt be a 'short while' > >side-thought, it must be part of the design. > > RIP/EIP corruption, as said above, can be easily handled. RSP/ESP > corruption, as I understand it, isn't being handled in the old code, > and so I can't see what improvements the new code could do here (given > that instruction and stack pointers serve as the anchors for kicking > off an unwind). i'm not only talking about RSP/ESP corruption, but about stack corruption. I.e. some area of the stack is corrupted. With the scanning method we at least get some other entries out - while with the unwind method we only say 'sorry'. anyway, i think that handling a bad initial RIP/EIP would be a good first step and it should solve the problem at hand. (it will also serve as a basis for whatever other heuristics we might want to apply later on) > >In all other cases (if we go outside of the stack page(s)) we _must_ > >fall back to the dump 'scan the stack pages for interesting entries' > >method, to get the information out! "Uh oh the unwind info somehow got > >corrupted, sorry" is not enough to debug a kernel bug. > > Again, you miss the point that the very last unwind operation must > always be expected to move the stack pointer outside the stack > boundaries, which would mean triggering the fallback path in all > cases. With this, we could as well leave out the entire unwind code > and keep everyone of us manually do the separation of good and bad > entries in the trace shown. no, i dont miss that point at all. What _you_ are missing is the obvious solution: stacks on x86_64 are already linked to each other, via fixed-position pointers at the end of the stackpages. So the unwinder can easily check whether the 'next stack' as suggested by the link at the end of the page is indeed the same as the unwind jumpout does. If not => fallback. same for i386 - there too the stacks are linked via non-unwind data. The unwinder can do a pretty good verification of the jumpout. Ingo