From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760965Ab2C3O3r (ORCPT ); Fri, 30 Mar 2012 10:29:47 -0400 Received: from www.linutronix.de ([62.245.132.108]:38602 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760807Ab2C3O3k (ORCPT ); Fri, 30 Mar 2012 10:29:40 -0400 Date: Fri, 30 Mar 2012 16:29:38 +0200 (CEST) From: Thomas Gleixner To: David Henningsson cc: Arun Raghavan , LKML , Peter Zijlstra Subject: Re: [PATCH] [RESEND] rlimits: Print more information when limits are exceeded In-Reply-To: <4F75BF5B.7000306@canonical.com> Message-ID: References: <1330025378-26075-1-git-send-email-arun.raghavan@collabora.co.uk> <4F75BF5B.7000306@canonical.com> User-Agent: Alpine 2.02 (LFD 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 30 Mar 2012, David Henningsson wrote: > On 03/30/2012 03:39 PM, Thomas Gleixner wrote: > > On Fri, 24 Feb 2012, Arun Raghavan wrote: > > > > > This dumps some information in logs when a process exceeds its CPU or RT > > > limits (soft and hard). Makes debugging easier when userspace triggers > > > these limits. > > > > Why do we need to spam the logs with such information? > > > > SIGXCPU is only ever sent by this code. If there is a signal handler > > in the application it's easy to debug. If not it's even easier, the > > thing will simply be killed and you get the reason printed. > > I'm not totally sure, but don't we log SIGSEGVs? If so, the same reasoning > would apply to SIGSEGV. I think so. Dunno why this was added in the first place. core dumps or proper signal handlers are telling you usually more than that single line in dmesg. > > For the SIGKILL case there only a limited number of reasons why a > > SIGKILL is sent. So no, I rather commit a patch which removes that > > ugly printk which is already there instead of adding more of them. > > The reason I proposed some kind of printk for SIGKILL, was to get some > diagnostic information out of the SIGKILL. E g, if you have two threads both > running on rtprio rlimits in the same process, it would be very interesting to > know which one of them was causing the kernel to send SIGKILL. Usually the one which ignored SIGXCPU for quite a while. There is a reason why SIGXCPU can be handled by the application. > Also, it could be useful to know whether the SIGKILL was actually sent by the > kernel, or by some other process feeling evil (e g "kill -9"). Agreed, but instead of adding that printk to the rlimit code I prefer a generic infrastructure which can be used by all call sites which issue SIGKILL. Something like: [__]kill_it(flags, task, "Reason"); Thanks, tglx