From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fernando Luis Vazquez Cao Subject: Re: [PATCH] fsfreeze: tell hung_task about processes put to sleep Date: Mon, 15 Oct 2012 15:51:34 +0900 Message-ID: <507BB276.8020502@lab.ntt.co.jp> References: <1350035252.6500.2.camel@nexus.lab.ntt.co.jp> <20121013010613.GP2739@dastard> <507B820B.3000908@lab.ntt.co.jp> <20121015063608.GW2739@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Al Viro , Ingo Molnar , Jan Kara , linux-fsdevel@vger.kernel.org To: Dave Chinner Return-path: Received: from tama500.ecl.ntt.co.jp ([129.60.39.148]:53467 "EHLO tama500.ecl.ntt.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752643Ab2JOGvs (ORCPT ); Mon, 15 Oct 2012 02:51:48 -0400 In-Reply-To: <20121015063608.GW2739@dastard> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 2012=E5=B9=B410=E6=9C=8815=E6=97=A5 15:36, Dave Chinner wrote: > On Mon, Oct 15, 2012 at 12:24:59PM +0900, Fernando Luis Vazquez Cao w= rote: >> On 2012/10/13 10:06, Dave Chinner wrote: >>> On Fri, Oct 12, 2012 at 06:47:32PM +0900, Fernando Luis V=C3=A1zque= z Cao wrote: >>>> Any process attempting to write to a frozen filesystem uninterrupt= ibly and >>>> unkillably waits for the filesystem to be thawed. This wait is of = unbounded >>>> length. Ignore such waits in the hung_task detector. >>> Filesystems should not be frozen for long enough to trigger the hun= g >>> task detector under normal usage. IMO, if you are freezing a >>> filesystem for that long, then you're either doing something wrong >>> or something has gone wrong, and in either case I think we should b= e >>> emitting warnings... >> The problem is that in production systems situations where >> a filesystem remains brozen for long periods are not uncommon. >> A typical example is as follows: the control daemon or script that >> controls the freeze/thaw using the fsfreeze ioctls dies, the next > There's your problem. Fix that, don't turn off useful warnings that > indicate something has gone wrong. It is not my problem. It is the enterprise distro's user's problem. As I mentioned in my previous email if you want to emit a warning do it in the right place and make sure that it is something informative. hung_check certainly isn't the right place to do it. A failure in a user space script should not lead to a kernel panic or to a flood of process stack dumps in the system log administrators cannot interpret (a common complaint from our customers). This is the behaviour this patch is trying to fix. Thanks, =46ernando -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html