From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: [PATCH v3] fs: record task name which froze superblock Date: Sun, 1 Mar 2015 08:31:26 +1100 Message-ID: <20150228213126.GM4251@dastard> References: <20150214185524.GA16579@p183.telecom.by> <20150216093852.GB4749@quack.suse.cz> <20150218073455.GA1752@p183.telecom.by> <20150218091323.GA4614@quack.suse.cz> <20150220121522.GC6293@quack.suse.cz> <20150228142235.GA19552@p183.telecom.by> <20150228142557.GB19552@p183.telecom.by> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jan Kara , Andrew Morton , Al Viro , Linux Kernel , linux-fsdevel , swhiteho@redhat.com, cluster-devel@redhat.com To: Alexey Dobriyan Return-path: Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:43505 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751486AbbB1Vba (ORCPT ); Sat, 28 Feb 2015 16:31:30 -0500 Content-Disposition: inline In-Reply-To: <20150228142557.GB19552@p183.telecom.by> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sat, Feb 28, 2015 at 05:25:57PM +0300, Alexey Dobriyan wrote: > Freezing and thawing are separate system calls, task which is supposed > to thaw filesystem/superblock can disappear due to crash or not thaw > due to a bug. At least record task name (we can't take task_struct > reference) to make support engineer's life easier. > > Hopefully 16 bytes per superblock isn't much. > > TASK_COMM_LEN definition (which is userspace ABI, see prctl(PR_SET_NAME)) is > moved to userspace exported header to not drag sched.h into every fs.h inclusion. > > Signed-off-by: Alexey Dobriyan Freeze/thaw can be nested at the block level. That means the sb->s_writers.freeze_comm can point at the wrong process. i.e. Task A Task B freeze_bdev freeze_super freeze_comm = A freeze_bdev ..... thaw_bdev At this point, the block device will never be unthawed, but the debug field is now pointing to the wrong task. i.e. The debug helper has not recorded the process that is actually causing the problem, and leads us all off on a wild goose chase down the wrong path. IMO, debug code is only useful if it's reliable..... > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -303,9 +303,6 @@ extern char ___assert_task_state[1 - 2*!!( > > #endif > > -/* Task command name length */ > -#define TASK_COMM_LEN 16 > - > #include > > /* > --- a/include/uapi/linux/sched.h > +++ b/include/uapi/linux/sched.h > @@ -49,4 +49,7 @@ > */ > #define SCHED_FLAG_RESET_ON_FORK 0x01 > > +/* Task command name length */ > +#define TASK_COMM_LEN 16 > + > #endif /* _UAPI_LINUX_SCHED_H */ That should be a separate patch, sent to the scheduler maintainers for review. AFAICT, it isn't part of the user API - it's not defined in the man page which just says "can be up to 16 bytes". Cheers, Dave. -- Dave Chinner david@fromorbit.com