From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wu Fengguang Subject: Re: [PATCH 2/2] fs: Make write(2) interruptible by a signal Date: Wed, 23 Nov 2011 17:05:33 +0800 Message-ID: <20111123090533.GA22420@localhost> References: <1321441935-6802-1-git-send-email-jack@suse.cz> <1321441935-6802-3-git-send-email-jack@suse.cz> <20111116114421.GA9098@localhost> <20111122142805.4e59faae.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jan Kara , Christoph Hellwig , Al Viro , "linux-fsdevel@vger.kernel.org" To: Andrew Morton Return-path: Received: from mga01.intel.com ([192.55.52.88]:41557 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752216Ab1KWJFf (ORCPT ); Wed, 23 Nov 2011 04:05:35 -0500 Content-Disposition: inline In-Reply-To: <20111122142805.4e59faae.akpm@linux-foundation.org> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Nov 23, 2011 at 06:28:05AM +0800, Andrew Morton wrote: > On Wed, 16 Nov 2011 19:44:21 +0800 > Wu Fengguang wrote: > > > Due to the (very low) possibility of data loss by partial writes, IMHO > > it would safer to test this patch in linux-next until next merge window, > > Any such bugs will not be discovered in linux-next testing. Yup, I'm afraid. > The only way to find these things in a reasonable period of time is to > go in and find them. For example, intensive fsx-linux testing with > concurrent heavy memory pressure on various filesystems with various > block sizes. And of course concurrent signalling. If you're talking > about O_DIRECT then iirc I hacked support for that into fsx-linux. I > think. How are we going to measure the success/failure? Check if it eventually resulted in filesystem corruption or whatever? When received SIGKILL, fsx-linux itself will just die. > Anyway, what _are_ the scenarios in which we think data can be lost? It's the vision that there may be partial writes on SIGKILL. Before patch, the write will either succeed as a whole or not started at all, depending on the timing of write/SIGKILL. This is kind of atomic operation. However now the write can be half done. If the application really cares about atomic behavior, it can do create-and-rename. However there are always the possibility of broken applications. Maybe this is not that big problem as SIGKILL is considered be to destructive already. Thanks, Fengguang