From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 2/2] Make write(2) interruptible by a signal
Date: Wed, 23 Nov 2011 12:29:48 -0800
Message-ID: <20111123122948.4aa7ddfa.akpm@linux-foundation.org>
References: <1321441935-6802-1-git-send-email-jack@suse.cz>
	<1321441935-6802-3-git-send-email-jack@suse.cz>
	<20111116114421.GA9098@localhost>
	<20111122142805.4e59faae.akpm@linux-foundation.org>
	<20111123090533.GA22420@localhost>
	<20111123015005.8f366566.akpm@linux-foundation.org>
	<BE9807BA-82BE-451D-AAEC-9D51010C16C6@mit.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Cc: Wu Fengguang <fengguang.wu@intel.com>, Jan Kara <jack@suse.cz>,
	Christoph Hellwig <hch@infradead.org>,
	Al Viro <viro@ZenIV.linux.org.uk>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
To: Theodore Tso <tytso@mit.edu>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from mail.linuxfoundation.org ([140.211.169.12]:34657 "EHLO
	mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753120Ab1KWU3u (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Wed, 23 Nov 2011 15:29:50 -0500
In-Reply-To: <BE9807BA-82BE-451D-AAEC-9D51010C16C6@mit.edu>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Wed, 23 Nov 2011 07:27:43 -0500
Theodore Tso <tytso@mit.edu> wrote:

> >> 
> >> Maybe this is not that big problem as SIGKILL is considered be to
> >> destructive already.
> > 
> > Yeah, I have dim dark memories that there are subtle problems with
> > interrupting write().  Linus might remember.

(err, you're sending 600-column emails)

> The big one is that you're lucky if application programmers check the
> return values of write(2), and if they do, it's likely they will only
> check for error returns and not necessarily for partial writes ---
> since no other Unix-like or Linux-like system has ever returned partial
> reads or writes for files except in error conditions.  We've barely
> gotten them trained to check for partial writes and reads with TCP
> connections and character devices, but I wouldn't bet on application
> programmers getting things right for files.
> 
> Still, if it's ***only*** for SIGKILL, we'll probably be OK, since
> for that one case there's no chance userspace can intercept the signal,
> so it can't do any recovery anyway.  (I could imagine some HPC program
> doing a massive 2GB write, and some user of that program depending on
> the fact that he can kill it at a predefined place by sending a SIGKILL
> and knowing that the file would be written up to that 2GB chunk --- but
> that's clearly an edge situation, as opposed to something that would
> effect most GNOME and KDE apps.) We just need to make sure we never try
> to do this for any other signal that could be caught, such as SIGINT or
> SIGQUIT or (worse yet) SIGTSTP.

That it is a fatal SIGKILL means that the *current* application doesn't
care.  But other processes will sometimes notice this change. 
Previously if an app did write(file, 128k) and was hit with SIGKILL, it
would write either 0 bytes or 128k bytes.  Now, it can write 36k bytes,
yes?  If the target file consisted of a stream of 128k records then the
user will claim, with some justification, that Linux corrupted it.

Dunno.  People do lots of weird and flakey things.  I have a suspicion
that we'll be hearing back from them about this change.