From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Subject: Re: [RFC] write(2) semantics wrt return values and current position Date: Tue, 7 Apr 2015 08:25:31 -0700 Message-ID: <20150407152531.GA21108@infradead.org> References: <20150406153641.GL889@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-fsdevel@vger.kernel.org, Linus Torvalds , Trond Myklebust , Christoph Hellwig , Dave Chinner , Theodore Ts'o , Miklos Szeredi , Oleg Drokin To: Al Viro Return-path: Received: from bombadil.infradead.org ([198.137.202.9]:42477 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754546AbbDGPZc (ORCPT ); Tue, 7 Apr 2015 11:25:32 -0400 Content-Disposition: inline In-Reply-To: <20150406153641.GL889@ZenIV.linux.org.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Apr 06, 2015 at 05:02:31PM +0100, Al Viro wrote: > 6) XFS seems to have fun bugs in O_DIRECT handling. Consider > the following scenario: > * O_DIRECT write() is called, we hit xfs_file_dio_aio_write(). > * we check alignment and make decision whether to do > xfs_rw_ilock exclusive (which will include i_mutex) or shared (which will > not). Suppose it takes that shared. > * we call xfs_file_aio_write_checks(), which, for starters, might > modify position (on O_APPEND) and size (on rlimit). Which renders the > alignment checks useless, of course, but what's worse, it proceeds to > calling xfs_break_layouts(), which might drop and retake XFS part of what's > taken by xfs_rw_iolock(). Retake it exclusive, and update the iolock flag > passed to it by reference accordingly. And when we return to > xfs_file_aio_write_checks(), and do xfs_rw_iunlock(), we'll end up dropping > exclusively taken XFS part of things *and* ->i_mutex we'd never taken. > I might be misreading that code (it sure as hell wouldn't be > the first time when xfs_{rw_,}_ilock() is involved), but it looks dubious > to me... It's not just dubious, it's broken. I've forgotten to drop and retake i_mutex there (depending on EXCL) flag. It's been hitting me by making another bug worse. I've got an RFC patches for a few days, just need to get around to send it out, it's proably 4.0 material. And yes, alignment checks really should be past xfs_file_aio_write_checks, or at least be re-checked there.