From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752699AbbKBUKt (ORCPT ); Mon, 2 Nov 2015 15:10:49 -0500 Received: from ipmail07.adl2.internode.on.net ([150.101.137.131]:28271 "EHLO ipmail07.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752652AbbKBUKq (ORCPT ); Mon, 2 Nov 2015 15:10:46 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CdBwBvwjdW/+rW03ZegzuBQqpFAQEBAQEBBosuiy+GEwICAQECgTtNAQEBAQEBgQuENgEBBCcTHCMQCAMOCgklDwUlAyETiC/BbgEBCAIBIBmGF4VFhEeEeQWWQ4gNhRCcQmOCER2Baio0hDWBSQEBAQ Date: Tue, 3 Nov 2015 07:10:29 +1100 From: Dave Chinner To: Jeff Moyer Cc: Ross Zwisler , linux-kernel@vger.kernel.org, "H. Peter Anvin" , "J. Bruce Fields" , "Theodore Ts'o" , Alexander Viro , Andreas Dilger , Dan Williams , Ingo Molnar , Jan Kara , Jeff Layton , Matthew Wilcox , Thomas Gleixner , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org, x86@kernel.org, xfs@oss.sgi.com, Andrew Morton , Matthew Wilcox Subject: Re: [RFC 00/11] DAX fsynx/msync support Message-ID: <20151102201029.GI10656@dastard> References: <1446149535-16200-1-git-send-email-ross.zwisler@linux.intel.com> <20151030035533.GU19199@dastard> <20151030183938.GC24643@linux.intel.com> <20151101232948.GF10656@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 02, 2015 at 09:22:15AM -0500, Jeff Moyer wrote: > Dave Chinner writes: > > > Further, REQ_FLUSH/REQ_FUA are more than just "put the data on stable > > storage" commands. They are also IO barriers that affect scheduling > > of IOs in progress and in the request queues. A REQ_FLUSH/REQ_FUA > > IO cannot be dispatched before all prior IO has been dispatched and > > drained from the request queue, and IO submitted after a queued > > REQ_FLUSH/REQ_FUA cannot be scheduled ahead of the queued > > REQ_FLUSH/REQ_FUA operation. > > > > IOWs, REQ_FUA/REQ_FLUSH not only guarantee data is on stable > > storage, they also guarantee the order of IO dispatch and > > completion when concurrent IO is in progress. > > This hasn't been the case for several years, now. It used to work that > way, and that was deemed a big performance problem. Since file systems > already issued and waited for all I/O before sending down a barrier, we > decided to get rid of the I/O ordering pieces of barriers (and stop > calling them barriers). > > See commit 28e7d184521 (block: drop barrier ordering by queue draining). Yes, I realise that, even if I wasn't very clear about how I wrote it. ;) Correct me if I'm wrong: AFAIA, dispatch ordering (i.e. the "IO barrier") is still enforced by the scheduler via REQ_FUA|REQ_FLUSH -> ELEVATOR_INSERT_FLUSH -> REQ_SOFTBARRIER and subsequent IO scheduler calls to elv_dispatch_sort() that don't pass REQ_SOFTBARRIER in the queue. IOWs, if we queue a bunch of REQ_WRITE IOs followed by a REQ_WRITE|REQ_FLUSH IO, all of the prior REQ_WRITE IOs will be dispatched before the REQ_WRITE|REQ_FLUSH IO and hence be captured by the cache flush. Hence once the filesystem has waited on the REQ_WRITE|REQ_FLUSH IO to complete, we know that all the earlier REQ_WRITE IOs are on stable storage, too. Hence there's no need for the elevator to drain the queue to guarantee completion ordering - the dispatch ordering and flush/fua write semantics guarantee that when the flush/fua completes, all the IOs dispatch prior to that flush/fua write are also on stable storage... Cheers, Dave. -- Dave Chinner david@fromorbit.com