From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757011AbZEEP6Z (ORCPT ); Tue, 5 May 2009 11:58:25 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755903AbZEEP6M (ORCPT ); Tue, 5 May 2009 11:58:12 -0400 Received: from [212.69.161.110] ([212.69.161.110]:54280 "EHLO mail09.linbit.com" rhost-flags-FAIL-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1753702AbZEEP6L (ORCPT ); Tue, 5 May 2009 11:58:11 -0400 From: Philipp Reisner Organization: LINBIT To: Bart Van Assche Subject: Re: [PATCH 00/16] DRBD: a block device for HA clusters Date: Tue, 5 May 2009 17:57:15 +0200 User-Agent: KMail/1.11.2 (Linux/2.6.28-11-generic; KDE/4.2.2; i686; ; ) Cc: James Bottomley , david@lang.hm, Willy Tarreau , Andrew Morton , linux-kernel@vger.kernel.org, Jens Axboe , Greg KH , Neil Brown , Sam Ravnborg , Dave Jones , Nikanth Karthikesan , "Lars Marowsky-Bree" , Kyle Moffett , Lars Ellenberg References: <1241090812-13516-1-git-send-email-philipp.reisner@linbit.com> <200905051021.33461.philipp.reisner@linbit.com> In-Reply-To: X-OTRS-FollowUp-SenderType: agent MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200905051757.17100.philipp.reisner@linbit.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tuesday 05 May 2009 17:03:13 Bart Van Assche wrote: > On Tue, May 5, 2009 at 10:21 AM, Philipp Reisner > > wrote: > > What we have in DRBD boils down to: > > > > * We obey all possible write after write dependencies in the stream of > > writes we get from the upper layers. And generate DRBD internal > > reorder barriers for the packet stream. > > Hello Philipp, > > I couldn't find a call to blk_queue_ordered() in the DRBD 8.3.1 source > code. This made me wonder how DRBD obtains information about barriers > that is generated by filesystems like ext3 with the option barrier=1 ? > Hi Bart, I was refering to implicit write after write dependencies, that one needs to obey when doing asynchronous replication. Up to now we do not offer barrier support for the layers above us. That will follow sooner or later. Here is an example, why it is not completely trivial: Imagine DRBD on top of a dm-linear on both nodes. When you start, both dm-linear mappings sit on top of something that supports barriers itself. -- Then the user replaces the backing device below the dm-linear on the secondary node with something that does not support barriers. When we get a write request with the BIO_RW_BARRIER flag set in from the FS, we submit this locally, ship it over to the peer and submit it there. Unfortunately it fails now with ENOTSUP on the peer. We can not ship that error back to the upper layer, because our mirror is already inconsistent. We have to resubmit it with BIO_RW_BARRIER cleared, and other means to enforce write ordering... Then tell the other node that we prefer to no longer accept BIO_RW_BARRIER etc... -Phil -- : Dipl-Ing Philipp Reisner : LINBIT | Your Way to High Availability : Tel: +43-1-8178292-50, Fax: +43-1-8178292-82 : http://www.linbit.com DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria.