All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benny Halevy <bhalevy@tonian.com>
To: "Myklebust, Trond" <Trond.Myklebust@netapp.com>
Cc: "Mora, Jorge" <Jorge.Mora@netapp.com>,
	"Isaman, Fred" <Fred.Isaman@netapp.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] pnfs: do not reset to mds if wb_offset != wb_pgbase
Date: Mon, 18 Mar 2013 18:45:49 +0200	[thread overview]
Message-ID: <514744BD.6000205@tonian.com> (raw)
In-Reply-To: <1363624756.4351.30.camel@leira.trondhjem.org>

On 2013-03-18 18:39, Myklebust, Trond wrote:
> On Mon, 2013-03-18 at 18:22 +0200, Benny Halevy wrote:
>> On 2013-03-18 17:55, Myklebust, Trond wrote:
>>> On Mon, 2013-03-18 at 16:38 +0200, Benny Halevy wrote:
>>>> We're seeing roughly 20% of the I/Os going to the MDS
>>>> when installing a VM over KVM in "none" caching mode (O_DIRECT).
>>>> Instrumenting the client reveled that this is caused by buffer
>>>> alignment vs. file offset alignment.
>>>> Besides being a performance problem, when the MDS caches data
>>>> this is also manifested as data corruption when data is written
>>>> first via the MDS, then via the DS, eventually the stale data is
>>>> read back from the MDS.
>>>
>>> That's why we should return the layout.
>>
>> We are not in this case.
> 
> Doh! I was thinking it was a case where we need to fence...
> 
> Actually, it shouldn't be needed: we will always do a _stable_ write of
> the data before we try to read it back in from the server, so MDS
> caching shouldn't be a problem.
> 

Writing stable to the MDS does not solve all cases.
The corruption we've seen happens like this:

write(A) to MDS
write(B) to DS
read(A) from MDS - since the MDS is caching the last data written to it.

>>>> Note that this check exists also for the file layout specific
>>>> pg_init_* functions.  The objects (ORE) and block
>>>> (bl_{read,write}_pagelist) layouts seem to deal correctly with
>>>> splitting IOs in the case where req->wb_offset != req->wb_pgbase
>>>> though this hasn't been tested wen submitting this patch.
>>>>
>>> NACK. I see no evidence that we've addressed the issues that were raised
>>> by Fred in commit 1825a0d08f22463e5a8f4b1636473efd057a3479 (NFS: prepare
>>> coalesce testing for directio).
>>> If you think that his concerns about the coalescing assumptions are no
>>> longer true, then please point to why this is the case. AFAICR that
>>> patch was added to fix corruption issues.
>>>
>>
>> We see no problems with this patch with the workloads we're testing.
>> Do you have a test that reproduces the original problem that we can try running?
> 
> I suspect it was one of the nfstests. (see
> git://git.linux-nfs.org/projects/mora/nfstest.git ) since Fred was
> working with Jorge to do the O_DIRECT testing.
> 
> Fred, Jorge?
> 
> 

  reply	other threads:[~2013-03-18 16:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1363617532-24172-1-git-send-email-bhalevy@tonian.com>
2013-03-18 15:55 ` [PATCH] pnfs: do not reset to mds if wb_offset != wb_pgbase Myklebust, Trond
2013-03-18 16:22   ` Benny Halevy
2013-03-18 16:39     ` Myklebust, Trond
2013-03-18 16:45       ` Benny Halevy [this message]
2013-03-18 17:09         ` Myklebust, Trond
2013-03-18 17:04     ` Fred Isaman
2013-03-19 19:35       ` Benny Halevy
2013-03-19 20:28         ` Fred Isaman
2013-03-19 20:38           ` Benny Halevy
2013-03-27 12:19           ` Benny Halevy
2013-04-28 15:20             ` Benny Halevy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=514744BD.6000205@tonian.com \
    --to=bhalevy@tonian.com \
    --cc=Fred.Isaman@netapp.com \
    --cc=Jorge.Mora@netapp.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.