All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: virtio-fs@redhat.com
Subject: Re: [Virtio-fs] [PATCH RFC] virtiofs: use fine-grained lock for dmap reclaim
Date: Thu, 11 Jul 2019 09:49:58 +0100	[thread overview]
Message-ID: <20190711084958.GC3971@work-vm> (raw)
In-Reply-To: <20190710203714.GA25146@redhat.com>

* Vivek Goyal (vgoyal@redhat.com) wrote:
> On Wed, Jul 10, 2019 at 11:41:29AM -0700, Liu Bo wrote:
> > On Mon, Jul 08, 2019 at 04:43:01PM -0400, Vivek Goyal wrote:
> > > On Thu, Jul 04, 2019 at 03:25:31PM +0800, Liu Bo wrote:
> > > > With free fuse dax mapping reducing, read performance is impacted
> > > > significantly because reads need to wait for a free fuse dax mapping.
> > > > 
> > > > Although reads will trigger reclaim work to try to reclaim fuse dax
> > > > mapping, reclaim code can barely make any progress if most of fuse dax
> > > > mappings are used by the file we're reading since inode lock is required
> > > > by reclaim code.
> > > > 
> > > > However, we don't have to take inode lock for reclaiming if dax mapping
> > > > has its own reference count, reference counting is to tell reclaim code to
> > > > skip those in use dax mappings, such that we can avoid the risk of
> > > > accidentally reclaiming a dax mapping that other readers are using.
> > > > 
> > > > On the other hand, holding ->i_dmap_sem during reclaim can be used to
> > > > prevent the follwing reads to get a dax mapping under reclaim.
> > > > 
> > > > Another reason is that reads/writes only use fuse dax mapping within
> > > > dax_iomap_rw(), so we can do such a trick, while mmap/faulting is a
> > > > different story and we have to take ->i_mmap_sem prior to reclaiming a dax
> > > > mapping in order to avoid the race.
> > > 
> > > Hi Liu Bo,
> > > 
> > > Not sure why we can get rid of inode lock but not ->i_mmap_sem. Holding
> > > this lock only prevents further page fault. But existing mapped pages
> > > will continue to be accessed by process.
> > >
> > 
> > The idea is that for reads/writes, it always goes thru fuse_iomap_{begin,end}
> > during the whole IO process, while for mmap, once fault-in finishes, it can
> > read/write to the area without fuse_iomap_{begin,end}.
> > 
> > > And in theory we could drop inode lock and be safe only with ->dmap_sem
> > > lock, can't we do something similar for ->i_mmap_sem lock.
> > > 
> > > Also little worried about races w.r.t truncation and i_size update. Dax
> > > code assumes that i_size is stable  and filesystem is holding enough
> > > locks to ensure that.
> > >
> > 
> > typically i_mmap_sem is held by truncate to avoid the race with dax code.
> > 
> > but the race between truncate and isize changes seems to be unrelated to this
> > patch, as the patch only does updates how the background reclaim worker
> > manipulates locks.
> > 
> > I went thru fuse_setattr(), we don't remove dax mapping ranges in truncate, either.
> > 
> > > What kind of testing you have done to make sure this is safe. Try
> > > running blogbench. Possibly mix of read/write/mmap workload along
> > > with heavy truncation/punch hole operation in parallel as well.
> > >
> > 
> > Not yet, but surely I'm planning to run a whole round of regression test against
> > it.
> 
> I ran bunch of fio jobs and compared the results with vanilla kernel and
> with your patch. Looks like randwrite jobs are showing some regression. 

But...

> Here are my scripts.
> 
> https://github.com/rhvgoyal/virtiofs-tests
> 
> I rand tests with cache=always, cache size=2G, dax enabled.
> 
> 
> NAME                    I/O Operation           BW(Read/Write)
> virtiofs-vanilla        seqread                 164(MiB/s)
> virtiofs-liubo-patch    seqread                 163(MiB/s)
> 
> virtiofs-vanilla        seqread-mmap-single     200(MiB/s)
> virtiofs-liubo-patch    seqread-mmap-single     220(MiB/s)
> 
> virtiofs-vanilla        seqread-mmap-multi      736(MiB/s)
> virtiofs-liubo-patch    seqread-mmap-multi      771(MiB/s)
> 
> virtiofs-vanilla        randread                1406(KiB/s)
> virtiofs-liubo-patch    randread                15(MiB/s)

That's a hell of an improvement for randread; perhaps the write
regression is worth it?

Dave

> virtiofs-vanilla        randread-mmap-single    13(MiB/s)
> virtiofs-liubo-patch    randread-mmap-single    11(MiB/s)
> 
> virtiofs-vanilla        randread-mmap-multi     8934(KiB/s)
> virtiofs-liubo-patch    randread-mmap-multi     9264(KiB/s)
> 
> virtiofs-vanilla        seqwrite                120(MiB/s)
> virtiofs-liubo-patch    seqwrite                110(MiB/s)
> 
> virtiofs-vanilla        seqwrite-mmap-single    242(MiB/s)
> virtiofs-liubo-patch    seqwrite-mmap-single    250(MiB/s)
> 
> virtiofs-vanilla        seqwrite-mmap-multi     595(MiB/s)
> virtiofs-liubo-patch    seqwrite-mmap-multi     646(MiB/s)
> 
> virtiofs-vanilla        randwrite               20(MiB/s)
> virtiofs-liubo-patch    randwrite               15(MiB/s)
> 
> virtiofs-vanilla        randwrite-mmap-single   12(MiB/s)
> virtiofs-liubo-patch    randwrite-mmap-single   10(MiB/s)
> 
> virtiofs-vanilla        randwrite-mmap-multi    11(MiB/s)
> virtiofs-liubo-patch    randwrite-mmap-multi    8246(KiB/s)
> 
> Vivek
> 
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


  reply	other threads:[~2019-07-11  8:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-04  7:25 [Virtio-fs] [PATCH RFC] virtiofs: use fine-grained lock for dmap reclaim Liu Bo
2019-07-08 20:43 ` Vivek Goyal
2019-07-10 18:41   ` Liu Bo
2019-07-10 20:37     ` Vivek Goyal
2019-07-11  8:49       ` Dr. David Alan Gilbert [this message]
2019-07-10 13:38 ` Vivek Goyal
2019-07-10 18:59   ` Liu Bo
2019-07-15 20:37 ` Vivek Goyal
2019-07-15 21:38   ` Liu Bo
2019-07-16 18:36     ` Vivek Goyal
2019-07-16 19:07       ` Liu Bo
2019-07-16 19:15         ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190711084958.GC3971@work-vm \
    --to=dgilbert@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=virtio-fs@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.