All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: virtio-fs@redhat.com, qemu-devel@nongnu.org,
	Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [Virtio-fs] [PATCH 0/4] virtiofsd: multithreading preparation part 3
Date: Thu, 8 Aug 2019 10:53:16 +0100	[thread overview]
Message-ID: <20190808095316.GC2852@work-vm> (raw)
In-Reply-To: <20190808090213.GD31476@stefanha-x1.localdomain>

* Stefan Hajnoczi (stefanha@redhat.com) wrote:
> On Wed, Aug 07, 2019 at 04:57:15PM -0400, Vivek Goyal wrote:
> > Kernel also serializes MAP/UNMAP on one inode. So you will need to run
> > multiple jobs operating on different inodes to see parallel MAP/UNMAP
> > (atleast from kernel's point of view).
> 
> Okay, there is still room to experiment with how MAP and UNMAP are
> handled by virtiofsd and QEMU even if the host kernel ultimately becomes
> the bottleneck.
> 
> One possible optimization is to eliminate REMOVEMAPPING requests when
> the guest driver knows a SETUPMAPPING will follow immediately.  I see
> the following request pattern in a fio randread iodepth=64 job:
> 
>   unique: 995348, opcode: SETUPMAPPING (48), nodeid: 135, insize: 80, pid: 1351
>   lo_setupmapping(ino=135, fi=0x(nil), foffset=3860856832, len=2097152, moffset=859832320, flags=0)
>      unique: 995348, success, outsize: 16
>   unique: 995350, opcode: REMOVEMAPPING (49), nodeid: 135, insize: 60, pid: 12
>      unique: 995350, success, outsize: 16
>   unique: 995352, opcode: SETUPMAPPING (48), nodeid: 135, insize: 80, pid: 1351
>   lo_setupmapping(ino=135, fi=0x(nil), foffset=16777216, len=2097152, moffset=861929472, flags=0)
>      unique: 995352, success, outsize: 16
>   unique: 995354, opcode: REMOVEMAPPING (49), nodeid: 135, insize: 60, pid: 12
>      unique: 995354, success, outsize: 16
>   virtio_send_msg: elem 9: with 1 in desc of length 16
>   unique: 995356, opcode: SETUPMAPPING (48), nodeid: 135, insize: 80, pid: 1351
>   lo_setupmapping(ino=135, fi=0x(nil), foffset=383778816, len=2097152, moffset=864026624, flags=0)
>      unique: 995356, success, outsize: 16
>   unique: 995358, opcode: REMOVEMAPPING (49), nodeid: 135, insize: 60, pid: 12
> 
> The REMOVEMAPPING requests are unnecessary since we can map over the top
> of the old mapping instead of taking the extra step of removing it
> first.

Yep, those should go - I think Vivek likes to keep them for testing
since they make things fail more completely if there's a screwup.

> Some more questions to consider for DAX performance optimization:
> 
> 1. Is FUSE_READ/FUSE_WRITE more efficient than DAX for some I/O patterns?

Probably for cases where the data is only accessed once, and you can't
preemptively map.
Another variant on (1) is whether we could do read/writes while the mmap
is happening to absorb the latency.

> 2. Can MAP/UNMAP be performed directly in QEMU via a separate virtqueue?

I think there's two things to solve here that I don't currently know the
answer to:
  2a) We'd need to get the fd to qemu for the thing to mmap;
      we might be able to cache the fd on the qemu side for existing
      mappings, so when asking for a new mapping for an existing file then
      it would already have the fd.

  2b) Running a device with a mix of queues inside QEMU and on
      vhost-user; I don't think we have anything with that mix
 
> 3. Can READ/WRITE be performed directly in QEMU via a separate virtqueue
>    to eliminate the bad address problem?

Are you thinking of doing all read/writes that way, or just the corner
cases? It doesn't seem worth it for the corner cases unless you're
finding them cropping up in real work loads.

> 4. Can OPEN+MAP be fused into a single request for small files, avoiding
>    the 2nd request?

Sounds possible.

> I'm not going to tackle DAX optimization myself right now but wanted to
> share these ideas.

One I was thinking about that feels easier than (2) was to change the
vhost slave protocol to be split transaction; it wouldn't do anything
for the latency but it would be able to do some in parallel if we can
get the kernel to feed it.

Dave

> Stefan



> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


WARNING: multiple messages have this Message-ID (diff)
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: virtio-fs@redhat.com, qemu-devel@nongnu.org,
	Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [Qemu-devel] [Virtio-fs] [PATCH 0/4] virtiofsd: multithreading preparation part 3
Date: Thu, 8 Aug 2019 10:53:16 +0100	[thread overview]
Message-ID: <20190808095316.GC2852@work-vm> (raw)
In-Reply-To: <20190808090213.GD31476@stefanha-x1.localdomain>

* Stefan Hajnoczi (stefanha@redhat.com) wrote:
> On Wed, Aug 07, 2019 at 04:57:15PM -0400, Vivek Goyal wrote:
> > Kernel also serializes MAP/UNMAP on one inode. So you will need to run
> > multiple jobs operating on different inodes to see parallel MAP/UNMAP
> > (atleast from kernel's point of view).
> 
> Okay, there is still room to experiment with how MAP and UNMAP are
> handled by virtiofsd and QEMU even if the host kernel ultimately becomes
> the bottleneck.
> 
> One possible optimization is to eliminate REMOVEMAPPING requests when
> the guest driver knows a SETUPMAPPING will follow immediately.  I see
> the following request pattern in a fio randread iodepth=64 job:
> 
>   unique: 995348, opcode: SETUPMAPPING (48), nodeid: 135, insize: 80, pid: 1351
>   lo_setupmapping(ino=135, fi=0x(nil), foffset=3860856832, len=2097152, moffset=859832320, flags=0)
>      unique: 995348, success, outsize: 16
>   unique: 995350, opcode: REMOVEMAPPING (49), nodeid: 135, insize: 60, pid: 12
>      unique: 995350, success, outsize: 16
>   unique: 995352, opcode: SETUPMAPPING (48), nodeid: 135, insize: 80, pid: 1351
>   lo_setupmapping(ino=135, fi=0x(nil), foffset=16777216, len=2097152, moffset=861929472, flags=0)
>      unique: 995352, success, outsize: 16
>   unique: 995354, opcode: REMOVEMAPPING (49), nodeid: 135, insize: 60, pid: 12
>      unique: 995354, success, outsize: 16
>   virtio_send_msg: elem 9: with 1 in desc of length 16
>   unique: 995356, opcode: SETUPMAPPING (48), nodeid: 135, insize: 80, pid: 1351
>   lo_setupmapping(ino=135, fi=0x(nil), foffset=383778816, len=2097152, moffset=864026624, flags=0)
>      unique: 995356, success, outsize: 16
>   unique: 995358, opcode: REMOVEMAPPING (49), nodeid: 135, insize: 60, pid: 12
> 
> The REMOVEMAPPING requests are unnecessary since we can map over the top
> of the old mapping instead of taking the extra step of removing it
> first.

Yep, those should go - I think Vivek likes to keep them for testing
since they make things fail more completely if there's a screwup.

> Some more questions to consider for DAX performance optimization:
> 
> 1. Is FUSE_READ/FUSE_WRITE more efficient than DAX for some I/O patterns?

Probably for cases where the data is only accessed once, and you can't
preemptively map.
Another variant on (1) is whether we could do read/writes while the mmap
is happening to absorb the latency.

> 2. Can MAP/UNMAP be performed directly in QEMU via a separate virtqueue?

I think there's two things to solve here that I don't currently know the
answer to:
  2a) We'd need to get the fd to qemu for the thing to mmap;
      we might be able to cache the fd on the qemu side for existing
      mappings, so when asking for a new mapping for an existing file then
      it would already have the fd.

  2b) Running a device with a mix of queues inside QEMU and on
      vhost-user; I don't think we have anything with that mix
 
> 3. Can READ/WRITE be performed directly in QEMU via a separate virtqueue
>    to eliminate the bad address problem?

Are you thinking of doing all read/writes that way, or just the corner
cases? It doesn't seem worth it for the corner cases unless you're
finding them cropping up in real work loads.

> 4. Can OPEN+MAP be fused into a single request for small files, avoiding
>    the 2nd request?

Sounds possible.

> I'm not going to tackle DAX optimization myself right now but wanted to
> share these ideas.

One I was thinking about that feels easier than (2) was to change the
vhost slave protocol to be split transaction; it wouldn't do anything
for the latency but it would be able to do some in parallel if we can
get the kernel to feed it.

Dave

> Stefan



> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


  reply	other threads:[~2019-08-08  9:53 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-01 16:54 [Virtio-fs] [PATCH 0/4] virtiofsd: multithreading preparation part 3 Stefan Hajnoczi
2019-08-01 16:54 ` [Qemu-devel] " Stefan Hajnoczi
2019-08-01 16:54 ` [Virtio-fs] [PATCH 1/4] virtiofsd: process requests in a thread pool Stefan Hajnoczi
2019-08-01 16:54   ` [Qemu-devel] " Stefan Hajnoczi
2019-08-05 12:02   ` [Virtio-fs] " Dr. David Alan Gilbert
2019-08-05 12:02     ` [Qemu-devel] " Dr. David Alan Gilbert
2019-08-07  9:35     ` [Virtio-fs] " Stefan Hajnoczi
2019-08-07  9:35       ` [Qemu-devel] " Stefan Hajnoczi
2019-08-01 16:54 ` [Virtio-fs] [PATCH 2/4] virtiofsd: prevent FUSE_INIT/FUSE_DESTROY races Stefan Hajnoczi
2019-08-01 16:54   ` [Qemu-devel] " Stefan Hajnoczi
2019-08-05 12:26   ` [Virtio-fs] " Dr. David Alan Gilbert
2019-08-05 12:26     ` [Qemu-devel] " Dr. David Alan Gilbert
2019-08-01 16:54 ` [Virtio-fs] [PATCH 3/4] virtiofsd: fix lo_destroy() resource leaks Stefan Hajnoczi
2019-08-01 16:54   ` [Qemu-devel] " Stefan Hajnoczi
2019-08-05 15:17   ` [Virtio-fs] " Dr. David Alan Gilbert
2019-08-05 15:17     ` [Qemu-devel] " Dr. David Alan Gilbert
2019-08-05 18:57     ` [Virtio-fs] " Dr. David Alan Gilbert
2019-08-05 18:57       ` [Qemu-devel] " Dr. David Alan Gilbert
2019-08-06 18:58       ` [Virtio-fs] " Dr. David Alan Gilbert
2019-08-06 18:58         ` [Qemu-devel] " Dr. David Alan Gilbert
2019-08-07  9:41       ` [Virtio-fs] " Stefan Hajnoczi
2019-08-07  9:41         ` [Qemu-devel] " Stefan Hajnoczi
2019-08-01 16:54 ` [Virtio-fs] [PATCH 4/4] virtiofsd: add --thread-pool-size=NUM option Stefan Hajnoczi
2019-08-01 16:54   ` [Qemu-devel] " Stefan Hajnoczi
2019-08-05  2:52 ` [Virtio-fs] [PATCH 0/4] virtiofsd: multithreading preparation part 3 piaojun
2019-08-05  2:52   ` [Qemu-devel] " piaojun
2019-08-05  8:01   ` [Virtio-fs] [Qemu-devel] " Stefan Hajnoczi
2019-08-05  8:01     ` [Qemu-devel] [Virtio-fs] " Stefan Hajnoczi
2019-08-05  9:40     ` [Virtio-fs] [Qemu-devel] " piaojun
2019-08-05  9:40       ` [Qemu-devel] [Virtio-fs] " piaojun
2019-08-07 18:03 ` Stefan Hajnoczi
2019-08-07 18:03   ` [Qemu-devel] " Stefan Hajnoczi
2019-08-07 20:57   ` [Virtio-fs] " Vivek Goyal
2019-08-07 20:57     ` [Qemu-devel] " Vivek Goyal
2019-08-08  9:02     ` Stefan Hajnoczi
2019-08-08  9:02       ` [Qemu-devel] " Stefan Hajnoczi
2019-08-08  9:53       ` Dr. David Alan Gilbert [this message]
2019-08-08  9:53         ` Dr. David Alan Gilbert
2019-08-08 12:53         ` Vivek Goyal
2019-08-08 12:53           ` [Qemu-devel] " Vivek Goyal
2019-08-09  8:23           ` Stefan Hajnoczi
2019-08-09  8:23             ` [Qemu-devel] " Stefan Hajnoczi
2019-08-10 21:35           ` Liu Bo
2019-08-10 21:35             ` [Qemu-devel] " Liu Bo
2019-08-09  8:21         ` Stefan Hajnoczi
2019-08-09  8:21           ` [Qemu-devel] " Stefan Hajnoczi
2019-08-10 21:34           ` Liu Bo
2019-08-10 21:34             ` [Qemu-devel] " Liu Bo
2019-08-11  2:26           ` piaojun
2019-08-11  2:26             ` [Qemu-devel] " piaojun
2019-08-12 10:05             ` Stefan Hajnoczi
2019-08-12 10:05               ` [Qemu-devel] " Stefan Hajnoczi
2019-08-12 11:58               ` piaojun
2019-08-12 11:58                 ` [Qemu-devel] " piaojun
2019-08-12 12:51                 ` Dr. David Alan Gilbert
2019-08-12 12:51                   ` [Qemu-devel] " Dr. David Alan Gilbert
2019-08-08  8:10   ` piaojun
2019-08-08  8:10     ` [Qemu-devel] " piaojun
2019-08-08  9:53     ` Stefan Hajnoczi
2019-08-08  9:53       ` [Qemu-devel] " Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190808095316.GC2852@work-vm \
    --to=dgilbert@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=virtio-fs@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.