From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Wed, 17 Apr 2019 15:51:21 +0100 From: "Dr. David Alan Gilbert" Message-ID: <20190417145121.GG2839@work-vm> References: <20190416190858.16833-1-bo.liu@linux.alibaba.com> <20190416190858.16833-4-bo.liu@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190416190858.16833-4-bo.liu@linux.alibaba.com> Subject: Re: [Virtio-fs] [PATCH 3/4] virtiofsd: use file-backend memory region for virtiofsd's cache area List-Id: Development discussions about virtio-fs List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Liu Bo , vgoyal@redhat.com, stefanha@redhat.com Cc: virtio-fs@redhat.com * Liu Bo (bo.liu@linux.alibaba.com) wrote: > From: Xiaoguang Wang > > When running xfstests test case generic/413, we found such issue: > 1, create a file in one virtiofsd mount point with dax enabled > 2, mmap this file, get virtual addr: A > 3, write(fd, A, len), here fd comes from another file in another > virtiofsd mount point without dax enabled, also note here write(2) > is direct io. > 4, this direct io will hang forever, because the virtiofsd has crashed. > Here is the stack: > [ 247.166276] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 247.167171] t_mmap_dio D 0 2335 2102 0x00000000 > [ 247.168006] Call Trace: > [ 247.169067] ? __schedule+0x3d0/0x830 > [ 247.170219] schedule+0x32/0x80 > [ 247.171328] schedule_timeout+0x1e2/0x350 > [ 247.172416] ? fuse_direct_io+0x2e5/0x6b0 [fuse] > [ 247.173516] wait_for_completion+0x123/0x190 > [ 247.174593] ? wake_up_q+0x70/0x70 > [ 247.175640] fuse_direct_IO+0x265/0x310 [fuse] > [ 247.176724] generic_file_read_iter+0xaa/0xd20 > [ 247.177824] fuse_file_read_iter+0x81/0x130 [fuse] > [ 247.178938] ? fuse_simple_request+0x104/0x1b0 [fuse] > [ 247.180041] ? fuse_fsync_common+0xad/0x240 [fuse] > [ 247.181136] __vfs_read+0x108/0x190 > [ 247.181930] vfs_read+0x91/0x130 > [ 247.182671] ksys_read+0x52/0xc0 > [ 247.183454] do_syscall_64+0x55/0x170 > [ 247.184200] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > And virtiofsd crashed because vu_gpa_to_va() can not handle guest physical > address correctly. For a memory mapped area in dax mode, indeed the page > for this area points virtiofsd's cache area, or rather virtio pci device's > cache bar. In qemu, currently this cache bar is implemented with an anonymous > memory and will not pass this cache bar's address info to vhost-user backend, > so vu_gpa_to_va() will fail. > > To fix this issue, we create this vhost cache area with a file backend > memory area. Thanks, I know there was another case of the daemon trying to access the buffer that Stefan and Vivek hit, but fixed by persuading the kernel not to do it; Stefan/Vivek: What do you think? It worries me a little exposing the area back to the daemon; the guest can write the BAR and change the mapping, I doubt anything would notice that (but also I doubt it happens much). Dave > Reviewed-by: Liu Bo > Signed-off-by: Xiaoguang Wang > --- > hw/virtio/vhost-user-fs.c | 59 +++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 49 insertions(+), 10 deletions(-) > > diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c > index a5110a9..fc2ed38 100644 > --- a/hw/virtio/vhost-user-fs.c > +++ b/hw/virtio/vhost-user-fs.c > @@ -300,6 +300,35 @@ static bool vuf_guest_notifier_pending(VirtIODevice *vdev, int idx) > return vhost_virtqueue_pending(&fs->vhost_dev, idx); > } > > +static int vuf_get_tmpfile(size_t size) > +{ > + const char *tmpdir = g_get_tmp_dir(); > + gchar *fname; > + int fd; > + > + fname = g_strdup_printf("%s/virtio-fs-cache-XXXXXX", tmpdir); > + fd = mkstemp(fname); > + if (fd < 0) { > + fprintf(stderr, "%s: mkstemp(%s) failed\n", __func__, fname); > + g_free(fname); > + return -1; > + } > + > + /* > + * Note: this temp file would be deleted until file is closed or > + * process exits. > + */ > + unlink(fname); > + if (ftruncate(fd, size) == -1) { > + fprintf(stderr, "%s: ftruncate(%s) failed\n", __func__, fname); > + close(fd); > + fd = -1; > + } > + > + g_free(fname); > + return fd; > +} > + > static void vuf_device_realize(DeviceState *dev, Error **errp) > { > VirtIODevice *vdev = VIRTIO_DEVICE(dev); > @@ -349,16 +378,26 @@ static void vuf_device_realize(DeviceState *dev, Error **errp) > "no smaller than the page size"); > return; > } > - /* We need a region with some host memory, 'ram' is the easiest */ > - memory_region_init_ram_nomigrate(&fs->cache, OBJECT(vdev), > - "virtio-fs-cache", > - fs->conf.cache_size, NULL); > - /* > - * But we don't actually want anyone reading/writing the raw > - * area with no cache data. > - */ > - mprotect(memory_region_get_ram_ptr(&fs->cache), fs->conf.cache_size, > - PROT_NONE); > + > + if (fs->conf.cache_size) { > + int fd; > + > + fd = vuf_get_tmpfile(fs->conf.cache_size); > + if (fd < 0) { > + error_setg(errp, "failed to initialize virtio-fs-cache"); > + return; > + } > + memory_region_init_ram_from_fd(&fs->cache, OBJECT(vdev), > + "virtio-fs-cache", > + fs->conf.cache_size, true, fd, NULL); > + > + /* > + * But we don't actually want anyone reading/writing the raw > + * area with no cache data. > + */ > + mprotect(memory_region_get_ram_ptr(&fs->cache), > + fs->conf.cache_size, PROT_NONE); > + } > > if (!vhost_user_init(&fs->vhost_user, &fs->conf.chardev, errp)) { > return; > -- > 1.8.3.1 > > _______________________________________________ > Virtio-fs mailing list > Virtio-fs@redhat.com > https://www.redhat.com/mailman/listinfo/virtio-fs -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK