Re: [PATCH v4 15/16] virtio-fs: add virtiofs filesystem

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>,
	Miklos Szeredi <mszeredi@redhat.com>,
	virtualization@lists.linux-foundation.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH v4 15/16] virtio-fs: add virtiofs filesystem
Date: Fri, 6 Sep 2019 09:52:42 -0400	[thread overview]
Message-ID: <20190906095225-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20190906102209.GD5900@stefanha-x1.localdomain>

On Fri, Sep 06, 2019 at 11:22:09AM +0100, Stefan Hajnoczi wrote:
> On Thu, Sep 05, 2019 at 03:15:15PM -0400, Vivek Goyal wrote:
> > On Tue, Sep 03, 2019 at 09:55:49AM -0400, Michael S. Tsirkin wrote:
> > [..]
> > > What's with all of the TODOs? Some of these are really scary,
> > > looks like they need to be figured out before this is merged.
> > 
> > Hi Michael,
> > 
> > One of the issue I noticed is races w.r.t device removal and super
> > block initialization. I am about to post a set of patches which
> > take care of these races and also get rid of some of the scary
> > TODOs. Other TODOs like suspend/restore, multiqueue support etc
> > are improvements which we can do over a period of time.
> > 
> > [..]
> > > > +/* Per-virtqueue state */
> > > > +struct virtio_fs_vq {
> > > > +	spinlock_t lock;
> > > > +	struct virtqueue *vq;     /* protected by ->lock */
> > > > +	struct work_struct done_work;
> > > > +	struct list_head queued_reqs;
> > > > +	struct delayed_work dispatch_work;
> > > > +	struct fuse_dev *fud;
> > > > +	bool connected;
> > > > +	long in_flight;
> > > > +	char name[24];
> > > 
> > > I'd keep names somewhere separate as they are not used on data path.
> > 
> > Ok, this sounds like a nice to have. Will take care of this once base
> > patch gets merged.
> > 
> > [..]
> > > > +struct virtio_fs_forget {
> > > > +	struct fuse_in_header ih;
> > > > +	struct fuse_forget_in arg;
> > > 
> > > These structures are all native endian.
> > > 
> > > Passing them to host will make cross-endian setups painful to support,
> > > and hardware implementations impossible.
> > > 
> > > How about converting everything to LE?
> > 
> > So looks like endianness issue is now resolved (going by the other
> > emails). So I will not worry about it.
> > 
> > [..]
> > > > +/* Add a new instance to the list or return -EEXIST if tag name exists*/
> > > > +static int virtio_fs_add_instance(struct virtio_fs *fs)
> > > > +{
> > > > +	struct virtio_fs *fs2;
> > > > +	bool duplicate = false;
> > > > +
> > > > +	mutex_lock(&virtio_fs_mutex);
> > > > +
> > > > +	list_for_each_entry(fs2, &virtio_fs_instances, list) {
> > > > +		if (strcmp(fs->tag, fs2->tag) == 0)
> > > > +			duplicate = true;
> > > > +	}
> > > > +
> > > > +	if (!duplicate)
> > > > +		list_add_tail(&fs->list, &virtio_fs_instances);
> > > 
> > > 
> > > This is O(N^2) as it's presumably called for each istance.
> > > Doesn't scale - please switch to a tree or such.
> > 
> > This is O(N) and not O(N^2) right? Addition of device is O(N), search
> > during mount is O(N).
> > 
> > This is not a frequent event at all and number of virtiofs instances
> > per guest are expected to be fairly small (say less than 10). So I 
> > really don't think there is any value in converting this into a tree
> > (instead of a list).
> > 
> > [..]
> > > > +static void virtio_fs_free_devs(struct virtio_fs *fs)
> > > > +{
> > > > +	unsigned int i;
> > > > +
> > > > +	/* TODO lock */
> > > 
> > > Doesn't inspire confidence, does it?
> > 
> > Agreed. Getting rid of this in set of fixes I am about to post.
> > 
> > > 
> > > > +
> > > > +	for (i = 0; i < fs->nvqs; i++) {
> > > > +		struct virtio_fs_vq *fsvq = &fs->vqs[i];
> > > > +
> > > > +		if (!fsvq->fud)
> > > > +			continue;
> > > > +
> > > > +		flush_work(&fsvq->done_work);
> > > > +		flush_delayed_work(&fsvq->dispatch_work);
> > > > +
> > > > +		/* TODO need to quiesce/end_requests/decrement dev_count */
> > > 
> > > Indeed. Won't this crash if we don't?
> > 
> > Took care of this as well.
> > 
> > [..]
> > > > +static void virtio_fs_hiprio_dispatch_work(struct work_struct *work)
> > > > +{
> > > > +	struct virtio_fs_forget *forget;
> > > > +	struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
> > > > +						 dispatch_work.work);
> > > > +	struct virtqueue *vq = fsvq->vq;
> > > > +	struct scatterlist sg;
> > > > +	struct scatterlist *sgs[] = {&sg};
> > > > +	bool notify;
> > > > +	int ret;
> > > > +
> > > > +	pr_debug("virtio-fs: worker %s called.\n", __func__);
> > > > +	while (1) {
> > > > +		spin_lock(&fsvq->lock);
> > > > +		forget = list_first_entry_or_null(&fsvq->queued_reqs,
> > > > +					struct virtio_fs_forget, list);
> > > > +		if (!forget) {
> > > > +			spin_unlock(&fsvq->lock);
> > > > +			return;
> > > > +		}
> > > > +
> > > > +		list_del(&forget->list);
> > > > +		if (!fsvq->connected) {
> > > > +			spin_unlock(&fsvq->lock);
> > > > +			kfree(forget);
> > > > +			continue;
> > > > +		}
> > > > +
> > > > +		sg_init_one(&sg, forget, sizeof(*forget));
> > > 
> > > This passes to host a structure including "struct list_head list";
> > > 
> > > Not a good idea.
> > 
> > Ok, host does not have to see "struct list_head list". Its needed for
> > guest. Will look into getting rid of this.
> > 
> > > 
> > > 
> > > > +
> > > > +		/* Enqueue the request */
> > > > +		dev_dbg(&vq->vdev->dev, "%s\n", __func__);
> > > > +		ret = virtqueue_add_sgs(vq, sgs, 1, 0, forget, GFP_ATOMIC);
> > > 
> > > 
> > > This is easier as add_outbuf.
> > 
> > Will look into it.
> > 
> > > 
> > > Also - why GFP_ATOMIC?
> > 
> > Hmm..., may be it can be GFP_KERNEL. I don't see atomic context here. Will
> > look into it.
> > 
> > > 
> > > > +		if (ret < 0) {
> > > > +			if (ret == -ENOMEM || ret == -ENOSPC) {
> > > > +				pr_debug("virtio-fs: Could not queue FORGET: err=%d. Will try later\n",
> > > > +					 ret);
> > > > +				list_add_tail(&forget->list,
> > > > +						&fsvq->queued_reqs);
> > > > +				schedule_delayed_work(&fsvq->dispatch_work,
> > > > +						msecs_to_jiffies(1));
> > > 
> > > Can't we we requeue after some buffers get consumed?
> > 
> > That's what dispatch work is doing. It tries to requeue the request after
> > a while.
> > 
> > [..]
> > > > +static int virtio_fs_probe(struct virtio_device *vdev)
> > > > +{
> > > > +	struct virtio_fs *fs;
> > > > +	int ret;
> > > > +
> > > > +	fs = devm_kzalloc(&vdev->dev, sizeof(*fs), GFP_KERNEL);
> > > > +	if (!fs)
> > > > +		return -ENOMEM;
> > > > +	vdev->priv = fs;
> > > > +
> > > > +	ret = virtio_fs_read_tag(vdev, fs);
> > > > +	if (ret < 0)
> > > > +		goto out;
> > > > +
> > > > +	ret = virtio_fs_setup_vqs(vdev, fs);
> > > > +	if (ret < 0)
> > > > +		goto out;
> > > > +
> > > > +	/* TODO vq affinity */
> > > > +	/* TODO populate notifications vq */
> > > 
> > > what's notifications vq?
> > 
> > It has not been implemented yet. At some point of time we want to have
> > a notion of notification queue so that host can send notifications to
> > guest. Will get rid of this comment for now.
> > 
> > [..]
> > > > +#ifdef CONFIG_PM_SLEEP
> > > > +static int virtio_fs_freeze(struct virtio_device *vdev)
> > > > +{
> > > > +	return 0; /* TODO */
> > > > +}
> > > > +
> > > > +static int virtio_fs_restore(struct virtio_device *vdev)
> > > > +{
> > > > +	return 0; /* TODO */
> > > > +}
> > > 
> > > Is this really a good idea? I'd rather it was implemented,
> > > but if not possible at all disabling PM seems better than just
> > > keep going.
> > 
> > I agree. Will look into disabling it.
> > 
> > > 
> > > > +#endif /* CONFIG_PM_SLEEP */
> > > > +
> > > > +const static struct virtio_device_id id_table[] = {
> > > > +	{ VIRTIO_ID_FS, VIRTIO_DEV_ANY_ID },
> > > > +	{},
> > > > +};
> > > > +
> > > > +const static unsigned int feature_table[] = {};
> > > > +
> > > > +static struct virtio_driver virtio_fs_driver = {
> > > > +	.driver.name		= KBUILD_MODNAME,
> > > > +	.driver.owner		= THIS_MODULE,
> > > > +	.id_table		= id_table,
> > > > +	.feature_table		= feature_table,
> > > > +	.feature_table_size	= ARRAY_SIZE(feature_table),
> > > > +	/* TODO validate config_get != NULL */
> > > 
> > > Why?
> > 
> > Don't know. Stefan, do you remember why did you put this comment? If not,
> > I will get rid of it.
> 
> This comment can be removed.
> 
> > > > +static void virtio_fs_wake_pending_and_unlock(struct fuse_iqueue *fiq)
> > > > +__releases(fiq->waitq.lock)
> > > > +{
> > > > +	unsigned int queue_id = VQ_REQUEST; /* TODO multiqueue */
> > > > +	struct virtio_fs *fs;
> > > > +	struct fuse_conn *fc;
> > > > +	struct fuse_req *req;
> > > > +	struct fuse_pqueue *fpq;
> > > > +	int ret;
> > > > +
> > > > +	WARN_ON(list_empty(&fiq->pending));
> > > > +	req = list_last_entry(&fiq->pending, struct fuse_req, list);
> > > > +	clear_bit(FR_PENDING, &req->flags);
> > > > +	list_del_init(&req->list);
> > > > +	WARN_ON(!list_empty(&fiq->pending));
> > > > +	spin_unlock(&fiq->waitq.lock);
> > > > +
> > > > +	fs = fiq->priv;
> > > > +	fc = fs->vqs[queue_id].fud->fc;
> > > > +
> > > > +	dev_dbg(&fs->vqs[queue_id].vq->vdev->dev,
> > > > +		"%s: opcode %u unique %#llx nodeid %#llx in.len %u out.len %u\n",
> > > > +		__func__, req->in.h.opcode, req->in.h.unique, req->in.h.nodeid,
> > > > +		req->in.h.len, fuse_len_args(req->out.numargs, req->out.args));
> > > > +
> > > > +	fpq = &fs->vqs[queue_id].fud->pq;
> > > > +	spin_lock(&fpq->lock);
> > > > +	if (!fpq->connected) {
> > > > +		spin_unlock(&fpq->lock);
> > > > +		req->out.h.error = -ENODEV;
> > > > +		pr_err("virtio-fs: %s disconnected\n", __func__);
> > > > +		fuse_request_end(fc, req);
> > > > +		return;
> > > > +	}
> > > > +	list_add_tail(&req->list, fpq->processing);
> > > > +	spin_unlock(&fpq->lock);
> > > > +	set_bit(FR_SENT, &req->flags);
> > > > +	/* matches barrier in request_wait_answer() */
> > > > +	smp_mb__after_atomic();
> > > > +	/* TODO check for FR_INTERRUPTED? */
> > > 
> > > 
> > > ?
> > 
> > hmm... we don't support FR_INTERRUPTED. Stefan, do you remember why
> > this TODO is here. If not, I will get rid of it.
> 
> We don't support FUSE_INTERRUPT yet.  The purpose of this comment is
> that when we do support FUSE_INTERRUPT we'll need to follow
> fuse_dev_do_read() in queuing a FUSE_INTERRUPT here.
> 
> Stefan



OK so pls write this explicitly in the comment.

-- 
MST

next prev parent reply	other threads:[~2019-09-06 13:52 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-03 11:36 [PATCH v4 00/16] virtio-fs: shared file system for virtual machines Miklos Szeredi
2019-09-03 11:36 ` [PATCH v4 01/16] vfs: Create fs_context-aware mount_bdev() replacement Miklos Szeredi
2019-09-03 11:36 ` [PATCH v4 02/16] fuse: convert to use the new mount API Miklos Szeredi
2019-09-03 11:36 ` [PATCH v4 03/16] vfs: subtype handling moved to fuse Miklos Szeredi
2019-09-03 11:36 ` [PATCH v4 04/16] fuse: export fuse_end_request() Miklos Szeredi
2019-09-03 11:40 ` [PATCH v4 05/16] fuse: export fuse_len_args() Miklos Szeredi
2019-09-03 11:41 ` [PATCH v4 06/16] fuse: export fuse_send_init_request() Miklos Szeredi
2019-09-03 11:41 ` [PATCH v4 07/16] fuse: export fuse_get_unique() Miklos Szeredi
2019-09-03 11:41 ` [PATCH v4 08/16] fuse: export fuse_dequeue_forget() function Miklos Szeredi
2019-09-03 11:41 ` [PATCH v4 09/16] fuse: extract fuse_fill_super_common() Miklos Szeredi
2019-09-03 11:41 ` [PATCH v4 10/16] fuse: add fuse_iqueue_ops callbacks Miklos Szeredi
2019-09-03 11:41 ` [PATCH v4 11/16] fuse: separate fuse device allocation and installation in fuse_conn Miklos Szeredi
2019-09-03 11:41 ` [PATCH v4 12/16] fuse: delete dentry if timeout is zero Miklos Szeredi
2019-09-03 11:42 ` [PATCH v4 13/16] fuse: dissociate DESTROY from fuseblk Miklos Szeredi
2019-09-03 11:42 ` [PATCH v4 14/16] fuse: allow skipping control interface and forced unmount Miklos Szeredi
2019-09-03 11:42 ` [PATCH v4 15/16] virtio-fs: add virtiofs filesystem Miklos Szeredi
2019-09-03 13:55   ` Michael S. Tsirkin
2019-09-03 13:55   ` Michael S. Tsirkin
2019-09-04 18:16     ` Stefan Hajnoczi
2019-09-04 18:16     ` Stefan Hajnoczi
2019-09-04 18:58       ` Michael S. Tsirkin
2019-09-04 18:58       ` Michael S. Tsirkin
2019-09-05 19:15     ` Vivek Goyal
2019-09-05 19:15       ` Vivek Goyal
2019-09-06 10:22       ` Stefan Hajnoczi
2019-09-06 10:22       ` Stefan Hajnoczi
2019-09-06 13:52         ` Michael S. Tsirkin
2019-09-06 13:52         ` Michael S. Tsirkin [this message]
2019-09-03 11:42 ` [PATCH v4 16/16] virtio-fs: add Documentation/filesystems/virtiofs.rst Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190906095225-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.