* Re: [PATCH 1/3] virtio_ring: always warn when descriptor chain exceeds queue size [not found] ` <20210318135223.1342795-2-ckuehl@redhat.com> @ 2021-03-22 3:22 ` Jason Wang 2021-03-22 3:41 ` Jason Wang 2021-03-22 8:17 ` Michael S. Tsirkin 0 siblings, 2 replies; 12+ messages in thread From: Jason Wang @ 2021-03-22 3:22 UTC (permalink / raw) To: Connor Kuehl, virtio-fs Cc: miklos, mst, linux-kernel, virtualization, stefanha, linux-fsdevel, vgoyal 在 2021/3/18 下午9:52, Connor Kuehl 写道: > From section 2.6.5.3.1 (Driver Requirements: Indirect Descriptors) > of the virtio spec: > > "A driver MUST NOT create a descriptor chain longer than the Queue > Size of the device." > > This text suggests that the warning should trigger even if > indirect descriptors are in use. So I think at least the commit log needs some tweak. For split virtqueue. We had: 2.6.5.2 Driver Requirements: The Virtqueue Descriptor Table Drivers MUST NOT add a descriptor chain longer than 2^32 bytes in total; this implies that loops in the descriptor chain are forbidden! 2.6.5.3.1 Driver Requirements: Indirect Descriptors A driver MUST NOT create a descriptor chain longer than the Queue Size of the device. If I understand the spec correctly, the check is only needed for a single indirect descriptor table? For packed virtqueue. We had: 2.7.17 Driver Requirements: Scatter-Gather Support A driver MUST NOT create a descriptor list longer than allowed by the device. A driver MUST NOT create a descriptor list longer than the Queue Size. 2.7.19 Driver Requirements: Indirect Descriptors A driver MUST NOT create a descriptor chain longer than allowed by the device. So it looks to me the packed part is fine. Note that if I understand the spec correctly 2.7.17 implies 2.7.19. Thanks > > Reported-by: Stefan Hajnoczi <stefanha@redhat.com> > Signed-off-by: Connor Kuehl <ckuehl@redhat.com> > --- > drivers/virtio/virtio_ring.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index 71e16b53e9c1..1bc290f9ba13 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -444,11 +444,12 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, > > head = vq->free_head; > > + WARN_ON_ONCE(total_sg > vq->split.vring.num); > + > if (virtqueue_use_indirect(_vq, total_sg)) > desc = alloc_indirect_split(_vq, total_sg, gfp); > else { > desc = NULL; > - WARN_ON_ONCE(total_sg > vq->split.vring.num && !vq->indirect); > } > > if (desc) { > @@ -1118,6 +1119,8 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq, > > BUG_ON(total_sg == 0); > > + WARN_ON_ONCE(total_sg > vq->packed.vring.num); > + > if (virtqueue_use_indirect(_vq, total_sg)) > return virtqueue_add_indirect_packed(vq, sgs, total_sg, > out_sgs, in_sgs, data, gfp); > @@ -1125,8 +1128,6 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq, > head = vq->packed.next_avail_idx; > avail_used_flags = vq->packed.avail_used_flags; > > - WARN_ON_ONCE(total_sg > vq->packed.vring.num && !vq->indirect); > - > desc = vq->packed.vring.desc; > i = head; > descs_used = total_sg; _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] virtio_ring: always warn when descriptor chain exceeds queue size 2021-03-22 3:22 ` [PATCH 1/3] virtio_ring: always warn when descriptor chain exceeds queue size Jason Wang @ 2021-03-22 3:41 ` Jason Wang 2021-03-22 8:17 ` Michael S. Tsirkin 1 sibling, 0 replies; 12+ messages in thread From: Jason Wang @ 2021-03-22 3:41 UTC (permalink / raw) To: virtualization 在 2021/3/22 上午11:22, Jason Wang 写道: > > 在 2021/3/18 下午9:52, Connor Kuehl 写道: >> From section 2.6.5.3.1 (Driver Requirements: Indirect Descriptors) >> of the virtio spec: >> >> "A driver MUST NOT create a descriptor chain longer than the Queue >> Size of the device." >> >> This text suggests that the warning should trigger even if >> indirect descriptors are in use. > > > So I think at least the commit log needs some tweak. > > For split virtqueue. We had: > > 2.6.5.2 Driver Requirements: The Virtqueue Descriptor Table > > Drivers MUST NOT add a descriptor chain longer than 2^32 bytes in > total; this implies that loops in the descriptor chain are forbidden! > > 2.6.5.3.1 Driver Requirements: Indirect Descriptors > > A driver MUST NOT create a descriptor chain longer than the Queue Size > of the device. > > If I understand the spec correctly, the check is only needed for a > single indirect descriptor table? > > For packed virtqueue. We had: > > 2.7.17 Driver Requirements: Scatter-Gather Support > > A driver MUST NOT create a descriptor list longer than allowed by the > device. > > A driver MUST NOT create a descriptor list longer than the Queue Size. > > 2.7.19 Driver Requirements: Indirect Descriptors > > A driver MUST NOT create a descriptor chain longer than allowed by the > device. > > So it looks to me the packed part is fine. > > Note that if I understand the spec correctly 2.7.17 implies 2.7.19. Actually not. So in 2.7.7, spec said: " Some devices benefit by concurrently dispatching a large number of large requests. The VIRTIO_F_INDIRECT_DESC feature allows this. To increase ring capacity the driver can store a (read-only by the device) table of indirect descriptors anywhere in memory, and insert a descriptor in the main virtqueue (with Flags bit VIRTQ_DESC_F_INDIRECT on) that refers to a buffer element containing this indirect descriptor table; addr and len refer to the indirect table address and length in bytes, respectively. " And in 2.7.5, spec said " While unusual (most implementations either create all lists solely using non-indirect descriptors, or always use a single indirect element), if both features have been negotiated, mixing indirect and non-indirect descriptors in a ring is valid, as long as each list only contains descriptors of a given type. " So my understanding is that the indirect descriptor is used to sumbit the request whose #buffers is greater than the virtqueue size. And the spec allows the driver to create a list of indirect descriptors just need to make sure the number of indirect descriptors in this list must not exceed the size of the virtqueue (2.7.17). And for each indirector descriptor, the number of chained descriptor must not exceed the virtqueue size. So actually this aligns with split virtqueue. So if I understand the spec correctly, what we need to do is to make sure the descriptor chained in the indirect descriptor table does not exceed the virtqueue size. That means we probably need to chain indirect descriptors instead of a warn here. Thanks > > Thanks > > >> >> Reported-by: Stefan Hajnoczi <stefanha@redhat.com> >> Signed-off-by: Connor Kuehl <ckuehl@redhat.com> >> --- >> drivers/virtio/virtio_ring.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >> index 71e16b53e9c1..1bc290f9ba13 100644 >> --- a/drivers/virtio/virtio_ring.c >> +++ b/drivers/virtio/virtio_ring.c >> @@ -444,11 +444,12 @@ static inline int virtqueue_add_split(struct >> virtqueue *_vq, >> head = vq->free_head; >> + WARN_ON_ONCE(total_sg > vq->split.vring.num); >> + >> if (virtqueue_use_indirect(_vq, total_sg)) >> desc = alloc_indirect_split(_vq, total_sg, gfp); >> else { >> desc = NULL; >> - WARN_ON_ONCE(total_sg > vq->split.vring.num && !vq->indirect); >> } >> if (desc) { >> @@ -1118,6 +1119,8 @@ static inline int virtqueue_add_packed(struct >> virtqueue *_vq, >> BUG_ON(total_sg == 0); >> + WARN_ON_ONCE(total_sg > vq->packed.vring.num); >> + >> if (virtqueue_use_indirect(_vq, total_sg)) >> return virtqueue_add_indirect_packed(vq, sgs, total_sg, >> out_sgs, in_sgs, data, gfp); >> @@ -1125,8 +1128,6 @@ static inline int virtqueue_add_packed(struct >> virtqueue *_vq, >> head = vq->packed.next_avail_idx; >> avail_used_flags = vq->packed.avail_used_flags; >> - WARN_ON_ONCE(total_sg > vq->packed.vring.num && !vq->indirect); >> - >> desc = vq->packed.vring.desc; >> i = head; >> descs_used = total_sg; > > _______________________________________________ > Virtualization mailing list > Virtualization@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/virtualization _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] virtio_ring: always warn when descriptor chain exceeds queue size 2021-03-22 3:22 ` [PATCH 1/3] virtio_ring: always warn when descriptor chain exceeds queue size Jason Wang 2021-03-22 3:41 ` Jason Wang @ 2021-03-22 8:17 ` Michael S. Tsirkin 2021-03-23 2:38 ` Jason Wang 1 sibling, 1 reply; 12+ messages in thread From: Michael S. Tsirkin @ 2021-03-22 8:17 UTC (permalink / raw) To: Jason Wang Cc: miklos, linux-kernel, virtualization, virtio-fs, stefanha, linux-fsdevel, vgoyal On Mon, Mar 22, 2021 at 11:22:15AM +0800, Jason Wang wrote: > > 在 2021/3/18 下午9:52, Connor Kuehl 写道: > > From section 2.6.5.3.1 (Driver Requirements: Indirect Descriptors) > > of the virtio spec: > > > > "A driver MUST NOT create a descriptor chain longer than the Queue > > Size of the device." > > > > This text suggests that the warning should trigger even if > > indirect descriptors are in use. > > > So I think at least the commit log needs some tweak. > > For split virtqueue. We had: > > 2.6.5.2 Driver Requirements: The Virtqueue Descriptor Table > > Drivers MUST NOT add a descriptor chain longer than 2^32 bytes in total; > this implies that loops in the descriptor chain are forbidden! > > 2.6.5.3.1 Driver Requirements: Indirect Descriptors > > A driver MUST NOT create a descriptor chain longer than the Queue Size of > the device. > > If I understand the spec correctly, the check is only needed for a single > indirect descriptor table? > > For packed virtqueue. We had: > > 2.7.17 Driver Requirements: Scatter-Gather Support > > A driver MUST NOT create a descriptor list longer than allowed by the > device. > > A driver MUST NOT create a descriptor list longer than the Queue Size. > > 2.7.19 Driver Requirements: Indirect Descriptors > > A driver MUST NOT create a descriptor chain longer than allowed by the > device. > > So it looks to me the packed part is fine. > > Note that if I understand the spec correctly 2.7.17 implies 2.7.19. > > Thanks It would be quite strange for packed and split to differ here: so for packed would you say there's no limit on # of descriptors at all? I am guessing I just forgot to move this part from the format specific to the common part of the spec. This needs discussion in the TC mailing list - want to start a thread there? > > > > > Reported-by: Stefan Hajnoczi <stefanha@redhat.com> > > Signed-off-by: Connor Kuehl <ckuehl@redhat.com> > > --- > > drivers/virtio/virtio_ring.c | 7 ++++--- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index 71e16b53e9c1..1bc290f9ba13 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -444,11 +444,12 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, > > head = vq->free_head; > > + WARN_ON_ONCE(total_sg > vq->split.vring.num); > > + > > if (virtqueue_use_indirect(_vq, total_sg)) > > desc = alloc_indirect_split(_vq, total_sg, gfp); > > else { > > desc = NULL; > > - WARN_ON_ONCE(total_sg > vq->split.vring.num && !vq->indirect); > > } > > if (desc) { > > @@ -1118,6 +1119,8 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq, > > BUG_ON(total_sg == 0); > > + WARN_ON_ONCE(total_sg > vq->packed.vring.num); > > + > > if (virtqueue_use_indirect(_vq, total_sg)) > > return virtqueue_add_indirect_packed(vq, sgs, total_sg, > > out_sgs, in_sgs, data, gfp); > > @@ -1125,8 +1128,6 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq, > > head = vq->packed.next_avail_idx; > > avail_used_flags = vq->packed.avail_used_flags; > > - WARN_ON_ONCE(total_sg > vq->packed.vring.num && !vq->indirect); > > - > > desc = vq->packed.vring.desc; > > i = head; > > descs_used = total_sg; _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] virtio_ring: always warn when descriptor chain exceeds queue size 2021-03-22 8:17 ` Michael S. Tsirkin @ 2021-03-23 2:38 ` Jason Wang 0 siblings, 0 replies; 12+ messages in thread From: Jason Wang @ 2021-03-23 2:38 UTC (permalink / raw) To: Michael S. Tsirkin Cc: miklos, linux-kernel, virtualization, virtio-fs, stefanha, linux-fsdevel, vgoyal 在 2021/3/22 下午4:17, Michael S. Tsirkin 写道: > On Mon, Mar 22, 2021 at 11:22:15AM +0800, Jason Wang wrote: >> 在 2021/3/18 下午9:52, Connor Kuehl 写道: >>> From section 2.6.5.3.1 (Driver Requirements: Indirect Descriptors) >>> of the virtio spec: >>> >>> "A driver MUST NOT create a descriptor chain longer than the Queue >>> Size of the device." >>> >>> This text suggests that the warning should trigger even if >>> indirect descriptors are in use. >> >> So I think at least the commit log needs some tweak. >> >> For split virtqueue. We had: >> >> 2.6.5.2 Driver Requirements: The Virtqueue Descriptor Table >> >> Drivers MUST NOT add a descriptor chain longer than 2^32 bytes in total; >> this implies that loops in the descriptor chain are forbidden! >> >> 2.6.5.3.1 Driver Requirements: Indirect Descriptors >> >> A driver MUST NOT create a descriptor chain longer than the Queue Size of >> the device. >> >> If I understand the spec correctly, the check is only needed for a single >> indirect descriptor table? >> >> For packed virtqueue. We had: >> >> 2.7.17 Driver Requirements: Scatter-Gather Support >> >> A driver MUST NOT create a descriptor list longer than allowed by the >> device. >> >> A driver MUST NOT create a descriptor list longer than the Queue Size. >> >> 2.7.19 Driver Requirements: Indirect Descriptors >> >> A driver MUST NOT create a descriptor chain longer than allowed by the >> device. >> >> So it looks to me the packed part is fine. >> >> Note that if I understand the spec correctly 2.7.17 implies 2.7.19. >> >> Thanks > It would be quite strange for packed and split to differ here: > so for packed would you say there's no limit on # of descriptors at all? > > I am guessing I just forgot to move this part from > the format specific to the common part of the spec. > > This needs discussion in the TC mailing list - want to start a thread > there? Will do. Thanks > > > >>> Reported-by: Stefan Hajnoczi <stefanha@redhat.com> >>> Signed-off-by: Connor Kuehl <ckuehl@redhat.com> >>> --- >>> drivers/virtio/virtio_ring.c | 7 ++++--- >>> 1 file changed, 4 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >>> index 71e16b53e9c1..1bc290f9ba13 100644 >>> --- a/drivers/virtio/virtio_ring.c >>> +++ b/drivers/virtio/virtio_ring.c >>> @@ -444,11 +444,12 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, >>> head = vq->free_head; >>> + WARN_ON_ONCE(total_sg > vq->split.vring.num); >>> + >>> if (virtqueue_use_indirect(_vq, total_sg)) >>> desc = alloc_indirect_split(_vq, total_sg, gfp); >>> else { >>> desc = NULL; >>> - WARN_ON_ONCE(total_sg > vq->split.vring.num && !vq->indirect); >>> } >>> if (desc) { >>> @@ -1118,6 +1119,8 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq, >>> BUG_ON(total_sg == 0); >>> + WARN_ON_ONCE(total_sg > vq->packed.vring.num); >>> + >>> if (virtqueue_use_indirect(_vq, total_sg)) >>> return virtqueue_add_indirect_packed(vq, sgs, total_sg, >>> out_sgs, in_sgs, data, gfp); >>> @@ -1125,8 +1128,6 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq, >>> head = vq->packed.next_avail_idx; >>> avail_used_flags = vq->packed.avail_used_flags; >>> - WARN_ON_ONCE(total_sg > vq->packed.vring.num && !vq->indirect); >>> - >>> desc = vq->packed.vring.desc; >>> i = head; >>> descs_used = total_sg; _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20210318135223.1342795-4-ckuehl@redhat.com>]
* Re: [PATCH 3/3] fuse: fix typo for fuse_conn.max_pages comment [not found] ` <20210318135223.1342795-4-ckuehl@redhat.com> @ 2021-03-22 3:42 ` Jason Wang 0 siblings, 0 replies; 12+ messages in thread From: Jason Wang @ 2021-03-22 3:42 UTC (permalink / raw) To: Connor Kuehl, virtio-fs Cc: miklos, mst, linux-kernel, virtualization, stefanha, linux-fsdevel, vgoyal 在 2021/3/18 下午9:52, Connor Kuehl 写道: > 'Maxmum' -> 'Maximum' Need a better log here. With the commit log fixed. Acked-by: Jason Wang <jasowang@redhat.com> > > Signed-off-by: Connor Kuehl <ckuehl@redhat.com> > --- > fs/fuse/fuse_i.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > index f0e4ee906464..8bdee79ba593 100644 > --- a/fs/fuse/fuse_i.h > +++ b/fs/fuse/fuse_i.h > @@ -552,7 +552,7 @@ struct fuse_conn { > /** Maximum write size */ > unsigned max_write; > > - /** Maxmum number of pages that can be used in a single request */ > + /** Maximum number of pages that can be used in a single request */ > unsigned int max_pages; > > #if IS_ENABLED(CONFIG_VIRTIO_FS) _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20210318135223.1342795-3-ckuehl@redhat.com>]
* Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size [not found] ` <20210318135223.1342795-3-ckuehl@redhat.com> @ 2021-03-19 13:49 ` Vivek Goyal 2021-03-19 14:16 ` Connor Kuehl 2021-03-22 15:47 ` Stefan Hajnoczi [not found] ` <YFNvH8w4l7WyEMyr@miu.piliscsaba.redhat.com> 2 siblings, 1 reply; 12+ messages in thread From: Vivek Goyal @ 2021-03-19 13:49 UTC (permalink / raw) To: Connor Kuehl Cc: miklos, mst, linux-kernel, virtualization, virtio-fs, stefanha, linux-fsdevel On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote: > If an incoming FUSE request can't fit on the virtqueue, the request is > placed onto a workqueue so a worker can try to resubmit it later where > there will (hopefully) be space for it next time. > > This is fine for requests that aren't larger than a virtqueue's maximum > capacity. However, if a request's size exceeds the maximum capacity of > the virtqueue (even if the virtqueue is empty), it will be doomed to a > life of being placed on the workqueue, removed, discovered it won't fit, > and placed on the workqueue yet again. > > Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect > Descriptors) of the virtio spec: > > "A driver MUST NOT create a descriptor chain longer than the Queue > Size of the device." > > To fix this, limit the number of pages FUSE will use for an overall > request. This way, each request can realistically fit on the virtqueue > when it is decomposed into a scattergather list and avoid violating > section 2.6.5.3.1 of the virtio spec. Hi Connor, So as of now if a request is bigger than what virtqueue can support, it never gets dispatched and caller waits infinitely? So this patch will fix it by forcing fuse to split the request. That sounds good. [..] > diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c > index 8868ac31a3c0..a6ffba85d59a 100644 > --- a/fs/fuse/virtio_fs.c > +++ b/fs/fuse/virtio_fs.c > @@ -18,6 +18,12 @@ > #include <linux/uio.h> > #include "fuse_i.h" > > +/* Used to help calculate the FUSE connection's max_pages limit for a request's > + * size. Parts of the struct fuse_req are sliced into scattergather lists in > + * addition to the pages used, so this can help account for that overhead. > + */ > +#define FUSE_HEADER_OVERHEAD 4 How did yo arrive at this overhead. Is it following. - One sg element for fuse_in_header. - One sg element for input arguments. - One sg element for fuse_out_header. - One sg element for output args. Thanks Vivek _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size 2021-03-19 13:49 ` [PATCH 2/3] virtiofs: split requests that exceed virtqueue size Vivek Goyal @ 2021-03-19 14:16 ` Connor Kuehl 0 siblings, 0 replies; 12+ messages in thread From: Connor Kuehl @ 2021-03-19 14:16 UTC (permalink / raw) To: Vivek Goyal Cc: miklos, mst, linux-kernel, virtualization, virtio-fs, stefanha, linux-fsdevel On 3/19/21 8:49 AM, Vivek Goyal wrote: > On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote: >> If an incoming FUSE request can't fit on the virtqueue, the request is >> placed onto a workqueue so a worker can try to resubmit it later where >> there will (hopefully) be space for it next time. >> >> This is fine for requests that aren't larger than a virtqueue's maximum >> capacity. However, if a request's size exceeds the maximum capacity of >> the virtqueue (even if the virtqueue is empty), it will be doomed to a >> life of being placed on the workqueue, removed, discovered it won't fit, >> and placed on the workqueue yet again. >> >> Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect >> Descriptors) of the virtio spec: >> >> "A driver MUST NOT create a descriptor chain longer than the Queue >> Size of the device." >> >> To fix this, limit the number of pages FUSE will use for an overall >> request. This way, each request can realistically fit on the virtqueue >> when it is decomposed into a scattergather list and avoid violating >> section 2.6.5.3.1 of the virtio spec. > > Hi Connor, > > So as of now if a request is bigger than what virtqueue can support, > it never gets dispatched and caller waits infinitely? So this patch > will fix it by forcing fuse to split the request. That sounds good. Right, in theory. Certain configurations make it easier to avoid this from happening, such as using indirect descriptors; however, in that case, the virtio spec says even if indirect descriptors are used, the descriptor chain length shouldn't exceed the length of the queue's size anyways. So having FUSE split the request also helps to uphold that property. This is my reading of the potential looping problem: virtio_fs_wake_pending_and_unlock calls virtio_fs_enqueue_req calls virtqueue_add_sgs virtqueue_add_sgs can return -ENOSPC if there aren't enough descriptors available. This error gets propagated back down to virtio_fs_wake_pending_and_unlock which checks for this exact issue and places the request on a workqueue to retry submission later. Resubmission occurs in virtio_fs_request_dispatch_work, which does a similar dance, where if the request fails with -ENOSPC it just puts it back in the queue. However, for a sufficiently large request that would exceed the capacity of the virtqueue (even when empty), no amount of retrying will ever make it fit. > > > [..] >> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c >> index 8868ac31a3c0..a6ffba85d59a 100644 >> --- a/fs/fuse/virtio_fs.c >> +++ b/fs/fuse/virtio_fs.c >> @@ -18,6 +18,12 @@ >> #include <linux/uio.h> >> #include "fuse_i.h" >> >> +/* Used to help calculate the FUSE connection's max_pages limit for a request's >> + * size. Parts of the struct fuse_req are sliced into scattergather lists in >> + * addition to the pages used, so this can help account for that overhead. >> + */ >> +#define FUSE_HEADER_OVERHEAD 4 > > How did yo arrive at this overhead. Is it following. > > - One sg element for fuse_in_header. > - One sg element for input arguments. > - One sg element for fuse_out_header. > - One sg element for output args. Yes, that's exactly how I got to that number. Connor _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size [not found] ` <20210318135223.1342795-3-ckuehl@redhat.com> 2021-03-19 13:49 ` [PATCH 2/3] virtiofs: split requests that exceed virtqueue size Vivek Goyal @ 2021-03-22 15:47 ` Stefan Hajnoczi [not found] ` <YFNvH8w4l7WyEMyr@miu.piliscsaba.redhat.com> 2 siblings, 0 replies; 12+ messages in thread From: Stefan Hajnoczi @ 2021-03-22 15:47 UTC (permalink / raw) To: Connor Kuehl Cc: miklos, mst, linux-kernel, virtualization, virtio-fs, linux-fsdevel, vgoyal [-- Attachment #1.1: Type: text/plain, Size: 1397 bytes --] On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote: > If an incoming FUSE request can't fit on the virtqueue, the request is > placed onto a workqueue so a worker can try to resubmit it later where > there will (hopefully) be space for it next time. > > This is fine for requests that aren't larger than a virtqueue's maximum > capacity. However, if a request's size exceeds the maximum capacity of > the virtqueue (even if the virtqueue is empty), it will be doomed to a > life of being placed on the workqueue, removed, discovered it won't fit, > and placed on the workqueue yet again. > > Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect > Descriptors) of the virtio spec: > > "A driver MUST NOT create a descriptor chain longer than the Queue > Size of the device." > > To fix this, limit the number of pages FUSE will use for an overall > request. This way, each request can realistically fit on the virtqueue > when it is decomposed into a scattergather list and avoid violating > section 2.6.5.3.1 of the virtio spec. > > Signed-off-by: Connor Kuehl <ckuehl@redhat.com> > --- > fs/fuse/fuse_i.h | 5 +++++ > fs/fuse/inode.c | 7 +++++++ > fs/fuse/virtio_fs.c | 14 ++++++++++++++ > 3 files changed, 26 insertions(+) Nice that FUSE already has max_pages :-). Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] [-- Attachment #2: Type: text/plain, Size: 183 bytes --] _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <YFNvH8w4l7WyEMyr@miu.piliscsaba.redhat.com>]
[parent not found: <00c5dce8-fc2d-6e68-e3bc-a958ca5d2342@redhat.com>]
* Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size [not found] ` <00c5dce8-fc2d-6e68-e3bc-a958ca5d2342@redhat.com> @ 2021-03-20 20:04 ` Michael S. Tsirkin 0 siblings, 0 replies; 12+ messages in thread From: Michael S. Tsirkin @ 2021-03-20 20:04 UTC (permalink / raw) To: Connor Kuehl Cc: Miklos Szeredi, linux-kernel, virtualization, virtio-fs, stefanha, linux-fsdevel, vgoyal On Thu, Mar 18, 2021 at 10:52:14AM -0500, Connor Kuehl wrote: > On 3/18/21 10:17 AM, Miklos Szeredi wrote: > > I removed the conditional compilation and renamed the limit. Also made > > virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below. > > Thanks, Miklos. I think it looks better with those changes. > > > The virtio_ring patch in this series should probably go through the respective > > subsystem tree. > > Makes sense. I've CC'd everyone that ./scripts/get_maintainers.pl suggested > for that patch on this entire series as well. Should I resend patch #1 > through just that subsystem to avoid confusion or wait to see if it gets > picked out of this series? Yes pls post separately. Thanks! > Thanks again, > > Connor _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size [not found] ` <YFNvH8w4l7WyEMyr@miu.piliscsaba.redhat.com> [not found] ` <00c5dce8-fc2d-6e68-e3bc-a958ca5d2342@redhat.com> @ 2021-03-22 19:01 ` Vivek Goyal 2021-03-24 15:09 ` Connor Kuehl 2 siblings, 0 replies; 12+ messages in thread From: Vivek Goyal @ 2021-03-22 19:01 UTC (permalink / raw) To: Miklos Szeredi Cc: mst, linux-kernel, virtualization, virtio-fs, stefanha, linux-fsdevel On Thu, Mar 18, 2021 at 04:17:51PM +0100, Miklos Szeredi wrote: > On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote: > > If an incoming FUSE request can't fit on the virtqueue, the request is > > placed onto a workqueue so a worker can try to resubmit it later where > > there will (hopefully) be space for it next time. > > > > This is fine for requests that aren't larger than a virtqueue's maximum > > capacity. However, if a request's size exceeds the maximum capacity of > > the virtqueue (even if the virtqueue is empty), it will be doomed to a > > life of being placed on the workqueue, removed, discovered it won't fit, > > and placed on the workqueue yet again. > > > > Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect > > Descriptors) of the virtio spec: > > > > "A driver MUST NOT create a descriptor chain longer than the Queue > > Size of the device." > > > > To fix this, limit the number of pages FUSE will use for an overall > > request. This way, each request can realistically fit on the virtqueue > > when it is decomposed into a scattergather list and avoid violating > > section 2.6.5.3.1 of the virtio spec. > > I removed the conditional compilation and renamed the limit. Also made > virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below. > > The virtio_ring patch in this series should probably go through the respective > subsystem tree. > > > Thanks, > Miklos > > --- > From: Connor Kuehl <ckuehl@redhat.com> > Subject: virtiofs: split requests that exceed virtqueue size > Date: Thu, 18 Mar 2021 08:52:22 -0500 > > If an incoming FUSE request can't fit on the virtqueue, the request is > placed onto a workqueue so a worker can try to resubmit it later where > there will (hopefully) be space for it next time. > > This is fine for requests that aren't larger than a virtqueue's maximum > capacity. However, if a request's size exceeds the maximum capacity of the > virtqueue (even if the virtqueue is empty), it will be doomed to a life of > being placed on the workqueue, removed, discovered it won't fit, and placed > on the workqueue yet again. > > Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect > Descriptors) of the virtio spec: > > "A driver MUST NOT create a descriptor chain longer than the Queue > Size of the device." > > To fix this, limit the number of pages FUSE will use for an overall > request. This way, each request can realistically fit on the virtqueue > when it is decomposed into a scattergather list and avoid violating section > 2.6.5.3.1 of the virtio spec. > > Signed-off-by: Connor Kuehl <ckuehl@redhat.com> > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> > --- Looks good to me. Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Vivek > fs/fuse/fuse_i.h | 3 +++ > fs/fuse/inode.c | 3 ++- > fs/fuse/virtio_fs.c | 19 +++++++++++++++++-- > 3 files changed, 22 insertions(+), 3 deletions(-) > > --- a/fs/fuse/fuse_i.h > +++ b/fs/fuse/fuse_i.h > @@ -555,6 +555,9 @@ struct fuse_conn { > /** Maxmum number of pages that can be used in a single request */ > unsigned int max_pages; > > + /** Constrain ->max_pages to this value during feature negotiation */ > + unsigned int max_pages_limit; > + > /** Input queue */ > struct fuse_iqueue iq; > > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -712,6 +712,7 @@ void fuse_conn_init(struct fuse_conn *fc > fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); > fc->user_ns = get_user_ns(user_ns); > fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ; > + fc->max_pages_limit = FUSE_MAX_MAX_PAGES; > > INIT_LIST_HEAD(&fc->mounts); > list_add(&fm->fc_entry, &fc->mounts); > @@ -1040,7 +1041,7 @@ static void process_init_reply(struct fu > fc->abort_err = 1; > if (arg->flags & FUSE_MAX_PAGES) { > fc->max_pages = > - min_t(unsigned int, FUSE_MAX_MAX_PAGES, > + min_t(unsigned int, fc->max_pages_limit, > max_t(unsigned int, arg->max_pages, 1)); > } > if (IS_ENABLED(CONFIG_FUSE_DAX) && > --- a/fs/fuse/virtio_fs.c > +++ b/fs/fuse/virtio_fs.c > @@ -18,6 +18,12 @@ > #include <linux/uio.h> > #include "fuse_i.h" > > +/* Used to help calculate the FUSE connection's max_pages limit for a request's > + * size. Parts of the struct fuse_req are sliced into scattergather lists in > + * addition to the pages used, so this can help account for that overhead. > + */ > +#define FUSE_HEADER_OVERHEAD 4 > + > /* List of virtio-fs device instances and a lock for the list. Also provides > * mutual exclusion in device removal and mounting path > */ > @@ -1413,9 +1419,10 @@ static int virtio_fs_get_tree(struct fs_ > { > struct virtio_fs *fs; > struct super_block *sb; > - struct fuse_conn *fc; > + struct fuse_conn *fc = NULL; > struct fuse_mount *fm; > - int err; > + unsigned int virtqueue_size; > + int err = -EIO; > > /* This gets a reference on virtio_fs object. This ptr gets installed > * in fc->iq->priv. Once fuse_conn is going away, it calls ->put() > @@ -1427,6 +1434,10 @@ static int virtio_fs_get_tree(struct fs_ > return -EINVAL; > } > > + virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq); > + if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD)) > + goto out_err; > + > err = -ENOMEM; > fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL); > if (!fc) > @@ -1442,6 +1453,10 @@ static int virtio_fs_get_tree(struct fs_ > fc->delete_stale = true; > fc->auto_submounts = true; > > + /* Tell FUSE to split requests that exceed the virtqueue's size */ > + fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit, > + virtqueue_size - FUSE_HEADER_OVERHEAD); > + > fsc->s_fs_info = fm; > sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc); > if (fsc->s_fs_info) { > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size [not found] ` <YFNvH8w4l7WyEMyr@miu.piliscsaba.redhat.com> [not found] ` <00c5dce8-fc2d-6e68-e3bc-a958ca5d2342@redhat.com> 2021-03-22 19:01 ` Vivek Goyal @ 2021-03-24 15:09 ` Connor Kuehl [not found] ` <CAJfpeguzdPV13LhXFL0U_bfKcpOHQCvg2wfxF6ryksp==tjVWA@mail.gmail.com> 2 siblings, 1 reply; 12+ messages in thread From: Connor Kuehl @ 2021-03-24 15:09 UTC (permalink / raw) To: Miklos Szeredi Cc: mst, linux-kernel, virtualization, virtio-fs, stefanha, linux-fsdevel, vgoyal On 3/18/21 10:17 AM, Miklos Szeredi wrote: > I removed the conditional compilation and renamed the limit. Also made > virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below. Hi Miklos, Has this patch been queued? Connor > --- > From: Connor Kuehl <ckuehl@redhat.com> > Subject: virtiofs: split requests that exceed virtqueue size > Date: Thu, 18 Mar 2021 08:52:22 -0500 > > If an incoming FUSE request can't fit on the virtqueue, the request is > placed onto a workqueue so a worker can try to resubmit it later where > there will (hopefully) be space for it next time. > > This is fine for requests that aren't larger than a virtqueue's maximum > capacity. However, if a request's size exceeds the maximum capacity of the > virtqueue (even if the virtqueue is empty), it will be doomed to a life of > being placed on the workqueue, removed, discovered it won't fit, and placed > on the workqueue yet again. > > Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect > Descriptors) of the virtio spec: > > "A driver MUST NOT create a descriptor chain longer than the Queue > Size of the device." > > To fix this, limit the number of pages FUSE will use for an overall > request. This way, each request can realistically fit on the virtqueue > when it is decomposed into a scattergather list and avoid violating section > 2.6.5.3.1 of the virtio spec. > > Signed-off-by: Connor Kuehl <ckuehl@redhat.com> > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> > --- > fs/fuse/fuse_i.h | 3 +++ > fs/fuse/inode.c | 3 ++- > fs/fuse/virtio_fs.c | 19 +++++++++++++++++-- > 3 files changed, 22 insertions(+), 3 deletions(-) > > --- a/fs/fuse/fuse_i.h > +++ b/fs/fuse/fuse_i.h > @@ -555,6 +555,9 @@ struct fuse_conn { > /** Maxmum number of pages that can be used in a single request */ > unsigned int max_pages; > > + /** Constrain ->max_pages to this value during feature negotiation */ > + unsigned int max_pages_limit; > + > /** Input queue */ > struct fuse_iqueue iq; > > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -712,6 +712,7 @@ void fuse_conn_init(struct fuse_conn *fc > fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); > fc->user_ns = get_user_ns(user_ns); > fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ; > + fc->max_pages_limit = FUSE_MAX_MAX_PAGES; > > INIT_LIST_HEAD(&fc->mounts); > list_add(&fm->fc_entry, &fc->mounts); > @@ -1040,7 +1041,7 @@ static void process_init_reply(struct fu > fc->abort_err = 1; > if (arg->flags & FUSE_MAX_PAGES) { > fc->max_pages = > - min_t(unsigned int, FUSE_MAX_MAX_PAGES, > + min_t(unsigned int, fc->max_pages_limit, > max_t(unsigned int, arg->max_pages, 1)); > } > if (IS_ENABLED(CONFIG_FUSE_DAX) && > --- a/fs/fuse/virtio_fs.c > +++ b/fs/fuse/virtio_fs.c > @@ -18,6 +18,12 @@ > #include <linux/uio.h> > #include "fuse_i.h" > > +/* Used to help calculate the FUSE connection's max_pages limit for a request's > + * size. Parts of the struct fuse_req are sliced into scattergather lists in > + * addition to the pages used, so this can help account for that overhead. > + */ > +#define FUSE_HEADER_OVERHEAD 4 > + > /* List of virtio-fs device instances and a lock for the list. Also provides > * mutual exclusion in device removal and mounting path > */ > @@ -1413,9 +1419,10 @@ static int virtio_fs_get_tree(struct fs_ > { > struct virtio_fs *fs; > struct super_block *sb; > - struct fuse_conn *fc; > + struct fuse_conn *fc = NULL; > struct fuse_mount *fm; > - int err; > + unsigned int virtqueue_size; > + int err = -EIO; > > /* This gets a reference on virtio_fs object. This ptr gets installed > * in fc->iq->priv. Once fuse_conn is going away, it calls ->put() > @@ -1427,6 +1434,10 @@ static int virtio_fs_get_tree(struct fs_ > return -EINVAL; > } > > + virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq); > + if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD)) > + goto out_err; > + > err = -ENOMEM; > fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL); > if (!fc) > @@ -1442,6 +1453,10 @@ static int virtio_fs_get_tree(struct fs_ > fc->delete_stale = true; > fc->auto_submounts = true; > > + /* Tell FUSE to split requests that exceed the virtqueue's size */ > + fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit, > + virtqueue_size - FUSE_HEADER_OVERHEAD); > + > fsc->s_fs_info = fm; > sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc); > if (fsc->s_fs_info) { > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <CAJfpeguzdPV13LhXFL0U_bfKcpOHQCvg2wfxF6ryksp==tjVWA@mail.gmail.com>]
* Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size [not found] ` <CAJfpeguzdPV13LhXFL0U_bfKcpOHQCvg2wfxF6ryksp==tjVWA@mail.gmail.com> @ 2021-03-24 15:31 ` Connor Kuehl 0 siblings, 0 replies; 12+ messages in thread From: Connor Kuehl @ 2021-03-24 15:31 UTC (permalink / raw) To: Miklos Szeredi Cc: Michael S. Tsirkin, linux-kernel, virtualization, virtio-fs-list, Stefan Hajnoczi, linux-fsdevel, Vivek Goyal On 3/24/21 10:30 AM, Miklos Szeredi wrote: > On Wed, Mar 24, 2021 at 4:09 PM Connor Kuehl <ckuehl@redhat.com> wrote: >> >> On 3/18/21 10:17 AM, Miklos Szeredi wrote: >>> I removed the conditional compilation and renamed the limit. Also made >>> virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below. >> >> Hi Miklos, >> >> Has this patch been queued? > > It's in my internal patch queue at the moment. Will push to > fuse.git#for-next in a couple of days. Cool! Thank you :-) Connor _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-03-24 15:32 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20210318135223.1342795-1-ckuehl@redhat.com>
[not found] ` <20210318135223.1342795-2-ckuehl@redhat.com>
2021-03-22 3:22 ` [PATCH 1/3] virtio_ring: always warn when descriptor chain exceeds queue size Jason Wang
2021-03-22 3:41 ` Jason Wang
2021-03-22 8:17 ` Michael S. Tsirkin
2021-03-23 2:38 ` Jason Wang
[not found] ` <20210318135223.1342795-4-ckuehl@redhat.com>
2021-03-22 3:42 ` [PATCH 3/3] fuse: fix typo for fuse_conn.max_pages comment Jason Wang
[not found] ` <20210318135223.1342795-3-ckuehl@redhat.com>
2021-03-19 13:49 ` [PATCH 2/3] virtiofs: split requests that exceed virtqueue size Vivek Goyal
2021-03-19 14:16 ` Connor Kuehl
2021-03-22 15:47 ` Stefan Hajnoczi
[not found] ` <YFNvH8w4l7WyEMyr@miu.piliscsaba.redhat.com>
[not found] ` <00c5dce8-fc2d-6e68-e3bc-a958ca5d2342@redhat.com>
2021-03-20 20:04 ` Michael S. Tsirkin
2021-03-22 19:01 ` Vivek Goyal
2021-03-24 15:09 ` Connor Kuehl
[not found] ` <CAJfpeguzdPV13LhXFL0U_bfKcpOHQCvg2wfxF6ryksp==tjVWA@mail.gmail.com>
2021-03-24 15:31 ` Connor Kuehl
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).