* Re: [PATCH] ethernet: renesas: convert to SPDX identifiers
From: Sergei Shtylyov @ 2018-09-07 9:05 UTC (permalink / raw)
To: Kuninori Morimoto, David S. Miller, Mark Brown
Cc: Geert Uytterhoeven, Robin Murphy, netdev, linux-renesas-soc
In-Reply-To: <875zziqdem.wl-kuninori.morimoto.gx@renesas.com>
Hello!
On 9/7/2018 5:02 AM, Kuninori Morimoto wrote:
> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>
> This patch updates license to use SPDX-License-Identifier
> instead of verbose license text.
>
> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> ---
> drivers/net/ethernet/renesas/Kconfig | 1 +
> drivers/net/ethernet/renesas/Makefile | 1 +
> drivers/net/ethernet/renesas/ravb_ptp.c | 6 +-----
> 3 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/renesas/Kconfig b/drivers/net/ethernet/renesas/Kconfig
> index f3f7477..bb0ebdf 100644
> --- a/drivers/net/ethernet/renesas/Kconfig
> +++ b/drivers/net/ethernet/renesas/Kconfig
> @@ -1,3 +1,4 @@
> +# SPDX-License-Identifier: GPL-2.0
> #
> # Renesas device configuration
> #
There was no license at all on this file...
> diff --git a/drivers/net/ethernet/renesas/Makefile b/drivers/net/ethernet/renesas/Makefile
> index a05102a..f21ab8c 100644
> --- a/drivers/net/ethernet/renesas/Makefile
> +++ b/drivers/net/ethernet/renesas/Makefile
> @@ -1,3 +1,4 @@
> +# SPDX-License-Identifier: GPL-2.0
> #
> # Makefile for the Renesas device drivers.
> #
Likewise.
> diff --git a/drivers/net/ethernet/renesas/ravb_ptp.c b/drivers/net/ethernet/renesas/ravb_ptp.c
> index eede70e..0721b5c 100644
> --- a/drivers/net/ethernet/renesas/ravb_ptp.c
> +++ b/drivers/net/ethernet/renesas/ravb_ptp.c
> @@ -1,13 +1,9 @@
> +// SPDX-License-Identifier: GPL-2.0+
> /* PTP 1588 clock using the Renesas Ethernet AVB
> *
> * Copyright (C) 2013-2015 Renesas Electronics Corporation
> * Copyright (C) 2015 Renesas Solutions Corp.
> * Copyright (C) 2015-2016 Cogent Embedded, Inc. <source@cogentembedded.com>
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License as published by
> - * the Free Software Foundation; either version 2 of the License, or
> - * (at your option) any later version.
> */
>
> #include "ravb.h"
This part looks valid, don't know why this file was missed by Wolfram who
did the previous patch...
MBR, Sergei
^ permalink raw reply
* Re: [PATCH] xen/netfront: fix waiting for xenbus state change
From: Boris Ostrovsky @ 2018-09-07 13:46 UTC (permalink / raw)
To: Juergen Gross, linux-kernel, xen-devel, netdev; +Cc: davem, stable
In-Reply-To: <20180907122130.30810-1-jgross@suse.com>
On 09/07/2018 08:21 AM, Juergen Gross wrote:
> Commit 822fb18a82aba ("xen-netfront: wait xenbus state change when load
> module manually") added a new wait queue to wait on for a state change
> when the module is loaded manually. Unfortunately there is no wakeup
> anywhere to stop that waiting.
>
> Instead of introducing a new wait queue rename the existing
> module_unload_q to module_wq and use it for both purposes (loading and
> unloading).
>
> As any state change of the backend might be intended to stop waiting
> do the wake_up_all() in any case when netback_changed() is called.
>
> Fixes: 822fb18a82aba ("xen-netfront: wait xenbus state change when load module manually")
> Cc: <stable@vger.kernel.org> #4.18
> Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
^ permalink raw reply
* Re: [PATCH net-next v2 3/5] virtio_ring: add packed ring support
From: Michael S. Tsirkin @ 2018-09-07 13:49 UTC (permalink / raw)
To: Tiwei Bie
Cc: jasowang, virtualization, linux-kernel, netdev, virtio-dev, wexu,
jfreimann
In-Reply-To: <20180711022711.7090-4-tiwei.bie@intel.com>
On Wed, Jul 11, 2018 at 10:27:09AM +0800, Tiwei Bie wrote:
> This commit introduces the support (without EVENT_IDX) for
> packed ring.
>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> ---
> drivers/virtio/virtio_ring.c | 495 ++++++++++++++++++++++++++++++++++-
> 1 file changed, 487 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index c4f8abc7445a..f317b485ba54 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -55,12 +55,21 @@
> #define END_USE(vq)
> #endif
>
> +#define _VRING_DESC_F_AVAIL(b) ((__u16)(b) << 7)
> +#define _VRING_DESC_F_USED(b) ((__u16)(b) << 15)
Underscore followed by an upper case letter isn't a good idea
for a symbol. And it's not nice that this does not
use the VRING_DESC_F_USED from the header.
How about ((b) ? VRING_DESC_F_USED : 0) instead?
Is produced code worse then?
> +
> struct vring_desc_state {
> void *data; /* Data for callback. */
> struct vring_desc *indir_desc; /* Indirect descriptor, if any. */
> };
>
> struct vring_desc_state_packed {
> + void *data; /* Data for callback. */
> + struct vring_packed_desc *indir_desc; /* Indirect descriptor, if any. */
Include vring_desc_state for these?
> + int num; /* Descriptor list length. */
> + dma_addr_t addr; /* Buffer DMA addr. */
> + u32 len; /* Buffer length. */
> + u16 flags; /* Descriptor flags. */
This seems only to be used for indirect. Check indirect field above
instead and drop this.
> int next; /* The next desc state. */
Packing things into 16 bit integers will help reduce
cache pressure here.
Also, this is only used for unmap, so when dma API is not used,
maintaining addr and len list management is unnecessary. How about we
maintain addr/len in a separate array, avoiding accesses on unmap in that case?
Also, lots of architectures have a nop unmap in the DMA API.
See a proposed patch at the end for optimizing that case.
> };
>
> @@ -660,7 +669,6 @@ static bool virtqueue_poll_split(struct virtqueue *_vq, unsigned last_used_idx)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - virtio_mb(vq->weak_barriers);
> return (u16)last_used_idx != virtio16_to_cpu(_vq->vdev, vq->vring.used->idx);
> }
why is this changing the split queue implementation?
>
> @@ -757,6 +765,72 @@ static inline unsigned vring_size_packed(unsigned int num, unsigned long align)
> & ~(align - 1)) + sizeof(struct vring_packed_desc_event) * 2;
> }
>
> +static void vring_unmap_state_packed(const struct vring_virtqueue *vq,
> + struct vring_desc_state_packed *state)
> +{
> + u16 flags;
> +
> + if (!vring_use_dma_api(vq->vq.vdev))
> + return;
> +
> + flags = state->flags;
> +
> + if (flags & VRING_DESC_F_INDIRECT) {
> + dma_unmap_single(vring_dma_dev(vq),
> + state->addr, state->len,
> + (flags & VRING_DESC_F_WRITE) ?
> + DMA_FROM_DEVICE : DMA_TO_DEVICE);
> + } else {
> + dma_unmap_page(vring_dma_dev(vq),
> + state->addr, state->len,
> + (flags & VRING_DESC_F_WRITE) ?
> + DMA_FROM_DEVICE : DMA_TO_DEVICE);
> + }
> +}
> +
> +static void vring_unmap_desc_packed(const struct vring_virtqueue *vq,
> + struct vring_packed_desc *desc)
> +{
> + u16 flags;
> +
> + if (!vring_use_dma_api(vq->vq.vdev))
> + return;
> +
> + flags = virtio16_to_cpu(vq->vq.vdev, desc->flags);
I see no reason to use virtioXX wrappers for the packed ring.
That's there to support legacy devices. Just use leXX.
> +
> + if (flags & VRING_DESC_F_INDIRECT) {
> + dma_unmap_single(vring_dma_dev(vq),
> + virtio64_to_cpu(vq->vq.vdev, desc->addr),
> + virtio32_to_cpu(vq->vq.vdev, desc->len),
> + (flags & VRING_DESC_F_WRITE) ?
> + DMA_FROM_DEVICE : DMA_TO_DEVICE);
> + } else {
> + dma_unmap_page(vring_dma_dev(vq),
> + virtio64_to_cpu(vq->vq.vdev, desc->addr),
> + virtio32_to_cpu(vq->vq.vdev, desc->len),
> + (flags & VRING_DESC_F_WRITE) ?
> + DMA_FROM_DEVICE : DMA_TO_DEVICE);
> + }
> +}
> +
> +static struct vring_packed_desc *alloc_indirect_packed(struct virtqueue *_vq,
> + unsigned int total_sg,
> + gfp_t gfp)
> +{
> + struct vring_packed_desc *desc;
> +
> + /*
> + * We require lowmem mappings for the descriptors because
> + * otherwise virt_to_phys will give us bogus addresses in the
> + * virtqueue.
Where is virt_to_phys used? I don't see it in this patch.
> + */
> + gfp &= ~__GFP_HIGHMEM;
> +
> + desc = kmalloc(total_sg * sizeof(struct vring_packed_desc), gfp);
> +
> + return desc;
> +}
> +
> static inline int virtqueue_add_packed(struct virtqueue *_vq,
> struct scatterlist *sgs[],
> unsigned int total_sg,
> @@ -766,47 +840,449 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
> void *ctx,
> gfp_t gfp)
> {
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + struct vring_packed_desc *desc;
> + struct scatterlist *sg;
> + unsigned int i, n, descs_used, uninitialized_var(prev), err_idx;
> + __virtio16 uninitialized_var(head_flags), flags;
> + u16 head, avail_wrap_counter, id, curr;
> + bool indirect;
> +
> + START_USE(vq);
> +
> + BUG_ON(data == NULL);
> + BUG_ON(ctx && vq->indirect);
> +
> + if (unlikely(vq->broken)) {
> + END_USE(vq);
> + return -EIO;
> + }
> +
> +#ifdef DEBUG
> + {
> + ktime_t now = ktime_get();
> +
> + /* No kick or get, with .1 second between? Warn. */
> + if (vq->last_add_time_valid)
> + WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time))
> + > 100);
> + vq->last_add_time = now;
> + vq->last_add_time_valid = true;
> + }
> +#endif
Add incline helpers for this debug stuff rather than
duplicate it from split ring?
> +
> + BUG_ON(total_sg == 0);
> +
> + head = vq->next_avail_idx;
> + avail_wrap_counter = vq->avail_wrap_counter;
> +
> + if (virtqueue_use_indirect(_vq, total_sg))
> + desc = alloc_indirect_packed(_vq, total_sg, gfp);
> + else {
> + desc = NULL;
> + WARN_ON_ONCE(total_sg > vq->vring_packed.num && !vq->indirect);
> + }
Apparently this attempts to treat indirect descriptors same as
direct ones. This is how it is in the split ring, but not in
the packed one. I think you want two separate functions, for
direct and indirect.
> +
> + if (desc) {
> + /* Use a single buffer which doesn't continue */
> + indirect = true;
> + /* Set up rest to use this indirect table. */
> + i = 0;
> + descs_used = 1;
> + } else {
> + indirect = false;
> + desc = vq->vring_packed.desc;
> + i = head;
> + descs_used = total_sg;
> + }
> +
> + if (vq->vq.num_free < descs_used) {
> + pr_debug("Can't add buf len %i - avail = %i\n",
> + descs_used, vq->vq.num_free);
> + /* FIXME: for historical reasons, we force a notify here if
> + * there are outgoing parts to the buffer. Presumably the
> + * host should service the ring ASAP. */
> + if (out_sgs)
> + vq->notify(&vq->vq);
> + if (indirect)
> + kfree(desc);
> + END_USE(vq);
> + return -ENOSPC;
> + }
> +
> + id = vq->free_head;
> + BUG_ON(id == vq->vring_packed.num);
> +
> + curr = id;
> + for (n = 0; n < out_sgs + in_sgs; n++) {
> + for (sg = sgs[n]; sg; sg = sg_next(sg)) {
> + dma_addr_t addr = vring_map_one_sg(vq, sg, n < out_sgs ?
> + DMA_TO_DEVICE : DMA_FROM_DEVICE);
> + if (vring_mapping_error(vq, addr))
> + goto unmap_release;
> +
> + flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT |
> + (n < out_sgs ? 0 : VRING_DESC_F_WRITE) |
> + _VRING_DESC_F_AVAIL(vq->avail_wrap_counter) |
> + _VRING_DESC_F_USED(!vq->avail_wrap_counter));
Spec says:
The VIRTQ_DESC_F_WRITE flags bit is the only valid flag for descriptors in the
indirect table.
All this logic isn't needed for indirect.
Also, why re-calculate avail/used flags every time? They only change
when you wrap around.
> + if (!indirect && i == head)
> + head_flags = flags;
> + else
> + desc[i].flags = flags;
> +
> + desc[i].addr = cpu_to_virtio64(_vq->vdev, addr);
> + desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
> + i++;
> + if (!indirect) {
> + if (vring_use_dma_api(_vq->vdev)) {
> + vq->desc_state_packed[curr].addr = addr;
> + vq->desc_state_packed[curr].len =
> + sg->length;
> + vq->desc_state_packed[curr].flags =
> + virtio16_to_cpu(_vq->vdev,
> + flags);
> + }
> + curr = vq->desc_state_packed[curr].next;
> +
> + if (i >= vq->vring_packed.num) {
> + i = 0;
> + vq->avail_wrap_counter ^= 1;
> + }
> + }
> + }
> + }
> +
> + prev = (i > 0 ? i : vq->vring_packed.num) - 1;
> + desc[prev].id = cpu_to_virtio16(_vq->vdev, id);
Is it easier to write this out in all descriptors, to avoid the need to
calculate prev?
> +
> + /* Last one doesn't continue. */
> + if (total_sg == 1)
> + head_flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT);
> + else
> + desc[prev].flags &= cpu_to_virtio16(_vq->vdev,
> + ~VRING_DESC_F_NEXT);
Wouldn't it be easier to avoid setting VRING_DESC_F_NEXT
in the first place?
if (n != in_sgs - 1 && n != out_sgs - 1)
must be better than writing descriptor an extra time.
> +
> + if (indirect) {
> + /* Now that the indirect table is filled in, map it. */
> + dma_addr_t addr = vring_map_single(
> + vq, desc, total_sg * sizeof(struct vring_packed_desc),
> + DMA_TO_DEVICE);
> + if (vring_mapping_error(vq, addr))
> + goto unmap_release;
> +
> + head_flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_INDIRECT |
> + _VRING_DESC_F_AVAIL(avail_wrap_counter) |
> + _VRING_DESC_F_USED(!avail_wrap_counter));
> + vq->vring_packed.desc[head].addr = cpu_to_virtio64(_vq->vdev,
> + addr);
> + vq->vring_packed.desc[head].len = cpu_to_virtio32(_vq->vdev,
> + total_sg * sizeof(struct vring_packed_desc));
> + vq->vring_packed.desc[head].id = cpu_to_virtio16(_vq->vdev, id);
> +
> + if (vring_use_dma_api(_vq->vdev)) {
> + vq->desc_state_packed[id].addr = addr;
> + vq->desc_state_packed[id].len = total_sg *
> + sizeof(struct vring_packed_desc);
> + vq->desc_state_packed[id].flags =
> + virtio16_to_cpu(_vq->vdev, head_flags);
> + }
> + }
> +
> + /* We're using some buffers from the free list. */
> + vq->vq.num_free -= descs_used;
> +
> + /* Update free pointer */
> + if (indirect) {
> + n = head + 1;
> + if (n >= vq->vring_packed.num) {
> + n = 0;
> + vq->avail_wrap_counter ^= 1;
> + }
> + vq->next_avail_idx = n;
> + vq->free_head = vq->desc_state_packed[id].next;
> + } else {
> + vq->next_avail_idx = i;
> + vq->free_head = curr;
> + }
> +
> + /* Store token and indirect buffer state. */
> + vq->desc_state_packed[id].num = descs_used;
> + vq->desc_state_packed[id].data = data;
> + if (indirect)
> + vq->desc_state_packed[id].indir_desc = desc;
> + else
> + vq->desc_state_packed[id].indir_desc = ctx;
> +
> + /* A driver MUST NOT make the first descriptor in the list
> + * available before all subsequent descriptors comprising
> + * the list are made available. */
> + virtio_wmb(vq->weak_barriers);
> + vq->vring_packed.desc[head].flags = head_flags;
> + vq->num_added += descs_used;
> +
> + pr_debug("Added buffer head %i to %p\n", head, vq);
> + END_USE(vq);
> +
> + return 0;
> +
> +unmap_release:
> + err_idx = i;
> + i = head;
> +
> + for (n = 0; n < total_sg; n++) {
> + if (i == err_idx)
> + break;
> + vring_unmap_desc_packed(vq, &desc[i]);
> + i++;
> + if (!indirect && i >= vq->vring_packed.num)
> + i = 0;
> + }
> +
> + vq->avail_wrap_counter = avail_wrap_counter;
> +
> + if (indirect)
> + kfree(desc);
> +
> + END_USE(vq);
> return -EIO;
> }
>
> static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> {
> - return false;
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 flags;
> + bool needs_kick;
> + u32 snapshot;
> +
> + START_USE(vq);
> + /* We need to expose the new flags value before checking notification
> + * suppressions. */
> + virtio_mb(vq->weak_barriers);
> +
> + snapshot = READ_ONCE(*(u32 *)vq->vring_packed.device);
> + flags = virtio16_to_cpu(_vq->vdev, (__virtio16)(snapshot >> 16)) & 0x3;
What are all these hard-coded things? Also either we do READ_ONCE
everywhere or nowhere. Why is this place special? Any why read 32 bit
if you only want the 16?
And doesn't sparse complain about cast to __virtio16?
> +
> +#ifdef DEBUG
> + if (vq->last_add_time_valid) {
> + WARN_ON(ktime_to_ms(ktime_sub(ktime_get(),
> + vq->last_add_time)) > 100);
> + }
> + vq->last_add_time_valid = false;
> +#endif
> +
> + needs_kick = (flags != VRING_EVENT_F_DISABLE);
> + END_USE(vq);
> + return needs_kick;
> +}
> +
> +static void detach_buf_packed(struct vring_virtqueue *vq,
> + unsigned int id, void **ctx)
> +{
> + struct vring_desc_state_packed *state = NULL;
> + struct vring_packed_desc *desc;
> + unsigned int curr, i;
> +
> + /* Clear data ptr. */
> + vq->desc_state_packed[id].data = NULL;
> +
> + curr = id;
> + for (i = 0; i < vq->desc_state_packed[id].num; i++) {
> + state = &vq->desc_state_packed[curr];
> + vring_unmap_state_packed(vq, state);
> + curr = state->next;
> + }
> +
> + BUG_ON(state == NULL);
> + vq->vq.num_free += vq->desc_state_packed[id].num;
> + state->next = vq->free_head;
> + vq->free_head = id;
> +
> + if (vq->indirect) {
> + u32 len;
> +
> + /* Free the indirect table, if any, now that it's unmapped. */
> + desc = vq->desc_state_packed[id].indir_desc;
> + if (!desc)
> + return;
> +
> + if (vring_use_dma_api(vq->vq.vdev)) {
> + len = vq->desc_state_packed[id].len;
> + for (i = 0; i < len / sizeof(struct vring_packed_desc);
> + i++)
> + vring_unmap_desc_packed(vq, &desc[i]);
> + }
> + kfree(desc);
> + vq->desc_state_packed[id].indir_desc = NULL;
> + } else if (ctx) {
> + *ctx = vq->desc_state_packed[id].indir_desc;
> + }
> +}
> +
> +static inline bool is_used_desc_packed(const struct vring_virtqueue *vq,
> + u16 idx, bool used_wrap_counter)
> +{
> + u16 flags;
> + bool avail, used;
> +
> + flags = virtio16_to_cpu(vq->vq.vdev,
> + vq->vring_packed.desc[idx].flags);
> + avail = !!(flags & VRING_DESC_F_AVAIL);
> + used = !!(flags & VRING_DESC_F_USED);
> +
> + return avail == used && used == used_wrap_counter;
I think that you don't need to look at avail flag to detect a used
descriptor. The reason device writes it is to avoid confusing
*device* next time descriptor wraps.
> }
>
> static inline bool more_used_packed(const struct vring_virtqueue *vq)
> {
> - return false;
> + return is_used_desc_packed(vq, vq->last_used_idx,
> + vq->used_wrap_counter);
> }
>
> static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> unsigned int *len,
> void **ctx)
> {
> - return NULL;
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 last_used, id;
> + void *ret;
> +
> + START_USE(vq);
> +
> + if (unlikely(vq->broken)) {
> + END_USE(vq);
> + return NULL;
> + }
> +
> + if (!more_used_packed(vq)) {
> + pr_debug("No more buffers in queue\n");
> + END_USE(vq);
> + return NULL;
> + }
> +
> + /* Only get used elements after they have been exposed by host. */
> + virtio_rmb(vq->weak_barriers);
> +
> + last_used = vq->last_used_idx;
> + id = virtio16_to_cpu(_vq->vdev, vq->vring_packed.desc[last_used].id);
> + *len = virtio32_to_cpu(_vq->vdev, vq->vring_packed.desc[last_used].len);
> +
> + if (unlikely(id >= vq->vring_packed.num)) {
> + BAD_RING(vq, "id %u out of range\n", id);
> + return NULL;
> + }
> + if (unlikely(!vq->desc_state_packed[id].data)) {
> + BAD_RING(vq, "id %u is not a head!\n", id);
> + return NULL;
> + }
> +
> + vq->last_used_idx += vq->desc_state_packed[id].num;
> + if (vq->last_used_idx >= vq->vring_packed.num) {
> + vq->last_used_idx -= vq->vring_packed.num;
> + vq->used_wrap_counter ^= 1;
> + }
> +
> + /* detach_buf_packed clears data, so grab it now. */
> + ret = vq->desc_state_packed[id].data;
> + detach_buf_packed(vq, id, ctx);
> +
> +#ifdef DEBUG
> + vq->last_add_time_valid = false;
> +#endif
> +
> + END_USE(vq);
> + return ret;
> }
>
> static void virtqueue_disable_cb_packed(struct virtqueue *_vq)
> {
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
> + if (vq->event_flags_shadow != VRING_EVENT_F_DISABLE) {
> + vq->event_flags_shadow = VRING_EVENT_F_DISABLE;
> + vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> + vq->event_flags_shadow);
> + }
> }
>
> static unsigned virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq)
> {
> - return 0;
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
> + START_USE(vq);
> +
> + /* We optimistically turn back on interrupts, then check if there was
> + * more to do. */
> +
> + if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> + vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> + vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> + vq->event_flags_shadow);
> + }
> +
> + END_USE(vq);
> + return vq->last_used_idx | ((u16)vq->used_wrap_counter << 15);
> }
>
> -static bool virtqueue_poll_packed(struct virtqueue *_vq, unsigned last_used_idx)
> +static bool virtqueue_poll_packed(struct virtqueue *_vq, unsigned off_wrap)
> {
> - return false;
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + bool wrap_counter;
> + u16 used_idx;
> +
> + wrap_counter = off_wrap >> 15;
> + used_idx = off_wrap & ~(1 << 15);
> +
> + return is_used_desc_packed(vq, used_idx, wrap_counter);
These >> 15 << 15 all over the place duplicate info.
Pls put 15 in the header.
Also can you maintain the counters properly shifted?
Then just use them.
> }
>
> static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
> {
> - return false;
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
> + START_USE(vq);
> +
> + /* We optimistically turn back on interrupts, then check if there was
> + * more to do. */
> +
> + if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> + vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> + vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> + vq->event_flags_shadow);
> + /* We need to enable interrupts first before re-checking
> + * for more used buffers. */
> + virtio_mb(vq->weak_barriers);
> + }
> +
> + if (more_used_packed(vq)) {
> + END_USE(vq);
> + return false;
> + }
> +
> + END_USE(vq);
> + return true;
> }
>
> static void *virtqueue_detach_unused_buf_packed(struct virtqueue *_vq)
> {
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + unsigned int i;
> + void *buf;
> +
> + START_USE(vq);
> +
> + for (i = 0; i < vq->vring_packed.num; i++) {
> + if (!vq->desc_state_packed[i].data)
> + continue;
> + /* detach_buf clears data, so grab it now. */
> + buf = vq->desc_state_packed[i].data;
> + detach_buf_packed(vq, i, NULL);
> + END_USE(vq);
> + return buf;
> + }
> + /* That should have freed everything. */
> + BUG_ON(vq->vq.num_free != vq->vring_packed.num);
> +
> + END_USE(vq);
> return NULL;
> }
>
> @@ -1083,6 +1559,9 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> + /* We need to enable interrupts first before re-checking
> + * for more used buffers. */
> + virtio_mb(vq->weak_barriers);
> return vq->packed ? virtqueue_poll_packed(_vq, last_used_idx) :
> virtqueue_poll_split(_vq, last_used_idx);
> }
Possible optimization for when dma API is in use:
--->
dma: detecting nop unmap
drivers need to maintain the dma address for unmap purposes,
but these cycles are wasted when unmap callback is not
defined. Add an API for drivers to check that and avoid
unmap completely. Debug builds still have unmap.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
----
diff --git a/include/linux/dma-debug.h b/include/linux/dma-debug.h
index a785f2507159..38b2c27c8449 100644
--- a/include/linux/dma-debug.h
+++ b/include/linux/dma-debug.h
@@ -42,6 +42,11 @@ extern void debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
extern void debug_dma_unmap_page(struct device *dev, dma_addr_t addr,
size_t size, int direction, bool map_single);
+static inline bool has_debug_dma_unmap(struct device *dev)
+{
+ return true;
+}
+
extern void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
int nents, int mapped_ents, int direction);
@@ -121,6 +126,11 @@ static inline void debug_dma_unmap_page(struct device *dev, dma_addr_t addr,
{
}
+static inline bool has_debug_dma_unmap(struct device *dev)
+{
+ return false;
+}
+
static inline void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
int nents, int mapped_ents, int direction)
{
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 1db6a6b46d0d..656f3e518166 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -241,6 +241,14 @@ static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr,
return addr;
}
+static inline bool has_dma_unmap(struct device *dev)
+{
+ const struct dma_map_ops *ops = get_dma_ops(dev);
+
+ return ops->unmap_page || ops->unmap_sg || ops->unmap_resource ||
+ has_dma_unmap(dev);
+}
+
static inline void dma_unmap_single_attrs(struct device *dev, dma_addr_t addr,
size_t size,
enum dma_data_direction dir,
^ permalink raw reply related
* Re: [PATCH net-next] net: sched: cls_flower: dump offload count value
From: Jakub Kicinski @ 2018-09-07 9:11 UTC (permalink / raw)
To: Vlad Buslov; +Cc: netdev, jhs, xiyou.wangcong, jiri, davem
In-Reply-To: <1536248243-24092-1-git-send-email-vladbu@mellanox.com>
On Thu, 6 Sep 2018 18:37:23 +0300, Vlad Buslov wrote:
> Change flower in_hw_count type to fixed-size u32 and dump it as
> TCA_FLOWER_IN_HW_COUNT. This change is necessary to properly test shared
> blocks and re-offload functionality.
>
> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> ---
> include/net/sch_generic.h | 2 +-
> include/uapi/linux/pkt_cls.h | 2 ++
> net/sched/cls_flower.c | 5 ++++-
> 3 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index a6d00093f35e..d68ac55539a5 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -362,7 +362,7 @@ static inline void tcf_block_offload_dec(struct tcf_block *block, u32 *flags)
> }
>
> static inline void
> -tc_cls_offload_cnt_update(struct tcf_block *block, unsigned int *cnt,
> +tc_cls_offload_cnt_update(struct tcf_block *block, u32 *cnt,
> u32 *flags, bool add)
> {
> if (add) {
> diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
> index be382fb0592d..2824fb7ed1c9 100644
> --- a/include/uapi/linux/pkt_cls.h
> +++ b/include/uapi/linux/pkt_cls.h
> @@ -483,6 +483,8 @@ enum {
> TCA_FLOWER_KEY_ENC_OPTS,
> TCA_FLOWER_KEY_ENC_OPTS_MASK,
>
> + TCA_FLOWER_IN_HW_COUNT, /* be32 */
Why be32?
I wish there was a good way to share this attribute between
classifiers :(
> __TCA_FLOWER_MAX,
> };
>
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index 6fd9bdd93796..4b8dd37dd4f8 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -98,7 +98,7 @@ struct cls_fl_filter {
> struct list_head list;
> u32 handle;
> u32 flags;
> - unsigned int in_hw_count;
> + u32 in_hw_count;
> struct rcu_work rwork;
> struct net_device *hw_dev;
> };
> @@ -1880,6 +1880,9 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
> if (f->flags && nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags))
> goto nla_put_failure;
>
> + if (nla_put_u32(skb, TCA_FLOWER_IN_HW_COUNT, f->in_hw_count))
> + goto nla_put_failure;
> +
> if (tcf_exts_dump(skb, &f->exts))
> goto nla_put_failure;
>
^ permalink raw reply
* Re: [PATCH net-next v2 1/5] virtio: add packed ring definitions
From: Michael S. Tsirkin @ 2018-09-07 13:51 UTC (permalink / raw)
To: Tiwei Bie
Cc: jasowang, virtualization, linux-kernel, netdev, virtio-dev, wexu,
jfreimann
In-Reply-To: <20180711022711.7090-2-tiwei.bie@intel.com>
On Wed, Jul 11, 2018 at 10:27:07AM +0800, Tiwei Bie wrote:
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> ---
> include/uapi/linux/virtio_config.h | 3 +++
> include/uapi/linux/virtio_ring.h | 43 ++++++++++++++++++++++++++++++
> 2 files changed, 46 insertions(+)
>
> diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
> index 449132c76b1c..1196e1c1d4f6 100644
> --- a/include/uapi/linux/virtio_config.h
> +++ b/include/uapi/linux/virtio_config.h
> @@ -75,6 +75,9 @@
> */
> #define VIRTIO_F_IOMMU_PLATFORM 33
>
> +/* This feature indicates support for the packed virtqueue layout. */
> +#define VIRTIO_F_RING_PACKED 34
> +
> /*
> * Does the device support Single Root I/O Virtualization?
> */
> diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
> index 6d5d5faa989b..0254a2ba29cf 100644
> --- a/include/uapi/linux/virtio_ring.h
> +++ b/include/uapi/linux/virtio_ring.h
> @@ -44,6 +44,10 @@
> /* This means the buffer contains a list of buffer descriptors. */
> #define VRING_DESC_F_INDIRECT 4
>
> +/* Mark a descriptor as available or used. */
> +#define VRING_DESC_F_AVAIL (1ul << 7)
> +#define VRING_DESC_F_USED (1ul << 15)
> +
> /* The Host uses this in used->flags to advise the Guest: don't kick me when
> * you add a buffer. It's unreliable, so it's simply an optimization. Guest
> * will still kick if it's out of buffers. */
> @@ -53,6 +57,17 @@
> * optimization. */
> #define VRING_AVAIL_F_NO_INTERRUPT 1
>
> +/* Enable events. */
> +#define VRING_EVENT_F_ENABLE 0x0
> +/* Disable events. */
> +#define VRING_EVENT_F_DISABLE 0x1
> +/*
> + * Enable events for a specific descriptor
> + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter).
> + * Only valid if VIRTIO_RING_F_EVENT_IDX has been negotiated.
> + */
> +#define VRING_EVENT_F_DESC 0x2
> +
> /* We support indirect buffer descriptors */
> #define VIRTIO_RING_F_INDIRECT_DESC 28
>
These are for the packed ring, right? Pls prefix accordingly.
Also, you likely need macros for the wrap counters.
> @@ -171,4 +186,32 @@ static inline int vring_need_event(__u16 event_idx, __u16 new_idx, __u16 old)
> return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);
> }
>
> +struct vring_packed_desc_event {
> + /* Descriptor Ring Change Event Offset/Wrap Counter. */
> + __virtio16 off_wrap;
> + /* Descriptor Ring Change Event Flags. */
> + __virtio16 flags;
> +};
> +
> +struct vring_packed_desc {
> + /* Buffer Address. */
> + __virtio64 addr;
> + /* Buffer Length. */
> + __virtio32 len;
> + /* Buffer ID. */
> + __virtio16 id;
> + /* The flags depending on descriptor type. */
> + __virtio16 flags;
> +};
Don't use __virtioXX types, just __leXX ones.
> +
> +struct vring_packed {
> + unsigned int num;
> +
> + struct vring_packed_desc *desc;
> +
> + struct vring_packed_desc_event *driver;
> +
> + struct vring_packed_desc_event *device;
> +};
> +
> #endif /* _UAPI_LINUX_VIRTIO_RING_H */
> --
> 2.18.0
^ permalink raw reply
* Re: [PATCH] ethernet: renesas: convert to SPDX identifiers
From: Geert Uytterhoeven @ 2018-09-07 9:11 UTC (permalink / raw)
To: Sergei Shtylyov
Cc: Kuninori Morimoto, David S. Miller, Mark Brown, Robin Murphy,
netdev, Linux-Renesas
In-Reply-To: <d55d4d25-9966-cf34-7258-97009a9fa5ce@cogentembedded.com>
Hi Sergei,
On Fri, Sep 7, 2018 at 11:05 AM Sergei Shtylyov
<sergei.shtylyov@cogentembedded.com> wrote:
> On 9/7/2018 5:02 AM, Kuninori Morimoto wrote:
> > From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> >
> > This patch updates license to use SPDX-License-Identifier
> > instead of verbose license text.
> >
> > Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> > ---
> > drivers/net/ethernet/renesas/Kconfig | 1 +
> > drivers/net/ethernet/renesas/Makefile | 1 +
> > drivers/net/ethernet/renesas/ravb_ptp.c | 6 +-----
> > 3 files changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/renesas/Kconfig b/drivers/net/ethernet/renesas/Kconfig
> > index f3f7477..bb0ebdf 100644
> > --- a/drivers/net/ethernet/renesas/Kconfig
> > +++ b/drivers/net/ethernet/renesas/Kconfig
> > @@ -1,3 +1,4 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > #
> > # Renesas device configuration
> > #
>
> There was no license at all on this file...
So it's licensed under GPL-2.0.
Cfr. e.g. commit 7328c8f48d1895b3 ("PCI: Add SPDX GPL-2.0 when no license
was specified")
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply
* Re: [PATCH] ethernet: renesas: convert to SPDX identifiers
From: Sergei Shtylyov @ 2018-09-07 9:16 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Kuninori Morimoto, David S. Miller, Mark Brown, Robin Murphy,
netdev, Linux-Renesas
In-Reply-To: <CAMuHMdVJ+3cX5hMntJ=YJwA48gju+PEcsh-NQmpDRMCnAVG=zw@mail.gmail.com>
On 9/7/2018 12:11 PM, Geert Uytterhoeven wrote:
>> On 9/7/2018 5:02 AM, Kuninori Morimoto wrote:
>>> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>>>
>>> This patch updates license to use SPDX-License-Identifier
>>> instead of verbose license text.
>>>
>>> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
>>> ---
>>> drivers/net/ethernet/renesas/Kconfig | 1 +
>>> drivers/net/ethernet/renesas/Makefile | 1 +
>>> drivers/net/ethernet/renesas/ravb_ptp.c | 6 +-----
>>> 3 files changed, 3 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/renesas/Kconfig b/drivers/net/ethernet/renesas/Kconfig
>>> index f3f7477..bb0ebdf 100644
>>> --- a/drivers/net/ethernet/renesas/Kconfig
>>> +++ b/drivers/net/ethernet/renesas/Kconfig
>>> @@ -1,3 +1,4 @@
>>> +# SPDX-License-Identifier: GPL-2.0
>>> #
>>> # Renesas device configuration
>>> #
>>
>> There was no license at all on this file...
>
> So it's licensed under GPL-2.0.
> Cfr. e.g. commit 7328c8f48d1895b3 ("PCI: Add SPDX GPL-2.0 when no license
> was specified")
OK, but then does the patch description match what is being done?
> Gr{oetje,eeting}s,
>
> Geert
MBR, Sergei
^ permalink raw reply
* GREETINGS FROM TRACY WILLIAM
From: Miss Tracy William @ 2018-09-07 9:20 UTC (permalink / raw)
In-Reply-To: <1094400474.1597284.1536312012984.ref@mail.yahoo.com>
Hello Dear,
how are you today,I hope you are doing great.
It is my great pleasure to
contact you and i hope you don't mind,I was just surfing
through the Internet search when i found your email address,I want to make a new
and special friend,I hope you don't mind.
My name is Tracy William,I am from the United States of America but
presently I live
and work in England,
I will give you pictures and details of me as soon as i hear from you
bye
Tracy
^ permalink raw reply
* Re: [PATCH net-next v2 2/5] virtio_ring: support creating packed ring
From: Michael S. Tsirkin @ 2018-09-07 14:03 UTC (permalink / raw)
To: Tiwei Bie
Cc: jasowang, virtualization, linux-kernel, netdev, virtio-dev, wexu,
jfreimann
In-Reply-To: <20180711022711.7090-3-tiwei.bie@intel.com>
On Wed, Jul 11, 2018 at 10:27:08AM +0800, Tiwei Bie wrote:
> This commit introduces the support for creating packed ring.
> All split ring specific functions are added _split suffix.
> Some necessary stubs for packed ring are also added.
>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
I'd rather have a patch just renaming split functions, then
add all packed stuff in as a separate patch on top.
> ---
> drivers/virtio/virtio_ring.c | 801 +++++++++++++++++++++++------------
> include/linux/virtio_ring.h | 8 +-
> 2 files changed, 546 insertions(+), 263 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 814b395007b2..c4f8abc7445a 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -60,11 +60,15 @@ struct vring_desc_state {
> struct vring_desc *indir_desc; /* Indirect descriptor, if any. */
> };
>
> +struct vring_desc_state_packed {
> + int next; /* The next desc state. */
So this can go away with IN_ORDER?
> +};
> +
> struct vring_virtqueue {
> struct virtqueue vq;
>
> - /* Actual memory layout for this queue */
> - struct vring vring;
> + /* Is this a packed ring? */
> + bool packed;
>
> /* Can we use weak barriers? */
> bool weak_barriers;
> @@ -86,11 +90,39 @@ struct vring_virtqueue {
> /* Last used index we've seen. */
> u16 last_used_idx;
>
> - /* Last written value to avail->flags */
> - u16 avail_flags_shadow;
> + union {
> + /* Available for split ring */
> + struct {
> + /* Actual memory layout for this queue. */
> + struct vring vring;
>
> - /* Last written value to avail->idx in guest byte order */
> - u16 avail_idx_shadow;
> + /* Last written value to avail->flags */
> + u16 avail_flags_shadow;
> +
> + /* Last written value to avail->idx in
> + * guest byte order. */
> + u16 avail_idx_shadow;
> + };
Name this field split so it's easier to detect misuse of e.g.
packed fields in split code?
> +
> + /* Available for packed ring */
> + struct {
> + /* Actual memory layout for this queue. */
> + struct vring_packed vring_packed;
> +
> + /* Driver ring wrap counter. */
> + bool avail_wrap_counter;
> +
> + /* Device ring wrap counter. */
> + bool used_wrap_counter;
> +
> + /* Index of the next avail descriptor. */
> + u16 next_avail_idx;
> +
> + /* Last written value to driver->flags in
> + * guest byte order. */
> + u16 event_flags_shadow;
> + };
> + };
>
> /* How to notify other side. FIXME: commonalize hcalls! */
> bool (*notify)(struct virtqueue *vq);
> @@ -110,11 +142,24 @@ struct vring_virtqueue {
> #endif
>
> /* Per-descriptor state. */
> - struct vring_desc_state desc_state[];
> + union {
> + struct vring_desc_state desc_state[1];
> + struct vring_desc_state_packed desc_state_packed[1];
> + };
> };
>
> #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
>
> +static inline bool virtqueue_use_indirect(struct virtqueue *_vq,
> + unsigned int total_sg)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
> + /* If the host supports indirect descriptor tables, and we have multiple
> + * buffers, then go indirect. FIXME: tune this threshold */
> + return (vq->indirect && total_sg > 1 && vq->vq.num_free);
> +}
> +
> /*
> * Modern virtio devices have feature bits to specify whether they need a
> * quirk and bypass the IOMMU. If not there, just use the DMA API.
> @@ -200,8 +245,17 @@ static dma_addr_t vring_map_single(const struct vring_virtqueue *vq,
> cpu_addr, size, direction);
> }
>
> -static void vring_unmap_one(const struct vring_virtqueue *vq,
> - struct vring_desc *desc)
> +static int vring_mapping_error(const struct vring_virtqueue *vq,
> + dma_addr_t addr)
> +{
> + if (!vring_use_dma_api(vq->vq.vdev))
> + return 0;
> +
> + return dma_mapping_error(vring_dma_dev(vq), addr);
> +}
> +
> +static void vring_unmap_one_split(const struct vring_virtqueue *vq,
> + struct vring_desc *desc)
> {
> u16 flags;
>
> @@ -225,17 +279,9 @@ static void vring_unmap_one(const struct vring_virtqueue *vq,
> }
> }
>
> -static int vring_mapping_error(const struct vring_virtqueue *vq,
> - dma_addr_t addr)
> -{
> - if (!vring_use_dma_api(vq->vq.vdev))
> - return 0;
> -
> - return dma_mapping_error(vring_dma_dev(vq), addr);
> -}
> -
> -static struct vring_desc *alloc_indirect(struct virtqueue *_vq,
> - unsigned int total_sg, gfp_t gfp)
> +static struct vring_desc *alloc_indirect_split(struct virtqueue *_vq,
> + unsigned int total_sg,
> + gfp_t gfp)
> {
> struct vring_desc *desc;
> unsigned int i;
> @@ -256,14 +302,14 @@ static struct vring_desc *alloc_indirect(struct virtqueue *_vq,
> return desc;
> }
>
> -static inline int virtqueue_add(struct virtqueue *_vq,
> - struct scatterlist *sgs[],
> - unsigned int total_sg,
> - unsigned int out_sgs,
> - unsigned int in_sgs,
> - void *data,
> - void *ctx,
> - gfp_t gfp)
> +static inline int virtqueue_add_split(struct virtqueue *_vq,
> + struct scatterlist *sgs[],
> + unsigned int total_sg,
> + unsigned int out_sgs,
> + unsigned int in_sgs,
> + void *data,
> + void *ctx,
> + gfp_t gfp)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> struct scatterlist *sg;
> @@ -299,10 +345,8 @@ static inline int virtqueue_add(struct virtqueue *_vq,
>
> head = vq->free_head;
>
> - /* If the host supports indirect descriptor tables, and we have multiple
> - * buffers, then go indirect. FIXME: tune this threshold */
> - if (vq->indirect && total_sg > 1 && vq->vq.num_free)
> - desc = alloc_indirect(_vq, total_sg, gfp);
> + if (virtqueue_use_indirect(_vq, total_sg))
> + desc = alloc_indirect_split(_vq, total_sg, gfp);
> else {
> desc = NULL;
> WARN_ON_ONCE(total_sg > vq->vring.num && !vq->indirect);
> @@ -423,7 +467,7 @@ static inline int virtqueue_add(struct virtqueue *_vq,
> for (n = 0; n < total_sg; n++) {
> if (i == err_idx)
> break;
> - vring_unmap_one(vq, &desc[i]);
> + vring_unmap_one_split(vq, &desc[i]);
> i = virtio16_to_cpu(_vq->vdev, vq->vring.desc[i].next);
> }
>
> @@ -434,6 +478,355 @@ static inline int virtqueue_add(struct virtqueue *_vq,
> return -EIO;
> }
>
> +static bool virtqueue_kick_prepare_split(struct virtqueue *_vq)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 new, old;
> + bool needs_kick;
> +
> + START_USE(vq);
> + /* We need to expose available array entries before checking avail
> + * event. */
> + virtio_mb(vq->weak_barriers);
> +
> + old = vq->avail_idx_shadow - vq->num_added;
> + new = vq->avail_idx_shadow;
> + vq->num_added = 0;
> +
> +#ifdef DEBUG
> + if (vq->last_add_time_valid) {
> + WARN_ON(ktime_to_ms(ktime_sub(ktime_get(),
> + vq->last_add_time)) > 100);
> + }
> + vq->last_add_time_valid = false;
> +#endif
> +
> + if (vq->event) {
> + needs_kick = vring_need_event(virtio16_to_cpu(_vq->vdev, vring_avail_event(&vq->vring)),
> + new, old);
> + } else {
> + needs_kick = !(vq->vring.used->flags & cpu_to_virtio16(_vq->vdev, VRING_USED_F_NO_NOTIFY));
> + }
> + END_USE(vq);
> + return needs_kick;
> +}
> +
> +static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head,
> + void **ctx)
> +{
> + unsigned int i, j;
> + __virtio16 nextflag = cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_NEXT);
> +
> + /* Clear data ptr. */
> + vq->desc_state[head].data = NULL;
> +
> + /* Put back on free list: unmap first-level descriptors and find end */
> + i = head;
> +
> + while (vq->vring.desc[i].flags & nextflag) {
> + vring_unmap_one_split(vq, &vq->vring.desc[i]);
> + i = virtio16_to_cpu(vq->vq.vdev, vq->vring.desc[i].next);
> + vq->vq.num_free++;
> + }
> +
> + vring_unmap_one_split(vq, &vq->vring.desc[i]);
> + vq->vring.desc[i].next = cpu_to_virtio16(vq->vq.vdev, vq->free_head);
> + vq->free_head = head;
> +
> + /* Plus final descriptor */
> + vq->vq.num_free++;
> +
> + if (vq->indirect) {
> + struct vring_desc *indir_desc = vq->desc_state[head].indir_desc;
> + u32 len;
> +
> + /* Free the indirect table, if any, now that it's unmapped. */
> + if (!indir_desc)
> + return;
> +
> + len = virtio32_to_cpu(vq->vq.vdev, vq->vring.desc[head].len);
> +
> + BUG_ON(!(vq->vring.desc[head].flags &
> + cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_INDIRECT)));
> + BUG_ON(len == 0 || len % sizeof(struct vring_desc));
> +
> + for (j = 0; j < len / sizeof(struct vring_desc); j++)
> + vring_unmap_one_split(vq, &indir_desc[j]);
> +
> + kfree(indir_desc);
> + vq->desc_state[head].indir_desc = NULL;
> + } else if (ctx) {
> + *ctx = vq->desc_state[head].indir_desc;
> + }
> +}
> +
> +static inline bool more_used_split(const struct vring_virtqueue *vq)
> +{
> + return vq->last_used_idx != virtio16_to_cpu(vq->vq.vdev, vq->vring.used->idx);
> +}
> +
> +static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> + unsigned int *len,
> + void **ctx)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + void *ret;
> + unsigned int i;
> + u16 last_used;
> +
> + START_USE(vq);
> +
> + if (unlikely(vq->broken)) {
> + END_USE(vq);
> + return NULL;
> + }
> +
> + if (!more_used_split(vq)) {
> + pr_debug("No more buffers in queue\n");
> + END_USE(vq);
> + return NULL;
> + }
> +
> + /* Only get used array entries after they have been exposed by host. */
> + virtio_rmb(vq->weak_barriers);
> +
> + last_used = (vq->last_used_idx & (vq->vring.num - 1));
> + i = virtio32_to_cpu(_vq->vdev, vq->vring.used->ring[last_used].id);
> + *len = virtio32_to_cpu(_vq->vdev, vq->vring.used->ring[last_used].len);
> +
> + if (unlikely(i >= vq->vring.num)) {
> + BAD_RING(vq, "id %u out of range\n", i);
> + return NULL;
> + }
> + if (unlikely(!vq->desc_state[i].data)) {
> + BAD_RING(vq, "id %u is not a head!\n", i);
> + return NULL;
> + }
> +
> + /* detach_buf_split clears data, so grab it now. */
> + ret = vq->desc_state[i].data;
> + detach_buf_split(vq, i, ctx);
> + vq->last_used_idx++;
> + /* If we expect an interrupt for the next entry, tell host
> + * by writing event index and flush out the write before
> + * the read in the next get_buf call. */
> + if (!(vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> + virtio_store_mb(vq->weak_barriers,
> + &vring_used_event(&vq->vring),
> + cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> +
> +#ifdef DEBUG
> + vq->last_add_time_valid = false;
> +#endif
> +
> + END_USE(vq);
> + return ret;
> +}
> +
> +static void virtqueue_disable_cb_split(struct virtqueue *_vq)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
> + if (!(vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> + vq->avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> + if (!vq->event)
> + vq->vring.avail->flags = cpu_to_virtio16(_vq->vdev, vq->avail_flags_shadow);
> + }
> +}
> +
> +static unsigned virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 last_used_idx;
> +
> + START_USE(vq);
> +
> + /* We optimistically turn back on interrupts, then check if there was
> + * more to do. */
> + /* Depending on the VIRTIO_RING_F_EVENT_IDX feature, we need to
> + * either clear the flags bit or point the event index at the next
> + * entry. Always do both to keep code simple. */
> + if (vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) {
> + vq->avail_flags_shadow &= ~VRING_AVAIL_F_NO_INTERRUPT;
> + if (!vq->event)
> + vq->vring.avail->flags = cpu_to_virtio16(_vq->vdev, vq->avail_flags_shadow);
> + }
> + vring_used_event(&vq->vring) = cpu_to_virtio16(_vq->vdev, last_used_idx = vq->last_used_idx);
> + END_USE(vq);
> + return last_used_idx;
> +}
> +
> +static bool virtqueue_poll_split(struct virtqueue *_vq, unsigned last_used_idx)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
> + virtio_mb(vq->weak_barriers);
> + return (u16)last_used_idx != virtio16_to_cpu(_vq->vdev, vq->vring.used->idx);
> +}
> +
> +static bool virtqueue_enable_cb_delayed_split(struct virtqueue *_vq)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 bufs;
> +
> + START_USE(vq);
> +
> + /* We optimistically turn back on interrupts, then check if there was
> + * more to do. */
> + /* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> + * either clear the flags bit or point the event index at the next
> + * entry. Always update the event index to keep code simple. */
> + if (vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) {
> + vq->avail_flags_shadow &= ~VRING_AVAIL_F_NO_INTERRUPT;
> + if (!vq->event)
> + vq->vring.avail->flags = cpu_to_virtio16(_vq->vdev, vq->avail_flags_shadow);
> + }
> + /* TODO: tune this threshold */
> + bufs = (u16)(vq->avail_idx_shadow - vq->last_used_idx) * 3 / 4;
> +
> + virtio_store_mb(vq->weak_barriers,
> + &vring_used_event(&vq->vring),
> + cpu_to_virtio16(_vq->vdev, vq->last_used_idx + bufs));
> +
> + if (unlikely((u16)(virtio16_to_cpu(_vq->vdev, vq->vring.used->idx) - vq->last_used_idx) > bufs)) {
> + END_USE(vq);
> + return false;
> + }
> +
> + END_USE(vq);
> + return true;
> +}
> +
> +static void *virtqueue_detach_unused_buf_split(struct virtqueue *_vq)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + unsigned int i;
> + void *buf;
> +
> + START_USE(vq);
> +
> + for (i = 0; i < vq->vring.num; i++) {
> + if (!vq->desc_state[i].data)
> + continue;
> + /* detach_buf clears data, so grab it now. */
> + buf = vq->desc_state[i].data;
> + detach_buf_split(vq, i, NULL);
> + vq->avail_idx_shadow--;
> + vq->vring.avail->idx = cpu_to_virtio16(_vq->vdev, vq->avail_idx_shadow);
> + END_USE(vq);
> + return buf;
> + }
> + /* That should have freed everything. */
> + BUG_ON(vq->vq.num_free != vq->vring.num);
> +
> + END_USE(vq);
> + return NULL;
> +}
> +
> +/*
> + * The layout for the packed ring is a continuous chunk of memory
> + * which looks like this.
> + *
> + * struct vring_packed {
> + * // The actual descriptors (16 bytes each)
> + * struct vring_packed_desc desc[num];
> + *
> + * // Padding to the next align boundary.
> + * char pad[];
> + *
> + * // Driver Event Suppression
> + * struct vring_packed_desc_event driver;
> + *
> + * // Device Event Suppression
> + * struct vring_packed_desc_event device;
> + * };
> + */
Why not just allocate event structures separately?
Is it a win to have them share a cache line for some reason?
> +static inline void vring_init_packed(struct vring_packed *vr, unsigned int num,
> + void *p, unsigned long align)
> +{
> + vr->num = num;
> + vr->desc = p;
> + vr->driver = (void *)ALIGN(((uintptr_t)p +
> + sizeof(struct vring_packed_desc) * num), align);
> + vr->device = vr->driver + 1;
> +}
What's all this about alignment? Where does it come from?
> +
> +static inline unsigned vring_size_packed(unsigned int num, unsigned long align)
> +{
> + return ((sizeof(struct vring_packed_desc) * num + align - 1)
> + & ~(align - 1)) + sizeof(struct vring_packed_desc_event) * 2;
> +}
> +
> +static inline int virtqueue_add_packed(struct virtqueue *_vq,
> + struct scatterlist *sgs[],
> + unsigned int total_sg,
> + unsigned int out_sgs,
> + unsigned int in_sgs,
> + void *data,
> + void *ctx,
> + gfp_t gfp)
> +{
> + return -EIO;
> +}
> +
> +static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> +{
> + return false;
> +}
> +
> +static inline bool more_used_packed(const struct vring_virtqueue *vq)
> +{
> + return false;
> +}
> +
> +static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> + unsigned int *len,
> + void **ctx)
> +{
> + return NULL;
> +}
> +
> +static void virtqueue_disable_cb_packed(struct virtqueue *_vq)
> +{
> +}
> +
> +static unsigned virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq)
> +{
> + return 0;
> +}
> +
> +static bool virtqueue_poll_packed(struct virtqueue *_vq, unsigned last_used_idx)
> +{
> + return false;
> +}
> +
> +static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
> +{
> + return false;
> +}
> +
> +static void *virtqueue_detach_unused_buf_packed(struct virtqueue *_vq)
> +{
> + return NULL;
> +}
> +
> +static inline int virtqueue_add(struct virtqueue *_vq,
> + struct scatterlist *sgs[],
> + unsigned int total_sg,
> + unsigned int out_sgs,
> + unsigned int in_sgs,
> + void *data,
> + void *ctx,
> + gfp_t gfp)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> +
> + return vq->packed ? virtqueue_add_packed(_vq, sgs, total_sg, out_sgs,
> + in_sgs, data, ctx, gfp) :
> + virtqueue_add_split(_vq, sgs, total_sg, out_sgs,
> + in_sgs, data, ctx, gfp);
> +}
> +
> /**
> * virtqueue_add_sgs - expose buffers to other end
> * @vq: the struct virtqueue we're talking about.
> @@ -550,34 +943,9 @@ EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_ctx);
> bool virtqueue_kick_prepare(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> - u16 new, old;
> - bool needs_kick;
>
> - START_USE(vq);
> - /* We need to expose available array entries before checking avail
> - * event. */
> - virtio_mb(vq->weak_barriers);
> -
> - old = vq->avail_idx_shadow - vq->num_added;
> - new = vq->avail_idx_shadow;
> - vq->num_added = 0;
> -
> -#ifdef DEBUG
> - if (vq->last_add_time_valid) {
> - WARN_ON(ktime_to_ms(ktime_sub(ktime_get(),
> - vq->last_add_time)) > 100);
> - }
> - vq->last_add_time_valid = false;
> -#endif
> -
> - if (vq->event) {
> - needs_kick = vring_need_event(virtio16_to_cpu(_vq->vdev, vring_avail_event(&vq->vring)),
> - new, old);
> - } else {
> - needs_kick = !(vq->vring.used->flags & cpu_to_virtio16(_vq->vdev, VRING_USED_F_NO_NOTIFY));
> - }
> - END_USE(vq);
> - return needs_kick;
> + return vq->packed ? virtqueue_kick_prepare_packed(_vq) :
> + virtqueue_kick_prepare_split(_vq);
> }
> EXPORT_SYMBOL_GPL(virtqueue_kick_prepare);
>
> @@ -625,58 +993,9 @@ bool virtqueue_kick(struct virtqueue *vq)
> }
> EXPORT_SYMBOL_GPL(virtqueue_kick);
>
> -static void detach_buf(struct vring_virtqueue *vq, unsigned int head,
> - void **ctx)
> -{
> - unsigned int i, j;
> - __virtio16 nextflag = cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_NEXT);
> -
> - /* Clear data ptr. */
> - vq->desc_state[head].data = NULL;
> -
> - /* Put back on free list: unmap first-level descriptors and find end */
> - i = head;
> -
> - while (vq->vring.desc[i].flags & nextflag) {
> - vring_unmap_one(vq, &vq->vring.desc[i]);
> - i = virtio16_to_cpu(vq->vq.vdev, vq->vring.desc[i].next);
> - vq->vq.num_free++;
> - }
> -
> - vring_unmap_one(vq, &vq->vring.desc[i]);
> - vq->vring.desc[i].next = cpu_to_virtio16(vq->vq.vdev, vq->free_head);
> - vq->free_head = head;
> -
> - /* Plus final descriptor */
> - vq->vq.num_free++;
> -
> - if (vq->indirect) {
> - struct vring_desc *indir_desc = vq->desc_state[head].indir_desc;
> - u32 len;
> -
> - /* Free the indirect table, if any, now that it's unmapped. */
> - if (!indir_desc)
> - return;
> -
> - len = virtio32_to_cpu(vq->vq.vdev, vq->vring.desc[head].len);
> -
> - BUG_ON(!(vq->vring.desc[head].flags &
> - cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_INDIRECT)));
> - BUG_ON(len == 0 || len % sizeof(struct vring_desc));
> -
> - for (j = 0; j < len / sizeof(struct vring_desc); j++)
> - vring_unmap_one(vq, &indir_desc[j]);
> -
> - kfree(indir_desc);
> - vq->desc_state[head].indir_desc = NULL;
> - } else if (ctx) {
> - *ctx = vq->desc_state[head].indir_desc;
> - }
> -}
> -
> static inline bool more_used(const struct vring_virtqueue *vq)
> {
> - return vq->last_used_idx != virtio16_to_cpu(vq->vq.vdev, vq->vring.used->idx);
> + return vq->packed ? more_used_packed(vq) : more_used_split(vq);
> }
>
> /**
> @@ -699,57 +1018,9 @@ void *virtqueue_get_buf_ctx(struct virtqueue *_vq, unsigned int *len,
> void **ctx)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> - void *ret;
> - unsigned int i;
> - u16 last_used;
>
> - START_USE(vq);
> -
> - if (unlikely(vq->broken)) {
> - END_USE(vq);
> - return NULL;
> - }
> -
> - if (!more_used(vq)) {
> - pr_debug("No more buffers in queue\n");
> - END_USE(vq);
> - return NULL;
> - }
> -
> - /* Only get used array entries after they have been exposed by host. */
> - virtio_rmb(vq->weak_barriers);
> -
> - last_used = (vq->last_used_idx & (vq->vring.num - 1));
> - i = virtio32_to_cpu(_vq->vdev, vq->vring.used->ring[last_used].id);
> - *len = virtio32_to_cpu(_vq->vdev, vq->vring.used->ring[last_used].len);
> -
> - if (unlikely(i >= vq->vring.num)) {
> - BAD_RING(vq, "id %u out of range\n", i);
> - return NULL;
> - }
> - if (unlikely(!vq->desc_state[i].data)) {
> - BAD_RING(vq, "id %u is not a head!\n", i);
> - return NULL;
> - }
> -
> - /* detach_buf clears data, so grab it now. */
> - ret = vq->desc_state[i].data;
> - detach_buf(vq, i, ctx);
> - vq->last_used_idx++;
> - /* If we expect an interrupt for the next entry, tell host
> - * by writing event index and flush out the write before
> - * the read in the next get_buf call. */
> - if (!(vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> - virtio_store_mb(vq->weak_barriers,
> - &vring_used_event(&vq->vring),
> - cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> -
> -#ifdef DEBUG
> - vq->last_add_time_valid = false;
> -#endif
> -
> - END_USE(vq);
> - return ret;
> + return vq->packed ? virtqueue_get_buf_ctx_packed(_vq, len, ctx) :
> + virtqueue_get_buf_ctx_split(_vq, len, ctx);
> }
> EXPORT_SYMBOL_GPL(virtqueue_get_buf_ctx);
>
> @@ -771,12 +1042,10 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (!(vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> - vq->avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> - if (!vq->event)
> - vq->vring.avail->flags = cpu_to_virtio16(_vq->vdev, vq->avail_flags_shadow);
> - }
> -
> + if (vq->packed)
> + virtqueue_disable_cb_packed(_vq);
> + else
> + virtqueue_disable_cb_split(_vq);
> }
> EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
>
> @@ -795,23 +1064,9 @@ EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
> unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> - u16 last_used_idx;
>
> - START_USE(vq);
> -
> - /* We optimistically turn back on interrupts, then check if there was
> - * more to do. */
> - /* Depending on the VIRTIO_RING_F_EVENT_IDX feature, we need to
> - * either clear the flags bit or point the event index at the next
> - * entry. Always do both to keep code simple. */
> - if (vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) {
> - vq->avail_flags_shadow &= ~VRING_AVAIL_F_NO_INTERRUPT;
> - if (!vq->event)
> - vq->vring.avail->flags = cpu_to_virtio16(_vq->vdev, vq->avail_flags_shadow);
> - }
> - vring_used_event(&vq->vring) = cpu_to_virtio16(_vq->vdev, last_used_idx = vq->last_used_idx);
> - END_USE(vq);
> - return last_used_idx;
> + return vq->packed ? virtqueue_enable_cb_prepare_packed(_vq) :
> + virtqueue_enable_cb_prepare_split(_vq);
> }
> EXPORT_SYMBOL_GPL(virtqueue_enable_cb_prepare);
>
> @@ -828,8 +1083,8 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - virtio_mb(vq->weak_barriers);
> - return (u16)last_used_idx != virtio16_to_cpu(_vq->vdev, vq->vring.used->idx);
> + return vq->packed ? virtqueue_poll_packed(_vq, last_used_idx) :
> + virtqueue_poll_split(_vq, last_used_idx);
> }
> EXPORT_SYMBOL_GPL(virtqueue_poll);
>
> @@ -867,34 +1122,9 @@ EXPORT_SYMBOL_GPL(virtqueue_enable_cb);
> bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> - u16 bufs;
>
> - START_USE(vq);
> -
> - /* We optimistically turn back on interrupts, then check if there was
> - * more to do. */
> - /* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
> - * either clear the flags bit or point the event index at the next
> - * entry. Always update the event index to keep code simple. */
> - if (vq->avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) {
> - vq->avail_flags_shadow &= ~VRING_AVAIL_F_NO_INTERRUPT;
> - if (!vq->event)
> - vq->vring.avail->flags = cpu_to_virtio16(_vq->vdev, vq->avail_flags_shadow);
> - }
> - /* TODO: tune this threshold */
> - bufs = (u16)(vq->avail_idx_shadow - vq->last_used_idx) * 3 / 4;
> -
> - virtio_store_mb(vq->weak_barriers,
> - &vring_used_event(&vq->vring),
> - cpu_to_virtio16(_vq->vdev, vq->last_used_idx + bufs));
> -
> - if (unlikely((u16)(virtio16_to_cpu(_vq->vdev, vq->vring.used->idx) - vq->last_used_idx) > bufs)) {
> - END_USE(vq);
> - return false;
> - }
> -
> - END_USE(vq);
> - return true;
> + return vq->packed ? virtqueue_enable_cb_delayed_packed(_vq) :
> + virtqueue_enable_cb_delayed_split(_vq);
> }
> EXPORT_SYMBOL_GPL(virtqueue_enable_cb_delayed);
>
> @@ -909,27 +1139,9 @@ EXPORT_SYMBOL_GPL(virtqueue_enable_cb_delayed);
> void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> - unsigned int i;
> - void *buf;
>
> - START_USE(vq);
> -
> - for (i = 0; i < vq->vring.num; i++) {
> - if (!vq->desc_state[i].data)
> - continue;
> - /* detach_buf clears data, so grab it now. */
> - buf = vq->desc_state[i].data;
> - detach_buf(vq, i, NULL);
> - vq->avail_idx_shadow--;
> - vq->vring.avail->idx = cpu_to_virtio16(_vq->vdev, vq->avail_idx_shadow);
> - END_USE(vq);
> - return buf;
> - }
> - /* That should have freed everything. */
> - BUG_ON(vq->vq.num_free != vq->vring.num);
> -
> - END_USE(vq);
> - return NULL;
> + return vq->packed ? virtqueue_detach_unused_buf_packed(_vq) :
> + virtqueue_detach_unused_buf_split(_vq);
> }
> EXPORT_SYMBOL_GPL(virtqueue_detach_unused_buf);
>
> @@ -954,7 +1166,8 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
> EXPORT_SYMBOL_GPL(vring_interrupt);
>
> struct virtqueue *__vring_new_virtqueue(unsigned int index,
> - struct vring vring,
> + union vring_union vring,
> + bool packed,
> struct virtio_device *vdev,
> bool weak_barriers,
> bool context,
> @@ -962,19 +1175,22 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> void (*callback)(struct virtqueue *),
> const char *name)
> {
> - unsigned int i;
> struct vring_virtqueue *vq;
> + unsigned int num, i;
> + size_t size;
>
> - vq = kmalloc(sizeof(*vq) + vring.num * sizeof(struct vring_desc_state),
> - GFP_KERNEL);
> + num = packed ? vring.vring_packed.num : vring.vring_split.num;
> + size = packed ? num * sizeof(struct vring_desc_state_packed) :
> + num * sizeof(struct vring_desc_state);
> +
> + vq = kmalloc(sizeof(*vq) + size, GFP_KERNEL);
> if (!vq)
> return NULL;
>
> - vq->vring = vring;
> vq->vq.callback = callback;
> vq->vq.vdev = vdev;
> vq->vq.name = name;
> - vq->vq.num_free = vring.num;
> + vq->vq.num_free = num;
> vq->vq.index = index;
> vq->we_own_ring = false;
> vq->queue_dma_addr = 0;
> @@ -983,9 +1199,8 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> vq->weak_barriers = weak_barriers;
> vq->broken = false;
> vq->last_used_idx = 0;
> - vq->avail_flags_shadow = 0;
> - vq->avail_idx_shadow = 0;
> vq->num_added = 0;
> + vq->packed = packed;
> list_add_tail(&vq->vq.list, &vdev->vqs);
> #ifdef DEBUG
> vq->in_use = false;
> @@ -996,19 +1211,48 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> !context;
> vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
>
> + if (vq->packed) {
> + vq->vring_packed = vring.vring_packed;
> + vq->next_avail_idx = 0;
> + vq->avail_wrap_counter = 1;
> + vq->used_wrap_counter = 1;
> + vq->event_flags_shadow = 0;
> +
> + memset(vq->desc_state_packed, 0,
> + num * sizeof(struct vring_desc_state_packed));
> +
> + /* Put everything in free lists. */
> + vq->free_head = 0;
> + for (i = 0; i < num-1; i++)
> + vq->desc_state_packed[i].next = i + 1;
> + } else {
> + vq->vring = vring.vring_split;
> + vq->avail_flags_shadow = 0;
> + vq->avail_idx_shadow = 0;
> +
> + /* Put everything in free lists. */
> + vq->free_head = 0;
> + for (i = 0; i < num-1; i++)
> + vq->vring.desc[i].next = cpu_to_virtio16(vdev, i + 1);
> +
> + memset(vq->desc_state, 0,
> + num * sizeof(struct vring_desc_state));
> + }
> +
> /* No callback? Tell other side not to bother us. */
> if (!callback) {
> - vq->avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> - if (!vq->event)
> - vq->vring.avail->flags = cpu_to_virtio16(vdev, vq->avail_flags_shadow);
> + if (packed) {
> + vq->event_flags_shadow = VRING_EVENT_F_DISABLE;
> + vq->vring_packed.driver->flags = cpu_to_virtio16(vdev,
> + vq->event_flags_shadow);
> + } else {
> + vq->avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> + if (!vq->event)
> + vq->vring.avail->flags = cpu_to_virtio16(vdev,
> + vq->avail_flags_shadow);
> + }
> }
>
> - /* Put everything in free lists. */
> - vq->free_head = 0;
> - for (i = 0; i < vring.num-1; i++)
> - vq->vring.desc[i].next = cpu_to_virtio16(vdev, i + 1);
> - memset(vq->desc_state, 0, vring.num * sizeof(struct vring_desc_state));
> -
> return &vq->vq;
> }
> EXPORT_SYMBOL_GPL(__vring_new_virtqueue);
> @@ -1055,6 +1299,12 @@ static void vring_free_queue(struct virtio_device *vdev, size_t size,
> }
> }
>
> +static inline int
> +__vring_size(unsigned int num, unsigned long align, bool packed)
> +{
> + return packed ? vring_size_packed(num, align) : vring_size(num, align);
> +}
> +
> struct virtqueue *vring_create_virtqueue(
> unsigned int index,
> unsigned int num,
> @@ -1071,7 +1321,8 @@ struct virtqueue *vring_create_virtqueue(
> void *queue = NULL;
> dma_addr_t dma_addr;
> size_t queue_size_in_bytes;
> - struct vring vring;
> + union vring_union vring;
> + bool packed;
>
> /* We assume num is a power of 2. */
> if (num & (num - 1)) {
> @@ -1079,9 +1330,13 @@ struct virtqueue *vring_create_virtqueue(
> return NULL;
> }
>
> + packed = virtio_has_feature(vdev, VIRTIO_F_RING_PACKED);
> +
> /* TODO: allocate each queue chunk individually */
> - for (; num && vring_size(num, vring_align) > PAGE_SIZE; num /= 2) {
> - queue = vring_alloc_queue(vdev, vring_size(num, vring_align),
> + for (; num && __vring_size(num, vring_align, packed) > PAGE_SIZE;
> + num /= 2) {
> + queue = vring_alloc_queue(vdev, __vring_size(num, vring_align,
> + packed),
> &dma_addr,
> GFP_KERNEL|__GFP_NOWARN|__GFP_ZERO);
> if (queue)
> @@ -1093,17 +1348,21 @@ struct virtqueue *vring_create_virtqueue(
>
> if (!queue) {
> /* Try to get a single page. You are my only hope! */
> - queue = vring_alloc_queue(vdev, vring_size(num, vring_align),
> + queue = vring_alloc_queue(vdev, __vring_size(num, vring_align,
> + packed),
> &dma_addr, GFP_KERNEL|__GFP_ZERO);
> }
> if (!queue)
> return NULL;
>
> - queue_size_in_bytes = vring_size(num, vring_align);
> - vring_init(&vring, num, queue, vring_align);
> + queue_size_in_bytes = __vring_size(num, vring_align, packed);
> + if (packed)
> + vring_init_packed(&vring.vring_packed, num, queue, vring_align);
> + else
> + vring_init(&vring.vring_split, num, queue, vring_align);
>
> - vq = __vring_new_virtqueue(index, vring, vdev, weak_barriers, context,
> - notify, callback, name);
> + vq = __vring_new_virtqueue(index, vring, packed, vdev, weak_barriers,
> + context, notify, callback, name);
> if (!vq) {
> vring_free_queue(vdev, queue_size_in_bytes, queue,
> dma_addr);
> @@ -1129,10 +1388,17 @@ struct virtqueue *vring_new_virtqueue(unsigned int index,
> void (*callback)(struct virtqueue *vq),
> const char *name)
> {
> - struct vring vring;
> - vring_init(&vring, num, pages, vring_align);
> - return __vring_new_virtqueue(index, vring, vdev, weak_barriers, context,
> - notify, callback, name);
> + union vring_union vring;
> + bool packed;
> +
> + packed = virtio_has_feature(vdev, VIRTIO_F_RING_PACKED);
> + if (packed)
> + vring_init_packed(&vring.vring_packed, num, pages, vring_align);
> + else
> + vring_init(&vring.vring_split, num, pages, vring_align);
vring_init in the UAPI header is more or less a bug.
I'd just stop using it, keep it around for legacy userspace.
> +
> + return __vring_new_virtqueue(index, vring, packed, vdev, weak_barriers,
> + context, notify, callback, name);
> }
> EXPORT_SYMBOL_GPL(vring_new_virtqueue);
>
> @@ -1142,7 +1408,9 @@ void vring_del_virtqueue(struct virtqueue *_vq)
>
> if (vq->we_own_ring) {
> vring_free_queue(vq->vq.vdev, vq->queue_size_in_bytes,
> - vq->vring.desc, vq->queue_dma_addr);
> + vq->packed ? (void *)vq->vring_packed.desc :
> + (void *)vq->vring.desc,
> + vq->queue_dma_addr);
> }
> list_del(&_vq->list);
> kfree(vq);
> @@ -1184,7 +1452,7 @@ unsigned int virtqueue_get_vring_size(struct virtqueue *_vq)
>
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - return vq->vring.num;
> + return vq->packed ? vq->vring_packed.num : vq->vring.num;
> }
> EXPORT_SYMBOL_GPL(virtqueue_get_vring_size);
>
> @@ -1227,6 +1495,10 @@ dma_addr_t virtqueue_get_avail_addr(struct virtqueue *_vq)
>
> BUG_ON(!vq->we_own_ring);
>
> + if (vq->packed)
> + return vq->queue_dma_addr + ((char *)vq->vring_packed.driver -
> + (char *)vq->vring_packed.desc);
> +
> return vq->queue_dma_addr +
> ((char *)vq->vring.avail - (char *)vq->vring.desc);
> }
> @@ -1238,11 +1510,16 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *_vq)
>
> BUG_ON(!vq->we_own_ring);
>
> + if (vq->packed)
> + return vq->queue_dma_addr + ((char *)vq->vring_packed.device -
> + (char *)vq->vring_packed.desc);
> +
> return vq->queue_dma_addr +
> ((char *)vq->vring.used - (char *)vq->vring.desc);
> }
> EXPORT_SYMBOL_GPL(virtqueue_get_used_addr);
>
> +/* Only available for split ring */
> const struct vring *virtqueue_get_vring(struct virtqueue *vq)
> {
> return &to_vvq(vq)->vring;
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index fab02133a919..992142b35f55 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -60,6 +60,11 @@ static inline void virtio_store_mb(bool weak_barriers,
> struct virtio_device;
> struct virtqueue;
>
> +union vring_union {
> + struct vring vring_split;
> + struct vring_packed vring_packed;
> +};
> +
> /*
> * Creates a virtqueue and allocates the descriptor ring. If
> * may_reduce_num is set, then this may allocate a smaller ring than
> @@ -79,7 +84,8 @@ struct virtqueue *vring_create_virtqueue(unsigned int index,
>
> /* Creates a virtqueue with a custom layout. */
> struct virtqueue *__vring_new_virtqueue(unsigned int index,
> - struct vring vring,
> + union vring_union vring,
> + bool packed,
> struct virtio_device *vdev,
> bool weak_barriers,
> bool ctx,
> --
> 2.18.0
^ permalink raw reply
* Re: [PATCH net-next] net: sched: cls_flower: dump offload count value
From: Vlad Buslov @ 2018-09-07 9:28 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, jhs, xiyou.wangcong, jiri, davem
In-Reply-To: <20180907111113.100dfa97@cakuba>
On Fri 07 Sep 2018 at 09:11, Jakub Kicinski <jakub.kicinski@netronome.com> wrote:
> On Thu, 6 Sep 2018 18:37:23 +0300, Vlad Buslov wrote:
>> Change flower in_hw_count type to fixed-size u32 and dump it as
>> TCA_FLOWER_IN_HW_COUNT. This change is necessary to properly test shared
>> blocks and re-offload functionality.
>>
>> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>> Acked-by: Jiri Pirko <jiri@mellanox.com>
>> ---
>> include/net/sch_generic.h | 2 +-
>> include/uapi/linux/pkt_cls.h | 2 ++
>> net/sched/cls_flower.c | 5 ++++-
>> 3 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>> index a6d00093f35e..d68ac55539a5 100644
>> --- a/include/net/sch_generic.h
>> +++ b/include/net/sch_generic.h
>> @@ -362,7 +362,7 @@ static inline void tcf_block_offload_dec(struct tcf_block *block, u32 *flags)
>> }
>>
>> static inline void
>> -tc_cls_offload_cnt_update(struct tcf_block *block, unsigned int *cnt,
>> +tc_cls_offload_cnt_update(struct tcf_block *block, u32 *cnt,
>> u32 *flags, bool add)
>> {
>> if (add) {
>> diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
>> index be382fb0592d..2824fb7ed1c9 100644
>> --- a/include/uapi/linux/pkt_cls.h
>> +++ b/include/uapi/linux/pkt_cls.h
>> @@ -483,6 +483,8 @@ enum {
>> TCA_FLOWER_KEY_ENC_OPTS,
>> TCA_FLOWER_KEY_ENC_OPTS_MASK,
>>
>> + TCA_FLOWER_IN_HW_COUNT, /* be32 */
>
> Why be32?
This comment is wrong.
Thanks for catching this.
>
> I wish there was a good way to share this attribute between
> classifiers :(
>
>> __TCA_FLOWER_MAX,
>> };
>>
>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>> index 6fd9bdd93796..4b8dd37dd4f8 100644
>> --- a/net/sched/cls_flower.c
>> +++ b/net/sched/cls_flower.c
>> @@ -98,7 +98,7 @@ struct cls_fl_filter {
>> struct list_head list;
>> u32 handle;
>> u32 flags;
>> - unsigned int in_hw_count;
>> + u32 in_hw_count;
>> struct rcu_work rwork;
>> struct net_device *hw_dev;
>> };
>> @@ -1880,6 +1880,9 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
>> if (f->flags && nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags))
>> goto nla_put_failure;
>>
>> + if (nla_put_u32(skb, TCA_FLOWER_IN_HW_COUNT, f->in_hw_count))
>> + goto nla_put_failure;
>> +
>> if (tcf_exts_dump(skb, &f->exts))
>> goto nla_put_failure;
>>
^ permalink raw reply
* Re: [PATCH net-next v2 4/5] virtio_ring: add event idx support in packed ring
From: Michael S. Tsirkin @ 2018-09-07 14:10 UTC (permalink / raw)
To: Tiwei Bie
Cc: jasowang, virtualization, linux-kernel, netdev, virtio-dev, wexu,
jfreimann
In-Reply-To: <20180711022711.7090-5-tiwei.bie@intel.com>
On Wed, Jul 11, 2018 at 10:27:10AM +0800, Tiwei Bie wrote:
> This commit introduces the EVENT_IDX support in packed ring.
>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Besides the usual comment about hard-coded constants like <<15:
does this actually do any good for performance? We don't
have to if we do not want to.
> ---
> drivers/virtio/virtio_ring.c | 73 ++++++++++++++++++++++++++++++++----
> 1 file changed, 65 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index f317b485ba54..f79a1e17f7d1 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -1050,7 +1050,7 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
> static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> - u16 flags;
> + u16 new, old, off_wrap, flags, wrap_counter, event_idx;
> bool needs_kick;
> u32 snapshot;
>
> @@ -1059,9 +1059,19 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> * suppressions. */
> virtio_mb(vq->weak_barriers);
>
> + old = vq->next_avail_idx - vq->num_added;
> + new = vq->next_avail_idx;
> + vq->num_added = 0;
> +
> snapshot = READ_ONCE(*(u32 *)vq->vring_packed.device);
> + off_wrap = virtio16_to_cpu(_vq->vdev, (__virtio16)(snapshot & 0xffff));
> flags = virtio16_to_cpu(_vq->vdev, (__virtio16)(snapshot >> 16)) & 0x3;
some kind of struct union would be helpful to make this readable.
>
> + wrap_counter = off_wrap >> 15;
> + event_idx = off_wrap & ~(1 << 15);
> + if (wrap_counter != vq->avail_wrap_counter)
> + event_idx -= vq->vring_packed.num;
> +
> #ifdef DEBUG
> if (vq->last_add_time_valid) {
> WARN_ON(ktime_to_ms(ktime_sub(ktime_get(),
> @@ -1070,7 +1080,10 @@ static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
> vq->last_add_time_valid = false;
> #endif
>
> - needs_kick = (flags != VRING_EVENT_F_DISABLE);
> + if (flags == VRING_EVENT_F_DESC)
> + needs_kick = vring_need_event(event_idx, new, old);
> + else
> + needs_kick = (flags != VRING_EVENT_F_DISABLE);
> END_USE(vq);
> return needs_kick;
> }
> @@ -1185,6 +1198,15 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> ret = vq->desc_state_packed[id].data;
> detach_buf_packed(vq, id, ctx);
>
> + /* If we expect an interrupt for the next entry, tell host
> + * by writing event index and flush out the write before
> + * the read in the next get_buf call. */
> + if (vq->event_flags_shadow == VRING_EVENT_F_DESC)
> + virtio_store_mb(vq->weak_barriers,
> + &vq->vring_packed.driver->off_wrap,
> + cpu_to_virtio16(_vq->vdev, vq->last_used_idx |
> + ((u16)vq->used_wrap_counter << 15)));
> +
> #ifdef DEBUG
> vq->last_add_time_valid = false;
> #endif
> @@ -1213,8 +1235,18 @@ static unsigned virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq)
> /* We optimistically turn back on interrupts, then check if there was
> * more to do. */
>
> + if (vq->event) {
> + vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> + vq->last_used_idx |
> + ((u16)vq->used_wrap_counter << 15));
> + /* We need to update event offset and event wrap
> + * counter first before updating event flags. */
> + virtio_wmb(vq->weak_barriers);
> + }
> +
> if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> - vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> + vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> + VRING_EVENT_F_ENABLE;
> vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> vq->event_flags_shadow);
> }
> @@ -1238,22 +1270,47 @@ static bool virtqueue_poll_packed(struct virtqueue *_vq, unsigned off_wrap)
> static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 bufs, used_idx, wrap_counter;
>
> START_USE(vq);
>
> /* We optimistically turn back on interrupts, then check if there was
> * more to do. */
>
> + if (vq->event) {
> + /* TODO: tune this threshold */
> + bufs = (vq->vring_packed.num - _vq->num_free) * 3 / 4;
> + wrap_counter = vq->used_wrap_counter;
> +
> + used_idx = vq->last_used_idx + bufs;
> + if (used_idx >= vq->vring_packed.num) {
> + used_idx -= vq->vring_packed.num;
> + wrap_counter ^= 1;
> + }
> +
> + vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> + used_idx | (wrap_counter << 15));
> +
> + /* We need to update event offset and event wrap
> + * counter first before updating event flags. */
> + virtio_wmb(vq->weak_barriers);
> + } else {
> + used_idx = vq->last_used_idx;
> + wrap_counter = vq->used_wrap_counter;
> + }
> +
> if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> - vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> + vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> + VRING_EVENT_F_ENABLE;
> vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> vq->event_flags_shadow);
> - /* We need to enable interrupts first before re-checking
> - * for more used buffers. */
> - virtio_mb(vq->weak_barriers);
> }
>
> - if (more_used_packed(vq)) {
> + /* We need to update event suppression structure first
> + * before re-checking for more used buffers. */
> + virtio_mb(vq->weak_barriers);
> +
mb is expensive. We should not do it if we changed nothing.
> + if (is_used_desc_packed(vq, used_idx, wrap_counter)) {
> END_USE(vq);
> return false;
> }
> --
> 2.18.0
^ permalink raw reply
* Re: [PATCH net-next 06/11] tuntap: split out XDP logic
From: Michael S. Tsirkin @ 2018-09-07 14:16 UTC (permalink / raw)
To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <84ae17c1-f34b-610d-a5a1-ba8dce1732f3@redhat.com>
On Fri, Sep 07, 2018 at 11:29:34AM +0800, Jason Wang wrote:
> > > + if (act != XDP_PASS)
> > > + goto out;
> > likely?
>
> It depends on the XDP program, so I tend not to use it.
Point is XDP_PASS is already slow.
--
MST
^ permalink raw reply
* Re: [PATCH net-next 04/11] tuntap: simplify error handling in tun_build_skb()
From: Michael S. Tsirkin @ 2018-09-07 14:17 UTC (permalink / raw)
To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <e1f17f51-5f99-df18-4185-6c6e4dfaba89@redhat.com>
On Fri, Sep 07, 2018 at 11:22:00AM +0800, Jason Wang wrote:
> > > @@ -1668,6 +1668,9 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
> > > if (copied != len)
> > > return ERR_PTR(-EFAULT);
> > > + get_page(alloc_frag->page);
> > > + alloc_frag->offset += buflen;
> > > +
> > This adds an atomic op on XDP_DROP which is a data path
> > operation for some workloads.
>
> Yes, I have patch on top to amortize this, the idea is to have a very big
> refcount once after the frag was allocated and maintain a bias and decrease
> them all when allocating new frags.'
Why bother with refcounting for a drop though? It should be simple.
--
MST
^ permalink raw reply
* Re: kernels > v4.12 oops/crash with ipsec-traffic: bisected to b838d5e1c5b6e57b10ec8af2268824041e3ea911: ipv4: mark DST_NOGC and remove the operation of dst_free()
From: Wolfgang Walter @ 2018-09-07 9:53 UTC (permalink / raw)
To: Steffen Klassert; +Cc: netdev, Wei Wang
In-Reply-To: <20180831065024.GG23674@gauss3.secunet.de>
Hello,
didn't respond as I've been on vacation.
Am Freitag, 31. August 2018, 08:50:24 schrieb Steffen Klassert:
> On Thu, Aug 30, 2018 at 08:53:50PM +0200, Wolfgang Walter wrote:
> > Hello,
> >
> > kernels > 4.12 do not work on one of our main routers. They crash as soon
> > as ipsec-tunnels are configured and ipsec-traffic actually flows.
>
> Can you please send the backtrace of this crash?
>
I'll try today. The oops quickly disappears because other problems arising
from it pop up. The machine crashes and no logs are logged. I try to make foto
or try to log to the serial console.
At the moment I only see that there is xfrm_???? stuff in the call trace as
xfrm_lookup, xfrm_route_????, and it is while routing a packet.
With later kernels (4.18.5) the machine seems to crash without a call trace on
console.
Regards,
--
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
^ permalink raw reply
* Re: [PATCH 3/4] of: Convert to using %pOFn instead of device_node.name
From: Rob Herring @ 2018-09-07 14:58 UTC (permalink / raw)
To: Thierry Reding
Cc: Frank Rowand, devicetree, linux-kernel@vger.kernel.org,
Andrew Lunn, Florian Fainelli, netdev
In-Reply-To: <20180907122928.GA5821@ulmo>
On Fri, Sep 7, 2018 at 7:29 AM Thierry Reding <thierry.reding@gmail.com> wrote:
>
> On Tue, Aug 28, 2018 at 10:52:53AM -0500, Rob Herring wrote:
> > In preparation to remove the node name pointer from struct device_node,
> > convert printf users to use the %pOFn format specifier.
> >
> > Cc: Frank Rowand <frowand.list@gmail.com>
> > Cc: Andrew Lunn <andrew@lunn.ch>
> > Cc: Florian Fainelli <f.fainelli@gmail.com>
> > Cc: devicetree@vger.kernel.org
> > Cc: netdev@vger.kernel.org
> > Signed-off-by: Rob Herring <robh@kernel.org>
> > ---
> > drivers/of/device.c | 4 ++--
> > drivers/of/of_mdio.c | 12 ++++++------
> > drivers/of/of_numa.c | 4 ++--
> > drivers/of/overlay.c | 4 ++--
> > drivers/of/platform.c | 8 ++++----
> > drivers/of/unittest.c | 12 ++++++------
> > 6 files changed, 22 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/of/device.c b/drivers/of/device.c
> > index 5957cd4fa262..daa075d87317 100644
> > --- a/drivers/of/device.c
> > +++ b/drivers/of/device.c
> > @@ -219,7 +219,7 @@ static ssize_t of_device_get_modalias(struct device *dev, char *str, ssize_t len
> > return -ENODEV;
> >
> > /* Name & Type */
> > - csize = snprintf(str, len, "of:N%sT%s", dev->of_node->name,
> > + csize = snprintf(str, len, "of:N%pOFnT%s", dev->of_node,
> > dev->of_node->type);
> > tsize = csize;
> > len -= csize;
>
> This seems to cause the modalias to be improperly constructed. As a
> consequence, automatic module loading at boot time is now broken. I
> think the reason why this fails is because vsnprintf() will skip all
> alpha-numeric characters after a call to pointer(). Presumably this
> is meant to be a generic way of skipping whatever specifiers we throw
> at it.
>
> Unfortunately for the case of OF modaliases, this means that the 'T'
> character gets eaten, so we end up with something like this:
>
> # udevadm info /sys/bus/platform/devices/54200000.dc
> [...]
> E: MODALIAS=of:Ndc<NULL>Cnvidia,tegra124-dc
> [...]
>
> instead of this:
>
> # udevadm info /sys/bus/platform/devices/54200000.dc
> [...]
> E: MODALIAS=of:NdcT<NULL>Cnvidia,tegra124-dc
> [...]
Oops. Thanks for finding this.
> Everything is back to normal if I revert this patch. However, since
> that's obviously not what we want, I think perhaps what we need is a
> way for pointer() (and its implementations) to report back how many
> characters in the format string it consumed so that we can support
> these kinds of back-to-back strings.
I don't think we can change it because if I have something like
%pOFMoreWords and then later on want to add 'M' as a new modifier we'd
break any existing cases with 'M'. Of course, I could search the tree
for that case and find unused characters, but that seems fragile
(though silently throwing away the characters does too).
> If nobody else has the time I can look into coding up a fix, but in the
> meantime it might be best to back this one out until we can handle the
> OF modalias format string.
There's an easy fix though. Just replace the 'T' with a '%c'.
I found one other case in the clock code and one in soundbus (which I
haven't posted yet).
Rob
^ permalink raw reply
* Re: [PATCH] PCI: Reprogram bridge prefetch registers on resume
From: Peter Wu @ 2018-09-07 15:05 UTC (permalink / raw)
To: Daniel Drake
Cc: mika.westerberg-VuQAYsv1563Yd54FQh9/CA,
linux-pm-u79uwXL29TY76Z2rM5mHXA, linux-pci-u79uwXL29TY76Z2rM5mHXA,
rafael.j.wysocki-ral2JQCrhuEAvxtiuMwx3w,
nic_swsd-Rasf1IRRPZFBDgjK7y7TUQ,
keith.busch-ral2JQCrhuEAvxtiuMwx3w, netdev-u79uwXL29TY76Z2rM5mHXA,
nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
bhelgaas-hpIqsD4AKlfQT0dZR+AlfA,
andriy.shevchenko-VuQAYsv1563Yd54FQh9/CA,
linux-6IF/jdPJHihWk0Htik3J/w, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w,
hkallweit1-Re5JQEeQqe8AvxtiuMwx3w
In-Reply-To: <20180907053614.6540-1-drake-6IF/jdPJHihWk0Htik3J/w@public.gmane.org>
On Fri, Sep 07, 2018 at 01:36:14PM +0800, Daniel Drake wrote:
<..>
> Thomas Martitz reports that this workaround also solves an issue where
> the AMD Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive
> after S3 suspend/resume.
Where was this claimed? It is not stated in the linked bug:
(https://bugs.freedesktop.org/show_bug.cgi?id=105760
> On resume, reprogram the PCI bridge prefetch registers, including the
> magic register mentioned above.
>
> This matches Win10 behaviour, which also rewrites these registers
> during S3 resume (checked with qemu tracing).
Windows 10 unconditionally rewrites these registers (BAR, I/O Base +
Limit, Memory Base + Limit, etc. from top to bottom), see annotations:
https://www.spinics.net/lists/linux-pci/msg75856.html
Linux has a generic "restore" operation that works backwards from the
end of the PCI config space to the beginning, see
pci_restore_config_space. Do you have a dmesg where you see the
"restoring config space at offset" messages?
Would it be reasonable to unconditionally write these registers in
pci_restore_config_dword, like Windows does?
Kind regards,
Peter
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau
^ permalink raw reply
* [PATCH net-next] cxgb4: impose mandatory VLAN usage when non-zero TAG ID
From: Ganesh Goudar @ 2018-09-07 10:29 UTC (permalink / raw)
To: netdev, davem; +Cc: nirranjan, indranil, dt, Casey Leedom, Ganesh Goudar
From: Casey Leedom <leedom@chelsio.com>
When a non-zero VLAN Tag ID is passed to t4_set_vlan_acl()
then impose mandatory VLAN Usage with that VLAN ID.
I.e any other VLAN ID should result in packets getting
dropped.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 3 +++
drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h | 1 +
2 files changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index 5fe5d16..6f1bd7b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -10210,6 +10210,9 @@ int t4_set_vlan_acl(struct adapter *adap, unsigned int mbox, unsigned int vf,
vlan_cmd.en_to_len16 = cpu_to_be32(enable | FW_LEN16(vlan_cmd));
/* Drop all packets that donot match vlan id */
vlan_cmd.dropnovlan_fm = FW_ACL_VLAN_CMD_FM_F;
+ vlan_cmd.dropnovlan_fm = (enable
+ ? (FW_ACL_VLAN_CMD_DROPNOVLAN_F |
+ FW_ACL_VLAN_CMD_FM_F) : 0);
if (enable != 0) {
vlan_cmd.nvlan = 1;
vlan_cmd.vlanid[0] = cpu_to_be16(vlan);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h b/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
index 5dc6c41..6d2bc87 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
@@ -2464,6 +2464,7 @@ struct fw_acl_vlan_cmd {
#define FW_ACL_VLAN_CMD_DROPNOVLAN_S 7
#define FW_ACL_VLAN_CMD_DROPNOVLAN_V(x) ((x) << FW_ACL_VLAN_CMD_DROPNOVLAN_S)
+#define FW_ACL_VLAN_CMD_DROPNOVLAN_F FW_ACL_VLAN_CMD_DROPNOVLAN_V(1U)
#define FW_ACL_VLAN_CMD_FM_S 6
#define FW_ACL_VLAN_CMD_FM_M 0x1
--
2.1.0
^ permalink raw reply related
* Re: [PATCH v2 net-next 3/4] net: make listified RX functions return number of good packets
From: Edward Cree @ 2018-09-07 10:44 UTC (permalink / raw)
To: Eric Dumazet, davem; +Cc: linux-net-drivers, netdev
In-Reply-To: <2ed84271-06da-b5c2-3f23-940357880002@gmail.com>
On 07/09/18 03:27, Eric Dumazet wrote:
> On 09/06/2018 07:26 AM, Edward Cree wrote:
>> Signed-off-by: Edward Cree <ecree@solarflare.com>
> Lack of changelog here ?
>
> I do not know what is a good packet.
The comment on netif_receive_skb_list() defines this as "skbs for which
netif_receive_skb() would have returned %NET_RX_SUCCESS". But I shall put
that into the changelog as well.
> You are adding a lot of conditional expressions, that cpu
> will mispredict quite often.
I don't see an alternative, since this is needed by patch #4 and I daresay
there are other drivers that will also want to get NET_RX_SUCCESS-like
information (possibly from regular receives as well as GRO; I'm not quite
sure why sfc only cares about the latter).
> Typical micro benchmarks wont really notice.
Any suggestions on how to construct a test that will?
^ permalink raw reply
* RE: [PATCH v3 2/2] net: ethernet: i40evf: fix underlying build error
From: Wyborny, Carolyn @ 2018-09-07 15:26 UTC (permalink / raw)
To: Wang, Dongsheng, Kirsher, Jeffrey T,
sergei.shtylyov@cogentembedded.com
Cc: Keller, Jacob E, davem@davemloft.net,
intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <77898840b49847a7a835b20b383e21a2@HXTBJIDCEMVIW02.hxtcorp.net>
> -----Original Message-----
> From: Wang, Dongsheng [mailto:dongsheng.wang@hxt-semitech.com]
> Sent: Friday, September 07, 2018 5:34 AM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>;
> sergei.shtylyov@cogentembedded.com
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; davem@davemloft.net;
> intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; Wyborny, Carolyn <carolyn.wyborny@intel.com>
> Subject: Re: [PATCH v3 2/2] net: ethernet: i40evf: fix underlying build error
>
> Hello Jacob,
>
> Since Carolyn' team is working this, I think we don't need this patch
> anymore because this header file is only for ethtool.c.
> [..]
Hello Dongsheng,
The commonization effort we're working on is prioritizing our newest drivers. The i40e work is still being scoped, so we should fix this problem as needed now and not wait.
I apologize for any miscommunication. Was trying to let people know that we aware of the issue and are trying to make progress in that direction.
Thanks,
Carolyn
^ permalink raw reply
* RE: [PATCH v3 1/2] net: ethernet: i40e: fix build error
From: Keller, Jacob E @ 2018-09-07 15:27 UTC (permalink / raw)
To: Wang Dongsheng, Kirsher, Jeffrey T,
sergei.shtylyov@cogentembedded.com
Cc: davem@davemloft.net, intel-wired-lan@lists.osuosl.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <1536319175-3660-2-git-send-email-dongsheng.wang@hxt-semitech.com>
> -----Original Message-----
> From: Wang Dongsheng [mailto:dongsheng.wang@hxt-semitech.com]
> Sent: Friday, September 07, 2018 4:20 AM
> To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>;
> sergei.shtylyov@cogentembedded.com
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; davem@davemloft.net; intel-
> wired-lan@lists.osuosl.org; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; Wang Dongsheng <dongsheng.wang@hxt-
> semitech.com>
> Subject: [PATCH v3 1/2] net: ethernet: i40e: fix build error
>
> Remove "inline" from __i40e_add_stat_strings and move the function.
>
> In file included from
> drivers/net/ethernet/intel/i40e/i40e_ethtool.c:9:0:
> drivers/net/ethernet/intel/i40e/i40e_ethtool.c: In function
> __i40e_add_stat_string:
> drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h:193:20: error:
> function __i40e_add_stat_strings can never be inlined because it uses
> variable argument lists
> static inline void __i40e_add_stat_strings(u8 **p, const struct
> i40e_stats stats[],
>
> Tested on: x86_64, make ARCH=i386
>
> Modules section .text:
> i40e: 00019380 <__i40e_add_stat_strings>:
> i40evf: 00006b00 <__i40e_add_stat_strings>:
>
> Buildin section .text:
> i40e: c351ca60 <__i40e_add_stat_strings>:
> i40evf: c354f2c0 <__i40e_add_stat_strings>:
>
> Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
> ---
> V3: add static
> V2: Move function
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
This patch fixes the build issue, and is fine with me.
Thanks,
Jake
> ---
> .../net/ethernet/intel/i40e/i40e_ethtool.c | 24 ++++++++++++++++++
> .../ethernet/intel/i40e/i40e_ethtool_stats.h | 25 ++-----------------
> 2 files changed, 26 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> index d7d3974beca2..34121a72f2e7 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> @@ -1821,6 +1821,30 @@ static void i40e_get_ethtool_stats(struct net_device
> *netdev,
> "ethtool stats count mismatch!");
> }
>
> +/**
> + * __i40e_add_stat_strings - copy stat strings into ethtool buffer
> + * @p: ethtool supplied buffer
> + * @stats: stat definitions array
> + * @size: size of the stats array
> + *
> + * Format and copy the strings described by stats into the buffer pointed at
> + * by p.
> + **/
> +static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
> + const unsigned int size, ...)
> +{
> + unsigned int i;
> +
> + for (i = 0; i < size; i++) {
> + va_list args;
> +
> + va_start(args, size);
> + vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
> + *p += ETH_GSTRING_LEN;
> + va_end(args);
> + }
> +}
> +
> /**
> * i40e_get_stat_strings - copy stat strings into supplied buffer
> * @netdev: the netdev to collect strings for
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
> b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
> index bba1cb0b658f..553b0d720839 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
> @@ -181,29 +181,8 @@ i40e_add_queue_stats(u64 **data, struct i40e_ring
> *ring)
> *data += size;
> }
>
> -/**
> - * __i40e_add_stat_strings - copy stat strings into ethtool buffer
> - * @p: ethtool supplied buffer
> - * @stats: stat definitions array
> - * @size: size of the stats array
> - *
> - * Format and copy the strings described by stats into the buffer pointed at
> - * by p.
> - **/
> -static inline void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
> - const unsigned int size, ...)
> -{
> - unsigned int i;
> -
> - for (i = 0; i < size; i++) {
> - va_list args;
> -
> - va_start(args, size);
> - vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
> - *p += ETH_GSTRING_LEN;
> - va_end(args);
> - }
> -}
> +static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
> + const unsigned int size, ...);
>
> /**
> * 40e_add_stat_strings - copy stat strings into ethtool buffer
> --
> 2.18.0
^ permalink raw reply
* RE: linux-next: build failure after merge of the net-next tree
From: Keller, Jacob E @ 2018-09-07 15:30 UTC (permalink / raw)
To: Stephen Rothwell, David Miller, Networking
Cc: Linux-Next Mailing List, Linux Kernel Mailing List,
Kirsher, Jeffrey T, Bowers, AndrewX
In-Reply-To: <20180907102055.001777b8@canb.auug.org.au>
> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Stephen Rothwell
> Sent: Thursday, September 06, 2018 5:21 PM
> To: David Miller <davem@davemloft.net>; Networking
> <netdev@vger.kernel.org>
> Cc: Linux-Next Mailing List <linux-next@vger.kernel.org>; Linux Kernel Mailing
> List <linux-kernel@vger.kernel.org>; Keller, Jacob E <jacob.e.keller@intel.com>;
> Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; Bowers, AndrewX
> <andrewx.bowers@intel.com>
> Subject: Re: linux-next: build failure after merge of the net-next tree
>
> Hi all,
>
> On Mon, 3 Sep 2018 09:47:02 +1000 Stephen Rothwell <sfr@canb.auug.org.au>
> wrote:
> >
> > After merging the net-next tree, today's linux-next build (powerpc
> > ppc64_defconfig) failed like this:
> >
> > In file included from drivers/net/ethernet/intel/i40e/i40e_ethtool.c:9:
> > drivers/net/ethernet/intel/i40e/i40e_ethtool.c: In function
> '__i40e_add_stat_strings':
> > drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h:193:20: error: function
> '__i40e_add_stat_strings' can never be inlined because it uses variable argument
> lists
> > static inline void __i40e_add_stat_strings(u8 **p, const struct i40e_stats
> stats[],
> > ^~~~~~~~~~~~~~~~~~~~~~~
> >
> > Caused by commit
> >
> > 8fd75c58a09a ("i40e: move ethtool stats boiler plate code to
> i40e_ethtool_stats.h")
> >
> > It is not clear this patch has any value anyway as the moved functions
> > are only used in the file they were moved from.
> >
> > I reverted that commit for today.
> >
> > The same problem would exist in drivers/net/ethernet/intel/i40evf (where
> > a lot of code is duplicated from drivers/net/ethernet/intel/i40e) except
> > that this function is not declared inline there.
> > Luckily, i40e_ethtool_stats.h is only included my one file
> > drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c, otherwise there
> > would be multiple copies of __i40e_add_stat_strings().
> >
> > Surely there is some scope for factoring out some common code between
> > these two drivers?
>
> Ping?
>
> --
> Cheers,
> Stephen Rothwell
There's some discussion about this going on in the intel-wired-lan mailing list.
Thanks,
Jake
^ permalink raw reply
* Re: [PATCH v2 net-next 0/4] net: batched receive in GRO path
From: Edward Cree @ 2018-09-07 10:51 UTC (permalink / raw)
To: Eric Dumazet, davem; +Cc: linux-net-drivers, netdev
In-Reply-To: <01bc9bf5-1780-2650-958f-961bd24b8c26@gmail.com>
On 07/09/18 03:32, Eric Dumazet wrote:
> Your performance numbers are not convincing, since TCP stream test should
> get nominal GRO gains.
I'm not quite sure what you mean here, could you explain a bit more?
> Adding this complexity and icache pressure needs more experimental results.
>
> What about RPC workloads (eg 100 concurrent netperf -t TCP_RR -- -r 8000,8000 )
I'll try that. Any other tests that would be worthwhile?
-Ed
^ permalink raw reply
* Re: [PATCH v3 2/2] net: ethernet: i40evf: fix underlying build error
From: Sergei Shtylyov @ 2018-09-07 15:33 UTC (permalink / raw)
To: Wang Dongsheng, jeffrey.t.kirsher
Cc: jacob.e.keller, davem, intel-wired-lan, netdev, linux-kernel
In-Reply-To: <1536319175-3660-3-git-send-email-dongsheng.wang@hxt-semitech.com>
On 09/07/2018 02:19 PM, Wang Dongsheng wrote:
> Can't have non-inline function in a header file.
> There is a risk of "Multiple definition" from cross-including.
>
> Tested on: x86_64, make ARCH=i386
>
> Modules section .text:
> i40e: 00019380 <__i40e_add_stat_strings>:
> i40evf: 00006b00 <__i40e_add_stat_strings>:
>
> Buildin section .text:
> i40e: c351ca60 <__i40e_add_stat_strings>:
> i40evf: c354f2c0 <__i40e_add_stat_strings>:
>
> Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
> ---
> V3: add static
> ---
> .../intel/i40evf/i40e_ethtool_stats.h | 23 +-----------------
> .../ethernet/intel/i40evf/i40evf_ethtool.c | 24 +++++++++++++++++++
> 2 files changed, 25 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h b/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h
> index 60b595dd8c39..62ab67a77753 100644
> --- a/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h
> +++ b/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h
> @@ -181,29 +181,8 @@ i40evf_add_queue_stats(u64 **data, struct i40e_ring *ring)
> *data += size;
> }
>
> -/**
> - * __i40e_add_stat_strings - copy stat strings into ethtool buffer
> - * @p: ethtool supplied buffer
> - * @stats: stat definitions array
> - * @size: size of the stats array
> - *
> - * Format and copy the strings described by stats into the buffer pointed at
> - * by p.
> - **/
> static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
There's no point to keeping *static* function in the header file (unless it's
also *inline*).
> - const unsigned int size, ...)
> -{
> - unsigned int i;
> -
> - for (i = 0; i < size; i++) {
> - va_list args;
> -
> - va_start(args, size);
> - vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
> - *p += ETH_GSTRING_LEN;
> - va_end(args);
> - }
> -}
> + const unsigned int size, ...);
>
> /**
> * 40e_add_stat_strings - copy stat strings into ethtool buffer
[...]
MBR, Sergei
^ permalink raw reply
* [PATCH v3 1/2] net: ethernet: i40e: fix build error
From: Wang Dongsheng @ 2018-09-07 11:19 UTC (permalink / raw)
To: jeffrey.t.kirsher, sergei.shtylyov
Cc: jacob.e.keller, davem, intel-wired-lan, netdev, linux-kernel,
Wang Dongsheng
In-Reply-To: <1536319175-3660-1-git-send-email-dongsheng.wang@hxt-semitech.com>
Remove "inline" from __i40e_add_stat_strings and move the function.
In file included from
drivers/net/ethernet/intel/i40e/i40e_ethtool.c:9:0:
drivers/net/ethernet/intel/i40e/i40e_ethtool.c: In function
__i40e_add_stat_string:
drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h:193:20: error:
function __i40e_add_stat_strings can never be inlined because it uses
variable argument lists
static inline void __i40e_add_stat_strings(u8 **p, const struct
i40e_stats stats[],
Tested on: x86_64, make ARCH=i386
Modules section .text:
i40e: 00019380 <__i40e_add_stat_strings>:
i40evf: 00006b00 <__i40e_add_stat_strings>:
Buildin section .text:
i40e: c351ca60 <__i40e_add_stat_strings>:
i40evf: c354f2c0 <__i40e_add_stat_strings>:
Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
---
V3: add static
V2: Move function
---
.../net/ethernet/intel/i40e/i40e_ethtool.c | 24 ++++++++++++++++++
.../ethernet/intel/i40e/i40e_ethtool_stats.h | 25 ++-----------------
2 files changed, 26 insertions(+), 23 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index d7d3974beca2..34121a72f2e7 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -1821,6 +1821,30 @@ static void i40e_get_ethtool_stats(struct net_device *netdev,
"ethtool stats count mismatch!");
}
+/**
+ * __i40e_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ * @size: size of the stats array
+ *
+ * Format and copy the strings described by stats into the buffer pointed at
+ * by p.
+ **/
+static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
+ const unsigned int size, ...)
+{
+ unsigned int i;
+
+ for (i = 0; i < size; i++) {
+ va_list args;
+
+ va_start(args, size);
+ vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
+ *p += ETH_GSTRING_LEN;
+ va_end(args);
+ }
+}
+
/**
* i40e_get_stat_strings - copy stat strings into supplied buffer
* @netdev: the netdev to collect strings for
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
index bba1cb0b658f..553b0d720839 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
@@ -181,29 +181,8 @@ i40e_add_queue_stats(u64 **data, struct i40e_ring *ring)
*data += size;
}
-/**
- * __i40e_add_stat_strings - copy stat strings into ethtool buffer
- * @p: ethtool supplied buffer
- * @stats: stat definitions array
- * @size: size of the stats array
- *
- * Format and copy the strings described by stats into the buffer pointed at
- * by p.
- **/
-static inline void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
- const unsigned int size, ...)
-{
- unsigned int i;
-
- for (i = 0; i < size; i++) {
- va_list args;
-
- va_start(args, size);
- vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
- *p += ETH_GSTRING_LEN;
- va_end(args);
- }
-}
+static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
+ const unsigned int size, ...);
/**
* 40e_add_stat_strings - copy stat strings into ethtool buffer
--
2.18.0
^ permalink raw reply related
* Re: [PATCH net-next 11/11] vhost_net: batch submitting XDP buffers to underlayer sockets
From: Michael S. Tsirkin @ 2018-09-07 16:13 UTC (permalink / raw)
To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <ffb802e6-1505-e01e-f6d5-11cde8dace9b@redhat.com>
On Fri, Sep 07, 2018 at 03:41:52PM +0800, Jason Wang wrote:
> > > @@ -556,10 +667,14 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
> > > size_t len, total_len = 0;
> > > int err;
> > > int sent_pkts = 0;
> > > + bool bulking = (sock->sk->sk_sndbuf == INT_MAX);
> > What does bulking mean?
>
> The name is misleading, it means whether we can do batching. For simplicity,
> I disable batching is sndbuf is not INT_MAX.
But what does batching have to do with sndbuf?
> > > for (;;) {
> > > bool busyloop_intr = false;
> > > + if (nvq->done_idx == VHOST_NET_BATCH)
> > > + vhost_tx_batch(net, nvq, sock, &msg);
> > > +
> > > head = get_tx_bufs(net, nvq, &msg, &out, &in, &len,
> > > &busyloop_intr);
> > > /* On error, stop handling until the next kick. */
> > > @@ -577,14 +692,34 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
> > > break;
> > > }
> > > - vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> > > - vq->heads[nvq->done_idx].len = 0;
> > > -
> > > total_len += len;
> > > - if (tx_can_batch(vq, total_len))
> > > - msg.msg_flags |= MSG_MORE;
> > > - else
> > > - msg.msg_flags &= ~MSG_MORE;
> > > +
> > > + /* For simplicity, TX batching is only enabled if
> > > + * sndbuf is unlimited.
> > What if sndbuf changes while this processing is going on?
>
> We will get the correct sndbuf in the next run of handle_tx(). I think this
> is safe.
If it's safe why bother with special-casing INT_MAX?
--
MST
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox