Netdev List

Netdev List
 help / color / mirror / Atom feed

* RE: [PATCH net-next 5/6] net/ncsi: Reset channel state in ncsi_start_dev()
From: Justin.Lee1 @ 2018-11-06 17:27 UTC (permalink / raw)
  To: sam, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <de9816e6c9cb31fdae1bb3d4a38da65b8f3a7694.camel@mendozajonas.com>


> On Mon, 2018-11-05 at 18:01 +0000, Justin.Lee1@Dell.com wrote:
> > > On Tue, 2018-10-30 at 21:26 +0000, Justin.Lee1@Dell.com wrote:
> > > > > +int ncsi_reset_dev(struct ncsi_dev *nd)
> > > > > +{
> > > > > +	struct ncsi_dev_priv *ndp = TO_NCSI_DEV_PRIV(nd);
> > > > > +	struct ncsi_channel *nc, *active;
> > > > > +	struct ncsi_package *np;
> > > > > +	unsigned long flags;
> > > > > +	bool enabled;
> > > > > +	int state;
> > > > > +
> > > > > +	active = NULL;
> > > > > +	NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > > > > +		NCSI_FOR_EACH_CHANNEL(np, nc) {
> > > > > +			spin_lock_irqsave(&nc->lock, flags);
> > > > > +			enabled = nc->monitor.enabled;
> > > > > +			state = nc->state;
> > > > > +			spin_unlock_irqrestore(&nc->lock, flags);
> > > > > +
> > > > > +			if (enabled)
> > > > > +				ncsi_stop_channel_monitor(nc);
> > > > > +			if (state == NCSI_CHANNEL_ACTIVE) {
> > > > > +				active = nc;
> > > > > +				break;
> > > > 
> > > > Is the original intention to process the channel one by one?
> > > > If it is the case, there are two loops and we might need to use
> > > > "goto found" instead.
> > > 
> > > Yes we'll need to break out of the package loop here as well.
> > > 
> > > > > +			}
> > > > > +		}
> > > > > +	}
> > > > > +
> > > > 
> > > > found: ?
> > > > 
> > > > > +	if (!active) {
> > > > > +		/* Done */
> > > > > +		spin_lock_irqsave(&ndp->lock, flags);
> > > > > +		ndp->flags &= ~NCSI_DEV_RESET;
> > > > > +		spin_unlock_irqrestore(&ndp->lock, flags);
> > > > > +		return ncsi_choose_active_channel(ndp);
> > > > > +	}
> > > > > +
> > > > > +	spin_lock_irqsave(&ndp->lock, flags);
> > > > > +	ndp->flags |= NCSI_DEV_RESET;
> > > > > +	ndp->active_channel = active;
> > > > > +	ndp->active_package = active->package;
> > > > > +	spin_unlock_irqrestore(&ndp->lock, flags);
> > > > > +
> > > > > +	nd->state = ncsi_dev_state_suspend;
> > > > > +	schedule_work(&ndp->work);
> > > > > +	return 0;
> > > > > +}
> > > > 
> > > > Also similar issue in ncsi_choose_active_channel() function below.
> > > > 
> > > > > @@ -916,32 +1045,49 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> > > > >  
> > > > >  			ncm = &nc->modes[NCSI_MODE_LINK];
> > > > >  			if (ncm->data[2] & 0x1) {
> > > > > -				spin_unlock_irqrestore(&nc->lock, flags);
> > > > >  				found = nc;
> > > > > -				goto out;
> > > > > +				with_link = true;
> > > > >  			}
> > > > >  
> > > > > -			spin_unlock_irqrestore(&nc->lock, flags);
> > > > > +			/* If multi_channel is enabled configure all valid
> > > > > +			 * channels whether or not they currently have link
> > > > > +			 * so they will have AENs enabled.
> > > > > +			 */
> > > > > +			if (with_link || np->multi_channel) {
> > > > 
> > > > I notice that there is a case that we will misconfigure the interface.
> > > > For example below, multi-channel is not enable for package 1.
> > > > But we enable the channel for ncsi2 below (package 1 channel 0) as that interface is the first
> > > > channel for that package with link.
> > > 
> > > I don't think I see the issue here; multi-channel is not set on package
> > > 1, but both channels are in the channel whitelist. Channel 0 is
> > > configured since it's the first found on package 1, and channel 1 is not
> > > since channel 0 is already found. Are you expecting something different?
> > >  
> > 
> > The setting is that multi-package is enable for both package 0 and 1.
> > Multi-channel is only enabled for package 0.
> > 
> > > > cat /sys/kernel/debug/ncsi_protocol/ncsi_device_
> > > > IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC CS PS LS RU CR NQ HA
> > > > =====================================================================
> > > >   2   eth2   ncsi0  000 000 1  1  1  1  1  1  0  2  1  1  1  1  0  1
> > > >   2   eth2   ncsi1  000 001 1  0  1  1  1  1  0  2  1  1  1  1  0  1
> > > >   2   eth2   ncsi2  001 000 1  0  1  0  1  1  0  2  1  1  1  1  0  1
> > 
> > I was replying to the wrong old email and it might cause a bit confusion.
> > The first 1 meaning channel is enabled for package 1 channel 0 (ncsi2). 
> > For eth2, we already has ncsi0 as the active channel with TX enable.
> > I would think that package doesn't have the multi-channel enabled and
> > we should not enable the channel for ncsi2. The problem is that package 1 doesn't
> > enable the multi-channel and it believes it needs to enable one channel for its package 
> > but it doesn't aware that the other package already has one active channel.
> 
> Ah, maybe the confusion here is that multi_channel is a per-package
> setting; it determines what a package does with its own channels.
> 
> So you have package 0 with multi-channel enabled so it enables channels 0
> & 1.
> Then you have package 1 without multi-channel so it enables only channel
> 0.
> There is still only one Tx channel (package 0, channel 0).
> 
> Does that sound right, or have I missed something?

Yes, you are right. There is only one TX enabled. 
If we can hold off a few seconds before applying, then we will not see 
these configuration changes in between the back to back netlink commands.

Thanks, 
Justin

^ permalink raw reply

* Re: [PATCH] cw1200: fix small typo
From: Kalle Valo @ 2018-11-06 17:04 UTC (permalink / raw)
  To: Yangtao Li; +Cc: pizza, davem, linux-wireless, netdev, linux-kernel, Yangtao Li
In-Reply-To: <20181101153319.22830-1-tiny.windzz@gmail.com>

Yangtao Li <tiny.windzz@gmail.com> wrote:

> Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>

Patch applied to wireless-drivers-next.git, thanks.

f4bd758f3f20 cw1200: fix small typo

-- 
https://patchwork.kernel.org/patch/10664149/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] rtlwifi: Remove same duplicated includes
From: Kalle Valo @ 2018-11-06 17:02 UTC (permalink / raw)
  To: zhong jiang; +Cc: pkshih, davem, linux-wireless, netdev, linux-kernel
In-Reply-To: <1540361256-678-1-git-send-email-zhongjiang@huawei.com>

zhong jiang <zhongjiang@huawei.com> wrote:

> Just remove same duplicated includes.
> 
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>

Patch applied to wireless-drivers-next.git, thanks.

963b307361bd rtlwifi: Remove same duplicated includes

-- 
https://patchwork.kernel.org/patch/10654183/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCHv3] rtlwifi: rtl8723ae: Remove set but not used variables and #defines
From: Kalle Valo @ 2018-11-06 17:01 UTC (permalink / raw)
  To: zhong jiang; +Cc: pkshih, davem, linux-wireless, joe, netdev, linux-kernel
In-Reply-To: <1540360329-65013-1-git-send-email-zhongjiang@huawei.com>

zhong jiang <zhongjiang@huawei.com> wrote:

> radiob_array_table' and 'radiob_arraylen' are not used after setting its value.
> It is safe to remove the unused variable. Meanwhile, radio B array should be
> removed as well. because it will no longer be referenced.
> 
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> Acked-by: Ping-Ke Shih <pkshih@realtek.com>

Patch applied to wireless-drivers-next.git, thanks.

90e3243d16ad rtlwifi: rtl8723ae: Remove set but not used variables and #defines

-- 
https://patchwork.kernel.org/patch/10654181/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH 07/20] iwlegacy: 4965-mac: mark expected switch fall-through
From: Kalle Valo @ 2018-11-06 17:00 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: Stanislaw Gruszka, linux-wireless, David S. Miller, netdev,
	linux-kernel, Gustavo A. R. Silva
In-Reply-To: <8d72b8a6f1529906672ac458e96d272bc55a1410.1540239684.git.gustavo@embeddedor.com>

"Gustavo A. R. Silva" <gustavo@embeddedor.com> wrote:

> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

14 patches applied to wireless-drivers-next.git, thanks.

e9904084dd1b iwlegacy: 4965-mac: mark expected switch fall-through
af71f8fef45c iwlegacy: common: mark expected switch fall-throughs
d56b26801e1d orinoco_usb: mark expected switch fall-through
d22b8fadd08e prism54: isl_38xx: Mark expected switch fall-through
3d238b9d5048 prism54: isl_ioctl: mark expected switch fall-through
38a0792d08e9 prism54: islpci_dev: mark expected switch fall-through
63fdc952df36 mwifiex: Mark expected switch fall-through
6eba8fd22352 rt2x00: rt2400pci: mark expected switch fall-through
10bb92217747 rt2x00: rt2500pci: mark expected switch fall-through
916e6bbcfcff rt2x00: rt2800lib: mark expected switch fall-throughs
641dd8068ecb rt2x00: rt61pci: mark expected switch fall-through
d22d2492a35d ray_cs: mark expected switch fall-throughs
89e54fa4562e rtlwifi: rtl8821ae: phy: Mark expected switch fall-through
7cbbe1597e44 zd1201: mark expected switch fall-through

-- 
https://patchwork.kernel.org/patch/10652563/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH 1/2] rtl8xxxu: Mark expected switch fall-throughs
From: Kalle Valo @ 2018-11-06 16:59 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: linux-kernel, Jes Sorensen, linux-wireless, David S. Miller,
	netdev, Gustavo A. R. Silva
In-Reply-To: <08817d137b32d5d091eacc6fee0a3eca68d49d94.1540208577.git.gustavo@embeddedor.com>

"Gustavo A. R. Silva" <gustavo@embeddedor.com> wrote:

> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
> 
> Addresses-Coverity-ID: 1357355 ("Missing break in switch")
> Addresses-Coverity-ID: 1357378 ("Missing break in switch")
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

2 patches applied to wireless-drivers-next.git, thanks.

e20c50cdca19 rtl8xxxu: Mark expected switch fall-throughs
307b00c5e695 rtl8xxxu: Fix missing break in switch

-- 
https://patchwork.kernel.org/patch/10651953/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] rtlwifi: btcoex: remove set but not used variable 'ppsc'
From: Kalle Valo @ 2018-11-06 16:58 UTC (permalink / raw)
  To: YueHaibing
  Cc: Ping-Ke Shih, Larry Finger, Colin Ian King, Nathan Chancellor,
	YueHaibing, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	kernel-janitors-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1540194675-65562-1-git-send-email-yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

YueHaibing <yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c: In function 'halbtc_leave_lps':
> drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:295:21: warning:
>  variable 'ppsc' set but not used [-Wunused-but-set-variable]
> 
> drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c: In function 'halbtc_enter_lps':
> drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:318:21: warning:
>  variable 'ppsc' set but not used [-Wunused-but-set-variable]
> 
> It never used since introduction in
> commit aa45a673b291 ("rtlwifi: btcoexist: Add new mini driver")
> 
> Signed-off-by: YueHaibing <yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Acked-by: Ping-Ke Shih <pkshih-Rasf1IRRPZFBDgjK7y7TUQ@public.gmane.org>

Patch applied to wireless-drivers-next.git, thanks.

9198f460ec9d rtlwifi: btcoex: remove set but not used variable 'ppsc'

-- 
https://patchwork.kernel.org/patch/10651825/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] brcmfmac: fix spelling mistake "Retreiving" -> "Retrieving"
From: Kalle Valo @ 2018-11-06 16:57 UTC (permalink / raw)
  To: Colin King
  Cc: Arend van Spriel, Franky Lin, Hante Meuleman, Chi-Hsien Lin,
	Wright Feng, David S . Miller, Pieter-Paul Giesberts,
	linux-wireless, brcm80211-dev-list.pdl, brcm80211-dev-list,
	netdev, kernel-janitors, linux-kernel
In-Reply-To: <20181016174342.1867-1-colin.king@canonical.com>

Colin King <colin.king@canonical.com> wrote:

> From: Colin Ian King <colin.king@canonical.com>
> 
> Trivial fix to spelling mistake in brcmf_err error message.
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>

Patch applied to wireless-drivers-next.git, thanks.

e966a79c2f76 brcmfmac: fix spelling mistake "Retreiving" -> "Retrieving"

-- 
https://patchwork.kernel.org/patch/10643927/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] wlcore: Fix the return value in case of error in 'wlcore_vendor_cmd_smart_config_start()'
From: Kalle Valo @ 2018-11-06 16:53 UTC (permalink / raw)
  To: Christophe JAILLET
  Cc: davem, tony, linux-wireless, netdev, linux-kernel,
	kernel-janitors, Christophe JAILLET
In-Reply-To: <20181016073940.26797-1-christophe.jaillet@wanadoo.fr>

Christophe JAILLET <christophe.jaillet@wanadoo.fr> wrote:

> We return 0 unconditionally at the end of
> 'wlcore_vendor_cmd_smart_config_start()'.
> However, 'ret' is set to some error codes in several error handling paths
> and we already return some error codes at the beginning of the function.
> 
> Return 'ret' instead to propagate the error code.
> 
> Fixes: 80ff8063e87c ("wlcore: handle smart config vendor commands")
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

Patch applied to wireless-drivers-next.git, thanks.

3419348a97bc wlcore: Fix the return value in case of error in 'wlcore_vendor_cmd_smart_config_start()'

-- 
https://patchwork.kernel.org/patch/10643195/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] libertas: remove set but not used variable 'int_type'
From: Kalle Valo @ 2018-11-06 16:52 UTC (permalink / raw)
  To: YueHaibing
  Cc: Lubomir Rintel, YueHaibing,
	libertas-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	kernel-janitors-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1539569795-176889-1-git-send-email-yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

YueHaibing <yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> drivers/net/wireless/marvell/libertas/if_spi.c: In function 'if_spi_h2c':
> drivers/net/wireless/marvell/libertas/if_spi.c:799:6: warning:
>  variable 'int_type' set but not used [-Wunused-but-set-variable]
> 
> It never used since introduction in
> commit d2b21f191753 ("libertas: if_spi, driver for libertas GSPI devices")
> 
> Signed-off-by: YueHaibing <yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Tested-by: Lubomir Rintel <lkundrak-NGH9Lh4a5iE@public.gmane.org>

Patch applied to wireless-drivers-next.git, thanks.

937a13091cbd libertas: remove set but not used variable 'int_type'

-- 
https://patchwork.kernel.org/patch/10641005/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] rsi: fix spelling mistake "Initialzing" -> "Initializing"
From: Kalle Valo @ 2018-11-06 16:52 UTC (permalink / raw)
  To: Colin King
  Cc: David S . Miller, Amitkumar Karwar, Prameela Rani Garnepudi,
	linux-wireless, netdev, kernel-janitors, linux-kernel
In-Reply-To: <20181013153730.5761-1-colin.king@canonical.com>

Colin King <colin.king@canonical.com> wrote:

> From: Colin Ian King <colin.king@canonical.com>
> 
> Trivial fix to spelling mistake in rsi_dbg debug message
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Patch applied to wireless-drivers-next.git, thanks.

55930d2bf79b rsi: fix spelling mistake "Initialzing" -> "Initializing"

-- 
https://patchwork.kernel.org/patch/10640377/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH 3/5] VSOCK: support receive mergeable rx buffer in guest
From: jiangyiwen @ 2018-11-06  6:41 UTC (permalink / raw)
  To: Jason Wang, stefanha; +Cc: netdev, kvm, virtualization
In-Reply-To: <791d67e7-ad95-38e4-0d38-2b7c54d68040@redhat.com>

On 2018/11/6 12:00, Jason Wang wrote:
> 
> On 2018/11/5 下午3:47, jiangyiwen wrote:
>> Guest receive mergeable rx buffer, it can merge
>> scatter rx buffer into a big buffer and then copy
>> to user space.
>>
>> Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
>> ---
>>   include/linux/virtio_vsock.h            |  9 ++++
>>   net/vmw_vsock/virtio_transport.c        | 75 +++++++++++++++++++++++++++++----
>>   net/vmw_vsock/virtio_transport_common.c | 59 ++++++++++++++++++++++----
>>   3 files changed, 127 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>> index da9e1fe..6be3cd7 100644
>> --- a/include/linux/virtio_vsock.h
>> +++ b/include/linux/virtio_vsock.h
>> @@ -13,6 +13,8 @@
>>   #define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE    (1024 * 4)
>>   #define VIRTIO_VSOCK_MAX_BUF_SIZE        0xFFFFFFFFUL
>>   #define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE        (1024 * 64)
>> +/* virtio_vsock_pkt + max_pkt_len(default MAX_PKT_BUF_SIZE) */
>> +#define VIRTIO_VSOCK_MAX_MRG_BUF_NUM ((VIRTIO_VSOCK_MAX_PKT_BUF_SIZE / PAGE_SIZE) + 1)
>>
>>   /* Virtio-vsock feature */
>>   #define VIRTIO_VSOCK_F_MRG_RXBUF 0 /* Host can merge receive buffers. */
>> @@ -48,6 +50,11 @@ struct virtio_vsock_sock {
>>       struct list_head rx_queue;
>>   };
>>
>> +struct virtio_vsock_mrg_rxbuf {
>> +    void *buf;
>> +    u32 len;
>> +};
>> +
>>   struct virtio_vsock_pkt {
>>       struct virtio_vsock_hdr    hdr;
>>       struct virtio_vsock_mrg_rxbuf_hdr mrg_rxbuf_hdr;
>> @@ -59,6 +66,8 @@ struct virtio_vsock_pkt {
>>       u32 len;
>>       u32 off;
>>       bool reply;
>> +    bool mergeable;
>> +    struct virtio_vsock_mrg_rxbuf mrg_rxbuf[VIRTIO_VSOCK_MAX_MRG_BUF_NUM];
>>   };
> 
> 
> It's better to use iov here I think, and drop buf completely.
> 
> And this is better to be done in an independent patch.
> 

You're right, I can use kvec instead of customized structure,
in addition, I don't understand about drop buf completely and
an independent patch.

Thanks.

> 
>>
>>   struct virtio_vsock_pkt_info {
>> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>> index 2040a9e..3557ad3 100644
>> --- a/net/vmw_vsock/virtio_transport.c
>> +++ b/net/vmw_vsock/virtio_transport.c
>> @@ -359,11 +359,62 @@ static bool virtio_transport_more_replies(struct virtio_vsock *vsock)
>>       return val < virtqueue_get_vring_size(vq);
>>   }
>>
>> +static struct virtio_vsock_pkt *receive_mergeable(struct virtqueue *vq,
>> +        struct virtio_vsock *vsock, unsigned int *total_len)
>> +{
>> +    struct virtio_vsock_pkt *pkt;
>> +    u16 num_buf;
>> +    void *page;
>> +    unsigned int len;
>> +    int i = 0;
>> +
>> +    page = virtqueue_get_buf(vq, &len);
>> +    if (!page)
>> +        return NULL;
>> +
>> +    *total_len = len;
>> +    vsock->rx_buf_nr--;
>> +
>> +    pkt = page;
>> +    num_buf = le16_to_cpu(pkt->mrg_rxbuf_hdr.num_buffers);
>> +    if (!num_buf || num_buf > VIRTIO_VSOCK_MAX_MRG_BUF_NUM)
>> +        goto err;
>> +
>> +    pkt->mergeable = true;
>> +    if (!le32_to_cpu(pkt->hdr.len))
>> +        return pkt;
>> +
>> +    len -= sizeof(struct virtio_vsock_pkt);
>> +    pkt->mrg_rxbuf[i].buf = page + sizeof(struct virtio_vsock_pkt);
>> +    pkt->mrg_rxbuf[i].len = len;
>> +    i++;
>> +
>> +    while (--num_buf) {
>> +        page = virtqueue_get_buf(vq, &len);
>> +        if (!page)
>> +            goto err;
>> +
>> +        *total_len += len;
>> +        vsock->rx_buf_nr--;
>> +
>> +        pkt->mrg_rxbuf[i].buf = page;
>> +        pkt->mrg_rxbuf[i].len = len;
>> +        i++;
>> +    }
>> +
>> +    return pkt;
>> +err:
>> +    virtio_transport_free_pkt(pkt);
>> +    return NULL;
>> +}
> 
> 
> Similar logic could be found at virtio-net driver.
> 
> Haven't thought this deeply, but it looks to me use virtio-net driver is also possible, e.g for data-path, just register vsock specific callbacks.
> 
> 
>> +
>>   static void virtio_transport_rx_work(struct work_struct *work)
>>   {
>>       struct virtio_vsock *vsock =
>>           container_of(work, struct virtio_vsock, rx_work);
>>       struct virtqueue *vq;
>> +    size_t vsock_hlen = vsock->mergeable ? sizeof(struct virtio_vsock_pkt) :
>> +            sizeof(struct virtio_vsock_hdr);
>>
>>       vq = vsock->vqs[VSOCK_VQ_RX];
>>
>> @@ -383,21 +434,29 @@ static void virtio_transport_rx_work(struct work_struct *work)
>>                   goto out;
>>               }
>>
>> -            pkt = virtqueue_get_buf(vq, &len);
>> -            if (!pkt) {
>> -                break;
>> -            }
>> +            if (likely(vsock->mergeable)) {
>> +                pkt = receive_mergeable(vq, vsock, &len);
>> +                if (!pkt)
>> +                    break;
>>
>> -            vsock->rx_buf_nr--;
>> +                pkt->len = le32_to_cpu(pkt->hdr.len);
>> +            } else {
>> +                pkt = virtqueue_get_buf(vq, &len);
>> +                if (!pkt) {
>> +                    break;
>> +                }
>> +
>> +                vsock->rx_buf_nr--;
>> +            }
>>
>>               /* Drop short/long packets */
>> -            if (unlikely(len < sizeof(pkt->hdr) ||
>> -                     len > sizeof(pkt->hdr) + pkt->len)) {
>> +            if (unlikely(len < vsock_hlen ||
>> +                     len > vsock_hlen + pkt->len)) {
>>                   virtio_transport_free_pkt(pkt);
>>                   continue;
>>               }
>>
>> -            pkt->len = len - sizeof(pkt->hdr);
>> +            pkt->len = len - vsock_hlen;
>>               virtio_transport_deliver_tap_pkt(pkt);
>>               virtio_transport_recv_pkt(pkt);
>>           }
>> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>> index 3ae3a33..7bef1d5 100644
>> --- a/net/vmw_vsock/virtio_transport_common.c
>> +++ b/net/vmw_vsock/virtio_transport_common.c
>> @@ -272,14 +272,49 @@ static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
>>            */
>>           spin_unlock_bh(&vvs->rx_lock);
>>
>> -        err = memcpy_to_msg(msg, pkt->buf + pkt->off, bytes);
>> -        if (err)
>> -            goto out;
>> +        if (pkt->mergeable) {
>> +            struct virtio_vsock_mrg_rxbuf *buf = pkt->mrg_rxbuf;
>> +            size_t mrg_copy_bytes, last_buf_total = 0, rxbuf_off;
>> +            size_t tmp_bytes = bytes;
>> +            int i;
>> +
>> +            for (i = 0; i < le16_to_cpu(pkt->mrg_rxbuf_hdr.num_buffers); i++) {
>> +                if (pkt->off > last_buf_total + buf[i].len) {
>> +                    last_buf_total += buf[i].len;
>> +                    continue;
>> +                }
>> +
>> +                rxbuf_off = pkt->off - last_buf_total;
>> +                mrg_copy_bytes = min(buf[i].len - rxbuf_off, tmp_bytes);
>> +                err = memcpy_to_msg(msg, buf[i].buf + rxbuf_off, mrg_copy_bytes);
>> +                if (err)
>> +                    goto out;
>> +
>> +                tmp_bytes -= mrg_copy_bytes;
>> +                pkt->off += mrg_copy_bytes;
>> +                last_buf_total += buf[i].len;
>> +                if (!tmp_bytes)
>> +                    break;
>> +            }
> 
> 
> After switch to use iov, you can user iov_iter helper to avoid the above open-coding I believe.
> 
> And you can also drop the if (mergeable) condition.
> 
> Thanks
> 

Thanks, I will try to use iov instead.

> 
>> +
>> +            if (tmp_bytes) {
>> +                printk(KERN_WARNING "WARNING! bytes = %llu, "
>> +                        "bytes = %llu\n",
>> +                        (unsigned long long)bytes,
>> +                        (unsigned long long)tmp_bytes);
>> +            }
>> +
>> +            total += (bytes - tmp_bytes);
>> +        } else {
>> +            err = memcpy_to_msg(msg, pkt->buf + pkt->off, bytes);
>> +            if (err)
>> +                goto out;
>> +
>> +            total += bytes;
>> +            pkt->off += bytes;
>> +        }
>>
>>           spin_lock_bh(&vvs->rx_lock);
>> -
>> -        total += bytes;
>> -        pkt->off += bytes;
>>           if (pkt->off == pkt->len) {
>>               virtio_transport_dec_rx_pkt(vvs, pkt);
>>               list_del(&pkt->list);
>> @@ -1050,8 +1085,16 @@ void virtio_transport_recv_pkt(struct virtio_vsock_pkt *pkt)
>>
>>   void virtio_transport_free_pkt(struct virtio_vsock_pkt *pkt)
>>   {
>> -    kfree(pkt->buf);
>> -    kfree(pkt);
>> +    int i;
>> +
>> +    if (pkt->mergeable) {
>> +        for (i = 1; i < le16_to_cpu(pkt->mrg_rxbuf_hdr.num_buffers); i++)
>> +            free_page((unsigned long)pkt->mrg_rxbuf[i].buf);
>> +        free_page((unsigned long)(void *)pkt);
>> +    } else {
>> +        kfree(pkt->buf);
>> +        kfree(pkt);
>> +    }
>>   }
>>   EXPORT_SYMBOL_GPL(virtio_transport_free_pkt);
>>
> 
> .
> 

^ permalink raw reply

* Re: [PATCH] ath10k: fix some typo
From: Kalle Valo @ 2018-11-06 16:01 UTC (permalink / raw)
  To: Yangtao Li
  Cc: Yangtao Li, netdev, linux-wireless, linux-kernel, ath10k, davem
In-Reply-To: <20181101152932.22684-1-tiny.windzz@gmail.com>

Yangtao Li <tiny.windzz@gmail.com> wrote:

> Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>

Patch applied to ath-next branch of ath.git, thanks.

c8cb09644c6c ath10k: fix some typo

-- 
https://patchwork.kernel.org/patch/10664139/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] ath10k: fix some typo
From: Kalle Valo @ 2018-11-06 16:01 UTC (permalink / raw)
  To: Yangtao Li
  Cc: davem, ath10k, linux-wireless, netdev, linux-kernel, Yangtao Li
In-Reply-To: <20181101152932.22684-1-tiny.windzz@gmail.com>

Yangtao Li <tiny.windzz@gmail.com> wrote:

> Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>

Patch applied to ath-next branch of ath.git, thanks.

c8cb09644c6c ath10k: fix some typo

-- 
https://patchwork.kernel.org/patch/10664139/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH 2/5] VSOCK: support fill data to mergeable rx buffer in host
From: jiangyiwen @ 2018-11-06  6:30 UTC (permalink / raw)
  To: Jason Wang, stefanha; +Cc: netdev, kvm, virtualization
In-Reply-To: <485c2c5d-d73e-e679-9549-aad3de02f0ab@redhat.com>

On 2018/11/6 11:43, Jason Wang wrote:
> 
> On 2018/11/5 下午3:45, jiangyiwen wrote:
>> When vhost support VIRTIO_VSOCK_F_MRG_RXBUF feature,
>> it will merge big packet into rx vq.
>>
>> Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
>> ---
>>   drivers/vhost/vsock.c             | 117 +++++++++++++++++++++++++++++++-------
>>   include/linux/virtio_vsock.h      |   1 +
>>   include/uapi/linux/virtio_vsock.h |   5 ++
>>   3 files changed, 102 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>> index 34bc3ab..648be39 100644
>> --- a/drivers/vhost/vsock.c
>> +++ b/drivers/vhost/vsock.c
>> @@ -22,7 +22,8 @@
>>   #define VHOST_VSOCK_DEFAULT_HOST_CID    2
>>
>>   enum {
>> -    VHOST_VSOCK_FEATURES = VHOST_FEATURES,
>> +    VHOST_VSOCK_FEATURES = VHOST_FEATURES |
>> +            (1ULL << VIRTIO_VSOCK_F_MRG_RXBUF),
>>   };
>>
>>   /* Used to track all the vhost_vsock instances on the system. */
>> @@ -80,6 +81,68 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>>       return vsock;
>>   }
>>
>> +static int get_rx_bufs(struct vhost_virtqueue *vq,
>> +        struct vring_used_elem *heads, int datalen,
>> +        unsigned *iovcount, unsigned int quota)
>> +{
>> +    unsigned int out, in;
>> +    int seg = 0;
>> +    int headcount = 0;
>> +    unsigned d;
>> +    int ret;
>> +    /*
>> +     * len is always initialized before use since we are always called with
>> +     * datalen > 0.
>> +     */
>> +    u32 uninitialized_var(len);
>> +
>> +    while (datalen > 0 && headcount < quota) {
>> +        if (unlikely(seg >= UIO_MAXIOV)) {
>> +            ret = -ENOBUFS;
>> +            goto err;
>> +        }
>> +
>> +        ret = vhost_get_vq_desc(vq, vq->iov + seg,
>> +                ARRAY_SIZE(vq->iov) - seg, &out,
>> +                &in, NULL, NULL);
>> +        if (unlikely(ret < 0))
>> +            goto err;
>> +
>> +        d = ret;
>> +        if (d == vq->num) {
>> +            ret = 0;
>> +            goto err;
>> +        }
>> +
>> +        if (unlikely(out || in <= 0)) {
>> +            vq_err(vq, "unexpected descriptor format for RX: "
>> +                    "out %d, in %d\n", out, in);
>> +            ret = -EINVAL;
>> +            goto err;
>> +        }
>> +
>> +        heads[headcount].id = cpu_to_vhost32(vq, d);
>> +        len = iov_length(vq->iov + seg, in);
>> +        heads[headcount].len = cpu_to_vhost32(vq, len);
>> +        datalen -= len;
>> +        ++headcount;
>> +        seg += in;
>> +    }
>> +
>> +    heads[headcount - 1].len = cpu_to_vhost32(vq, len + datalen);
>> +    *iovcount = seg;
>> +
>> +    /* Detect overrun */
>> +    if (unlikely(datalen > 0)) {
>> +        ret = UIO_MAXIOV + 1;
>> +        goto err;
>> +    }
>> +    return headcount;
>> +err:
>> +    vhost_discard_vq_desc(vq, headcount);
>> +    return ret;
>> +}
> 
> 
> Seems duplicated with the one used by vhost-net.
> 
> In packed virtqueue implementation, I plan to move this to vhost.c.
> 

Yes, this code is full copied from vhost-net, if it can be packed into
vhost.c, it would be great.

> 
>> +
>>   static void
>>   vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
>>                   struct vhost_virtqueue *vq)
>> @@ -87,22 +150,34 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>>       struct vhost_virtqueue *tx_vq = &vsock->vqs[VSOCK_VQ_TX];
>>       bool added = false;
>>       bool restart_tx = false;
>> +    int mergeable;
>> +    size_t vsock_hlen;
>>
>>       mutex_lock(&vq->mutex);
>>
>>       if (!vq->private_data)
>>           goto out;
>>
>> +    mergeable = vhost_has_feature(vq, VIRTIO_VSOCK_F_MRG_RXBUF);
>> +    /*
>> +     * Guest fill page for rx vq in mergeable case, so it will not
>> +     * allocate pkt structure, we should reserve size of pkt in advance.
>> +     */
>> +    if (likely(mergeable))
>> +        vsock_hlen = sizeof(struct virtio_vsock_pkt);
>> +    else
>> +        vsock_hlen = sizeof(struct virtio_vsock_hdr);
>> +
>>       /* Avoid further vmexits, we're already processing the virtqueue */
>>       vhost_disable_notify(&vsock->dev, vq);
>>
>>       for (;;) {
>>           struct virtio_vsock_pkt *pkt;
>>           struct iov_iter iov_iter;
>> -        unsigned out, in;
>> +        unsigned out = 0, in = 0;
>>           size_t nbytes;
>>           size_t len;
>> -        int head;
>> +        s16 headcount;
>>
>>           spin_lock_bh(&vsock->send_pkt_list_lock);
>>           if (list_empty(&vsock->send_pkt_list)) {
>> @@ -116,16 +191,9 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>>           list_del_init(&pkt->list);
>>           spin_unlock_bh(&vsock->send_pkt_list_lock);
>>
>> -        head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
>> -                     &out, &in, NULL, NULL);
>> -        if (head < 0) {
>> -            spin_lock_bh(&vsock->send_pkt_list_lock);
>> -            list_add(&pkt->list, &vsock->send_pkt_list);
>> -            spin_unlock_bh(&vsock->send_pkt_list_lock);
>> -            break;
>> -        }
>> -
>> -        if (head == vq->num) {
>> +        headcount = get_rx_bufs(vq, vq->heads, vsock_hlen + pkt->len,
>> +                &in, likely(mergeable) ? UIO_MAXIOV : 1);
>> +        if (headcount <= 0) {
>>               spin_lock_bh(&vsock->send_pkt_list_lock);
>>               list_add(&pkt->list, &vsock->send_pkt_list);
>>               spin_unlock_bh(&vsock->send_pkt_list_lock);
>> @@ -133,19 +201,13 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>>               /* We cannot finish yet if more buffers snuck in while
>>                * re-enabling notify.
>>                */
>> -            if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
>> +            if (!headcount && unlikely(vhost_enable_notify(&vsock->dev, vq))) {
>>                   vhost_disable_notify(&vsock->dev, vq);
>>                   continue;
>>               }
>>               break;
>>           }
>>
>> -        if (out) {
>> -            virtio_transport_free_pkt(pkt);
>> -            vq_err(vq, "Expected 0 output buffers, got %u\n", out);
>> -            break;
>> -        }
>> -
>>           len = iov_length(&vq->iov[out], in);
>>           iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
>>
>> @@ -156,6 +218,19 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>>               break;
>>           }
>>
>> +        if (likely(mergeable)) {
>> +            pkt->mrg_rxbuf_hdr.num_buffers = cpu_to_le16(headcount);
>> +            nbytes = copy_to_iter(&pkt->mrg_rxbuf_hdr,
>> +                    sizeof(pkt->mrg_rxbuf_hdr), &iov_iter);
>> +            if (nbytes != sizeof(pkt->mrg_rxbuf_hdr)) {
>> +                virtio_transport_free_pkt(pkt);
>> +                vq_err(vq, "Faulted on copying rxbuf hdr\n");
>> +                break;
>> +            }
>> +            iov_iter_advance(&iov_iter, (vsock_hlen -
>> +                    sizeof(pkt->mrg_rxbuf_hdr) - sizeof(pkt->hdr)));
>> +        }
>> +
>>           nbytes = copy_to_iter(pkt->buf, pkt->len, &iov_iter);
>>           if (nbytes != pkt->len) {
>>               virtio_transport_free_pkt(pkt);
>> @@ -163,7 +238,7 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>>               break;
>>           }
>>
>> -        vhost_add_used(vq, head, sizeof(pkt->hdr) + pkt->len);
>> +        vhost_add_used_n(vq, vq->heads, headcount);
>>           added = true;
> 
> 
> Looks rather similar to vhost-net mergeable rx buffer implementation. Another proof of using vhost-net.
> 
> Thanks

Yes.

> 
> 
>>
>>           if (pkt->reply) {
>> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>> index bf84418..da9e1fe 100644
>> --- a/include/linux/virtio_vsock.h
>> +++ b/include/linux/virtio_vsock.h
>> @@ -50,6 +50,7 @@ struct virtio_vsock_sock {
>>
>>   struct virtio_vsock_pkt {
>>       struct virtio_vsock_hdr    hdr;
>> +    struct virtio_vsock_mrg_rxbuf_hdr mrg_rxbuf_hdr;
>>       struct work_struct work;
>>       struct list_head list;
>>       /* socket refcnt not held, only use for cancellation */
>> diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
>> index 1d57ed3..2292f30 100644
>> --- a/include/uapi/linux/virtio_vsock.h
>> +++ b/include/uapi/linux/virtio_vsock.h
>> @@ -63,6 +63,11 @@ struct virtio_vsock_hdr {
>>       __le32    fwd_cnt;
>>   } __attribute__((packed));
>>
>> +/* It add mergeable rx buffers feature */
>> +struct virtio_vsock_mrg_rxbuf_hdr {
>> +    __le16  num_buffers;    /* number of mergeable rx buffers */
>> +} __attribute__((packed));
>> +
>>   enum virtio_vsock_type {
>>       VIRTIO_VSOCK_TYPE_STREAM = 1,
>>   };
> 
> .
> 

^ permalink raw reply

* Re: [PATCH bpf-next v2 02/13] bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO
From: Alexei Starovoitov @ 2018-11-06  6:29 UTC (permalink / raw)
  To: Edward Cree
  Cc: Martin Lau, Yonghong Song, Alexei Starovoitov,
	daniel@iogearbox.net, netdev@vger.kernel.org, Kernel Team
In-Reply-To: <dffb0db2-521c-0adb-ff3d-68a379d17b26@solarflare.com>

On Thu, Nov 01, 2018 at 09:08:37PM +0000, Edward Cree wrote:
> I've spent a bit more time thinking about / sleeping on this, and I
>  still think there's a major disagreement here.  Basically it seems
>  like I'm saying "the design of BTF is wrong" and you're saying "but
>  it's the design" (with the possible implication — I'm not entirely
>  sure — of "but that's what DWARF does").
> So let's back away from the details about FUNC/PROTO, and talk in
>  more general terms about what a BTF record means.
> There are two classes of things we might want to put in debug-info:
>  * There exists a type T
>  * I have an instance X (variable, subprogram, etc.) of type T
> Both of these may need to reference other types, and have the same
>  space of possible things T could be, but there the similarity ends:
>  they are semantically different things.
> Indeed, the only reason for any record of the first class is to
>  define types referenced by records of the second class.  Some
>  concrete examples of records of the second class are:
> 1) I have a map named "foo" with key-type T1 and value-type T2
> 2) I have a subprogram named "bar" with prototype T3
> 3) I am using stack slot fp-8 to store a value of type T4
> 4) I am casting ctx+8 to a pointer type T5 before dereferencing it
> Currently we have (1) and this patch series adds (2), both done
>  through records that look like they are just defining a type (i.e.
>  the first class of record) but have 'magic' semantics (in the case
>  of (1), special names of the form ____btf_map_foo.  How anyone
>  thought that was a clean and tasteful design is beyond me.)
> What IMHO the design *should* be, is that we have a 'types'
>  subsection that *only* contains records of the first class, and
>  then other subsections to hold records of the second class that
>  reference records of the first class by ID.

Such pure type approach wouldn't be practical.
BTF is not pure type information. BTF is everything that verifier needs
to know to make safety decisions that bpf instruction set doesn't have.
For example two anonymous structs:
struct {
 int a;
 int b;
} var1;
struct {
  int c;
  int d;
} var2;
from C point of view have the same type, but from BTF point of view
they are different. Names of the fields are essential part of the BTF
because the purpose of BTF is to provide information about bpf objects
for debugging and safety reasons.
Similarly
int (*) (void *src, void *dst, int len);
and
int (*) (void *dst, void *src, int len);
are the same from C and compiler point of view,
but they are different in BTF, because names carry information
that needs to be preserved.
Same goes for function declarations. The function name and argument
names are part of the 'type description'.
We shouldn't be using word 'type' in pure form otherwise
it will cause confusion like this thread demonstrated.
Beyond prog names expressed in BTF we're adding global variables support.
They will be expressed in BTF as a new KIND.
Think of all global variables in a single .c file as fields of
anonymous structure. They have offsets from beginning of .bss,
sizes, further references into btf_type_ids and most importantly names.
Another thing we're working on is spin_lock support.
There we also have to rely on BTF to make sure that the certain
bytes of map's value or cgroup local storage that belong to spin_lock
will be masked for lookup/update calls.
typedef u32 bpf_spin_lock;
will be recognized by the verifier by its name.
May be we will introduce new KIND_ for spin_lock too and convert name
into KIND in btf writer. That is TBD. The main point that
names of types, variables, functions has to be expressed in BTF
as one coherent graph of information.
Splitting pure types into one section, variables into another,
functions into yet another is not practical, since the same
modifiers (like const or volatile) need to be applied to
variables and functions. At the end all sections will have
the same style of encoding, hence no need to duplicate
the encoding three times and instead it's cleaner to encode
all of them BTF-style via different KINDs.
vmlinux's BTF will include all global variables and functions,
so that tracing scripts can reference to particular variable
or function argument by name making kernel debuging easier
and less error prone.
Consider that right now typical bcc's trace.py script looks like:
trace.py -I linux/skbuff.h -I net/sock.h \
  'skb_recv_datagram(struct sock *sk, unsigned int flags, int noblock, int *err) \
    "sk_recv_q:%llx next:%llx prev:%llx" \
    &sk->sk_receive_queue,sk->sk_receive_queue.next,sk->sk_receive_queue.prev'
where func declaration is copy pasted from the C source
and there are no guarantees that argument names and their order match
what vmlinux actually has.
With fully named BTF of vmlinux the tracing scripts will become more robust.
Note that BTF_KIND_FUNC is pretty much the same as BTF_KIND_STRUCT with
an addition of return type.
Kernel specific things like per-cpu attribute of the variable will
be BTF KIND too. This is information that tooling and kernel needs
and current BTF graph design is perfectly suited to carry such info.

^ permalink raw reply

* [PATCH] net: skbuff.h: remove unnecessary unlikely()
From: Yangtao Li @ 2018-11-06 15:45 UTC (permalink / raw)
  To: davem, edumazet, dja, willemb, ast, sbrivio, posk, pabeni, borisp
  Cc: linux-kernel, netdev, Yangtao Li

WARN_ON() already contains an unlikely(), so it's not necessary to use
unlikely.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
---
 include/linux/skbuff.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 0ba687454267..7dcfb5591dc3 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2508,10 +2508,8 @@ int ___pskb_trim(struct sk_buff *skb, unsigned int len);
 
 static inline void __skb_set_length(struct sk_buff *skb, unsigned int len)
 {
-	if (unlikely(skb_is_nonlinear(skb))) {
-		WARN_ON(1);
+	if (WARN_ON(skb_is_nonlinear(skb)))
 		return;
-	}
 	skb->len = len;
 	skb_set_tail_pointer(skb, len);
 }
-- 
2.17.0

^ permalink raw reply related

* Re: [GIT] Networking
From: Linus Torvalds @ 2018-11-06 15:45 UTC (permalink / raw)
  To: David Miller; +Cc: Andrew Morton, netdev, Linux Kernel Mailing List
In-Reply-To: <20181105.174242.2248944127095829834.davem@davemloft.net>

On Mon, Nov 5, 2018 at 5:42 PM David Miller <davem@davemloft.net> wrote:
>
> 1) Handle errors mid-stream [...]

Pulled,

                Linus

^ permalink raw reply

* Re: [PATCH V2 2/7] net: lorawan: Add LoRaWAN socket module
From: Jian-Hong Pan @ 2018-11-06 14:28 UTC (permalink / raw)
  To: David S. Miller
  Cc: Andreas Färber, netdev,
	<linux-arm-kernel@lists.infradead.org\,
	linux-kernel@vger.kernel.org>,, Marcel Holtmann, Dollar Chen,
	Ken Yu, linux-wpan - ML, Stefan Schmidt
In-Reply-To: <20181105.101610.1437737564548154497.davem@davemloft.net>

David Miller <davem@davemloft.net> 於 2018年11月6日 週二 上午2:16寫道：
>
> From: Jian-Hong Pan <starnight@g.ncu.edu.tw>
> Date: Tue,  6 Nov 2018 00:55:40 +0800
>
> > +static inline struct lrw_mac_cb * mac_cb(struct sk_buff *skb)
>
> "mac_cb()" is pretty generic for a name, and leads to namespace pollution,
> please use lrw_mac_cb() or similar.
>
> > +static inline struct dgram_sock *
> > +dgram_sk(const struct sock *sk)
> > +{
> > +     return container_of(sk, struct dgram_sock, sk);
> > +}
> > +
> > +static inline struct net_device *
> > +lrw_get_dev_by_addr(struct net *net, u32 devaddr)
>
> Never use inline for functions in a foo.c file, let the compiler decide.
>
> > +{
> > +     struct net_device *ndev = NULL;
> > +     __be32 be_addr = cpu_to_be32(devaddr);
>
> Always order local variables from longest to shortest line.
>
> > +static int
> > +dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> > +           int noblock, int flags, int *addr_len)
> > +{
> > +     struct sk_buff *skb;
> > +     size_t copied = 0;
> > +     DECLARE_SOCKADDR(struct sockaddr_lorawan *, saddr, msg->msg_name);
> > +     int err;
>
> Likewise.
>
> I'm not going to point out every single place where you have made these
> two errors.
>
> Please audit your entire submission and fix the problems wherever they
> occur.

Thanks for the reviewing.  I will check the submission again.

Jian-Hong Pan

^ permalink raw reply

* We have processed payment for freight charges for our shipment
From: salman.ahmed @ 2018-11-06  4:29 UTC (permalink / raw)
  To: Recipients

[-- Attachment #1: Mail message body --]
[-- Type: text/plain, Size: 261 bytes --]

Goodday!!!

We have processed payment for freight charges for our shipment. Enclosed below is a copy of swift confirmation for your reference.

Send us original invoice, B/L and packing list as soon as you confirm transfer details.

Account department

[-- Attachment #2: Swift confirmation.doc --]
[-- Type: application/msword, Size: 218277 bytes --]

^ permalink raw reply

* Re: [bindings][PATCH] bindings/net: DPAA Backplane Device Bindings
From: Andrew Lunn @ 2018-11-06 13:29 UTC (permalink / raw)
  To: Florinel Iordache
  Cc: robh+dt@kernel.org, mark.rutland@arm.com, broonie@kernel.org,
	horms+renesas@verge.net.au, geert+renesas@glider.be,
	linus.walleij@linaro.org, devicetree@vger.kernel.org,
	davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <1541504900-30091-1-git-send-email-florinel.iordache@nxp.com>

On Tue, Nov 06, 2018 at 11:48:30AM +0000, Florinel Iordache wrote:
> Device Tree Bindings for DPAA backplane available on Layerscape
>  communications processors.
> 
> Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>
> ---
>  .../devicetree/bindings/net/dpaa-backplane.txt     | 105 +++++++++++++++++++++
>  1 file changed, 105 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/net/dpaa-backplane.txt
> 
> diff --git a/Documentation/devicetree/bindings/net/dpaa-backplane.txt b/Documentation/devicetree/bindings/net/dpaa-backplane.txt
> new file mode 100644
> index 0000000..f147c84
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/dpaa-backplane.txt
> @@ -0,0 +1,105 @@
> +=============================================================================
> +DPAA Backplane Device Bindings
> +
> +CONTENTS
> +  - SerDes Node
> +  - PCS Phy Node
> +
> +=============================================================================
> +SerDes Node
> +
> +DESCRIPTION
> +
> +SerDes (Serializer/Deserializer) HW peripheral
> +

Hi Florinel

It is normal to indicate which properties are required, and which are
optional using the section headers. There is a defacto standard for
the format of a device tree binding, and this text does not follow it.

> +PROPERTIES
> +
> +- compatible
> +		Usage: required
> +		Value type: <stringlist>
> +		Definition: Specifies the type of SerDes.
> +		Must include the prefix "fsl,serdes"
> +		SerDes can be of different types:
> +		- 10G SerDes must be specified as: "fsl,serdes-10g"
> +		- 28G SerDes must be specified as: "fsl,serdes-28g"

Does a different driver get loaded depending on the compatible? I'm
just wondering if there should be one compatible string, and a speed
property.

> +
> +- reg
> +		Usage: required
> +		Value type: <prop-encoded-array>
> +		Definition: Specifies the offset of the SerDes configuration registers
> +
> +- little-endian
> +		Usage: optional
> +		Value type: <Boolean>
> +		Definition: Specifies endianness access to SerDes registers.
> +		If omitted, big-endian will be used
> +		See common-properties.txt for complete definition
> +
> +EXAMPLE
> +
> +Example of 10G SerDes node:
> +
> +serdes1: serdes@1ea0000 {
> +		compatible = "fsl,serdes-10g";
> +		reg = <0x0 0x1ea0000 0 0x00002000>;
> +		little-endian;
> +};
> +
> +=============================================================================
> +PCS Phy Node
> +
> +DESCRIPTION
> +
> +PCS Phy (Physical Coding Sublayer / Physical layer) node
> +
> +PROPERTIES
> +
> +- compatible
> +		Usage: required

This is normally optional, and defaults to c22 if not present.

> +		Value type: <stringlist>
> +		Definition: A standard property. Specifies the IEEE 802.3 Clause
> +		Different IEEE 802.3 Clauses can be specified:
> +		- Clause 22 must be specified as: "ethernet-phy-ieee802.3-c22"
> +		- Clause 45 must be specified as: "ethernet-phy-ieee802.3-c45"
> +		For complete definition see:
> +		Documentation/devicetree/bindings/net/phy.txt
> +
> +- reg
> +		Usage: required
> +		Value type: <prop-encoded-array>
> +		Definition: A standard property.
> +		Specifies the offset of the PCS Phy configuration registers
> +		For complete definition see:
> +		Documentation/devicetree/bindings/net/phy.txt

Rather than state what is already in phy.txt, just say that the
following additional properties can be used:

> +
> +- backplane-mode
> +		Usage: required

So your PHYs can only be used in backplane-mode? Base-T is never
supported?

> +		Value type: <stringlist>
> +		Definition: Specifies the speed and type of the protocol used
> +		Different speeds and backplane protocol types can be used:
> +		- 10GBase-KR must be specified as: "10gbase-kr"
> +		- 40GBase-KR must be specified as: "40gbase-kr"

I don't see why this is needed. The PHY driver should know what speeds
the PHY supports. You might want to make use of the standard max-speed
property to prevent it doing a high speed, because of layout issues,
etc.

> +
> +- fsl,lane-handle
> +		Usage: required
> +		Value type: <phandle>
> +		Definition: Specifies the reference to a node representing the SerDes
> +		device
> +
> +- fsl,lane-reg
> +		Usage: required
> +		Value type: <prop-encoded-array>
> +		Definition: Specifies the offsets of the SerDes lanes configuration
> +		registers

No other PHY driver we have needs such properties. Why are these
needed?

It is also a bit ununsed to post a binding without the code. Maybe the
code for these two drivers will help me understand.

     Thanks
	Andrew

^ permalink raw reply

* Re: [PATCH 2/5] VSOCK: support fill data to mergeable rx buffer in host
From: Jason Wang @ 2018-11-06  3:43 UTC (permalink / raw)
  To: jiangyiwen, stefanha; +Cc: netdev, kvm, virtualization
In-Reply-To: <5BDFF537.3050806@huawei.com>


On 2018/11/5 下午3:45, jiangyiwen wrote:
> When vhost support VIRTIO_VSOCK_F_MRG_RXBUF feature,
> it will merge big packet into rx vq.
>
> Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
> ---
>   drivers/vhost/vsock.c             | 117 +++++++++++++++++++++++++++++++-------
>   include/linux/virtio_vsock.h      |   1 +
>   include/uapi/linux/virtio_vsock.h |   5 ++
>   3 files changed, 102 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index 34bc3ab..648be39 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -22,7 +22,8 @@
>   #define VHOST_VSOCK_DEFAULT_HOST_CID	2
>
>   enum {
> -	VHOST_VSOCK_FEATURES = VHOST_FEATURES,
> +	VHOST_VSOCK_FEATURES = VHOST_FEATURES |
> +			(1ULL << VIRTIO_VSOCK_F_MRG_RXBUF),
>   };
>
>   /* Used to track all the vhost_vsock instances on the system. */
> @@ -80,6 +81,68 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>   	return vsock;
>   }
>
> +static int get_rx_bufs(struct vhost_virtqueue *vq,
> +		struct vring_used_elem *heads, int datalen,
> +		unsigned *iovcount, unsigned int quota)
> +{
> +	unsigned int out, in;
> +	int seg = 0;
> +	int headcount = 0;
> +	unsigned d;
> +	int ret;
> +	/*
> +	 * len is always initialized before use since we are always called with
> +	 * datalen > 0.
> +	 */
> +	u32 uninitialized_var(len);
> +
> +	while (datalen > 0 && headcount < quota) {
> +		if (unlikely(seg >= UIO_MAXIOV)) {
> +			ret = -ENOBUFS;
> +			goto err;
> +		}
> +
> +		ret = vhost_get_vq_desc(vq, vq->iov + seg,
> +				ARRAY_SIZE(vq->iov) - seg, &out,
> +				&in, NULL, NULL);
> +		if (unlikely(ret < 0))
> +			goto err;
> +
> +		d = ret;
> +		if (d == vq->num) {
> +			ret = 0;
> +			goto err;
> +		}
> +
> +		if (unlikely(out || in <= 0)) {
> +			vq_err(vq, "unexpected descriptor format for RX: "
> +					"out %d, in %d\n", out, in);
> +			ret = -EINVAL;
> +			goto err;
> +		}
> +
> +		heads[headcount].id = cpu_to_vhost32(vq, d);
> +		len = iov_length(vq->iov + seg, in);
> +		heads[headcount].len = cpu_to_vhost32(vq, len);
> +		datalen -= len;
> +		++headcount;
> +		seg += in;
> +	}
> +
> +	heads[headcount - 1].len = cpu_to_vhost32(vq, len + datalen);
> +	*iovcount = seg;
> +
> +	/* Detect overrun */
> +	if (unlikely(datalen > 0)) {
> +		ret = UIO_MAXIOV + 1;
> +		goto err;
> +	}
> +	return headcount;
> +err:
> +	vhost_discard_vq_desc(vq, headcount);
> +	return ret;
> +}


Seems duplicated with the one used by vhost-net.

In packed virtqueue implementation, I plan to move this to vhost.c.


> +
>   static void
>   vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
>   			    struct vhost_virtqueue *vq)
> @@ -87,22 +150,34 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>   	struct vhost_virtqueue *tx_vq = &vsock->vqs[VSOCK_VQ_TX];
>   	bool added = false;
>   	bool restart_tx = false;
> +	int mergeable;
> +	size_t vsock_hlen;
>
>   	mutex_lock(&vq->mutex);
>
>   	if (!vq->private_data)
>   		goto out;
>
> +	mergeable = vhost_has_feature(vq, VIRTIO_VSOCK_F_MRG_RXBUF);
> +	/*
> +	 * Guest fill page for rx vq in mergeable case, so it will not
> +	 * allocate pkt structure, we should reserve size of pkt in advance.
> +	 */
> +	if (likely(mergeable))
> +		vsock_hlen = sizeof(struct virtio_vsock_pkt);
> +	else
> +		vsock_hlen = sizeof(struct virtio_vsock_hdr);
> +
>   	/* Avoid further vmexits, we're already processing the virtqueue */
>   	vhost_disable_notify(&vsock->dev, vq);
>
>   	for (;;) {
>   		struct virtio_vsock_pkt *pkt;
>   		struct iov_iter iov_iter;
> -		unsigned out, in;
> +		unsigned out = 0, in = 0;
>   		size_t nbytes;
>   		size_t len;
> -		int head;
> +		s16 headcount;
>
>   		spin_lock_bh(&vsock->send_pkt_list_lock);
>   		if (list_empty(&vsock->send_pkt_list)) {
> @@ -116,16 +191,9 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>   		list_del_init(&pkt->list);
>   		spin_unlock_bh(&vsock->send_pkt_list_lock);
>
> -		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> -					 &out, &in, NULL, NULL);
> -		if (head < 0) {
> -			spin_lock_bh(&vsock->send_pkt_list_lock);
> -			list_add(&pkt->list, &vsock->send_pkt_list);
> -			spin_unlock_bh(&vsock->send_pkt_list_lock);
> -			break;
> -		}
> -
> -		if (head == vq->num) {
> +		headcount = get_rx_bufs(vq, vq->heads, vsock_hlen + pkt->len,
> +				&in, likely(mergeable) ? UIO_MAXIOV : 1);
> +		if (headcount <= 0) {
>   			spin_lock_bh(&vsock->send_pkt_list_lock);
>   			list_add(&pkt->list, &vsock->send_pkt_list);
>   			spin_unlock_bh(&vsock->send_pkt_list_lock);
> @@ -133,19 +201,13 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>   			/* We cannot finish yet if more buffers snuck in while
>   			 * re-enabling notify.
>   			 */
> -			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
> +			if (!headcount && unlikely(vhost_enable_notify(&vsock->dev, vq))) {
>   				vhost_disable_notify(&vsock->dev, vq);
>   				continue;
>   			}
>   			break;
>   		}
>
> -		if (out) {
> -			virtio_transport_free_pkt(pkt);
> -			vq_err(vq, "Expected 0 output buffers, got %u\n", out);
> -			break;
> -		}
> -
>   		len = iov_length(&vq->iov[out], in);
>   		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
>
> @@ -156,6 +218,19 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>   			break;
>   		}
>
> +		if (likely(mergeable)) {
> +			pkt->mrg_rxbuf_hdr.num_buffers = cpu_to_le16(headcount);
> +			nbytes = copy_to_iter(&pkt->mrg_rxbuf_hdr,
> +					sizeof(pkt->mrg_rxbuf_hdr), &iov_iter);
> +			if (nbytes != sizeof(pkt->mrg_rxbuf_hdr)) {
> +				virtio_transport_free_pkt(pkt);
> +				vq_err(vq, "Faulted on copying rxbuf hdr\n");
> +				break;
> +			}
> +			iov_iter_advance(&iov_iter, (vsock_hlen -
> +					sizeof(pkt->mrg_rxbuf_hdr) - sizeof(pkt->hdr)));
> +		}
> +
>   		nbytes = copy_to_iter(pkt->buf, pkt->len, &iov_iter);
>   		if (nbytes != pkt->len) {
>   			virtio_transport_free_pkt(pkt);
> @@ -163,7 +238,7 @@ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid)
>   			break;
>   		}
>
> -		vhost_add_used(vq, head, sizeof(pkt->hdr) + pkt->len);
> +		vhost_add_used_n(vq, vq->heads, headcount);
>   		added = true;


Looks rather similar to vhost-net mergeable rx buffer implementation. 
Another proof of using vhost-net.

Thanks


>
>   		if (pkt->reply) {
> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> index bf84418..da9e1fe 100644
> --- a/include/linux/virtio_vsock.h
> +++ b/include/linux/virtio_vsock.h
> @@ -50,6 +50,7 @@ struct virtio_vsock_sock {
>
>   struct virtio_vsock_pkt {
>   	struct virtio_vsock_hdr	hdr;
> +	struct virtio_vsock_mrg_rxbuf_hdr mrg_rxbuf_hdr;
>   	struct work_struct work;
>   	struct list_head list;
>   	/* socket refcnt not held, only use for cancellation */
> diff --git a/include/uapi/linux/virtio_vsock.h b/include/uapi/linux/virtio_vsock.h
> index 1d57ed3..2292f30 100644
> --- a/include/uapi/linux/virtio_vsock.h
> +++ b/include/uapi/linux/virtio_vsock.h
> @@ -63,6 +63,11 @@ struct virtio_vsock_hdr {
>   	__le32	fwd_cnt;
>   } __attribute__((packed));
>
> +/* It add mergeable rx buffers feature */
> +struct virtio_vsock_mrg_rxbuf_hdr {
> +	__le16  num_buffers;    /* number of mergeable rx buffers */
> +} __attribute__((packed));
> +
>   enum virtio_vsock_type {
>   	VIRTIO_VSOCK_TYPE_STREAM = 1,
>   };

^ permalink raw reply

* Re: [PATCH 1/5] VSOCK: support fill mergeable rx buffer in guest
From: Jason Wang @ 2018-11-06  3:38 UTC (permalink / raw)
  To: jiangyiwen, stefanha; +Cc: netdev, kvm, virtualization
In-Reply-To: <5BDFF4FE.1080500@huawei.com>


On 2018/11/5 下午3:45, jiangyiwen wrote:
> In driver probing, if virtio has VIRTIO_VSOCK_F_MRG_RXBUF feature,
> it will fill mergeable rx buffer, support for host send mergeable
> rx buffer. It will fill a page everytime to compact with small
> packet and big packet.
>
> Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
> ---
>   include/linux/virtio_vsock.h     |  3 ++
>   net/vmw_vsock/virtio_transport.c | 72 +++++++++++++++++++++++++++++-----------
>   2 files changed, 56 insertions(+), 19 deletions(-)
>
> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
> index e223e26..bf84418 100644
> --- a/include/linux/virtio_vsock.h
> +++ b/include/linux/virtio_vsock.h
> @@ -14,6 +14,9 @@
>   #define VIRTIO_VSOCK_MAX_BUF_SIZE		0xFFFFFFFFUL
>   #define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE		(1024 * 64)
>
> +/* Virtio-vsock feature */
> +#define VIRTIO_VSOCK_F_MRG_RXBUF 0 /* Host can merge receive buffers. */
> +
>   enum {
>   	VSOCK_VQ_RX     = 0, /* for host to guest data */
>   	VSOCK_VQ_TX     = 1, /* for guest to host data */
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> index 5d3cce9..2040a9e 100644
> --- a/net/vmw_vsock/virtio_transport.c
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -64,6 +64,7 @@ struct virtio_vsock {
>   	struct virtio_vsock_event event_list[8];
>
>   	u32 guest_cid;
> +	bool mergeable;
>   };
>
>   static struct virtio_vsock *virtio_vsock_get(void)
> @@ -256,6 +257,25 @@ static int virtio_transport_send_pkt_loopback(struct virtio_vsock *vsock,
>   	return 0;
>   }
>
> +static int fill_mergeable_rx_buff(struct virtqueue *vq)
> +{
> +	void *page = NULL;
> +	struct scatterlist sg;
> +	int err;
> +
> +	page = (void *)get_zeroed_page(GFP_KERNEL);


Any reason to use zeroed page?


> +	if (!page)
> +		return -ENOMEM;
> +
> +	sg_init_one(&sg, page, PAGE_SIZE);


FYI, for virtio-net we have several optimizations for mergeable rx buffer:

- skb_page_frag_refill() which can use high order page and reduce the 
stress of page allocator

- we don't use fixed buffer size, instead we use EWMA to estimate the 
possible rx buffer size to avoid internal fragmentation


If we can try to reuse virtio-net driver, we will get those nice features.


Thanks


> +
> +	err = virtqueue_add_inbuf(vq, &sg, 1, page, GFP_KERNEL);
> +	if (err < 0)
> +		free_page((unsigned long) page);
> +
> +	return err;
> +}
> +
>   static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
>   {
>   	int buf_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> @@ -267,27 +287,33 @@ static void virtio_vsock_rx_fill(struct virtio_vsock *vsock)
>   	vq = vsock->vqs[VSOCK_VQ_RX];
>
>   	do {
> -		pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> -		if (!pkt)
> -			break;
> +		if (vsock->mergeable) {
> +			ret = fill_mergeable_rx_buff(vq);
> +			if (ret)
> +				break;
> +		} else {
> +			pkt = kzalloc(sizeof(*pkt), GFP_KERNEL);
> +			if (!pkt)
> +				break;
>
> -		pkt->buf = kmalloc(buf_len, GFP_KERNEL);
> -		if (!pkt->buf) {
> -			virtio_transport_free_pkt(pkt);
> -			break;
> -		}
> +			pkt->buf = kmalloc(buf_len, GFP_KERNEL);
> +			if (!pkt->buf) {
> +				virtio_transport_free_pkt(pkt);
> +				break;
> +			}
>
> -		pkt->len = buf_len;
> +			pkt->len = buf_len;
>
> -		sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
> -		sgs[0] = &hdr;
> +			sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr));
> +			sgs[0] = &hdr;
>
> -		sg_init_one(&buf, pkt->buf, buf_len);
> -		sgs[1] = &buf;
> -		ret = virtqueue_add_sgs(vq, sgs, 0, 2, pkt, GFP_KERNEL);
> -		if (ret) {
> -			virtio_transport_free_pkt(pkt);
> -			break;
> +			sg_init_one(&buf, pkt->buf, buf_len);
> +			sgs[1] = &buf;
> +			ret = virtqueue_add_sgs(vq, sgs, 0, 2, pkt, GFP_KERNEL);
> +			if (ret) {
> +				virtio_transport_free_pkt(pkt);
> +				break;
> +			}
>   		}
>   		vsock->rx_buf_nr++;
>   	} while (vq->num_free);
> @@ -588,6 +614,9 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
>   	if (ret < 0)
>   		goto out_vqs;
>
> +	if (virtio_has_feature(vdev, VIRTIO_VSOCK_F_MRG_RXBUF))
> +		vsock->mergeable = true;
> +
>   	vsock->rx_buf_nr = 0;
>   	vsock->rx_buf_max_nr = 0;
>   	atomic_set(&vsock->queued_replies, 0);
> @@ -640,8 +669,12 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>   	vdev->config->reset(vdev);
>
>   	mutex_lock(&vsock->rx_lock);
> -	while ((pkt = virtqueue_detach_unused_buf(vsock->vqs[VSOCK_VQ_RX])))
> -		virtio_transport_free_pkt(pkt);
> +	while ((pkt = virtqueue_detach_unused_buf(vsock->vqs[VSOCK_VQ_RX]))) {
> +		if (vsock->mergeable)
> +			free_page((unsigned long)(void *)pkt);
> +		else
> +			virtio_transport_free_pkt(pkt);
> +	}
>   	mutex_unlock(&vsock->rx_lock);
>
>   	mutex_lock(&vsock->tx_lock);
> @@ -683,6 +716,7 @@ static void virtio_vsock_remove(struct virtio_device *vdev)
>   };
>
>   static unsigned int features[] = {
> +	VIRTIO_VSOCK_F_MRG_RXBUF,
>   };
>
>   static struct virtio_driver virtio_vsock_driver = {

^ permalink raw reply

* Re: [PATCH 0/5] VSOCK: support mergeable rx buffer in vhost-vsock
From: Jason Wang @ 2018-11-06  3:32 UTC (permalink / raw)
  To: jiangyiwen, stefanha; +Cc: netdev, kvm, virtualization
In-Reply-To: <5BE107B5.2050900@huawei.com>


On 2018/11/6 上午11:17, jiangyiwen wrote:
> On 2018/11/6 10:41, Jason Wang wrote:
>> On 2018/11/6 上午10:17, jiangyiwen wrote:
>>> On 2018/11/5 17:21, Jason Wang wrote:
>>>> On 2018/11/5 下午3:43, jiangyiwen wrote:
>>>>> Now vsock only support send/receive small packet, it can't achieve
>>>>> high performance. As previous discussed with Jason Wang, I revisit the
>>>>> idea of vhost-net about mergeable rx buffer and implement the mergeable
>>>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
>>>>> into different buffers and improve performance obviously.
>>>>>
>>>>> I write a tool to test the vhost-vsock performance, mainly send big
>>>>> packet(64K) included guest->Host and Host->Guest. The result as
>>>>> follows:
>>>>>
>>>>> Before performance:
>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
>>>>> Guest->Host   ~400MB/s                 ~480MB/s
>>>>> Host->Guest   ~1450MB/s                ~1600MB/s
>>>>>
>>>>> After performance:
>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
>>>>> Guest->Host   ~1700MB/s                ~2900MB/s
>>>>> Host->Guest   ~1700MB/s                ~2900MB/s
>>>>>
>>>>>    From the test results, the performance is improved obviously, and guest
>>>>> memory will not be wasted.
>>>> Hi:
>>>>
>>>> Thanks for the patches and the numbers are really impressive.
>>>>
>>>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
>>>>
>>>>
>>> Hi Jason,
>>>
>>> I am not very familiar with virtio-net, so I am afraid I can't give too
>>> much effective advice. Then I have several problems:
>>>
>>> 1. If use virtio-net as a transport, guest should see a virtio-net
>>> device instead of virtio-vsock device, right? Is vsock only as a
>>> transport between socket and net_device? User should still use
>>> AF_VSOCK type to create socket, right?
>>
>> Well, there're many choices. What you need is just to keep the socket API and hide the implementation. For example, you can keep the vosck device in guest and switch to use vhost-net in host. We probably need a new feature bit or header to let vhost know we are passing vsock packet. And vhost-net could forward the packet to vsock core on host.
>>
>>
>>> 2. I want to know if this idea has already started, and how is
>>> the current progress?
>>
>> Not yet started.  Just want to listen from the community. If this sounds good, do you have interest in implementing this?
>>
>>
>>> 3. And what is stefan's idea?
>>
>> Talk with Stefan a little on this during KVM Forum. I think he tends to agree on this idea. Anyway, let's wait for his reply.
>>
>>
>> Thanks
>>
>>
> Hi Jason,
>
> Thanks your reply, what you want is try to avoid duplicate code, and still
> use the existed features with virtio-net.


Yes, technically we can use virtio-net driver is guest as well but we 
could do it step by step.


> Yes, if this sounds good and most people can recognize this idea, I am very
> happy to implement this.


Cool, thanks.


>
> In addition, I hope you can review these patches before the new idea is
> implemented, after all the performance can be improved. :-)


Ok.


So the patch actually did three things:

- mergeable buffer implementation

- increase the default rx buffer size

- add used and signal guest in a batch

It would be helpful if you can measure the performance improvement 
independently. This can give reviewer a better understanding on how much 
did each part help.

Thanks


>
> Thanks,
> Yiwen.
>
>>> Thanks,
>>> Yiwen.
>>>
>> .
>>
>

^ permalink raw reply

* [PATCH iproute2] tc: f_u32: allow skip_hw and skip_sw flags to be last
From: Jakub Kicinski @ 2018-11-06  3:23 UTC (permalink / raw)
  To: stephen, dsahern; +Cc: netdev, oss-drivers, Jakub Kicinski

u32 uses NEXT_ARG() incorrectly when parsing skip_hw and skip_sw
flags.  NEXT_ARG() ensures there is another argument on the command
line, and is used in handling <keyword> <value> syntax to move past
<keyword> and ensure there is a <value> to read.

Commit 5e5b3008d1fb ("tc: f_u32: Add support for skip_hw and skip_sw
flags") seems to have copy pasted the handling from the previous
command - "police", which needs an extra parameter and is kind of
special due to the use of parse_police() helper.

The combination of NEXT_ARG() and continue worked fine as long as
skip_sw/skip_hw wasn't last, e.g.:

$ tc filter add dev dummy0 ingress prio 101 protocol ipv6 \
    u32 match ip6 priority 0xa0 0xe0 skip_hw action pass

But would fail if it was last:

$ tc filter add dev dummy0 ingress prio 101 protocol ipv6 \
    u32 match ip6 priority 0xa0 0xe0 flowid :1 skip_hw
Command line is not complete. Try option "help"

Remove the NEXT_ARG()s and the continues, and let the argc--; argv++;
at the end of the loop do its job.

Fixes: 5e5b3008d1fb ("tc: f_u32: Add support for skip_hw and skip_sw flags")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 tc/f_u32.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/tc/f_u32.c b/tc/f_u32.c
index bff4be637728..e0a322d5a11c 100644
--- a/tc/f_u32.c
+++ b/tc/f_u32.c
@@ -1147,13 +1147,9 @@ static int u32_parse_opt(struct filter_util *qu, char *handle,
 			terminal_ok++;
 			continue;
 		} else if (strcmp(*argv, "skip_hw") == 0) {
-			NEXT_ARG();
 			flags |= TCA_CLS_FLAGS_SKIP_HW;
-			continue;
 		} else if (strcmp(*argv, "skip_sw") == 0) {
-			NEXT_ARG();
 			flags |= TCA_CLS_FLAGS_SKIP_SW;
-			continue;
 		} else if (strcmp(*argv, "help") == 0) {
 			explain();
 			return -1;
-- 
2.17.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox