Linux-HyperV List
 help / color / mirror / Atom feed
* [PATCH 1/2] clocksource/Hyper-v: Allocate Hyper-V tsc page statically
From: lantianyu1986 @ 2019-07-29  7:52 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, hpa, x86, kys, haiyangz, sthemmin, sashal,
	daniel.lezcano, arnd, michael.h.kelley, ashal
  Cc: Tianyu Lan, linux-kernel, linux-hyperv, linux-arch
In-Reply-To: <20190729075243.22745-1-Tianyu.Lan@microsoft.com>

From: Tianyu Lan <Tianyu.Lan@microsoft.com>

This is to prepare to add Hyper-V sched clock callback and move
Hyper-V reference TSC initialization much earlier in the boot
process when timestamp is 0. So no discontinuity is observed
when pv_ops.time.sched_clock to calculate its offset. This earlier
initialization requires that the Hyper-V TSC page be allocated
statically instead of with vmalloc(), so fixup the references
to the TSC page and the method of getting its physical address.

Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
---
 arch/x86/entry/vdso/vma.c          |  2 +-
 drivers/clocksource/hyperv_timer.c | 12 ++++--------
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 349a61d8bf34..f5937742b290 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -122,7 +122,7 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
 
 		if (tsc_pg && vclock_was_used(VCLOCK_HVCLOCK))
 			return vmf_insert_pfn(vma, vmf->address,
-					vmalloc_to_pfn(tsc_pg));
+					virt_to_phys(tsc_pg) >> PAGE_SHIFT);
 	}
 
 	return VM_FAULT_SIGBUS;
diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index ba2c79e6a0ee..86764ec9a854 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -214,17 +214,17 @@ EXPORT_SYMBOL_GPL(hyperv_cs);
 
 #ifdef CONFIG_HYPERV_TSCPAGE
 
-static struct ms_hyperv_tsc_page *tsc_pg;
+static struct ms_hyperv_tsc_page tsc_pg __aligned(PAGE_SIZE);
 
 struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
 {
-	return tsc_pg;
+	return &tsc_pg;
 }
 EXPORT_SYMBOL_GPL(hv_get_tsc_page);
 
 static u64 notrace read_hv_sched_clock_tsc(void)
 {
-	u64 current_tick = hv_read_tsc_page(tsc_pg);
+	u64 current_tick = hv_read_tsc_page(&tsc_pg);
 
 	if (current_tick == U64_MAX)
 		hv_get_time_ref_count(current_tick);
@@ -280,12 +280,8 @@ static bool __init hv_init_tsc_clocksource(void)
 	if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
 		return false;
 
-	tsc_pg = vmalloc(PAGE_SIZE);
-	if (!tsc_pg)
-		return false;
-
 	hyperv_cs = &hyperv_cs_tsc;
-	phys_addr = page_to_phys(vmalloc_to_page(tsc_pg));
+	phys_addr = virt_to_phys(&tsc_pg) & PAGE_MASK;
 
 	/*
 	 * The Hyper-V TLFS specifies to preserve the value of reserved
-- 
2.14.5


^ permalink raw reply related

* [PATCH 0/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function
From: lantianyu1986 @ 2019-07-29  7:52 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, hpa, x86, kys, haiyangz, sthemmin, sashal,
	daniel.lezcano, arnd, michael.h.kelley, ashal
  Cc: Tianyu Lan, linux-arch, linux-hyperv, linux-kernel

From: Tianyu Lan <Tianyu.Lan@microsoft.com>

Hyper-V guests use the default native_sched_clock() in pv_ops.time.sched_clock
on x86.  But native_sched_clock() directly uses the raw TSC value, which
can be discontinuous in a Hyper-V VM.   Add the generic hv_setup_sched_clock()
to set the sched clock function appropriately.  On x86, this sets
pv_ops.time.sched_clock to read the Hyper-V reference TSC value that is
scaled and adjusted to be continuous.

Also move the Hyper-V reference TSC initialization much earlier in the boot
process so no discontinuity is observed when pv_ops.time.sched_clock
calculates its offset.  This earlier initialization requires that the Hyper-V TSC
page be allocated statically instead of with vmalloc(), so fixup the references
to the TSC page and the method of getting its physical address.

Tianyu Lan (2):
  clocksource/Hyper-v: Allocate Hyper-V tsc page statically
  clocksource/Hyper-V: Add Hyper-V specific sched clock function

 arch/x86/entry/vdso/vma.c          |  2 +-
 arch/x86/hyperv/hv_init.c          |  2 --
 arch/x86/kernel/cpu/mshyperv.c     |  8 ++++++++
 drivers/clocksource/hyperv_timer.c | 34 ++++++++++++++++------------------
 include/asm-generic/mshyperv.h     |  1 +
 5 files changed, 26 insertions(+), 21 deletions(-)

-- 
2.14.5


^ permalink raw reply

* [PATCH net] hv_sock: Fix hang when a connection is closed
From: Dexuan Cui @ 2019-07-28 18:32 UTC (permalink / raw)
  To: Sunil Muthuswamy, David Miller, netdev@vger.kernel.org
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, olaf@aepfle.de, apw@canonical.com,
	jasowang@redhat.com, vkuznets, marcelo.cerri@canonical.com


hvs_do_close_lock_held() may decrease the reference count to 0 and free the
sk struct completely, and then the following release_sock(sk) may hang.

Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: stable@vger.kernel.org

---
With the proper kernel debugging options enabled, first a warning can
appear:

kworker/1:0/4467 is freeing memory ..., with a lock still held there!
stack backtrace:
Workqueue: events vmbus_onmessage_work [hv_vmbus]
Call Trace:
 dump_stack+0x67/0x90
 debug_check_no_locks_freed.cold.52+0x78/0x7d
 slab_free_freelist_hook+0x85/0x140
 kmem_cache_free+0xa5/0x380
 __sk_destruct+0x150/0x260
 hvs_close_connection+0x24/0x30 [hv_sock]
 vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
 process_one_work+0x241/0x600
 worker_thread+0x3c/0x390
 kthread+0x11b/0x140
 ret_from_fork+0x24/0x30

and then the following release_sock(sk) can hang:

watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:4467]
...
irq event stamp: 62890
CPU: 1 PID: 4467 Comm: kworker/1:0 Tainted: G        W         5.2.0+ #39
Workqueue: events vmbus_onmessage_work [hv_vmbus]
RIP: 0010:queued_spin_lock_slowpath+0x2b/0x1e0
...
Call Trace:
 do_raw_spin_lock+0xab/0xb0
 release_sock+0x19/0xb0
 vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
 process_one_work+0x241/0x600
 worker_thread+0x3c/0x390
 kthread+0x11b/0x140
 ret_from_fork+0x24/0x30

 net/vmw_vsock/hyperv_transport.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index f2084e3f7aa4..efbda8ef1eff 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -309,9 +309,16 @@ static void hvs_close_connection(struct vmbus_channel *chan)
 {
 	struct sock *sk = get_per_channel_state(chan);
 
+	/* Grab an extra reference since hvs_do_close_lock_held() may decrease
+	 * the reference count to 0 by calling sock_put(sk).
+	 */
+	sock_hold(sk);
+
 	lock_sock(sk);
 	hvs_do_close_lock_held(vsock_sk(sk), true);
 	release_sock(sk);
+
+	sock_put(sk);
 }
 
 static void hvs_open_connection(struct vmbus_channel *chan)
-- 
2.19.1


^ permalink raw reply related

* Re: [PATCH] hv_sock: use HV_HYP_PAGE_SIZE instead of PAGE_SIZE_4K
From: kbuild test robot @ 2019-07-28  4:06 UTC (permalink / raw)
  To: Himadri Pandya
  Cc: kbuild-all, mikelley, kys, haiyangz, sthemmin, sashal, davem,
	linux-hyperv, netdev, linux-kernel, Himadri Pandya
In-Reply-To: <20190725051125.10605-1-himadri18.07@gmail.com>

Hi Himadri,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[cannot apply to v5.3-rc1 next-20190726]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Himadri-Pandya/hv_sock-use-HV_HYP_PAGE_SIZE-instead-of-PAGE_SIZE_4K/20190726-085229
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.1-rc1-7-g2b96cd8-dirty
        make ARCH=x86_64 allmodconfig
        make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

   include/linux/sched.h:609:43: sparse: sparse: bad integer constant expression
   include/linux/sched.h:609:73: sparse: sparse: invalid named zero-width bitfield `value'
   include/linux/sched.h:610:43: sparse: sparse: bad integer constant expression
   include/linux/sched.h:610:67: sparse: sparse: invalid named zero-width bitfield `bucket_id'
   net/vmw_vsock/hyperv_transport.c:214:39: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:214:39: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
>> net/vmw_vsock/hyperv_transport.c:214:39: sparse: sparse: incompatible types for operation (-)
>> net/vmw_vsock/hyperv_transport.c:214:39: sparse:    left side has type bad type
>> net/vmw_vsock/hyperv_transport.c:214:39: sparse:    right side has type int
   net/vmw_vsock/hyperv_transport.c:214:39: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
>> net/vmw_vsock/hyperv_transport.c:214:39: sparse: sparse: incompatible types for operation (-)
>> net/vmw_vsock/hyperv_transport.c:214:39: sparse:    left side has type bad type
>> net/vmw_vsock/hyperv_transport.c:214:39: sparse:    right side has type int
   net/vmw_vsock/hyperv_transport.c:65:17: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
>> net/vmw_vsock/hyperv_transport.c:65:17: sparse: sparse: bad constant expression type
   net/vmw_vsock/hyperv_transport.c:387:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:388:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
>> net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
>> net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
>> net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
>> net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
>> net/vmw_vsock/hyperv_transport.c:390:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:391:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:392:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:392:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:392:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:392:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:393:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:394:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:395:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:395:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:395:26: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:395:26: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:465:25: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:466:25: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:666:9: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: cast from unknown type
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: undefined identifier 'HV_HYP_PAGE_SIZE'
   net/vmw_vsock/hyperv_transport.c:681:28: sparse: sparse: cast from unknown type

vim +214 net/vmw_vsock/hyperv_transport.c

ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   59  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   60  struct hvs_send_buf {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   61  	/* The header before the payload data */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   62  	struct vmpipe_proto_header hdr;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   63  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   64  	/* The payload */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  @65  	u8 data[HVS_SEND_BUF_SIZE];
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   66  };
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   67  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   68  #define HVS_HEADER_LEN	(sizeof(struct vmpacket_descriptor) + \
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   69  			 sizeof(struct vmpipe_proto_header))
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   70  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   71  /* See 'prev_indices' in hv_ringbuffer_read(), hv_ringbuffer_write(), and
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   72   * __hv_pkt_iter_next().
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   73   */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   74  #define VMBUS_PKT_TRAILER_SIZE	(sizeof(u64))
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   75  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   76  #define HVS_PKT_LEN(payload_len)	(HVS_HEADER_LEN + \
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   77  					 ALIGN((payload_len), 8) + \
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   78  					 VMBUS_PKT_TRAILER_SIZE)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   79  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   80  union hvs_service_id {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   81  	uuid_le	srv_id;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   82  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   83  	struct {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   84  		unsigned int svm_port;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   85  		unsigned char b[sizeof(uuid_le) - sizeof(unsigned int)];
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   86  	};
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   87  };
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   88  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   89  /* Per-socket state (accessed via vsk->trans) */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   90  struct hvsock {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   91  	struct vsock_sock *vsk;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   92  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   93  	uuid_le vm_srv_id;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   94  	uuid_le host_srv_id;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   95  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   96  	struct vmbus_channel *chan;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   97  	struct vmpacket_descriptor *recv_desc;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   98  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26   99  	/* The length of the payload not delivered to userland yet */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  100  	u32 recv_data_len;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  101  	/* The offset of the payload */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  102  	u32 recv_data_off;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  103  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  104  	/* Have we sent the zero-length packet (FIN)? */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  105  	bool fin_sent;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  106  };
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  107  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  108  /* In the VM, we support Hyper-V Sockets with AF_VSOCK, and the endpoint is
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  109   * <cid, port> (see struct sockaddr_vm). Note: cid is not really used here:
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  110   * when we write apps to connect to the host, we can only use VMADDR_CID_ANY
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  111   * or VMADDR_CID_HOST (both are equivalent) as the remote cid, and when we
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  112   * write apps to bind() & listen() in the VM, we can only use VMADDR_CID_ANY
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  113   * as the local cid.
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  114   *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  115   * On the host, Hyper-V Sockets are supported by Winsock AF_HYPERV:
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  116   * https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/user-
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  117   * guide/make-integration-service, and the endpoint is <VmID, ServiceId> with
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  118   * the below sockaddr:
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  119   *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  120   * struct SOCKADDR_HV
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  121   * {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  122   *    ADDRESS_FAMILY Family;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  123   *    USHORT Reserved;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  124   *    GUID VmId;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  125   *    GUID ServiceId;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  126   * };
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  127   * Note: VmID is not used by Linux VM and actually it isn't transmitted via
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  128   * VMBus, because here it's obvious the host and the VM can easily identify
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  129   * each other. Though the VmID is useful on the host, especially in the case
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  130   * of Windows container, Linux VM doesn't need it at all.
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  131   *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  132   * To make use of the AF_VSOCK infrastructure in Linux VM, we have to limit
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  133   * the available GUID space of SOCKADDR_HV so that we can create a mapping
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  134   * between AF_VSOCK port and SOCKADDR_HV Service GUID. The rule of writing
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  135   * Hyper-V Sockets apps on the host and in Linux VM is:
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  136   *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  137   ****************************************************************************
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  138   * The only valid Service GUIDs, from the perspectives of both the host and *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  139   * Linux VM, that can be connected by the other end, must conform to this   *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  140   * format: <port>-facb-11e6-bd58-64006a7986d3, and the "port" must be in    *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  141   * this range [0, 0x7FFFFFFF].                                              *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  142   ****************************************************************************
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  143   *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  144   * When we write apps on the host to connect(), the GUID ServiceID is used.
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  145   * When we write apps in Linux VM to connect(), we only need to specify the
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  146   * port and the driver will form the GUID and use that to request the host.
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  147   *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  148   * From the perspective of Linux VM:
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  149   * 1. the local ephemeral port (i.e. the local auto-bound port when we call
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  150   * connect() without explicit bind()) is generated by __vsock_bind_stream(),
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  151   * and the range is [1024, 0xFFFFFFFF).
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  152   * 2. the remote ephemeral port (i.e. the auto-generated remote port for
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  153   * a connect request initiated by the host's connect()) is generated by
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  154   * hvs_remote_addr_init() and the range is [0x80000000, 0xFFFFFFFF).
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  155   */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  156  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  157  #define MAX_LISTEN_PORT			((u32)0x7FFFFFFF)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  158  #define MAX_VM_LISTEN_PORT		MAX_LISTEN_PORT
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  159  #define MAX_HOST_LISTEN_PORT		MAX_LISTEN_PORT
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  160  #define MIN_HOST_EPHEMERAL_PORT		(MAX_HOST_LISTEN_PORT + 1)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  161  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  162  /* 00000000-facb-11e6-bd58-64006a7986d3 */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  163  static const uuid_le srv_id_template =
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  164  	UUID_LE(0x00000000, 0xfacb, 0x11e6, 0xbd, 0x58,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  165  		0x64, 0x00, 0x6a, 0x79, 0x86, 0xd3);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  166  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  167  static bool is_valid_srv_id(const uuid_le *id)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  168  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  169  	return !memcmp(&id->b[4], &srv_id_template.b[4], sizeof(uuid_le) - 4);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  170  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  171  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  172  static unsigned int get_port_by_srv_id(const uuid_le *svr_id)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  173  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  174  	return *((unsigned int *)svr_id);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  175  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  176  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  177  static void hvs_addr_init(struct sockaddr_vm *addr, const uuid_le *svr_id)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  178  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  179  	unsigned int port = get_port_by_srv_id(svr_id);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  180  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  181  	vsock_addr_init(addr, VMADDR_CID_ANY, port);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  182  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  183  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  184  static void hvs_remote_addr_init(struct sockaddr_vm *remote,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  185  				 struct sockaddr_vm *local)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  186  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  187  	static u32 host_ephemeral_port = MIN_HOST_EPHEMERAL_PORT;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  188  	struct sock *sk;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  189  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  190  	vsock_addr_init(remote, VMADDR_CID_ANY, VMADDR_PORT_ANY);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  191  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  192  	while (1) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  193  		/* Wrap around ? */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  194  		if (host_ephemeral_port < MIN_HOST_EPHEMERAL_PORT ||
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  195  		    host_ephemeral_port == VMADDR_PORT_ANY)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  196  			host_ephemeral_port = MIN_HOST_EPHEMERAL_PORT;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  197  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  198  		remote->svm_port = host_ephemeral_port++;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  199  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  200  		sk = vsock_find_connected_socket(remote, local);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  201  		if (!sk) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  202  			/* Found an available ephemeral port */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  203  			return;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  204  		}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  205  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  206  		/* Release refcnt got in vsock_find_connected_socket */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  207  		sock_put(sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  208  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  209  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  210  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  211  static void hvs_set_channel_pending_send_size(struct vmbus_channel *chan)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  212  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  213  	set_channel_pending_send_size(chan,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26 @214  				      HVS_PKT_LEN(HVS_SEND_BUF_SIZE));
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  215  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  216  	virt_mb();
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  217  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  218  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  219  static bool hvs_channel_readable(struct vmbus_channel *chan)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  220  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  221  	u32 readable = hv_get_bytes_to_read(&chan->inbound);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  222  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  223  	/* 0-size payload means FIN */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  224  	return readable >= HVS_PKT_LEN(0);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  225  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  226  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  227  static int hvs_channel_readable_payload(struct vmbus_channel *chan)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  228  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  229  	u32 readable = hv_get_bytes_to_read(&chan->inbound);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  230  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  231  	if (readable > HVS_PKT_LEN(0)) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  232  		/* At least we have 1 byte to read. We don't need to return
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  233  		 * the exact readable bytes: see vsock_stream_recvmsg() ->
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  234  		 * vsock_stream_has_data().
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  235  		 */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  236  		return 1;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  237  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  238  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  239  	if (readable == HVS_PKT_LEN(0)) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  240  		/* 0-size payload means FIN */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  241  		return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  242  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  243  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  244  	/* No payload or FIN */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  245  	return -1;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  246  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  247  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  248  static size_t hvs_channel_writable_bytes(struct vmbus_channel *chan)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  249  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  250  	u32 writeable = hv_get_bytes_to_write(&chan->outbound);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  251  	size_t ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  252  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  253  	/* The ringbuffer mustn't be 100% full, and we should reserve a
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  254  	 * zero-length-payload packet for the FIN: see hv_ringbuffer_write()
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  255  	 * and hvs_shutdown().
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  256  	 */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  257  	if (writeable <= HVS_PKT_LEN(1) + HVS_PKT_LEN(0))
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  258  		return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  259  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  260  	ret = writeable - HVS_PKT_LEN(1) - HVS_PKT_LEN(0);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  261  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  262  	return round_down(ret, 8);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  263  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  264  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  265  static int hvs_send_data(struct vmbus_channel *chan,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  266  			 struct hvs_send_buf *send_buf, size_t to_write)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  267  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  268  	send_buf->hdr.pkt_type = 1;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  269  	send_buf->hdr.data_size = to_write;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  270  	return vmbus_sendpacket(chan, &send_buf->hdr,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  271  				sizeof(send_buf->hdr) + to_write,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  272  				0, VM_PKT_DATA_INBAND, 0);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  273  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  274  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  275  static void hvs_channel_cb(void *ctx)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  276  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  277  	struct sock *sk = (struct sock *)ctx;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  278  	struct vsock_sock *vsk = vsock_sk(sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  279  	struct hvsock *hvs = vsk->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  280  	struct vmbus_channel *chan = hvs->chan;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  281  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  282  	if (hvs_channel_readable(chan))
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  283  		sk->sk_data_ready(sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  284  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  285  	if (hv_get_bytes_to_write(&chan->outbound) > 0)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  286  		sk->sk_write_space(sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  287  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  288  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  289  static void hvs_do_close_lock_held(struct vsock_sock *vsk,
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  290  				   bool cancel_timeout)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  291  {
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  292  	struct sock *sk = sk_vsock(vsk);
b4562ca7925a3be Dexuan Cui       2017-10-19  293  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  294  	sock_set_flag(sk, SOCK_DONE);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  295  	vsk->peer_shutdown = SHUTDOWN_MASK;
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  296  	if (vsock_stream_has_data(vsk) <= 0)
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  297  		sk->sk_state = TCP_CLOSING;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  298  	sk->sk_state_change(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  299  	if (vsk->close_work_scheduled &&
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  300  	    (!cancel_timeout || cancel_delayed_work(&vsk->close_work))) {
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  301  		vsk->close_work_scheduled = false;
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  302  		vsock_remove_sock(vsk);
b4562ca7925a3be Dexuan Cui       2017-10-19  303  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  304  		/* Release the reference taken while scheduling the timeout */
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  305  		sock_put(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  306  	}
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  307  }
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  308  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  309  static void hvs_close_connection(struct vmbus_channel *chan)
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  310  {
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  311  	struct sock *sk = get_per_channel_state(chan);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  312  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  313  	lock_sock(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  314  	hvs_do_close_lock_held(vsock_sk(sk), true);
b4562ca7925a3be Dexuan Cui       2017-10-19  315  	release_sock(sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  316  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  317  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  318  static void hvs_open_connection(struct vmbus_channel *chan)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  319  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  320  	uuid_le *if_instance, *if_type;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  321  	unsigned char conn_from_host;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  322  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  323  	struct sockaddr_vm addr;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  324  	struct sock *sk, *new = NULL;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  325  	struct vsock_sock *vnew = NULL;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  326  	struct hvsock *hvs = NULL;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  327  	struct hvsock *hvs_new = NULL;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  328  	int rcvbuf;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  329  	int ret;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  330  	int sndbuf;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  331  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  332  	if_type = &chan->offermsg.offer.if_type;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  333  	if_instance = &chan->offermsg.offer.if_instance;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  334  	conn_from_host = chan->offermsg.offer.u.pipe.user_def[0];
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  335  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  336  	/* The host or the VM should only listen on a port in
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  337  	 * [0, MAX_LISTEN_PORT]
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  338  	 */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  339  	if (!is_valid_srv_id(if_type) ||
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  340  	    get_port_by_srv_id(if_type) > MAX_LISTEN_PORT)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  341  		return;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  342  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  343  	hvs_addr_init(&addr, conn_from_host ? if_type : if_instance);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  344  	sk = vsock_find_bound_socket(&addr);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  345  	if (!sk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  346  		return;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  347  
b4562ca7925a3be Dexuan Cui       2017-10-19  348  	lock_sock(sk);
3b4477d2dcf2709 Stefan Hajnoczi  2017-10-05  349  	if ((conn_from_host && sk->sk_state != TCP_LISTEN) ||
3b4477d2dcf2709 Stefan Hajnoczi  2017-10-05  350  	    (!conn_from_host && sk->sk_state != TCP_SYN_SENT))
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  351  		goto out;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  352  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  353  	if (conn_from_host) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  354  		if (sk->sk_ack_backlog >= sk->sk_max_ack_backlog)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  355  			goto out;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  356  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  357  		new = __vsock_create(sock_net(sk), NULL, sk, GFP_KERNEL,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  358  				     sk->sk_type, 0);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  359  		if (!new)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  360  			goto out;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  361  
3b4477d2dcf2709 Stefan Hajnoczi  2017-10-05  362  		new->sk_state = TCP_SYN_SENT;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  363  		vnew = vsock_sk(new);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  364  		hvs_new = vnew->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  365  		hvs_new->chan = chan;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  366  	} else {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  367  		hvs = vsock_sk(sk)->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  368  		hvs->chan = chan;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  369  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  370  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  371  	set_channel_read_mode(chan, HV_CALL_DIRECT);
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  372  
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  373  	/* Use the socket buffer sizes as hints for the VMBUS ring size. For
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  374  	 * server side sockets, 'sk' is the parent socket and thus, this will
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  375  	 * allow the child sockets to inherit the size from the parent. Keep
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  376  	 * the mins to the default value and align to page size as per VMBUS
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  377  	 * requirements.
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  378  	 * For the max, the socket core library will limit the socket buffer
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  379  	 * size that can be set by the user, but, since currently, the hv_sock
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  380  	 * VMBUS ring buffer is physically contiguous allocation, restrict it
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  381  	 * further.
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  382  	 * Older versions of hv_sock host side code cannot handle bigger VMBUS
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  383  	 * ring buffer size. Use the version number to limit the change to newer
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  384  	 * versions.
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  385  	 */
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  386  	if (vmbus_proto_version < VERSION_WIN10_V5) {
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  387  		sndbuf = RINGBUFFER_HVS_SND_SIZE;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  388  		rcvbuf = RINGBUFFER_HVS_RCV_SIZE;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  389  	} else {
ac383f58f3c98de Sunil Muthuswamy 2019-05-22 @390  		sndbuf = max_t(int, sk->sk_sndbuf, RINGBUFFER_HVS_SND_SIZE);
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  391  		sndbuf = min_t(int, sndbuf, RINGBUFFER_HVS_MAX_SIZE);
31113cc83e30924 Himadri Pandya   2019-07-25  392  		sndbuf = ALIGN(sndbuf, HV_HYP_PAGE_SIZE);
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  393  		rcvbuf = max_t(int, sk->sk_rcvbuf, RINGBUFFER_HVS_RCV_SIZE);
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  394  		rcvbuf = min_t(int, rcvbuf, RINGBUFFER_HVS_MAX_SIZE);
31113cc83e30924 Himadri Pandya   2019-07-25  395  		rcvbuf = ALIGN(rcvbuf, HV_HYP_PAGE_SIZE);
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  396  	}
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  397  
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  398  	ret = vmbus_open(chan, sndbuf, rcvbuf, NULL, 0, hvs_channel_cb,
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  399  			 conn_from_host ? new : sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  400  	if (ret != 0) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  401  		if (conn_from_host) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  402  			hvs_new->chan = NULL;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  403  			sock_put(new);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  404  		} else {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  405  			hvs->chan = NULL;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  406  		}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  407  		goto out;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  408  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  409  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  410  	set_per_channel_state(chan, conn_from_host ? new : sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  411  	vmbus_set_chn_rescind_callback(chan, hvs_close_connection);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  412  
cb359b60416701c Sunil Muthuswamy 2019-06-17  413  	/* Set the pending send size to max packet size to always get
cb359b60416701c Sunil Muthuswamy 2019-06-17  414  	 * notifications from the host when there is enough writable space.
cb359b60416701c Sunil Muthuswamy 2019-06-17  415  	 * The host is optimized to send notifications only when the pending
cb359b60416701c Sunil Muthuswamy 2019-06-17  416  	 * size boundary is crossed, and not always.
cb359b60416701c Sunil Muthuswamy 2019-06-17  417  	 */
cb359b60416701c Sunil Muthuswamy 2019-06-17  418  	hvs_set_channel_pending_send_size(chan);
cb359b60416701c Sunil Muthuswamy 2019-06-17  419  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  420  	if (conn_from_host) {
3b4477d2dcf2709 Stefan Hajnoczi  2017-10-05  421  		new->sk_state = TCP_ESTABLISHED;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  422  		sk->sk_ack_backlog++;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  423  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  424  		hvs_addr_init(&vnew->local_addr, if_type);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  425  		hvs_remote_addr_init(&vnew->remote_addr, &vnew->local_addr);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  426  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  427  		hvs_new->vm_srv_id = *if_type;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  428  		hvs_new->host_srv_id = *if_instance;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  429  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  430  		vsock_insert_connected(vnew);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  431  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  432  		vsock_enqueue_accept(sk, new);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  433  	} else {
3b4477d2dcf2709 Stefan Hajnoczi  2017-10-05  434  		sk->sk_state = TCP_ESTABLISHED;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  435  		sk->sk_socket->state = SS_CONNECTED;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  436  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  437  		vsock_insert_connected(vsock_sk(sk));
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  438  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  439  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  440  	sk->sk_state_change(sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  441  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  442  out:
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  443  	/* Release refcnt obtained when we called vsock_find_bound_socket() */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  444  	sock_put(sk);
b4562ca7925a3be Dexuan Cui       2017-10-19  445  
b4562ca7925a3be Dexuan Cui       2017-10-19  446  	release_sock(sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  447  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  448  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  449  static u32 hvs_get_local_cid(void)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  450  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  451  	return VMADDR_CID_ANY;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  452  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  453  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  454  static int hvs_sock_init(struct vsock_sock *vsk, struct vsock_sock *psk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  455  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  456  	struct hvsock *hvs;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  457  	struct sock *sk = sk_vsock(vsk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  458  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  459  	hvs = kzalloc(sizeof(*hvs), GFP_KERNEL);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  460  	if (!hvs)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  461  		return -ENOMEM;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  462  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  463  	vsk->trans = hvs;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  464  	hvs->vsk = vsk;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  465  	sk->sk_sndbuf = RINGBUFFER_HVS_SND_SIZE;
ac383f58f3c98de Sunil Muthuswamy 2019-05-22  466  	sk->sk_rcvbuf = RINGBUFFER_HVS_RCV_SIZE;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  467  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  468  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  469  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  470  static int hvs_connect(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  471  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  472  	union hvs_service_id vm, host;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  473  	struct hvsock *h = vsk->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  474  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  475  	vm.srv_id = srv_id_template;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  476  	vm.svm_port = vsk->local_addr.svm_port;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  477  	h->vm_srv_id = vm.srv_id;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  478  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  479  	host.srv_id = srv_id_template;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  480  	host.svm_port = vsk->remote_addr.svm_port;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  481  	h->host_srv_id = host.srv_id;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  482  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  483  	return vmbus_send_tl_connect_request(&h->vm_srv_id, &h->host_srv_id);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  484  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  485  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  486  static void hvs_shutdown_lock_held(struct hvsock *hvs, int mode)
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  487  {
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  488  	struct vmpipe_proto_header hdr;
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  489  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  490  	if (hvs->fin_sent || !hvs->chan)
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  491  		return;
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  492  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  493  	/* It can't fail: see hvs_channel_writable_bytes(). */
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  494  	(void)hvs_send_data(hvs->chan, (struct hvs_send_buf *)&hdr, 0);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  495  	hvs->fin_sent = true;
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  496  }
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  497  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  498  static int hvs_shutdown(struct vsock_sock *vsk, int mode)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  499  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  500  	struct sock *sk = sk_vsock(vsk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  501  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  502  	if (!(mode & SEND_SHUTDOWN))
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  503  		return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  504  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  505  	lock_sock(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  506  	hvs_shutdown_lock_held(vsk->trans, mode);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  507  	release_sock(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  508  	return 0;
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  509  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  510  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  511  static void hvs_close_timeout(struct work_struct *work)
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  512  {
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  513  	struct vsock_sock *vsk =
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  514  		container_of(work, struct vsock_sock, close_work.work);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  515  	struct sock *sk = sk_vsock(vsk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  516  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  517  	sock_hold(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  518  	lock_sock(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  519  	if (!sock_flag(sk, SOCK_DONE))
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  520  		hvs_do_close_lock_held(vsk, false);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  521  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  522  	vsk->close_work_scheduled = false;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  523  	release_sock(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  524  	sock_put(sk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  525  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  526  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  527  /* Returns true, if it is safe to remove socket; false otherwise */
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  528  static bool hvs_close_lock_held(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  529  {
b4562ca7925a3be Dexuan Cui       2017-10-19  530  	struct sock *sk = sk_vsock(vsk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  531  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  532  	if (!(sk->sk_state == TCP_ESTABLISHED ||
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  533  	      sk->sk_state == TCP_CLOSING))
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  534  		return true;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  535  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  536  	if ((sk->sk_shutdown & SHUTDOWN_MASK) != SHUTDOWN_MASK)
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  537  		hvs_shutdown_lock_held(vsk->trans, SHUTDOWN_MASK);
b4562ca7925a3be Dexuan Cui       2017-10-19  538  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  539  	if (sock_flag(sk, SOCK_DONE))
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  540  		return true;
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  541  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  542  	/* This reference will be dropped by the delayed close routine */
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  543  	sock_hold(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  544  	INIT_DELAYED_WORK(&vsk->close_work, hvs_close_timeout);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  545  	vsk->close_work_scheduled = true;
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  546  	schedule_delayed_work(&vsk->close_work, HVS_CLOSE_TIMEOUT);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  547  	return false;
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  548  }
b4562ca7925a3be Dexuan Cui       2017-10-19  549  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  550  static void hvs_release(struct vsock_sock *vsk)
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  551  {
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  552  	struct sock *sk = sk_vsock(vsk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  553  	bool remove_sock;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  554  
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  555  	lock_sock(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  556  	remove_sock = hvs_close_lock_held(vsk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  557  	release_sock(sk);
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  558  	if (remove_sock)
a9eeb998c28d550 Sunil Muthuswamy 2019-05-15  559  		vsock_remove_sock(vsk);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  560  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  561  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  562  static void hvs_destruct(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  563  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  564  	struct hvsock *hvs = vsk->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  565  	struct vmbus_channel *chan = hvs->chan;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  566  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  567  	if (chan)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  568  		vmbus_hvsock_device_unregister(chan);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  569  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  570  	kfree(hvs);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  571  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  572  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  573  static int hvs_dgram_bind(struct vsock_sock *vsk, struct sockaddr_vm *addr)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  574  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  575  	return -EOPNOTSUPP;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  576  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  577  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  578  static int hvs_dgram_dequeue(struct vsock_sock *vsk, struct msghdr *msg,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  579  			     size_t len, int flags)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  580  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  581  	return -EOPNOTSUPP;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  582  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  583  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  584  static int hvs_dgram_enqueue(struct vsock_sock *vsk,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  585  			     struct sockaddr_vm *remote, struct msghdr *msg,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  586  			     size_t dgram_len)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  587  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  588  	return -EOPNOTSUPP;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  589  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  590  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  591  static bool hvs_dgram_allow(u32 cid, u32 port)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  592  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  593  	return false;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  594  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  595  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  596  static int hvs_update_recv_data(struct hvsock *hvs)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  597  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  598  	struct hvs_recv_buf *recv_buf;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  599  	u32 payload_len;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  600  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  601  	recv_buf = (struct hvs_recv_buf *)(hvs->recv_desc + 1);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  602  	payload_len = recv_buf->hdr.data_size;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  603  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  604  	if (payload_len > HVS_MTU_SIZE)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  605  		return -EIO;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  606  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  607  	if (payload_len == 0)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  608  		hvs->vsk->peer_shutdown |= SEND_SHUTDOWN;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  609  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  610  	hvs->recv_data_len = payload_len;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  611  	hvs->recv_data_off = 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  612  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  613  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  614  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  615  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  616  static ssize_t hvs_stream_dequeue(struct vsock_sock *vsk, struct msghdr *msg,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  617  				  size_t len, int flags)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  618  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  619  	struct hvsock *hvs = vsk->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  620  	bool need_refill = !hvs->recv_desc;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  621  	struct hvs_recv_buf *recv_buf;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  622  	u32 to_read;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  623  	int ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  624  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  625  	if (flags & MSG_PEEK)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  626  		return -EOPNOTSUPP;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  627  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  628  	if (need_refill) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  629  		hvs->recv_desc = hv_pkt_iter_first(hvs->chan);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  630  		ret = hvs_update_recv_data(hvs);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  631  		if (ret)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  632  			return ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  633  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  634  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  635  	recv_buf = (struct hvs_recv_buf *)(hvs->recv_desc + 1);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  636  	to_read = min_t(u32, len, hvs->recv_data_len);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  637  	ret = memcpy_to_msg(msg, recv_buf->data + hvs->recv_data_off, to_read);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  638  	if (ret != 0)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  639  		return ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  640  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  641  	hvs->recv_data_len -= to_read;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  642  	if (hvs->recv_data_len == 0) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  643  		hvs->recv_desc = hv_pkt_iter_next(hvs->chan, hvs->recv_desc);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  644  		if (hvs->recv_desc) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  645  			ret = hvs_update_recv_data(hvs);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  646  			if (ret)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  647  				return ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  648  		}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  649  	} else {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  650  		hvs->recv_data_off += to_read;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  651  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  652  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  653  	return to_read;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  654  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  655  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  656  static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  657  				  size_t len)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  658  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  659  	struct hvsock *hvs = vsk->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  660  	struct vmbus_channel *chan = hvs->chan;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  661  	struct hvs_send_buf *send_buf;
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  662  	ssize_t to_write, max_writable;
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  663  	ssize_t ret = 0;
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  664  	ssize_t bytes_written = 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  665  
31113cc83e30924 Himadri Pandya   2019-07-25  666  	BUILD_BUG_ON(sizeof(*send_buf) != HV_HYP_PAGE_SIZE);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  667  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  668  	send_buf = kmalloc(sizeof(*send_buf), GFP_KERNEL);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  669  	if (!send_buf)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  670  		return -ENOMEM;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  671  
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  672  	/* Reader(s) could be draining data from the channel as we write.
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  673  	 * Maximize bandwidth, by iterating until the channel is found to be
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  674  	 * full.
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  675  	 */
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  676  	while (len) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  677  		max_writable = hvs_channel_writable_bytes(chan);
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  678  		if (!max_writable)
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  679  			break;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  680  		to_write = min_t(ssize_t, len, max_writable);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  681  		to_write = min_t(ssize_t, to_write, HVS_SEND_BUF_SIZE);
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  682  		/* memcpy_from_msg is safe for loop as it advances the offsets
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  683  		 * within the message iterator.
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  684  		 */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  685  		ret = memcpy_from_msg(send_buf->data, msg, to_write);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  686  		if (ret < 0)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  687  			goto out;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  688  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  689  		ret = hvs_send_data(hvs->chan, send_buf, to_write);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  690  		if (ret < 0)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  691  			goto out;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  692  
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  693  		bytes_written += to_write;
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  694  		len -= to_write;
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  695  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  696  out:
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  697  	/* If any data has been sent, return that */
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  698  	if (bytes_written)
14a1eaa8820e8f3 Sunil Muthuswamy 2019-05-22  699  		ret = bytes_written;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  700  	kfree(send_buf);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  701  	return ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  702  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  703  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  704  static s64 hvs_stream_has_data(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  705  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  706  	struct hvsock *hvs = vsk->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  707  	s64 ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  708  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  709  	if (hvs->recv_data_len > 0)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  710  		return 1;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  711  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  712  	switch (hvs_channel_readable_payload(hvs->chan)) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  713  	case 1:
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  714  		ret = 1;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  715  		break;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  716  	case 0:
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  717  		vsk->peer_shutdown |= SEND_SHUTDOWN;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  718  		ret = 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  719  		break;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  720  	default: /* -1 */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  721  		ret = 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  722  		break;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  723  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  724  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  725  	return ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  726  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  727  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  728  static s64 hvs_stream_has_space(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  729  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  730  	struct hvsock *hvs = vsk->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  731  
cb359b60416701c Sunil Muthuswamy 2019-06-17  732  	return hvs_channel_writable_bytes(hvs->chan);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  733  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  734  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  735  static u64 hvs_stream_rcvhiwat(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  736  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  737  	return HVS_MTU_SIZE + 1;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  738  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  739  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  740  static bool hvs_stream_is_active(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  741  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  742  	struct hvsock *hvs = vsk->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  743  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  744  	return hvs->chan != NULL;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  745  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  746  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  747  static bool hvs_stream_allow(u32 cid, u32 port)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  748  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  749  	/* The host's port range [MIN_HOST_EPHEMERAL_PORT, 0xFFFFFFFF) is
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  750  	 * reserved as ephemeral ports, which are used as the host's ports
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  751  	 * when the host initiates connections.
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  752  	 *
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  753  	 * Perform this check in the guest so an immediate error is produced
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  754  	 * instead of a timeout.
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  755  	 */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  756  	if (port > MAX_HOST_LISTEN_PORT)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  757  		return false;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  758  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  759  	if (cid == VMADDR_CID_HOST)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  760  		return true;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  761  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  762  	return false;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  763  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  764  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  765  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  766  int hvs_notify_poll_in(struct vsock_sock *vsk, size_t target, bool *readable)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  767  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  768  	struct hvsock *hvs = vsk->trans;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  769  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  770  	*readable = hvs_channel_readable(hvs->chan);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  771  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  772  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  773  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  774  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  775  int hvs_notify_poll_out(struct vsock_sock *vsk, size_t target, bool *writable)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  776  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  777  	*writable = hvs_stream_has_space(vsk) > 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  778  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  779  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  780  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  781  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  782  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  783  int hvs_notify_recv_init(struct vsock_sock *vsk, size_t target,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  784  			 struct vsock_transport_recv_notify_data *d)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  785  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  786  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  787  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  788  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  789  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  790  int hvs_notify_recv_pre_block(struct vsock_sock *vsk, size_t target,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  791  			      struct vsock_transport_recv_notify_data *d)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  792  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  793  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  794  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  795  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  796  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  797  int hvs_notify_recv_pre_dequeue(struct vsock_sock *vsk, size_t target,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  798  				struct vsock_transport_recv_notify_data *d)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  799  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  800  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  801  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  802  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  803  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  804  int hvs_notify_recv_post_dequeue(struct vsock_sock *vsk, size_t target,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  805  				 ssize_t copied, bool data_read,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  806  				 struct vsock_transport_recv_notify_data *d)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  807  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  808  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  809  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  810  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  811  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  812  int hvs_notify_send_init(struct vsock_sock *vsk,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  813  			 struct vsock_transport_send_notify_data *d)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  814  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  815  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  816  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  817  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  818  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  819  int hvs_notify_send_pre_block(struct vsock_sock *vsk,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  820  			      struct vsock_transport_send_notify_data *d)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  821  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  822  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  823  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  824  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  825  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  826  int hvs_notify_send_pre_enqueue(struct vsock_sock *vsk,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  827  				struct vsock_transport_send_notify_data *d)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  828  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  829  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  830  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  831  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  832  static
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  833  int hvs_notify_send_post_enqueue(struct vsock_sock *vsk, ssize_t written,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  834  				 struct vsock_transport_send_notify_data *d)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  835  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  836  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  837  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  838  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  839  static void hvs_set_buffer_size(struct vsock_sock *vsk, u64 val)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  840  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  841  	/* Ignored. */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  842  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  843  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  844  static void hvs_set_min_buffer_size(struct vsock_sock *vsk, u64 val)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  845  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  846  	/* Ignored. */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  847  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  848  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  849  static void hvs_set_max_buffer_size(struct vsock_sock *vsk, u64 val)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  850  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  851  	/* Ignored. */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  852  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  853  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  854  static u64 hvs_get_buffer_size(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  855  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  856  	return -ENOPROTOOPT;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  857  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  858  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  859  static u64 hvs_get_min_buffer_size(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  860  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  861  	return -ENOPROTOOPT;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  862  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  863  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  864  static u64 hvs_get_max_buffer_size(struct vsock_sock *vsk)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  865  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  866  	return -ENOPROTOOPT;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  867  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  868  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  869  static struct vsock_transport hvs_transport = {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  870  	.get_local_cid            = hvs_get_local_cid,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  871  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  872  	.init                     = hvs_sock_init,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  873  	.destruct                 = hvs_destruct,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  874  	.release                  = hvs_release,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  875  	.connect                  = hvs_connect,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  876  	.shutdown                 = hvs_shutdown,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  877  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  878  	.dgram_bind               = hvs_dgram_bind,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  879  	.dgram_dequeue            = hvs_dgram_dequeue,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  880  	.dgram_enqueue            = hvs_dgram_enqueue,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  881  	.dgram_allow              = hvs_dgram_allow,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  882  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  883  	.stream_dequeue           = hvs_stream_dequeue,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  884  	.stream_enqueue           = hvs_stream_enqueue,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  885  	.stream_has_data          = hvs_stream_has_data,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  886  	.stream_has_space         = hvs_stream_has_space,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  887  	.stream_rcvhiwat          = hvs_stream_rcvhiwat,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  888  	.stream_is_active         = hvs_stream_is_active,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  889  	.stream_allow             = hvs_stream_allow,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  890  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  891  	.notify_poll_in           = hvs_notify_poll_in,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  892  	.notify_poll_out          = hvs_notify_poll_out,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  893  	.notify_recv_init         = hvs_notify_recv_init,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  894  	.notify_recv_pre_block    = hvs_notify_recv_pre_block,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  895  	.notify_recv_pre_dequeue  = hvs_notify_recv_pre_dequeue,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  896  	.notify_recv_post_dequeue = hvs_notify_recv_post_dequeue,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  897  	.notify_send_init         = hvs_notify_send_init,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  898  	.notify_send_pre_block    = hvs_notify_send_pre_block,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  899  	.notify_send_pre_enqueue  = hvs_notify_send_pre_enqueue,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  900  	.notify_send_post_enqueue = hvs_notify_send_post_enqueue,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  901  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  902  	.set_buffer_size          = hvs_set_buffer_size,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  903  	.set_min_buffer_size      = hvs_set_min_buffer_size,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  904  	.set_max_buffer_size      = hvs_set_max_buffer_size,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  905  	.get_buffer_size          = hvs_get_buffer_size,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  906  	.get_min_buffer_size      = hvs_get_min_buffer_size,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  907  	.get_max_buffer_size      = hvs_get_max_buffer_size,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  908  };
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  909  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  910  static int hvs_probe(struct hv_device *hdev,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  911  		     const struct hv_vmbus_device_id *dev_id)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  912  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  913  	struct vmbus_channel *chan = hdev->channel;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  914  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  915  	hvs_open_connection(chan);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  916  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  917  	/* Always return success to suppress the unnecessary error message
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  918  	 * in vmbus_probe(): on error the host will rescind the device in
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  919  	 * 30 seconds and we can do cleanup at that time in
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  920  	 * vmbus_onoffer_rescind().
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  921  	 */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  922  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  923  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  924  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  925  static int hvs_remove(struct hv_device *hdev)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  926  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  927  	struct vmbus_channel *chan = hdev->channel;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  928  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  929  	vmbus_close(chan);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  930  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  931  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  932  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  933  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  934  /* This isn't really used. See vmbus_match() and vmbus_probe() */
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  935  static const struct hv_vmbus_device_id id_table[] = {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  936  	{},
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  937  };
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  938  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  939  static struct hv_driver hvs_drv = {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  940  	.name		= "hv_sock",
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  941  	.hvsock		= true,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  942  	.id_table	= id_table,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  943  	.probe		= hvs_probe,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  944  	.remove		= hvs_remove,
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  945  };
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  946  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  947  static int __init hvs_init(void)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  948  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  949  	int ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  950  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  951  	if (vmbus_proto_version < VERSION_WIN10)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  952  		return -ENODEV;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  953  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  954  	ret = vmbus_driver_register(&hvs_drv);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  955  	if (ret != 0)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  956  		return ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  957  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  958  	ret = vsock_core_init(&hvs_transport);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  959  	if (ret) {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  960  		vmbus_driver_unregister(&hvs_drv);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  961  		return ret;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  962  	}
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  963  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  964  	return 0;
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  965  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  966  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  967  static void __exit hvs_exit(void)
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  968  {
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  969  	vsock_core_exit();
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  970  	vmbus_driver_unregister(&hvs_drv);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  971  }
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  972  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  973  module_init(hvs_init);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  974  module_exit(hvs_exit);
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  975  
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  976  MODULE_DESCRIPTION("Hyper-V Sockets");
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  977  MODULE_VERSION("1.0.0");
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  978  MODULE_LICENSE("GPL");
ae0078fcf0a5eb3 Dexuan Cui       2017-08-26  979  MODULE_ALIAS_NETPROTO(PF_VSOCK);

:::::: The code at line 214 was first introduced by commit
:::::: ae0078fcf0a5eb3a8623bfb5f988262e0911fdb9 hv_sock: implements Hyper-V transport for Virtual Sockets (AF_VSOCK)

:::::: TO: Dexuan Cui <decui@microsoft.com>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* Re: [PATCH] hv_sock: use HV_HYP_PAGE_SIZE instead of PAGE_SIZE_4K
From: Himadri Pandya @ 2019-07-27 11:50 UTC (permalink / raw)
  To: kbuild test robot
  Cc: kbuild-all, mikelley, kys, haiyangz, sthemmin, sashal, davem,
	linux-hyperv, netdev, linux-kernel, Himadri Pandya
In-Reply-To: <201907271302.tDRkl9uU%lkp@intel.com>


On 7/27/2019 10:50 AM, kbuild test robot wrote:
> Hi Himadri,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on linus/master]
> [cannot apply to v5.3-rc1 next-20190726]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

This patch should be applied to linux-next git tree.

Thank you.

- Himadri

>
> url:    https://github.com/0day-ci/linux/commits/Himadri-Pandya/hv_sock-use-HV_HYP_PAGE_SIZE-instead-of-PAGE_SIZE_4K/20190726-085229
> config: x86_64-allyesconfig (attached as .config)
> compiler: gcc-7 (Debian 7.4.0-10) 7.4.0
> reproduce:
>          # save the attached .config to linux build tree
>          make ARCH=x86_64
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <lkp@intel.com>
>
> All error/warnings (new ones prefixed by >>):
>
>>> net/vmw_vsock/hyperv_transport.c:58:28: error: 'HV_HYP_PAGE_SIZE' undeclared here (not in a function); did you mean 'HV_MESSAGE_SIZE'?
>      #define HVS_SEND_BUF_SIZE (HV_HYP_PAGE_SIZE - sizeof(struct vmpipe_proto_header))
>                                 ^
>>> net/vmw_vsock/hyperv_transport.c:65:10: note: in expansion of macro 'HVS_SEND_BUF_SIZE'
>       u8 data[HVS_SEND_BUF_SIZE];
>               ^~~~~~~~~~~~~~~~~
>     In file included from include/linux/list.h:9:0,
>                      from include/linux/module.h:9,
>                      from net/vmw_vsock/hyperv_transport.c:11:
>     net/vmw_vsock/hyperv_transport.c: In function 'hvs_open_connection':
>>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
>       __builtin_choose_expr(__safe_cmp(x, y), \
>       ^
>     include/linux/kernel.h:921:27: note: in expansion of macro '__careful_cmp'
>      #define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >)
>                                ^~~~~~~~~~~~~
>>> net/vmw_vsock/hyperv_transport.c:390:12: note: in expansion of macro 'max_t'
>        sndbuf = max_t(int, sk->sk_sndbuf, RINGBUFFER_HVS_SND_SIZE);
>                 ^~~~~
>>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
>       __builtin_choose_expr(__safe_cmp(x, y), \
>       ^
>     include/linux/kernel.h:913:27: note: in expansion of macro '__careful_cmp'
>      #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <)
>                                ^~~~~~~~~~~~~
>>> net/vmw_vsock/hyperv_transport.c:391:12: note: in expansion of macro 'min_t'
>        sndbuf = min_t(int, sndbuf, RINGBUFFER_HVS_MAX_SIZE);
>                 ^~~~~
>>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
>       __builtin_choose_expr(__safe_cmp(x, y), \
>       ^
>     include/linux/kernel.h:921:27: note: in expansion of macro '__careful_cmp'
>      #define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >)
>                                ^~~~~~~~~~~~~
>     net/vmw_vsock/hyperv_transport.c:393:12: note: in expansion of macro 'max_t'
>        rcvbuf = max_t(int, sk->sk_rcvbuf, RINGBUFFER_HVS_RCV_SIZE);
>                 ^~~~~
>>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
>       __builtin_choose_expr(__safe_cmp(x, y), \
>       ^
>     include/linux/kernel.h:913:27: note: in expansion of macro '__careful_cmp'
>      #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <)
>                                ^~~~~~~~~~~~~
>     net/vmw_vsock/hyperv_transport.c:394:12: note: in expansion of macro 'min_t'
>        rcvbuf = min_t(int, rcvbuf, RINGBUFFER_HVS_MAX_SIZE);
>                 ^~~~~
>     net/vmw_vsock/hyperv_transport.c: In function 'hvs_stream_enqueue':
>>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
>       __builtin_choose_expr(__safe_cmp(x, y), \
>       ^
>     include/linux/kernel.h:913:27: note: in expansion of macro '__careful_cmp'
>      #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <)
>                                ^~~~~~~~~~~~~
>     net/vmw_vsock/hyperv_transport.c:681:14: note: in expansion of macro 'min_t'
>        to_write = min_t(ssize_t, to_write, HVS_SEND_BUF_SIZE);
>                   ^~~~~
>
> vim +58 net/vmw_vsock/hyperv_transport.c
>
> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* Re: [PATCH] hv_sock: use HV_HYP_PAGE_SIZE instead of PAGE_SIZE_4K
From: kbuild test robot @ 2019-07-27  5:20 UTC (permalink / raw)
  To: Himadri Pandya
  Cc: kbuild-all, mikelley, kys, haiyangz, sthemmin, sashal, davem,
	linux-hyperv, netdev, linux-kernel, Himadri Pandya
In-Reply-To: <20190725051125.10605-1-himadri18.07@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4160 bytes --]

Hi Himadri,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc1 next-20190726]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Himadri-Pandya/hv_sock-use-HV_HYP_PAGE_SIZE-instead-of-PAGE_SIZE_4K/20190726-085229
config: x86_64-allyesconfig (attached as .config)
compiler: gcc-7 (Debian 7.4.0-10) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

>> net/vmw_vsock/hyperv_transport.c:58:28: error: 'HV_HYP_PAGE_SIZE' undeclared here (not in a function); did you mean 'HV_MESSAGE_SIZE'?
    #define HVS_SEND_BUF_SIZE (HV_HYP_PAGE_SIZE - sizeof(struct vmpipe_proto_header))
                               ^
>> net/vmw_vsock/hyperv_transport.c:65:10: note: in expansion of macro 'HVS_SEND_BUF_SIZE'
     u8 data[HVS_SEND_BUF_SIZE];
             ^~~~~~~~~~~~~~~~~
   In file included from include/linux/list.h:9:0,
                    from include/linux/module.h:9,
                    from net/vmw_vsock/hyperv_transport.c:11:
   net/vmw_vsock/hyperv_transport.c: In function 'hvs_open_connection':
>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
     __builtin_choose_expr(__safe_cmp(x, y), \
     ^
   include/linux/kernel.h:921:27: note: in expansion of macro '__careful_cmp'
    #define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >)
                              ^~~~~~~~~~~~~
>> net/vmw_vsock/hyperv_transport.c:390:12: note: in expansion of macro 'max_t'
      sndbuf = max_t(int, sk->sk_sndbuf, RINGBUFFER_HVS_SND_SIZE);
               ^~~~~
>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
     __builtin_choose_expr(__safe_cmp(x, y), \
     ^
   include/linux/kernel.h:913:27: note: in expansion of macro '__careful_cmp'
    #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <)
                              ^~~~~~~~~~~~~
>> net/vmw_vsock/hyperv_transport.c:391:12: note: in expansion of macro 'min_t'
      sndbuf = min_t(int, sndbuf, RINGBUFFER_HVS_MAX_SIZE);
               ^~~~~
>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
     __builtin_choose_expr(__safe_cmp(x, y), \
     ^
   include/linux/kernel.h:921:27: note: in expansion of macro '__careful_cmp'
    #define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >)
                              ^~~~~~~~~~~~~
   net/vmw_vsock/hyperv_transport.c:393:12: note: in expansion of macro 'max_t'
      rcvbuf = max_t(int, sk->sk_rcvbuf, RINGBUFFER_HVS_RCV_SIZE);
               ^~~~~
>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
     __builtin_choose_expr(__safe_cmp(x, y), \
     ^
   include/linux/kernel.h:913:27: note: in expansion of macro '__careful_cmp'
    #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <)
                              ^~~~~~~~~~~~~
   net/vmw_vsock/hyperv_transport.c:394:12: note: in expansion of macro 'min_t'
      rcvbuf = min_t(int, rcvbuf, RINGBUFFER_HVS_MAX_SIZE);
               ^~~~~
   net/vmw_vsock/hyperv_transport.c: In function 'hvs_stream_enqueue':
>> include/linux/kernel.h:845:2: error: first argument to '__builtin_choose_expr' not a constant
     __builtin_choose_expr(__safe_cmp(x, y), \
     ^
   include/linux/kernel.h:913:27: note: in expansion of macro '__careful_cmp'
    #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <)
                              ^~~~~~~~~~~~~
   net/vmw_vsock/hyperv_transport.c:681:14: note: in expansion of macro 'min_t'
      to_write = min_t(ssize_t, to_write, HVS_SEND_BUF_SIZE);
                 ^~~~~

vim +58 net/vmw_vsock/hyperv_transport.c

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 69531 bytes --]

^ permalink raw reply

* [PATCH] clocksource/drivers: hyperv_timer: Fix CPU offlining by unbinding the timer
From: Dexuan Cui @ 2019-07-27  5:07 UTC (permalink / raw)
  To: tglx@linutronix.de, daniel.lezcano@linaro.org,
	gregkh@linuxfoundation.org, sashal@kernel.org, Stephen Hemminger,
	Haiyang Zhang, KY Srinivasan, Michael Kelley,
	linux-hyperv@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, Dexuan Cui

The commit fd1fea6834d0 says "No behavior is changed", but actually it
removes the clockevents_unbind_device() call from hv_synic_cleanup().

In the discussion earlier this month, I thought the unbind call is
unnecessary (see https://www.spinics.net/lists/arm-kernel/msg739888.html),
however, after more investigation, when a VM runs on Hyper-V, it turns out
the unbind call must be kept, otherwise CPU offling may not work, because
a per-cpu timer device is still needed, after hv_synic_cleanup() disables
the per-cpu Hyper-V timer device.

The issue is found in the hibernation test. These are the details:

1. CPU0 hangs in wait_for_ap_thread(), when trying to offline CPU1:

hibernation_snapshot
  create_image
    suspend_disable_secondary_cpus
      freeze_secondary_cpus
        _cpu_down(1, 1, CPUHP_OFFLINE)
          cpuhp_kick_ap_work
            cpuhp_kick_ap
              __cpuhp_kick_ap
                wait_for_ap_thread()

2. CPU0 hangs because CPU1 hangs this way: after CPU1 disables the per-cpu
Hyper-V timer device in hv_synic_cleanup(), CPU1 sets a timer... Please
read on to see how this can happen.

2.1 By "_cpu_down(1, 1, CPUHP_OFFLINE):", CPU0 first tries to move CPU1 to
the CPUHP_TEARDOWN_CPU state and this wakes up the cpuhp/1 thread on CPU1;
the thread is basically a loop of executing various callbacks defined in
the global array cpuhp_hp_states[]: see smpboot_thread_fn().

2.2 This is how a callback is called on CPU1:
  smpboot_thread_fn
    ht->thread_fn(td->cpu), i.e. cpuhp_thread_fun
      cpuhp_invoke_callback
        state = st->state
        st->state--
        cpuhp_get_step(state)->teardown.single()

2.3 At first, the state of CPU1 is CPUHP_ONLINE, which defines a
.teardown.single of NULL, so the execution of the code returns to the loop
in smpboot_thread_fn(), and then reruns cpuhp_invoke_callback() with a
smaller st->state.

2.4 The .teardown.single of every state between CPUHP_ONLINE and
CPUHP_TEARDOWN_CPU runs one by one.

2.5 When it comes to the CPUHP_AP_ONLINE_DYN range, hv_synic_cleanup()
runs: see vmbus_bus_init(). It calls hv_stimer_cleanup() ->
hv_ce_shutdown() to disable the per-cpu timer device, so timer interrupt
will no longer happen on CPU1.

2.6 Later, the .teardown.single of CPUHP_AP_SMPBOOT_THREADS, i.e.
smpboot_park_threads(), starts to run, trying to park all the other
hotplug_threads, e.g. ksoftirqd/1 and rcuc/1; here a timer can be set up
this way and the timer will never be fired since CPU1 doesn't have
an active timer device now, so CPU1 hangs and can not be offlined:
  smpboot_park_threads
    smpboot_park_thread
      kthread_park
        wait_task_inactive
          schedule_hrtimeout(&to, HRTIMER_MODE_REL)

With this patch, when the per-cpu Hyper-V timer device is disabled, the
system switches to the Local APIC timer, and the hang issue can not
happen.

Fixes: fd1fea6834d0 ("clocksource/drivers: Make Hyper-V clocksource ISA agnostic")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/clocksource/hyperv_timer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index 41c31a7ac0e4..8f3422c66cbb 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -139,6 +139,7 @@ void hv_stimer_cleanup(unsigned int cpu)
 	/* Turn off clockevent device */
 	if (ms_hyperv.features & HV_MSR_SYNTIMER_AVAILABLE) {
 		ce = per_cpu_ptr(hv_clock_event, cpu);
+		clockevents_unbind_device(ce, cpu);
 		hv_ce_shutdown(ce);
 	}
 }
-- 
2.19.1


^ permalink raw reply related

* Re: [PATCH 1/2] Drivers: hv: Specify receive buffer size using Hyper-V page size
From: Stephen Hemminger @ 2019-07-26 16:07 UTC (permalink / raw)
  To: Himadri Pandya
  Cc: Michael Kelley, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal, linux-hyperv, linux-kernel, himadri18.07
In-Reply-To: <20190725050315.6935-2-himadri18.07@gmail.com>

On Wed, 24 Jul 2019 22:03:14 -0700
"Himadri Pandya" <himadrispandya@gmail.com> wrote:

> The recv_buffer is used to retrieve data from the VMbus ring buffer.
> VMbus ring buffers are sized based on the guest page size which
> Hyper-V assumes to be 4KB. But it may be different on some
> architectures. So use the Hyper-V page size to allocate the
> recv_buffer and set the maximum size to receive.
> 
> Signed-off-by: Himadri Pandya <himadri18.07@gmail.com>

If pagesize is 64K, then doing it this way will waste lots of
memory.

^ permalink raw reply

* Re: [PATCH 2/2] Drivers: hv: util: Specify ring buffer size using Hyper-V page size
From: Stephen Hemminger @ 2019-07-26 16:06 UTC (permalink / raw)
  To: Himadri Pandya
  Cc: Michael Kelley, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal, linux-hyperv, linux-kernel, himadri18.07
In-Reply-To: <20190725050315.6935-3-himadri18.07@gmail.com>

On Wed, 24 Jul 2019 22:03:15 -0700
"Himadri Pandya" <himadrispandya@gmail.com> wrote:

> VMbus ring buffers are sized based on the 4K page size used by
> Hyper-V. The Linux guest page size may not be 4K on all architectures
> so use the Hyper-V page size to specify the ring buffer size.
> 
> Signed-off-by: Himadri Pandya <himadri18.07@gmail.com>
> ---
>  drivers/hv/hv_util.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
> index c2c08f26bd5f..766bd8457346 100644
> --- a/drivers/hv/hv_util.c
> +++ b/drivers/hv/hv_util.c
> @@ -413,8 +413,9 @@ static int util_probe(struct hv_device *dev,
>  
>  	hv_set_drvdata(dev, srv);
>  
> -	ret = vmbus_open(dev->channel, 4 * PAGE_SIZE, 4 * PAGE_SIZE, NULL,
> 0,
> -			srv->util_cb, dev->channel);
> +	ret = vmbus_open(dev->channel, 4 * HV_HYP_PAGE_SIZE,
> +			 4 * HV_HYP_PAGE_SIZE, NULL, 0, srv->util_cb,
> +			 dev->channel);
>  	if (ret)
>  		goto error;
>  

hv_util doesn't need lots of buffering. Why not define a fixed
value across all architectures. Maybe with some roundup to HV_HYP_PAGE_SIZE.

^ permalink raw reply

* Re: [PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently
From: Juergen Gross @ 2019-07-26  7:28 UTC (permalink / raw)
  To: Nadav Amit, Andy Lutomirski, Dave Hansen
  Cc: Borislav Petkov, Peter Zijlstra, Sasha Levin, x86,
	Thomas Gleixner, virtualization, xen-devel, Haiyang Zhang,
	K. Y. Srinivasan, Stephen Hemminger, Boris Ostrovsky, Ingo Molnar,
	Paolo Bonzini, kvm, linux-hyperv, linux-kernel
In-Reply-To: <20190719005837.4150-5-namit@vmware.com>

On 19.07.19 02:58, Nadav Amit wrote:
> To improve TLB shootdown performance, flush the remote and local TLBs
> concurrently. Introduce flush_tlb_multi() that does so. Introduce
> paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen
> and hyper-v are only compile-tested).
> 
> While the updated smp infrastructure is capable of running a function on
> a single local core, it is not optimized for this case. The multiple
> function calls and the indirect branch introduce some overhead, and
> might make local TLB flushes slower than they were before the recent
> changes.
> 
> Before calling the SMP infrastructure, check if only a local TLB flush
> is needed to restore the lost performance in this common case. This
> requires to check mm_cpumask() one more time, but unless this mask is
> updated very frequently, this should impact performance negatively.
> 
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Stephen Hemminger <sthemmin@microsoft.com>
> Cc: Sasha Levin <sashal@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: x86@kernel.org
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: linux-hyperv@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: virtualization@lists.linux-foundation.org
> Cc: kvm@vger.kernel.org
> Cc: xen-devel@lists.xenproject.org
> Signed-off-by: Nadav Amit <namit@vmware.com>
> ---
>   arch/x86/hyperv/mmu.c                 | 10 +++---
>   arch/x86/include/asm/paravirt.h       |  6 ++--
>   arch/x86/include/asm/paravirt_types.h |  4 +--
>   arch/x86/include/asm/tlbflush.h       |  8 ++---
>   arch/x86/include/asm/trace/hyperv.h   |  2 +-
>   arch/x86/kernel/kvm.c                 | 11 +++++--
>   arch/x86/kernel/paravirt.c            |  2 +-
>   arch/x86/mm/tlb.c                     | 47 ++++++++++++++++++---------
>   arch/x86/xen/mmu_pv.c                 | 11 +++----
>   include/trace/events/xen.h            |  2 +-
>   10 files changed, 62 insertions(+), 41 deletions(-)

Xen and paravirt parts: Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

^ permalink raw reply

* Re: [PATCH] hv_sock: use HV_HYP_PAGE_SIZE instead of PAGE_SIZE_4K
From: David Miller @ 2019-07-26  0:26 UTC (permalink / raw)
  To: himadrispandya
  Cc: mikelley, kys, haiyangz, sthemmin, sashal, linux-hyperv, netdev,
	linux-kernel, himadri18.07
In-Reply-To: <20190725051125.10605-1-himadri18.07@gmail.com>

From: Himadri Pandya <himadrispandya@gmail.com>
Date: Thu, 25 Jul 2019 05:11:25 +0000

> Older windows hosts require the hv_sock ring buffer to be defined
> using 4K pages. This was achieved by using the symbol PAGE_SIZE_4K
> defined specifically for this purpose. But now we have a new symbol
> HV_HYP_PAGE_SIZE defined in hyperv-tlfs which can be used for this.
> 
> This patch removes the definition of symbol PAGE_SIZE_4K and replaces
> its usage with the symbol HV_HYP_PAGE_SIZE. This patch also aligns
> sndbuf and rcvbuf to hyper-v specific page size using HV_HYP_PAGE_SIZE
> instead of the guest page size(PAGE_SIZE) as hyper-v expects the page
> size to be 4K and it might not be the case on ARM64 architecture.
> 
> Signed-off-by: Himadri Pandya <himadri18.07@gmail.com>

This doesn't compile:

  CC [M]  net/vmw_vsock/hyperv_transport.o
net/vmw_vsock/hyperv_transport.c:58:28: error: ‘HV_HYP_PAGE_SIZE’ undeclared here (not in a function); did you mean ‘HV_MESSAGE_SIZE’?
 #define HVS_SEND_BUF_SIZE (HV_HYP_PAGE_SIZE - sizeof(struct vmpipe_proto_header))
                            ^~~~~~~~~~~~~~~~

^ permalink raw reply

* Re: [PATCH net-next] Name NICs based on vmbus offer and enable async probe by default
From: David Miller @ 2019-07-25 18:46 UTC (permalink / raw)
  To: haiyangz
  Cc: sashal, linux-hyperv, netdev, kys, sthemmin, olaf, vkuznets,
	linux-kernel
In-Reply-To: <1563908517-55735-1-git-send-email-haiyangz@microsoft.com>


1) Subject: line lacks proper subsystem prefix

2) No module parameters in networking drivers, sorry.  Find some generic way to do
   this via devlink or similar.


^ permalink raw reply

* [PATCH] hv_sock: use HV_HYP_PAGE_SIZE instead of PAGE_SIZE_4K
From: Himadri Pandya @ 2019-07-25  5:11 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal, davem
  Cc: linux-hyperv, netdev, linux-kernel, Himadri Pandya

Older windows hosts require the hv_sock ring buffer to be defined
using 4K pages. This was achieved by using the symbol PAGE_SIZE_4K
defined specifically for this purpose. But now we have a new symbol
HV_HYP_PAGE_SIZE defined in hyperv-tlfs which can be used for this.

This patch removes the definition of symbol PAGE_SIZE_4K and replaces
its usage with the symbol HV_HYP_PAGE_SIZE. This patch also aligns
sndbuf and rcvbuf to hyper-v specific page size using HV_HYP_PAGE_SIZE
instead of the guest page size(PAGE_SIZE) as hyper-v expects the page
size to be 4K and it might not be the case on ARM64 architecture.

Signed-off-by: Himadri Pandya <himadri18.07@gmail.com>
---
 net/vmw_vsock/hyperv_transport.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index f2084e3f7aa4..ecb5d72d8010 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -13,15 +13,16 @@
 #include <linux/hyperv.h>
 #include <net/sock.h>
 #include <net/af_vsock.h>
+#include <asm/hyperv-tlfs.h>
 
 /* Older (VMBUS version 'VERSION_WIN10' or before) Windows hosts have some
- * stricter requirements on the hv_sock ring buffer size of six 4K pages. Newer
- * hosts don't have this limitation; but, keep the defaults the same for compat.
+ * stricter requirements on the hv_sock ring buffer size of six 4K pages.
+ * hyperv-tlfs defines HV_HYP_PAGE_SIZE as 4K. Newer hosts don't have this
+ * limitation; but, keep the defaults the same for compat.
  */
-#define PAGE_SIZE_4K		4096
-#define RINGBUFFER_HVS_RCV_SIZE (PAGE_SIZE_4K * 6)
-#define RINGBUFFER_HVS_SND_SIZE (PAGE_SIZE_4K * 6)
-#define RINGBUFFER_HVS_MAX_SIZE (PAGE_SIZE_4K * 64)
+#define RINGBUFFER_HVS_RCV_SIZE (HV_HYP_PAGE_SIZE * 6)
+#define RINGBUFFER_HVS_SND_SIZE (HV_HYP_PAGE_SIZE * 6)
+#define RINGBUFFER_HVS_MAX_SIZE (HV_HYP_PAGE_SIZE * 64)
 
 /* The MTU is 16KB per the host side's design */
 #define HVS_MTU_SIZE		(1024 * 16)
@@ -54,7 +55,7 @@ struct hvs_recv_buf {
  * ringbuffer APIs that allow us to directly copy data from userspace buffer
  * to VMBus ringbuffer.
  */
-#define HVS_SEND_BUF_SIZE (PAGE_SIZE_4K - sizeof(struct vmpipe_proto_header))
+#define HVS_SEND_BUF_SIZE (HV_HYP_PAGE_SIZE - sizeof(struct vmpipe_proto_header))
 
 struct hvs_send_buf {
 	/* The header before the payload data */
@@ -388,10 +389,10 @@ static void hvs_open_connection(struct vmbus_channel *chan)
 	} else {
 		sndbuf = max_t(int, sk->sk_sndbuf, RINGBUFFER_HVS_SND_SIZE);
 		sndbuf = min_t(int, sndbuf, RINGBUFFER_HVS_MAX_SIZE);
-		sndbuf = ALIGN(sndbuf, PAGE_SIZE);
+		sndbuf = ALIGN(sndbuf, HV_HYP_PAGE_SIZE);
 		rcvbuf = max_t(int, sk->sk_rcvbuf, RINGBUFFER_HVS_RCV_SIZE);
 		rcvbuf = min_t(int, rcvbuf, RINGBUFFER_HVS_MAX_SIZE);
-		rcvbuf = ALIGN(rcvbuf, PAGE_SIZE);
+		rcvbuf = ALIGN(rcvbuf, HV_HYP_PAGE_SIZE);
 	}
 
 	ret = vmbus_open(chan, sndbuf, rcvbuf, NULL, 0, hvs_channel_cb,
@@ -662,7 +663,7 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
 	ssize_t ret = 0;
 	ssize_t bytes_written = 0;
 
-	BUILD_BUG_ON(sizeof(*send_buf) != PAGE_SIZE_4K);
+	BUILD_BUG_ON(sizeof(*send_buf) != HV_HYP_PAGE_SIZE);
 
 	send_buf = kmalloc(sizeof(*send_buf), GFP_KERNEL);
 	if (!send_buf)
-- 
2.17.1


^ permalink raw reply related

* [PATCH 2/2] Drivers: hv: util: Specify ring buffer size using Hyper-V page size
From: Himadri Pandya @ 2019-07-25  5:03 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal
  Cc: linux-hyperv, linux-kernel, Himadri Pandya
In-Reply-To: <20190725050315.6935-1-himadri18.07@gmail.com>

VMbus ring buffers are sized based on the 4K page size used by
Hyper-V. The Linux guest page size may not be 4K on all architectures
so use the Hyper-V page size to specify the ring buffer size.

Signed-off-by: Himadri Pandya <himadri18.07@gmail.com>
---
 drivers/hv/hv_util.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index c2c08f26bd5f..766bd8457346 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -413,8 +413,9 @@ static int util_probe(struct hv_device *dev,
 
 	hv_set_drvdata(dev, srv);
 
-	ret = vmbus_open(dev->channel, 4 * PAGE_SIZE, 4 * PAGE_SIZE, NULL, 0,
-			srv->util_cb, dev->channel);
+	ret = vmbus_open(dev->channel, 4 * HV_HYP_PAGE_SIZE,
+			 4 * HV_HYP_PAGE_SIZE, NULL, 0, srv->util_cb,
+			 dev->channel);
 	if (ret)
 		goto error;
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH 1/2] Drivers: hv: Specify receive buffer size using Hyper-V page size
From: Himadri Pandya @ 2019-07-25  5:03 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal
  Cc: linux-hyperv, linux-kernel, Himadri Pandya
In-Reply-To: <20190725050315.6935-1-himadri18.07@gmail.com>

The recv_buffer is used to retrieve data from the VMbus ring buffer.
VMbus ring buffers are sized based on the guest page size which
Hyper-V assumes to be 4KB. But it may be different on some
architectures. So use the Hyper-V page size to allocate the
recv_buffer and set the maximum size to receive.

Signed-off-by: Himadri Pandya <himadri18.07@gmail.com>
---
 drivers/hv/hv_fcopy.c    | 3 ++-
 drivers/hv/hv_kvp.c      | 3 ++-
 drivers/hv/hv_snapshot.c | 3 ++-
 drivers/hv/hv_util.c     | 8 ++++----
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index 7e30ae0635cc..08fa4a5de644 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -13,6 +13,7 @@
 #include <linux/workqueue.h>
 #include <linux/hyperv.h>
 #include <linux/sched.h>
+#include <asm/hyperv-tlfs.h>
 
 #include "hyperv_vmbus.h"
 #include "hv_utils_transport.h"
@@ -234,7 +235,7 @@ void hv_fcopy_onchannelcallback(void *context)
 	if (fcopy_transaction.state > HVUTIL_READY)
 		return;
 
-	vmbus_recvpacket(channel, recv_buffer, PAGE_SIZE * 2, &recvlen,
+	vmbus_recvpacket(channel, recv_buffer, HV_HYP_PAGE_SIZE * 2, &recvlen,
 			 &requestid);
 	if (recvlen <= 0)
 		return;
diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
index 5054d1105236..ae7c028dc5a8 100644
--- a/drivers/hv/hv_kvp.c
+++ b/drivers/hv/hv_kvp.c
@@ -27,6 +27,7 @@
 #include <linux/connector.h>
 #include <linux/workqueue.h>
 #include <linux/hyperv.h>
+#include <asm/hyperv-tlfs.h>
 
 #include "hyperv_vmbus.h"
 #include "hv_utils_transport.h"
@@ -661,7 +662,7 @@ void hv_kvp_onchannelcallback(void *context)
 	if (kvp_transaction.state > HVUTIL_READY)
 		return;
 
-	vmbus_recvpacket(channel, recv_buffer, PAGE_SIZE * 4, &recvlen,
+	vmbus_recvpacket(channel, recv_buffer, HV_HYP_PAGE_SIZE * 4, &recvlen,
 			 &requestid);
 
 	if (recvlen > 0) {
diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c
index 20ba95b75a94..03b6454268b3 100644
--- a/drivers/hv/hv_snapshot.c
+++ b/drivers/hv/hv_snapshot.c
@@ -12,6 +12,7 @@
 #include <linux/connector.h>
 #include <linux/workqueue.h>
 #include <linux/hyperv.h>
+#include <asm/hyperv-tlfs.h>
 
 #include "hyperv_vmbus.h"
 #include "hv_utils_transport.h"
@@ -297,7 +298,7 @@ void hv_vss_onchannelcallback(void *context)
 	if (vss_transaction.state > HVUTIL_READY)
 		return;
 
-	vmbus_recvpacket(channel, recv_buffer, PAGE_SIZE * 2, &recvlen,
+	vmbus_recvpacket(channel, recv_buffer, HV_HYP_PAGE_SIZE * 2, &recvlen,
 			 &requestid);
 
 	if (recvlen > 0) {
diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index e32681ee7b9f..c2c08f26bd5f 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -136,7 +136,7 @@ static void shutdown_onchannelcallback(void *context)
 	struct icmsg_hdr *icmsghdrp;
 
 	vmbus_recvpacket(channel, shut_txf_buf,
-			 PAGE_SIZE, &recvlen, &requestid);
+			 HV_HYP_PAGE_SIZE, &recvlen, &requestid);
 
 	if (recvlen > 0) {
 		icmsghdrp = (struct icmsg_hdr *)&shut_txf_buf[
@@ -284,7 +284,7 @@ static void timesync_onchannelcallback(void *context)
 	u8 *time_txf_buf = util_timesynch.recv_buffer;
 
 	vmbus_recvpacket(channel, time_txf_buf,
-			 PAGE_SIZE, &recvlen, &requestid);
+			 HV_HYP_PAGE_SIZE, &recvlen, &requestid);
 
 	if (recvlen > 0) {
 		icmsghdrp = (struct icmsg_hdr *)&time_txf_buf[
@@ -346,7 +346,7 @@ static void heartbeat_onchannelcallback(void *context)
 	while (1) {
 
 		vmbus_recvpacket(channel, hbeat_txf_buf,
-				 PAGE_SIZE, &recvlen, &requestid);
+				 HV_HYP_PAGE_SIZE, &recvlen, &requestid);
 
 		if (!recvlen)
 			break;
@@ -390,7 +390,7 @@ static int util_probe(struct hv_device *dev,
 		(struct hv_util_service *)dev_id->driver_data;
 	int ret;
 
-	srv->recv_buffer = kmalloc(PAGE_SIZE * 4, GFP_KERNEL);
+	srv->recv_buffer = kmalloc(HV_HYP_PAGE_SIZE * 4, GFP_KERNEL);
 	if (!srv->recv_buffer)
 		return -ENOMEM;
 	srv->channel = dev->channel;
-- 
2.17.1


^ permalink raw reply related

* [PATCH 0/2] Drivers: hv: Specify buffer size using Hyper-V page size
From: Himadri Pandya @ 2019-07-25  5:03 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal
  Cc: linux-hyperv, linux-kernel, Himadri Pandya

recv_buffer and VMbus ring buffers are sized based on guest page size
which Hyper-V assumes to be 4KB. It might not be the case for some
architectures. Hence instead use the Hyper-V page size.

Himadri Pandya (2):
  Drivers: hv: Specify receive buffer size using Hyper-V page size
  Drivers: hv: util: Specify ring buffer size using Hyper-V page size

 drivers/hv/hv_fcopy.c    |  3 ++-
 drivers/hv/hv_kvp.c      |  3 ++-
 drivers/hv/hv_snapshot.c |  3 ++-
 drivers/hv/hv_util.c     | 13 +++++++------
 4 files changed, 13 insertions(+), 9 deletions(-)

-- 
2.17.1


^ permalink raw reply

* Re: [PATCH v3] locking/spinlocks, paravirt, hyperv: Correct the hv_nopvspin case
From: Zhenzhong Duan @ 2019-07-24  7:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, Sasha Levin,
	Juergen Gross, Boris Ostrovsky, Peter Zijlstra, Waiman Long,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, linux-hyperv
In-Reply-To: <1562120635-9806-1-git-send-email-zhenzhong.duan@oracle.com>

Hi Maintainers,

Any further comments on this? Thanks

Zhenzhong

On 2019/7/3 10:23, Zhenzhong Duan wrote:
> With the boot parameter "hv_nopvspin" specified a Hyperv guest should
> not make use of paravirt spinlocks, but behave as if running on bare
> metal. This is not true, however, as the qspinlock code will fall back
> to a test-and-set scheme when it is detecting a hypervisor.
>
> In order to avoid this disable the virt_spin_lock_key.
>
> Same change for XEN is already in Commit e6fd28eb3522
> ("locking/spinlocks, paravirt, xen: Correct the xen_nopvspin case")
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Stephen Hemminger <sthemmin@microsoft.com>
> Cc: Sasha Levin <sashal@kernel.org>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Waiman Long <longman@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: linux-hyperv@vger.kernel.org
> ---
> v3: remove unlikely() as suggested by Sasha
>
>   arch/x86/hyperv/hv_spinlock.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/hyperv/hv_spinlock.c b/arch/x86/hyperv/hv_spinlock.c
> index 07f21a0..210495b 100644
> --- a/arch/x86/hyperv/hv_spinlock.c
> +++ b/arch/x86/hyperv/hv_spinlock.c
> @@ -64,6 +64,9 @@ __visible bool hv_vcpu_is_preempted(int vcpu)
>   
>   void __init hv_init_spinlocks(void)
>   {
> +	if (!hv_pvspin)
> +		static_branch_disable(&virt_spin_lock_key);
> +
>   	if (!hv_pvspin || !apic ||
>   	    !(ms_hyperv.hints & HV_X64_CLUSTER_IPI_RECOMMENDED) ||
>   	    !(ms_hyperv.features & HV_X64_MSR_GUEST_IDLE_AVAILABLE)) {

^ permalink raw reply

* Re: [PATCH v1] hv_sock: Use consistent types for UUIDs
From: David Miller @ 2019-07-23 20:58 UTC (permalink / raw)
  To: andriy.shevchenko; +Cc: haiyangz, kys, sthemmin, sashal, linux-hyperv, netdev
In-Reply-To: <20190723163943.65991-1-andriy.shevchenko@linux.intel.com>

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Tue, 23 Jul 2019 19:39:43 +0300

> The rest of Hyper-V code is using new types for UUID handling.
> Convert hv_sock as well.
> 
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Applied to net-next.

^ permalink raw reply

* [PATCH net-next] Name NICs based on vmbus offer and enable async probe by default
From: Haiyang Zhang @ 2019-07-23 19:02 UTC (permalink / raw)
  To: sashal@kernel.org, linux-hyperv@vger.kernel.org,
	netdev@vger.kernel.org
  Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de,
	vkuznets, davem@davemloft.net, linux-kernel@vger.kernel.org

Previously the async probing caused NIC naming in random order.

The patch adds a dev_num field in vmbus channel structure. It’s assigned
to the first available number when the channel is offered. So netvsc can
use it for NIC naming based on channel offer sequence. Now we re-enable
the async probing mode by default for faster probing.

Also added a modules parameter, probe_type, to set sync probing mode if
a user wants to.

Fixes: af0a5646cb8d ("use the new async probing feature for the hyperv drivers")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/hv/channel_mgmt.c       | 46 +++++++++++++++++++++++++++++++++++++++--
 drivers/net/hyperv/netvsc_drv.c | 33 ++++++++++++++++++++++++++---
 include/linux/hyperv.h          |  4 ++++
 3 files changed, 78 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index addcef5..ab7c05b 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -304,6 +304,8 @@ bool vmbus_prep_negotiate_resp(struct icmsg_hdr *icmsghdrp,
 
 EXPORT_SYMBOL_GPL(vmbus_prep_negotiate_resp);
 
+#define HV_DEV_NUM_INVALID (-1)
+
 /*
  * alloc_channel - Allocate and initialize a vmbus channel object
  */
@@ -315,6 +317,8 @@ static struct vmbus_channel *alloc_channel(void)
 	if (!channel)
 		return NULL;
 
+	channel->dev_num = HV_DEV_NUM_INVALID;
+
 	spin_lock_init(&channel->lock);
 	init_completion(&channel->rescind_event);
 
@@ -533,6 +537,42 @@ static void vmbus_add_channel_work(struct work_struct *work)
 }
 
 /*
+ * Get the first available device number of its type, then
+ * record it in the channel structure.
+ */
+static void hv_set_devnum(struct vmbus_channel *newchannel)
+{
+	struct vmbus_channel *channel;
+	unsigned int i = 0;
+	bool found;
+
+	BUG_ON(!mutex_is_locked(&vmbus_connection.channel_mutex));
+
+	/* Only HV_NIC uses this number for now */
+	if (hv_get_dev_type(newchannel) != HV_NIC)
+		return;
+
+next:
+	found = false;
+
+	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
+		if (i == channel->dev_num &&
+		    guid_equal(&channel->offermsg.offer.if_type,
+			       &newchannel->offermsg.offer.if_type)) {
+			found = true;
+			break;
+		}
+	}
+
+	if (found) {
+		i++;
+		goto next;
+	}
+
+	newchannel->dev_num = i;
+}
+
+/*
  * vmbus_process_offer - Process the offer by creating a channel/device
  * associated with this offer
  */
@@ -561,10 +601,12 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
 		}
 	}
 
-	if (fnew)
+	if (fnew) {
+		hv_set_devnum(newchannel);
+
 		list_add_tail(&newchannel->listentry,
 			      &vmbus_connection.chn_list);
-	else {
+	} else {
 		/*
 		 * Check to see if this is a valid sub-channel.
 		 */
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index afdcc56..af53690 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -57,6 +57,10 @@
 module_param(debug, int, 0444);
 MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
 
+static unsigned int probe_type __ro_after_init = PROBE_PREFER_ASYNCHRONOUS;
+module_param(probe_type, uint, 0444);
+MODULE_PARM_DESC(probe_type, "Probe type: 1=async(default), 2=sync");
+
 static LIST_HEAD(netvsc_dev_list);
 
 static void netvsc_change_rx_flags(struct net_device *net, int change)
@@ -2233,10 +2237,19 @@ static int netvsc_probe(struct hv_device *dev,
 	struct net_device_context *net_device_ctx;
 	struct netvsc_device_info *device_info = NULL;
 	struct netvsc_device *nvdev;
+	char name[IFNAMSIZ];
 	int ret = -ENOMEM;
 
-	net = alloc_etherdev_mq(sizeof(struct net_device_context),
-				VRSS_CHANNEL_MAX);
+	if (probe_type == PROBE_PREFER_ASYNCHRONOUS) {
+		snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
+		net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
+				       NET_NAME_ENUM, ether_setup,
+				       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
+	} else {
+		net = alloc_etherdev_mq(sizeof(struct net_device_context),
+					VRSS_CHANNEL_MAX);
+	}
+
 	if (!net)
 		goto no_net;
 
@@ -2323,6 +2336,14 @@ static int netvsc_probe(struct hv_device *dev,
 		net->max_mtu = ETH_DATA_LEN;
 
 	ret = register_netdevice(net);
+
+	if (ret == -EEXIST) {
+		pr_info("NIC name %s exists, request another name.\n",
+			net->name);
+		strlcpy(net->name, "eth%d", IFNAMSIZ);
+		ret = register_netdevice(net);
+	}
+
 	if (ret != 0) {
 		pr_err("Unable to register netdev.\n");
 		goto register_failed;
@@ -2407,7 +2428,7 @@ static int netvsc_remove(struct hv_device *dev)
 	.probe = netvsc_probe,
 	.remove = netvsc_remove,
 	.driver = {
-		.probe_type = PROBE_FORCE_SYNCHRONOUS,
+		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
 	},
 };
 
@@ -2473,6 +2494,12 @@ static int __init netvsc_drv_init(void)
 	}
 	netvsc_ring_bytes = ring_size * PAGE_SIZE;
 
+	if (probe_type != PROBE_PREFER_ASYNCHRONOUS)
+		probe_type = PROBE_FORCE_SYNCHRONOUS;
+
+	netvsc_drv.driver.probe_type = probe_type;
+	pr_info("probe_type: %u\n", probe_type);
+
 	ret = vmbus_driver_register(&netvsc_drv);
 	if (ret)
 		return ret;
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 6256cc3..12fc5ea 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -841,6 +841,10 @@ struct vmbus_channel {
 	 */
 	struct vmbus_channel *primary_channel;
 	/*
+	 * Used for device naming based on channel offer sequence.
+	 */
+	int dev_num;
+	/*
 	 * Support per-channel state for use by vmbus drivers.
 	 */
 	void *per_channel_state;
-- 
1.8.3.1


^ permalink raw reply related

* RE: [PATCH v1] hv_sock: Use consistent types for UUIDs
From: Dexuan Cui @ 2019-07-23 16:57 UTC (permalink / raw)
  To: Andy Shevchenko, Haiyang Zhang, KY Srinivasan, Stephen Hemminger,
	Sasha Levin, linux-hyperv@vger.kernel.org, David S. Miller,
	netdev@vger.kernel.org
In-Reply-To: <20190723163943.65991-1-andriy.shevchenko@linux.intel.com>

> From: linux-hyperv-owner@vger.kernel.org
> <linux-hyperv-owner@vger.kernel.org> On Behalf Of Andy Shevchenko
> Sent: Tuesday, July 23, 2019 9:40 AM
> 
> The rest of Hyper-V code is using new types for UUID handling.
> Convert hv_sock as well.
> 
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Reviewed-by: Dexuan Cui <decui@microsoft.com>

Looks good to me. Thanks, Andy!

Thanks,
-- Dexuan

^ permalink raw reply

* [PATCH v1] hv_sock: Use consistent types for UUIDs
From: Andy Shevchenko @ 2019-07-23 16:39 UTC (permalink / raw)
  To: Haiyang Zhang, K. Y. Srinivasan, Stephen Hemminger, Sasha Levin,
	linux-hyperv, David S. Miller, netdev
  Cc: Andy Shevchenko

The rest of Hyper-V code is using new types for UUID handling.
Convert hv_sock as well.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 net/vmw_vsock/hyperv_transport.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index f2084e3f7aa4..2a1719c0f8d2 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -77,11 +77,11 @@ struct hvs_send_buf {
 					 VMBUS_PKT_TRAILER_SIZE)
 
 union hvs_service_id {
-	uuid_le	srv_id;
+	guid_t	srv_id;
 
 	struct {
 		unsigned int svm_port;
-		unsigned char b[sizeof(uuid_le) - sizeof(unsigned int)];
+		unsigned char b[sizeof(guid_t) - sizeof(unsigned int)];
 	};
 };
 
@@ -89,8 +89,8 @@ union hvs_service_id {
 struct hvsock {
 	struct vsock_sock *vsk;
 
-	uuid_le vm_srv_id;
-	uuid_le host_srv_id;
+	guid_t vm_srv_id;
+	guid_t host_srv_id;
 
 	struct vmbus_channel *chan;
 	struct vmpacket_descriptor *recv_desc;
@@ -159,21 +159,21 @@ struct hvsock {
 #define MIN_HOST_EPHEMERAL_PORT		(MAX_HOST_LISTEN_PORT + 1)
 
 /* 00000000-facb-11e6-bd58-64006a7986d3 */
-static const uuid_le srv_id_template =
-	UUID_LE(0x00000000, 0xfacb, 0x11e6, 0xbd, 0x58,
-		0x64, 0x00, 0x6a, 0x79, 0x86, 0xd3);
+static const guid_t srv_id_template =
+	GUID_INIT(0x00000000, 0xfacb, 0x11e6, 0xbd, 0x58,
+		  0x64, 0x00, 0x6a, 0x79, 0x86, 0xd3);
 
-static bool is_valid_srv_id(const uuid_le *id)
+static bool is_valid_srv_id(const guid_t *id)
 {
-	return !memcmp(&id->b[4], &srv_id_template.b[4], sizeof(uuid_le) - 4);
+	return !memcmp(&id->b[4], &srv_id_template.b[4], sizeof(guid_t) - 4);
 }
 
-static unsigned int get_port_by_srv_id(const uuid_le *svr_id)
+static unsigned int get_port_by_srv_id(const guid_t *svr_id)
 {
 	return *((unsigned int *)svr_id);
 }
 
-static void hvs_addr_init(struct sockaddr_vm *addr, const uuid_le *svr_id)
+static void hvs_addr_init(struct sockaddr_vm *addr, const guid_t *svr_id)
 {
 	unsigned int port = get_port_by_srv_id(svr_id);
 
@@ -316,7 +316,7 @@ static void hvs_close_connection(struct vmbus_channel *chan)
 
 static void hvs_open_connection(struct vmbus_channel *chan)
 {
-	uuid_le *if_instance, *if_type;
+	guid_t *if_instance, *if_type;
 	unsigned char conn_from_host;
 
 	struct sockaddr_vm addr;
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently
From: Peter Zijlstra @ 2019-07-22 19:32 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Andy Lutomirski, Dave Hansen, the arch/x86 maintainers, LKML,
	Thomas Gleixner, Ingo Molnar, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Sasha Levin, Borislav Petkov, Juergen Gross,
	Paolo Bonzini, Boris Ostrovsky, linux-hyperv@vger.kernel.org,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	xen-devel@lists.xenproject.org
In-Reply-To: <58DA0841-33C2-4D16-A671-08064A15001C@vmware.com>

On Mon, Jul 22, 2019 at 07:27:09PM +0000, Nadav Amit wrote:
> > On Jul 22, 2019, at 12:14 PM, Peter Zijlstra <peterz@infradead.org> wrote:

> > But then we can still do something like the below, which doesn't change
> > things and still gets rid of that dual function crud, simplifying
> > smp_call_function_many again.

> Nice! I will add it on top, if you don’t mind (instead squashing it).

Not at all.

> The original decision to have local/remote functions was mostly to provide
> the generality.
> 
> I would change the last argument of __smp_call_function_many() from “wait”
> to “flags” that would indicate whether to run the function locally, since I
> don’t want to change the semantics of smp_call_function_many() and decide
> whether to run the function locally purely based on the mask. Let me know if
> you disagree.

Agreed.

^ permalink raw reply

* Re: [PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently
From: Nadav Amit @ 2019-07-22 19:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Lutomirski, Dave Hansen, the arch/x86 maintainers, LKML,
	Thomas Gleixner, Ingo Molnar, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Sasha Levin, Borislav Petkov, Juergen Gross,
	Paolo Bonzini, Boris Ostrovsky, linux-hyperv@vger.kernel.org,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	xen-devel@lists.xenproject.org
In-Reply-To: <20190722191433.GD6698@worktop.programming.kicks-ass.net>

> On Jul 22, 2019, at 12:14 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Thu, Jul 18, 2019 at 05:58:32PM -0700, Nadav Amit wrote:
>> @@ -709,8 +716,9 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
>> 	 * doing a speculative memory access.
>> 	 */
>> 	if (info->freed_tables) {
>> -		smp_call_function_many(cpumask, flush_tlb_func_remote,
>> -			       (void *)info, 1);
>> +		__smp_call_function_many(cpumask, flush_tlb_func_remote,
>> +					 flush_tlb_func_local,
>> +					 (void *)info, 1);
>> 	} else {
>> 		/*
>> 		 * Although we could have used on_each_cpu_cond_mask(),
>> @@ -737,7 +745,8 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
>> 			if (tlb_is_not_lazy(cpu))
>> 				__cpumask_set_cpu(cpu, cond_cpumask);
>> 		}
>> -		smp_call_function_many(cond_cpumask, flush_tlb_func_remote,
>> +		__smp_call_function_many(cond_cpumask, flush_tlb_func_remote,
>> +					 flush_tlb_func_local,
>> 					 (void *)info, 1);
>> 	}
>> }
> 
> Do we really need that _local/_remote distinction? ISTR you had a patch
> that frobbed flush_tlb_info into the csd and that gave space
> constraints, but I'm not seeing that here (probably a wise, get stuff
> merged etc..).
> 
> struct __call_single_data {
>        struct llist_node          llist;                /*     0     8 */
>        smp_call_func_t            func;                 /*     8     8 */
>        void *                     info;                 /*    16     8 */
>        unsigned int               flags;                /*    24     4 */
> 
>        /* size: 32, cachelines: 1, members: 4 */
>        /* padding: 4 */
>        /* last cacheline: 32 bytes */
> };
> 
> struct flush_tlb_info {
>        struct mm_struct *         mm;                   /*     0     8 */
>        long unsigned int          start;                /*     8     8 */
>        long unsigned int          end;                  /*    16     8 */
>        u64                        new_tlb_gen;          /*    24     8 */
>        unsigned int               stride_shift;         /*    32     4 */
>        bool                       freed_tables;         /*    36     1 */
> 
>        /* size: 40, cachelines: 1, members: 6 */
>        /* padding: 3 */
>        /* last cacheline: 40 bytes */
> };
> 
> IIRC what you did was make void *__call_single_data::info the last
> member and a union until the full cacheline size (64). Given the above
> that would get us 24 bytes for csd, leaving us 40 for that
> flush_tlb_info.
> 
> But then we can still do something like the below, which doesn't change
> things and still gets rid of that dual function crud, simplifying
> smp_call_function_many again.
> 
> Index: linux-2.6/arch/x86/include/asm/tlbflush.h
> ===================================================================
> --- linux-2.6.orig/arch/x86/include/asm/tlbflush.h
> +++ linux-2.6/arch/x86/include/asm/tlbflush.h
> @@ -546,8 +546,9 @@ struct flush_tlb_info {
> 	unsigned long		start;
> 	unsigned long		end;
> 	u64			new_tlb_gen;
> -	unsigned int		stride_shift;
> -	bool			freed_tables;
> +	unsigned int		cpu;
> +	unsigned short		stride_shift;
> +	unsigned char		freed_tables;
> };
> 
> #define local_flush_tlb() __flush_tlb()
> Index: linux-2.6/arch/x86/mm/tlb.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/mm/tlb.c
> +++ linux-2.6/arch/x86/mm/tlb.c
> @@ -659,6 +659,27 @@ static void flush_tlb_func_remote(void *
> 	flush_tlb_func_common(f, false, TLB_REMOTE_SHOOTDOWN);
> }
> 
> +static void flush_tlb_func(void *info)
> +{
> +	const struct flush_tlb_info *f = info;
> +	enum tlb_flush_reason reason = TLB_REMOTE_SHOOTDOWN;
> +	bool local = false;
> +
> +	if (f->cpu == smp_processor_id()) {
> +		local = true;
> +		reason = (f->mm == NULL) ? TLB_LOCAL_SHOOTDOWN : TLB_LOCAL_MM_SHOOTDOWN;
> +	} else {
> +		inc_irq_stat(irq_tlb_count);
> +
> +		if (f->mm && f->mm != this_cpu_read(cpu_tlbstate.loaded_mm))
> +			return;
> +
> +		count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
> +	}
> +
> +	flush_tlb_func_common(f, local, reason);
> +}
> +
> static bool tlb_is_not_lazy(int cpu)
> {
> 	return !per_cpu(cpu_tlbstate_shared.is_lazy, cpu);

Nice! I will add it on top, if you don’t mind (instead squashing it).

The original decision to have local/remote functions was mostly to provide
the generality.

I would change the last argument of __smp_call_function_many() from “wait”
to “flags” that would indicate whether to run the function locally, since I
don’t want to change the semantics of smp_call_function_many() and decide
whether to run the function locally purely based on the mask. Let me know if
you disagree.

^ permalink raw reply

* Re: [PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently
From: Peter Zijlstra @ 2019-07-22 19:14 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Andy Lutomirski, Dave Hansen, x86, linux-kernel, Thomas Gleixner,
	Ingo Molnar, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Sasha Levin, Borislav Petkov, Juergen Gross, Paolo Bonzini,
	Boris Ostrovsky, linux-hyperv, virtualization, kvm, xen-devel
In-Reply-To: <20190719005837.4150-5-namit@vmware.com>

On Thu, Jul 18, 2019 at 05:58:32PM -0700, Nadav Amit wrote:
> @@ -709,8 +716,9 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
>  	 * doing a speculative memory access.
>  	 */
>  	if (info->freed_tables) {
> -		smp_call_function_many(cpumask, flush_tlb_func_remote,
> -			       (void *)info, 1);
> +		__smp_call_function_many(cpumask, flush_tlb_func_remote,
> +					 flush_tlb_func_local,
> +					 (void *)info, 1);
>  	} else {
>  		/*
>  		 * Although we could have used on_each_cpu_cond_mask(),
> @@ -737,7 +745,8 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
>  			if (tlb_is_not_lazy(cpu))
>  				__cpumask_set_cpu(cpu, cond_cpumask);
>  		}
> -		smp_call_function_many(cond_cpumask, flush_tlb_func_remote,
> +		__smp_call_function_many(cond_cpumask, flush_tlb_func_remote,
> +					 flush_tlb_func_local,
>  					 (void *)info, 1);
>  	}
>  }

Do we really need that _local/_remote distinction? ISTR you had a patch
that frobbed flush_tlb_info into the csd and that gave space
constraints, but I'm not seeing that here (probably a wise, get stuff
merged etc..).

struct __call_single_data {
        struct llist_node          llist;                /*     0     8 */
        smp_call_func_t            func;                 /*     8     8 */
        void *                     info;                 /*    16     8 */
        unsigned int               flags;                /*    24     4 */

        /* size: 32, cachelines: 1, members: 4 */
        /* padding: 4 */
        /* last cacheline: 32 bytes */
};

struct flush_tlb_info {
        struct mm_struct *         mm;                   /*     0     8 */
        long unsigned int          start;                /*     8     8 */
        long unsigned int          end;                  /*    16     8 */
        u64                        new_tlb_gen;          /*    24     8 */
        unsigned int               stride_shift;         /*    32     4 */
        bool                       freed_tables;         /*    36     1 */

        /* size: 40, cachelines: 1, members: 6 */
        /* padding: 3 */
        /* last cacheline: 40 bytes */
};

IIRC what you did was make void *__call_single_data::info the last
member and a union until the full cacheline size (64). Given the above
that would get us 24 bytes for csd, leaving us 40 for that
flush_tlb_info.

But then we can still do something like the below, which doesn't change
things and still gets rid of that dual function crud, simplifying
smp_call_function_many again.

Index: linux-2.6/arch/x86/include/asm/tlbflush.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/tlbflush.h
+++ linux-2.6/arch/x86/include/asm/tlbflush.h
@@ -546,8 +546,9 @@ struct flush_tlb_info {
 	unsigned long		start;
 	unsigned long		end;
 	u64			new_tlb_gen;
-	unsigned int		stride_shift;
-	bool			freed_tables;
+	unsigned int		cpu;
+	unsigned short		stride_shift;
+	unsigned char		freed_tables;
 };
 
 #define local_flush_tlb() __flush_tlb()
Index: linux-2.6/arch/x86/mm/tlb.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/tlb.c
+++ linux-2.6/arch/x86/mm/tlb.c
@@ -659,6 +659,27 @@ static void flush_tlb_func_remote(void *
 	flush_tlb_func_common(f, false, TLB_REMOTE_SHOOTDOWN);
 }
 
+static void flush_tlb_func(void *info)
+{
+	const struct flush_tlb_info *f = info;
+	enum tlb_flush_reason reason = TLB_REMOTE_SHOOTDOWN;
+	bool local = false;
+
+	if (f->cpu == smp_processor_id()) {
+		local = true;
+		reason = (f->mm == NULL) ? TLB_LOCAL_SHOOTDOWN : TLB_LOCAL_MM_SHOOTDOWN;
+	} else {
+		inc_irq_stat(irq_tlb_count);
+
+		if (f->mm && f->mm != this_cpu_read(cpu_tlbstate.loaded_mm))
+			return;
+
+		count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
+	}
+
+	flush_tlb_func_common(f, local, reason);
+}
+
 static bool tlb_is_not_lazy(int cpu)
 {
 	return !per_cpu(cpu_tlbstate_shared.is_lazy, cpu);


^ permalink raw reply

* Re: [PATCH] hv: Use the correct style for SPDX License Identifier
From: Greg Kroah-Hartman @ 2019-07-22 14:08 UTC (permalink / raw)
  To: Nishad Kamdar
  Cc: K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, Sasha Levin,
	Joe Perches, Uwe Kleine-König, linux-hyperv, linux-kernel
In-Reply-To: <20190722133112.GA7990@nishad>

On Mon, Jul 22, 2019 at 07:01:17PM +0530, Nishad Kamdar wrote:
> This patch corrects the SPDX License Identifier style
> in the trace header file related to Microsoft Hyper-V
> client drivers.
> For C header files Documentation/process/license-rules.rst
> mandates C-like comments (opposed to C source files where
> C++ style should be used)
> 
> Changes made by using a script provided by Joe Perches here:
> https://lkml.org/lkml/2019/2/7/46
> 
> Suggested-by: Joe Perches <joe@perches.com>
> Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox