From: Dexuan Cui <decui@microsoft.com>
To: gregkh@linuxfoundation.org, davem@davemloft.net,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
devel@linuxdriverproject.org, olaf@aepfle.de, apw@canonical.com,
jasowang@redhat.com, cavery@redhat.com, kys@microsoft.com,
haiyangz@microsoft.com
Cc: joe@perches.com, vkuznets@redhat.com
Subject: [PATCH v11 net-next 0/1] introduce Hyper-V VM Sockets(hv_sock)
Date: Sun, 15 May 2016 09:52:42 -0700 [thread overview]
Message-ID: <1463331162-6679-1-git-send-email-decui@microsoft.com> (raw)
Hyper-V Sockets (hv_sock) supplies a byte-stream based communication
mechanism between the host and the guest. It's somewhat like TCP over
VMBus, but the transportation layer (VMBus) is much simpler than IP.
With Hyper-V Sockets, applications between the host and the guest can talk
to each other directly by the traditional BSD-style socket APIs.
Hyper-V Sockets is only available on new Windows hosts, like Windows Server
2016. More info is in this article "Make your own integration services":
https://msdn.microsoft.com/en-us/virtualization/hyperv_on_windows/develop/make_mgmt_service
The patch implements the necessary support in the guest side by
introducing a new socket address family AF_HYPERV.
You can also get the patch by:
https://github.com/dcui/linux/commits/decui/hv_sock/net-next/20160512_v10
Note: the VMBus driver side's supporting patches have been in the mainline
tree.
I know the kernel has already had a VM Sockets driver (AF_VSOCK) based
on VMware VMCI (net/vmw_vsock/, drivers/misc/vmw_vmci), and KVM is
proposing AF_VSOCK of virtio version:
http://marc.info/?l=linux-netdev&m=145952064004765&w=2
However, though Hyper-V Sockets may seem conceptually similar to
AF_VOSCK, there are differences in the transportation layer, and IMO these
make the direct code reusing impractical:
1. In AF_VSOCK, the endpoint type is: <u32 ContextID, u32 Port>, but in
AF_HYPERV, the endpoint type is: <GUID VM_ID, GUID ServiceID>. Here GUID
is 128-bit.
2. AF_VSOCK supports SOCK_DGRAM, while AF_HYPERV doesn't.
3. AF_VSOCK supports some special sock opts, like SO_VM_SOCKETS_BUFFER_SIZE,
SO_VM_SOCKETS_BUFFER_MIN/MAX_SIZE and SO_VM_SOCKETS_CONNECT_TIMEOUT.
These are meaningless to AF_HYPERV.
4. Some AF_VSOCK's VMCI transportation ops are meanless to AF_HYPERV/VMBus,
like .notify_recv_init
.notify_recv_pre_block
.notify_recv_pre_dequeue
.notify_recv_post_dequeue
.notify_send_init
.notify_send_pre_block
.notify_send_pre_enqueue
.notify_send_post_enqueue
etc.
So I think we'd better introduce a new address family: AF_HYPERV.
Please review the patch.
Looking forward to your comments, especially comments from David. :-)
Changes since v1:
- updated "[PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature"
- added __init and __exit for the module init/exit functions
- net/hv_sock/Kconfig: "default m" -> "default m if HYPERV"
- MODULE_LICENSE: "Dual MIT/GPL" -> "Dual BSD/GPL"
Changes since v2:
- fixed various coding issue pointed out by David Miller
- fixed indentation issues
- removed pr_debug in net/hv_sock/af_hvsock.c
- used reverse-Chrismas-tree style for local variables.
- EXPORT_SYMBOL -> EXPORT_SYMBOL_GPL
Changes since v3:
- fixed a few coding issue pointed by Vitaly Kuznetsov and Dan Carpenter
- fixed the ret value in vmbus_recvpacket_hvsock on error
- fixed the style of multi-line comment: vmbus_get_hvsock_rw_status()
Changes since v4 (https://lkml.org/lkml/2015/7/28/404):
- addressed all the comments about V4.
- treat the hvsock offers/channels as special VMBus devices
- add a mechanism to pass hvsock events to the hvsock driver
- fixed some corner cases with proper locking when a connection is closed
- rebased to the latest Greg's tree
Changes since v5 (https://lkml.org/lkml/2015/12/24/103):
- addressed the coding style issues (Vitaly Kuznetsov & David Miller, thanks!)
- used a better coding for the per-channel rescind callback (Thank Vitaly!)
- avoided the introduction of new VMBUS driver APIs vmbus_sendpacket_hvsock()
and vmbus_recvpacket_hvsock() and used vmbus_sendpacket()/vmbus_recvpacket()
in the higher level (i.e., the vmsock driver). Thank Vitaly!
Changes since v6 (http://lkml.iu.edu/hypermail/linux/kernel/1601.3/01813.html)
- only a few minor changes of coding style and comments
Changes since v7
- a few minor changes of coding style: thanks, Joe Perches!
- added some lines of comments about GUID/UUID before the struct sockaddr_hv.
Changes since v8
- removed the unnecessary __packed for some definitions: thanks, David!
- hvsock_open_connection: use offer.u.pipe.user_def[0] to know the connection
and reorganized the function
direction
- reorganized the code according to suggestions from Cathy Avery: split big
functions into small ones, set .setsockopt and getsockopt to
sock_no_setsockopt/sock_no_getsockopt
- inline'd some small list helper functions
Changes since v9
- minimized struct hvsock_sock by making the send/recv buffers pointers.
the buffers are allocated by kmalloc() in __hvsock_create() now.
- minimized the sizes of the send/recv buffers and the vmbus ringbuffers.
Changes since v10
1) add module params: send_ring_page, recv_ring_page. They can be used to
enlarge the ringbuffer size to get better performance, e.g.,
# modprobe hv_sock recv_ring_page=16 send_ring_page=16
By default, recv_ring_page is 3 and send_ring_page is 2.
2) add module param max_socket_number (the default is 1024).
A user can enlarge the number to create more than 1024 hv_sock sockets.
By default, 1024 sockets take about 1024 * (3+2+1+1) * 4KB = 28M bytes.
(Here 1+1 means 1 page for send/recv buffers per connection, respectively.)
3) implement the TODO in hvsock_shutdown().
4) fix a bug in hvsock_close_connection():
I remove "sk->sk_socket->state = SS_UNCONNECTED;" -- actually this line
is not really useful. For a connection triggered by a host app’s connect(),
sk->sk_socket remains NULL before the connection is accepted by the server
app (in Linux VM): see hvsock_accept() -> hvsock_accept_wait() ->
sock_graft(connected, newsock). If the host app exits before the server
app’s accept() returns, the host can send a rescind-message to close the
connection and later in the Linux VM’s message handler
i.e. vmbus_onoffer_rescind()), Linux will get a NULL de-referencing crash.
5) fix a bug in hvsock_open_connection()
I move the vmbus_set_chn_rescind_callback() to a later place, because
when vmbus_open() fails, hvsock_close_connection() can do nothing and we
count on vmbus_onoffer_rescind() -> vmbus_device_unregister() to clean up
the device.
6) some stylistic modificiation.
Dexuan Cui (1):
hv_sock: introduce Hyper-V Sockets
MAINTAINERS | 2 +
include/linux/hyperv.h | 14 +
include/linux/socket.h | 4 +-
include/net/af_hvsock.h | 78 +++
include/uapi/linux/hyperv.h | 25 +
net/Kconfig | 1 +
net/Makefile | 1 +
net/hv_sock/Kconfig | 10 +
net/hv_sock/Makefile | 3 +
net/hv_sock/af_hvsock.c | 1520 +++++++++++++++++++++++++++++++++++++++++++
10 files changed, 1657 insertions(+), 1 deletion(-)
create mode 100644 include/net/af_hvsock.h
create mode 100644 net/hv_sock/Kconfig
create mode 100644 net/hv_sock/Makefile
create mode 100644 net/hv_sock/af_hvsock.c
--
2.7.4
WARNING: multiple messages have this Message-ID (diff)
From: Dexuan Cui <decui@microsoft.com>
To: gregkh@linuxfoundation.org, davem@davemloft.net,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
devel@linuxdriverproject.org, olaf@aepfle.de, apw@canonical.com,
jasowang@redhat.com, cavery@redhat.com, kys@microsoft.com,
haiyangz@microsoft.com
Cc: joe@perches.com
Subject: [PATCH v11 net-next 0/1] introduce Hyper-V VM Sockets(hv_sock)
Date: Sun, 15 May 2016 09:52:42 -0700 [thread overview]
Message-ID: <1463331162-6679-1-git-send-email-decui@microsoft.com> (raw)
Hyper-V Sockets (hv_sock) supplies a byte-stream based communication
mechanism between the host and the guest. It's somewhat like TCP over
VMBus, but the transportation layer (VMBus) is much simpler than IP.
With Hyper-V Sockets, applications between the host and the guest can talk
to each other directly by the traditional BSD-style socket APIs.
Hyper-V Sockets is only available on new Windows hosts, like Windows Server
2016. More info is in this article "Make your own integration services":
https://msdn.microsoft.com/en-us/virtualization/hyperv_on_windows/develop/make_mgmt_service
The patch implements the necessary support in the guest side by
introducing a new socket address family AF_HYPERV.
You can also get the patch by:
https://github.com/dcui/linux/commits/decui/hv_sock/net-next/20160512_v10
Note: the VMBus driver side's supporting patches have been in the mainline
tree.
I know the kernel has already had a VM Sockets driver (AF_VSOCK) based
on VMware VMCI (net/vmw_vsock/, drivers/misc/vmw_vmci), and KVM is
proposing AF_VSOCK of virtio version:
http://marc.info/?l=linux-netdev&m=145952064004765&w=2
However, though Hyper-V Sockets may seem conceptually similar to
AF_VOSCK, there are differences in the transportation layer, and IMO these
make the direct code reusing impractical:
1. In AF_VSOCK, the endpoint type is: <u32 ContextID, u32 Port>, but in
AF_HYPERV, the endpoint type is: <GUID VM_ID, GUID ServiceID>. Here GUID
is 128-bit.
2. AF_VSOCK supports SOCK_DGRAM, while AF_HYPERV doesn't.
3. AF_VSOCK supports some special sock opts, like SO_VM_SOCKETS_BUFFER_SIZE,
SO_VM_SOCKETS_BUFFER_MIN/MAX_SIZE and SO_VM_SOCKETS_CONNECT_TIMEOUT.
These are meaningless to AF_HYPERV.
4. Some AF_VSOCK's VMCI transportation ops are meanless to AF_HYPERV/VMBus,
like .notify_recv_init
.notify_recv_pre_block
.notify_recv_pre_dequeue
.notify_recv_post_dequeue
.notify_send_init
.notify_send_pre_block
.notify_send_pre_enqueue
.notify_send_post_enqueue
etc.
So I think we'd better introduce a new address family: AF_HYPERV.
Please review the patch.
Looking forward to your comments, especially comments from David. :-)
Changes since v1:
- updated "[PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature"
- added __init and __exit for the module init/exit functions
- net/hv_sock/Kconfig: "default m" -> "default m if HYPERV"
- MODULE_LICENSE: "Dual MIT/GPL" -> "Dual BSD/GPL"
Changes since v2:
- fixed various coding issue pointed out by David Miller
- fixed indentation issues
- removed pr_debug in net/hv_sock/af_hvsock.c
- used reverse-Chrismas-tree style for local variables.
- EXPORT_SYMBOL -> EXPORT_SYMBOL_GPL
Changes since v3:
- fixed a few coding issue pointed by Vitaly Kuznetsov and Dan Carpenter
- fixed the ret value in vmbus_recvpacket_hvsock on error
- fixed the style of multi-line comment: vmbus_get_hvsock_rw_status()
Changes since v4 (https://lkml.org/lkml/2015/7/28/404):
- addressed all the comments about V4.
- treat the hvsock offers/channels as special VMBus devices
- add a mechanism to pass hvsock events to the hvsock driver
- fixed some corner cases with proper locking when a connection is closed
- rebased to the latest Greg's tree
Changes since v5 (https://lkml.org/lkml/2015/12/24/103):
- addressed the coding style issues (Vitaly Kuznetsov & David Miller, thanks!)
- used a better coding for the per-channel rescind callback (Thank Vitaly!)
- avoided the introduction of new VMBUS driver APIs vmbus_sendpacket_hvsock()
and vmbus_recvpacket_hvsock() and used vmbus_sendpacket()/vmbus_recvpacket()
in the higher level (i.e., the vmsock driver). Thank Vitaly!
Changes since v6 (http://lkml.iu.edu/hypermail/linux/kernel/1601.3/01813.html)
- only a few minor changes of coding style and comments
Changes since v7
- a few minor changes of coding style: thanks, Joe Perches!
- added some lines of comments about GUID/UUID before the struct sockaddr_hv.
Changes since v8
- removed the unnecessary __packed for some definitions: thanks, David!
- hvsock_open_connection: use offer.u.pipe.user_def[0] to know the connection
and reorganized the function
direction
- reorganized the code according to suggestions from Cathy Avery: split big
functions into small ones, set .setsockopt and getsockopt to
sock_no_setsockopt/sock_no_getsockopt
- inline'd some small list helper functions
Changes since v9
- minimized struct hvsock_sock by making the send/recv buffers pointers.
the buffers are allocated by kmalloc() in __hvsock_create() now.
- minimized the sizes of the send/recv buffers and the vmbus ringbuffers.
Changes since v10
1) add module params: send_ring_page, recv_ring_page. They can be used to
enlarge the ringbuffer size to get better performance, e.g.,
# modprobe hv_sock recv_ring_page=16 send_ring_page=16
By default, recv_ring_page is 3 and send_ring_page is 2.
2) add module param max_socket_number (the default is 1024).
A user can enlarge the number to create more than 1024 hv_sock sockets.
By default, 1024 sockets take about 1024 * (3+2+1+1) * 4KB = 28M bytes.
(Here 1+1 means 1 page for send/recv buffers per connection, respectively.)
3) implement the TODO in hvsock_shutdown().
4) fix a bug in hvsock_close_connection():
I remove "sk->sk_socket->state = SS_UNCONNECTED;" -- actually this line
is not really useful. For a connection triggered by a host app’s connect(),
sk->sk_socket remains NULL before the connection is accepted by the server
app (in Linux VM): see hvsock_accept() -> hvsock_accept_wait() ->
sock_graft(connected, newsock). If the host app exits before the server
app’s accept() returns, the host can send a rescind-message to close the
connection and later in the Linux VM’s message handler
i.e. vmbus_onoffer_rescind()), Linux will get a NULL de-referencing crash.
5) fix a bug in hvsock_open_connection()
I move the vmbus_set_chn_rescind_callback() to a later place, because
when vmbus_open() fails, hvsock_close_connection() can do nothing and we
count on vmbus_onoffer_rescind() -> vmbus_device_unregister() to clean up
the device.
6) some stylistic modificiation.
Dexuan Cui (1):
hv_sock: introduce Hyper-V Sockets
MAINTAINERS | 2 +
include/linux/hyperv.h | 14 +
include/linux/socket.h | 4 +-
include/net/af_hvsock.h | 78 +++
include/uapi/linux/hyperv.h | 25 +
net/Kconfig | 1 +
net/Makefile | 1 +
net/hv_sock/Kconfig | 10 +
net/hv_sock/Makefile | 3 +
net/hv_sock/af_hvsock.c | 1520 +++++++++++++++++++++++++++++++++++++++++++
10 files changed, 1657 insertions(+), 1 deletion(-)
create mode 100644 include/net/af_hvsock.h
create mode 100644 net/hv_sock/Kconfig
create mode 100644 net/hv_sock/Makefile
create mode 100644 net/hv_sock/af_hvsock.c
--
2.7.4
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
next reply other threads:[~2016-05-15 15:12 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-15 16:52 Dexuan Cui [this message]
2016-05-15 16:52 ` [PATCH v11 net-next 0/1] introduce Hyper-V VM Sockets(hv_sock) Dexuan Cui
2016-05-15 17:16 ` David Miller
2016-05-15 17:16 ` David Miller
2016-05-17 2:45 ` Dexuan Cui
2016-05-17 2:45 ` Dexuan Cui
2016-05-19 0:59 ` Dexuan Cui
2016-05-19 0:59 ` Dexuan Cui
2016-05-19 1:05 ` gregkh
2016-05-19 1:05 ` gregkh
2016-05-19 4:12 ` David Miller
2016-05-19 4:12 ` David Miller
2016-05-19 5:41 ` Dexuan Cui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1463331162-6679-1-git-send-email-decui@microsoft.com \
--to=decui@microsoft.com \
--cc=apw@canonical.com \
--cc=cavery@redhat.com \
--cc=davem@davemloft.net \
--cc=devel@linuxdriverproject.org \
--cc=gregkh@linuxfoundation.org \
--cc=haiyangz@microsoft.com \
--cc=jasowang@redhat.com \
--cc=joe@perches.com \
--cc=kys@microsoft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=olaf@aepfle.de \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.