From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757242AbcC2M1X (ORCPT ); Tue, 29 Mar 2016 08:27:23 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37630 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757212AbcC2M1V (ORCPT ); Tue, 29 Mar 2016 08:27:21 -0400 From: Vitaly Kuznetsov To: "K. Y. Srinivasan" Cc: gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, devel@linuxdriverproject.org, olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com Subject: Re: [PATCH 1/6] Drivers: hv: kvp: fix IP Failover References: <1457559302-10365-1-git-send-email-kys@microsoft.com> <1457559325-10406-1-git-send-email-kys@microsoft.com> Date: Tue, 29 Mar 2016 14:27:16 +0200 In-Reply-To: <1457559325-10406-1-git-send-email-kys@microsoft.com> (K. Y. Srinivasan's message of "Wed, 9 Mar 2016 13:35:20 -0800") Message-ID: <87y491tqor.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org "K. Y. Srinivasan" writes: > From: Vitaly Kuznetsov > > Hyper-V VMs can be replicated to another hosts and there is a feature to > set different IP for replicas, it is called 'Failover TCP/IP'. When > such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon > as we finish negotiation procedure. The problem is that it can happen (and > it actually happens) before userspace daemon connects and we reply with > HV_E_FAIL to the message. As there are no repetitions we fail to set the > requested IP. > > Solve the issue by postponing our reply to the negotiation message till > userspace daemon is connected. We can't wait too long as there is a > host-side timeout (cca. 75 seconds) and if we fail to reply in this time > frame the whole KVP service will become inactive. The solution is not > ideal - if it takes userspace daemon more than 60 seconds to connect > IP Failover will still fail but I don't see a solution with our current > separation between kernel and userspace parts. > > Other two modules (VSS and FCOPY) don't require such delay, leave them > untouched. > > Signed-off-by: Vitaly Kuznetsov > Signed-off-by: K. Y. Srinivasan An issue was found with this patch: we don't cancel kvp_host_handshake_work on module unload and if the work is still pending we end up crashing. As far as I can see this series didn't make it to char-misc tree so instead of sending a patch to this patch I'll send v2 for the original issue. Sorry for the inconvenience. > --- > drivers/hv/hv_kvp.c | 30 ++++++++++++++++++++++++++++++ > drivers/hv/hyperv_vmbus.h | 5 +++++ > 2 files changed, 35 insertions(+), 0 deletions(-) > > diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c > index 9b9b370..0d3fcd6 100644 > --- a/drivers/hv/hv_kvp.c > +++ b/drivers/hv/hv_kvp.c > @@ -78,9 +78,11 @@ static void kvp_send_key(struct work_struct *dummy); > > static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); > static void kvp_timeout_func(struct work_struct *dummy); > +static void kvp_host_handshake_func(struct work_struct *dummy); > static void kvp_register(int); > > static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); > +static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func); > static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); > > static const char kvp_devname[] = "vmbus/hv_kvp"; > @@ -130,6 +132,11 @@ static void kvp_timeout_func(struct work_struct *dummy) > hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); > } > > +static void kvp_host_handshake_func(struct work_struct *dummy) > +{ > + hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); > +} > + > static int kvp_handle_handshake(struct hv_kvp_msg *msg) > { > switch (msg->kvp_hdr.operation) { > @@ -154,6 +161,12 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) > pr_debug("KVP: userspace daemon ver. %d registered\n", > KVP_OP_REGISTER); > kvp_register(dm_reg_value); > + > + /* > + * If we're still negotiating with the host cancel the timeout > + * work to not poll the channel twice. > + */ > + cancel_delayed_work_sync(&kvp_host_handshake_work); > hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); > > return 0; > @@ -594,7 +607,22 @@ void hv_kvp_onchannelcallback(void *context) > struct icmsg_negotiate *negop = NULL; > int util_fw_version; > int kvp_srv_version; > + static enum {NEGO_NOT_STARTED, > + NEGO_IN_PROGRESS, > + NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; > > + if (host_negotiatied == NEGO_NOT_STARTED && > + kvp_transaction.state < HVUTIL_READY) { > + /* > + * If userspace daemon is not connected and host is asking > + * us to negotiate we need to delay to not lose messages. > + * This is important for Failover IP setting. > + */ > + host_negotiatied = NEGO_IN_PROGRESS; > + schedule_delayed_work(&kvp_host_handshake_work, > + HV_UTIL_NEGO_TIMEOUT * HZ); > + return; > + } > if (kvp_transaction.state > HVUTIL_READY) > return; > > @@ -672,6 +700,8 @@ void hv_kvp_onchannelcallback(void *context) > vmbus_sendpacket(channel, recv_buffer, > recvlen, requestid, > VM_PKT_DATA_INBAND, 0); > + > + host_negotiatied = NEGO_FINISHED; > } > > } > diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h > index a64b176..28e9df9 100644 > --- a/drivers/hv/hyperv_vmbus.h > +++ b/drivers/hv/hyperv_vmbus.h > @@ -36,6 +36,11 @@ > #define HV_UTIL_TIMEOUT 30 > > /* > + * Timeout for guest-host handshake for services. > + */ > +#define HV_UTIL_NEGO_TIMEOUT 60 > + > +/* > * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent > * is set by CPUID(HVCPUID_VERSION_FEATURES). > */ -- Vitaly