From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: KY Srinivasan <kys@microsoft.com>
Cc: "devel\@linuxdriverproject.org" <devel@linuxdriverproject.org>,
Haiyang Zhang <haiyangz@microsoft.com>,
Dexuan Cui <decui@microsoft.com>,
"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] Drivers: hv: vmbus: prevent new subchannel creation on device shutdown
Date: Mon, 13 Jul 2015 17:28:31 +0200 [thread overview]
Message-ID: <87380syt40.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <BY2PR0301MB165497CA99CEA01E5B54FEFDA09C0@BY2PR0301MB1654.namprd03.prod.outlook.com> (KY Srinivasan's message of "Mon, 13 Jul 2015 15:10:31 +0000")
KY Srinivasan <kys@microsoft.com> writes:
>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
>> Sent: Monday, July 13, 2015 5:19 AM
>> To: devel@linuxdriverproject.org
>> Cc: KY Srinivasan; Haiyang Zhang; Dexuan Cui; linux-kernel@vger.kernel.org
>> Subject: [PATCH] Drivers: hv: vmbus: prevent new subchannel creation on
>> device shutdown
>>
>> When a new subchannel offer from host comes during device shutdown
>> (e.g.
>> when a netvsc/storvsc module is unloadedshortly after it was loaded) a
>> crash can happen as vmbus_process_offer() is not anyhow serialized with
>> vmbus_remove(). As an example we can try calling subchannel create
>> callback when the module is already unloaded.
>> The following crash was observed while keeping loading/unloading
>> hv_netvsc
>> module on 64-CPU guest:
>>
>> hv_netvsc vmbus_14: net device safe to remove
>> BUG: unable to handle kernel paging request at 000000000000a118
>> IP: [<ffffffffa01c1b21>] netvsc_sc_open+0x31/0xb0 [hv_netvsc]
>> PGD 1f3946a067 PUD 1f38a5f067 PMD 0
>> Oops: 0000 [#1] SMP
>> ...
>> Call Trace:
>> [<ffffffffa0055cd7>] vmbus_onoffer+0x477/0x530 [hv_vmbus]
>> [<ffffffff81092e1f>] ? move_linked_works+0x5f/0x80
>> [<ffffffffa0055fd3>] vmbus_onmessage+0x33/0xa0 [hv_vmbus]
>> [<ffffffffa0052f81>] vmbus_onmessage_work+0x21/0x30 [hv_vmbus]
>> [<ffffffff81095cde>] process_one_work+0x18e/0x4e0
>> ...
>>
>> The issue cannot be solved by just resetting sc_creation_callback on
>> driver removal as while we search for the parent channel with channel_lock
>> held we release it after the channel was found and it can disapper beneath
>> our feet while we're still in vmbus_process_offer();
>>
>> Introduce new sc_create_lock mutex and take it in vmbus_remove() to
>> ensure
>> no new subchannels are created after we started the removal procedure.
>> Check its state with mutex_trylock in vmbus_process_offer().
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>
> Thanks Vitaly; strangely enough, I too am looking at this exact problem. I was planning to
> address this problem a little differently: I was planning to hold the context that performed
> the initial (primary) channel open until all of the "open activity" was completed and this would include
> opening of the sub-channels for multi-channel devices.
(not sure it can actually happen with current implementation of Hyper-V
host but) in case a subchannel is rescinded and reopened again later on
such 'full open lock' won't probably help but in other cases both
approaches should give us equal results. This is not a hotpath, any
syncronization should do the job.
>
> Regards,
>
> K. Y
>
>> ---
>> drivers/hv/channel.c | 3 +++
>> drivers/hv/channel_mgmt.c | 20 ++++++++++++++++++--
>> drivers/hv/vmbus_drv.c | 6 ++++++
>> include/linux/hyperv.h | 4 ++++
>> 4 files changed, 31 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
>> index 603ce97..faa91fe 100644
>> --- a/drivers/hv/channel.c
>> +++ b/drivers/hv/channel.c
>> @@ -555,6 +555,9 @@ static int vmbus_close_internal(struct vmbus_channel
>> *channel)
>> if (channel->rescind)
>> hv_process_channel_removal(channel,
>> channel->offermsg.child_relid);
>> + else if (mutex_is_locked(&channel->sc_create_lock))
>> + mutex_unlock(&channel->sc_create_lock);
>> +
>> return ret;
>> }
>>
>> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
>> index 4506a66..96f8b96 100644
>> --- a/drivers/hv/channel_mgmt.c
>> +++ b/drivers/hv/channel_mgmt.c
>> @@ -150,6 +150,8 @@ static struct vmbus_channel *alloc_channel(void)
>> INIT_LIST_HEAD(&channel->sc_list);
>> INIT_LIST_HEAD(&channel->percpu_list);
>>
>> + mutex_init(&channel->sc_create_lock);
>> +
>> return channel;
>> }
>>
>> @@ -158,6 +160,7 @@ static struct vmbus_channel *alloc_channel(void)
>> */
>> static void free_channel(struct vmbus_channel *channel)
>> {
>> + mutex_destroy(&channel->sc_create_lock);
>> kfree(channel);
>> }
>>
>> @@ -247,8 +250,18 @@ static void vmbus_process_offer(struct
>> vmbus_channel *newchannel)
>> newchannel->offermsg.offer.if_type) &&
>> !uuid_le_cmp(channel->offermsg.offer.if_instance,
>> newchannel->offermsg.offer.if_instance)) {
>> - fnew = false;
>> - break;
>> + if (mutex_trylock(&channel->sc_create_lock)) {
>> + fnew = false;
>> + break;
>> + }
>> + /*
>> + * If we fail to acquire mutex here it means we're
>> + * closing parent channel at this moment. We can't
>> + * create new subchannel in this case.
>> + */
>> +
>> spin_unlock_irqrestore(&vmbus_connection.channel_lock,
>> + flags);
>> + goto err_free_chan;
>> }
>> }
>>
>> @@ -297,6 +310,7 @@ static void vmbus_process_offer(struct
>> vmbus_channel *newchannel)
>> if (!fnew) {
>> if (channel->sc_creation_callback != NULL)
>> channel->sc_creation_callback(newchannel);
>> + mutex_unlock(&channel->sc_create_lock);
>> return;
>> }
>>
>> @@ -340,6 +354,8 @@ err_deq_chan:
>> }
>>
>> err_free_chan:
>> + if (!fnew)
>> + mutex_unlock(&channel->sc_create_lock);
>> free_channel(newchannel);
>> }
>>
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
>> index cf20400..7ad7fcc 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -539,6 +539,12 @@ static int vmbus_remove(struct device
>> *child_device)
>> struct hv_device *dev = device_to_hv_device(child_device);
>> u32 relid = dev->channel->offermsg.child_relid;
>>
>> + /*
>> + * Device is being removed, prevent creating new subchannels.
>> Mutex
>> + * will be released in vmbus_close_internal().
>> + */
>> + mutex_lock(&dev->channel->sc_create_lock);
>> +
>> if (child_device->driver) {
>> drv = drv_to_hv_drv(child_device->driver);
>> if (drv->remove)
>> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
>> index 30d3a1f..f1af05c 100644
>> --- a/include/linux/hyperv.h
>> +++ b/include/linux/hyperv.h
>> @@ -748,6 +748,10 @@ struct vmbus_channel {
>> */
>> struct vmbus_channel *primary_channel;
>> /*
>> + * Protects sub-channel create path.
>> + */
>> + struct mutex sc_create_lock;
>> + /*
>> * Support per-channel state for use by vmbus drivers.
>> */
>> void *per_channel_state;
>> --
>> 2.4.3
--
Vitaly
next prev parent reply other threads:[~2015-07-13 15:28 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-13 12:18 [PATCH] Drivers: hv: vmbus: prevent new subchannel creation on device shutdown Vitaly Kuznetsov
2015-07-13 15:10 ` KY Srinivasan
2015-07-13 15:28 ` Vitaly Kuznetsov [this message]
2015-07-14 11:41 ` Dexuan Cui
2015-07-14 16:02 ` Vitaly Kuznetsov
2015-07-16 0:31 ` KY Srinivasan
2015-07-16 13:42 ` KY Srinivasan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87380syt40.fsf@vitty.brq.redhat.com \
--to=vkuznets@redhat.com \
--cc=decui@microsoft.com \
--cc=devel@linuxdriverproject.org \
--cc=haiyangz@microsoft.com \
--cc=kys@microsoft.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.