public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Dexuan Cui <decui@microsoft.com>
Cc: gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org,
	driverdev-devel@linuxdriverproject.org, olaf@aepfle.de,
	apw@canonical.com, jasowang@redhat.com, kys@microsoft.com,
	haiyangz@microsoft.com, rolf.neugebauer@docker.com,
	dave.scott@docker.com, ian.campbell@docker.com
Subject: Re: [PATCH v3] Drivers: hv: vmbus: fix the race when querying & updating the percpu list
Date: Tue, 31 May 2016 18:26:54 +0200	[thread overview]
Message-ID: <87twhefb1t.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <1463896966-4745-1-git-send-email-decui@microsoft.com> (Dexuan Cui's message of "Sat, 21 May 2016 23:02:46 -0700")

Dexuan Cui <decui@microsoft.com> writes:

> There is a rare race when we remove an entry from the global list
> hv_context.percpu_list[cpu] in hv_process_channel_removal() ->
> percpu_channel_deq() -> list_del(): at this time, if vmbus_on_event() ->
> process_chn_event() -> pcpu_relid2channel() is trying to query the list,
> we can get the kernel fault.
>
> Similarly, we also have the issue in the code path: vmbus_process_offer() ->
> percpu_channel_enq().
>
> We can resolve the issue by disabling the tasklet when updating the list.
>
> The patch also moves vmbus_release_relid() to a later place where
> the channel has been removed from the per-cpu and the global lists.
>
> Reported-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Tested 4.7-rc1 with this path applied and kernel always crashes on boot
(WS2016TP5, 12 CPU SMP guest, Generation 2):

[    5.464251] hv_vmbus: Hyper-V Host Build:14300-10.0-1-0.1006; Vmbus version:4.0
[    5.471666] hv_vmbus: Unknown GUID: f8e65716-3cb3-4a06-9a60-1889c5cccab5
[    5.472143] BUG: unable to handle kernel paging request at 000000079fff5288
[    5.477107] IP: [<ffffffffa0004b91>] vmbus_onoffer+0x311/0x570 [hv_vmbus]
[    5.477107] PGD 0 
[    5.477107] Oops: 0000 [#1] SMP
[    5.477107] Modules linked in: hv_vmbus
[    5.477107] CPU: 11 PID: 189 Comm: kworker/11:1 Not tainted 4.7.0-rc1_dc1_test+ #262
[    5.477107] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[    5.477107] Workqueue: hv_vmbus_con vmbus_onmessage_work [hv_vmbus]
[    5.477107] task: ffff8801796e4480 ti: ffff8801796e8000 task.ti: ffff8801796e8000
[    5.477107] RIP: 0010:[<ffffffffa0004b91>]  [<ffffffffa0004b91>] vmbus_onoffer+0x311/0x570 [hv_vmbus]
[    5.477107] RSP: 0018:ffff8801796ebc50  EFLAGS: 00010286
[    5.477107] RAX: 00000000ffff8801 RBX: ffff880032641000 RCX: 0000000000000050
[    5.477107] RDX: 0000000000040000 RSI: 0000000000000000 RDI: ffff880032641000
[    5.477107] RBP: ffff8801796ebd10 R08: 0000000000000001 R09: 0000000000000001
[    5.477107] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000010
[    5.477107] R13: 4a063cb3f8e65716 R14: b5caccc58918609a R15: ffffffffa0008b60
[    5.477107] FS:  0000000000000000(0000) GS:ffff88017c000000(0000) knlGS:0000000000000000
[    5.477107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.477107] CR2: 000000079fff5288 CR3: 0000000032613000 CR4: 00000000001406e0
[    5.477107] Stack:
[    5.477107]  ffff880032641780 ffff88003264102c 0010010000000046 ffffffffa000646e
[    5.477107]  ffff8801796e5090 ffff8801796e4480 00000000004f827d 0000000000000001
[    5.477107]  0000000000000000 ffff8801796ebce8 ffffffff810eaebc 00000000796e5058
[    5.477107] Call Trace:
[    5.477107]  [<ffffffff810eaebc>] ? __lock_acquire+0x3dc/0x730
[    5.477107]  [<ffffffffa0005263>] vmbus_onmessage+0x33/0xa0 [hv_vmbus]
[    5.477107]  [<ffffffffa0001371>] vmbus_onmessage_work+0x21/0x30 [hv_vmbus]
[    5.653321]  [<ffffffff810abd1f>] process_one_work+0x1ff/0x6d0
[    5.653321]  [<ffffffff810abca1>] ? process_one_work+0x181/0x6d0
[    5.653321]  [<ffffffff810ac23e>] worker_thread+0x4e/0x490
[    5.653321]  [<ffffffff810ac1f0>] ? process_one_work+0x6d0/0x6d0
[    5.653321]  [<ffffffff810ac1f0>] ? process_one_work+0x6d0/0x6d0
[    5.653321]  [<ffffffff810b31b1>] kthread+0x101/0x120
[    5.653321]  [<ffffffff81739cef>] ret_from_fork+0x1f/0x40
[    5.653321]  [<ffffffff810b30b0>] ? kthread_create_on_node+0x250/0x250
[    5.653321] Code: 74 24 08 48 c7 c7 60 6c 00 a0 e8 0a 9e 1b e1 b8 10 00 00 00 66 89 44 24 16 44 89 e6 48 89 df e8 f6 f9 ff ff 41 8b 87 f4 02 00 00 <48> 8b 14 c5 80 12 03 a0 f0 ff 42 10 48 8b 42 08 a8 02 75 f8 0f 
[    5.653321] RIP  [<ffffffffa0004b91>] vmbus_onoffer+0x311/0x570 [hv_vmbus]
[    5.653321]  RSP <ffff8801796ebc50>
[    5.653321] CR2: 000000079fff5288
[    5.653321] ---[ end trace 62df6070997f1f10 ]---
[    5.653321] Kernel panic - not syncing: Fatal exception
[    5.653321] Kernel Offset: disabled
[    5.653321] ---[ end Kernel panic - not syncing: Fatal exception
[    5.653480] ------------[ cut here ]------------

I can investigate it tomorrow if this doesn't reproduce for you.

<skip>

-- 
  Vitaly

  reply	other threads:[~2016-05-31 16:27 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-22  6:02 [PATCH v3] Drivers: hv: vmbus: fix the race when querying & updating the percpu list Dexuan Cui
2016-05-31 16:26 ` Vitaly Kuznetsov [this message]
2016-06-01  6:39   ` Dexuan Cui
2016-06-01  8:58     ` Vitaly Kuznetsov
2016-06-01 10:01       ` Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87twhefb1t.fsf@vitty.brq.redhat.com \
    --to=vkuznets@redhat.com \
    --cc=apw@canonical.com \
    --cc=dave.scott@docker.com \
    --cc=decui@microsoft.com \
    --cc=driverdev-devel@linuxdriverproject.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=haiyangz@microsoft.com \
    --cc=ian.campbell@docker.com \
    --cc=jasowang@redhat.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=olaf@aepfle.de \
    --cc=rolf.neugebauer@docker.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox