From: Daniel De Graaf <dgdegra@tycho.nsa.gov>
To: Roderick Colenbrander <thunderbird2k@gmail.com>
Cc: Konrad Rzeszutek Wilk <konrad@darnok.org>, xen-devel@lists.xensource.com
Subject: Re: Race condition in gntdev driver
Date: Thu, 26 Jan 2012 09:27:34 -0500 [thread overview]
Message-ID: <4F2162D6.40701@tycho.nsa.gov> (raw)
In-Reply-To: <CAEc3jaAJP7g6hA8_QC5vLq7vd0tKKwWR9ZR3ckkHf-izgjFceQ@mail.gmail.com>
On 01/25/2012 08:49 PM, Roderick Colenbrander wrote:
> Hi Daniel and Konrad,
>
> In an email thread at the end of last year ('Questions about GPLPV
> stability tests'), I mentioned a certain bug in the gntdev driver. I
> think I found the bug and would like your opinion. Since Daniel did a
> lot of work on gntdev recently, I think he is most familiar with the
> code.
>
> The systems on which I see this bug run Xen 4.1.2 and Linux 2.6.32.48
> (pvops), but the same bug should be in Linux 3.x since the code is
> similar.
>
> Let me summarize the problem. At some point in time a DomU is
> shutdown. Apparently shutdown takes too long according to Xend,
> causing it to SIGKILL qemu-dm:
> [2012-01-16 14:54:34 3419] INFO (XendDomainInfo:2078) Domain has
> shutdown: name=win7-1 id=7581 reason=poweroff.
> [2012-01-16 14:54:34 3419] DEBUG (XendDomainInfo:3071)
> XendDomainInfo.destroy: domid=7581
> [2012-01-16 14:54:36 3419] DEBUG (XendDomainInfo:2401) Destroying device model
> [2012-01-16 14:54:46 3419] WARNING (image:649) DeviceModel 31742 took
> more than 10s to terminate: sending SIGKILL
>
> This triggers the following Oops:
> [533804.785589] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000008
> [533804.793913] IP: [<ffffffff814bdba9>] _spin_lock+0x5/0x15
> [533804.799556] PGD 0
> [533804.801896] Oops: 0002 [#1] SMP
> [533804.805448] last sysfs file:
> /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/class
> [533804.813736] CPU 0
> [533804.816066] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf
> [533804.822914] Pid: 21632, comm: qemu-dm Not tainted 2.6.32.48 #1 X8STi
> [533804.829595] RIP: e030:[<ffffffff814bdba9>] [<ffffffff814bdba9>]
> _spin_lock+0x5/0x15
> [533804.837873] RSP: e02b:ffff88005f187c50 EFLAGS: 00010202
> [533804.843513] RAX: 0000000000000100 RBX: ffff88007d6923c0 RCX:
> 0000000000000003
> [533804.851192] RDX: ffff88007deb3180 RSI: ffff88007d6923c0 RDI:
> 0000000000000008
> [533804.858871] RBP: ffff88007e260128 R08: 0000000000000000 R09:
> 0000000000000000
> [533804.866552] R10: ffff88007121eb40 R11: ffffffff811b862b R12:
> ffff88007fac1e40
> [533804.874541] R13: ffff88007e3c3340 R14: ffff88007e3c3340 R15:
> ffff88007fdfbc00
> [533804.882243] FS: 00007f5cdcefe6f0(0000) GS:ffff880028038000(0000)
> knlGS:0000000000000000
> [533804.890874] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [533804.896948] CR2: 0000000000000008 CR3: 0000000001001000 CR4:
> 0000000000002660
> [533804.904627] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [533804.912306] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [533804.919985] Process qemu-dm (pid: 21632, threadinfo
> ffff88005f186000, task ffff880074f4e270)
> [533804.928971] Stack:
> [533804.931312] ffffffff810a9ad0 ffff88007deb3180 ffff88007e260100
> ffff88007e260100
> [533804.938762] <0> ffffffff8124222d 0000000000000008 0000000000000008
> ffff88007deb3180
> [533804.946900] <0> ffffffff810b245e ffff88007fac1e40 ffff88007deb3180
> 0000000000000000
> [533804.955465] Call Trace:
> [533804.958244] [<ffffffff810a9ad0>] ? mmu_notifier_unregister+0x27/0xa5
> [533804.965019] [<ffffffff8124222d>] ? gntdev_release+0xc3/0xd1
> [533804.971007] [<ffffffff810b245e>] ? __fput+0x100/0x1af
> [533804.976477] [<ffffffff810af998>] ? filp_close+0x5b/0x62
> [533804.982119] [<ffffffff81042989>] ? put_files_struct+0x64/0xc1
> [533804.988280] [<ffffffff810441f0>] ? do_exit+0x209/0x68d
> [533804.993836] [<ffffffff810441f8>] ? do_exit+0x211/0x68d
> [533804.999390] [<ffffffff810446e9>] ? do_group_exit+0x75/0x9c
> [533805.005294] [<ffffffff8104cf1d>] ? get_signal_to_deliver+0x30d/0x338
> [533805.012063] [<ffffffff81010f7a>] ? do_notify_resume+0x8a/0x6b1
> [533805.018310] [<ffffffff810bdb3a>] ? poll_select_copy_remaining+0xd0/0xf3
> [533805.025339] [<ffffffff81011c8e>] ? int_signal+0x12/0x17
> [533805.030980] Code: 00 00 00 01 74 05 e8 67 1c d2 ff 48 89 d0 5e c3
> ff 14 25 20 2d 6a 81 f0 81 2f 00 00 00 01 74 05 e8 4d 1c d2 ff c3 b8
> 00 01 00 00 <f0> 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c3 f0 81 2f
> 00 00
> [533805.050454] RIP [<ffffffff814bdba9>] _spin_lock+0x5/0x15
> [533805.056182] RSP <ffff88005f187c50>
> [533805.059997] CR2: 0000000000000008
> [533805.063638] ---[ end trace 85ee7cbec9ce72ac ]---
> [533805.068584] Fixing recursive fault but reboot is needed!
>
>
> Below I attached the in my opinion relevant part of the gntdev driver:
> static int gntdev_open(struct inode *inode, struct file *flip)
> {
> ...
> ...
> priv->mm = get_task_mm(current);
> if (!priv->mm) {
> kfree(priv);
> return -ENOMEM;
> }
> priv->mn.ops = &gntdev_mmu_ops;
> mmu_notifier_register(&priv->mn, priv->mm);
> mmput(priv->mm);
>
> flip->private_data = priv;
> ...
> }
>
> static int gntdev_release(struct inode *inode, struct file *flip)
> {
> struct gntdev_priv *priv = flip->private_data;
> ...
> mmu_notifier_unregister(&priv->mn, priv->mm);
> kfree(priv);
> return 0;
> }
>
> I think the bug is due to gntdev not handling reference counting
> (mm->mm_users) in combination with a race condition. The function
> gntdev_open calls get_task_mm (which increases mm_users) but straight
> after storing 'mm' in 'priv' it performs a mmput which again decreases
> the reference count. Since mmu_notifier_register also increases the
> reference count, the count is definitely positive after gntdev_open. I
> suspect that due to a race condition (after 10 seconds, xend performed
> a SIGKILL) the mm structure already got freed somehow. This causes a
> bug in mmu_notifier_unregister.
>
> If this is correct then the the 'mmput' line should be moved to
> gntdev_release and placed after mmu_notifier_unregister. What do you
> think? If this is correct, I will send a patch for this.
>
> Regards,
> Roderick Colenbrander
If this is correct, I think it's a bug in mmu_notifier_register - it
already increments mm->mm_count and there is a note in the function
description suggesting that this suffices to run mmu_notifier_unregister
later.
I'm not very familiar with the MMU notifier (my changes were made for the
case where it was not used) so it might be assuming the extra reference
you are proposing - especially if that change fixes the bug.
--
Daniel De Graaf
National Security Agency
prev parent reply other threads:[~2012-01-26 14:27 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-26 1:49 Race condition in gntdev driver Roderick Colenbrander
2012-01-26 14:27 ` Daniel De Graaf [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F2162D6.40701@tycho.nsa.gov \
--to=dgdegra@tycho.nsa.gov \
--cc=konrad@darnok.org \
--cc=thunderbird2k@gmail.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).