From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752635AbbG2DCr (ORCPT ); Tue, 28 Jul 2015 23:02:47 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:26994 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752172AbbG2DCq (ORCPT ); Tue, 28 Jul 2015 23:02:46 -0400 Message-ID: <55B841FF.2000102@oracle.com> Date: Tue, 28 Jul 2015 23:01:19 -0400 From: Boris Ostrovsky User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Andrew Cooper , Andy Lutomirski CC: "security@kernel.org" , Peter Zijlstra , X86 ML , "linux-kernel@vger.kernel.org" , Steven Rostedt , xen-devel , Borislav Petkov , Jan Beulich , Sasha Levin Subject: Re: [Xen-devel] [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option References: <55B64FEA.70204@oracle.com> <55B659EC.5030009@oracle.com> <55B75993.90909@citrix.com> <55B7AE39.7000101@citrix.com> <55B7B791.2050208@oracle.com> <55B822B8.3090608@citrix.com> In-Reply-To: <55B822B8.3090608@citrix.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/28/2015 08:47 PM, Andrew Cooper wrote: > On 29/07/2015 01:21, Andy Lutomirski wrote: >> On Tue, Jul 28, 2015 at 10:10 AM, Boris Ostrovsky >> wrote: >>> On 07/28/2015 01:07 PM, Andy Lutomirski wrote: >>>> On Tue, Jul 28, 2015 at 9:30 AM, Andrew Cooper >>>> wrote: >>>>> I suspect that the set_ldt(NULL, 0) call hasn't reached Xen before >>>>> xen_free_ldt() is attempting to nab back the pages which Xen still has >>>>> mapped as an LDT. >>>>> >>>> I just instrumented it with yet more LSL instructions. I'm pretty >>>> sure that set_ldt really is clearing at least LDT entry zero. >>>> Nonetheless the free_ldt call still oopses. >>>> >>> Yes, I added some instrumentation to the hypervisor and we definitely set >>> LDT to NULL before failing. >>> >>> -boris >> Looking at map_ldt_shadow_page: what keeps shadow_ldt_mapcnt from >> getting incremented once on each CPU at the same time if both CPUs >> fault in the same shadow LDT page at the same time? > Nothing, but that is fine. If a page is in use in two vcpus LDTs, it is > expected to have a type refcount of 2. > >> Similarly, what >> keeps both CPUs from calling get_page_type at the same time and >> therefore losing track of the page type reference count? > a cmpxchg() loop in the depths of __get_page_type(). > >> I don't see why vmalloc or vm_unmap_aliases would have anything to do >> with this, though. So just for kicks I made lazy_max_pages() return 0 to free vmaps immediately and the problem went away. I also saw this warning, BTW: [ 178.686542] ------------[ cut here ]------------ [ 178.686554] WARNING: CPU: 0 PID: 16440 at ./arch/x86/include/asm/mmu_context.h:96 load_mm_ldt+0x70/0x76() [ 178.686558] DEBUG_LOCKS_WARN_ON(!irqs_disabled()) [ 178.686561] Modules linked in: [ 178.686566] CPU: 0 PID: 16440 Comm: kworker/u2:1 Not tainted 4.1.0-32b #80 [ 178.686570] 00000000 00000000 ea4e3df8 c1670e71 00000000 ea4e3e28 c106ac1e c1814e43 [ 178.686577] ea4e3e54 00004038 c181bc2c 00000060 c166fd3b c166fd3b e6705dc0 00000000 [ 178.686583] ea665000 ea4e3e40 c106ad03 00000009 ea4e3e38 c1814e43 ea4e3e54 ea4e3e5c [ 178.686589] Call Trace: [ 178.686594] [] dump_stack+0x41/0x52 [ 178.686598] [] warn_slowpath_common+0x8e/0xd0 [ 178.686602] [] ? load_mm_ldt+0x70/0x76 [ 178.686609] [] ? load_mm_ldt+0x70/0x76 [ 178.686612] [] warn_slowpath_fmt+0x33/0x40 [ 178.686615] [] load_mm_ldt+0x70/0x76 [ 178.686619] [] flush_old_exec+0x6f9/0x750 [ 178.686626] [] load_elf_binary+0x2b4/0x1040 [ 178.686630] [] ? page_address+0x15/0xf0 [ 178.686633] [] ? kunmap+0x1f/0x70 [ 178.686636] [] search_binary_handler+0x89/0x1c0 [ 178.686639] [] do_execveat_common+0x4c0/0x620 [ 178.686653] [] ? kmemdup+0x33/0x50 [ 178.686659] [] ? __call_rcu.constprop.66+0xbb/0x220 [ 178.686673] [] do_execve+0x24/0x30 [ 178.686679] [] ____call_usermodehelper+0xde/0x120 [ 178.686684] [] ret_from_kernel_thread+0x21/0x30 [ 178.686696] [] ? __request_module+0x240/0x240 [ 178.686701] ---[ end trace 8b3f5341f50e6c88 ]--- -boris