All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: M A Young <m.a.young@durham.ac.uk>
Cc: xen-devel@lists.xen.org, Jan Beulich <JBeulich@suse.com>,
	Jason Fritcher <jkf@wolfnet.org>
Subject: Re: Xen BUG at page_alloc.c:1738 (Xen 4.5)
Date: Sat, 30 May 2015 23:43:12 +0100	[thread overview]
Message-ID: <556A3D00.1030701@citrix.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1505302303570.23215@procyon.dur.ac.uk>

On 30/05/2015 23:07, M A Young wrote:
> On Fri, 29 May 2015, Andrew Cooper wrote:
>
>> On 29/05/15 12:17, M A Young wrote:
>>>>> I did a bit of testing - xen-4.5.1-rc1 built on Fedora 22 (gcc5) doesn't 
>>>>> boot for me, but if I replace xen.gz with one from the same code built on 
>>>>> Fedora 21 (gcc4) then it does boot. There are rpms and build logs 
>>>>> available via 
>>>>> http://copr.fedoraproject.org/coprs/myoung/xentest/build/93366/
>>>>> if anyone else wants to do some testing.
>>>>>
>>>>> 	Michael Young
>>>> Do you have easy access to xen-syms from each build?
>>> Yes.
>>>
>> Thankyou very much.
>>
>> GCC 5 is indeed miscompiling the code. Comparing the fc21 vs fc22 builds:
>>
>> The C snippet from mmio_ro_do_page_fault():
>>
>> struct page_info *page = mfn_to_page(mfn);
>> struct domain *owner = page_get_owner_and_reference(page);
>> if ( owner )
>>     put_page(page);
>>
>> In fc21 is:
>>
>> movabs $0xffff82e000000000,%rbp
>> shr    %cl,%rax
>> or     %rdx,%rax
>> shl    $0x5,%rax
>> add    %rax,%rbp
>> mov    %rbp,%rdi
>> callq  ffff82d080186900 <page_get_owner_and_reference>
>> test   %rax,%rax
>> mov    %rax,%r12
>> je     ffff82d080189c4e <mmio_ro_do_page_fault+0x11e>
>> mov    %rbp,%rdi
>> callq  ffff82d080188ec0 <put_page>
>>
>> and in fc22 is:
>>
>> movabs $0xffff82e000000000,%r8
>> shr    %cl,%rax
>> or     %rdx,%rax
>> shl    $0x5,%rax
>> lea    (%r8,%rax,1),%rdi
>> callq  ffff82d0801874f0 <page_get_owner_and_reference>
>> test   %rax,%rax
>> mov    %rax,%rbp
>> je     ffff82d08018ca14 <mmio_ro_do_page_fault+0x114>
>> mov    %r8,%rdi
>> callq  ffff82d080189a90 <put_page>
>>
>> "lea (%r8,%rax,1),%rdi" in FC22 is slightly shorter than "add %rax,%rbp;
>> mov %rbp,%rdi" in FC21.  In both cases %rdi is now 'page' from the C
>> snippet.
>>
>> In FC21, the result is stored in %rbp, then reloaded from %rbp into %rdi
>> for call to put_page().
>>
>> However, in FC22, the result of the calculation is only held in %rdi,
>> and clobbered by the call to page_get_owner_and_reference().  When it
>> comes to call put_page(), %r8 is reloaded, which is still a pointer to
>> the base of the frametable, not the page we actually took a reference on.
>>
>> FC22 is miscompiling the C to:
>>
>> struct page_info *page = mfn_to_page(mfn);
>> struct domain *owner = page_get_owner_and_reference(page);
>> if ( owner )
>>     put_page(mfn_to_page(0));
>>
>> which is wrong, and why free_domheap_pages() does legitimately complain
>> about the wonky refcount.
> With a bit of experimentation I have found that compiling with the 
> -fno-caller-saves flag gets this code segment back to the Fedora 21 
> version, thus avoiding the bug.

After sending this email, I wondered whether the optimiser as assuming
that %rdi was preserved.  Indeed, it turns out that the generated code
for page_get_owner_and_reference leaves %rdi unmodified, and safe for
reuse after return.

If the 'mov %r8,%rdi' were simply omitted, the code would work, as %rdi
still contains the correct result of the original calculation.

Therefore, I suspect that the bug is in the -fcaller-saves optimisation
code.

~Andrew

  reply	other threads:[~2015-05-30 22:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-29  6:24 Xen BUG at page_alloc.c:1738 (Xen 4.5) Jason Fritcher
2015-05-29 10:09 ` Andrew Cooper
2015-05-29 10:50   ` M A Young
2015-05-29 10:57     ` Andrew Cooper
2015-05-29 11:17       ` M A Young
2015-05-29 18:12         ` Andrew Cooper
2015-05-30 22:07           ` M A Young
2015-05-30 22:43             ` Andrew Cooper [this message]
2015-06-01  7:40               ` Jan Beulich
2015-06-01  7:47                 ` M A Young
2015-06-06 21:00                   ` M A Young
  -- strict thread matches above, loose matches on Subject: below --
2015-05-19 18:06 Major Hayden
2015-05-19 18:16 ` Andrew Cooper
2015-05-20  2:11   ` Major Hayden
2015-05-20 10:41 ` Jan Beulich
2015-05-20 16:52   ` Major Hayden
2015-05-20 19:51     ` M A Young

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=556A3D00.1030701@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=jkf@wolfnet.org \
    --cc=m.a.young@durham.ac.uk \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.