* [Qemu-devel] javac crash in user-mode emulation: races on page_unprotect()
@ 2017-11-24 17:18 Peter Maydell
2017-11-27 14:38 ` Paolo Bonzini
0 siblings, 1 reply; 6+ messages in thread
From: Peter Maydell @ 2017-11-24 17:18 UTC (permalink / raw)
To: QEMU Developers; +Cc: Alex Bennée, Richard Henderson
I managed to track down the crash running javac in user-mode
emulation. The problem is that we can have several threads which
race in page_unprotect():
* threads A & B both try to do a write to a page with code in it at
the same time (ie which we've made non-writeable, so SEGV)
* they race into the signal handler with this faulting address
* thread A happens to get to page_unprotect() first and takes the
mmap lock, so thread B sits waiting for it to be done
* A then finds the page, marks it PAGE_WRITE and mprotect()s it writable
* A can then continue OK (returns from signal handler to retry the
memory access)
* ...but when B gets the mmap lock it finds that the page is already
PAGE_WRITE, and so it exits page_unprotect() via the "not due to
protected translation" code path, and wrongly delivers the signal
to the guest rather than just retrying the access
I'm not sure how best to fix this. We could make page_unprotect()
say "if PAGE_WRITE is set, assume this call raced with another one
and say 'this was caused by protected translation' without doing
anything". But I have a feeling that will mean we could end up looping
endlessly if we get a SEGV for a write to a writeable page (not
sure when this could happen, but maybe alignment issues?).
Anybody got a better idea?
thanks
-- PMM
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] javac crash in user-mode emulation: races on page_unprotect()
2017-11-24 17:18 [Qemu-devel] javac crash in user-mode emulation: races on page_unprotect() Peter Maydell
@ 2017-11-27 14:38 ` Paolo Bonzini
2017-11-27 14:47 ` Peter Maydell
0 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2017-11-27 14:38 UTC (permalink / raw)
To: Peter Maydell, QEMU Developers; +Cc: Alex Bennée, Richard Henderson
On 24/11/2017 18:18, Peter Maydell wrote:
> * threads A & B both try to do a write to a page with code in it at
> the same time (ie which we've made non-writeable, so SEGV)
> * they race into the signal handler with this faulting address
> * thread A happens to get to page_unprotect() first and takes the
> mmap lock, so thread B sits waiting for it to be done
> * A then finds the page, marks it PAGE_WRITE and mprotect()s it writable
> * A can then continue OK (returns from signal handler to retry the
> memory access)
> * ...but when B gets the mmap lock it finds that the page is already
> PAGE_WRITE, and so it exits page_unprotect() via the "not due to
> protected translation" code path, and wrongly delivers the signal
> to the guest rather than just retrying the access
>
> I'm not sure how best to fix this. We could make page_unprotect()
> say "if PAGE_WRITE is set, assume this call raced with another one
> and say 'this was caused by protected translation' without doing
> anything".
Yes, I think this is the only solution since SIGSEGV is raised
asynchronously. Even using a trylock would only narrow the race window
but not fix it.
> But I have a feeling that will mean we could end up looping
> endlessly if we get a SEGV for a write to a writeable page (not
> sure when this could happen, but maybe alignment issues?).
Those would have to be detected via si_code (for the specific case of
invalid address alignment, that would be a SIGBUS with
si_code==BUS_ADRALN, not a SIGSEGV).
In general, I think that only SIGSEGV/SEGV_ACCERR needs to go down the
page_unprotect path.
Paolo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] javac crash in user-mode emulation: races on page_unprotect()
2017-11-27 14:38 ` Paolo Bonzini
@ 2017-11-27 14:47 ` Peter Maydell
2017-11-27 14:53 ` Paolo Bonzini
0 siblings, 1 reply; 6+ messages in thread
From: Peter Maydell @ 2017-11-27 14:47 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: QEMU Developers, Alex Bennée, Richard Henderson
On 27 November 2017 at 15:38, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 24/11/2017 18:18, Peter Maydell wrote:
>> * threads A & B both try to do a write to a page with code in it at
>> the same time (ie which we've made non-writeable, so SEGV)
>> * they race into the signal handler with this faulting address
>> * thread A happens to get to page_unprotect() first and takes the
>> mmap lock, so thread B sits waiting for it to be done
>> * A then finds the page, marks it PAGE_WRITE and mprotect()s it writable
>> * A can then continue OK (returns from signal handler to retry the
>> memory access)
>> * ...but when B gets the mmap lock it finds that the page is already
>> PAGE_WRITE, and so it exits page_unprotect() via the "not due to
>> protected translation" code path, and wrongly delivers the signal
>> to the guest rather than just retrying the access
>>
>> I'm not sure how best to fix this. We could make page_unprotect()
>> say "if PAGE_WRITE is set, assume this call raced with another one
>> and say 'this was caused by protected translation' without doing
>> anything".
>
> Yes, I think this is the only solution since SIGSEGV is raised
> asynchronously. Even using a trylock would only narrow the race window
> but not fix it.
I have a patch from rth based on an idea he and I came up with:
we add a field to the PageDesc struct to store the thread id of
the thread that last touches the flags. If you come into the
segv handler and the page flags/last-modified-by field say "should be
writeable and somebody else updated it" then you mark the page as
"last modified by this thread" and retry the access. If the
flags say "should be writeable, last modified by this thread"
then you know the page state hasn't changed since this thread
last saw it as "definitely not causing segvs because of cached TBs",
and so that should be passed on as a guest SEGV.
thanks
-- PMM
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] javac crash in user-mode emulation: races on page_unprotect()
2017-11-27 14:47 ` Peter Maydell
@ 2017-11-27 14:53 ` Paolo Bonzini
2017-11-27 15:06 ` Peter Maydell
0 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2017-11-27 14:53 UTC (permalink / raw)
To: Peter Maydell; +Cc: QEMU Developers, Alex Bennée, Richard Henderson
On 27/11/2017 15:47, Peter Maydell wrote:
> On 27 November 2017 at 15:38, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 24/11/2017 18:18, Peter Maydell wrote:
>>> * threads A & B both try to do a write to a page with code in it at
>>> the same time (ie which we've made non-writeable, so SEGV)
>>> * they race into the signal handler with this faulting address
>>> * thread A happens to get to page_unprotect() first and takes the
>>> mmap lock, so thread B sits waiting for it to be done
>>> * A then finds the page, marks it PAGE_WRITE and mprotect()s it writable
>>> * A can then continue OK (returns from signal handler to retry the
>>> memory access)
>>> * ...but when B gets the mmap lock it finds that the page is already
>>> PAGE_WRITE, and so it exits page_unprotect() via the "not due to
>>> protected translation" code path, and wrongly delivers the signal
>>> to the guest rather than just retrying the access
>>>
>>> I'm not sure how best to fix this. We could make page_unprotect()
>>> say "if PAGE_WRITE is set, assume this call raced with another one
>>> and say 'this was caused by protected translation' without doing
>>> anything".
>>
>> Yes, I think this is the only solution since SIGSEGV is raised
>> asynchronously. Even using a trylock would only narrow the race window
>> but not fix it.
>
> I have a patch from rth based on an idea he and I came up with:
> we add a field to the PageDesc struct to store the thread id of
> the thread that last touches the flags. If you come into the
> segv handler and the page flags/last-modified-by field say "should be
> writeable and somebody else updated it" then you mark the page as
> "last modified by this thread" and retry the access. If the
> flags say "should be writeable, last modified by this thread"
> then you know the page state hasn't changed since this thread
> last saw it as "definitely not causing segvs because of cached TBs",
> and so that should be passed on as a guest SEGV.
Clever, but why would si_code not work?...
Paolo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] javac crash in user-mode emulation: races on page_unprotect()
2017-11-27 14:53 ` Paolo Bonzini
@ 2017-11-27 15:06 ` Peter Maydell
2017-11-27 15:14 ` Paolo Bonzini
0 siblings, 1 reply; 6+ messages in thread
From: Peter Maydell @ 2017-11-27 15:06 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: QEMU Developers, Alex Bennée, Richard Henderson
On 27 November 2017 at 15:53, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 27/11/2017 15:47, Peter Maydell wrote:
>> I have a patch from rth based on an idea he and I came up with:
>> we add a field to the PageDesc struct to store the thread id of
>> the thread that last touches the flags. If you come into the
>> segv handler and the page flags/last-modified-by field say "should be
>> writeable and somebody else updated it" then you mark the page as
>> "last modified by this thread" and retry the access. If the
>> flags say "should be writeable, last modified by this thread"
>> then you know the page state hasn't changed since this thread
>> last saw it as "definitely not causing segvs because of cached TBs",
>> and so that should be passed on as a guest SEGV.
>
> Clever, but why would si_code not work?...
Do we have a guarantee that it's absolutely never the case that
you can get a SEGV with si_code SEGV_ACCERR for an access to memory
that's mapped writeable (and conversely that we'll always get
SEGV_ACCERR for the "mapped nonwriteable" case)? If it's ever possible
then the guest will go into an infinite loop of taking segfaults that
should be delivered to the guest but which we just retry the failing
access for forever...
thanks
-- PMM
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] javac crash in user-mode emulation: races on page_unprotect()
2017-11-27 15:06 ` Peter Maydell
@ 2017-11-27 15:14 ` Paolo Bonzini
0 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2017-11-27 15:14 UTC (permalink / raw)
To: Peter Maydell; +Cc: QEMU Developers, Alex Bennée, Richard Henderson
On 27/11/2017 16:06, Peter Maydell wrote:
> On 27 November 2017 at 15:53, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 27/11/2017 15:47, Peter Maydell wrote:
>>> I have a patch from rth based on an idea he and I came up with:
>>> we add a field to the PageDesc struct to store the thread id of
>>> the thread that last touches the flags. If you come into the
>>> segv handler and the page flags/last-modified-by field say "should be
>>> writeable and somebody else updated it" then you mark the page as
>>> "last modified by this thread" and retry the access. If the
>>> flags say "should be writeable, last modified by this thread"
>>> then you know the page state hasn't changed since this thread
>>> last saw it as "definitely not causing segvs because of cached TBs",
>>> and so that should be passed on as a guest SEGV.
>> Clever, but why would si_code not work?...
> Do we have a guarantee that it's absolutely never the case that
> you can get a SEGV with si_code SEGV_ACCERR for an access to memory
> that's mapped writeable (and conversely that we'll always get
> SEGV_ACCERR for the "mapped nonwriteable" case)? If it's ever possible
> then the guest will go into an infinite loop of taking segfaults that
> should be delivered to the guest but which we just retry the failing
> access for forever...
At least for x86, yes.
For ARM, the syndrome values that trigger SEGV_ACCERR are 9-11 and
13-15, which seems sane to me but I know much less about ARM than x86.
So I would say "yes except for kernel bugs".
Paolo
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-11-27 15:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-24 17:18 [Qemu-devel] javac crash in user-mode emulation: races on page_unprotect() Peter Maydell
2017-11-27 14:38 ` Paolo Bonzini
2017-11-27 14:47 ` Peter Maydell
2017-11-27 14:53 ` Paolo Bonzini
2017-11-27 15:06 ` Peter Maydell
2017-11-27 15:14 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).