* [uml-devel] a question about sigsetjmp() in copy_from/to_user()
@ 2005-09-27 14:06 Young Koh
2005-09-27 17:28 ` Jeff Dike
2005-09-28 8:41 ` Blaisorblade
0 siblings, 2 replies; 20+ messages in thread
From: Young Koh @ 2005-09-27 14:06 UTC (permalink / raw)
To: user-mode-linux-devel
Hi,
i have a question about copy_from/to_user() implementation in skas mode.
as my understanding,
when copy_from/to_user() is invoked, before the address translation
happens, UML kernel calls sigsetjmp() to come back when there is a
segmentation fault. and if there is, it seems that the system call an
application triggered eventually returns EFAULT. then, it seems to me
that sigsetjmp() is to catch the error when the application gave the
invalid user space address.
my question is, if so, shouldn't the error be caught when UML kernel
translates the user space address to the kernel space address? i mean,
UML kernel must know the valid memory regions and if the address is
out of the valid regions, then it knows the address is invalid before
UML tries to access the address. why should it use sigsetjmp() and let
a segfault occur?
Thank you in advance,
-Young
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-27 14:06 [uml-devel] a question about sigsetjmp() in copy_from/to_user() Young Koh
@ 2005-09-27 17:28 ` Jeff Dike
2005-09-28 11:59 ` Blaisorblade
2005-09-28 8:41 ` Blaisorblade
1 sibling, 1 reply; 20+ messages in thread
From: Jeff Dike @ 2005-09-27 17:28 UTC (permalink / raw)
To: Young Koh; +Cc: user-mode-linux-devel
On Tue, Sep 27, 2005 at 10:06:53AM -0400, Young Koh wrote:
> my question is, if so, shouldn't the error be caught when UML kernel
> translates the user space address to the kernel space address? i mean,
> UML kernel must know the valid memory regions and if the address is
> out of the valid regions, then it knows the address is invalid before
> UML tries to access the address. why should it use sigsetjmp() and let
> a segfault occur?
Because the address may be fine, and an access may still cause a segfault.
UML memory is backed by a file on the host. You can map anything from
the file you want, but if you access it when the host filesystem is full
or you've exceeded your disk quota, the access will segfault.
Jeff
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-27 14:06 [uml-devel] a question about sigsetjmp() in copy_from/to_user() Young Koh
2005-09-27 17:28 ` Jeff Dike
@ 2005-09-28 8:41 ` Blaisorblade
2005-09-28 14:22 ` Young Koh
1 sibling, 1 reply; 20+ messages in thread
From: Blaisorblade @ 2005-09-28 8:41 UTC (permalink / raw)
To: user-mode-linux-devel, Young Koh
On Tuesday 27 September 2005 16:06, Young Koh wrote:
> Hi,
> i have a question about copy_from/to_user() implementation in skas mode.
Ok, and here I'll explain also about TT mode, since they're reasonably
similar, and TT mode is more similar to i386.
> as my understanding,
> when copy_from/to_user() is invoked, before the address translation
> happens, UML kernel calls sigsetjmp() to come back when there is a
> segmentation fault. and if there is, it seems that the system call an
> application triggered eventually returns EFAULT.
Yes, copy_*_user returns a failure code and the calling code is supposed to
check and return EFAULT.
Assuming the fault is a *real* fault, i.e. an unfixable one - maybe we simply
need to call handle_page_fault() and load the page from swap.
> then, it seems to me
> that sigsetjmp() is to catch the error when the application gave the
> invalid user space address.
Exactly.
> my question is, if so, shouldn't the error be caught when UML kernel
> translates the user space address to the kernel space address? i mean,
> UML kernel must know the valid memory regions
Well, saying "regions" is a bit confusing - in fact, you are coping with
installed mappings (mmap()s done on the host), which are like page tables
(populated on fault).
> and if the address is
> out of the valid regions, then it knows the address is invalid before
> UML tries to access the address.
> why should it use sigsetjmp() and let
> a segfault occur?
In general, because it would be faster, because you must optimize for the fast
path, when a segfault won't occur and predoing the checking
1) Background: TT mode and i386 implementation. They can access the user
address directly, so they do, and catch the error afterwards. Why? Because
when we care for performance, the application will pass correct address. So
it's better to optimize the performance of the fast path (correct address)
than the one of the slow path. The hardware walking of page tables is faster
than the software one (even due to TLBs, which are processor caches of page
tables).
Whenever a fault occur, the i386 exception handler (see
search_exception_tables() and
grep ".section __ex_table" include/asm-i386/*)
and/or the TT mode fault catcher make copy_*_user return an error.
2) SKAS instead. SKAS is like 4G/4G on the host (it is actually a 3G/3G).
In SKAS mode, we actually walk the page tables, because we cannot access the
host mapping - we're using a different mapping set.
In fact, what you see doesn't catch user space wrong addresses.
It catches kernelspace faulting addresses - which is legal to happen, because
i386 implementation catches any fault, and doesn't make a distinction, and
which happens, when you try to do things like "cat /dev/kmem" - you're trying
to do copy_to_user(to, offset /* which is 0 */, size).
In fact, that sigsegjmp() was added back in 2.4.24-?um (IIRC) and then around
~2.6.7-um after I and Jeff analyzed this.
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Messenger: chiamate gratuite in tutto il mondo
http://it.messenger.yahoo.com
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-27 17:28 ` Jeff Dike
@ 2005-09-28 11:59 ` Blaisorblade
2005-09-28 13:47 ` Young Koh
2005-09-28 16:09 ` Jeff Dike
0 siblings, 2 replies; 20+ messages in thread
From: Blaisorblade @ 2005-09-28 11:59 UTC (permalink / raw)
To: user-mode-linux-devel; +Cc: Jeff Dike, Young Koh
On Tuesday 27 September 2005 19:28, Jeff Dike wrote:
> On Tue, Sep 27, 2005 at 10:06:53AM -0400, Young Koh wrote:
> > my question is, if so, shouldn't the error be caught when UML kernel
> > translates the user space address to the kernel space address? i mean,
> > UML kernel must know the valid memory regions and if the address is
> > out of the valid regions, then it knows the address is invalid before
> > UML tries to access the address. why should it use sigsetjmp() and let
> > a segfault occur?
>
> Because the address may be fine, and an access may still cause a segfault.
>
> UML memory is backed by a file on the host. You can map anything from
> the file you want, but if you access it when the host filesystem is full
> or you've exceeded your disk quota, the access will segfault.
That wasn't the original reason - this is fine too, but as I explained in the
other mail, cat /dev/kmem will cause a copy_to_user() with invalid kernel
("from") address. I remember because I discussed this with you at length.
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-28 11:59 ` Blaisorblade
@ 2005-09-28 13:47 ` Young Koh
2005-09-28 14:50 ` Jeff Dike
2005-09-28 16:09 ` Jeff Dike
1 sibling, 1 reply; 20+ messages in thread
From: Young Koh @ 2005-09-28 13:47 UTC (permalink / raw)
To: Jeff Dike; +Cc: user-mode-linux-devel
Thank you for your reply, but still have one more.
(i think i forgot to reply to the mailing list with the previous
email, so, i'm attaching the text)
On 9/28/05, Jeff Dike <jdike@addtoit.com> wrote:
> On Tue, Sep 27, 2005 at 08:56:51PM -0400, Young Koh wrote:
> > 1) if the address is fine, shouldn't the access that causes a segfault
> > be regarded as a page fault? that is, shouldn't it be handled by UML
> > kernel and UML proceeds normally instead of returning an error to the
> > application? (because the app gave the proper address)
>
> No.
>
> It's akin to a piece of memory all of a sudden going bad. You allocated the
> memory and thought you could use it, but when you try, it turns out not
> to be there.
yes, i think memory can go bad if 1) it cannot be allocated because of
filesystem full or similar reasons as Jeff described, or 2) it was
allocated once but could have been swapped.
in case of 1) it should be more like kernel panic rather than just a
system call error, i think? because kernel cannot allocate any more
memory, which kernel is supposed to use.
in case of 2) this is a real fault, so, the seg fault handler of UML
kernel is supposed to load the swapped page and UML kernel proceeds
normally? (as Blaisorblade described in the other mail, by calling
handle_page_fault()?)
Thank you,
-Young
>
> > 2) i thought that the file used for UML memory is created when a UML
> > process is initialized.
> > then, the memory file is not created for the fixed size at first, but
> > it changes the size according to the UML memory usage?
>
> It's not fully allocated. It starts off sparse and gets allocated on
> the host as the guest's memory usage increases.
>
> > 3) is it the only case sigsetjmp() protected from?
>
> I believe so, but am not positive.
>
> Jeff
>
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-28 8:41 ` Blaisorblade
@ 2005-09-28 14:22 ` Young Koh
2005-09-28 16:43 ` Blaisorblade
0 siblings, 1 reply; 20+ messages in thread
From: Young Koh @ 2005-09-28 14:22 UTC (permalink / raw)
To: Blaisorblade; +Cc: user-mode-linux-devel
so, as my understanding, sigsetjmp() is used for returning an error
when there is a userspace and/or kernelspace address faulting in both
skas and tt modes. and i386 implementation works the same way, i
guess.
my one quick question is (it could sound stupid, but) that why there
may be a kernelspace faulting? kernel must correct and shouldn't
access bad address, i guess, and if so, shouldn't it be a kernel
panic?
> In fact, what you see doesn't catch user space wrong addresses.
>
> It catches kernelspace faulting addresses - which is legal to happen, because
> i386 implementation catches any fault, and doesn't make a distinction, and
> which happens, when you try to do things like "cat /dev/kmem" - you're trying
> to do copy_to_user(to, offset /* which is 0 */, size).
>
> In fact, that sigsegjmp() was added back in 2.4.24-?um (IIRC) and then around
> ~2.6.7-um after I and Jeff analyzed this.
i'm using 2.4.26 and 2.6.12 and i think both versions include sigsetjmp().
Thank you,
-Young
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-28 13:47 ` Young Koh
@ 2005-09-28 14:50 ` Jeff Dike
2005-09-28 19:25 ` Young Koh
0 siblings, 1 reply; 20+ messages in thread
From: Jeff Dike @ 2005-09-28 14:50 UTC (permalink / raw)
To: Young Koh; +Cc: user-mode-linux-devel
On Wed, Sep 28, 2005 at 09:47:41AM -0400, Young Koh wrote:
> yes, i think memory can go bad if 1) it cannot be allocated because of
> filesystem full or similar reasons as Jeff described, or 2) it was
> allocated once but could have been swapped.
>
> in case of 1) it should be more like kernel panic rather than just a
> system call error, i think? because kernel cannot allocate any more
> memory, which kernel is supposed to use.
No, just that page is bad. Another page could have been dirtied and thus
allocated on the host, and it would be usable. So, it's not a fatal problem.
> in case of 2) this is a real fault, so, the seg fault handler of UML
> kernel is supposed to load the swapped page and UML kernel proceeds
> normally? (as Blaisorblade described in the other mail, by calling
> handle_page_fault()?)
Yes, this happens in the call from maybe_map.
Jeff
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-28 11:59 ` Blaisorblade
2005-09-28 13:47 ` Young Koh
@ 2005-09-28 16:09 ` Jeff Dike
2005-09-28 17:26 ` Blaisorblade
1 sibling, 1 reply; 20+ messages in thread
From: Jeff Dike @ 2005-09-28 16:09 UTC (permalink / raw)
To: Blaisorblade; +Cc: user-mode-linux-devel, Young Koh
On Wed, Sep 28, 2005 at 01:59:48PM +0200, Blaisorblade wrote:
> That wasn't the original reason - this is fine too, but as I explained in the
> other mail, cat /dev/kmem will cause a copy_to_user() with invalid kernel
> ("from") address. I remember because I discussed this with you at length.
Oh yeah. I was thinking there was a different (and better) reason, but I
couldn't remember what it was.
Also, my reason isn't that good anyway. I had a different fix a while ago,
but it got lost somewhere. I added an arch hook to get_free_pages which
touched each allocated page under the cover of a setjmp. If a page couldn't
be allocated on the host, then it is put on a "bad pages" list, and another
page is allocated instead.
Jeff
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-28 14:22 ` Young Koh
@ 2005-09-28 16:43 ` Blaisorblade
0 siblings, 0 replies; 20+ messages in thread
From: Blaisorblade @ 2005-09-28 16:43 UTC (permalink / raw)
To: user-mode-linux-devel, Young Koh
On Wednesday 28 September 2005 16:22, Young Koh wrote:
> so, as my understanding, sigsetjmp() is used for returning an error
> when there is a userspace and/or kernelspace address faulting in both
> skas and tt modes. and i386 implementation works the same way, i
> guess.
> my one quick question is (it could sound stupid,
Not at all.
> but) that why there
> may be a kernelspace faulting? kernel must correct and shouldn't
> access bad address, i guess, and if so, shouldn't it be a kernel
> panic?
cat /dev/kmem, as I already said (won't repeat the whole story here). Yes,
yes, yes, the kmem driver could check manually the address, but (same story
as the rest):
*) checking by hand is slower
*) not needed, because for i386 works and other archs conform to i386.
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-28 16:09 ` Jeff Dike
@ 2005-09-28 17:26 ` Blaisorblade
2005-09-28 18:43 ` Jeff Dike
0 siblings, 1 reply; 20+ messages in thread
From: Blaisorblade @ 2005-09-28 17:26 UTC (permalink / raw)
To: user-mode-linux-devel; +Cc: Jeff Dike, Young Koh
On Wednesday 28 September 2005 18:09, Jeff Dike wrote:
> On Wed, Sep 28, 2005 at 01:59:48PM +0200, Blaisorblade wrote:
> > That wasn't the original reason - this is fine too, but as I explained in
> > the other mail, cat /dev/kmem will cause a copy_to_user() with invalid
> > kernel ("from") address. I remember because I discussed this with you at
> > length.
> Oh yeah. I was thinking there was a different (and better) reason, but I
> couldn't remember what it was.
> Also, my reason isn't that good anyway. I had a different fix a while ago,
> but it got lost somewhere. I added an arch hook to get_free_pages which
> touched each allocated page under the cover of a setjmp. If a page
> couldn't be allocated on the host, then it is put on a "bad pages" list,
> and another page is allocated instead.
I'm not sure that would help anyway - if the host memory is full, it's full.
It's just matter of waiting and retrying.
I don't think the host would SIGBUS again on the same page specifically - or
better, I'm almost sure this is not done.
So, I don't see the reason for that. Catching SIGSEGV/SIGBUS is ok, taking
another page is bad.
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-28 17:26 ` Blaisorblade
@ 2005-09-28 18:43 ` Jeff Dike
0 siblings, 0 replies; 20+ messages in thread
From: Jeff Dike @ 2005-09-28 18:43 UTC (permalink / raw)
To: Blaisorblade; +Cc: user-mode-linux-devel, Young Koh
On Wed, Sep 28, 2005 at 07:26:43PM +0200, Blaisorblade wrote:
> I'm not sure that would help anyway - if the host memory is full, it's full.
> It's just matter of waiting and retrying.
This isn't a matter of waiting and retrying. We use pages that are already
known to be good, and the rest go on a "bad pages" list and not freed, so
no one will try to use them again.
> I don't think the host would SIGBUS again on the same page specifically - or
> better, I'm almost sure this is not done.
> So, I don't see the reason for that. Catching SIGSEGV/SIGBUS is ok, taking
> another page is bad.
Catching SIGSEGV/SIGBUS at some random place in the kernel after it kmalloced
a bad page is OK? I don't think so.
The only way to keep the kernel running at that point is to get another
page and hope that it's OK. And if it's not, you try again.
Jeff
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-28 14:50 ` Jeff Dike
@ 2005-09-28 19:25 ` Young Koh
2005-09-29 12:09 ` Blaisorblade
0 siblings, 1 reply; 20+ messages in thread
From: Young Koh @ 2005-09-28 19:25 UTC (permalink / raw)
To: Jeff Dike; +Cc: user-mode-linux-devel
hi,
On 9/28/05, Jeff Dike <jdike@addtoit.com> wrote:
> On Wed, Sep 28, 2005 at 09:47:41AM -0400, Young Koh wrote:
> > yes, i think memory can go bad if 1) it cannot be allocated because of
> > filesystem full or similar reasons as Jeff described, or 2) it was
> > allocated once but could have been swapped.
> >
> > in case of 1) it should be more like kernel panic rather than just a
> > system call error, i think? because kernel cannot allocate any more
> > memory, which kernel is supposed to use.
>
> No, just that page is bad. Another page could have been dirtied and thus
> allocated on the host, and it would be usable. So, it's not a fatal problem.
>
then, if just that page is bad, shouldn't UML kernel wait until
another page is usable(or force another page to be swapped out) and
allocate the free page? and proceed normal? i may be still confused.
Ok, my thought/idea/suggestion is that what if UML uses a TLB-like
table before it does the address translation? i mean, once there is a
valid mapping, UML inserts the address mapping(or page mapping) into a
software TLB. after that, for that userspace address, UML can search
the software TLB table and use the mapping without calling sigsetjmp()
and walking through page tables. it seems that sigsetjmp() has
relatively large overhead, we could reduce some overhead by not
calling it. but surely the problem is the mapping can go corrupted.
for that, UML may invalidate a TLB entry if the corresponding page is
swapped out or any change is made. what do you think?
thank you,
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-28 19:25 ` Young Koh
@ 2005-09-29 12:09 ` Blaisorblade
2005-09-30 15:08 ` Young Koh
2005-10-02 1:03 ` Jeff Dike
0 siblings, 2 replies; 20+ messages in thread
From: Blaisorblade @ 2005-09-29 12:09 UTC (permalink / raw)
To: user-mode-linux-devel, Young Koh; +Cc: Jeff Dike
On Wednesday 28 September 2005 21:25, Young Koh wrote:
> hi,
>
> On 9/28/05, Jeff Dike <jdike@addtoit.com> wrote:
> > On Wed, Sep 28, 2005 at 09:47:41AM -0400, Young Koh wrote:
> > No, just that page is bad.
Again, that page is not bad. There is no page yet for this address, and the
host won't allocate one for now.
> > Another page could have been dirtied and thus
> > allocated on the host, and it would be usable. So, it's not a fatal
> > problem.
Ok, this makes a bit of sense, even if IMHO it doesn't work, I now see your
point (but I still insist with what said above).
However, even a dirtied page could be "bad", if it has been swapped. If we're
getting a SIGBUS, it meant that it didn't succeed in freeing any memory.
And, frankly, unless the UML ram file is kept on ramfs (which is RAM-only), it
can be swapped (both for disk-based filesystem and for tmpfs).
So, I don't think what you suggest could work.
> then, if just that page is bad, shouldn't UML kernel wait until
> another page is usable (or force another page to be swapped out)
We're talking about the host, and if the host is on OOM, you can't help it.
The best you can do is to reclaim cache memory. But that must be done when the
host is starting to swap, not at SIGBUS time.
> and
> allocate the free page? and proceed normal? i may be still confused.
Jeff's point is that once we have dirtied a page, and the host hasn't yet
swapped it, it could be in memory - and accessing it would work. Having
dirtied it means we allocated it with (say) kmalloc and then dirtied it.
> Ok, my thought/idea/suggestion is that what if UML uses a TLB-like
> table before it does the address translation?
> i mean, once there is a
> valid mapping, UML inserts the address mapping(or page mapping) into a
> software TLB. after that, for that userspace address, UML can search
> the software TLB table and use the mapping without calling sigsetjmp()
> and walking through page tables.
> it seems that sigsetjmp() has
> relatively large overhead, we could reduce some overhead by not
> calling it.
How do you measure it? I'm curious myself - I know there's the possibility to
use gprof, but I've never used that myself.
Surely the "sig" thing is heavy (some syscalls like sigprocmask() for
blocking/unblocking signals). Or better, it's the only heavy thing - the rest
consists only of saving a couple of registers (6, IIRC) in memory, there's no
interest in optimizing it away (I assume).
But that part will go away - there's a "softints" patch for this (i.e. moving
the blocking/unblocking to userspace - the signal handler notices the signal
is stopped and queues the handling via userspace mechanisms). It's at:
http://user-mode-linux.sourceforge.net/patches.html
> but surely the problem is the mapping can go corrupted.
> for that, UML may invalidate a TLB entry if the corresponding page is
> swapped out or any change is made. what do you think?
In short, it's a really interesting idea.
The kernel (arch-independent) infrastructure for this exists, for managing the
real TLBs. You already need to invalidate TLB entries when you swap a page and
such. Reading Documentation/cachetlb.txt is definitely worth the time spent.
And actually, currently that is used to update the host mappings.
Using TLBs to save the page table walk is interesting, especially since that
would avoid taking a spinlock on SMP (the current implementation of
maybe_map() doesn't, but it should), and more important because the TLBs would
likely be hotter than all the page tables, so it would probably fit in the L2
cache (while, to walk page tables, we're likely going to have a L2 miss -
they're too big). Hotter means "more likely to be accessed, and thus more
worth to keep in cache". The cache usage discussion in Reiser4 whitepaper
(www.namesys.com) is really enlightening on this point.
Don't know if we can optimize the locking on the TLBs, though - we could use
maybe atomic ops, or have per-processor TLBs (which is the way it's
implemented in hardware - you get IPIs on flush, though, and on i386
atomic_read and atomic_set have no additional cost over non-atomic
counterparts. So maybe shared TLBs are ok - they'd need to be tagged,
however).
I've not yet thought about an efficient data structure - an array means that
invalidation checks each entry, and I'd like to avoid that. The other way is
to empty the TLB on flushing a single entry.
The only problem is when there is a fault on the kernelspace address.
However, we may implement some checking, if setjmp() is still costly:
only kernelspace addresses upper than TASK_SIZE are valid (or something of
this sort).
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-29 12:09 ` Blaisorblade
@ 2005-09-30 15:08 ` Young Koh
2005-09-30 15:44 ` Geert Uytterhoeven
2005-10-02 1:03 ` Jeff Dike
1 sibling, 1 reply; 20+ messages in thread
From: Young Koh @ 2005-09-30 15:08 UTC (permalink / raw)
To: Blaisorblade; +Cc: user-mode-linux-devel, Jeff Dike
> > it seems that sigsetjmp() has
> > relatively large overhead, we could reduce some overhead by not
> > calling it.
>
> How do you measure it? I'm curious myself - I know there's the possibility to
> use gprof, but I've never used that myself.
i usually use pentium rdtsc(read time stamp counter) to measure the
timing and latency. (sometimes by instrumenting kernel or sometimes by
measuring test programs) i ran a test program and measured the
overhead of sigsetjmp() and setjmp(). it showed sigsetjmp() uses
around 1350 cycles (which is around 0.45 us in 3.0GHz machine), and
setjmp() only 21 cycles (< 0.01us). maybe while sigsetjmp() is
implemented as a system call to cope with signal blocking/unblocking,
setjmp() is not a system call? (getpid() itself takes more than 1000
cycles)
thanks,
>
> Surely the "sig" thing is heavy (some syscalls like sigprocmask() for
> blocking/unblocking signals). Or better, it's the only heavy thing - the rest
> consists only of saving a couple of registers (6, IIRC) in memory, there's no
> interest in optimizing it away (I assume).
>
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-30 15:08 ` Young Koh
@ 2005-09-30 15:44 ` Geert Uytterhoeven
0 siblings, 0 replies; 20+ messages in thread
From: Geert Uytterhoeven @ 2005-09-30 15:44 UTC (permalink / raw)
To: Young Koh; +Cc: Blaisorblade, user-mode-linux-devel, Jeff Dike
On Fri, 30 Sep 2005, Young Koh wrote:
> > > it seems that sigsetjmp() has
> > > relatively large overhead, we could reduce some overhead by not
> > > calling it.
> >
> > How do you measure it? I'm curious myself - I know there's the possibility to
> > use gprof, but I've never used that myself.
>
> i usually use pentium rdtsc(read time stamp counter) to measure the
> timing and latency. (sometimes by instrumenting kernel or sometimes by
> measuring test programs) i ran a test program and measured the
> overhead of sigsetjmp() and setjmp(). it showed sigsetjmp() uses
> around 1350 cycles (which is around 0.45 us in 3.0GHz machine), and
> setjmp() only 21 cycles (< 0.01us). maybe while sigsetjmp() is
> implemented as a system call to cope with signal blocking/unblocking,
> setjmp() is not a system call? (getpid() itself takes more than 1000
> cycles)
Indeed, setjmp() is not a system call. It just saves the registers to the
passed env structure.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-09-29 12:09 ` Blaisorblade
2005-09-30 15:08 ` Young Koh
@ 2005-10-02 1:03 ` Jeff Dike
2005-10-02 10:23 ` Blaisorblade
1 sibling, 1 reply; 20+ messages in thread
From: Jeff Dike @ 2005-10-02 1:03 UTC (permalink / raw)
To: Blaisorblade; +Cc: user-mode-linux-devel, Young Koh
On Thu, Sep 29, 2005 at 02:09:27PM +0200, Blaisorblade wrote:
> Again, that page is not bad. There is no page yet for this address, and the
> host won't allocate one for now.
It is bad in the sense that, unless some space is freed on that mount, a
reference to the page will always fault.
> Ok, this makes a bit of sense, even if IMHO it doesn't work, I now see your
> point (but I still insist with what said above).
Explain why it doesn't work.
> However, even a dirtied page could be "bad", if it has been swapped. If
> we're getting a SIGBUS, it meant that it didn't succeed in freeing any
> memory.
No it can't. A swapped page still counts as occupying space in the filesystem.
If a page was successfully allocated, then accesses to it will always succeed,
even if it needs to be swapped in.
> And, frankly, unless the UML ram file is kept on ramfs (which is RAM-only),
> it can be swapped (both for disk-based filesystem and for tmpfs).
> So, I don't think what you suggest could work.
Swapping makes no difference.
Jeff
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-10-02 1:03 ` Jeff Dike
@ 2005-10-02 10:23 ` Blaisorblade
2005-10-02 18:31 ` Jeff Dike
0 siblings, 1 reply; 20+ messages in thread
From: Blaisorblade @ 2005-10-02 10:23 UTC (permalink / raw)
To: Jeff Dike; +Cc: user-mode-linux-devel, Young Koh
On Sunday 02 October 2005 03:03, Jeff Dike wrote:
> On Thu, Sep 29, 2005 at 02:09:27PM +0200, Blaisorblade wrote:
> > Again, that page is not bad. There is no page yet for this address, and
> > the host won't allocate one for now.
> It is bad in the sense that, unless some space is freed on that mount, a
> reference to the page will always fault.
Sorry, any reference will fault, unless it is done on a allocated present
page, which the UML kernel freed but the host didn't. And remember, btw,
you've planned to make this impossible...
> > Ok, this makes a bit of sense, even if IMHO it doesn't work, I now see
> > your point (but I still insist with what said above).
> Explain why it doesn't work.
Below.
> > However, even a dirtied page could be "bad", if it has been swapped. If
> > we're getting a SIGBUS, it meant that it didn't succeed in freeing any
> > memory.
> No it can't. A swapped page still counts as occupying space in the
> filesystem. If a page was successfully allocated, then accesses to it will
> always succeed, even if it needs to be swapped in.
Sorry, Jeff, which page are you going to evict? It can be a dirty page. Unless
you mean that since that page is still accounted in the FS, Linux will leave
a RAM page free to allow it to be re-read, while still swapping the page.
You didn't obviously mean this absurdity (why swap it in first place), but I
don't catch what's missing to you.
> > And, frankly, unless the UML ram file is kept on ramfs (which is
> > RAM-only), it can be swapped (both for disk-based filesystem and for
> > tmpfs). So, I don't think what you suggest could work.
> Swapping makes no difference.
Realoding pages means freeing RAM to leave place to them.
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Messenger: chiamate gratuite in tutto il mondo
http://it.messenger.yahoo.com
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-10-02 10:23 ` Blaisorblade
@ 2005-10-02 18:31 ` Jeff Dike
2005-10-03 18:35 ` Blaisorblade
0 siblings, 1 reply; 20+ messages in thread
From: Jeff Dike @ 2005-10-02 18:31 UTC (permalink / raw)
To: Blaisorblade; +Cc: user-mode-linux-devel, Young Koh
On Sun, Oct 02, 2005 at 12:23:14PM +0200, Blaisorblade wrote:
> Sorry, any reference will fault, unless it is done on a allocated present
> page, which the UML kernel freed but the host didn't. And remember, btw,
> you've planned to make this impossible...
You are not making any sense t me here.
> Sorry, Jeff, which page are you going to evict? It can be a dirty page. Unless
> you mean that since that page is still accounted in the FS, Linux will leave
> a RAM page free to allow it to be re-read, while still swapping the page.
Swapped pages are accounted in the FS all the time and there's obviously
no dedicated page left free for them when they are next pulled in. All
disk-based filesystems account pages that are on disk and not in memory.
tmpfs is no different, except that its disk is the swap partition.
Jeff
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-10-02 18:31 ` Jeff Dike
@ 2005-10-03 18:35 ` Blaisorblade
2005-10-03 20:38 ` Jeff Dike
0 siblings, 1 reply; 20+ messages in thread
From: Blaisorblade @ 2005-10-03 18:35 UTC (permalink / raw)
To: Jeff Dike; +Cc: user-mode-linux-devel, Young Koh
On Sunday 02 October 2005 20:31, Jeff Dike wrote:
> On Sun, Oct 02, 2005 at 12:23:14PM +0200, Blaisorblade wrote:
> > Sorry, any reference will fault, unless it is done on a allocated present
> > page, which the UML kernel freed but the host didn't. And remember, btw,
> > you've planned to make this impossible...
> You are not making any sense t me here.
/dev/anon - when the UML kernel frees a page, we ask the host to free it too.
> > Sorry, Jeff, which page are you going to evict? It can be a dirty page.
> > Unless you mean that since that page is still accounted in the FS, Linux
> > will leave a RAM page free to allow it to be re-read, while still
> > swapping the page.
> Swapped pages are accounted in the FS all the time and there's obviously
> no dedicated page left free for them when they are next pulled in. All
> disk-based filesystems account pages that are on disk and not in memory.
> tmpfs is no different, except that its disk is the swap partition.
Exactly what I knew...
However, I was in error...
I just saw that filling the disk, or tmpfs, is rather different than going
OOM. OOM causes SIGKILL, while SIGBUS (as you correctly said) comes from full
disk/partition, or filled disk quota. I only checked tmpfs, but that gives at
least a feeling.
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [uml-devel] a question about sigsetjmp() in copy_from/to_user()
2005-10-03 18:35 ` Blaisorblade
@ 2005-10-03 20:38 ` Jeff Dike
0 siblings, 0 replies; 20+ messages in thread
From: Jeff Dike @ 2005-10-03 20:38 UTC (permalink / raw)
To: Blaisorblade; +Cc: user-mode-linux-devel, Young Koh
On Mon, Oct 03, 2005 at 08:35:54PM +0200, Blaisorblade wrote:
> /dev/anon - when the UML kernel frees a page, we ask the host to free it too.
Yeah, /dev/anon is a completely different story, but I thought we were talking
about normal tmpfs.
Jeff
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2005-10-03 21:17 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-27 14:06 [uml-devel] a question about sigsetjmp() in copy_from/to_user() Young Koh
2005-09-27 17:28 ` Jeff Dike
2005-09-28 11:59 ` Blaisorblade
2005-09-28 13:47 ` Young Koh
2005-09-28 14:50 ` Jeff Dike
2005-09-28 19:25 ` Young Koh
2005-09-29 12:09 ` Blaisorblade
2005-09-30 15:08 ` Young Koh
2005-09-30 15:44 ` Geert Uytterhoeven
2005-10-02 1:03 ` Jeff Dike
2005-10-02 10:23 ` Blaisorblade
2005-10-02 18:31 ` Jeff Dike
2005-10-03 18:35 ` Blaisorblade
2005-10-03 20:38 ` Jeff Dike
2005-09-28 16:09 ` Jeff Dike
2005-09-28 17:26 ` Blaisorblade
2005-09-28 18:43 ` Jeff Dike
2005-09-28 8:41 ` Blaisorblade
2005-09-28 14:22 ` Young Koh
2005-09-28 16:43 ` Blaisorblade
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.