* Re: [parisc-linux] 2.4.19-pa24 (uaccess.h patch)
@ 2002-10-29 0:12 John Marvin
2002-10-29 2:14 ` Matthew Wilcox
0 siblings, 1 reply; 5+ messages in thread
From: John Marvin @ 2002-10-29 0:12 UTC (permalink / raw)
To: parisc-linux
>
> ex_table is used for recovering from a page fault. i don't see how we
> can take a page fault when copying to kernel ram.
>
Let's wait on removing this support. This mechanism provides a way of
testing whether or not an address is a valid kernel address (in addition
to range checking the address). I'd like to consider implementing the
virtual mem map support (currently implemented on ia64) on parisc as an
alternative mechanism (to willy's idea of using kmap/kunmap) for getting
back the 256Mb of memory we currently ignore for Astro based machines with
more than 3.75 Gb of memory. The virtual mem map code uses this exact
mechanism to determine whether or not a struct page pointer is pointing
into a sparse (unallocated) region of the virtual mem map array (i.e. on
ia64 the ia64_page_valid() routine does a get_user on the first byte and
checks the return from get_user to see if it fails/succeeds).
John
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [parisc-linux] 2.4.19-pa24 (uaccess.h patch)
2002-10-29 0:12 [parisc-linux] 2.4.19-pa24 (uaccess.h patch) John Marvin
@ 2002-10-29 2:14 ` Matthew Wilcox
0 siblings, 0 replies; 5+ messages in thread
From: Matthew Wilcox @ 2002-10-29 2:14 UTC (permalink / raw)
To: John Marvin; +Cc: parisc-linux
On Mon, Oct 28, 2002 at 05:12:03PM -0700, John Marvin wrote:
> > ex_table is used for recovering from a page fault. i don't see how we
> > can take a page fault when copying to kernel ram.
>
> Let's wait on removing this support. This mechanism provides a way of
> testing whether or not an address is a valid kernel address (in addition
> to range checking the address). I'd like to consider implementing the
> virtual mem map support (currently implemented on ia64) on parisc as an
> alternative mechanism (to willy's idea of using kmap/kunmap) for getting
> back the 256Mb of memory we currently ignore for Astro based machines with
> more than 3.75 Gb of memory. The virtual mem map code uses this exact
> mechanism to determine whether or not a struct page pointer is pointing
> into a sparse (unallocated) region of the virtual mem map array (i.e. on
> ia64 the ia64_page_valid() routine does a get_user on the first byte and
> checks the return from get_user to see if it fails/succeeds).
Well.. if you're interested in working on PA again, I have 3 ideas which
kind of overlap (you seem to have confused two of them, so it's worth
talking about them all):
(1) get/put_user & copy_to/from_user should always copy to %sr3.
set_fs et al should manipulate sr3. Last time this came up, you said
this should work but would need testing. That gets rid of the duplicate
exception tables.
(2) the kmap/kunmap idea was to avoid cache aliasing. we always kmap a
pagecache page before we access it, so we can map it to an address that
is "equivalent" to the userspace address before accessing it. Not sure
we get away with making flush_dcache_page() a no-op, though.
(3) For getting back the 256MB of memory mapped at 64GB, I think the
DISCONTIG code in 2.5 is now suitable. It's now based on zones,
not nodes, so we can have a ZONE_DMA from 0-3.75GB, ZONE_NORMAL from
4-xGB and ZONE_HIGHMEM from 64GB to 64GB+256MB. I suspect this is
the right thing to do on ia64 too, for 2.5.
--
Revolutions do not require corporate support.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [parisc-linux] 2.4.19-pa24 (uaccess.h patch)
@ 2002-10-29 5:55 John Marvin
2002-10-29 13:51 ` Matthew Wilcox
2002-10-29 17:41 ` Grant Grundler
0 siblings, 2 replies; 5+ messages in thread
From: John Marvin @ 2002-10-29 5:55 UTC (permalink / raw)
To: parisc-linux
> Well.. if you're interested in working on PA again, I have 3 ideas which
> kind of overlap (you seem to have confused two of them, so it's worth
> talking about them all):
I didn't get them confused in the way you think. I knew that you were
proposing using kmap/kunmap to solve the flushing/aliasing issues,
especially on stretch. But I saw your todo message suggesting the use of
zone_highmem for getting back the 256Mb. My 2.4 based understanding is
that the kernel kmaps memory in that zone in order to use it. That may no
longer be true on 2.5 (or may not ever have been true in the first place).
I'll be looking at that immediately, since I have to either merge the
virtual mem map stuff into 2.5, or see if the vm changes with respect to
zones will solve the problem as you suggest. Whatever works for ia64
will probably also work for parisc, since the problem is similar (although
a little more extreme on ia64).
On another note, I am still looking at the I bit issue with respect to
handle_interruption. What I forgot was that we always turn the I bit
off when we switch to virtual mode before calling handle_interruption.
So, the I bit needs to be on for user faults at the very least. That is
why I put that code in there.
However, Grant is right in that there is a hole if the kernel faults with
the I bit off. Normally that would be a bug, i.e. there are few valid
reasons for the kernel to fault with the I bit off (what I mean by fault
in this case is something that makes it to handle_interruption, not
something that gets handled at a lower level, like a tlb miss). But, I
can think of a few possible scenarios where it might be happen
legitimately, although I don't know if any actually occur.
The right solution is to restore the I bit to whatever it was at the time
of the fault. That is probably more appropriately handled at virt_map
time (add a register argument to the macro holding the desired I bit
state, call with r0 for intr_extint, call with previous masked value from
ipsw for intr_save). I believe if done right, we can also set things up
properly in hpmc.S so that when it calls intr_save, the I bit won't be
turned on, and we can remove the special case code from handle_interruption.
I'm also looking at a potential problem in parisc's return from
syscall/faults. I'll hopefully fix all the above soon. And yes, I'll
merge it into 2.5 also.
John
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [parisc-linux] 2.4.19-pa24 (uaccess.h patch)
2002-10-29 5:55 John Marvin
@ 2002-10-29 13:51 ` Matthew Wilcox
2002-10-29 17:41 ` Grant Grundler
1 sibling, 0 replies; 5+ messages in thread
From: Matthew Wilcox @ 2002-10-29 13:51 UTC (permalink / raw)
To: John Marvin; +Cc: parisc-linux
On Mon, Oct 28, 2002 at 10:55:02PM -0700, John Marvin wrote:
> I didn't get them confused in the way you think. I knew that you were
> proposing using kmap/kunmap to solve the flushing/aliasing issues,
> especially on stretch. But I saw your todo message suggesting the use of
> zone_highmem for getting back the 256Mb. My 2.4 based understanding is
> that the kernel kmaps memory in that zone in order to use it. That may no
> longer be true on 2.5 (or may not ever have been true in the first place).
That's true. It's only available for allocation to requests that have
__GFP_HIGHMEM set, ie those that specify GFP_HIGHUSER. And all those
users kmap it to ensure that it's addressable. But that includes almost
all the allocations done to give ram to userspace, so I suspect this zone
will be exhausted long before the othr zones.
As an existance proof, people really do run x86 boxes with 15GB of
ZONE_HIGHMEM and 800MB of ZONE_NORMAL, and it works for most workloads.
The exceptions are things like 1000 oracle processes mmaping 2GB each.
They run out of ZONE_NORMAL because ptes are still allocated from there.
> I'll be looking at that immediately, since I have to either merge the
> virtual mem map stuff into 2.5, or see if the vm changes with respect to
> zones will solve the problem as you suggest. Whatever works for ia64
> will probably also work for parisc, since the problem is similar (although
> a little more extreme on ia64).
Yep. The worst case is 1GB of ZONE_DMA and 3GB of ZONE_HIGHMEM, which
is still not as bad as x86 gets. Of course, kmap is still a nop since
we can still address the "highmem". I do think the zones need to be
redesigned a bit; they're still too x86-centric. There might still be
time for that before 2.6...
> I'm also looking at a potential problem in parisc's return from
> syscall/faults. I'll hopefully fix all the above soon. And yes, I'll
> merge it into 2.5 also.
Great, thanks!
--
Revolutions do not require corporate support.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [parisc-linux] 2.4.19-pa24 (uaccess.h patch)
2002-10-29 5:55 John Marvin
2002-10-29 13:51 ` Matthew Wilcox
@ 2002-10-29 17:41 ` Grant Grundler
1 sibling, 0 replies; 5+ messages in thread
From: Grant Grundler @ 2002-10-29 17:41 UTC (permalink / raw)
To: John Marvin; +Cc: parisc-linux
John Marvin wrote:
> However, Grant is right in that there is a hole if the kernel faults with
> the I bit off.
yes - I guess that's another way of looking at it.
My concern was around all the places in entry.S that return to
the intr_return label. That results in I-bit getting unconditional
set (thus re-enabling interrupts) until rfir restores it.
It opens a window where
> Normally that would be a bug, i.e. there are few valid
> reasons for the kernel to fault with the I bit off (what I mean by fault
> in this case is something that makes it to handle_interruption, not
> something that gets handled at a lower level, like a tlb miss).
hmm...not sure about that. We have lots of misc reasons for traps/faults
where we just might not be handling the CPU correctly. In "Group 2"
Interrupt Class, only LPMC might be expected - but we don't protect
against the others either. I need to add code to "handle_interrupt()"
to see if I'm hitting that code path and how.
> But, I can think of a few possible scenarios where it might be happen
> legitimately, although I don't know if any actually occur.
I guess i should convince myself 100% it is happening and which trap/fault
is the offending bit.
> The right solution is to restore the I bit to whatever it was at the time
> of the fault.
I'm thinking the right solution is *only* the 'extr_interrupt' code
path should touch the I-bit after it's handled the external interrupt.
rfir will restore to what it should be. In practice, this would
mean that only extr_intr label will return to intr_return.
Everyone else will return to intr_restore.
And we need to find instances of local_irq_disable() when
I-bit is already off. Non-trivial since some code will save flags,
disable IRQ, and restore flags later.
> That is probably more appropriately handled at virt_map
> time (add a register argument to the macro holding the desired I bit
> state, call with r0 for intr_extint, call with previous masked value from
> ipsw for intr_save). I believe if done right, we can also set things up
> properly in hpmc.S so that when it calls intr_save, the I bit won't be
> turned on, and we can remove the special case code from handle_interruption.
sounds like you understand this part of the code alot better than I do.
> I'm also looking at a potential problem in parisc's return from
> syscall/faults. I'll hopefully fix all the above soon. And yes, I'll
> merge it into 2.5 also.
cool - I'm testing my proposal to change other to use intr_restore path
but it's still deadlocking on io_request_lock at boot. Either some other
code must still be mucking with the I-bit or something is corrupting
the io_request_lock.
thanks,
grant
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2002-10-29 17:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-29 0:12 [parisc-linux] 2.4.19-pa24 (uaccess.h patch) John Marvin
2002-10-29 2:14 ` Matthew Wilcox
-- strict thread matches above, loose matches on Subject: below --
2002-10-29 5:55 John Marvin
2002-10-29 13:51 ` Matthew Wilcox
2002-10-29 17:41 ` Grant Grundler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox