* Kernel support for peer-to-peer protection models...
@ 2004-03-27 4:23 Ivan Godard
2004-03-29 0:17 ` Paul Mackerras
0 siblings, 1 reply; 21+ messages in thread
From: Ivan Godard @ 2004-03-27 4:23 UTC (permalink / raw)
To: Linux Kernel Mailing List
We're a processor startup with a new architecture that we will be porting
Linux to. The bulk of the port will be straightforward (well, you know what
I mean), except for the protection model supported by the hardware. How
would you extend/mod the kernel if you had hardware that:
1) had a large number of distinguishable address spaces
2) any running code had two of these (code and data environment) it could
use arbitrarily, but access to addresses in others was arbitrarily protected
3) flat, unified virtual addresses (64 bit) so that pointers, including
inter-space pointers, have the same representation in all spaces
4) no "supervisor mode"
5) inter-space references require grant of access (transitive) by the
accessed space; grants can be entire space or any contiguous subspace
6) inter-space reference has same performance as intra-space
7) you can call across space boundaries in a way that changes the governing
space and consequently the protection environment. Thus both code and data
of a server process or DLL may be non-addressable by an app process, but the
app can call a function (controllably) in the server/DLL and that function
will run in the protection environment of the server/DLL, not that of the
app, though it will run in the app's stack (in the app's address space) with
no task switch. Return unwinds the change. This with full protection both
ways - neither the app nor the server/DLL can klobber each other.
Performance is the same as a conventional intra-space call/return.
8) OS calls are not traps, just inter-space calls, again with full
protection.
9) Hardware interrupts are involuntary inter-space calls. They do not
require locking (assuming the handler is re-entrant - and if not then only
from themselves), nor task switch, nor disabling other interrupts. The
handler runs in the stack of whoever got interrupted, which (depending on
interrupt priorities) could be another interrupt, on an interrupt, ... on an
app, all mutually protected.
10) Drivers can have their own individual space(s) distinct from those of
the kernel and the apps. Buggy drivers cannot crash the kernel.
We could of course ignore the hardware model and emulate a conventional
processor (and might, as a first step of the port), but this would discard
significant reliability/performance improvements that the hardware could
provide.
Is this model so alien to the existing Kernel that the best approach is to
peel off a kernel tarball and create a new kernel, to be maintained in
isolation forever? Or would this work fit into planned/expected kernel work
dealing with protection models, interrupts, trap handling and the like? What
about partitioning the kernel into disjoint (and mutually protected)
components like IP stack, password/security, FS etc?
We will not be beginning this port for six months or so - the compiler has
to come up first :-) So this question is more to get some sense of the lay
of the land rather than any immediate help. We know that there is a lot of
know-how and strong opinions concentrated in the kernel development crew,
and we want to gain from that and also to contribute what is currently
budgeted as some 10 engineer-years of work upcoming to the general
improvement if it can be done. Alternatively we might be too wierd to be
worth bothering with for the group, and so we should do it on our own and
not try to fit with the "Linux way of doing things".
Comments please?
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
[not found] <048e01c413b3$3c3cae60$fc82c23f@pc21.suse.lists.linux.kernel>
@ 2004-03-27 6:29 ` Andi Kleen
2004-03-28 20:21 ` Ivan Godard
0 siblings, 1 reply; 21+ messages in thread
From: Andi Kleen @ 2004-03-27 6:29 UTC (permalink / raw)
To: Ivan Godard; +Cc: linux-kernel
"Ivan Godard" <igodard@pacbell.net> writes:
> We're a processor startup with a new architecture that we will be porting
> Linux to. The bulk of the port will be straightforward (well, you know what
> I mean), except for the protection model supported by the hardware. How
> would you extend/mod the kernel if you had hardware that:
>
> 1) had a large number of distinguishable address spaces
Large or unlimited? If not unlimited you may still run into
problems when you give each process such an address space.
Limiting the number of processes is probably not an option.
> 2) any running code had two of these (code and data environment) it could
> use arbitrarily, but access to addresses in others was arbitrarily protected
> 3) flat, unified virtual addresses (64 bit) so that pointers, including
> inter-space pointers, have the same representation in all spaces
You mean Address 0 is only accessible from Address space 0, but not
from Space 1 ?
Maybe you can give each process an different address range, but AFAIK
the only people who have done this before are users of non MMU architectures.
It will probably require som changes in the portable part of the code.
Also porting glibc's ld.so to this will be likely no-fun.
> 4) no "supervisor mode"
> 5) inter-space references require grant of access (transitive) by the
> accessed space; grants can be entire space or any contiguous subspace
Sounds like the only sane way to handle (4) would be to give the kernel
an own address space with the necessary grants to access everything.
However this will require an address space switch for every system call.
But there is no way around it, linux requires a "shared" kernel mapping
at least for part of the kernel memory ("lowmem")
Overall it sounds like your architecture is not very well suited to
run Linux.
> 10) Drivers can have their own individual space(s) distinct from those of
> the kernel and the apps. Buggy drivers cannot crash the kernel.
At least you would need to use your own drivers (I believe the IBM
iSeries and s390/VM port does it kind of). If your CPU has generic PCI
slots this will be a lot of work. Without it it will be lots of work too,
but at least the number of drivers required is limited.
> Is this model so alien to the existing Kernel that the best approach is to
It is definitely alien.
-Andi
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
[not found] <048e01c413b3_3c3cae60_fc82c23f@pc21>
@ 2004-03-27 10:34 ` Pavel Machek
2004-03-28 1:32 ` Ivan Godard
0 siblings, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2004-03-27 10:34 UTC (permalink / raw)
To: Ivan Godard; +Cc: Linux Kernel Mailing List
Hi!
> 1) had a large number of distinguishable address spaces
> 2) any running code had two of these (code and data environment) it could
> use arbitrarily, but access to addresses in others was arbitrarily protected
> 3) flat, unified virtual addresses (64 bit) so that pointers, including
> inter-space pointers, have the same representation in all spaces
Hmm, will it be possible to have UML?
> 4) no "supervisor mode"
Is all your i/o memory mapped?
> 5) inter-space references require grant of access (transitive) by the
> accessed space; grants can be entire space or any contiguous subspace
> 6) inter-space reference has same performance as intra-space
Huh? Does it mean that all the accesses are horibly slow?
> 9) Hardware interrupts are involuntary inter-space calls. They do not
> require locking (assuming the handler is re-entrant - and if not then only
> from themselves), nor task switch, nor disabling other interrupts. The
> handler runs in the stack of whoever got interrupted, which (depending on
> interrupt priorities) could be another interrupt, on an interrupt, ... on an
> app, all mutually protected.
How do you implement ptrace if apps are protected from kernel?
> 10) Drivers can have their own individual space(s) distinct from those of
> the kernel and the apps. Buggy drivers cannot crash the kernel.
Well... buggy drivers can usually program DMA to do crashing for them.
How is your architecture called?
> dealing with protection models, interrupts, trap handling and the like? What
> about partitioning the kernel into disjoint (and mutually protected)
> components like IP stack, password/security, FS etc?
That would be pretty big rewrite...
Anyway, I believe you *do* want linux on it, if only as a test load.
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-27 10:34 ` Pavel Machek
@ 2004-03-28 1:32 ` Ivan Godard
2004-03-28 6:24 ` Pavel Machek
0 siblings, 1 reply; 21+ messages in thread
From: Ivan Godard @ 2004-03-28 1:32 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux Kernel Mailing List
Interlinear
----- Original Message -----
From: "Pavel Machek" <pavel@ucw.cz>
To: "Ivan Godard" <igodard@pacbell.net>
Cc: "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Sent: Saturday, March 27, 2004 2:34 AM
Subject: Re: Kernel support for peer-to-peer protection models...
> Hi!
>
> > 1) had a large number of distinguishable address spaces
> > 2) any running code had two of these (code and data environment) it
could
> > use arbitrarily, but access to addresses in others was arbitrarily
protected
> > 3) flat, unified virtual addresses (64 bit) so that pointers, including
> > inter-space pointers, have the same representation in all spaces
>
> Hmm, will it be possible to have UML?
If by UML you mean Uniform Modelling Language, I don't understand where the
protection model has any impact. UML models flow, permissions are somewhat
superimposed, just like file permissions in a UML model on any machine.
> > 4) no "supervisor mode"
>
> Is all your i/o memory mapped?
Yes, and also those few machine operations that are "priveledged".
> > 5) inter-space references require grant of access (transitive) by the
> > accessed space; grants can be entire space or any contiguous subspace
> > 6) inter-space reference has same performance as intra-space
>
> Huh? Does it mean that all the accesses are horibly slow?
No, they run at full regular memory speed, at all levels of the memory
heirarchy. Because of the flat unified address space, all caches can be in
virtual (this is common in 64-bit address systems) because a pointer means
the same thing no matter who uses it. A consequence is that you don't have
to cache scrub on task switch, which is a big time win.
> > 9) Hardware interrupts are involuntary inter-space calls. They do not
> > require locking (assuming the handler is re-entrant - and if not then
only
> > from themselves), nor task switch, nor disabling other interrupts. The
> > handler runs in the stack of whoever got interrupted, which (depending
on
> > interrupt priorities) could be another interrupt, on an interrupt, ...
on an
> > app, all mutually protected.
>
> How do you implement ptrace if apps are protected from kernel?
Anybody can make all or part of themselves visible to anybody else. If you
start up an app in your space, you can grant visibility to a debugger in
another space. But this applies only to you. For example, suppose that your
app calls a paranoid server DLL passing in a function, and the DLL in turn
calls back your function. Then your stack will have a hunk of you (that you
can see and expose to the debugger), then a hunk of DLL function activations
(which are opaque to you AND the debugger unless you can talk the DLL into
exposing itself), and then another hunk of you again (and again visible to
you and the debugger). The DLL and you (and your debugger) are mutually
protected.
To do this on a conventional system requires that the DLL runs as a server
process, and getting it to do something for you requires a roundtrip through
the dispatcher. For us it's a simple subroutine call, just as if the DLL
were un-paranoid and had been linked into your code. Clearer?
> > 10) Drivers can have their own individual space(s) distinct from those
of
> > the kernel and the apps. Buggy drivers cannot crash the kernel.
>
> Well... buggy drivers can usually program DMA to do crashing for them.
Nope. The DMA has the same permissions as the driver that starts it.
> How is your architecture called?
"Mill"
> > dealing with protection models, interrupts, trap handling and the like?
What
> > about partitioning the kernel into disjoint (and mutually protected)
> > components like IP stack, password/security, FS etc?
>
> That would be pretty big rewrite...
>
> Anyway, I believe you *do* want linux on it, if only as a test load.
We definitely want Linux. The question is whether Linux will want the result
of our port, or (in finer detail) what parts could we do in what way to be
useful to other people.
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-28 1:32 ` Ivan Godard
@ 2004-03-28 6:24 ` Pavel Machek
2004-03-28 6:32 ` Ivan Godard
0 siblings, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2004-03-28 6:24 UTC (permalink / raw)
To: Ivan Godard; +Cc: Linux Kernel Mailing List
Hi!
On So 27-03-04 17:32:06, Ivan Godard wrote:
> > > 1) had a large number of distinguishable address spaces
> > > 2) any running code had two of these (code and data environment) it
> could
> > > use arbitrarily, but access to addresses in others was arbitrarily
> protected
> > > 3) flat, unified virtual addresses (64 bit) so that pointers, including
> > > inter-space pointers, have the same representation in all spaces
> >
> > Hmm, will it be possible to have UML?
>
> If by UML you mean Uniform Modelling Language, I don't understand where the
> protection model has any impact. UML models flow, permissions are somewhat
> superimposed, just like file permissions in a UML model on any
> machine.
I meant "User Mode Linux" == linux running under linux. Someone
probably has an URL.
> > > 9) Hardware interrupts are involuntary inter-space calls. They do not
> > > require locking (assuming the handler is re-entrant - and if not then
> only
> > > from themselves), nor task switch, nor disabling other interrupts. The
> > > handler runs in the stack of whoever got interrupted, which (depending
> on
> > > interrupt priorities) could be another interrupt, on an interrupt, ...
> on an
> > > app, all mutually protected.
> >
> > How do you implement ptrace if apps are protected from kernel?
>
> Anybody can make all or part of themselves visible to anybody else. If you
> start up an app in your space, you can grant visibility to a debugger in
> another space. But this applies only to you. For example, suppose that your
> app calls a paranoid server DLL passing in a function, and the DLL in turn
> calls back your function. Then your stack will have a hunk of you (that you
> can see and expose to the debugger), then a hunk of DLL function activations
> (which are opaque to you AND the debugger unless you can talk the DLL into
> exposing itself), and then another hunk of you again (and again visible to
> you and the debugger). The DLL and you (and your debugger) are mutually
> protected.
>
> To do this on a conventional system requires that the DLL runs as a server
> process, and getting it to do something for you requires a roundtrip through
> the dispatcher. For us it's a simple subroutine call, just as if the DLL
> were un-paranoid and had been linked into your code. Clearer?
Strange system.... If an application does not grant kernel access to
its space, how is kernel supposed to do its job? For example, that
"paranoid DLL" becomes unswappable, then?
If you have "enough" paranoid DLLs, you can then bring the machine
down due to lack of real memory :-).
> > > 10) Drivers can have their own individual space(s) distinct from those
> of
> > > the kernel and the apps. Buggy drivers cannot crash the kernel.
> >
> > Well... buggy drivers can usually program DMA to do crashing for them.
>
> Nope. The DMA has the same permissions as the driver that starts it.
So normal PCI cards are not allowed, or do you have some kind of
IOMMU?
> > That would be pretty big rewrite...
> >
> > Anyway, I believe you *do* want linux on it, if only as a test load.
>
> We definitely want Linux. The question is whether Linux will want the result
> of our port, or (in finer detail) what parts could we do in what way to be
> useful to other people.
If most changes are in arch/, it should be acceptable...
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-28 6:24 ` Pavel Machek
@ 2004-03-28 6:32 ` Ivan Godard
2004-03-28 18:54 ` Pavel Machek
0 siblings, 1 reply; 21+ messages in thread
From: Ivan Godard @ 2004-03-28 6:32 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux Kernel Mailing List
----- Original Message -----
From: "Pavel Machek" <pavel@ucw.cz>
To: "Ivan Godard" <igodard@pacbell.net>
Cc: "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Sent: Saturday, March 27, 2004 10:24 PM
Subject: Re: Kernel support for peer-to-peer protection models...
> Hi!
>
> On So 27-03-04 17:32:06, Ivan Godard wrote:
> > > > 1) had a large number of distinguishable address spaces
> > > > 2) any running code had two of these (code and data environment) it
> > could
> > > > use arbitrarily, but access to addresses in others was arbitrarily
> > protected
> > > > 3) flat, unified virtual addresses (64 bit) so that pointers,
including
> > > > inter-space pointers, have the same representation in all spaces
> > >
> > > Hmm, will it be possible to have UML?
> >
> > If by UML you mean Uniform Modelling Language, I don't understand where
the
> > protection model has any impact. UML models flow, permissions are
somewhat
> > superimposed, just like file permissions in a UML model on any
> > machine.
>
> I meant "User Mode Linux" == linux running under linux. Someone
> probably has an URL.
Sorry - I plead ignorance :-) As the protection is recursive and
transitive, I suppose that you could do this. When the UMK (user mode
kernel) went to change the "real" machine it would get a protection fault
that would be handled by the KMK, emulating the effect. Getting it right and
also performant would be tricky though - is UML a necessary feature?
> > > > 9) Hardware interrupts are involuntary inter-space calls. They do
not
> > > > require locking (assuming the handler is re-entrant - and if not
then
> > only
> > > > from themselves), nor task switch, nor disabling other interrupts.
The
> > > > handler runs in the stack of whoever got interrupted, which
(depending
> > on
> > > > interrupt priorities) could be another interrupt, on an interrupt,
...
> > on an
> > > > app, all mutually protected.
> > >
> > > How do you implement ptrace if apps are protected from kernel?
> >
> > Anybody can make all or part of themselves visible to anybody else. If
you
> > start up an app in your space, you can grant visibility to a debugger in
> > another space. But this applies only to you. For example, suppose that
your
> > app calls a paranoid server DLL passing in a function, and the DLL in
turn
> > calls back your function. Then your stack will have a hunk of you (that
you
> > can see and expose to the debugger), then a hunk of DLL function
activations
> > (which are opaque to you AND the debugger unless you can talk the DLL
into
> > exposing itself), and then another hunk of you again (and again visible
to
> > you and the debugger). The DLL and you (and your debugger) are mutually
> > protected.
> >
> > To do this on a conventional system requires that the DLL runs as a
server
> > process, and getting it to do something for you requires a roundtrip
through
> > the dispatcher. For us it's a simple subroutine call, just as if the DLL
> > were un-paranoid and had been linked into your code. Clearer?
>
> Strange system.... If an application does not grant kernel access to
> its space, how is kernel supposed to do its job? For example, that
> "paranoid DLL" becomes unswappable, then?
Pretection is in the *virtual* space, not physical. The physical-page
manager (who has the TLB and underlying mapping tables in its space) can see
and deal with any physical address, which in turn has the usual aliasing
relationship with virtual addresses. Of course, physical is just one of the
virtual spaces (and is distinguished solely by the one-to-one
virtual-physical mapping). So the protection can be penetrated by anyone who
can see the underlying physical page - but that's always true.
> If you have "enough" paranoid DLLs, you can then bring the machine
> down due to lack of real memory :-).
No.
> > > > 10) Drivers can have their own individual space(s) distinct from
those
> > of
> > > > the kernel and the apps. Buggy drivers cannot crash the kernel.
> > >
> > > Well... buggy drivers can usually program DMA to do crashing for them.
> >
> > Nope. The DMA has the same permissions as the driver that starts it.
>
> So normal PCI cards are not allowed, or do you have some kind of
> IOMMU?
Nope. Apertures in the memory controller. We looked at doing I/O in virtual,
but the extra traffic on the pins was too expensive (although it did permit
I/O onto paged memory). But it's easy to give each IO its own aperture base,
just by adding a few more high-order bits to the address, and the controller
can make sure that each DMA stays in its aperture. Sort of a poor-man's MMU
:-)
> > > That would be pretty big rewrite...
> > >
> > > Anyway, I believe you *do* want linux on it, if only as a test load.
> >
> > We definitely want Linux. The question is whether Linux will want the
result
> > of our port, or (in finer detail) what parts could we do in what way to
be
> > useful to other people.
>
> If most changes are in arch/, it should be acceptable...
I fear that it might be more extensive than that :-)
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-28 6:32 ` Ivan Godard
@ 2004-03-28 18:54 ` Pavel Machek
2004-03-28 19:56 ` Ivan Godard
0 siblings, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2004-03-28 18:54 UTC (permalink / raw)
To: Ivan Godard; +Cc: Linux Kernel Mailing List
Hi!
> > I meant "User Mode Linux" == linux running under linux. Someone
> > probably has an URL.
>
> Sorry - I plead ignorance :-) As the protection is recursive and
> transitive, I suppose that you could do this. When the UMK (user mode
> kernel) went to change the "real" machine it would get a protection fault
> that would be handled by the KMK, emulating the effect. Getting it right and
> also performant would be tricky though - is UML a necessary feature?
No. Its just "nice to have", and it does not support too many
architectures.
> > Strange system.... If an application does not grant kernel access to
> > its space, how is kernel supposed to do its job? For example, that
> > "paranoid DLL" becomes unswappable, then?
>
> Pretection is in the *virtual* space, not physical. The physical-page
> manager (who has the TLB and underlying mapping tables in its space) can see
> and deal with any physical address, which in turn has the usual aliasing
> relationship with virtual addresses. Of course, physical is just one of the
> virtual spaces (and is distinguished solely by the one-to-one
> virtual-physical mapping). So the protection can be penetrated by anyone who
> can see the underlying physical page - but that's always true.
Aha, so some part of kernel exist that has "absolute right". Ok, now I
can imagine that it can work.
> > If most changes are in arch/, it should be acceptable...
>
> I fear that it might be more extensive than that :-)
Well, make patch and lets see... That means that 2.8 needs to be your
target. If impact outside of arch is not "total rewrite", you might
have a chance. If it is "total rewrite".... well you just need to be
very clever.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-28 18:54 ` Pavel Machek
@ 2004-03-28 19:56 ` Ivan Godard
2004-03-28 20:35 ` Pavel Machek
0 siblings, 1 reply; 21+ messages in thread
From: Ivan Godard @ 2004-03-28 19:56 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux Kernel Mailing List
----- Original Message -----
From: "Pavel Machek" <pavel@ucw.cz>
To: "Ivan Godard" <igodard@pacbell.net>
Cc: "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Sent: Sunday, March 28, 2004 10:54 AM
Subject: Re: Kernel support for peer-to-peer protection models...
> Hi!
>
> > > Strange system.... If an application does not grant kernel access to
> > > its space, how is kernel supposed to do its job? For example, that
> > > "paranoid DLL" becomes unswappable, then?
> >
> > Pretection is in the *virtual* space, not physical. The physical-page
> > manager (who has the TLB and underlying mapping tables in its space) can
see
> > and deal with any physical address, which in turn has the usual aliasing
> > relationship with virtual addresses. Of course, physical is just one of
the
> > virtual spaces (and is distinguished solely by the one-to-one
> > virtual-physical mapping). So the protection can be penetrated by anyone
who
> > can see the underlying physical page - but that's always true.
>
> Aha, so some part of kernel exist that has "absolute right". Ok, now I
> can imagine that it can work.
The "boot process" comes up with unlimited access to everything and the
virt2phys direct mapped. As it forks procesess it can arbitrarily restrict
their vision, transitively, and set the translation tables any way it wants.
What I've sketched is one model, where a particular virtual space is used to
map physical and the kernel is broken up into distinct address spaces with
protection boundaries between, and each driver and app in ots own space. But
you could emulate a conventoional, with the kernel and the drivers all in
one space (and mutually vulnerable), or others.
> > > If most changes are in arch/, it should be acceptable...
> >
> > I fear that it might be more extensive than that :-)
>
> Well, make patch and lets see... That means that 2.8 needs to be your
> target. If impact outside of arch is not "total rewrite", you might
> have a chance. If it is "total rewrite".... well you just need to be
> very clever.
How badly would the average driver break if it did not have direct data
access to kernal data structures? Calls into the kernel and direct access by
the called functions are OK.
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-27 6:29 ` Kernel support for peer-to-peer protection models Andi Kleen
@ 2004-03-28 20:21 ` Ivan Godard
2004-03-28 23:14 ` Andi Kleen
0 siblings, 1 reply; 21+ messages in thread
From: Ivan Godard @ 2004-03-28 20:21 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
----- Original Message -----
From: "Andi Kleen" <ak@suse.de>
To: "Ivan Godard" <igodard@pacbell.net>
Cc: <linux-kernel@vger.kernel.org>
Sent: Friday, March 26, 2004 10:29 PM
Subject: Re: Kernel support for peer-to-peer protection models...
> "Ivan Godard" <igodard@pacbell.net> writes:
>
> > We're a processor startup with a new architecture that we will be
porting
> > Linux to. The bulk of the port will be straightforward (well, you know
what
> > I mean), except for the protection model supported by the hardware. How
> > would you extend/mod the kernel if you had hardware that:
> >
> > 1) had a large number of distinguishable address spaces
>
> Large or unlimited? If not unlimited you may still run into
> problems when you give each process such an address space.
> Limiting the number of processes is probably not an option.
Large but not unlimited - tens of thousands. Think PIDs. The number of
threads (sharing common address spaces) is not limited.
> > 2) any running code had two of these (code and data environment) it
could
> > use arbitrarily, but access to addresses in others was arbitrarily
protected
> > 3) flat, unified virtual addresses (64 bit) so that pointers, including
> > inter-space pointers, have the same representation in all spaces
>
> You mean Address 0 is only accessible from Address space 0, but not
> from Space 1 ?
A field in a (64 bit) pointer gives the address space number. Each space has
an "address 0" which is actually the zeroth byte *in that space* - an actual
pointer is a space number and an index in that space. You (i.e. a process)
run with a native space that determines your priveledges. If you are running
as space 0 then you may or may not be able to address space 1 - it depends
on the access you inherited or have obtained. Think mmap. You (almost)
always can address all your native space, although there might not be
anything there.
> Maybe you can give each process an different address range, but AFAIK
> the only people who have done this before are users of non MMU
architectures.
> It will probably require som changes in the portable part of the code.
> Also porting glibc's ld.so to this will be likely no-fun.
Each process gets a different range because each process gets a different
native space. Within that space processes can use the same offsets, and
typically will so as to avoid pointless relocation.
> > 4) no "supervisor mode"
> > 5) inter-space references require grant of access (transitive) by the
> > accessed space; grants can be entire space or any contiguous subspace
>
> Sounds like the only sane way to handle (4) would be to give the kernel
> an own address space with the necessary grants to access everything.
> However this will require an address space switch for every system call.
> But there is no way around it, linux requires a "shared" kernel mapping
> at least for part of the kernel memory ("lowmem")
We could (trivially) emulate a monolithic kernel in a single space. But that
loses the reliability improvement available if the kernel subsystens ran in
their own spaces with grants of access to those common structures they
individually needed. BTW, there's nothing to be gained by minimizing address
switches - it's in hardware, and inter-space references and calls run at the
same speed as same-space references and calls.
> Overall it sounds like your architecture is not very well suited to
> run Linux.
We believe we can adopt the Linux protection model (i.e. the 386 protection
model) with no more work than any other port to a new architectire (ahem).
But the result would also be as prone to bugs and exploits as a 386 too.
> > 10) Drivers can have their own individual space(s) distinct from those
of
> > the kernel and the apps. Buggy drivers cannot crash the kernel.
>
> At least you would need to use your own drivers (I believe the IBM
> iSeries and s390/VM port does it kind of). If your CPU has generic PCI
> slots this will be a lot of work. Without it it will be lots of work too,
> but at least the number of drivers required is limited.
So long as 1) a driver has a driver-load-time defined region of working data
space; 2) has a defined code region; 3) gets its buffer addresses etc. as
arguments; 4) calls defined OS APIs; and 5) never touches anything except
its private code and data, its arguments, and syscalls then it can run in
one of our protected environments and be none the wiser. That is, if the
driver has been coded to look like a well behaved server process then all is
well. If it has hard references to shared kernel data structures then it
will break, because those shared spaces are not visible and must be accessed
through a service call to someone who owns that structure.
> > Is this model so alien to the existing Kernel that the best approach is
to
>
> It is definitely alien.
You don't know the half of it! This is just the protection model :-)
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-28 19:56 ` Ivan Godard
@ 2004-03-28 20:35 ` Pavel Machek
0 siblings, 0 replies; 21+ messages in thread
From: Pavel Machek @ 2004-03-28 20:35 UTC (permalink / raw)
To: Ivan Godard; +Cc: Linux Kernel Mailing List
Hi!
> > > > If most changes are in arch/, it should be acceptable...
> > >
> > > I fear that it might be more extensive than that :-)
> >
> > Well, make patch and lets see... That means that 2.8 needs to be your
> > target. If impact outside of arch is not "total rewrite", you might
> > have a chance. If it is "total rewrite".... well you just need to be
> > very clever.
>
> How badly would the average driver break if it did not have direct data
> access to kernal data structures? Calls into the kernel and direct access by
> the called functions are OK.
Kernel likes to pass it pointers to internal data structures. And
drivers will walk over pointers in those structures pretty
often.
Actually I'm not so sure. Perhaps for simple drivers something like
that would be possible..
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-28 20:21 ` Ivan Godard
@ 2004-03-28 23:14 ` Andi Kleen
2004-03-29 8:09 ` Ivan Godard
2004-03-29 15:36 ` Pavel Machek
0 siblings, 2 replies; 21+ messages in thread
From: Andi Kleen @ 2004-03-28 23:14 UTC (permalink / raw)
To: Ivan Godard; +Cc: linux-kernel
On Sun, 28 Mar 2004 12:21:36 -0800
"Ivan Godard" <igodard@pacbell.net> wrote:
>
> > Maybe you can give each process an different address range, but AFAIK
> > the only people who have done this before are users of non MMU
> architectures.
> > It will probably require som changes in the portable part of the code.
> > Also porting glibc's ld.so to this will be likely no-fun.
>
> Each process gets a different range because each process gets a different
> native space. Within that space processes can use the same offsets, and
> typically will so as to avoid pointless relocation.
fork() will be hard and/or inefficient this way.
> > Overall it sounds like your architecture is not very well suited to
> > run Linux.
>
> We believe we can adopt the Linux protection model (i.e. the 386 protection
> model) with no more work than any other port to a new architectire (ahem).
Just FYI - Linux has been ported to several architectures with similar SASOS
capabilities in hardware (IA64 or ppc64 on iseries) and they have all opted to use
an conventional protection model.
> So long as 1) a driver has a driver-load-time defined region of working data
> space; 2) has a defined code region; 3) gets its buffer addresses etc. as
Just (1) alone is a illusion - linux drivers generally work on the shared
page pool, just like all other subsystems.
-Andi
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-27 4:23 Ivan Godard
@ 2004-03-29 0:17 ` Paul Mackerras
2004-03-29 3:18 ` Ivan Godard
0 siblings, 1 reply; 21+ messages in thread
From: Paul Mackerras @ 2004-03-29 0:17 UTC (permalink / raw)
To: Ivan Godard; +Cc: Linux Kernel Mailing List
Ivan Godard writes:
> 3) flat, unified virtual addresses (64 bit) so that pointers, including
> inter-space pointers, have the same representation in all spaces
How are you going to implement fork() ?
Paul.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-29 0:17 ` Paul Mackerras
@ 2004-03-29 3:18 ` Ivan Godard
2004-03-29 3:48 ` Davide Libenzi
0 siblings, 1 reply; 21+ messages in thread
From: Ivan Godard @ 2004-03-29 3:18 UTC (permalink / raw)
To: Paul Mackerras; +Cc: Linux Kernel Mailing List
----- Original Message -----
From: "Paul Mackerras" <paulus@samba.org>
To: "Ivan Godard" <igodard@pacbell.net>
Cc: "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Sent: Sunday, March 28, 2004 4:17 PM
Subject: Re: Kernel support for peer-to-peer protection models...
> Ivan Godard writes:
>
> > 3) flat, unified virtual addresses (64 bit) so that pointers, including
> > inter-space pointers, have the same representation in all spaces
>
> How are you going to implement fork() ?
The usual COW using the page tables. The child keeps the same code space but
gets a new data space. I expect that specialized versions of fork will give
explicit control over which space the child gets, but in comman usage no one
cases just as no one cares which PID it gets.
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-29 3:18 ` Ivan Godard
@ 2004-03-29 3:48 ` Davide Libenzi
2004-03-29 7:52 ` Ivan Godard
0 siblings, 1 reply; 21+ messages in thread
From: Davide Libenzi @ 2004-03-29 3:48 UTC (permalink / raw)
To: Ivan Godard; +Cc: Paul Mackerras, Linux Kernel Mailing List
On Sun, 28 Mar 2004, Ivan Godard wrote:
> ----- Original Message -----
> From: "Paul Mackerras" <paulus@samba.org>
> To: "Ivan Godard" <igodard@pacbell.net>
> Cc: "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
> Sent: Sunday, March 28, 2004 4:17 PM
> Subject: Re: Kernel support for peer-to-peer protection models...
>
>
> > Ivan Godard writes:
> >
> > > 3) flat, unified virtual addresses (64 bit) so that pointers, including
> > > inter-space pointers, have the same representation in all spaces
> >
> > How are you going to implement fork() ?
>
> The usual COW using the page tables. The child keeps the same code space but
> gets a new data space. I expect that specialized versions of fork will give
> explicit control over which space the child gets, but in comman usage no one
> cases just as no one cares which PID it gets.
Uh?
int myexec(char const *cmd) {
if (!fork()) {
exit(exec(cmd));
}
...
}
- Davide
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-29 3:48 ` Davide Libenzi
@ 2004-03-29 7:52 ` Ivan Godard
2004-03-29 18:45 ` Davide Libenzi
0 siblings, 1 reply; 21+ messages in thread
From: Ivan Godard @ 2004-03-29 7:52 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Paul Mackerras, Linux Kernel Mailing List
----- Original Message -----
From: "Davide Libenzi" <davidel@xmailserver.org>
To: "Ivan Godard" <igodard@pacbell.net>
Cc: "Paul Mackerras" <paulus@samba.org>; "Linux Kernel Mailing List"
<linux-kernel@vger.kernel.org>
Sent: Sunday, March 28, 2004 7:48 PM
Subject: Re: Kernel support for peer-to-peer protection models...
> > > Ivan Godard writes:
> > >
> > > > 3) flat, unified virtual addresses (64 bit) so that pointers,
including
> > > > inter-space pointers, have the same representation in all spaces
> > >
> > > How are you going to implement fork() ?
> >
> > The usual COW using the page tables. The child keeps the same code space
but
> > gets a new data space. I expect that specialized versions of fork will
give
> > explicit control over which space the child gets, but in comman usage no
one
> > cases just as no one cares which PID it gets.
>
> Uh?
>
> int myexec(char const *cmd) {
>
> if (!fork()) {
> exit(exec(cmd));
> }
> ...
> }
>
Ah! you wanted to know about exec, not fork. A true fork() is pretty rare
these days anyway. Still, the answer is pretty much the same: the fork()
gets you a new data space, retaining the old code space, and the exec()
finds (or creates) the code space that cmd's code is in and switches the
active code space to that space. Heritable data, such as file descriptors,
won't have been in the old data space anyway, so the child references them
through syscalls just as in a conventional.
Perhaps I'm missing your question here, but in general we see no problem
with fork/exec in our model - it's one of the least changed things. You
always get a new address space on a conventional, and you also do with a
Mill; the only difference is that you don't have to shoot down the cache or
TLB, so a fork/exec should be quite a bit faster.
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-28 23:14 ` Andi Kleen
@ 2004-03-29 8:09 ` Ivan Godard
2004-03-29 15:36 ` Pavel Machek
1 sibling, 0 replies; 21+ messages in thread
From: Ivan Godard @ 2004-03-29 8:09 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
interlinear
----- Original Message -----
From: "Andi Kleen" <ak@suse.de>
To: "Ivan Godard" <igodard@pacbell.net>
Cc: <linux-kernel@vger.kernel.org>
Sent: Sunday, March 28, 2004 3:14 PM
Subject: Re: Kernel support for peer-to-peer protection models...
> On Sun, 28 Mar 2004 12:21:36 -0800
> "Ivan Godard" <igodard@pacbell.net> wrote:
>
> >
> > > Maybe you can give each process an different address range, but AFAIK
> > > the only people who have done this before are users of non MMU
> > architectures.
> > > It will probably require som changes in the portable part of the code.
> > > Also porting glibc's ld.so to this will be likely no-fun.
> >
> > Each process gets a different range because each process gets a
different
> > native space. Within that space processes can use the same offsets, and
> > typically will so as to avoid pointless relocation.
>
> fork() will be hard and/or inefficient this way.
Why? The load image for the new process does not require relocation, so all
that's necessary to spawn a process is to allocate a spaceID, map the
excutable to that ID in the page tables, push one entry into the TLB, set a
couple of hardware registers, and insert it into the readyq. When it get's
control it will prompty fault in its code pages (if not already present),
and execution from there on is normal. This process is essentially identical
to what happens on a conventional, except because there is no aliasing of
addreses (flat unified 64-bit model) you don't have to scrub the cache or
TLB.
> > > Overall it sounds like your architecture is not very well suited to
> > > run Linux.
> >
> > We believe we can adopt the Linux protection model (i.e. the 386
protection
> > model) with no more work than any other port to a new architectire
(ahem).
>
> Just FYI - Linux has been ported to several architectures with similar
SASOS
> capabilities in hardware (IA64 or ppc64 on iseries) and they have all
opted to use
> an conventional protection model.
Do you know why? Can you point me to the people who did these ports so I can
ask?
> > So long as 1) a driver has a driver-load-time defined region of working
data
> > space; 2) has a defined code region; 3) gets its buffer addresses etc.
as
>
> Just (1) alone is a illusion - linux drivers generally work on the shared
> page pool, just like all other subsystems.
In 1) I'm talking about the driver's local state, not the pages it's trying
to fill. That local state will be in the driver's space, and protected from
interference by everybody else.
The pages (I assume) are arguments to the driver, i.e. "Fill *here*", and
the owner of the page will have exposed the page to the driver before or as
part of making the call. Or the call was "Fill some page and return it", and
the driver calls the physmem manager who allocates the page, exposes it to
the driver, and reurns the address to the driver. When the driver is done it
will hand off ownership of the page to the client. I fully expect that the
present code for this mechanism will have to be mangled, but I suspect that
the kernel already implements some concept of "owner" for a physpage and we
can hook the ownership transfer into our model.
I hope :-)
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-28 23:14 ` Andi Kleen
2004-03-29 8:09 ` Ivan Godard
@ 2004-03-29 15:36 ` Pavel Machek
2004-03-30 14:06 ` Andi Kleen
2004-03-30 15:09 ` Ivan Godard
1 sibling, 2 replies; 21+ messages in thread
From: Pavel Machek @ 2004-03-29 15:36 UTC (permalink / raw)
To: Andi Kleen; +Cc: Ivan Godard, linux-kernel
Hi!
> > > Overall it sounds like your architecture is not very well suited to
> > > run Linux.
> >
> > We believe we can adopt the Linux protection model (i.e. the 386 protection
> > model) with no more work than any other port to a new architectire (ahem).
>
> Just FYI - Linux has been ported to several architectures with similar SASOS
> capabilities in hardware (IA64 or ppc64 on iseries) and they have all opted to use
> an conventional protection model.
>
It might be actually plus for Ivan: if ia64 and ppc64 benefit from
changes for mill, it makes them more acceptable.
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-29 7:52 ` Ivan Godard
@ 2004-03-29 18:45 ` Davide Libenzi
2004-03-29 20:53 ` Ivan Godard
0 siblings, 1 reply; 21+ messages in thread
From: Davide Libenzi @ 2004-03-29 18:45 UTC (permalink / raw)
To: Ivan Godard; +Cc: Paul Mackerras, Linux Kernel Mailing List
On Sun, 28 Mar 2004, Ivan Godard wrote:
> Ah! you wanted to know about exec, not fork. A true fork() is pretty rare
> these days anyway. Still, the answer is pretty much the same: the fork()
> gets you a new data space, retaining the old code space, and the exec()
> finds (or creates) the code space that cmd's code is in and switches the
> active code space to that space. Heritable data, such as file descriptors,
> won't have been in the old data space anyway, so the child references them
> through syscalls just as in a conventional.
>
> Perhaps I'm missing your question here, but in general we see no problem
> with fork/exec in our model - it's one of the least changed things. You
> always get a new address space on a conventional, and you also do with a
> Mill; the only difference is that you don't have to shoot down the cache or
> TLB, so a fork/exec should be quite a bit faster.
No, sorry. Lossy email compression (sometimes being concise is a virtue
but I usually fall way too far). Reading thru previous messages seems
confusing to me. On one side you say that fork() uses a std COW, on the
other side you say that the unified virtual address space combined with
virtually tagged cache let you avoid cache flushes. As soon as you COW,
you have one virtual address that refer to two different physical addresses.
Does the virtual address have extra tag bits to identify the task?
- Davide
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-29 18:45 ` Davide Libenzi
@ 2004-03-29 20:53 ` Ivan Godard
0 siblings, 0 replies; 21+ messages in thread
From: Ivan Godard @ 2004-03-29 20:53 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Paul Mackerras, Linux Kernel Mailing List
----- Original Message -----
From: "Davide Libenzi" <davidel@xmailserver.org>
To: "Ivan Godard" <igodard@pacbell.net>
Cc: "Paul Mackerras" <paulus@samba.org>; "Linux Kernel Mailing List"
<linux-kernel@vger.kernel.org>
Sent: Monday, March 29, 2004 10:45 AM
Subject: Re: Kernel support for peer-to-peer protection models...
> On Sun, 28 Mar 2004, Ivan Godard wrote:
>
> > Ah! you wanted to know about exec, not fork. A true fork() is pretty
rare
> > these days anyway. Still, the answer is pretty much the same: the fork()
> > gets you a new data space, retaining the old code space, and the exec()
> > finds (or creates) the code space that cmd's code is in and switches the
> > active code space to that space. Heritable data, such as file
descriptors,
> > won't have been in the old data space anyway, so the child references
them
> > through syscalls just as in a conventional.
> >
> > Perhaps I'm missing your question here, but in general we see no problem
> > with fork/exec in our model - it's one of the least changed things. You
> > always get a new address space on a conventional, and you also do with a
> > Mill; the only difference is that you don't have to shoot down the cache
or
> > TLB, so a fork/exec should be quite a bit faster.
>
> No, sorry. Lossy email compression (sometimes being concise is a virtue
> but I usually fall way too far). Reading thru previous messages seems
> confusing to me. On one side you say that fork() uses a std COW, on the
> other side you say that the unified virtual address space combined with
> virtually tagged cache let you avoid cache flushes. As soon as you COW,
> you have one virtual address that refer to two different physical
addresses.
> Does the virtual address have extra tag bits to identify the task?
No, but it has a mechanism (unfortunately I can't explain yet how it works)
which amount to the same thing. In fact, the hardware can distinguish
between the case where a particular address in the child is supposed to
refer to the child, and the case where it's supposed to refer back into the
parent. This is a novel feature that is not supportd by the conventional
model. We expect to expose it through an API. For example a server can spawn
worker processes such that each worker has its own space (and is protected
from other workers) but can communicate with the server via data structures
in the server, where the new worker COW comes up with the back-references
already created. This gives the convenience of thread-workers with the
protection of process-workers using a mmap communication region, but without
the overhead of creating the region.
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-29 15:36 ` Pavel Machek
@ 2004-03-30 14:06 ` Andi Kleen
2004-03-30 15:09 ` Ivan Godard
1 sibling, 0 replies; 21+ messages in thread
From: Andi Kleen @ 2004-03-30 14:06 UTC (permalink / raw)
To: Pavel Machek; +Cc: igodard, linux-kernel
On Mon, 29 Mar 2004 17:36:06 +0200
Pavel Machek <pavel@suse.cz> wrote:
> >
>
> It might be actually plus for Ivan: if ia64 and ppc64 benefit from
> changes for mill, it makes them more acceptable.
I doubt either of them will completely redesign their port at this point.
-Andi
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Kernel support for peer-to-peer protection models...
2004-03-29 15:36 ` Pavel Machek
2004-03-30 14:06 ` Andi Kleen
@ 2004-03-30 15:09 ` Ivan Godard
1 sibling, 0 replies; 21+ messages in thread
From: Ivan Godard @ 2004-03-30 15:09 UTC (permalink / raw)
To: Pavel Machek, Andi Kleen; +Cc: linux-kernel
----- Original Message -----
From: "Pavel Machek" <pavel@suse.cz>
To: "Andi Kleen" <ak@suse.de>
Cc: "Ivan Godard" <igodard@pacbell.net>; <linux-kernel@vger.kernel.org>
Sent: Monday, March 29, 2004 7:36 AM
Subject: Re: Kernel support for peer-to-peer protection models...
> Hi!
>
> > > > Overall it sounds like your architecture is not very well suited to
> > > > run Linux.
> > >
> > > We believe we can adopt the Linux protection model (i.e. the 386
protection
> > > model) with no more work than any other port to a new architectire
(ahem).
> >
> > Just FYI - Linux has been ported to several architectures with similar
SASOS
> > capabilities in hardware (IA64 or ppc64 on iseries) and they have all
opted to use
> > an conventional protection model.
> >
>
> It might be actually plus for Ivan: if ia64 and ppc64 benefit from
> changes for mill, it makes them more acceptable.
Interesting point. Although it's not clear how long ia64 will still exist.
Remember the Alpha?
Ivan
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2004-03-30 15:21 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <048e01c413b3$3c3cae60$fc82c23f@pc21.suse.lists.linux.kernel>
2004-03-27 6:29 ` Kernel support for peer-to-peer protection models Andi Kleen
2004-03-28 20:21 ` Ivan Godard
2004-03-28 23:14 ` Andi Kleen
2004-03-29 8:09 ` Ivan Godard
2004-03-29 15:36 ` Pavel Machek
2004-03-30 14:06 ` Andi Kleen
2004-03-30 15:09 ` Ivan Godard
[not found] <048e01c413b3_3c3cae60_fc82c23f@pc21>
2004-03-27 10:34 ` Pavel Machek
2004-03-28 1:32 ` Ivan Godard
2004-03-28 6:24 ` Pavel Machek
2004-03-28 6:32 ` Ivan Godard
2004-03-28 18:54 ` Pavel Machek
2004-03-28 19:56 ` Ivan Godard
2004-03-28 20:35 ` Pavel Machek
2004-03-27 4:23 Ivan Godard
2004-03-29 0:17 ` Paul Mackerras
2004-03-29 3:18 ` Ivan Godard
2004-03-29 3:48 ` Davide Libenzi
2004-03-29 7:52 ` Ivan Godard
2004-03-29 18:45 ` Davide Libenzi
2004-03-29 20:53 ` Ivan Godard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox