How linear address translate to physical address in kernel space?

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* How linear address translate to physical address in kernel space?
@ 2005-07-19 11:34 liyu@WAN
  2005-07-19 12:12 ` Steven Rostedt
  0 siblings, 1 reply; 4+ messages in thread
From: liyu@WAN @ 2005-07-19 11:34 UTC (permalink / raw)
  To: LKML

HI, every in LKML.

    I have a question that can not understand.

    In kernel space, how linear address translate to physical address ? 
In many kernel bookes,
they said "directly mappd", I think I seen what they said, their mean is 
use __pa()/__va() macro
pair.

    (My platform is i386.)

    but these macro use in case that require use physical address 
explicitly, but in most case,
kernel more need translate them hiddenly. In user space, this 
translation is handled by MMU and
pagefault exception handler of kernel.

    I think kernel can not use CR3 register directly for this purpose, 
beacause of , for example,
when kernel need to switch between user space task, it need change CR3 
regsiter to switch task
address space, if kernel also use CR3 register, this CR3 change will 
break down kernel control flow.

   
    I don't known if I say my question clearly, my english so poor. but 
I am waitting to any answer.

    Thanks in advanced.



                                                                         
   liyu/NOW~


   


   


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How linear address translate to physical address in kernel space?
  2005-07-19 11:34 How linear address translate to physical address in kernel space? liyu@WAN
@ 2005-07-19 12:12 ` Steven Rostedt
  2005-07-19 13:39   ` liyu@WAN
  0 siblings, 1 reply; 4+ messages in thread
From: Steven Rostedt @ 2005-07-19 12:12 UTC (permalink / raw)
  To: liyu@WAN; +Cc: LKML

On Tue, 2005-07-19 at 19:34 +0800, liyu@WAN wrote:
> HI, every in LKML.
> 
>     I have a question that can not understand.
> 
>     In kernel space, how linear address translate to physical address ? 
> In many kernel bookes,
> they said "directly mappd", I think I seen what they said, their mean is 
> use __pa()/__va() macro
> pair.
> 
>     (My platform is i386.)

Hi Liyu,

I'm not that strong in the Intel world, but after all the fancy
registers, intel is not much different than say PPC. So I'll keep this
more of a generic platform discussion, and only talk about physical and
virtual address space.

The kernel is usually mapped down to the lower end of memory, which in
most platforms starts at physical address zero (I've worked with
platforms that don't do this, but that's offtopic). Then the kernel maps
this physical location to some upper address (with intel it's usually
0xc0000000).

All the user space addresses are mapped below this kernel address. Now
the magic here, and it probably confuses you, is that all
tasks/processes have the kernel address mapped to the same location.  So
on context switches, the kernel is still in the same location in virtual
address.  It's just that when the CPU is in user mode, the kernel
address is protected from being read or written to. So if a user space
process tries to write or read from it (like *(char*)(0xc0000000) = 1;)
it will get a page fault. But when the CPU switches to kernel mode
(through a system call or interrupt), it has full access to this area.

So you have this mapping:

             Physical               Virtual
0x00000000  ------------          -------------
            | kernel   | --+      |  user      |
            |          |   |      |            | 
            +----------+   |      |            |
            | general  |   |      |            |
            | memory   |   |      |            |
            .          .   |      .            .
            .          .   |      .            . 
            | end of   |   |      |            |
            | memory   |   |      |            |
            +----------+   |      |            |
            |          |   +--->  +------------+ 0xc0000000
            |          |          |  kernel    |
            .          .          .            .
            .          .          .            .

Usually all of memory is mapped to the address 0xc0000000. But this
becomes a problem when you have a gig or more of RAM.  Since you run out
of virtual space to map there. And you still need room to map device
memory as well (you can't have user space conflicting with devices). So
if you have a lot of RAM, you need to turn on highmem support, which
then plays around to get memory above a certain point. But that's
another discussion.

So to access physical memory from the kernel, you can use __pa and back
with __va.  These are used to communicate with devices usually, which
are also mapped to some location.  But these only work when the mapping
is direct as show above. When highmem support is on, you can't get to
memory that is not mapped in. But there's no need to use __pa or __va to
get to memory just for itself.  Usually they are used when dealing with
devices that have DMA or some other need to find a physical address. If
a device needs to write to memory (usually only knowing about the
physical location of the memory) you get that memory with GFP_DMA flag,
which guarantees that you will get memory that is mapped directly.

-- Steve

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How linear address translate to physical address in kernel space?
  2005-07-19 12:12 ` Steven Rostedt
@ 2005-07-19 13:39   ` liyu@WAN
  2005-07-19 14:20     ` Steven Rostedt
  0 siblings, 1 reply; 4+ messages in thread
From: liyu@WAN @ 2005-07-19 13:39 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML

Hi:

    Thanks for steven. but I think you don't understand real intention 
of my question.

    First, please allow me make a bit explain about intel architecture. 
If it include
any error, please tell me. 3ks.

    On i386, that all addresses can be used directly in program, is 
called "logical address",
logical address include a pointer(16bit) to segment descriptor table and 
one offset(32bit).
the logical address must be pass segment translation. this process use 
segment descriptor in
GDT or LDT (Global/Local segment Descriptor Table).  Each segment 
descriptor includes
one base address of the segment, then, the offset in "logical address" 
and this base address
will combine "linear address", if we disable paging, this linear address 
just is physcial address.
however, if we enable great paging, it still need do paging translate to 
physical address yet.
In paging process, CR3 register take one important role, it include page 
table directory base address.

    On i386 Linux, segment tranlstion just is dummy process, all logical 
address is translated to same
address. in other words, all segments (KERNEL_CS, KERNEL_DS, USER_DS, 
USER_CS) have zero value base
address.

    And we must note, in above words, the terms "logical address" , 
"linear address" are came from intel
architecture manual, they are not completely equal with linux kernel 
terms. but both "linear address" are
the most like.

    The paging at i386 architecture must use CR3 register, as we known, 
at least. So in linux kernel, if it
is going to map its "linear address" to low end physical address, it 
also need use CR3 register. but it can
not use CR3 directly ,in switch_mm() function at least, this function 
will change CR3 register value to switch
user task memory address space.

    I known kernel often do not setup page table for itself, except some 
special cases, for example, vmalloc.
This feature say kernel use page table(and CR3) in reverse.

    I want to know how kernel translate itself address , especially, How 
code after kernel change CR3 register
work? It use CR3, or no? As steven said, I am confused here really.

    It is like kernel have many secrets I don't know. these secrets 
drive me study it.

    3ks in advanced.


                                                                         
         liyu/NOW~








   




   



   









Steven Rostedt wrote:

>Hi Liyu,
>
>I'm not that strong in the Intel world, but after all the fancy
>registers, intel is not much different than say PPC. So I'll keep this
>more of a generic platform discussion, and only talk about physical and
>virtual address space.
>
>The kernel is usually mapped down to the lower end of memory, which in
>most platforms starts at physical address zero (I've worked with
>platforms that don't do this, but that's offtopic). Then the kernel maps
>this physical location to some upper address (with intel it's usually
>0xc0000000).
>
>All the user space addresses are mapped below this kernel address. Now
>the magic here, and it probably confuses you, is that all
>tasks/processes have the kernel address mapped to the same location.  So
>on context switches, the kernel is still in the same location in virtual
>address.  It's just that when the CPU is in user mode, the kernel
>address is protected from being read or written to. So if a user space
>process tries to write or read from it (like *(char*)(0xc0000000) = 1;)
>it will get a page fault. But when the CPU switches to kernel mode
>(through a system call or interrupt), it has full access to this area.
>
>So you have this mapping:
>
>             Physical               Virtual
>0x00000000  ------------          -------------
>            | kernel   | --+      |  user      |
>            |          |   |      |            | 
>            +----------+   |      |            |
>            | general  |   |      |            |
>            | memory   |   |      |            |
>            .          .   |      .            .
>            .          .   |      .            . 
>            | end of   |   |      |            |
>            | memory   |   |      |            |
>            +----------+   |      |            |
>            |          |   +--->  +------------+ 0xc0000000
>            |          |          |  kernel    |
>            .          .          .            .
>            .          .          .            .
>
>Usually all of memory is mapped to the address 0xc0000000. But this
>becomes a problem when you have a gig or more of RAM.  Since you run out
>of virtual space to map there. And you still need room to map device
>memory as well (you can't have user space conflicting with devices). So
>if you have a lot of RAM, you need to turn on highmem support, which
>then plays around to get memory above a certain point. But that's
>another discussion.
>
>So to access physical memory from the kernel, you can use __pa and back
>with __va.  These are used to communicate with devices usually, which
>are also mapped to some location.  But these only work when the mapping
>is direct as show above. When highmem support is on, you can't get to
>memory that is not mapped in. But there's no need to use __pa or __va to
>get to memory just for itself.  Usually they are used when dealing with
>devices that have DMA or some other need to find a physical address. If
>a device needs to write to memory (usually only knowing about the
>physical location of the memory) you get that memory with GFP_DMA flag,
>which guarantees that you will get memory that is mapped directly.
>
>-- Steve
>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>
>  
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How linear address translate to physical address in kernel space?
  2005-07-19 13:39   ` liyu@WAN
@ 2005-07-19 14:20     ` Steven Rostedt
  0 siblings, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2005-07-19 14:20 UTC (permalink / raw)
  To: liyu@WAN; +Cc: LKML

On Tue, 2005-07-19 at 21:39 +0800, liyu@WAN wrote:
> Hi:
> 
>     Thanks for steven. but I think you don't understand real intention 
> of my question.

I think I actually do. I just haven't expressed myself well enough for
you to understand :-)

> 
>     First, please allow me make a bit explain about intel architecture. 
> If it include
> any error, please tell me. 3ks.
> 

[snip explaination of intel]

> 
>     On i386 Linux, segment tranlstion just is dummy process, all logical 
> address is translated to same
> address. in other words, all segments (KERNEL_CS, KERNEL_DS, USER_DS, 
> USER_CS) have zero value base
> address.

This is used to make intel look more like other architectures.

> 
>     And we must note, in above words, the terms "logical address" , 
> "linear address" are came from intel
> architecture manual, they are not completely equal with linux kernel 
> terms. but both "linear address" are
> the most like.

I always get confused with intel's terminology.  That's why I tried to
stay with the generic physical (what the bus sees) and the virtual (what
the CPU sees).

> 
>     The paging at i386 architecture must use CR3 register, as we known, 
> at least. So in linux kernel, if it
> is going to map its "linear address" to low end physical address, it 
> also need use CR3 register. but it can
> not use CR3 directly ,in switch_mm() function at least, this function 
> will change CR3 register value to switch
> user task memory address space.

OK, I'm too lazy to go open up my Intel books (they're buried someplace,
and my online documention is too big), so I'm assuming that the CR3
register is the pointer to the Global Page Table (GPT).

> 
>     I known kernel often do not setup page table for itself, except some 
> special cases, for example, vmalloc.
> This feature say kernel use page table(and CR3) in reverse.

Lets for sake of simplicity, forget about how the kernel fills up the
entries in the GPT (can be done from page faults).

> 
>     I want to know how kernel translate itself address , especially, How 
> code after kernel change CR3 register
> work? It use CR3, or no? As steven said, I am confused here really.
> 
>     It is like kernel have many secrets I don't know. these secrets 
> drive me study it.

OK, the CR3 (If I was right in my above statement) points to the page
tables. So we have

         GPT
CR3 -> +--------+
       |        | -> user page tables.
       |        |
       |        |
       |        |
       |0xc0000 | ->  Page table of kernel
       +--------+

If all the user's Global page tables (GPT) has a pointer to the kernel
page table for the address of 0xc0000000 and above, then there's no
problem in switching the CR3 register.  What happens is that the user's
page tables will change. But all the user tasks have the upper address
pointing to the same address (thus the kernel), with the proper
protection bits as described earlier that say that only when the CPU is
in kernel mode does it have access to this.

I hope this helps, since I'm just about to leave to Ottawa. (See
everyone there ;-)

-- Steve

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-07-19 14:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-19 11:34 How linear address translate to physical address in kernel space? liyu@WAN
2005-07-19 12:12 ` Steven Rostedt
2005-07-19 13:39   ` liyu@WAN
2005-07-19 14:20     ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox