* Page Cache Address Space Concept
@ 2011-02-14 10:59 piyush moghe
2011-02-14 13:27 ` Rajat Sharma
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: piyush moghe @ 2011-02-14 10:59 UTC (permalink / raw)
To: kernelnewbies
While going through Page Cache explanation in "Professional Linux Kernel"
book I came across one term called "address space" ( not related to virtual
or physical address space )
I did not get what is the meaning of this address space, following is
verbatim description:
"To manage the various target objects that can be processed and cached in
whole pages, the kernel uses an abstraction of
the "address space" that associates the pages in memory with a specific
block device (or any other system unit or part of a
system unit).
This type of address space must not be confused with the virtual and
physical address spaces provided by the
system or processor. It is a separate abstraction of the Linux kernel that
unfortunately bears the same name.
Initially, we are interested in only one aspect. Each address space has a
"host" from which it obtains its data. In most
cases, these are inodes that represent just one file.[2] Because all
existing inodes are linked with their superblock (as
discussed in Chapter 8), all the kernel need do is scan a list of all
superblocks and follow their associated inodes to obtain
a list of cached pages"
Can anyone please explain what is the use of this and what this is all
about?
Regards,
Piyush
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110214/f8786c10/attachment.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Page Cache Address Space Concept
2011-02-14 10:59 Page Cache Address Space Concept piyush moghe
@ 2011-02-14 13:27 ` Rajat Sharma
2011-02-14 20:16 ` Mulyadi Santosa
2011-02-15 18:16 ` Miguel Telleria de Esteban
2 siblings, 0 replies; 4+ messages in thread
From: Rajat Sharma @ 2011-02-14 13:27 UTC (permalink / raw)
To: kernelnewbies
One vital use of address_space object is by filesystem to manage
page-cache of a file:
- cache recently accessed data in page-cache
- Read-ahead (prefectiching) of data sequentially read data in page-cache
- support memory mapped I/O through page-cache.
Look at address_space_operations vector how it achieve it through
readpage, readpages, writepage and writepages methods to populate and
flush page-cache of an inode.
Thanks,
Rajat
On Mon, Feb 14, 2011 at 4:29 PM, piyush moghe <pmkernel@gmail.com> wrote:
> While going through Page Cache explanation in "Professional Linux Kernel"
> book I came across one term called "address space" ( not related to virtual
> or physical address space )
> I did not get what is the meaning of this address space, following is
> verbatim description:
> "To manage the various target objects that can be processed and cached in
> whole pages, the kernel uses an abstraction of
> the "address space" that associates the pages in memory with a specific
> block device (or any other system unit or part of a
> system unit).
> This type of address space must not be confused with the virtual and
> physical address spaces provided by the
> system or processor. It is a separate abstraction of the Linux kernel that
> unfortunately bears the same name.
> Initially, we are interested in only one aspect. Each address space has a
> "host" from which it obtains its data. In most
> cases, these are inodes that represent just one file.[2] Because all
> existing inodes are linked with their superblock (as
> discussed in Chapter 8), all the kernel need do is scan a list of all
> superblocks and follow their associated inodes to obtain
> a list of cached pages"
>
> Can anyone please explain what is the use of this and what this is all
> about?
> Regards,
> Piyush
>
>
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Page Cache Address Space Concept
2011-02-14 10:59 Page Cache Address Space Concept piyush moghe
2011-02-14 13:27 ` Rajat Sharma
@ 2011-02-14 20:16 ` Mulyadi Santosa
2011-02-15 18:16 ` Miguel Telleria de Esteban
2 siblings, 0 replies; 4+ messages in thread
From: Mulyadi Santosa @ 2011-02-14 20:16 UTC (permalink / raw)
To: kernelnewbies
Hi :)
On Mon, Feb 14, 2011 at 17:59, piyush moghe <pmkernel@gmail.com> wrote:
> While going through Page Cache explanation in "Professional Linux Kernel"
> book I came across one term called "address space" ( not related to virtual
> or physical address space )
> I did not get what is the meaning of this address space
I'll try to help (although I am not really into filesystem and block
device things):
think of it like "in which 'device'....or to be precise, a partition
or anything 'alike" it belongs". In practice, AFAIK it is used to
differentiate whether a page caches something from underlying
device...or caching anonymous memory (doesn't have any backing device)
(memory) address space --> layout of memory in virtual memory space
(the way MMU partitions memory)
name space --> in which root filesystem a file or directory (or
anything alike) belongs... example: chroot jails
pid space --> list of PID that belongs to certain "machine"..it is
done to cope with virtual machine...so PID created by them are each
belongs to different "space"
so by continuing the same "analogy", you could imagine what address
space means in this case....
--
regards,
Mulyadi Santosa
Freelance Linux trainer and consultant
blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Page Cache Address Space Concept
2011-02-14 10:59 Page Cache Address Space Concept piyush moghe
2011-02-14 13:27 ` Rajat Sharma
2011-02-14 20:16 ` Mulyadi Santosa
@ 2011-02-15 18:16 ` Miguel Telleria de Esteban
2 siblings, 0 replies; 4+ messages in thread
From: Miguel Telleria de Esteban @ 2011-02-15 18:16 UTC (permalink / raw)
To: kernelnewbies
My 2 cents on this topic, since I have recently read some litterature
about it.
On Mon, 14 Feb 2011 16:29:42 +0530 piyush moghe wrote:
> While going through Page Cache explanation in "Professional Linux
> Kernel" book I came across one term called "address space" ( not
> related to virtual or physical address space )
First of all I assume that you are referring to the structure
struct addres_space
defined in fs.h. See for kernel 2.6.37 the following link
http://lxr.linux.no/linux+*/include/linux/fs.h#L632
>
> I did not get what is the meaning of this address space, following is
> verbatim description:
>
> "To manage the various target objects that can be processed and
> cached in whole pages, the kernel uses an abstraction of
> the "address space" that associates the pages in memory with a
> specific block device (or any other system unit or part of a
> system unit).
> This type of address space must not be confused with the virtual and
> physical address spaces provided by the
> system or processor. It is a separate abstraction of the Linux kernel
> that unfortunately bears the same name.
> Initially, we are interested in only one aspect. Each address space
> has a "host" from which it obtains its data. In most
> cases, these are inodes that represent just one file.[2] Because all
> existing inodes are linked with their superblock (as
> discussed in Chapter 8), all the kernel need do is scan a list of all
> superblocks and follow their associated inodes to obtain
> a list of cached pages"
>
>
> Can anyone please explain what is the use of this and what this is all
> about?
This structure makes up the implementation of the "page cache"
mechanism which I see coarsely as a disk cache (of the kind of smartdrv
in Windows 3.1 in the conceptual case).
The idea is that, in most cases (exceptions are DIRECT_IO disk access)
all filesystem reads and writes are performed in memory first. Reads
are fetched from disk only the first time and writes are flushed to
disk at certain intervals.
The most typical use of address_space is with regular files. In these
cases the structure is embedded inside the inode (typically in the
i_data) and accessed through the i_mapping pointer. It can also be
accessed through the f_mapping pointer in the struct file data passed
to read() and write()
Whenever you want to access a certain offset of a file the procedure
goes as follows (skipping access permissions, file locking, reference
counting, spin locks, page flags and other issues):
1. Find the address_space of the file through the f_mapping pointer
inside struct file.
2. Compute the "page index" of the data to read based on:
- the memory page size (e.g. 4 Kb for i386)
- the file pointer location (ppos) from previous file operations
- the offset specified in the syscall
3. Invoke find_get_page() on the address_space object giving the index.
This function is the actual page-cache lookup operation. It goes
through the
page_tree member of struct adress_space (a radix tree)
and if available it returns a pointer of the corresponding struct
page.
If the page is not yet available or it is not up to date a block
I/O operation would be scheduled.
Let's assume that the page is available and up to date. So we have
a
struct page
representing the page that holds the cached data.
There are still some more tweaks to arrive to the actual data
although they fall outside of address_space.
4. From the struct page, the "private" pointer leads to a
circular single linked list of
struct buffer_head
5. Each buffer head represents an I/O block (as opossed to memory page)
chunk of data. The size of the block chunks is stored in b_size.
So we need to locate the corresponding i/o block(s) (or buffer_head)
inside the linked list that matches the actual data to read.
6. Once the corresponding buffer_head is selected, the ACTUAL DATA is
available through the b_data pointer.
In all those 6 steps I have given you an overview of the role of
- struct address_space
- struct page (yes the one of memory pages used everywhere in the
kernel)
- struct buffer_head
in filesystem access. Of course there is much more to tell
regarding spin-locks, reference counts, page flags and the like. If
you want to get deep in the issue (it took me 3 weeks to understand it)
you can read chapters 12, 15 and 16 of Understanding the Linux Kernel
book.
Hoping it Helps,
Miguel
PS: Corrections welcome :)
--
-----------------------------------------------------------------------
Miguel TELLERIA DE ESTEBAN Grupo de computadores y tiempo real
telleriam ENSAIMADA unican.es Dept. Electr?nica y Computadores
(change "ENSAIMADA" for @) Universidad de Cantabria
http://www.ctr.unican.es Tel trabajo: +34 942 201477
-----------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110215/cb3bbe9d/attachment.bin
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-02-15 18:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-14 10:59 Page Cache Address Space Concept piyush moghe
2011-02-14 13:27 ` Rajat Sharma
2011-02-14 20:16 ` Mulyadi Santosa
2011-02-15 18:16 ` Miguel Telleria de Esteban
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).