All of lore.kernel.org
 help / color / mirror / Atom feed
* avoiding dirty code pages with fixups
@ 2004-02-03 22:54 Andy Isaacson
  2004-02-07  0:13 ` Jamie Lokier
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Isaacson @ 2004-02-03 22:54 UTC (permalink / raw)
  To: linux-kernel

The discussion of vsyscall got me thinking along a slightly different
line.  Since disks are *so* slow, and CPUs are so damn fast, the
tradeoff between computation and IO is different today than it was ten
years ago.  This means it can be fruitful to revisit issues which were
solved in a particular way, back then.

I think I just came up with a novel way to handle code fixups.  First, I
need a way to make a private "throwaway" copy of a page.  Define a new
mmap flag, MAP_SCRATCH.  This is sorta like MAP_PRIVATE, but instead of
writing the dirty page out to the swapfile, I want the page never to
leave RAM.  If the kernel decides to evict the page, it should just drop
it and invalidate the virtual address range.  When my program faults it
back in, provide me with the contents of the page *as they exist in the
backing file*.

Next, I need a way to read and write sizeof(void*) bytes atomicly (with
respect to the above eviction protocol).  Ideally, all icache reads and
all aligned sizeof(void*) reads would be atomic WRT MAP_SCRATCH.


The technique is as follows.  Suppose we have a library which contains a
page which contains a jmp that needs a fixup.  Map the library with
MAP_SCRATCH, and have the un-fixed-up code contain a jmp to the
appropriate code in ld.so which will perform the fixup.  The first time
the code is executed, ld.so writes into the page, triggering a COW, and
cloning a scratch copy of the page, which can then be populated with all
the needed fixups.  Until the kernel evicts the page, further
invocations of the code will result in direct jumps to the fixed-up
addresses.

When the kernel evicts the page, all the fixups are lost.

When the app faults the page back in, the fixups are gone.  The fixup
code executes again, exactly as it did in the first time the app invoked
that library page, giving correct results.

The benefit over standard MAP_PRIVATE fixups is that we can avoid IO
when many processes have fixups on the same page of libc.so (for
example).  With the current MAP_PRIVATE scheme, the page is written out
to swap *once for every process that has it fixed-up*.  With
MAP_SCRATCH, the page is never written out to swap.  Instead it's read
off the filesystem, once, when the first process faults it in; the
cached copy is used for every process that needs it.

The downside is the additional computation on page-in.  It is a function
of how many fixups there are per page, and of how much work ld.so does
to satisfy a fixup.  I don't have a good feel for how expensive ld.so's
fixup mechanism is... any comments?

(Now I just need to find somebody to code this.)

-andy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: avoiding dirty code pages with fixups
  2004-02-03 22:54 avoiding dirty code pages with fixups Andy Isaacson
@ 2004-02-07  0:13 ` Jamie Lokier
  2004-02-07  3:03   ` Andy Isaacson
  0 siblings, 1 reply; 4+ messages in thread
From: Jamie Lokier @ 2004-02-07  0:13 UTC (permalink / raw)
  To: Andy Isaacson; +Cc: linux-kernel

Andy Isaacson wrote:
> This is sorta like MAP_PRIVATE, but instead of writing the dirty
> page out to the swapfile, I want the page never to leave RAM.  If
> the kernel decides to evict the page, it should just drop it and
> invalidate the virtual address range.  When my program faults it
> back in, provide me with the contents of the page *as they exist in
> the backing file*.

That idea has come up about a thousand million times.  Well, three.
It's a good one :)

It has lots of uses, not just the one you describe.  For example,
cacheing generated image data.

It would also be nice for a memory allocator to be able to convert a
region from MAP_PRIVATE to MAP_SCRATCH and back, so that freed blocks
of memory can be reclaimed by the system but only when there is memory
pressure.

> The downside is the additional computation on page-in.

> It is a function of how many fixups there are per page, and of how
> much work ld.so does to satisfy a fixup.  I don't have a good feel
> for how expensive ld.so's fixup mechanism is... any comments?

The other downside of your idea is that every instance of a program
has more dirty pages.  While it is true that the pages do not require
disk I/O, they still take up RAM that could be used for other page
cache things.

-- Jamie

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: avoiding dirty code pages with fixups
  2004-02-07  0:13 ` Jamie Lokier
@ 2004-02-07  3:03   ` Andy Isaacson
  2004-02-07  3:33     ` Jamie Lokier
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Isaacson @ 2004-02-07  3:03 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: linux-kernel

On Sat, Feb 07, 2004 at 12:13:17AM +0000, Jamie Lokier wrote:
> > The downside is the additional computation on page-in.
> 
> > It is a function of how many fixups there are per page, and of how
> > much work ld.so does to satisfy a fixup.  I don't have a good feel
> > for how expensive ld.so's fixup mechanism is... any comments?
> 
> The other downside of your idea is that every instance of a program
> has more dirty pages.  While it is true that the pages do not require
> disk I/O, they still take up RAM that could be used for other page
> cache things.

Well, in the case I describe, currently they're done with MAP_PRIVATE
mappings, so it's no net loss.

-andy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: avoiding dirty code pages with fixups
  2004-02-07  3:03   ` Andy Isaacson
@ 2004-02-07  3:33     ` Jamie Lokier
  0 siblings, 0 replies; 4+ messages in thread
From: Jamie Lokier @ 2004-02-07  3:33 UTC (permalink / raw)
  To: Andy Isaacson; +Cc: linux-kernel

Andy Isaacson wrote:
> On Sat, Feb 07, 2004 at 12:13:17AM +0000, Jamie Lokier wrote:
> > > The downside is the additional computation on page-in.
> > 
> > > It is a function of how many fixups there are per page, and of how
> > > much work ld.so does to satisfy a fixup.  I don't have a good feel
> > > for how expensive ld.so's fixup mechanism is... any comments?
> > 
> > The other downside of your idea is that every instance of a program
> > has more dirty pages.  While it is true that the pages do not require
> > disk I/O, they still take up RAM that could be used for other page
> > cache things.
> 
> Well, in the case I describe, currently they're done with MAP_PRIVATE
> mappings, so it's no net loss.

Ok, that's a good point.

When you brought it up in the context of our vsyscall fla^H^H^Hdebate,
I assumed you meant to use this as a technique to help fixing up more
code pointers at run time, to convert indirect jumps to direct ones.
That does dirty more pages.

Your idea of the reverted pages conveniently containing the right code
to get them patched again is quite clever, imho.

-- Jamie

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-02-07  3:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-03 22:54 avoiding dirty code pages with fixups Andy Isaacson
2004-02-07  0:13 ` Jamie Lokier
2004-02-07  3:03   ` Andy Isaacson
2004-02-07  3:33     ` Jamie Lokier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.