public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@arcor.de>
To: Andrew Morton <akpm@zip.com.au>,
	Linus Torvalds <torvalds@transmeta.com>,
	Marcelo Tosatti <marcelo@conectiva.com.br>
Cc: linux-kernel@vger.kernel.org, Christian Ehrhardt <ulcae@in-ulm.de>
Subject: [RFC] Alternative raceless page free
Date: Thu, 5 Sep 2002 06:42:12 +0200	[thread overview]
Message-ID: <E17moT6-00064X-00@starship> (raw)
In-Reply-To: <E17kunE-0003IO-00@starship>

For completeness, I implemented the atomic_dec_and_test version of raceless 
page freeing suggested by Manfred Spraul.  The atomic_dec_and_test approach 
eliminates the free race by ensuring that when a page's count drops to zero 
the lru list lock is taken atomically, leaving no window where the page can 
also be found and manipulated on the lru list.[1]  Both this and the 
extra-lru-count version are supported in the linked patch:

   http://people.nl.linux.org/~phillips/patches/lru.race-2.4.19-2

The atomic_dec_and_test version is slightly simpler, but was actually more 
work to implement because of the need to locate and eliminate all uses of 
page_cache_release where the lru lock is known to be held, as these will 
deadlock.  That had the side effect of eliminating a number of ifdefs vs the 
lru count version, and rooting out some hidden redundancy.

The patch exposes __free_pages_ok, which must called directly by the 
atomic_dec_and_lock variant.  In the process it got a less confusing name - 
recover_pages.  (The incumbent name is confusing because all other 'free' 
variants in addition manipulate the page count.)

It's a close call which version is faster.  I suspect the atomic_dec_and_lock 
version will not scale quite as well because of the bus-locked cmpxchg on the
page count (optimized version; unoptimized version always takes the spinlock) 
but neither version really lacks in the speed department.

I have a slight preference for the extra-lru-count version, because of the 
trylock in page_cache_release.  This means that nobody will have to spin when 
shrink_cache is active.  Instead, freed pages that collide with the lru lock 
can just be left on the lru list to be picked up efficiently later.  The 
trylock also allows the lru lock to be acquired speculatively from interrupt 
context, without a requirement that lru lock holders disable interrupts.  
Both versions are provably correct, modulo implementation gaffs.

The linked patch defaults to atomic_dec_and_lock version.  To change to
the extra count version, define LRU_PLUS_CACHE as 2 instead of 1.

Christian, can you please run this one through your race detector?

[1] As a corollary, pages with zero count can never be found on the lru list, 
so that is treated as a bug.  

-- 
Daniel

  parent reply	other threads:[~2002-09-05  4:36 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-08-22  2:29 MM patches against 2.5.31 Andrew Morton
2002-08-22 11:28 ` Christian Ehrhardt
2002-08-26  1:52   ` Andrew Morton
2002-08-26  9:10     ` Christian Ehrhardt
2002-08-26 14:22       ` Daniel Phillips
2002-08-26 15:29         ` Christian Ehrhardt
2002-08-26 17:56           ` Daniel Phillips
2002-08-26 19:24             ` Andrew Morton
2002-08-26 19:34               ` Daniel Phillips
2002-08-26 19:48               ` Christian Ehrhardt
2002-08-27  9:22               ` Christian Ehrhardt
2002-08-27 19:19                 ` Andrew Morton
2002-08-26 20:00             ` Christian Ehrhardt
2002-08-26 20:09               ` Daniel Phillips
2002-08-26 20:58                 ` Christian Ehrhardt
2002-08-27 16:48                   ` Daniel Phillips
2002-08-28 13:14                     ` Christian Ehrhardt
2002-08-28 17:18                       ` Daniel Phillips
2002-08-28 17:42                         ` Andrew Morton
2002-08-28 20:41                       ` Daniel Phillips
2002-08-28 21:03                         ` Andrew Morton
2002-08-28 22:04                           ` Daniel Phillips
2002-08-28 22:39                             ` Andrew Morton
2002-08-28 22:57                               ` Daniel Phillips
2002-08-26 21:31                 ` Andrew Morton
2002-08-27  3:42                   ` Benjamin LaHaise
2002-08-27  4:37                     ` Andrew Morton
2002-08-26 17:58     ` Linus Torvalds
2002-08-26 19:28       ` Rik van Riel
2002-08-30 23:03       ` [RFC] [PATCH] Include LRU in page count Daniel Phillips
2002-08-31 16:14         ` Christian Ehrhardt
2002-08-31 17:54           ` Andrew Morton
2002-08-31 19:47           ` Daniel Phillips
2002-08-31 20:26             ` Andrew Morton
2002-08-31 21:05               ` Daniel Phillips
2002-08-31 22:30                 ` William Lee Irwin III
2002-09-01  3:36                   ` Daniel Phillips
2002-09-01 21:32               ` Daniel Phillips
2002-09-01 22:09                 ` Andrew Morton
2002-09-01 22:08                   ` Daniel Phillips
2002-09-01 22:20                   ` Daniel Phillips
2002-09-01 23:08                     ` Andrew Morton
2002-09-01 23:19                       ` Daniel Phillips
2002-09-01 23:28                       ` William Lee Irwin III
2002-09-01 23:33                       ` Daniel Phillips
2002-09-02  0:17                         ` Andrew Morton
2002-09-02  0:30                           ` Daniel Phillips
2002-09-02  1:50                             ` Andrew Morton
2002-09-02  1:08                         ` Rik van Riel
2002-09-02 17:23             ` Christian Ehrhardt
2002-09-02 18:01               ` Daniel Phillips
2002-09-05  4:42         ` Daniel Phillips [this message]
2002-09-05 12:34           ` [RFC] Alternative raceless page free Christian Ehrhardt
2002-09-05 15:21             ` Daniel Phillips
2002-09-05 16:04               ` Christian Ehrhardt
2002-09-05 16:10                 ` Daniel Phillips
2002-09-05 16:31                 ` Daniel Phillips
2002-09-05 18:06                 ` [RFC] Alternative raceless page free, updated Daniel Phillips
2002-08-22 15:59 ` MM patches against 2.5.31 Steven Cole
2002-08-22 16:06   ` Martin J. Bligh
2002-08-22 19:45     ` Steven Cole
2002-08-26  2:15     ` Andrew Morton
2002-08-26  2:08       ` Martin J. Bligh
2002-08-26  2:32         ` Andrew Morton
2002-08-26  3:06           ` Steven Cole

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E17moT6-00064X-00@starship \
    --to=phillips@arcor.de \
    --cc=akpm@zip.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo@conectiva.com.br \
    --cc=torvalds@transmeta.com \
    --cc=ulcae@in-ulm.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox