linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan.kim@gmail.com>
To: "Jörn Engel" <joern@logfs.org>
Cc: Evgeniy Polyakov <zbr@ioremap.net>,
	"Robert P. J. Day" <rpjday@crashcourse.ca>,
	linux-mm@kvack.org
Subject: Re: why are some low-level MM routines being exported?
Date: Mon, 5 Apr 2010 17:26:58 +0900	[thread overview]
Message-ID: <m2w28c262361004050126mbcbed77cha6f1085394802cb2@mail.gmail.com> (raw)
In-Reply-To: <20100405071344.GC23515@logfs.org>

On Mon, Apr 5, 2010 at 4:13 PM, Jörn Engel <joern@logfs.org> wrote:
> On Mon, 5 April 2010 15:20:36 +0900, Minchan Kim wrote:
>>
>> Previously I said, what I have a concern is that if file systems or
>> some modules abuses
>> add_to_page_cache_lru, it might system LRU list wrong so then system
>> go to hell.
>> Of course, if we use it carefully, it can be good but how do you make sure it?
>
> Having access to the source code means you only have to read all
> callers.  This is not java, we don't have to add layers of anti-abuse
> wrappers.  We can simply flame the first offender to a crisp. :)
>
>> I am not a file system expert but as I read comment of read_cache_pages
>> "Hides the details of the LRU cache etc from the filesystem", I
>> thought it is not good that
>> file system handle LRU list directly. At least, we have been trying for years.
>
> Only speaking for logfs, I need some variant of find_or_create_page
> where I can replace lock_page() with a custom function.  Whether that
> function lives in fs/logfs/ or mm/filemap.c doesn't matter much.
>
> What we could do something roughly like the patch below, at least
> semantically.  I know the patch is crap in its current form, but it
> illustrates the general idea.
>
> Jörn
>
> --
> The key to performance is elegance, not battalions of special cases.
> -- Jon Bentley and Doug McIlroy
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 045b31c..6d452eb 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -646,27 +646,19 @@ repeat:
>  }
>  EXPORT_SYMBOL(find_get_page);
>
> -/**
> - * find_lock_page - locate, pin and lock a pagecache page
> - * @mapping: the address_space to search
> - * @offset: the page index
> - *
> - * Locates the desired pagecache page, locks it, increments its reference
> - * count and returns its address.
> - *
> - * Returns zero if the page was not present. find_lock_page() may sleep.
> - */
> -struct page *find_lock_page(struct address_space *mapping, pgoff_t offset)
> +static struct page *__find_lock_page(struct address_space *mapping,
> +               pgoff_t offset, void(*lock)(struct page *),
> +               void(*unlock)(struct page *))
>  {
>        struct page *page;
>
>  repeat:
>        page = find_get_page(mapping, offset);
>        if (page) {
> -               lock_page(page);
> +               lock(page);
>                /* Has the page been truncated? */
>                if (unlikely(page->mapping != mapping)) {
> -                       unlock_page(page);
> +                       unlock(page);
>                        page_cache_release(page);
>                        goto repeat;
>                }
> @@ -674,32 +666,31 @@ repeat:
>        }
>        return page;
>  }
> -EXPORT_SYMBOL(find_lock_page);
>
>  /**
> - * find_or_create_page - locate or add a pagecache page
> - * @mapping: the page's address_space
> - * @index: the page's index into the mapping
> - * @gfp_mask: page allocation mode
> - *
> - * Locates a page in the pagecache.  If the page is not present, a new page
> - * is allocated using @gfp_mask and is added to the pagecache and to the VM's
> - * LRU list.  The returned page is locked and has its reference count
> - * incremented.
> + * find_lock_page - locate, pin and lock a pagecache page
> + * @mapping: the address_space to search
> + * @offset: the page index
>  *
> - * find_or_create_page() may sleep, even if @gfp_flags specifies an atomic
> - * allocation!
> + * Locates the desired pagecache page, locks it, increments its reference
> + * count and returns its address.
>  *
> - * find_or_create_page() returns the desired page's address, or zero on
> - * memory exhaustion.
> + * Returns zero if the page was not present. find_lock_page() may sleep.
>  */
> -struct page *find_or_create_page(struct address_space *mapping,
> -               pgoff_t index, gfp_t gfp_mask)
> +struct page *find_lock_page(struct address_space *mapping, pgoff_t offset)
> +{
> +       return __find_lock_page(mapping, offset, lock_page, unlock_page);
> +}
> +EXPORT_SYMBOL(find_lock_page);
> +
> +static struct page *__find_or_create_page(struct address_space *mapping,
> +               pgoff_t index, gfp_t gfp_mask, void(*lock)(struct page *),
> +               void(*unlock)(struct page *))
>  {
>        struct page *page;
>        int err;
>  repeat:
> -       page = find_lock_page(mapping, index);
> +       page = __find_lock_page(mapping, index, lock, unlock);
>        if (!page) {
>                page = __page_cache_alloc(gfp_mask);
>                if (!page)
> @@ -721,6 +712,31 @@ repeat:
>        }
>        return page;
>  }
> +EXPORT_SYMBOL(__find_or_create_page);
> +
> +/**
> + * find_or_create_page - locate or add a pagecache page
> + * @mapping: the page's address_space
> + * @index: the page's index into the mapping
> + * @gfp_mask: page allocation mode
> + *
> + * Locates a page in the pagecache.  If the page is not present, a new page
> + * is allocated using @gfp_mask and is added to the pagecache and to the VM's
> + * LRU list.  The returned page is locked and has its reference count
> + * incremented.
> + *
> + * find_or_create_page() may sleep, even if @gfp_flags specifies an atomic
> + * allocation!
> + *
> + * find_or_create_page() returns the desired page's address, or zero on
> + * memory exhaustion.
> + */
> +struct page *find_or_create_page(struct address_space *mapping,
> +               pgoff_t index, gfp_t gfp_mask)
> +{
> +       return __find_or_create_page(mapping, index, gfp_mask, lock_page,
> +                       unlock_page);
> +}
>  EXPORT_SYMBOL(find_or_create_page);
>
>  /**
>

Seem to be not bad idea. :)
But we have to justify new interface before. For doing it, we have to say
why we can't do it by current functions(find_get_page,
add_to_page_cache and pagevec_lru_add_xxx)

Pagevec_lru_add_xxx does batch so that it can reduce calling path and
some overhead(ex, page_is_file_cache comparison,
get/put_cpu_var(lru_add_pvecs)).

At least, it would be rather good than old for performance.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-04-05  8:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-04 15:27 why are some low-level MM routines being exported? Robert P. J. Day
2010-04-04 15:59 ` Minchan Kim
2010-04-04 16:03   ` Evgeniy Polyakov
2010-04-04 16:17     ` Minchan Kim
2010-04-04 16:21     ` Minchan Kim
2010-04-04 18:15       ` Evgeniy Polyakov
2010-04-05  0:36         ` Minchan Kim
2010-04-05 12:47           ` Evgeniy Polyakov
2010-04-05 14:31             ` Minchan Kim
2010-04-04 19:55       ` Jörn Engel
2010-04-05  0:59         ` Minchan Kim
2010-04-05  5:30           ` Jörn Engel
2010-04-05  6:20             ` Minchan Kim
2010-04-05  6:22               ` Minchan Kim
2010-04-05  7:13               ` Jörn Engel
2010-04-05  8:26                 ` Minchan Kim [this message]
2010-04-05 11:19                   ` Jörn Engel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2w28c262361004050126mbcbed77cha6f1085394802cb2@mail.gmail.com \
    --to=minchan.kim@gmail.com \
    --cc=joern@logfs.org \
    --cc=linux-mm@kvack.org \
    --cc=rpjday@crashcourse.ca \
    --cc=zbr@ioremap.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).