From: Minchan Kim <minchan.kim@gmail.com>
To: "Jörn Engel" <joern@logfs.org>
Cc: Evgeniy Polyakov <zbr@ioremap.net>,
"Robert P. J. Day" <rpjday@crashcourse.ca>,
linux-mm@kvack.org
Subject: Re: why are some low-level MM routines being exported?
Date: Mon, 5 Apr 2010 17:26:58 +0900 [thread overview]
Message-ID: <m2w28c262361004050126mbcbed77cha6f1085394802cb2@mail.gmail.com> (raw)
In-Reply-To: <20100405071344.GC23515@logfs.org>
On Mon, Apr 5, 2010 at 4:13 PM, Jörn Engel <joern@logfs.org> wrote:
> On Mon, 5 April 2010 15:20:36 +0900, Minchan Kim wrote:
>>
>> Previously I said, what I have a concern is that if file systems or
>> some modules abuses
>> add_to_page_cache_lru, it might system LRU list wrong so then system
>> go to hell.
>> Of course, if we use it carefully, it can be good but how do you make sure it?
>
> Having access to the source code means you only have to read all
> callers. This is not java, we don't have to add layers of anti-abuse
> wrappers. We can simply flame the first offender to a crisp. :)
>
>> I am not a file system expert but as I read comment of read_cache_pages
>> "Hides the details of the LRU cache etc from the filesystem", I
>> thought it is not good that
>> file system handle LRU list directly. At least, we have been trying for years.
>
> Only speaking for logfs, I need some variant of find_or_create_page
> where I can replace lock_page() with a custom function. Whether that
> function lives in fs/logfs/ or mm/filemap.c doesn't matter much.
>
> What we could do something roughly like the patch below, at least
> semantically. I know the patch is crap in its current form, but it
> illustrates the general idea.
>
> Jörn
>
> --
> The key to performance is elegance, not battalions of special cases.
> -- Jon Bentley and Doug McIlroy
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 045b31c..6d452eb 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -646,27 +646,19 @@ repeat:
> }
> EXPORT_SYMBOL(find_get_page);
>
> -/**
> - * find_lock_page - locate, pin and lock a pagecache page
> - * @mapping: the address_space to search
> - * @offset: the page index
> - *
> - * Locates the desired pagecache page, locks it, increments its reference
> - * count and returns its address.
> - *
> - * Returns zero if the page was not present. find_lock_page() may sleep.
> - */
> -struct page *find_lock_page(struct address_space *mapping, pgoff_t offset)
> +static struct page *__find_lock_page(struct address_space *mapping,
> + pgoff_t offset, void(*lock)(struct page *),
> + void(*unlock)(struct page *))
> {
> struct page *page;
>
> repeat:
> page = find_get_page(mapping, offset);
> if (page) {
> - lock_page(page);
> + lock(page);
> /* Has the page been truncated? */
> if (unlikely(page->mapping != mapping)) {
> - unlock_page(page);
> + unlock(page);
> page_cache_release(page);
> goto repeat;
> }
> @@ -674,32 +666,31 @@ repeat:
> }
> return page;
> }
> -EXPORT_SYMBOL(find_lock_page);
>
> /**
> - * find_or_create_page - locate or add a pagecache page
> - * @mapping: the page's address_space
> - * @index: the page's index into the mapping
> - * @gfp_mask: page allocation mode
> - *
> - * Locates a page in the pagecache. If the page is not present, a new page
> - * is allocated using @gfp_mask and is added to the pagecache and to the VM's
> - * LRU list. The returned page is locked and has its reference count
> - * incremented.
> + * find_lock_page - locate, pin and lock a pagecache page
> + * @mapping: the address_space to search
> + * @offset: the page index
> *
> - * find_or_create_page() may sleep, even if @gfp_flags specifies an atomic
> - * allocation!
> + * Locates the desired pagecache page, locks it, increments its reference
> + * count and returns its address.
> *
> - * find_or_create_page() returns the desired page's address, or zero on
> - * memory exhaustion.
> + * Returns zero if the page was not present. find_lock_page() may sleep.
> */
> -struct page *find_or_create_page(struct address_space *mapping,
> - pgoff_t index, gfp_t gfp_mask)
> +struct page *find_lock_page(struct address_space *mapping, pgoff_t offset)
> +{
> + return __find_lock_page(mapping, offset, lock_page, unlock_page);
> +}
> +EXPORT_SYMBOL(find_lock_page);
> +
> +static struct page *__find_or_create_page(struct address_space *mapping,
> + pgoff_t index, gfp_t gfp_mask, void(*lock)(struct page *),
> + void(*unlock)(struct page *))
> {
> struct page *page;
> int err;
> repeat:
> - page = find_lock_page(mapping, index);
> + page = __find_lock_page(mapping, index, lock, unlock);
> if (!page) {
> page = __page_cache_alloc(gfp_mask);
> if (!page)
> @@ -721,6 +712,31 @@ repeat:
> }
> return page;
> }
> +EXPORT_SYMBOL(__find_or_create_page);
> +
> +/**
> + * find_or_create_page - locate or add a pagecache page
> + * @mapping: the page's address_space
> + * @index: the page's index into the mapping
> + * @gfp_mask: page allocation mode
> + *
> + * Locates a page in the pagecache. If the page is not present, a new page
> + * is allocated using @gfp_mask and is added to the pagecache and to the VM's
> + * LRU list. The returned page is locked and has its reference count
> + * incremented.
> + *
> + * find_or_create_page() may sleep, even if @gfp_flags specifies an atomic
> + * allocation!
> + *
> + * find_or_create_page() returns the desired page's address, or zero on
> + * memory exhaustion.
> + */
> +struct page *find_or_create_page(struct address_space *mapping,
> + pgoff_t index, gfp_t gfp_mask)
> +{
> + return __find_or_create_page(mapping, index, gfp_mask, lock_page,
> + unlock_page);
> +}
> EXPORT_SYMBOL(find_or_create_page);
>
> /**
>
Seem to be not bad idea. :)
But we have to justify new interface before. For doing it, we have to say
why we can't do it by current functions(find_get_page,
add_to_page_cache and pagevec_lru_add_xxx)
Pagevec_lru_add_xxx does batch so that it can reduce calling path and
some overhead(ex, page_is_file_cache comparison,
get/put_cpu_var(lru_add_pvecs)).
At least, it would be rather good than old for performance.
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-04-05 8:27 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-04 15:27 why are some low-level MM routines being exported? Robert P. J. Day
2010-04-04 15:59 ` Minchan Kim
2010-04-04 16:03 ` Evgeniy Polyakov
2010-04-04 16:17 ` Minchan Kim
2010-04-04 16:21 ` Minchan Kim
2010-04-04 18:15 ` Evgeniy Polyakov
2010-04-05 0:36 ` Minchan Kim
2010-04-05 12:47 ` Evgeniy Polyakov
2010-04-05 14:31 ` Minchan Kim
2010-04-04 19:55 ` Jörn Engel
2010-04-05 0:59 ` Minchan Kim
2010-04-05 5:30 ` Jörn Engel
2010-04-05 6:20 ` Minchan Kim
2010-04-05 6:22 ` Minchan Kim
2010-04-05 7:13 ` Jörn Engel
2010-04-05 8:26 ` Minchan Kim [this message]
2010-04-05 11:19 ` Jörn Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m2w28c262361004050126mbcbed77cha6f1085394802cb2@mail.gmail.com \
--to=minchan.kim@gmail.com \
--cc=joern@logfs.org \
--cc=linux-mm@kvack.org \
--cc=rpjday@crashcourse.ca \
--cc=zbr@ioremap.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).