* [PATCH] ext3: avoid sending down non-refcounted pages
@ 2005-12-08 9:09 FUJITA Tomonori
2005-12-08 10:18 ` Andreas Dilger
0 siblings, 1 reply; 13+ messages in thread
From: FUJITA Tomonori @ 2005-12-08 9:09 UTC (permalink / raw)
To: michaelc, hch, linux-fsdevel, ext2-devel, open-iscsi
If file systems don't send down non-refcounted pages, it makes life
much easier for open-iscsi because it uses tcp_sendpage.
Can we reach agreement?
Christoph said that he'll take care of xfs. This patch makes ext3 use
normal pages instead of kmalloc'ed pages.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
index 002ad2b..01b8a6a 100644
--- a/fs/jbd/commit.c
+++ b/fs/jbd/commit.c
@@ -261,7 +261,7 @@ void journal_commit_transaction(journal_
struct buffer_head *bh = jh2bh(jh);
jbd_lock_bh_state(bh);
- kfree(jh->b_committed_data);
+ free_page((unsigned long) jh->b_committed_data);
jh->b_committed_data = NULL;
jbd_unlock_bh_state(bh);
}
@@ -745,14 +745,14 @@ restart_loop:
* Otherwise, we can just throw away the frozen data now.
*/
if (jh->b_committed_data) {
- kfree(jh->b_committed_data);
+ free_page((unsigned long) jh->b_committed_data);
jh->b_committed_data = NULL;
if (jh->b_frozen_data) {
jh->b_committed_data = jh->b_frozen_data;
jh->b_frozen_data = NULL;
}
} else if (jh->b_frozen_data) {
- kfree(jh->b_frozen_data);
+ free_page((unsigned long) jh->b_frozen_data);
jh->b_frozen_data = NULL;
}
diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c
index e4b516a..cab8c31 100644
--- a/fs/jbd/journal.c
+++ b/fs/jbd/journal.c
@@ -328,10 +328,10 @@ repeat:
char *tmp;
jbd_unlock_bh_state(bh_in);
- tmp = jbd_rep_kmalloc(bh_in->b_size, GFP_NOFS);
+ tmp = (char *) __get_free_page(GFP_NOFS | __GFP_NOFAIL);
jbd_lock_bh_state(bh_in);
if (jh_in->b_frozen_data) {
- kfree(tmp);
+ free_page((unsigned long) tmp);
goto repeat;
}
@@ -1799,13 +1799,13 @@ static void __journal_remove_journal_hea
printk(KERN_WARNING "%s: freeing "
"b_frozen_data\n",
__FUNCTION__);
- kfree(jh->b_frozen_data);
+ free_page((unsigned long) jh->b_frozen_data);
}
if (jh->b_committed_data) {
printk(KERN_WARNING "%s: freeing "
"b_committed_data\n",
__FUNCTION__);
- kfree(jh->b_committed_data);
+ free_page((unsigned long) jh->b_committed_data);
}
bh->b_private = NULL;
jh->b_bh = NULL; /* debug, really */
diff --git a/fs/jbd/transaction.c b/fs/jbd/transaction.c
index 429f4b2..16e741d 100644
--- a/fs/jbd/transaction.c
+++ b/fs/jbd/transaction.c
@@ -665,8 +665,8 @@ repeat:
if (!frozen_buffer) {
JBUFFER_TRACE(jh, "allocate memory for buffer");
jbd_unlock_bh_state(bh);
- frozen_buffer = jbd_kmalloc(jh2bh(jh)->b_size,
- GFP_NOFS);
+ frozen_buffer =
+ (char *) jbd_get_free_page(GFP_NOFS);
if (!frozen_buffer) {
printk(KERN_EMERG
"%s: OOM for frozen_buffer\n",
@@ -724,7 +724,8 @@ done:
journal_cancel_revoke(handle, jh);
out:
- kfree(frozen_buffer);
+ if (frozen_buffer)
+ free_page((unsigned long) frozen_buffer);
JBUFFER_TRACE(jh, "exit");
return error;
@@ -877,7 +878,7 @@ int journal_get_undo_access(handle_t *ha
repeat:
if (!jh->b_committed_data) {
- committed_data = jbd_kmalloc(jh2bh(jh)->b_size, GFP_NOFS);
+ committed_data = (char *) jbd_get_free_page(GFP_NOFS);
if (!committed_data) {
printk(KERN_EMERG "%s: No memory for committed data\n",
__FUNCTION__);
@@ -903,7 +904,8 @@ repeat:
jbd_unlock_bh_state(bh);
out:
journal_put_journal_head(jh);
- kfree(committed_data);
+ if (committed_data)
+ free_page((unsigned long) committed_data);
return err;
}
diff --git a/include/linux/jbd.h b/include/linux/jbd.h
index dcde7ad..5b72dc8 100644
--- a/include/linux/jbd.h
+++ b/include/linux/jbd.h
@@ -70,8 +70,8 @@ extern int journal_enable_debug;
extern void * __jbd_kmalloc (const char *where, size_t size, gfp_t flags, int retry);
#define jbd_kmalloc(size, flags) \
__jbd_kmalloc(__FUNCTION__, (size), (flags), journal_oom_retry)
-#define jbd_rep_kmalloc(size, flags) \
- __jbd_kmalloc(__FUNCTION__, (size), (flags), 1)
+#define jbd_get_free_page(flags) \
+ __get_free_page((flags) | journal_oom_retry ? __GFP_NOFAIL : 0)
#define JFS_MIN_JOURNAL_BLOCKS 1024
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-08 9:09 [PATCH] ext3: avoid sending down non-refcounted pages FUJITA Tomonori
@ 2005-12-08 10:18 ` Andreas Dilger
2005-12-08 12:39 ` [Ext2-devel] " FUJITA Tomonori
2005-12-08 13:42 ` allowed pages in the block later, was " Christoph Hellwig
0 siblings, 2 replies; 13+ messages in thread
From: Andreas Dilger @ 2005-12-08 10:18 UTC (permalink / raw)
To: FUJITA Tomonori; +Cc: michaelc, hch, linux-fsdevel, ext2-devel, open-iscsi
On Dec 08, 2005 18:09 +0900, FUJITA Tomonori wrote:
> If file systems don't send down non-refcounted pages, it makes life
> much easier for open-iscsi because it uses tcp_sendpage.
>
> Christoph said that he'll take care of xfs. This patch makes ext3 use
> normal pages instead of kmalloc'ed pages.
>
> --- a/fs/jbd/transaction.c
> +++ b/fs/jbd/transaction.c
> - frozen_buffer = jbd_kmalloc(jh2bh(jh)->b_size,
> - GFP_NOFS);
> + frozen_buffer =
> + (char *) jbd_get_free_page(GFP_NOFS);
> @@ -877,7 +878,7 @@ int journal_get_undo_access(handle_t *ha
> - committed_data = jbd_kmalloc(jh2bh(jh)->b_size, GFP_NOFS);
> + committed_data = (char *) jbd_get_free_page(GFP_NOFS);
What happens on 1kB or 2kB block filesystems (i.e. b_size != PAGE_SIZE)?
This will allocate a whole page for each block (which may be considerable
overhead on e.g. a 64kB PAGE_SIZE ia64 or PPC system).
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-08 10:18 ` Andreas Dilger
@ 2005-12-08 12:39 ` FUJITA Tomonori
2005-12-08 13:42 ` allowed pages in the block later, was " Christoph Hellwig
1 sibling, 0 replies; 13+ messages in thread
From: FUJITA Tomonori @ 2005-12-08 12:39 UTC (permalink / raw)
To: adilger; +Cc: fujita.tomonori, michaelc, hch, linux-fsdevel, ext2-devel
From: Andreas Dilger <adilger@clusterfs.com>
Subject: Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
Date: Thu, 8 Dec 2005 03:18:33 -0700
> On Dec 08, 2005 18:09 +0900, FUJITA Tomonori wrote:
> > If file systems don't send down non-refcounted pages, it makes life
> > much easier for open-iscsi because it uses tcp_sendpage.
> >
> > Christoph said that he'll take care of xfs. This patch makes ext3 use
> > normal pages instead of kmalloc'ed pages.
> >
> > --- a/fs/jbd/transaction.c
> > +++ b/fs/jbd/transaction.c
> > - frozen_buffer = jbd_kmalloc(jh2bh(jh)->b_size,
> > - GFP_NOFS);
> > + frozen_buffer =
> > + (char *) jbd_get_free_page(GFP_NOFS);
> > @@ -877,7 +878,7 @@ int journal_get_undo_access(handle_t *ha
> > - committed_data = jbd_kmalloc(jh2bh(jh)->b_size, GFP_NOFS);
> > + committed_data = (char *) jbd_get_free_page(GFP_NOFS);
>
> What happens on 1kB or 2kB block filesystems (i.e. b_size != PAGE_SIZE)?
> This will allocate a whole page for each block (which may be considerable
> overhead on e.g. a 64kB PAGE_SIZE ia64 or PPC system).
Yep, it may be. But testing all pages in scatterlist and using sendmsg
are overhead on all architectures with all sorts of block size from
iscsi initiator's standpoint. And frozen_buffer is a temporary buffer.
Any better solution to meet two requirements?
^ permalink raw reply [flat|nested] 13+ messages in thread
* allowed pages in the block later, was Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-08 10:18 ` Andreas Dilger
2005-12-08 12:39 ` [Ext2-devel] " FUJITA Tomonori
@ 2005-12-08 13:42 ` Christoph Hellwig
2005-12-08 13:58 ` Pekka Enberg
` (2 more replies)
1 sibling, 3 replies; 13+ messages in thread
From: Christoph Hellwig @ 2005-12-08 13:42 UTC (permalink / raw)
To: FUJITA Tomonori, michaelc, hch, linux-fsdevel, ext2-devel,
open-iscsi, linux-mm, linux-kernel
On Thu, Dec 08, 2005 at 03:18:33AM -0700, Andreas Dilger wrote:
> What happens on 1kB or 2kB block filesystems (i.e. b_size != PAGE_SIZE)?
> This will allocate a whole page for each block (which may be considerable
> overhead on e.g. a 64kB PAGE_SIZE ia64 or PPC system).
Yes. How often do we trigger this codepath?
The problem we're trying to solve here is how do implement network block
devices (nbd, iscsi) efficiently. The zero copy codepath in the networking
layer does need to grab additional references to pages. So to use sendpage
we need a refcountable page. pages used by the slab allocator are not
normally refcounted so try to do get_page/pub_page on them will break.
One way to work around that would be to detect kmalloced pages and use
a slowpath for that. The major issues with that is that we don't have a
reliable way to detect if a given struct page comes from the slab allocator
or not. The minor problem is that even with such an indicator it means
having a separate and lightly tested slowpath for this rare case.
All in all I think we should document that the block layer only accepts
properly refcounted pages, which is everything but kmalloced pages (even
vmalloc is totally fine)
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: allowed pages in the block later, was Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-08 13:42 ` allowed pages in the block later, was " Christoph Hellwig
@ 2005-12-08 13:58 ` Pekka Enberg
2005-12-12 17:27 ` allowed pages in the block later, was " Christoph Hellwig
2005-12-08 18:18 ` allowed pages in the block later, was Re: [Ext2-devel] " Mike Christie
2005-12-11 0:47 ` Andrew Morton
2 siblings, 1 reply; 13+ messages in thread
From: Pekka Enberg @ 2005-12-08 13:58 UTC (permalink / raw)
To: Christoph Hellwig, FUJITA Tomonori, michaelc, linux-fsdevel,
ext2-devel, open-iscsi, linux-mm, linux-kernel
Hi Christoph,
On 12/8/05, Christoph Hellwig <hch@infradead.org> wrote:
> One way to work around that would be to detect kmalloced pages and use
> a slowpath for that. The major issues with that is that we don't have a
> reliable way to detect if a given struct page comes from the slab allocator
> or not.
Why doesn't PageSlab work for you?
Pekka
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: allowed pages in the block later, was Re: [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-08 13:58 ` Pekka Enberg
@ 2005-12-12 17:27 ` Christoph Hellwig
0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2005-12-12 17:27 UTC (permalink / raw)
To: Pekka Enberg
Cc: Christoph Hellwig, FUJITA Tomonori, michaelc, linux-fsdevel,
ext2-devel, open-iscsi, linux-mm, linux-kernel
On Thu, Dec 08, 2005 at 03:58:46PM +0200, Pekka Enberg wrote:
> Hi Christoph,
>
> On 12/8/05, Christoph Hellwig <hch@infradead.org> wrote:
> > One way to work around that would be to detect kmalloced pages and use
> > a slowpath for that. The major issues with that is that we don't have a
> > reliable way to detect if a given struct page comes from the slab allocator
> > or not.
>
> Why doesn't PageSlab work for you?
When I looked last time it was a noop without slab debugging enabled,
but that's not the case in current mainline anymore.
If the VM people agree with that usage we could at least use it to fall
back to slow-path. Even better would be to require normal pages, though.
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: allowed pages in the block later, was Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-08 13:42 ` allowed pages in the block later, was " Christoph Hellwig
2005-12-08 13:58 ` Pekka Enberg
@ 2005-12-08 18:18 ` Mike Christie
2005-12-08 18:22 ` Mike Christie
2005-12-11 0:47 ` Andrew Morton
2 siblings, 1 reply; 13+ messages in thread
From: Mike Christie @ 2005-12-08 18:18 UTC (permalink / raw)
To: Christoph Hellwig
Cc: FUJITA Tomonori, linux-fsdevel, ext2-devel, open-iscsi, linux-mm,
linux-kernel
Christoph Hellwig wrote:
> On Thu, Dec 08, 2005 at 03:18:33AM -0700, Andreas Dilger wrote:
>
>>What happens on 1kB or 2kB block filesystems (i.e. b_size != PAGE_SIZE)?
>>This will allocate a whole page for each block (which may be considerable
>>overhead on e.g. a 64kB PAGE_SIZE ia64 or PPC system).
>
>
> Yes. How often do we trigger this codepath?
>
> The problem we're trying to solve here is how do implement network block
> devices (nbd, iscsi) efficiently. The zero copy codepath in the networking
> layer does need to grab additional references to pages. So to use sendpage
> we need a refcountable page. pages used by the slab allocator are not
> normally refcounted so try to do get_page/pub_page on them will break.
>
> One way to work around that would be to detect kmalloced pages and use
> a slowpath for that. The major issues with that is that we don't have a
> reliable way to detect if a given struct page comes from the slab allocator
> or not. The minor problem is that even with such an indicator it means
> having a separate and lightly tested slowpath for this rare case.
>
> All in all I think we should document that the block layer only accepts
> properly refcounted pages, which is everything but kmalloced pages (even
> vmalloc is totally fine)
Is it anytime kmalloc is used? For scsi when it uses scsi_execute* for
something like scanning (report luns result is kmallocd) would this be a
problem?
If PageSlab() does work, then could we have a request queue flag that
bounces those pages for all block layer drivers. Pretty slow and yucky
but if we have to convert SCSI and maybe other parts of the block layer
maybe it will be easiest for now.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: allowed pages in the block later, was Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-08 18:18 ` allowed pages in the block later, was Re: [Ext2-devel] " Mike Christie
@ 2005-12-08 18:22 ` Mike Christie
2005-12-08 19:20 ` Pekka Enberg
0 siblings, 1 reply; 13+ messages in thread
From: Mike Christie @ 2005-12-08 18:22 UTC (permalink / raw)
To: open-iscsi
Cc: Christoph Hellwig, FUJITA Tomonori, linux-fsdevel, ext2-devel,
linux-mm, linux-kernel
Mike Christie wrote:
>
> Christoph Hellwig wrote:
>
>> On Thu, Dec 08, 2005 at 03:18:33AM -0700, Andreas Dilger wrote:
>>
>>> What happens on 1kB or 2kB block filesystems (i.e. b_size != PAGE_SIZE)?
>>> This will allocate a whole page for each block (which may be
>>> considerable
>>> overhead on e.g. a 64kB PAGE_SIZE ia64 or PPC system).
>>
>>
>>
>> Yes. How often do we trigger this codepath?
>>
>> The problem we're trying to solve here is how do implement network block
>> devices (nbd, iscsi) efficiently. The zero copy codepath in the
>> networking
>> layer does need to grab additional references to pages. So to use
>> sendpage
>> we need a refcountable page. pages used by the slab allocator are not
>> normally refcounted so try to do get_page/pub_page on them will break.
>>
>> One way to work around that would be to detect kmalloced pages and use
>> a slowpath for that. The major issues with that is that we don't have a
>> reliable way to detect if a given struct page comes from the slab
>> allocator
>> or not. The minor problem is that even with such an indicator it means
>> having a separate and lightly tested slowpath for this rare case.
>>
>> All in all I think we should document that the block layer only accepts
>> properly refcounted pages, which is everything but kmalloced pages (even
>> vmalloc is totally fine)
>
>
> Is it anytime kmalloc is used? For scsi when it uses scsi_execute* for
> something like scanning (report luns result is kmallocd) would this be a
> problem?
>
> If PageSlab() does work, then could we have a request queue flag that
> bounces those pages for all block layer drivers. Pretty slow and yucky
> but if we have to convert SCSI and maybe other parts of the block layer
> maybe it will be easiest for now.
>
Or there is not a way to do kmalloc(GFP_BLK) that gives us the right
type of memory is there?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: allowed pages in the block later, was Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-08 18:22 ` Mike Christie
@ 2005-12-08 19:20 ` Pekka Enberg
0 siblings, 0 replies; 13+ messages in thread
From: Pekka Enberg @ 2005-12-08 19:20 UTC (permalink / raw)
To: Mike Christie
Cc: open-iscsi, Christoph Hellwig, FUJITA Tomonori, linux-fsdevel,
ext2-devel, linux-mm, linux-kernel
Hi,
On 12/8/05, Mike Christie <michaelc@cs.wisc.edu> wrote:
> Or there is not a way to do kmalloc(GFP_BLK) that gives us the right
> type of memory is there?
The slab allocator uses page->lru for special purposes. See
page_{set|get}_{cache|slab} in mm/slab.c. They are used by kfree(),
ksize() and slab debugging code to lookup the cache and slab an void
pointer belongs to.
But, if you just need put_page and get_page, couldn't you do something
like the following?
Pekka
Index: 2.6/mm/swap.c
===================================================================
--- 2.6.orig/mm/swap.c
+++ 2.6/mm/swap.c
@@ -36,6 +36,9 @@ int page_cluster;
void put_page(struct page *page)
{
+ if (unlikely(PageSlab(page)))
+ return;
+
if (unlikely(PageCompound(page))) {
page = (struct page *)page_private(page);
if (put_page_testzero(page)) {
Index: 2.6/include/linux/mm.h
===================================================================
--- 2.6.orig/include/linux/mm.h
+++ 2.6/include/linux/mm.h
@@ -322,6 +322,9 @@ static inline int page_count(struct page
static inline void get_page(struct page *page)
{
+ if (unlikely(PageSlab(page)))
+ return;
+
if (unlikely(PageCompound(page)))
page = (struct page *)page_private(page);
atomic_inc(&page->_count);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: allowed pages in the block later, was Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-08 13:42 ` allowed pages in the block later, was " Christoph Hellwig
2005-12-08 13:58 ` Pekka Enberg
2005-12-08 18:18 ` allowed pages in the block later, was Re: [Ext2-devel] " Mike Christie
@ 2005-12-11 0:47 ` Andrew Morton
2005-12-11 8:44 ` allowed pages in the block later, was " Arjan van de Ven
2005-12-12 17:25 ` allowed pages in the block later, was Re: [Ext2-devel] " Christoph Hellwig
2 siblings, 2 replies; 13+ messages in thread
From: Andrew Morton @ 2005-12-11 0:47 UTC (permalink / raw)
To: Christoph Hellwig
Cc: fujita.tomonori, michaelc, hch, linux-fsdevel, ext2-devel,
open-iscsi, linux-mm, linux-kernel
Christoph Hellwig <hch@infradead.org> wrote:
>
> The problem we're trying to solve here is how do implement network block
> devices (nbd, iscsi) efficiently. The zero copy codepath in the networking
> layer does need to grab additional references to pages. So to use sendpage
> we need a refcountable page. pages used by the slab allocator are not
> normally refcounted so try to do get_page/pub_page on them will break.
I don't get it. Doing get_page/put_page on a slab-allocated page should do
the right thing?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: allowed pages in the block later, was Re: [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-11 0:47 ` Andrew Morton
@ 2005-12-11 8:44 ` Arjan van de Ven
2005-12-12 17:25 ` allowed pages in the block later, was Re: [Ext2-devel] " Christoph Hellwig
1 sibling, 0 replies; 13+ messages in thread
From: Arjan van de Ven @ 2005-12-11 8:44 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, open-iscsi, ext2-devel, linux-fsdevel,
michaelc, fujita.tomonori, Christoph Hellwig
On Sat, 2005-12-10 at 16:47 -0800, Andrew Morton wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
> >
> > The problem we're trying to solve here is how do implement network block
> > devices (nbd, iscsi) efficiently. The zero copy codepath in the networking
> > layer does need to grab additional references to pages. So to use sendpage
> > we need a refcountable page. pages used by the slab allocator are not
> > normally refcounted so try to do get_page/pub_page on them will break.
>
> I don't get it. Doing get_page/put_page on a slab-allocated page should do
> the right thing?
but it doesn't stop the kfree from freeing the memory; zero copy needs
the content of the memory to stay around afterwards, eg it wants to
delay the kfree until the data is over the wire, which is an
asynchronous event versus the actual send command in a zero-copy
situation.
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: allowed pages in the block later, was Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-11 0:47 ` Andrew Morton
2005-12-11 8:44 ` allowed pages in the block later, was " Arjan van de Ven
@ 2005-12-12 17:25 ` Christoph Hellwig
2005-12-12 20:12 ` Andrew Morton
1 sibling, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2005-12-12 17:25 UTC (permalink / raw)
To: Andrew Morton
Cc: Christoph Hellwig, fujita.tomonori, michaelc, linux-fsdevel,
ext2-devel, open-iscsi, linux-mm, linux-kernel
On Sat, Dec 10, 2005 at 04:47:36PM -0800, Andrew Morton wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
> >
> > The problem we're trying to solve here is how do implement network block
> > devices (nbd, iscsi) efficiently. The zero copy codepath in the networking
> > layer does need to grab additional references to pages. So to use sendpage
> > we need a refcountable page. pages used by the slab allocator are not
> > normally refcounted so try to do get_page/pub_page on them will break.
>
> I don't get it. Doing get_page/put_page on a slab-allocated page should do
> the right thing?
As Arjan mentioned, what would be the right thing? Delaying returning the
page to the page pool and disallow reuse until page count reaches zero?
All this seems highly impractical.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: allowed pages in the block later, was Re: [Ext2-devel] [PATCH] ext3: avoid sending down non-refcounted pages
2005-12-12 17:25 ` allowed pages in the block later, was Re: [Ext2-devel] " Christoph Hellwig
@ 2005-12-12 20:12 ` Andrew Morton
0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2005-12-12 20:12 UTC (permalink / raw)
To: Christoph Hellwig
Cc: hch, fujita.tomonori, michaelc, linux-fsdevel, ext2-devel,
open-iscsi, linux-mm, linux-kernel
Christoph Hellwig <hch@infradead.org> wrote:
>
> On Sat, Dec 10, 2005 at 04:47:36PM -0800, Andrew Morton wrote:
> > Christoph Hellwig <hch@infradead.org> wrote:
> > >
> > > The problem we're trying to solve here is how do implement network block
> > > devices (nbd, iscsi) efficiently. The zero copy codepath in the networking
> > > layer does need to grab additional references to pages. So to use sendpage
> > > we need a refcountable page. pages used by the slab allocator are not
> > > normally refcounted so try to do get_page/pub_page on them will break.
> >
> > I don't get it. Doing get_page/put_page on a slab-allocated page should do
> > the right thing?
>
> As Arjan mentioned, what would be the right thing? Delaying returning the
> page to the page pool and disallow reuse until page count reaches zero?
Yes, that's what'll happen. slab will put its final ref to the page, so
whoever did that intervening get_page() ends up owning the page.
> All this seems highly impractical.
Well, as Arjan points out, doing get_page() won't prevent slab from
"freeing" a part of the page and reusing it for another object of the same
type.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2005-12-12 20:12 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-08 9:09 [PATCH] ext3: avoid sending down non-refcounted pages FUJITA Tomonori
2005-12-08 10:18 ` Andreas Dilger
2005-12-08 12:39 ` [Ext2-devel] " FUJITA Tomonori
2005-12-08 13:42 ` allowed pages in the block later, was " Christoph Hellwig
2005-12-08 13:58 ` Pekka Enberg
2005-12-12 17:27 ` allowed pages in the block later, was " Christoph Hellwig
2005-12-08 18:18 ` allowed pages in the block later, was Re: [Ext2-devel] " Mike Christie
2005-12-08 18:22 ` Mike Christie
2005-12-08 19:20 ` Pekka Enberg
2005-12-11 0:47 ` Andrew Morton
2005-12-11 8:44 ` allowed pages in the block later, was " Arjan van de Ven
2005-12-12 17:25 ` allowed pages in the block later, was Re: [Ext2-devel] " Christoph Hellwig
2005-12-12 20:12 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).