[PATCH 2/7] mmc: Don't use PF

public inbox for linux-mmc@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 2/7] mmc: Don't use PF_MEMALLOC
       [not found] <20091117161551.3DD4.A69D9226@jp.fujitsu.com>
@ 2009-11-17  7:17 ` KOSAKI Motohiro
  2009-11-17 10:29   ` Alan Cox
  0 siblings, 1 reply; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-17  7:17 UTC (permalink / raw)
  To: LKML; +Cc: kosaki.motohiro, linux-mm, Andrew Morton, linux-mmc

Non MM subsystem must not use PF_MEMALLOC. Memory reclaim need few
memory, anyone must not prevent it. Otherwise the system cause
mysterious hang-up and/or OOM Killer invokation.

Cc: linux-mmc@vger.kernel.org
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 drivers/mmc/card/queue.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 49e5823..5deb996 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -46,8 +46,6 @@ static int mmc_queue_thread(void *d)
 	struct mmc_queue *mq = d;
 	struct request_queue *q = mq->queue;
 
-	current->flags |= PF_MEMALLOC;
-
 	down(&mq->thread_sem);
 	do {
 		struct request *req = NULL;
-- 
1.6.2.5



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-17  7:17 ` [PATCH 2/7] mmc: Don't use PF_MEMALLOC KOSAKI Motohiro
@ 2009-11-17 10:29   ` Alan Cox
  2009-11-17 10:32     ` Minchan Kim
  2009-11-17 11:58     ` KOSAKI Motohiro
  0 siblings, 2 replies; 12+ messages in thread
From: Alan Cox @ 2009-11-17 10:29 UTC (permalink / raw)
  Cc: LKML, kosaki.motohiro, linux-mm, Andrew Morton, linux-mmc

On Tue, 17 Nov 2009 16:17:50 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> Non MM subsystem must not use PF_MEMALLOC. Memory reclaim need few
> memory, anyone must not prevent it. Otherwise the system cause
> mysterious hang-up and/or OOM Killer invokation.

So now what happens if we are paging and all our memory is tied up for
writeback to a device or CIFS etc which can no longer allocate the memory
to complete the write out so the MM can reclaim ?

Am I missing something or is this patch set not addressing the case where
the writeback thread needs to inherit PF_MEMALLOC somehow (at least for
the I/O in question and those blocking it)

Alan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-17 10:29   ` Alan Cox
@ 2009-11-17 10:32     ` Minchan Kim
  2009-11-17 10:38       ` Oliver Neukum
  2009-11-17 11:58     ` KOSAKI Motohiro
  1 sibling, 1 reply; 12+ messages in thread
From: Minchan Kim @ 2009-11-17 10:32 UTC (permalink / raw)
  To: Alan Cox; +Cc: KOSAKI Motohiro, LKML, linux-mm, Andrew Morton, linux-mmc

On Tue, Nov 17, 2009 at 7:29 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> On Tue, 17 Nov 2009 16:17:50 +0900 (JST)
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>
>> Non MM subsystem must not use PF_MEMALLOC. Memory reclaim need few
>> memory, anyone must not prevent it. Otherwise the system cause
>> mysterious hang-up and/or OOM Killer invokation.
>
> So now what happens if we are paging and all our memory is tied up for
> writeback to a device or CIFS etc which can no longer allocate the memory
> to complete the write out so the MM can reclaim ?
>
> Am I missing something or is this patch set not addressing the case where
> the writeback thread needs to inherit PF_MEMALLOC somehow (at least for
> the I/O in question and those blocking it)
>

I agree.
At least, drivers for writeout is proper for using PF_MEMALLOC, I think.


> Alan
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-17 10:32     ` Minchan Kim
@ 2009-11-17 10:38       ` Oliver Neukum
  0 siblings, 0 replies; 12+ messages in thread
From: Oliver Neukum @ 2009-11-17 10:38 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Alan Cox, KOSAKI Motohiro, LKML, linux-mm, Andrew Morton,
	linux-mmc

Am Dienstag, 17. November 2009 11:32:36 schrieb Minchan Kim:
> On Tue, Nov 17, 2009 at 7:29 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> > On Tue, 17 Nov 2009 16:17:50 +0900 (JST)
> >
> > KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> >> Non MM subsystem must not use PF_MEMALLOC. Memory reclaim need few
> >> memory, anyone must not prevent it. Otherwise the system cause
> >> mysterious hang-up and/or OOM Killer invokation.
> >
> > So now what happens if we are paging and all our memory is tied up for
> > writeback to a device or CIFS etc which can no longer allocate the memory
> > to complete the write out so the MM can reclaim ?
> >
> > Am I missing something or is this patch set not addressing the case where
> > the writeback thread needs to inherit PF_MEMALLOC somehow (at least for
> > the I/O in question and those blocking it)
> 
> I agree.
> At least, drivers for writeout is proper for using PF_MEMALLOC, I think.

For the same reason error handling should also use it, shouldn't it?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-17 10:29   ` Alan Cox
  2009-11-17 10:32     ` Minchan Kim
@ 2009-11-17 11:58     ` KOSAKI Motohiro
  2009-11-17 12:51       ` Minchan Kim
  1 sibling, 1 reply; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-17 11:58 UTC (permalink / raw)
  To: Alan Cox; +Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, linux-mmc

> On Tue, 17 Nov 2009 16:17:50 +0900 (JST)
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> 
> > Non MM subsystem must not use PF_MEMALLOC. Memory reclaim need few
> > memory, anyone must not prevent it. Otherwise the system cause
> > mysterious hang-up and/or OOM Killer invokation.
> 
> So now what happens if we are paging and all our memory is tied up for
> writeback to a device or CIFS etc which can no longer allocate the memory
> to complete the write out so the MM can reclaim ?

Probably my answer is not so simple. sorry.

reason1: MM reclaim does both dropping clean memory and writing out dirty pages.
reason2: if all memory is exhausted, maybe we can't recover it. it is
fundamental limitation of Virtual Memory subsystem. and, min-watermark is
decided by number of system physcal memory, but # of I/O issue (i.e. # of
pages of used by writeback thread) is mainly decided # of devices. 
then, we can't gurantee min-watermark is sufficient on any systems.
Only reasonable solution is mempool like reservation, I think.
IOW, any reservation memory shouldn't share unrelated subsystem. otherwise
we lost any gurantee.

So, I think we need to hear why many developer don't use mempool,
instead use PF_MEMALLOC.

> Am I missing something or is this patch set not addressing the case where
> the writeback thread needs to inherit PF_MEMALLOC somehow (at least for
> the I/O in question and those blocking it)

Yes, probably my patchset isn't perfect. honestly I haven't understand
why so many developer prefer to use PF_MEMALLOC.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-17 11:58     ` KOSAKI Motohiro
@ 2009-11-17 12:51       ` Minchan Kim
  2009-11-17 20:47         ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Minchan Kim @ 2009-11-17 12:51 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Alan Cox, LKML, linux-mm, Andrew Morton, linux-mmc

Sorry for the noise. 
While I am typing, my mail client already send the mail. :(.
This is genuine.

KOSAKI Motohiro wrote:
>> On Tue, 17 Nov 2009 16:17:50 +0900 (JST)
>> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>>
>>> Non MM subsystem must not use PF_MEMALLOC. Memory reclaim need few
>>> memory, anyone must not prevent it. Otherwise the system cause
>>> mysterious hang-up and/or OOM Killer invokation.
>> So now what happens if we are paging and all our memory is tied up for
>> writeback to a device or CIFS etc which can no longer allocate the memory
>> to complete the write out so the MM can reclaim ?
> 
> Probably my answer is not so simple. sorry.
> 
> reason1: MM reclaim does both dropping clean memory and writing out dirty pages.

Who write out dirty pages?
If block driver can't allocate pages for flushing, It means VM can't reclaim
dirty pages after all.

> reason2: if all memory is exhausted, maybe we can't recover it. it is
> fundamental limitation of Virtual Memory subsystem. and, min-watermark is
> decided by number of system physcal memory, but # of I/O issue (i.e. # of
> pages of used by writeback thread) is mainly decided # of devices. 
> then, we can't gurantee min-watermark is sufficient on any systems.
> Only reasonable solution is mempool like reservation, I think.

I think it's because mempool reserves memory. 
(# of I/O issue\0 is hard to be expected.
How do we determine mempool size of each block driver?
For example,  maybe, server use few I/O for nand. 
but embedded system uses a lot of I/O. 

We need another knob for each block driver?

I understand your point. but it's not simple. 
I think, for making sure VM's pages, block driver need to distinguish 
normal flush path and flush patch for reclaiming. 
So In case of flushing for reclaiming, block driver have to set PF_MEMALLOC. 
otherwise, it shouldn't set PF_MEMALLOC.


> IOW, any reservation memory shouldn't share unrelated subsystem. otherwise
> we lost any gurantee.
> 
> So, I think we need to hear why many developer don't use mempool,
> instead use PF_MEMALLOC.
> 
>> Am I missing something or is this patch set not addressing the case where
>> the writeback thread needs to inherit PF_MEMALLOC somehow (at least for
>> the I/O in question and those blocking it)
> 
> Yes, probably my patchset isn't perfect. honestly I haven't understand
> why so many developer prefer to use PF_MEMALLOC.
> 
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-17 12:51       ` Minchan Kim
@ 2009-11-17 20:47         ` Peter Zijlstra
  2009-11-18  0:01           ` Minchan Kim
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2009-11-17 20:47 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KOSAKI Motohiro, Alan Cox, LKML, linux-mm, Andrew Morton,
	linux-mmc

On Tue, 2009-11-17 at 21:51 +0900, Minchan Kim wrote:
> I think it's because mempool reserves memory. 
> (# of I/O issue\0 is hard to be expected.
> How do we determine mempool size of each block driver?
> For example,  maybe, server use few I/O for nand. 
> but embedded system uses a lot of I/O. 

No, you scale the mempool to the minimum amount required to make
progress -- this includes limiting the 'concurrency' when handing out
mempool objects.

If you run into such tight corners often enough to notice it, there's
something else wrong.

I fully agree with ripping out PF_MEMALLOC from pretty much everything,
including the VM, getting rid of the various abuse outside of the VM
seems like a very good start.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-17 20:47         ` Peter Zijlstra
@ 2009-11-18  0:01           ` Minchan Kim
  2009-11-18  9:56             ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Minchan Kim @ 2009-11-18  0:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: KOSAKI Motohiro, Alan Cox, LKML, linux-mm, Andrew Morton,
	linux-mmc

Hi, Peter.

First of all, Thanks for the commenting.

On Wed, Nov 18, 2009 at 5:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, 2009-11-17 at 21:51 +0900, Minchan Kim wrote:
>> I think it's because mempool reserves memory.
>> (# of I/O issue\0 is hard to be expected.
>> How do we determine mempool size of each block driver?
>> For example,  maybe, server use few I/O for nand.
>> but embedded system uses a lot of I/O.
>
> No, you scale the mempool to the minimum amount required to make
> progress -- this includes limiting the 'concurrency' when handing out
> mempool objects.
>
> If you run into such tight corners often enough to notice it, there's
> something else wrong.
>
> I fully agree with ripping out PF_MEMALLOC from pretty much everything,
> including the VM, getting rid of the various abuse outside of the VM
> seems like a very good start.
>

I am not against removing PF_MEMALLOC.
Totally, I agree to prevent abusing of PF_MEMALLOC.

What I have a concern is per-block mempool.
Although it's minimum amount of mempool, it can be increased
by adding new block driver. I am not sure how many we will have block driver.

And, person who develop new driver always have to use mempool and consider
what is minimum of mempool.
I think this is a problem of mempool, now.

How about this?
According to system memory, kernel have just one mempool for I/O which
is one shared by several block driver.

And we make new API block driver can use.
Of course, as usual It can use dynamic memoy. Only it can use mempool if
system don't have much dynamic memory.

In this case, we can control read/write path. read I/O can't help
memory reclaiming.
So I think read I/O don't use mempool, I am not sure. :)

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-18  0:01           ` Minchan Kim
@ 2009-11-18  9:56             ` Peter Zijlstra
  2009-11-18 10:31               ` Minchan Kim
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2009-11-18  9:56 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KOSAKI Motohiro, Alan Cox, LKML, linux-mm, Andrew Morton,
	linux-mmc

On Wed, 2009-11-18 at 09:01 +0900, Minchan Kim wrote:
> Hi, Peter.
> 
> First of all, Thanks for the commenting.
> 
> On Wed, Nov 18, 2009 at 5:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Tue, 2009-11-17 at 21:51 +0900, Minchan Kim wrote:
> >> I think it's because mempool reserves memory.
> >> (# of I/O issue\0 is hard to be expected.
> >> How do we determine mempool size of each block driver?
> >> For example,  maybe, server use few I/O for nand.
> >> but embedded system uses a lot of I/O.
> >
> > No, you scale the mempool to the minimum amount required to make
> > progress -- this includes limiting the 'concurrency' when handing out
> > mempool objects.
> >
> > If you run into such tight corners often enough to notice it, there's
> > something else wrong.
> >
> > I fully agree with ripping out PF_MEMALLOC from pretty much everything,
> > including the VM, getting rid of the various abuse outside of the VM
> > seems like a very good start.
> >
> 
> I am not against removing PF_MEMALLOC.
> Totally, I agree to prevent abusing of PF_MEMALLOC.
> 
> What I have a concern is per-block mempool.
> Although it's minimum amount of mempool, it can be increased
> by adding new block driver. I am not sure how many we will have block driver.
> 
> And, person who develop new driver always have to use mempool and consider
> what is minimum of mempool.
> I think this is a problem of mempool, now.
> 
> How about this?
> According to system memory, kernel have just one mempool for I/O which
> is one shared by several block driver.
> 
> And we make new API block driver can use.
> Of course, as usual It can use dynamic memoy. Only it can use mempool if
> system don't have much dynamic memory.
> 
> In this case, we can control read/write path. read I/O can't help
> memory reclaiming.
> So I think read I/O don't use mempool, I am not sure. :)

Sure some generic blocklevel infrastructure might work, _but_ you cannot
take away the responsibility of determining the amount of memory needed,
nor does any of this have any merit if you do not limit yourself to that
amount.

Current PF_MEMALLOC usage in the VM is utterly broken in that we can
have a basically unlimited amount of tasks hit direct reclaim and all of
them will then consume PF_MEMALLOC, which mean we can easily run out of
memory.

( unless I missed the direct reclaim throttle patches going in, which
isn't at all impossible )



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-18  9:56             ` Peter Zijlstra
@ 2009-11-18 10:31               ` Minchan Kim
  2009-11-18 10:54                 ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Minchan Kim @ 2009-11-18 10:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: KOSAKI Motohiro, Alan Cox, LKML, linux-mm, Andrew Morton,
	linux-mmc

On Wed, Nov 18, 2009 at 6:56 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, 2009-11-18 at 09:01 +0900, Minchan Kim wrote:
>> Hi, Peter.
>>
>> First of all, Thanks for the commenting.
>>
>> On Wed, Nov 18, 2009 at 5:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Tue, 2009-11-17 at 21:51 +0900, Minchan Kim wrote:
>> >> I think it's because mempool reserves memory.
>> >> (# of I/O issue\0 is hard to be expected.
>> >> How do we determine mempool size of each block driver?
>> >> For example,  maybe, server use few I/O for nand.
>> >> but embedded system uses a lot of I/O.
>> >
>> > No, you scale the mempool to the minimum amount required to make
>> > progress -- this includes limiting the 'concurrency' when handing out
>> > mempool objects.
>> >
>> > If you run into such tight corners often enough to notice it, there's
>> > something else wrong.
>> >
>> > I fully agree with ripping out PF_MEMALLOC from pretty much everything,
>> > including the VM, getting rid of the various abuse outside of the VM
>> > seems like a very good start.
>> >
>>
>> I am not against removing PF_MEMALLOC.
>> Totally, I agree to prevent abusing of PF_MEMALLOC.
>>
>> What I have a concern is per-block mempool.
>> Although it's minimum amount of mempool, it can be increased
>> by adding new block driver. I am not sure how many we will have block driver.
>>
>> And, person who develop new driver always have to use mempool and consider
>> what is minimum of mempool.
>> I think this is a problem of mempool, now.
>>
>> How about this?
>> According to system memory, kernel have just one mempool for I/O which
>> is one shared by several block driver.
>>
>> And we make new API block driver can use.
>> Of course, as usual It can use dynamic memoy. Only it can use mempool if
>> system don't have much dynamic memory.
>>
>> In this case, we can control read/write path. read I/O can't help
>> memory reclaiming.
>> So I think read I/O don't use mempool, I am not sure. :)
>
> Sure some generic blocklevel infrastructure might work, _but_ you cannot
> take away the responsibility of determining the amount of memory needed,
> nor does any of this have any merit if you do not limit yourself to that
> amount.

Yes. Some one have to take a responsibility.

The intention was we could take away the responsibility from block driver.
Instead of driver, VM would take the responsibility.

You mean althgouth VM could take the responsiblity, it is hard to
expect amout of pages
needed by block drivers?

Yes, I agree.

>
> Current PF_MEMALLOC usage in the VM is utterly broken in that we can
> have a basically unlimited amount of tasks hit direct reclaim and all of
> them will then consume PF_MEMALLOC, which mean we can easily run out of
> memory.
>
> ( unless I missed the direct reclaim throttle patches going in, which
> isn't at all impossible )

I think we can prevent it at least.  Kosaki already submitted the patches. :)
(too_many_isolated functions).


I am looking forward to kosaki's next version.

Thanks for careful comment, Peter.
Thanks for submitting good issue, Kosaki. :)

>
>
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-18 10:31               ` Minchan Kim
@ 2009-11-18 10:54                 ` Peter Zijlstra
  2009-11-18 11:15                   ` Minchan Kim
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2009-11-18 10:54 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KOSAKI Motohiro, Alan Cox, LKML, linux-mm, Andrew Morton,
	linux-mmc

On Wed, 2009-11-18 at 19:31 +0900, Minchan Kim wrote:
> >
> > Sure some generic blocklevel infrastructure might work, _but_ you cannot
> > take away the responsibility of determining the amount of memory needed,
> > nor does any of this have any merit if you do not limit yourself to that
> > amount.
> 
> Yes. Some one have to take a responsibility.
> 
> The intention was we could take away the responsibility from block driver.
> Instead of driver, VM would take the responsibility.
> 
> You mean althgouth VM could take the responsiblity, it is hard to
> expect amout of pages needed by block drivers? 

Correct, its near impossible for the VM to accurately guess the amount
of memory needed for a driver, or limit the usage of the driver.

The driver could be very simple in that it'll just start a DMA on the
page and get an interrupt when done, not consuming much (if any) memory
beyond the generic BIO structure, but it could also be some iSCSI
monstrosity which involves the full network stack and userspace.

That is why I generally prefer the user of PF_MEMALLOC to take
responsibility, because it knows its own consumption and can limit its
own consumption.

Now, I don't think (but I could be wring here) that you need to bother
with PF_MEMALLOC unless you're swapping. File based pages should always
be able to free some memory due to the dirty-limit, whcih basically
guarantees that there are some clean file pages for every dirty file
page.

My swap-over-nfs series used to have a block-layer hook to expose the
swap-over-block behaviour:

http://programming.kicks-ass.net/kernel-patches/vm_deadlock/v12.99/blk_queue_swapdev.patch

That gives block devices the power to refuse being swapped over, which
could be an alternative to using PF_MEMALLOC.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/7] mmc: Don't use PF_MEMALLOC
  2009-11-18 10:54                 ` Peter Zijlstra
@ 2009-11-18 11:15                   ` Minchan Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Minchan Kim @ 2009-11-18 11:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: KOSAKI Motohiro, Alan Cox, LKML, linux-mm, Andrew Morton,
	linux-mmc

On Wed, Nov 18, 2009 at 7:54 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, 2009-11-18 at 19:31 +0900, Minchan Kim wrote:
>> >
>> > Sure some generic blocklevel infrastructure might work, _but_ you cannot
>> > take away the responsibility of determining the amount of memory needed,
>> > nor does any of this have any merit if you do not limit yourself to that
>> > amount.
>>
>> Yes. Some one have to take a responsibility.
>>
>> The intention was we could take away the responsibility from block driver.
>> Instead of driver, VM would take the responsibility.
>>
>> You mean althgouth VM could take the responsiblity, it is hard to
>> expect amout of pages needed by block drivers?
>
> Correct, its near impossible for the VM to accurately guess the amount
> of memory needed for a driver, or limit the usage of the driver.
>
> The driver could be very simple in that it'll just start a DMA on the
> page and get an interrupt when done, not consuming much (if any) memory
> beyond the generic BIO structure, but it could also be some iSCSI
> monstrosity which involves the full network stack and userspace.

Wow, Thanks for good example.
Until now, I don't know iSCSI is such memory hog driver.

> That is why I generally prefer the user of PF_MEMALLOC to take
> responsibility, because it knows its own consumption and can limit its
> own consumption.

Okay. I understand your point by good explanation.

> Now, I don't think (but I could be wring here) that you need to bother
> with PF_MEMALLOC unless you're swapping. File based pages should always
> be able to free some memory due to the dirty-limit, whcih basically
> guarantees that there are some clean file pages for every dirty file
> page.
>
> My swap-over-nfs series used to have a block-layer hook to expose the
> swap-over-block behaviour:
>
> http://programming.kicks-ass.net/kernel-patches/vm_deadlock/v12.99/blk_queue_swapdev.patch
>
> That gives block devices the power to refuse being swapped over, which
> could be an alternative to using PF_MEMALLOC.
>

Thanks for noticing me.
I will look at your patches.
Thanks again, Peter.





-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-11-18 11:15 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20091117161551.3DD4.A69D9226@jp.fujitsu.com>
2009-11-17  7:17 ` [PATCH 2/7] mmc: Don't use PF_MEMALLOC KOSAKI Motohiro
2009-11-17 10:29   ` Alan Cox
2009-11-17 10:32     ` Minchan Kim
2009-11-17 10:38       ` Oliver Neukum
2009-11-17 11:58     ` KOSAKI Motohiro
2009-11-17 12:51       ` Minchan Kim
2009-11-17 20:47         ` Peter Zijlstra
2009-11-18  0:01           ` Minchan Kim
2009-11-18  9:56             ` Peter Zijlstra
2009-11-18 10:31               ` Minchan Kim
2009-11-18 10:54                 ` Peter Zijlstra
2009-11-18 11:15                   ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox