linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* why the kmalloc return fail when there is free physical address but return success after dropping page caches
@ 2016-05-18  2:38 baotiao
  2016-05-18  8:45 ` Vlastimil Babka
  0 siblings, 1 reply; 5+ messages in thread
From: baotiao @ 2016-05-18  2:38 UTC (permalink / raw)
  To: linux-mm

[-- Attachment #1: Type: text/plain, Size: 1640 bytes --]

Hello every, I meet an interesting kernel memory problem. Can anyone help me explain what happen under the kernel

The machine's status is describe as blow:

the machine has 96 physical memory. And the real use memory is about 64G, and the page cache use about 32G. we also use the swap area, at that time we have about 10G(we set the swap max size to 32G). At that moment, we find xfs report

Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)

after reading the source code. This message is display from this line

ptr = kmalloc(size, lflags); if (ptr || (flags & (KM_MAYFAIL|KM_NOSLEEP))) return ptr; if (!(++retries % 100)) xfs_err(NULL, "possible memory allocation deadlock in %s (mode:0x%x)", __func__, lflags); congestion_wait(BLK_RW_ASYNC, HZ/50);

The error is cause by the kmalloc() function, there is not enough memory in the system. But there is still 32G page cache.

So I run

echo 3 > /proc/sys/vm/drop_caches

to drop the page cache.

Then the system is fine. But I really don't know the reason. Why after I run drop_caches operation the kmalloc() function will success? I think even we use whole physical memory, but we only use 64 real momory, the 32G memory are page cache, further we have enough swap space. So why the kernel don't flush the page cache or the swap to reserved the kmalloc operation.


----------------------------------------
 
Github: https://github.com/baotiao
Blog: http://baotiao.github.io/
Stackoverflow: http://stackoverflow.com/users/634415/baotiao 
Linkedin: http://www.linkedin.com/profile/view?id=145231990


[-- Attachment #2: Type: text/html, Size: 4093 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: why the kmalloc return fail when there is free physical address but return success after dropping page caches
  2016-05-18  2:38 why the kmalloc return fail when there is free physical address but return success after dropping page caches baotiao
@ 2016-05-18  8:45 ` Vlastimil Babka
  2016-05-18  8:58   ` baotiao
  0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2016-05-18  8:45 UTC (permalink / raw)
  To: baotiao; +Cc: linux-mm, Dave Chinner

[+CC Dave]

On 05/18/2016 04:38 AM, baotiao wrote:
> Hello every, I meet an interesting kernel memory problem. Can anyone
> help me explain what happen under the kernel

Which kernel version is that?

> The machine's status is describe as blow:
>
> the machine has 96 physical memory. And the real use memory is about
> 64G, and the page cache use about 32G. we also use the swap area, at
> that time we have about 10G(we set the swap max size to 32G). At that
> moment, we find xfs report
>
> |Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation
> deadlock in kmem_alloc (mode:0x250) |

Just once, or many times?

> after reading the source code. This message is display from this line
>
> |ptr = kmalloc(size, lflags); if (ptr || (flags &
> (KM_MAYFAIL|KM_NOSLEEP))) return ptr; if (!(++retries % 100))
> xfs_err(NULL, "possible memory allocation deadlock in %s (mode:0x%x)",
> __func__, lflags); congestion_wait(BLK_RW_ASYNC, HZ/50); |

Any indication what is the size used here?

> The error is cause by the kmalloc() function, there is not enough memory
> in the system. But there is still 32G page cache.
>
> So I run
>
> |echo 3 > /proc/sys/vm/drop_caches |
>
> to drop the page cache.
>
> Then the system is fine.

Are you saying that the error message was repeated infinitely until you 
did the drop_caches?

> But I really don't know the reason. Why after I
> run drop_caches operation the kmalloc() function will success? I think
> even we use whole physical memory, but we only use 64 real momory, the
> 32G memory are page cache, further we have enough swap space. So why the
> kernel don't flush the page cache or the swap to reserved the kmalloc
> operation.
>
>
> ----------------------------------------
> Github: https://github.com/baotiao
> Blog: http://baotiao.github.io/
> Stackoverflow: http://stackoverflow.com/users/634415/baotiao
> Linkedin: http://www.linkedin.com/profile/view?id=145231990
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: why the kmalloc return fail when there is free physical address but return success after dropping page caches
  2016-05-18  8:45 ` Vlastimil Babka
@ 2016-05-18  8:58   ` baotiao
  2016-05-18 14:41     ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: baotiao @ 2016-05-18  8:58 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: linux-mm, Dave Chinner

[-- Attachment #1: Type: text/plain, Size: 4621 bytes --]

Thanks for your reply

>> Hello every, I meet an interesting kernel memory problem. Can anyone
>> help me explain what happen under the kernel
> 
> Which kernel version is that?

The kernel version is 3.10.0-327.4.5.el7.x86_64
>> The machine's status is describe as blow:
>> 
>> the machine has 96 physical memory. And the real use memory is about
>> 64G, and the page cache use about 32G. we also use the swap area, at
>> that time we have about 10G(we set the swap max size to 32G). At that
>> moment, we find xfs report
>> 
>> |Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation
>> deadlock in kmem_alloc (mode:0x250) |
> 
> Just once, or many times?

the message appear many times
from the code, I know that xfs will try 100 time of kmalloc() function

>> after reading the source code. This message is display from this line
>> 
>> |ptr = kmalloc(size, lflags); if (ptr || (flags &
>> (KM_MAYFAIL|KM_NOSLEEP))) return ptr; if (!(++retries % 100))
>> xfs_err(NULL, "possible memory allocation deadlock in %s (mode:0x%x)",
>> __func__, lflags); congestion_wait(BLK_RW_ASYNC, HZ/50); |
> 
> Any indication what is the size used here?
I don't know the size here, since it is called by the xfs.

>> The error is cause by the kmalloc() function, there is not enough memory
>> in the system. But there is still 32G page cache.
>> 
>> So I run
>> 
>> |echo 3 > /proc/sys/vm/drop_caches |
>> 
>> to drop the page cache.
>> 
>> Then the system is fine.
> 
> Are you saying that the error message was repeated infinitely until you did the drop_caches?


No. the error message don't appear after I drop_cache.

Is it possible the reason is that even we have enough physical pages, but there pages is used for page cache, when user call kmalloc(), kmalloc() get page from kernel. kernel find that there is not enough pages, but some page is used for page cache, we can get some free pages from these page caches. so the kernel will call the kswapd to clear away some page cache. But it takes too long to get the free pages. And the function in xfs kmem_alloc don't set the flag __GFP_WAIT flag. So the kmem_alloc always return no enough memory, and print the error message.

----------------------------------------
 
Github: https://github.com/baotiao
Blog: http://baotiao.github.io/
Stackoverflow: http://stackoverflow.com/users/634415/baotiao 
Linkedin: http://www.linkedin.com/profile/view?id=145231990

> On May 18, 2016, at 16:45, Vlastimil Babka <vbabka@suse.cz> wrote:
> 
> [+CC Dave]
> 
> On 05/18/2016 04:38 AM, baotiao wrote:
>> Hello every, I meet an interesting kernel memory problem. Can anyone
>> help me explain what happen under the kernel
> 
> Which kernel version is that?
> 
>> The machine's status is describe as blow:
>> 
>> the machine has 96 physical memory. And the real use memory is about
>> 64G, and the page cache use about 32G. we also use the swap area, at
>> that time we have about 10G(we set the swap max size to 32G). At that
>> moment, we find xfs report
>> 
>> |Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation
>> deadlock in kmem_alloc (mode:0x250) |
> 
> Just once, or many times?
> 
>> after reading the source code. This message is display from this line
>> 
>> |ptr = kmalloc(size, lflags); if (ptr || (flags &
>> (KM_MAYFAIL|KM_NOSLEEP))) return ptr; if (!(++retries % 100))
>> xfs_err(NULL, "possible memory allocation deadlock in %s (mode:0x%x)",
>> __func__, lflags); congestion_wait(BLK_RW_ASYNC, HZ/50); |
> 
> Any indication what is the size used here?
> 
>> The error is cause by the kmalloc() function, there is not enough memory
>> in the system. But there is still 32G page cache.
>> 
>> So I run
>> 
>> |echo 3 > /proc/sys/vm/drop_caches |
>> 
>> to drop the page cache.
>> 
>> Then the system is fine.
> 
> Are you saying that the error message was repeated infinitely until you did the drop_caches?
> 
>> But I really don't know the reason. Why after I
>> run drop_caches operation the kmalloc() function will success? I think
>> even we use whole physical memory, but we only use 64 real momory, the
>> 32G memory are page cache, further we have enough swap space. So why the
>> kernel don't flush the page cache or the swap to reserved the kmalloc
>> operation.
>> 
>> 
>> ----------------------------------------
>> Github: https://github.com/baotiao
>> Blog: http://baotiao.github.io/
>> Stackoverflow: http://stackoverflow.com/users/634415/baotiao
>> Linkedin: http://www.linkedin.com/profile/view?id=145231990


[-- Attachment #2: Type: text/html, Size: 17310 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: why the kmalloc return fail when there is free physical address but return success after dropping page caches
  2016-05-18  8:58   ` baotiao
@ 2016-05-18 14:41     ` Dave Chinner
  2016-05-25  9:25       ` 陈宗志
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2016-05-18 14:41 UTC (permalink / raw)
  To: baotiao; +Cc: Vlastimil Babka, linux-mm

On Wed, May 18, 2016 at 04:58:31PM +0800, baotiao wrote:
> Thanks for your reply
> 
> >> Hello every, I meet an interesting kernel memory problem. Can anyone
> >> help me explain what happen under the kernel
> > 
> > Which kernel version is that?
> 
> The kernel version is 3.10.0-327.4.5.el7.x86_64

RHEL7 kernel. Best you report the problem to your RH support
contact - the RHEL7 kernels are far different to upstream kernels..

> >> The machine's status is describe as blow:
> >> 
> >> the machine has 96 physical memory. And the real use memory is about
> >> 64G, and the page cache use about 32G. we also use the swap area, at
> >> that time we have about 10G(we set the swap max size to 32G). At that
> >> moment, we find xfs report
> >> 
> >> |Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation
> >> deadlock in kmem_alloc (mode:0x250) |

Pretty sure that's a GFP_NOFS allocation context.

> > Just once, or many times?
> 
> the message appear many times
> from the code, I know that xfs will try 100 time of kmalloc() function

The curent upstream kernels report much more information - process,
size of allocation, etc.

In general, the cause of such problems is memory fragmentation
preventing a large contiguous allocation from taking place (e.g.
when you try to read a file with millions of extents).

> >> in the system. But there is still 32G page cache.
> >> 
> >> So I run
> >> 
> >> |echo 3 > /proc/sys/vm/drop_caches |
> >> 
> >> to drop the page cache.
> >> 
> >> Then the system is fine.
> > 
> > Are you saying that the error message was repeated infinitely until you did the drop_caches?
> 
> 
> No. the error message don't appear after I drop_cache.

Of course - freeing memory will cause contiguous free space to
reform. then the allocation will succeed.

IIRC, the reason the system can't recover itself is that memory
compaction is not triggered from GFP_NOFS allocation context, which
means memory reclaim won't try to create contiguous regions by
moving things around and hence the allocation will not succeed until
a significant amount of memory is freed by some other trigger....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: why the kmalloc return fail when there is free physical address but return success after dropping page caches
  2016-05-18 14:41     ` Dave Chinner
@ 2016-05-25  9:25       ` 陈宗志
  0 siblings, 0 replies; 5+ messages in thread
From: 陈宗志 @ 2016-05-25  9:25 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Vlastimil Babka, linux-mm

[-- Attachment #1: Type: text/plain, Size: 5524 bytes --]

Hi Dave

> >> The machine's status is describe as blow:
> >>
> >> the machine has 96 physical memory. And the real use memory is about
> >> 64G, and the page cache use about 32G. we also use the swap area, at
> >> that time we have about 10G(we set the swap max size to 32G). At that
> >> moment, we find xfs report
> >>
> >> |Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation
> >> deadlock in kmem_alloc (mode:0x250) |

Pretty sure that's a GFP_NOFS allocation context.

You are right, it is a GFP_NOFS operator from the xfs,  xfs use GFP_NOFS
flag to avoid recursive filesystem call


> > Just once, or many times?
>
> the message appear many times
> from the code, I know that xfs will try 100 time of kmalloc() function

The curent upstream kernels report much more information - process,
size of allocation, etc.

In general, the cause of such problems is memory fragmentation
preventing a large contiguous allocation from taking place (e.g.
when you try to read a file with millions of extents).

> >> in the system. But there is still 32G page cache.
> >>
> >> So I run
> >>
> >> |echo 3 > /proc/sys/vm/drop_caches |
> >>
> >> to drop the page cache.
> >>
> >> Then the system is fine.
> >
> > Are you saying that the error message was repeated infinitely until you
did the drop_caches?
>
>
> No. the error message don't appear after I drop_cache.


Yes, you are right, before I echo 3 > /proc/sys/vm/drop_caches, the
/proc/buddyinfo is list blow:
Node 0, zone      DMA      0      0      0      1      2      1      1
0      1      1      3
Node 0, zone    DMA32   2983   2230   1037    290    121     63     47
61     16      0      0
Node 0, zone   Normal  13707   1126    285    268    291    160     64
21     11      0      0
Node 1, zone   Normal  10678   5041   1167    705    316    158     61
22      0      0      0


after the operator the /proc/buddyinfo is list blow:
Node 0, zone      DMA      0      0      0      1      2      1      1
0      1      1      3
Node 0, zone    DMA32  61091  22791   3659    348    169     81     89
63     16      0      0
Node 0, zone   Normal 781723 532596 246195  57076   9853   4061   1922
799    217     19      0
Node 1, zone   Normal 334903 138984  49608   6929   2770   1603    843
447    232      2      0


we can find that after the operator, we get more large size pages

beside the /proc/buddyinfo, is there any other command the get the memory
fragmentation info?

And beside the drop_caches operator, is there any other command can avoid
the memory fragmentation?




IIRC, the reason the system can't recover itself is that memory
compaction is not triggered from GFP_NOFS allocation context, which
means memory reclaim won't try to create contiguous regions by
moving things around and hence the allocation will not succeed until
a significant amount of memory is freed by some other trigger....


The GFP_NOFS will not triggered memory compaction, where can I find the
logic in kernel source code?

thank you

On Wed, May 18, 2016 at 10:41 PM, Dave Chinner <david@fromorbit.com> wrote:

> On Wed, May 18, 2016 at 04:58:31PM +0800, baotiao wrote:
> > Thanks for your reply
> >
> > >> Hello every, I meet an interesting kernel memory problem. Can anyone
> > >> help me explain what happen under the kernel
> > >
> > > Which kernel version is that?
> >
> > The kernel version is 3.10.0-327.4.5.el7.x86_64
>
> RHEL7 kernel. Best you report the problem to your RH support
> contact - the RHEL7 kernels are far different to upstream kernels..
>
> > >> The machine's status is describe as blow:
> > >>
> > >> the machine has 96 physical memory. And the real use memory is about
> > >> 64G, and the page cache use about 32G. we also use the swap area, at
> > >> that time we have about 10G(we set the swap max size to 32G). At that
> > >> moment, we find xfs report
> > >>
> > >> |Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation
> > >> deadlock in kmem_alloc (mode:0x250) |
>
> Pretty sure that's a GFP_NOFS allocation context.
>
> > > Just once, or many times?
> >
> > the message appear many times
> > from the code, I know that xfs will try 100 time of kmalloc() function
>
> The curent upstream kernels report much more information - process,
> size of allocation, etc.
>
> In general, the cause of such problems is memory fragmentation
> preventing a large contiguous allocation from taking place (e.g.
> when you try to read a file with millions of extents).
>
> > >> in the system. But there is still 32G page cache.
> > >>
> > >> So I run
> > >>
> > >> |echo 3 > /proc/sys/vm/drop_caches |
> > >>
> > >> to drop the page cache.
> > >>
> > >> Then the system is fine.
> > >
> > > Are you saying that the error message was repeated infinitely until
> you did the drop_caches?
> >
> >
> > No. the error message don't appear after I drop_cache.
>
> Of course - freeing memory will cause contiguous free space to
> reform. then the allocation will succeed.
>
> IIRC, the reason the system can't recover itself is that memory
> compaction is not triggered from GFP_NOFS allocation context, which
> means memory reclaim won't try to create contiguous regions by
> moving things around and hence the allocation will not succeed until
> a significant amount of memory is freed by some other trigger....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>



-- 
---
Blog: http://www.chenzongzhi.info
Twitter: https://twitter.com/baotiao <https://twitter.com/#%21/baotiao>
Git: https://github.com/baotiao

[-- Attachment #2: Type: text/html, Size: 7717 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-05-25  9:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-18  2:38 why the kmalloc return fail when there is free physical address but return success after dropping page caches baotiao
2016-05-18  8:45 ` Vlastimil Babka
2016-05-18  8:58   ` baotiao
2016-05-18 14:41     ` Dave Chinner
2016-05-25  9:25       ` 陈宗志

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).