kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
@ 2014-07-09 12:23 Amit Agarwal
  2014-07-10  5:03 ` Greg KH
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Amit Agarwal @ 2014-07-09 12:23 UTC (permalink / raw)
  To: kernelnewbies

Hi All,

We are running a 32 bit application on RHEL6.3-64 bit OS with kernel 
version 2.6.32-279.el6.x86_64.

While running this application we see the following when running under 
strace:
mmap(offset=33230848, len=2068480) failed with errno=12 for the file 
<so file name>

The system has enough RAM, total 16GB and about 12 GB free. We checked 
buddyinfo on the system, when the application is running and see the 
below trend:
Node 0, zone   Normal   5200  21396  21389  21516  16202  12770   9054  
4459   1430    168    313
Node 0, zone   Normal   5231  21395  21389  21516  16202  12770   9054  
4459   1430    168    313
Node 0, zone   Normal   5128  21401  21389  21516  16202  12770   9054  
4459   1430    168    313
Node 0, zone   Normal   5060  21405  21391  21516  16202  12770   9054  
4459   1430    168    313
..............
Node 0, zone   Normal    483  17946  21342  21516  16202  12770   9054  
4459   1430    168    313
Node 0, zone   Normal    315  17937  21342  21516  16202  12770   9054  
4459   1430    168    313
Node 0, zone   Normal    345  17891  21352  21516  16202  12770   9054  
4459   1430    168    313
Node 0, zone   Normal    278  17785  21352  21516  16202  12770   9054  
4459   1430    168    313


At this point the application crashes with mmap error.


Page Type information on the system:
Page block order: 9
Pages per block:  512

Free pages count per migrate type at order       0      1      2      3 
     4      5      6      7      8      9     10
Node    0, zone      DMA, type    Unmovable      1      0      0      1 
     0      0      1      0      0      0      0
Node    0, zone      DMA, type  Reclaimable      0      0      0      0 
     0      0      0      0      0      0      0
Node    0, zone      DMA, type      Movable      0      0      0      0 
     0      0      0      0      0      0      3
Node    0, zone      DMA, type      Reserve      0      0      0      0 
     0      0      0      0      0      1      0
Node    0, zone      DMA, type      Isolate      0      0      0      0 
     0      0      0      0      0      0      0
Node    0, zone    DMA32, type    Unmovable    410    261    133     51 
    24     14      7      2      0      0      0
Node    0, zone    DMA32, type  Reclaimable   2573   1806   1108    889 
   522    175     31      0      0      0      0
Node    0, zone    DMA32, type      Movable    408    352    345    341 
   328    312    310    260    216    180    356
Node    0, zone    DMA32, type      Reserve      0      0      0      0 
     0      0      0      0      0      2      0
Node    0, zone    DMA32, type      Isolate      0      0      0      0 
     0      0      0      0      0      0      0
Node    0, zone   Normal, type    Unmovable    434    788    518    137 
    58     24     13      4      5      0      0
Node    0, zone   Normal, type  Reclaimable      1    110   2136   3212 
   414     25      8      7      8      0      0
Node    0, zone   Normal, type      Movable   8135  20535  17946  18171 
15732  12719   9033   4448   1417    169    312
Node    0, zone   Normal, type      Reserve      0      0      0      0 
     0      0      0      0      0      0      1
Node    0, zone   Normal, type      Isolate      0      0      0      0 
     0      0      0      0      0      0      0

Number of blocks type     Unmovable  Reclaimable      Movable      
Reserve      Isolate
Node 0, zone      DMA            1            0            6            
1            0
Node 0, zone    DMA32           11           79         1436            
2            0
Node 0, zone   Normal          188          337         6129            
2            0


When we are running the same application on other system, it comes up. 
So, it has to do with memory allocation setting of some sort.

So, Is there some setting that we can use to allow the kernel to 
allocate the 313 - 4M  pages that are free for 4k memory allocations? If 
so, what should we change on the system?

Will setting vm.zone_reclaim_mode to 1 help here?

Do I need to provide more information, if yes, what?

-- 
Thanks,
-aka
http://blog.amit-agarwal.co.in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-09 12:23 Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64 Amit Agarwal
@ 2014-07-10  5:03 ` Greg KH
  2014-07-11  4:38   ` Amit Agarwal
  2014-07-10  5:42 ` shhuiw
  2014-07-10  6:08 ` Rik van Riel
  2 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2014-07-10  5:03 UTC (permalink / raw)
  To: kernelnewbies

On Wed, Jul 09, 2014 at 05:53:29PM +0530, Amit Agarwal wrote:
> Hi All,
> 
> We are running a 32 bit application on RHEL6.3-64 bit OS with kernel 
> version 2.6.32-279.el6.x86_64.

Great.  So ask them for support for this as you are paying for it,
nothing that we can do about it here, in a community forum, sorry.

Best of luck,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-09 12:23 Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64 Amit Agarwal
  2014-07-10  5:03 ` Greg KH
@ 2014-07-10  5:42 ` shhuiw
  2014-07-11  8:01   ` Amit Agarwal
  2014-07-10  6:08 ` Rik van Riel
  2 siblings, 1 reply; 17+ messages in thread
From: shhuiw @ 2014-07-10  5:42 UTC (permalink / raw)
  To: kernelnewbies

Hi,

How about setting /proc/sys/vm/min_free_kbytes to small values, e.g 128?
And what's the output on other workable system?



--

Regards,
shhuiw




At 2014-07-09 08:23:29, "Amit Agarwal" <amit@amit-agarwal.co.in> wrote:
>Hi All,
>
>We are running a 32 bit application on RHEL6.3-64 bit OS with kernel 
>version 2.6.32-279.el6.x86_64.
>
>While running this application we see the following when running under 
>strace:
>mmap(offset=33230848, len=2068480) failed with errno=12 for the file 
><so file name>
>
>The system has enough RAM, total 16GB and about 12 GB free. We checked 
>buddyinfo on the system, when the application is running and see the 
>below trend:
>Node 0, zone   Normal   5200  21396  21389  21516  16202  12770   9054  
>4459   1430    168    313
>Node 0, zone   Normal   5231  21395  21389  21516  16202  12770   9054  
>4459   1430    168    313
>Node 0, zone   Normal   5128  21401  21389  21516  16202  12770   9054  
>4459   1430    168    313
>Node 0, zone   Normal   5060  21405  21391  21516  16202  12770   9054  
>4459   1430    168    313
>..............
>Node 0, zone   Normal    483  17946  21342  21516  16202  12770   9054  
>4459   1430    168    313
>Node 0, zone   Normal    315  17937  21342  21516  16202  12770   9054  
>4459   1430    168    313
>Node 0, zone   Normal    345  17891  21352  21516  16202  12770   9054  
>4459   1430    168    313
>Node 0, zone   Normal    278  17785  21352  21516  16202  12770   9054  
>4459   1430    168    313
>
>
>At this point the application crashes with mmap error.
>
>
>Page Type information on the system:
>Page block order: 9
>Pages per block:  512
>
>Free pages count per migrate type at order       0      1      2      3 
>     4      5      6      7      8      9     10
>Node    0, zone      DMA, type    Unmovable      1      0      0      1 
>     0      0      1      0      0      0      0
>Node    0, zone      DMA, type  Reclaimable      0      0      0      0 
>     0      0      0      0      0      0      0
>Node    0, zone      DMA, type      Movable      0      0      0      0 
>     0      0      0      0      0      0      3
>Node    0, zone      DMA, type      Reserve      0      0      0      0 
>     0      0      0      0      0      1      0
>Node    0, zone      DMA, type      Isolate      0      0      0      0 
>     0      0      0      0      0      0      0
>Node    0, zone    DMA32, type    Unmovable    410    261    133     51 
>    24     14      7      2      0      0      0
>Node    0, zone    DMA32, type  Reclaimable   2573   1806   1108    889 
>   522    175     31      0      0      0      0
>Node    0, zone    DMA32, type      Movable    408    352    345    341 
>   328    312    310    260    216    180    356
>Node    0, zone    DMA32, type      Reserve      0      0      0      0 
>     0      0      0      0      0      2      0
>Node    0, zone    DMA32, type      Isolate      0      0      0      0 
>     0      0      0      0      0      0      0
>Node    0, zone   Normal, type    Unmovable    434    788    518    137 
>    58     24     13      4      5      0      0
>Node    0, zone   Normal, type  Reclaimable      1    110   2136   3212 
>   414     25      8      7      8      0      0
>Node    0, zone   Normal, type      Movable   8135  20535  17946  18171 
>15732  12719   9033   4448   1417    169    312
>Node    0, zone   Normal, type      Reserve      0      0      0      0 
>     0      0      0      0      0      0      1
>Node    0, zone   Normal, type      Isolate      0      0      0      0 
>     0      0      0      0      0      0      0
>
>Number of blocks type     Unmovable  Reclaimable      Movable      
>Reserve      Isolate
>Node 0, zone      DMA            1            0            6            
>1            0
>Node 0, zone    DMA32           11           79         1436            
>2            0
>Node 0, zone   Normal          188          337         6129            
>2            0
>
>
>When we are running the same application on other system, it comes up. 
>So, it has to do with memory allocation setting of some sort.
>
>So, Is there some setting that we can use to allow the kernel to 
>allocate the 313 - 4M  pages that are free for 4k memory allocations? If 
>so, what should we change on the system?
>
>Will setting vm.zone_reclaim_mode to 1 help here?
>
>Do I need to provide more information, if yes, what?
>
>-- 
>Thanks,
>-aka
>http://blog.amit-agarwal.co.in
>
>
>_______________________________________________
>Kernelnewbies mailing list
>Kernelnewbies at kernelnewbies.org
>http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20140710/09678bbc/attachment-0001.html 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-09 12:23 Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64 Amit Agarwal
  2014-07-10  5:03 ` Greg KH
  2014-07-10  5:42 ` shhuiw
@ 2014-07-10  6:08 ` Rik van Riel
  2014-07-10  7:06   ` shhuiw
  2014-07-11  4:50   ` Amit Agarwal
  2 siblings, 2 replies; 17+ messages in thread
From: Rik van Riel @ 2014-07-10  6:08 UTC (permalink / raw)
  To: kernelnewbies

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/09/2014 08:23 AM, Amit Agarwal wrote:
> Hi All,
> 
> We are running a 32 bit application on RHEL6.3-64 bit OS with
> kernel version 2.6.32-279.el6.x86_64.
> 
> While running this application we see the following when running
> under strace: mmap(offset=33230848, len=2068480) failed with
> errno=12 for the file <so file name>
> 
> The system has enough RAM, total 16GB and about 12 GB free.

The system may have enough memory, but your 32 bit application
is limited to slightly less than 4GB of virtual memory.

Errno 12 corresponds to -ENOMEM. The process running out of its
slightly-less-than-4GB of memory corresponds nicely with your
system still having 12GB of free memory.

This suggests you have run out of virtual memory space in the
process.

If your program needs more than 4GB of memory, eg. because you have
a large data set, you need to use a 64 bit version of the program.

This is not a kernel problem.

- -- 
All rights reversed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTvi3EAAoJEM553pKExN6D1/QIAJ26mjn0UHz6MK1CPGywlvaZ
JuE+2JvHrImHALnQ8jxQOutNLWp9IPzB/zByoyMkqUiJkrQ15aj60pyzZ9PvzuQo
CX1ve3xMY/sf/617mRqRvbfCAbB3UArlXG4tU8OmzBdH4Qy/V6jFLY69hmFUqiXk
hMIkG7NhuptYSwPmHV7vw4qTFDVUhh/p+etPYDeuAtDcFlScZ8CjdX2pzwR8TxPr
M16UAOuhw2ONkZCO25XuM9AteKeLC85uV63gTgcm4h+W3Hqyeyebfsu7FmPkLq7v
JavXFNL9gCMeVxgvjU6A7o+NjT8HNgkh/IPdE2LrqFysE6SeU77T61AcJEMJulM=
=fmS4
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-10  6:08 ` Rik van Riel
@ 2014-07-10  7:06   ` shhuiw
  2014-07-11  4:52     ` Amit Agarwal
  2014-07-11  4:50   ` Amit Agarwal
  1 sibling, 1 reply; 17+ messages in thread
From: shhuiw @ 2014-07-10  7:06 UTC (permalink / raw)
  To: kernelnewbies





At 2014-07-10 02:08:04, "Rik van Riel" <riel@surriel.com> wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >On 07/09/2014 08:23 AM, Amit Agarwal wrote: >> Hi All, >> >> We are running a 32 bit application on RHEL6.3-64 bit OS with >> kernel version 2.6.32-279.el6.x86_64. >> >> While running this application we see the following when running >> under strace: mmap(offset=33230848, len=2068480) failed with >> errno=12 for the file <so file name> >> >> The system has enough RAM, total 16GB and about 12 GB free. > >The system may have enough memory, but your 32 bit application >is limited to slightly less than 4GB of virtual memory. > >Errno 12 corresponds to -ENOMEM. The process running out of its >slightly-less-than-4GB of memory corresponds nicely with your >system still having 12GB of free memory. > >This suggests you have run out of virtual memory space in the >process. > >If your program needs more than 4GB of memory, eg. because you have >a large data set, you need to use a 64 bit version of the program. > >This is not a kernel problem.

Confused.
The words "When we are running the same application on other system, it comes up. "
How can the program run on other systems?

Regards,
shhuiw

> >- -- >All rights reversed. >-----BEGIN PGP SIGNATURE----- >Version: GnuPG v1 >Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > >iQEcBAEBAgAGBQJTvi3EAAoJEM553pKExN6D1/QIAJ26mjn0UHz6MK1CPGywlvaZ >JuE+2JvHrImHALnQ8jxQOutNLWp9IPzB/zByoyMkqUiJkrQ15aj60pyzZ9PvzuQo >CX1ve3xMY/sf/617mRqRvbfCAbB3UArlXG4tU8OmzBdH4Qy/V6jFLY69hmFUqiXk >hMIkG7NhuptYSwPmHV7vw4qTFDVUhh/p+etPYDeuAtDcFlScZ8CjdX2pzwR8TxPr >M16UAOuhw2ONkZCO25XuM9AteKeLC85uV63gTgcm4h+W3Hqyeyebfsu7FmPkLq7v >JavXFNL9gCMeVxgvjU6A7o+NjT8HNgkh/IPdE2LrqFysE6SeU77T61AcJEMJulM= >=fmS4 >-----END PGP SIGNATURE----- > >_______________________________________________ >Kernelnewbies mailing list >Kernelnewbies at kernelnewbies.org >http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20140710/c1796f77/attachment.html 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-10  5:03 ` Greg KH
@ 2014-07-11  4:38   ` Amit Agarwal
  0 siblings, 0 replies; 17+ messages in thread
From: Amit Agarwal @ 2014-07-11  4:38 UTC (permalink / raw)
  To: kernelnewbies

On 2014-07-10 10:33, Greg KH wrote:
> On Wed, Jul 09, 2014 at 05:53:29PM +0530, Amit Agarwal wrote:
>> Hi All,
>>
>> We are running a 32 bit application on RHEL6.3-64 bit OS with kernel
>> version 2.6.32-279.el6.x86_64.
>
> Great.  So ask them for support for this as you are paying for it,
> nothing that we can do about it here, in a community forum, sorry.

I understand that but my question was not related to the OS. I wanted 
to understand when there
is enough memory, why kernel is not able to allocate memory to the 
process. Is there some kernel
parameter that is set differently. I provided the information for the 
OS, just in case, it is useful.

-- 
Thanks,
-aka
http://blog.amit-agarwal.co.in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-10  6:08 ` Rik van Riel
  2014-07-10  7:06   ` shhuiw
@ 2014-07-11  4:50   ` Amit Agarwal
  1 sibling, 0 replies; 17+ messages in thread
From: Amit Agarwal @ 2014-07-11  4:50 UTC (permalink / raw)
  To: kernelnewbies

Hi Rik,

On 10-07-2014 11:38, Rik van Riel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 07/09/2014 08:23 AM, Amit Agarwal wrote:
>> Hi All,
>>
>> We are running a 32 bit application on RHEL6.3-64 bit OS with
>> kernel version 2.6.32-279.el6.x86_64.
>>
>> While running this application we see the following when running
>> under strace: mmap(offset=33230848, len=2068480) failed with
>> errno=12 for the file <so file name>
>>
>> The system has enough RAM, total 16GB and about 12 GB free.
> The system may have enough memory, but your 32 bit application
> is limited to slightly less than 4GB of virtual memory.
I know that. On another system with same operating system and kernel 
version, the application comes up with about 3.5GB, so the application 
should be able to come up on this system as well. But on this particular 
system, the application is crashing in "mmap" for a dynamic library at 
about 2.8GB memory usage, which seems un-usual to me.

--
Thanks,
-aka

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-10  7:06   ` shhuiw
@ 2014-07-11  4:52     ` Amit Agarwal
  2014-07-11  5:42       ` shhuiw
  0 siblings, 1 reply; 17+ messages in thread
From: Amit Agarwal @ 2014-07-11  4:52 UTC (permalink / raw)
  To: kernelnewbies

Hi Shhuiw,

On 10-07-2014 12:36, shhuiw wrote:
> .
> Confused.
> The words "When we are running the same application on other system, 
> it comes up. "
> How can the program run on other systems?

By another system, I meant to say with same OS and kernel but on another 
host.

--
Thanks,
-aka
http://blog.amit-agarwal.co.in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-11  4:52     ` Amit Agarwal
@ 2014-07-11  5:42       ` shhuiw
  2014-07-11  5:46         ` Amit Agarwal
  0 siblings, 1 reply; 17+ messages in thread
From: shhuiw @ 2014-07-11  5:42 UTC (permalink / raw)
  To: kernelnewbies

What's your workable OS and kernel on the same system, please?


--

Regards,
shhuiw




At 2014-07-11 12:52:54, "Amit Agarwal" <amit@amit-agarwal.co.in> wrote:
>Hi Shhuiw,
>
>On 10-07-2014 12:36, shhuiw wrote:
>> .
>> Confused.
>> The words "When we are running the same application on other system, 
>> it comes up. "
>> How can the program run on other systems?
>
>By another system, I meant to say with same OS and kernel but on another 
>host.
>
>--
>Thanks,
>-aka
>http://blog.amit-agarwal.co.in
>
>_______________________________________________
>Kernelnewbies mailing list
>Kernelnewbies at kernelnewbies.org
>http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20140711/77e11b82/attachment.html 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-11  5:42       ` shhuiw
@ 2014-07-11  5:46         ` Amit Agarwal
  2014-07-11  6:59           ` Dave Tian
  0 siblings, 1 reply; 17+ messages in thread
From: Amit Agarwal @ 2014-07-11  5:46 UTC (permalink / raw)
  To: kernelnewbies

Hi Shhuiw,

On 11-07-2014 11:12, shhuiw wrote:
> What's your workable OS and kernel on the same system, please?
>
We can't change the OS and kernel on the system having issues as it has 
other services running which are live and we cannot disturb them. So, we 
tried with same OS and kernel in another system where we found that the 
application was able to come up. Am trying to get the buddyinfo on that 
system and will provide the same on the working system.

-- 
Thanks,
-aka
http://blog.amit-agarwal.co.in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-11  5:46         ` Amit Agarwal
@ 2014-07-11  6:59           ` Dave Tian
  2014-07-11 12:02             ` Amit Agarwal
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Tian @ 2014-07-11  6:59 UTC (permalink / raw)
  To: kernelnewbies

Compare the .config and ?sysctl -a? of the two running system may give a hint from the kernel configuration and running parameters. ?sysctl -a? may be much more useful in this case, I guess.

Dave Tian
dave.jing.tian at gmail.com



On Jul 10, 2014, at 10:46 PM, Amit Agarwal <amit@amit-agarwal.co.in> wrote:

> Hi Shhuiw,
> 
> On 11-07-2014 11:12, shhuiw wrote:
>> What's your workable OS and kernel on the same system, please?
>> 
> We can't change the OS and kernel on the system having issues as it has 
> other services running which are live and we cannot disturb them. So, we 
> tried with same OS and kernel in another system where we found that the 
> application was able to come up. Am trying to get the buddyinfo on that 
> system and will provide the same on the working system.
> 
> -- 
> Thanks,
> -aka
> http://blog.amit-agarwal.co.in
> 
> 
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-10  5:42 ` shhuiw
@ 2014-07-11  8:01   ` Amit Agarwal
  2014-07-14  6:50     ` shhuiw
  0 siblings, 1 reply; 17+ messages in thread
From: Amit Agarwal @ 2014-07-11  8:01 UTC (permalink / raw)
  To: kernelnewbies

Hi,


On 10-07-2014 11:12, shhuiw wrote:
> Hi,
>
> How about setting /proc/sys/vm/min_free_kbytes to small values, e.g 128?
> And what's the output on other workable system?
>
Tried changing min_free_kbytes, with this change, application comes up 
but dies a little later while allocating memory later.

Output on the work-able system is as follows:
Node 0, zone Normal 2974 3405 2898 2237 1671 1021 462 164 23 2 0
Node 0, zone Normal 2878 3384 2898 2237 1671 1021 462 164 23 2 0
Node 0, zone Normal 2827 3365 2898 2237 1671 1021 462 164 23 2 0
Node 0, zone Normal 2899 3358 2899 2237 1671 1021 462 164 23 2 0
Node 0, zone Normal 2930 3363 2899 2237 1671 1021 462 164 23 2 0
Node 0, zone Normal 2897 3362 2898 2236 1672 1021 462 164 23 2 0
Node 0, zone Normal 2860 3329 2898 2236 1672 1021 462 164 23 2 0
Node 0, zone Normal 2831 3327 2883 2236 1672 1021 462 164 23 2 0
Node 0, zone Normal 2869 3330 2872 2236 1672 1021 462 164 23 2 0
Node 0, zone Normal 2869 3331 2874 2236 1672 1021 462 164 23 2 0
Node 0, zone Normal 2844 3328 2872 2236 1672 1021 462 164 23 2 0
Node 0, zone Normal 2896 3332 2872 2236 1672 1021 462 164 23 2 0
Node 0, zone Normal 2859 3336 2872 2236 1672 1021 462 164 23 2 0
Node 0, zone Normal 283 208 132 77 46 816 462 164 23 0 0
Node 0, zone Normal 295 204 132 77 46 21 31 163 23 0 0
Node 0, zone Normal 280 204 132 77 45 21 31 163 23 0 0
Node 0, zone Normal 372 213 135 79 46 20 31 163 23 0 0
Node 0, zone Normal 273 205 133 78 46 21 32 164 22 0 0
Node 0, zone Normal 336 210 133 79 47 21 30 164 22 0 0

Here we clearly see that once the number's for power-0 comes down, the 
numbers for power-1 and so on, decreases but that does not happen on the 
other system.

-- 
Thanks,
-aka
http://blog.amit-agarwal.co.in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-11  6:59           ` Dave Tian
@ 2014-07-11 12:02             ` Amit Agarwal
  0 siblings, 0 replies; 17+ messages in thread
From: Amit Agarwal @ 2014-07-11 12:02 UTC (permalink / raw)
  To: kernelnewbies

Hi Dave,

On 11-07-2014 12:29, Dave Tian wrote:
> Compare the .config and ?sysctl -a? of the two running system may give a hint from the kernel configuration and running parameters. ?sysctl -a? may be much more useful in this case, I guess.
>
I compared the config files and they are verbatim same. For sysctl, 
there are some differences which I felt should not matter in this case. 
However I am attaching the diff, in case I missed something.

-- 
Thanks,
-aka
http://blog.amit-agarwal.co.in

-------------- next part --------------
Left file: System having issue
Right file: Working system
21   kernel.core_pattern = core.%p                                     <> 20   kernel.core_pattern = |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
22   kernel.core_pipe_limit = 0                                           21   kernel.core_pipe_limit = 4
37   kernel.threads-max = 254259                                       <> 36   kernel.threads-max = 192117
39   kernel.random.entropy_avail = 129                                 <> 38   kernel.random.entropy_avail = 3468
48   kernel.pid_max = 65536                                            <> 47   kernel.pid_max = 32768
83   kernel.slow-work.max-threads = 64                                 <> 82   kernel.slow-work.max-threads = 32
381  kernel.hostname = <hostname stripped>                             <> 668  kernel.hostname = <hostname stripped>
388  kernel.msgmni = 256                                               <> 675  kernel.msgmni = 24044
389  kernel.msgmnb = 780000                                               676  kernel.msgmnb = 65536
391  kernel.auto_msgmni = 0                                            <> 678  kernel.auto_msgmni = 1
393  kernel.pty.nr = 859                                               <> 680  kernel.pty.nr = 13
416  vm.drop_caches = 1                                                <> 703  vm.drop_caches = 0
417  vm.min_free_kbytes = 128                                             704  vm.min_free_kbytes = 65535
434  fs.inode-nr = 285525    0                                         <> 721  fs.inode-nr = 13704     2128
435  fs.inode-state = 285525 0       0       0       0       0       0    722  fs.inode-state = 13704  2128    0       0       0       0       0
436  fs.file-nr = 1760       0       1606964                              723  fs.file-nr = 1216       0       1214127
437  fs.file-max = 1606964                                                724  fs.file-max = 1214127
439  fs.dentry-state = 1498730       1490314 45      0       0       0 <> 726  fs.dentry-state = 11667 3756    45      0       0       0
445  fs.aio-nr = 896                                                   <> 732  fs.aio-nr = 0
450  fs.epoll.max_user_watches = 3332628                               <> 737  fs.epoll.max_user_watches = 2518118
451  fs.suid_dumpable = 1                                                 738  fs.suid_dumpable = 0
467  fs.nfs.nlm_grace_period = 0                                       +-
468  fs.nfs.nlm_timeout = 10
469  fs.nfs.nlm_udpport = 0
470  fs.nfs.nlm_tcpport = 0
471  fs.nfs.nsm_use_hostnames = 0
472  fs.nfs.nsm_local_state = 0
549  net.netfilter.nf_conntrack_count = 23                             <> 830  net.netfilter.nf_conntrack_count = 164
560  net.core.rmem_max = 16777216                                      <> 841  net.core.rmem_max = 131071
797  net.ipv4.tcp_mem = 1525536      2034048 3051072                   <> 1036 net.ipv4.tcp_mem = 1152672      1536896 2305344
798  net.ipv4.tcp_wmem = 4096        65535   16777216                     1037 net.ipv4.tcp_wmem = 4096        16384   4194304
799  net.ipv4.tcp_rmem = 4096        65535   16777216                     1038 net.ipv4.tcp_rmem = 4096        87380   4194304
814  net.ipv4.tcp_dma_copybreak = 262144                               <> 1053 net.ipv4.tcp_dma_copybreak = 4096
825  net.ipv4.udp_mem = 1525536      2034048 3051072                   <> 1064 net.ipv4.udp_mem = 1152672      1536896 2305344

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-11  8:01   ` Amit Agarwal
@ 2014-07-14  6:50     ` shhuiw
  2014-07-14  7:04       ` Amit Agarwal
  0 siblings, 1 reply; 17+ messages in thread
From: shhuiw @ 2014-07-14  6:50 UTC (permalink / raw)
  To: kernelnewbies

Will you please show us your application code, if possible? Or at least the mmap part.

So that we can figure out the testcase to reproduce on our boxes.


--

Regards,
shhuiw




At 2014-07-11 04:01:03, "Amit Agarwal" <amit@amit-agarwal.co.in> wrote:
>Hi,
>
>
>On 10-07-2014 11:12, shhuiw wrote:
>> Hi,
>>
>> How about setting /proc/sys/vm/min_free_kbytes to small values, e.g 128?
>> And what's the output on other workable system?
>>
>Tried changing min_free_kbytes, with this change, application comes up 
>but dies a little later while allocating memory later.
>
>Output on the work-able system is as follows:
>Node 0, zone Normal 2974 3405 2898 2237 1671 1021 462 164 23 2 0
>Node 0, zone Normal 2878 3384 2898 2237 1671 1021 462 164 23 2 0
>Node 0, zone Normal 2827 3365 2898 2237 1671 1021 462 164 23 2 0
>Node 0, zone Normal 2899 3358 2899 2237 1671 1021 462 164 23 2 0
>Node 0, zone Normal 2930 3363 2899 2237 1671 1021 462 164 23 2 0
>Node 0, zone Normal 2897 3362 2898 2236 1672 1021 462 164 23 2 0
>Node 0, zone Normal 2860 3329 2898 2236 1672 1021 462 164 23 2 0
>Node 0, zone Normal 2831 3327 2883 2236 1672 1021 462 164 23 2 0
>Node 0, zone Normal 2869 3330 2872 2236 1672 1021 462 164 23 2 0
>Node 0, zone Normal 2869 3331 2874 2236 1672 1021 462 164 23 2 0
>Node 0, zone Normal 2844 3328 2872 2236 1672 1021 462 164 23 2 0
>Node 0, zone Normal 2896 3332 2872 2236 1672 1021 462 164 23 2 0
>Node 0, zone Normal 2859 3336 2872 2236 1672 1021 462 164 23 2 0
>Node 0, zone Normal 283 208 132 77 46 816 462 164 23 0 0
>Node 0, zone Normal 295 204 132 77 46 21 31 163 23 0 0
>Node 0, zone Normal 280 204 132 77 45 21 31 163 23 0 0
>Node 0, zone Normal 372 213 135 79 46 20 31 163 23 0 0
>Node 0, zone Normal 273 205 133 78 46 21 32 164 22 0 0
>Node 0, zone Normal 336 210 133 79 47 21 30 164 22 0 0
>
>Here we clearly see that once the number's for power-0 comes down, the 
>numbers for power-1 and so on, decreases but that does not happen on the 
>other system.
>
>-- 
>Thanks,
>-aka
>http://blog.amit-agarwal.co.in
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20140714/d0ec61e9/attachment.html 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-14  6:50     ` shhuiw
@ 2014-07-14  7:04       ` Amit Agarwal
  2014-07-14  9:40         ` shhuiw
  0 siblings, 1 reply; 17+ messages in thread
From: Amit Agarwal @ 2014-07-14  7:04 UTC (permalink / raw)
  To: kernelnewbies

On 14-07-2014 12:20, shhuiw wrote:
> Will you please show us your application code, if possible? Or at 
> least the mmap part.
The application code is proprietary, so I cannot disclose that.

For the mmap part, we are not doing mmap explicitly in our code, rather 
the binary is linked to some dynamic library (here in the strace that I 
took, it was oracle client).  Application crashes while doing mmap for 
the oracle library, here is the trace:


----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- 
----------------------------
mmap(offset=33230848, len=2068480) failed with errno=12 for the file 
./libclntsh.so.11.1
mmap(offset=33230848, len=2068480) failed with errno=12 for the file 
./libclntsh.so.11.1
kpedbg_dmp_stack()+  call     556FC244             FFE3D814 ? 0 ?
219
kpeDbgCrash()+72     call     556F06E4             0 ? 5 ? FFE3D8E4 ? 
A42C90 ?
                                                    4 ? FFE3D890 ?
56AA4FDF             call     55701184             0 ? 5 ? 56E2AEFC ? 2 
? 4 ?
                                                    50 ? 4 ? FFE3E8FD ?
56785C74             call     00000000             FFE3D8F4 ? 56F27E60 ?
<stripped>  signal   00000000             B ? FFE3EDAC ? FFE3EE2C ?
PKc()+255
<stripped>  call     _ZN11CThreadDataC1E  FFE3F2C8 ? FFE3F32E ? 13 ?
<stripped>           PKc()                5720C14A ? 57144297 ? 0 ?
)+127
start_thread()+201   call     00000000             92ECDD8 ? FFE3FB70 ?
                                                    FFE3FB70 ? FFE3FB70 ?
clone()+94           call     00000000             FFE3FB70 ? 0 ? 0 ? 0 
? 0 ?
                                                    0 ?

We have tried by reducing the footprint for our application by 
decreasing the application caching of some data and see that the 
application can come up properly until the VIRT in top command output is 
about 2.8GB after which application starts crashing.

> So that we can figure out the testcase to reproduce on our boxes.

-- 
Thanks,
-aka
http://blog.amit-agarwal.co.in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-14  7:04       ` Amit Agarwal
@ 2014-07-14  9:40         ` shhuiw
  2014-07-14 12:16           ` Amit Agarwal
  0 siblings, 1 reply; 17+ messages in thread
From: shhuiw @ 2014-07-14  9:40 UTC (permalink / raw)
  To: kernelnewbies

Some info needed:
*  Output of 'getconf PAGESIZE' [or PAGE_SIZE] on 2 systems
*  Output of 'free' before and after the application run on the 2 systems
*  Output of 'top' on the 2 systems
*  Does the application call mlock on the mmaped area?

--

Regards,
shhuiw




At 2014-07-14 03:04:50, "Amit Agarwal" <amit@amit-agarwal.co.in> wrote:
>On 14-07-2014 12:20, shhuiw wrote:
>> Will you please show us your application code, if possible? Or at 
>> least the mmap part.
>The application code is proprietary, so I cannot disclose that.
>
>For the mmap part, we are not doing mmap explicitly in our code, rather 
>the binary is linked to some dynamic library (here in the strace that I 
>took, it was oracle client).  Application crashes while doing mmap for 
>the oracle library, here is the trace:
>
>
>----- Call Stack Trace -----
>calling              call     entry                argument values in hex
>location             type     point                (? means dubious value)
>-------------------- -------- -------------------- 
>----------------------------
>mmap(offset=33230848, len=2068480) failed with errno=12 for the file 
>./libclntsh.so.11.1
>mmap(offset=33230848, len=2068480) failed with errno=12 for the file 
>./libclntsh.so.11.1
>kpedbg_dmp_stack()+  call     556FC244             FFE3D814 ? 0 ?
>219
>kpeDbgCrash()+72     call     556F06E4             0 ? 5 ? FFE3D8E4 ? 
>A42C90 ?
>                                                    4 ? FFE3D890 ?
>56AA4FDF             call     55701184             0 ? 5 ? 56E2AEFC ? 2 
>? 4 ?
>                                                    50 ? 4 ? FFE3E8FD ?
>56785C74             call     00000000             FFE3D8F4 ? 56F27E60 ?
><stripped>  signal   00000000             B ? FFE3EDAC ? FFE3EE2C ?
>PKc()+255
><stripped>  call     _ZN11CThreadDataC1E  FFE3F2C8 ? FFE3F32E ? 13 ?
><stripped>           PKc()                5720C14A ? 57144297 ? 0 ?
>)+127
>start_thread()+201   call     00000000             92ECDD8 ? FFE3FB70 ?
>                                                    FFE3FB70 ? FFE3FB70 ?
>clone()+94           call     00000000             FFE3FB70 ? 0 ? 0 ? 0 
>? 0 ?
>                                                    0 ?
>
>We have tried by reducing the footprint for our application by 
>decreasing the application caching of some data and see that the 
>application can come up properly until the VIRT in top command output is 
>about 2.8GB after which application starts crashing.
>
>> So that we can figure out the testcase to reproduce on our boxes.
>
>-- 
>Thanks,
>-aka
>http://blog.amit-agarwal.co.in
>
>
>_______________________________________________
>Kernelnewbies mailing list
>Kernelnewbies at kernelnewbies.org
>http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20140714/db6fc3fe/attachment.html 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64
  2014-07-14  9:40         ` shhuiw
@ 2014-07-14 12:16           ` Amit Agarwal
  0 siblings, 0 replies; 17+ messages in thread
From: Amit Agarwal @ 2014-07-14 12:16 UTC (permalink / raw)
  To: kernelnewbies


On 14-07-2014 15:10, shhuiw wrote:
> Some info needed:
> * Output of 'getconf PAGESIZE' [or PAGE_SIZE] on 2 systems
On both the systems, the output is :
4096

> * Output of 'free' before and after the application run on the 2 systems
> * Output of 'top' on the 2 systems
The data is attached in zip file. The filenames indicate 
working/non-working systems.
> * Does the application call mlock on the mmaped area?
No, our application does not specifically call mlock.
Also, strace on application does not show any mlock on the non-working 
system from application start to the time application dies.

-- 
Thanks,
-aka
http://blog.amit-agarwal.co.in

-------------- next part --------------
A non-text attachment was scrubbed...
Name: stats.zip
Type: application/x-zip-compressed
Size: 22096 bytes
Desc: not available
Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20140714/5d28393e/attachment-0001.bin 

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-07-14 12:16 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-09 12:23 Memory allocation problems on RHEL 6.3 kernel version 2.6.32-279.el6.x86_64 Amit Agarwal
2014-07-10  5:03 ` Greg KH
2014-07-11  4:38   ` Amit Agarwal
2014-07-10  5:42 ` shhuiw
2014-07-11  8:01   ` Amit Agarwal
2014-07-14  6:50     ` shhuiw
2014-07-14  7:04       ` Amit Agarwal
2014-07-14  9:40         ` shhuiw
2014-07-14 12:16           ` Amit Agarwal
2014-07-10  6:08 ` Rik van Riel
2014-07-10  7:06   ` shhuiw
2014-07-11  4:52     ` Amit Agarwal
2014-07-11  5:42       ` shhuiw
2014-07-11  5:46         ` Amit Agarwal
2014-07-11  6:59           ` Dave Tian
2014-07-11 12:02             ` Amit Agarwal
2014-07-11  4:50   ` Amit Agarwal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).