* Problem with mlockall() and Threads: memory usage
@ 2004-05-18 10:10 Terry Barnaby
[not found] ` <041501c43cc9$28aaed00$c8de11cc@black>
0 siblings, 1 reply; 7+ messages in thread
From: Terry Barnaby @ 2004-05-18 10:10 UTC (permalink / raw)
To: linux-kernel
Hi,
We have a problem with a soft real-time program that uses mlockall
to improve its latency.
The basic problem, which can be seen with a simple test example, is
that if we have a program that uses a large amount of memory, uses multiple
threads and uses mlockall() the physical memory usage goes through the
roof. This problem/feature is present using RedHat 7.3 (2.4.x libc user level
threads), RedHat 9 (2.4.20 kernel threads) and Fedora Core 2 (2.6.5).
Our simple test program first does a mlockall(MCL_CURRENT | MCL_FUTURE),
mallocs 10MBytes and then creates 8 threads all which pause.
The memory usage with the mlockall() call is:
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
2251 pts/1 SL 0:00 0 2 95921 95924 37.3 ./t2 8
The memory usage without the mlockall() call is:
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
2275 pts/1 S 0:00 0 2 95929 11152 4.3 ./t2 8
It appears that the kernel is allocating physical memory for each
of the Threads shared data area's rather than allocating just
the one shared area.
Are we doing something wrong ?
Is this the correct behaviour ?
Is this a kernle or glibc bug ?
Example code follows:
Terry
/*******************************************************************************
* T2.c Test Threads
* T.Barnaby, BEAM Ltd, 18/5/04
*******************************************************************************
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <pthread.h>
#include <sys/mman.h>
#include <sys/statfs.h>
const int memSize = (10 * 1024*1024);
void* threadFunc(void* arg){
while(1){
printf("Thread::function: loop: Pid(%d)\n", getpid());
pause();
}
}
void test1(int n){
pthread_t* threads;
void* mem;
int i;
threads = (pthread_t*)malloc(n * sizeof(pthread_t));
mem = malloc(memSize);
memset(mem, 0, memSize);
printf("Mem: %p\n", mem);
for(i = 0; i < n; i++){
pthread_create(&threads[i], 0, threadFunc, 0);
}
pause();
}
int main(int argc, char** argv){
if(argc != 2){
fprintf(stderr, "Usage: t2 <numberOfThreads>\n");
return 1;
}
#ifndef ZAP
// Lock in all of the pages of this application
if(mlockall(MCL_CURRENT | MCL_FUTURE) < 0)
fprintf(stderr, "Warning: unable to lock in memory pages\n");
#endif
test1(atoi(argv[1]));
return 0;
}
--
Dr Terry Barnaby BEAM Ltd
Phone: +44 1454 324512 Northavon Business Center, Dean Rd
Fax: +44 1454 313172 Yate, Bristol, BS37 5NH, UK
Email: terry@beam.ltd.uk Web: www.beam.ltd.uk
BEAM for: Visually Impaired X-Terminals, Parallel Processing, Software
"Tandems are twice the fun !"
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <041501c43cc9$28aaed00$c8de11cc@black>]
* Re: Problem with mlockall() and Threads: memory usage [not found] ` <041501c43cc9$28aaed00$c8de11cc@black> @ 2004-05-18 12:51 ` Terry Barnaby 2004-05-18 20:38 ` David Schwartz 0 siblings, 1 reply; 7+ messages in thread From: Terry Barnaby @ 2004-05-18 12:51 UTC (permalink / raw) To: Mike Black; +Cc: Terry Barnaby, linux-kernel Thanks for that. I have done some more investigating, and on my system (Standard RedHat 9) the stack ulimit is set to 8192 KBytes. So it appears that the thread library/kernel threads pre-allocates, and writes to, 8129 KBytes of stack per thread and so then mlockall() locks all of this in memory. Should'nt the Thread library grow the stack rather than preallocate it all even with mlockall() like malloc ? I also notice that if I set the pre-thread stack with pthread_attr_setstacksize() this sets the hard limit for stack size rather than the initial stack size as stated in the pthread.h include file. Maybe there is another way to set the initial stack size per thread ? Anyway I presume this stack manipulation is done in the user level threads library rather than the kernel (even on NPTL). So I guess I should move this question to the list for Linux Threads. Any ideas where this list is ? Cheers Terry Mike Black wrote: > I compiled your program on my system and it behaves like you would expect. Looks like about 2Meg per thread overhead.. > t5 is with mlock and t5a is without -- I've attached a static compile of t5 so you can test it on your system. > That way it will tell whether it's just your compiler/library setup or the OS. > I'm running Linux 2.6.6, libc-2.3.2, libpthread-0.10 > > The zip file password is "t5" > > > ----- Original Message ----- > From: "Terry Barnaby" <terry1@beam.ltd.uk> > To: <linux-kernel@vger.kernel.org> > Sent: Tuesday, May 18, 2004 6:10 AM > Subject: Problem with mlockall() and Threads: memory usage > > > >>Hi, >> >>We have a problem with a soft real-time program that uses mlockall >>to improve its latency. >> >>The basic problem, which can be seen with a simple test example, is >>that if we have a program that uses a large amount of memory, uses multiple >>threads and uses mlockall() the physical memory usage goes through the >>roof. This problem/feature is present using RedHat 7.3 (2.4.x libc user level >>threads), RedHat 9 (2.4.20 kernel threads) and Fedora Core 2 (2.6.5). >> >>Our simple test program first does a mlockall(MCL_CURRENT | MCL_FUTURE), >>mallocs 10MBytes and then creates 8 threads all which pause. >> >>The memory usage with the mlockall() call is: >> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND >>2251 pts/1 SL 0:00 0 2 95921 95924 37.3 ./t2 8 >> >>The memory usage without the mlockall() call is: >> PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND >>2275 pts/1 S 0:00 0 2 95929 11152 4.3 ./t2 8 >> >>It appears that the kernel is allocating physical memory for each >>of the Threads shared data area's rather than allocating just >>the one shared area. >> >>Are we doing something wrong ? >>Is this the correct behaviour ? >>Is this a kernle or glibc bug ? >> >>Example code follows: >> >>Terry >> >>/******************************************************************************* >> * T2.c Test Threads >> * T.Barnaby, BEAM Ltd, 18/5/04 >> ******************************************************************************* >> */ >>#include <stdio.h> >>#include <stdlib.h> >>#include <string.h> >>#include <unistd.h> >>#include <pthread.h> >>#include <sys/mman.h> >>#include <sys/statfs.h> >> >>const int memSize = (10 * 1024*1024); >> >>void* threadFunc(void* arg){ >>while(1){ >>printf("Thread::function: loop: Pid(%d)\n", getpid()); >>pause(); >>} >>} >> >>void test1(int n){ >>pthread_t* threads; >>void* mem; >>int i; >> >>threads = (pthread_t*)malloc(n * sizeof(pthread_t)); >>mem = malloc(memSize); >>memset(mem, 0, memSize); >>printf("Mem: %p\n", mem); >> >>for(i = 0; i < n; i++){ >>pthread_create(&threads[i], 0, threadFunc, 0); >>} >>pause(); >>} >> >> >>int main(int argc, char** argv){ >>if(argc != 2){ >>fprintf(stderr, "Usage: t2 <numberOfThreads>\n"); >>return 1; >>} >> >>#ifndef ZAP >>// Lock in all of the pages of this application >>if(mlockall(MCL_CURRENT | MCL_FUTURE) < 0) >>fprintf(stderr, "Warning: unable to lock in memory pages\n"); >>#endif >> >>test1(atoi(argv[1])); >>return 0; >>} >> >>-- >>Dr Terry Barnaby BEAM Ltd >>Phone: +44 1454 324512 Northavon Business Center, Dean Rd >>Fax: +44 1454 313172 Yate, Bristol, BS37 5NH, UK >>Email: terry@beam.ltd.uk Web: www.beam.ltd.uk >>BEAM for: Visually Impaired X-Terminals, Parallel Processing, Software >> "Tandems are twice the fun !" >>- >>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >>the body of a message to majordomo@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html >>Please read the FAQ at http://www.tux.org/lkml/ -- Dr Terry Barnaby BEAM Ltd Phone: +44 1454 324512 Northavon Business Center, Dean Rd Fax: +44 1454 313172 Yate, Bristol, BS37 5NH, UK Email: terry@beam.ltd.uk Web: www.beam.ltd.uk BEAM for: Visually Impaired X-Terminals, Parallel Processing, Software "Tandems are twice the fun !" ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: Problem with mlockall() and Threads: memory usage 2004-05-18 12:51 ` Terry Barnaby @ 2004-05-18 20:38 ` David Schwartz 2004-05-19 8:45 ` Terry Barnaby 0 siblings, 1 reply; 7+ messages in thread From: David Schwartz @ 2004-05-18 20:38 UTC (permalink / raw) To: Mike Black; +Cc: linux-kernel > Thanks for that. > I have done some more investigating, and on my system (Standard RedHat 9) > the stack ulimit is set to 8192 KBytes. So it appears that the thread > library/kernel threads pre-allocates, and writes to, 8129 KBytes > of stack per > thread and so then mlockall() locks all of this in memory. > > Should'nt the Thread library grow the stack rather than > preallocate it all even > with mlockall() like malloc ? I thought you wanted improved latency. Surely having to find a page for you when your stack grows will add unpredictable latency. So, no, the thread library should reserve the stack when 'mlockall(MCL_FUTURE)' is specified. I do agree that having an 'initial stack size' in additional to a 'maximum stack size' would be a good idea. The former good for application that are concerned about physical memory usage and the latter for applications concerned about virtual memory usage. DS ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem with mlockall() and Threads: memory usage 2004-05-18 20:38 ` David Schwartz @ 2004-05-19 8:45 ` Terry Barnaby 2004-05-20 0:23 ` Elladan 0 siblings, 1 reply; 7+ messages in thread From: Terry Barnaby @ 2004-05-19 8:45 UTC (permalink / raw) To: davids; +Cc: Mike Black, linux-kernel Hi David, We do want improved latency, but with reasonable memory usage. This is a soft real-time system. At the moment the memory usage is far too high in our application. With 20 threads runing the system will lock 160MBytes of memory just for stack space (8 MBytes each), although the application probably only needs 2MByte in total. We can reduce the maximum stack size per thread, but then if a thread increases its stack size beyond this the application will crash with a segment fault, not good ... For our use, mapping in physical memory as required for a growing stack would be a good compromise between latency and memory usage. Once the system has run the worker threads for a short time all of the needed stack memory will be locked in and latency will be controlled. If a thread needs more memory for stack in a particular instance, there will be a latency hit but this would be acceptable and much better than a crash. Terry David Schwartz wrote: >>Thanks for that. >>I have done some more investigating, and on my system (Standard RedHat 9) >>the stack ulimit is set to 8192 KBytes. So it appears that the thread >>library/kernel threads pre-allocates, and writes to, 8129 KBytes >>of stack per >>thread and so then mlockall() locks all of this in memory. >> >>Should'nt the Thread library grow the stack rather than >>preallocate it all even >>with mlockall() like malloc ? > > > I thought you wanted improved latency. Surely having to find a page for you > when your stack grows will add unpredictable latency. So, no, the thread > library should reserve the stack when 'mlockall(MCL_FUTURE)' is specified. > > I do agree that having an 'initial stack size' in additional to a 'maximum > stack size' would be a good idea. The former good for application that are > concerned about physical memory usage and the latter for applications > concerned about virtual memory usage. > > DS > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Dr Terry Barnaby BEAM Ltd Phone: +44 1454 324512 Northavon Business Center, Dean Rd Fax: +44 1454 313172 Yate, Bristol, BS37 5NH, UK Email: terry@beam.ltd.uk Web: www.beam.ltd.uk BEAM for: Visually Impaired X-Terminals, Parallel Processing, Software "Tandems are twice the fun !" ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem with mlockall() and Threads: memory usage 2004-05-19 8:45 ` Terry Barnaby @ 2004-05-20 0:23 ` Elladan 2004-05-21 14:28 ` Terry Barnaby 2004-05-21 14:28 ` Terry Barnaby 0 siblings, 2 replies; 7+ messages in thread From: Elladan @ 2004-05-20 0:23 UTC (permalink / raw) To: Terry Barnaby; +Cc: davids, Mike Black, linux-kernel On Wed, May 19, 2004 at 09:45:27AM +0100, Terry Barnaby wrote: > Hi David, > > We do want improved latency, but with reasonable memory usage. This is > a soft real-time system. At the moment the memory usage is far too > high in our application. > > With 20 threads runing the system will lock 160MBytes of memory just > for stack space (8 MBytes each), although the application probably > only needs 2MByte in total. We can reduce the maximum stack size per > thread, but then if a thread increases its stack size beyond this the > application will crash with a segment fault, not good ... > > For our use, mapping in physical memory as required for a growing > stack would be a good compromise between latency and memory usage. > Once the system has run the worker threads for a short time all of the > needed stack memory will be locked in and latency will be controlled. > If a thread needs more memory for stack in a particular instance, > there will be a latency hit but this would be acceptable and much > better than a crash. It sounds to me like you have a really special-purpose situation here. You want to minimize the amount of memory used, but you may have deep stacks of unknown depth and you can't grow them safely without incurring latency. It seems to me that you really should just figure out how much stack your app really needs and set your limits appropriately. If your program requires indeterminate stack depth, you should fix it so it doesn't. If you really, really want random memory allocations and memory locking at the same time, you could implement your own mlockall solution with your own stack manager. You could do an mlockall(MCL_CURRENT) with small stack reserves, and then manually go and remap your stack space the way you want it. Of course, you'd need your own memory allocator if you ever allocate more non-stack memory, but you'll need that anyway. -J > David Schwartz wrote: > >>Thanks for that. > >>I have done some more investigating, and on my system (Standard RedHat 9) > >>the stack ulimit is set to 8192 KBytes. So it appears that the thread > >>library/kernel threads pre-allocates, and writes to, 8129 KBytes > >>of stack per > >>thread and so then mlockall() locks all of this in memory. > >> > >>Should'nt the Thread library grow the stack rather than > >>preallocate it all even > >>with mlockall() like malloc ? > > > > > > I thought you wanted improved latency. Surely having to find a page > > for you > >when your stack grows will add unpredictable latency. So, no, the thread > >library should reserve the stack when 'mlockall(MCL_FUTURE)' is specified. > > > > I do agree that having an 'initial stack size' in additional to a > > 'maximum > >stack size' would be a good idea. The former good for application that are > >concerned about physical memory usage and the latter for applications > >concerned about virtual memory usage. > > > > DS > > > > > >- > >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > >the body of a message to majordomo@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html > >Please read the FAQ at http://www.tux.org/lkml/ > > > > -- > Dr Terry Barnaby BEAM Ltd > Phone: +44 1454 324512 Northavon Business Center, Dean Rd > Fax: +44 1454 313172 Yate, Bristol, BS37 5NH, UK > Email: terry@beam.ltd.uk Web: www.beam.ltd.uk > BEAM for: Visually Impaired X-Terminals, Parallel Processing, Software > "Tandems are twice the fun !" > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem with mlockall() and Threads: memory usage 2004-05-20 0:23 ` Elladan @ 2004-05-21 14:28 ` Terry Barnaby 2004-05-21 14:28 ` Terry Barnaby 1 sibling, 0 replies; 7+ messages in thread From: Terry Barnaby @ 2004-05-21 14:28 UTC (permalink / raw) To: Elladan; +Cc: Terry Barnaby, davids, Mike Black, linux-kernel Hi Elladan, Thanks for the responce, I don't think we have a really special-purpose situation here. The application is relatively complex it that it uses CORBA as well as many other libraries. We cannot hope to calculate the necessary total stack usage, just make an educated guess based on observed stack usage during running. It is very likely that the stack usage will be small, and we can safely set the stack size small, however we cannot garantee this. Normally VM allows programmers flexibility with memory size limitations. It seems inconsistant that when using mlockall() newly malloced memory will be paged in as required but stacks are fixed into memory. Both of these mechanisems involve dynamically extending the processes memory. I would have thought it better that stacks followed the malloc() model in that memory pages are only allocated as required. A system that needed to guarantee no page faults could preallocate the stack easily as it would have to with the heap. I do understand that this could be dificult to achieve however. Anyway cheers for the response and pointers. Cheers Terry Some info gained for those reading this thread: 1. If you have a threaded application and you use mlockall(MCL_CURRENT | MCL_FUTURE) then the full amount of each threads preallocated stack will be mapped into physical memory. 2. If you use pthread_create(&t, NULL, func, 0) then RedHat 9 will allocate 8MBytes of stack per thread. (Possibly the amount set by the processes stack ulimit ?) 3. If you use pthread_create(&t, &a, func, 0) and set up the attributes with pthread_attr_init(&a) then RedHat 9 will allocate 2MBytes to each thread. The pthreads manual states that using pthread_create() with attibutes set using pthread_attr_init() is the same as using NULL to pthread_create() this is WRONG. Also using pthread_attr_init() will sets the threads scheduler to SCHED_OTHER rather than "inherit" the parents scheduler config as passing NULL to pthread_create() does. Elladan wrote: > On Wed, May 19, 2004 at 09:45:27AM +0100, Terry Barnaby wrote: > >>Hi David, >> >>We do want improved latency, but with reasonable memory usage. This is >>a soft real-time system. At the moment the memory usage is far too >>high in our application. >> >>With 20 threads runing the system will lock 160MBytes of memory just >>for stack space (8 MBytes each), although the application probably >>only needs 2MByte in total. We can reduce the maximum stack size per >>thread, but then if a thread increases its stack size beyond this the >>application will crash with a segment fault, not good ... >> >>For our use, mapping in physical memory as required for a growing >>stack would be a good compromise between latency and memory usage. >>Once the system has run the worker threads for a short time all of the >>needed stack memory will be locked in and latency will be controlled. >>If a thread needs more memory for stack in a particular instance, >>there will be a latency hit but this would be acceptable and much >>better than a crash. > > > It sounds to me like you have a really special-purpose situation here. > You want to minimize the amount of memory used, but you may have deep > stacks of unknown depth and you can't grow them safely without incurring > latency. > > It seems to me that you really should just figure out how much stack > your app really needs and set your limits appropriately. If your > program requires indeterminate stack depth, you should fix it so it > doesn't. > > If you really, really want random memory allocations and memory locking > at the same time, you could implement your own mlockall solution with > your own stack manager. You could do an mlockall(MCL_CURRENT) with > small stack reserves, and then manually go and remap your stack space > the way you want it. Of course, you'd need your own memory allocator if > you ever allocate more non-stack memory, but you'll need that anyway. > > -J > > >>David Schwartz wrote: >> >>>>Thanks for that. >>>>I have done some more investigating, and on my system (Standard RedHat 9) >>>>the stack ulimit is set to 8192 KBytes. So it appears that the thread >>>>library/kernel threads pre-allocates, and writes to, 8129 KBytes >>>>of stack per >>>>thread and so then mlockall() locks all of this in memory. >>>> >>>>Should'nt the Thread library grow the stack rather than >>>>preallocate it all even >>>>with mlockall() like malloc ? >>> >>> >>> I thought you wanted improved latency. Surely having to find a page >>> for you >>>when your stack grows will add unpredictable latency. So, no, the thread >>>library should reserve the stack when 'mlockall(MCL_FUTURE)' is specified. >>> >>> I do agree that having an 'initial stack size' in additional to a >>> 'maximum >>>stack size' would be a good idea. The former good for application that are >>>concerned about physical memory usage and the latter for applications >>>concerned about virtual memory usage. >>> >>> DS >>> >>> >>>- >>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >>>the body of a message to majordomo@vger.kernel.org >>>More majordomo info at http://vger.kernel.org/majordomo-info.html >>>Please read the FAQ at http://www.tux.org/lkml/ >>> >> >>-- >>Dr Terry Barnaby BEAM Ltd >>Phone: +44 1454 324512 Northavon Business Center, Dean Rd >>Fax: +44 1454 313172 Yate, Bristol, BS37 5NH, UK >>Email: terry@beam.ltd.uk Web: www.beam.ltd.uk >>BEAM for: Visually Impaired X-Terminals, Parallel Processing, Software >> "Tandems are twice the fun !" >>- >>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >>the body of a message to majordomo@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html >>Please read the FAQ at http://www.tux.org/lkml/ > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Dr Terry Barnaby BEAM Ltd Phone: +44 1454 324512 Northavon Business Center, Dean Rd Fax: +44 1454 313172 Yate, Bristol, BS37 5NH, UK Email: terry@beam.ltd.uk Web: www.beam.ltd.uk BEAM for: Visually Impaired X-Terminals, Parallel Processing, Software "Tandems are twice the fun !" ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem with mlockall() and Threads: memory usage 2004-05-20 0:23 ` Elladan 2004-05-21 14:28 ` Terry Barnaby @ 2004-05-21 14:28 ` Terry Barnaby 1 sibling, 0 replies; 7+ messages in thread From: Terry Barnaby @ 2004-05-21 14:28 UTC (permalink / raw) To: Elladan; +Cc: Terry Barnaby, davids, Mike Black, linux-kernel Hi Elladan, Thanks for the responce, I don't think we have a really special-purpose situation here. The application is relatively complex it that it uses CORBA as well as many other libraries. We cannot hope to calculate the necessary total stack usage, just make an educated guess based on observed stack usage during running. It is very likely that the stack usage will be small, and we can safely set the stack size small, however we cannot garantee this. Normally VM allows programmers flexibility with memory size limitations. It seems inconsistant that when using mlockall() newly malloced memory will be paged in as required but stacks are fixed into memory. Both of these mechanisems involve dynamically extending the processes memory. I would have thought it better that stacks followed the malloc() model in that memory pages are only allocated as required. A system that needed to guarantee no page faults could preallocate the stack easily as it would have to with the heap. I do understand that this could be dificult to achieve however. Anyway cheers for the response and pointers. Cheers Terry Some info gained for those reading this thread: 1. If you have a threaded application and you use mlockall(MCL_CURRENT | MCL_FUTURE) then the full amount of each threads preallocated stack will be mapped into physical memory. 2. If you use pthread_create(&t, NULL, func, 0) then RedHat 9 will allocate 8MBytes of stack per thread. (Possibly the amount set by the processes stack ulimit ?) 3. If you use pthread_create(&t, &a, func, 0) and set up the attributes with pthread_attr_init(&a) then RedHat 9 will allocate 2MBytes to each thread. The pthreads manual states that using pthread_create() with attibutes set using pthread_attr_init() is the same as using NULL to pthread_create() this is WRONG. Also using pthread_attr_init() will sets the threads scheduler to SCHED_OTHER rather than "inherit" the parents scheduler config as passing NULL to pthread_create() does. Elladan wrote: > On Wed, May 19, 2004 at 09:45:27AM +0100, Terry Barnaby wrote: > >>Hi David, >> >>We do want improved latency, but with reasonable memory usage. This is >>a soft real-time system. At the moment the memory usage is far too >>high in our application. >> >>With 20 threads runing the system will lock 160MBytes of memory just >>for stack space (8 MBytes each), although the application probably >>only needs 2MByte in total. We can reduce the maximum stack size per >>thread, but then if a thread increases its stack size beyond this the >>application will crash with a segment fault, not good ... >> >>For our use, mapping in physical memory as required for a growing >>stack would be a good compromise between latency and memory usage. >>Once the system has run the worker threads for a short time all of the >>needed stack memory will be locked in and latency will be controlled. >>If a thread needs more memory for stack in a particular instance, >>there will be a latency hit but this would be acceptable and much >>better than a crash. > > > It sounds to me like you have a really special-purpose situation here. > You want to minimize the amount of memory used, but you may have deep > stacks of unknown depth and you can't grow them safely without incurring > latency. > > It seems to me that you really should just figure out how much stack > your app really needs and set your limits appropriately. If your > program requires indeterminate stack depth, you should fix it so it > doesn't. > > If you really, really want random memory allocations and memory locking > at the same time, you could implement your own mlockall solution with > your own stack manager. You could do an mlockall(MCL_CURRENT) with > small stack reserves, and then manually go and remap your stack space > the way you want it. Of course, you'd need your own memory allocator if > you ever allocate more non-stack memory, but you'll need that anyway. > > -J > > >>David Schwartz wrote: >> >>>>Thanks for that. >>>>I have done some more investigating, and on my system (Standard RedHat 9) >>>>the stack ulimit is set to 8192 KBytes. So it appears that the thread >>>>library/kernel threads pre-allocates, and writes to, 8129 KBytes >>>>of stack per >>>>thread and so then mlockall() locks all of this in memory. >>>> >>>>Should'nt the Thread library grow the stack rather than >>>>preallocate it all even >>>>with mlockall() like malloc ? >>> >>> >>> I thought you wanted improved latency. Surely having to find a page >>> for you >>>when your stack grows will add unpredictable latency. So, no, the thread >>>library should reserve the stack when 'mlockall(MCL_FUTURE)' is specified. >>> >>> I do agree that having an 'initial stack size' in additional to a >>> 'maximum >>>stack size' would be a good idea. The former good for application that are >>>concerned about physical memory usage and the latter for applications >>>concerned about virtual memory usage. >>> >>> DS >>> >>> >>>- >>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >>>the body of a message to majordomo@vger.kernel.org >>>More majordomo info at http://vger.kernel.org/majordomo-info.html >>>Please read the FAQ at http://www.tux.org/lkml/ >>> >> >>-- >>Dr Terry Barnaby BEAM Ltd >>Phone: +44 1454 324512 Northavon Business Center, Dean Rd >>Fax: +44 1454 313172 Yate, Bristol, BS37 5NH, UK >>Email: terry@beam.ltd.uk Web: www.beam.ltd.uk >>BEAM for: Visually Impaired X-Terminals, Parallel Processing, Software >> "Tandems are twice the fun !" >>- >>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >>the body of a message to majordomo@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html >>Please read the FAQ at http://www.tux.org/lkml/ > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Dr Terry Barnaby BEAM Ltd Phone: +44 1454 324512 Northavon Business Center, Dean Rd Fax: +44 1454 313172 Yate, Bristol, BS37 5NH, UK Email: terry@beam.ltd.uk Web: www.beam.ltd.uk BEAM for: Visually Impaired X-Terminals, Parallel Processing, Software "Tandems are twice the fun !" ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-05-21 14:29 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-18 10:10 Problem with mlockall() and Threads: memory usage Terry Barnaby
[not found] ` <041501c43cc9$28aaed00$c8de11cc@black>
2004-05-18 12:51 ` Terry Barnaby
2004-05-18 20:38 ` David Schwartz
2004-05-19 8:45 ` Terry Barnaby
2004-05-20 0:23 ` Elladan
2004-05-21 14:28 ` Terry Barnaby
2004-05-21 14:28 ` Terry Barnaby
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox