* Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM
@ 2010-03-13 2:36 Tim Wright
[not found] ` <C7C03A40.9C2%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Tim Wright @ 2010-03-13 2:36 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Hello everybody,
I have a question regarding behavior of the memory pinning using
ibv_reg_mr() on RHEL/Centos 5.3/OFED-1.4.1. This may be more a Linux
feature/issue, and I apologize if that is the case. Here is the scenario
1. An application runs which allocates a significant fraction of the system
memory as RDMA buffers (e.g. 3GB of ram on a 4GB system). These are setup
using ibv_reg_mr(). It is clear that the pages are pinned from the kernel
perspective. With just this program running, the resident set size of the
program approaches the allocation size.
2. If other memory-intensive processes are now started, the resident set
size of the RDMA-using program shrinks dramatically.
3. Even if the other memory-intensive programs are stopped, and the
RDMA-using program is forced to read its memory, the resident set never
grows to a ³reasonable² size again.
My potentially foolish assumptions are/were that:
i) Since the memory is pinned anyway, it would be locked into the process
address space, and
ii) even if that were not the case, that the process would be able to regain
a large RSS when any competing processes stopped.
For ii), it almost seems that the VM doesn¹t realize that the pages it would
be grabbing back are already resident and therefore won¹t actually take any
more memory.
Clearly, I can use mlock() to avoid the issue, but I was wondering if I have
missed something obvious here. Any clues/brickbats gratefully received!
Regards,
Tim Wright
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread[parent not found: <C7C03A40.9C2%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>]
* Re: Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM [not found] ` <C7C03A40.9C2%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org> @ 2010-03-14 10:22 ` Eli Cohen [not found] ` <20100314102239.GA23358-8YAHvHwT2UEvbXDkjdHOrw/a8Rv0c6iv@public.gmane.org> [not found] ` <C7C25E98.10DB%tim.wright@rnanetworks.com> 0 siblings, 2 replies; 4+ messages in thread From: Eli Cohen @ 2010-03-14 10:22 UTC (permalink / raw) To: Tim Wright; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Tim, the memory you register using ibv_reg_mr() is pinned and the pages cannot be used for anything else until you release them through a call to ibv_dereg_mr() (or the process terminates). There is no need to use mlock. Did you check that the call to ibv_reg_mr() succedded? On Fri, Mar 12, 2010 at 09:36:48PM -0500, Tim Wright wrote: > Hello everybody, > I have a question regarding behavior of the memory pinning using > ibv_reg_mr() on RHEL/Centos 5.3/OFED-1.4.1. This may be more a Linux > feature/issue, and I apologize if that is the case. Here is the scenario > > 1. An application runs which allocates a significant fraction of the system > memory as RDMA buffers (e.g. 3GB of ram on a 4GB system). These are setup > using ibv_reg_mr(). It is clear that the pages are pinned from the kernel > perspective. With just this program running, the resident set size of the > program approaches the allocation size. > 2. If other memory-intensive processes are now started, the resident set > size of the RDMA-using program shrinks dramatically. > 3. Even if the other memory-intensive programs are stopped, and the > RDMA-using program is forced to read its memory, the resident set never > grows to a ³reasonable² size again. > > My potentially foolish assumptions are/were that: > i) Since the memory is pinned anyway, it would be locked into the process > address space, and > ii) even if that were not the case, that the process would be able to regain > a large RSS when any competing processes stopped. > > For ii), it almost seems that the VM doesn¹t realize that the pages it would > be grabbing back are already resident and therefore won¹t actually take any > more memory. > > Clearly, I can use mlock() to avoid the issue, but I was wondering if I have > missed something obvious here. Any clues/brickbats gratefully received! > > Regards, > > Tim Wright > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <20100314102239.GA23358-8YAHvHwT2UEvbXDkjdHOrw/a8Rv0c6iv@public.gmane.org>]
* Re: Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM [not found] ` <20100314102239.GA23358-8YAHvHwT2UEvbXDkjdHOrw/a8Rv0c6iv@public.gmane.org> @ 2010-03-14 19:56 ` Tim Wright 0 siblings, 0 replies; 4+ messages in thread From: Tim Wright @ 2010-03-14 19:56 UTC (permalink / raw) To: Eli Cohen; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Hi Eli, Thank you for replying. Yes the call succeeds and the memory is clearly pinned at least from the perspective of the kernel, and the RDMA operations do happen on the correct memory. The problem is that the pages get stolen from the resident set of the program, which is inefficient, and doesn¹t seem to make a lot of sense since it doesn¹t free much of anything since the memory is locked. What does seem to happen under load is that the process page tables can get kicked out making access to the pinned memory very slow. Of course other parts of the application are also subject to paging, so mlock() is really necessary anyway if my goal is to avoid paging, but I was surprised that the memory was only pinned into physical memory and not also locked into the calling process¹ address space. For my purposes, it seems mlockall() is the solution, and it appears to interact just fine with the underlying get_user_pages() operations. Regards, Tim On 3/14/10 3:22 AM, "Eli Cohen" <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote: > Tim, > the memory you register using ibv_reg_mr() is pinned and the pages > cannot be used for anything else until you release them through a > call to ibv_dereg_mr() (or the process terminates). There is no need > to use mlock. > Did you check that the call to ibv_reg_mr() succedded? > > On Fri, Mar 12, 2010 at 09:36:48PM -0500, Tim Wright wrote: >> Hello everybody, >> I have a question regarding behavior of the memory pinning using >> ibv_reg_mr() on RHEL/Centos 5.3/OFED-1.4.1. This may be more a Linux >> feature/issue, and I apologize if that is the case. Here is the scenario >> >> 1. An application runs which allocates a significant fraction of the system >> memory as RDMA buffers (e.g. 3GB of ram on a 4GB system). These are setup >> using ibv_reg_mr(). It is clear that the pages are pinned from the kernel >> perspective. With just this program running, the resident set size of the >> program approaches the allocation size. >> 2. If other memory-intensive processes are now started, the resident set >> size of the RDMA-using program shrinks dramatically. >> 3. Even if the other memory-intensive programs are stopped, and the >> RDMA-using program is forced to read its memory, the resident set never >> grows to a ³reasonable² size again. >> >> My potentially foolish assumptions are/were that: >> i) Since the memory is pinned anyway, it would be locked into the process >> address space, and >> ii) even if that were not the case, that the process would be able to regain >> a large RSS when any competing processes stopped. >> >> For ii), it almost seems that the VM doesn¹t realize that the pages it would >> be grabbing back are already resident and therefore won¹t actually take any >> more memory. >> >> Clearly, I can use mlock() to avoid the issue, but I was wondering if I have >> missed something obvious here. Any clues/brickbats gratefully received! >> >> Regards, >> >> Tim Wright >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <C7C25E98.10DB%tim.wright@rnanetworks.com>]
[parent not found: <C7C25E98.10DB%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>]
* Re: Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM [not found] ` <C7C25E98.10DB%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org> @ 2010-03-15 7:52 ` Eli Cohen 0 siblings, 0 replies; 4+ messages in thread From: Eli Cohen @ 2010-03-15 7:52 UTC (permalink / raw) To: Tim Wright; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Sun, Mar 14, 2010 at 12:36:26PM -0400, Tim Wright wrote: > Hi Eli, > Thank you for replying. Yes the call succeeds and the memory is clearly pinned at least from the perspective of the kernel, and the RDMA operations do happen on the correct memory. The problem is that the pages get stolen from the resident set of the program, which is inefficient, and doesn't seem to make a lot of sense since it doesn't free much of anything since the memory is locked. What does seem to happen under load is that the process page tables can get kicked out making access to the pinned memory very slow. Of course other parts of the application are also subject to paging, so mlock is really necessary anyway, but I was surprised that the memory was only pinned into physical memory and not also locked into the calling process' address space. > > For my purposes, it seems mlockall() is the solution, and it appears to interact just fine with the underlying get_user_pages() operations. I see your point and I think you're right. I beleive the reason for this is that the kernel does not know in advance that it is bound to fail releasing the page (that's because get_user_pages() takes a reference on it) so it tries swapping out the prcoess's page. mlocking the memory seems to do the work. If you'd had more memory then maybe you would not face this problem from the beginning. In the early days of IB we used to do mlock to lock the pages but it was not trivial to find the physical pages. Now we call get_user_pages() which gives us the list of pages. I wonder how bad is the effect of paging in the pages again. After all, the correct page should be found quite fast since it does not have to be allocated again, only to be found. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-03-15 7:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-13 2:36 Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM Tim Wright
[not found] ` <C7C03A40.9C2%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>
2010-03-14 10:22 ` Eli Cohen
[not found] ` <20100314102239.GA23358-8YAHvHwT2UEvbXDkjdHOrw/a8Rv0c6iv@public.gmane.org>
2010-03-14 19:56 ` Tim Wright
[not found] ` <C7C25E98.10DB%tim.wright@rnanetworks.com>
[not found] ` <C7C25E98.10DB%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>
2010-03-15 7:52 ` Eli Cohen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox