public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM
@ 2010-03-13  2:36 Tim Wright
       [not found] ` <C7C03A40.9C2%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Tim Wright @ 2010-03-13  2:36 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Hello everybody,
I have a question regarding behavior of the memory pinning using
ibv_reg_mr() on RHEL/Centos 5.3/OFED-1.4.1. This may be more a Linux
feature/issue, and I apologize if that is the case. Here is the scenario

1. An application runs which allocates a significant fraction of the system
memory as RDMA buffers (e.g. 3GB of ram on a 4GB system). These are setup
using ibv_reg_mr(). It is clear that the pages are pinned from the kernel
perspective. With just this program running, the resident set size of the
program approaches the allocation size.
2. If other memory-intensive processes are now started, the resident set
size of the RDMA-using program shrinks dramatically.
3. Even if the other memory-intensive programs are stopped, and the
RDMA-using program is forced to read its memory, the resident set never
grows to a ³reasonable² size again.

My potentially foolish assumptions are/were that:
i) Since the memory is pinned anyway, it would be locked into the process
address space, and
ii) even if that were not the case, that the process would be able to regain
a large RSS when any competing processes stopped.

For ii), it almost seems that the VM doesn¹t realize that the pages it would
be grabbing back are already resident and therefore won¹t actually take any
more memory.

Clearly, I can use mlock() to avoid the issue, but I was wondering if I have
missed something obvious here. Any clues/brickbats gratefully received!

Regards,

Tim Wright

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM
       [not found] ` <C7C03A40.9C2%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>
@ 2010-03-14 10:22   ` Eli Cohen
       [not found]     ` <20100314102239.GA23358-8YAHvHwT2UEvbXDkjdHOrw/a8Rv0c6iv@public.gmane.org>
       [not found]     ` <C7C25E98.10DB%tim.wright@rnanetworks.com>
  0 siblings, 2 replies; 4+ messages in thread
From: Eli Cohen @ 2010-03-14 10:22 UTC (permalink / raw)
  To: Tim Wright; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Tim,
the memory you register using ibv_reg_mr() is pinned and the pages
cannot be used for anything else until you release them through a
call to ibv_dereg_mr() (or the process terminates). There is no need
to use mlock.
Did you check that the call to ibv_reg_mr() succedded?

On Fri, Mar 12, 2010 at 09:36:48PM -0500, Tim Wright wrote:
> Hello everybody,
> I have a question regarding behavior of the memory pinning using
> ibv_reg_mr() on RHEL/Centos 5.3/OFED-1.4.1. This may be more a Linux
> feature/issue, and I apologize if that is the case. Here is the scenario
> 
> 1. An application runs which allocates a significant fraction of the system
> memory as RDMA buffers (e.g. 3GB of ram on a 4GB system). These are setup
> using ibv_reg_mr(). It is clear that the pages are pinned from the kernel
> perspective. With just this program running, the resident set size of the
> program approaches the allocation size.
> 2. If other memory-intensive processes are now started, the resident set
> size of the RDMA-using program shrinks dramatically.
> 3. Even if the other memory-intensive programs are stopped, and the
> RDMA-using program is forced to read its memory, the resident set never
> grows to a ³reasonable² size again.
> 
> My potentially foolish assumptions are/were that:
> i) Since the memory is pinned anyway, it would be locked into the process
> address space, and
> ii) even if that were not the case, that the process would be able to regain
> a large RSS when any competing processes stopped.
> 
> For ii), it almost seems that the VM doesn¹t realize that the pages it would
> be grabbing back are already resident and therefore won¹t actually take any
> more memory.
> 
> Clearly, I can use mlock() to avoid the issue, but I was wondering if I have
> missed something obvious here. Any clues/brickbats gratefully received!
> 
> Regards,
> 
> Tim Wright
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM
       [not found]     ` <20100314102239.GA23358-8YAHvHwT2UEvbXDkjdHOrw/a8Rv0c6iv@public.gmane.org>
@ 2010-03-14 19:56       ` Tim Wright
  0 siblings, 0 replies; 4+ messages in thread
From: Tim Wright @ 2010-03-14 19:56 UTC (permalink / raw)
  To: Eli Cohen; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Hi Eli,
Thank you for replying. Yes the call succeeds and the memory is clearly
pinned at least from the perspective of the kernel, and the RDMA operations
do happen on the correct memory. The problem is that the pages get stolen
from the resident set of the program, which is inefficient, and doesn¹t seem
to make a lot of sense since it doesn¹t free much of anything since the
memory is locked. What does seem to happen under load is that the process
page tables can get kicked out making access to the pinned memory very slow.
Of course other parts of the application are also subject to paging, so
mlock() is really necessary anyway if my goal is to avoid paging, but I was
surprised that the memory was only pinned into physical memory and not also
locked into the calling process¹ address space.

For my purposes, it seems mlockall() is the solution, and it appears to
interact just fine with the underlying get_user_pages() operations.

Regards,

Tim



On 3/14/10 3:22 AM, "Eli Cohen" <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:

> Tim,
> the memory you register using ibv_reg_mr() is pinned and the pages
> cannot be used for anything else until you release them through a
> call to ibv_dereg_mr() (or the process terminates). There is no need
> to use mlock.
> Did you check that the call to ibv_reg_mr() succedded?
> 
> On Fri, Mar 12, 2010 at 09:36:48PM -0500, Tim Wright wrote:
>> Hello everybody,
>> I have a question regarding behavior of the memory pinning using
>> ibv_reg_mr() on RHEL/Centos 5.3/OFED-1.4.1. This may be more a Linux
>> feature/issue, and I apologize if that is the case. Here is the scenario
>> 
>> 1. An application runs which allocates a significant fraction of the system
>> memory as RDMA buffers (e.g. 3GB of ram on a 4GB system). These are setup
>> using ibv_reg_mr(). It is clear that the pages are pinned from the kernel
>> perspective. With just this program running, the resident set size of the
>> program approaches the allocation size.
>> 2. If other memory-intensive processes are now started, the resident set
>> size of the RDMA-using program shrinks dramatically.
>> 3. Even if the other memory-intensive programs are stopped, and the
>> RDMA-using program is forced to read its memory, the resident set never
>> grows to a ³reasonable² size again.
>> 
>> My potentially foolish assumptions are/were that:
>> i) Since the memory is pinned anyway, it would be locked into the process
>> address space, and
>> ii) even if that were not the case, that the process would be able to regain
>> a large RSS when any competing processes stopped.
>> 
>> For ii), it almost seems that the VM doesn¹t realize that the pages it would
>> be grabbing back are already resident and therefore won¹t actually take any
>> more memory.
>> 
>> Clearly, I can use mlock() to avoid the issue, but I was wondering if I have
>> missed something obvious here. Any clues/brickbats gratefully received!
>> 
>> Regards,
>> 
>> Tim Wright
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM
       [not found]       ` <C7C25E98.10DB%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>
@ 2010-03-15  7:52         ` Eli Cohen
  0 siblings, 0 replies; 4+ messages in thread
From: Eli Cohen @ 2010-03-15  7:52 UTC (permalink / raw)
  To: Tim Wright; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Sun, Mar 14, 2010 at 12:36:26PM -0400, Tim Wright wrote:
> Hi Eli,
> Thank you for replying. Yes the call succeeds and the memory is clearly pinned at least from the perspective of the kernel, and the RDMA operations do happen on the correct memory. The problem is that the pages get stolen from the resident set of the program, which is inefficient, and doesn't seem to make a lot of sense since it doesn't free much of anything since the memory is locked. What does seem to  happen under load is that the process page tables can get kicked out making access to the pinned memory very slow. Of course other parts of the application are also subject to paging, so mlock is really necessary anyway, but I was surprised that the memory was only pinned into physical memory and not also locked into the calling process' address space.
> 
> For my purposes, it seems mlockall() is the solution, and it appears to interact just fine with the underlying get_user_pages() operations.

I see your point and I think you're right. I beleive the reason for
this is that the kernel does not know in advance that it is bound to
fail releasing the page (that's because get_user_pages() takes a
reference on it) so it tries swapping out the prcoess's page.
mlocking the memory seems to do the work. If you'd had more memory
then maybe you would not face this problem from the beginning.
In the early days of IB we used to do mlock to lock the pages but it
was not trivial to find the physical pages. Now we call
get_user_pages() which gives us the list of pages.
I wonder how bad is the effect of paging in the pages again. After
all, the correct page should be found quite fast since it does not
have to be allocated again, only to be found.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-03-15  7:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-13  2:36 Naïve question wrt ibv_reg_mr(), pinned pages and Linux VM Tim Wright
     [not found] ` <C7C03A40.9C2%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>
2010-03-14 10:22   ` Eli Cohen
     [not found]     ` <20100314102239.GA23358-8YAHvHwT2UEvbXDkjdHOrw/a8Rv0c6iv@public.gmane.org>
2010-03-14 19:56       ` Tim Wright
     [not found]     ` <C7C25E98.10DB%tim.wright@rnanetworks.com>
     [not found]       ` <C7C25E98.10DB%tim.wright-c+vHNXOUHMmEK/hMebVsMw@public.gmane.org>
2010-03-15  7:52         ` Eli Cohen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox