From: Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
To: Alexander Schmidt
<alexs-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: of-ewg <ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org>,
Linux RDMA <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Hoang-Nam Nguyen
<HNGUYEN-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
Stefan Roscher
<stefan.roscher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
Joachim Fenkes <fenkes-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
Christoph Raisch <raisch-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
Alex Vainman
<alexonlists-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC] libibverbs: ibv_fork_init() and libhugetlbfs
Date: Thu, 06 May 2010 13:55:31 -0700 [thread overview]
Message-ID: <adavdb092d8.fsf@roland-alpha.cisco.com> (raw)
In-Reply-To: <20100506093949.55916ab0@alex-laptop> (Alexander Schmidt's message of "Thu, 6 May 2010 09:39:49 +0200")
> When fork support is enabled in libibverbs, madvise() is called for every
> memory page that is registered as a memory region. Memory ranges that
> are passed to madvise() must be page aligned and the size must be a
> multiple of the page size. libibverbs uses sysconf(_SC_PAGESIZE) to find
> out the system page size and rounds all ranges passed to reg_mr() according
> to this page size. When memory from libhugetlbfs is passed to reg_mr(), this
> does not work as the page size for this memory range might be different
> (e.g. 16Mb). So libibverbs would have to use the huge page size to
> calculate a page aligned range for madvise.
Yes, Alex Vainman reaised this same issue a while ago.
> The patch below demonstrates a possible solution for this. It parses the
> /proc/PID/maps file when registering a memory region and decides if the
> memory that is to be registered is part of a libhugetlbfs range or not. If so,
> a page size of 16Mb is used to align the memory range passed to madvise().
>
> We see two problems with this: it is not a very elegant solution to parse the
> procfs file and the 16Mb are hardcoded currently. The latter point could be
> solved by calling gethugepagesize() from libhugetlbfs, which would add a new
> dependency to libibverbs.
I think that we cannot assume huge pages only come from libhugetlbfs --
we should support an application directly enabling huge pages (possibly
via another library too, so we can't assume that an application knows
the page size for a memory range it is about to register).
And also the 16 MB page size constant is of course not feasible -- with
all due respect, the x86 page size of 2 MB is much more likely in
practice :) (Although perhaps the much slower PowerPC TLB refill makes
users more likely to try and use hugetlb pages ;)
Alex suggested parsing files in the same way as libhugetlbfs does to get
the page size, and that seems to be the best solution, since I don't
think the libhugetlbfs license is compatible with the BSD license for
libibverbs.
But your trick of using /proc/*/maps looks nice. Does that only work
for libhugetlbfs or can we recognize direct mmap of hugetlb pages?
- R.
--
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-05-06 20:55 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-06 7:39 [RFC] libibverbs: ibv_fork_init() and libhugetlbfs Alexander Schmidt
2010-05-06 20:55 ` Roland Dreier [this message]
[not found] ` <adavdb092d8.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-05-07 10:19 ` Alexander Schmidt
2010-05-12 16:40 ` Roland Dreier
[not found] ` <adafx1xul8v.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-05-18 11:04 ` [ewg] " Stefan Roscher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adavdb092d8.fsf@roland-alpha.cisco.com \
--to=rdreier-fyb4gu1cfyuavxtiumwx3w@public.gmane.org \
--cc=HNGUYEN-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=alexonlists-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=alexs-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org \
--cc=fenkes-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=raisch-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
--cc=stefan.roscher-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox