From: "Tan, Jianfeng" <jianfeng.tan@intel.com>
To: "Walker, Benjamin" <benjamin.walker@intel.com>,
"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: Running DPDK as an unprivileged user
Date: Wed, 4 Jan 2017 19:39:18 +0800 [thread overview]
Message-ID: <685186b4-e50e-c122-459b-e4635404c3f8@intel.com> (raw)
In-Reply-To: <1483044080.11975.1.camel@intel.com>
Hi Benjamin,
On 12/30/2016 4:41 AM, Walker, Benjamin wrote:
> Hi all,
>
> I've been digging in to what it would take to run DPDK as an
> unprivileged user and I have some findings that I thought
> were worthy of discussion. The assumptions here are that I'm
> using a very recent Linux kernel (4.8.15 to be specific) and
> I'm using vfio with my IOMMU enabled. I'm only interested in
> making it possible to run as an unprivileged user in this
> type of environment.
>
> There are a few key things that DPDK needs to do in order to
> run as an unprivileged user:
>
> 1) Allocate hugepages
> 2) Map device resources
> 3) Map hugepage virtual addresses to DMA addresses.
>
> For #1 and #2, DPDK works just fine today. You simply chown
> the relevant resources in sysfs to the desired user and
> everything is happy.
>
> The problem is #3. This currently relies on looking up the
> mappings in /proc/self/pagemap, but the ability to get
> physical addresses in /proc/self/pagemap as an unprivileged
> user was removed from the kernel in the 4.x timeframe due to
> the Rowhammer vulnerability. At this time, it is not
> possible to run DPDK as an unprivileged user on a 4.x Linux
> kernel.
>
> There is a way to make this work though, which I'll outline
> now. Unfortunately, I think it is going to require some very
> significant changes to the initialization flow in the EAL.
> One bit of of background before I go into how to fix this -
> there are three types of memory addresses - virtual
> addresses, physical addresses, and DMA addresses. Sometimes
> DMA addresses are called bus addresses or I/O addresses, but
> I'll call them DMA addresses because I think that's the
> clearest name. In a system without an IOMMU, DMA addresses
> and physical addresses are equivalent, but in a system with
> an IOMMU any arbitrary DMA address can be chosen by the user
> to map to a given physical address. For security reasons
> (rowhammer), it is no longer considered safe to expose
> physical addresses to userspace, but it is perfectly fine to
> expose DMA addresses when an IOMMU is present.
>
> DPDK today begins by allocating all of the required
> hugepages, then finds all of the physical addresses for
> those hugepages using /proc/self/pagemap, sorts the
> hugepages by physical address, then remaps the pages to
> contiguous virtual addresses. Later on and if vfio is
> enabled, it asks vfio to pin the hugepages and to set their
> DMA addresses in the IOMMU to be the physical addresses
> discovered earlier. Of course, running as an unprivileged
> user means all of the physical addresses in
> /proc/self/pagemap are just 0, so this doesn't end up
> working. Further, there is no real reason to choose the
> physical address as the DMA address in the IOMMU - it would
> be better to just count up starting at 0.
Why not just using virtual address as the DMA address in this case to
avoid maintaining another kind of addresses?
> Also, because the
> pages are pinned after the virtual to physical mapping is
> looked up, there is a window where a page could be moved.
> Hugepage mappings can be moved on more recent kernels (at
> least 4.x), and the reliability of hugepages having static
> mappings decreases with every kernel release.
Do you mean kernel might take back a physical page after mapping it to a
virtual page (maybe copy the data to another physical page)? Could you
please show some links or kernel commits?
> Note that this
> probably means that using uio on recent kernels is subtly
> broken and cannot be supported going forward because there
> is no uio mechanism to pin the memory.
>
> The first open question I have is whether DPDK should allow
> uio at all on recent (4.x) kernels. My current understanding
> is that there is no way to pin memory and hugepages can now
> be moved around, so uio would be unsafe. What does the
> community think here?
>
> My second question is whether the user should be allowed to
> mix uio and vfio usage simultaneously. For vfio, the
> physical addresses are really DMA addresses and are best
> when arbitrarily chosen to appear sequential relative to
> their virtual addresses.
Why "sequential relative to their virtual addresses"? IOMMU table is for
DMA addr -> physical addr mapping. So we need to DMA addresses
"sequential relative to their physical addresses"? Based on your above
analysis on how hugepages are initialized, virtual addresses is a good
candidate for DMA address?
Thanks,
Jianfeng
next prev parent reply other threads:[~2017-01-04 11:39 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-29 20:41 Running DPDK as an unprivileged user Walker, Benjamin
2016-12-30 1:14 ` Stephen Hemminger
2017-01-02 14:32 ` Thomas Monjalon
2017-01-02 19:47 ` Stephen Hemminger
2017-01-03 22:50 ` Walker, Benjamin
2017-01-04 10:11 ` Thomas Monjalon
2017-01-04 21:35 ` Walker, Benjamin
2017-01-04 11:39 ` Tan, Jianfeng [this message]
2017-01-04 21:34 ` Walker, Benjamin
2017-01-05 10:09 ` Sergio Gonzalez Monroy
2017-01-05 10:16 ` Sergio Gonzalez Monroy
2017-01-05 14:58 ` Tan, Jianfeng
2017-01-05 15:52 ` Tan, Jianfeng
2017-11-05 0:17 ` Thomas Monjalon
2017-11-27 17:58 ` Walker, Benjamin
2017-11-28 14:16 ` Alejandro Lucero
2017-11-28 17:50 ` Walker, Benjamin
2017-11-28 19:13 ` Alejandro Lucero
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=685186b4-e50e-c122-459b-e4635404c3f8@intel.com \
--to=jianfeng.tan@intel.com \
--cc=benjamin.walker@intel.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.