All of lore.kernel.org
 help / color / mirror / Atom feed
From: Walker, Benjamin <benjamin.walker at intel.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] Running SPDK As Non-root User
Date: Tue, 27 Dec 2016 21:19:19 +0000	[thread overview]
Message-ID: <1482873557.6198.1.camel@intel.com> (raw)
In-Reply-To: 8046EFA3-9F5C-4BB1-B540-51C0ED39D968@cloudsimple.com

[-- Attachment #1: Type: text/plain, Size: 4393 bytes --]

On Fri, 2016-12-23 at 16:46 +0530, karthi wrote:
> Hi All,
>  
> I’m trying to run SPDK (NVMf target) in a non-root user mode. I’m ending up in
> rte_mempool_init always returns 0MB Available memory. But if I run the same as
> a root user, everything works fine.  Can Someone help me out on this and I
> have come across some DPDK mail thread for the same issue.
>  
> http://dpdk.org/ml/archives/users/2016-July/000709.html
>  
> Can DPDK currently run in non-root user mode?

Yes, it can, although it isn't well tested or heavily used as far as I can tell.

>  
> Is someone tried experimenting this in either DPDK or SPDK?

SPDK's scripts/setup.sh will do all of the things necessary to run your
application as an unprivileged user if your system has your IOMMU enabled in the
kernel boot parameters. The script will automatically use vfio instead of uio
and correctly set the permissions on all of the devices and hugepage files to
the user you ran the script as. Note that you do need to run the script as the
user you want to grant permission, but under 'sudo' because it needs root
permission to do the initial setup.

I'd like to dive into the details here a bit so that everyone knows where we
really stand on this. I see running as an unprivileged user a key feature for
production deployments, so it's one of those things that we really do want to
get functioning as a first class citizen in our test pool and documentation as
soon as possible.

DPDK and SPDK (and user space drivers in general) need two features from the
kernel to function.

First, they must be able to allocate memory that has a fixed virtual to physical
address mapping, which is usually called 'pinned' memory. This memory is used
for DMA operations that happen asynchronously and off of the CPU, e.g. the data
for reads and writes from the SSD or NIC. DPDK/SPDK accomplish this by
allocating the memory from hugepages, which happen to meet this requirement
today.

Second, they must be able to map the memory region described by the base address
register (BAR) in the PCI header into a virtual address accessible by the
process they are running in. DPDK/SPDK can do this using one of two Linux kernel
mechanisms - uio and vfio - where vfio is the newer of the two. Both mechanisms
present the user with a file in sysfs that can be mmap'd for this purpose.

Running SPDK as a non-root user, then, is a matter of finding ways to accomplish
the above two tasks without root. Hugepages are documented here:

https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt

The BAR is harder and it wasn't possible to map this as an unprivileged user
until the introduction of vfio. With vfio, a privileged user can simply grant
access to the resource file in sysfs to any other unprivileged user. See:

https://www.kernel.org/doc/Documentation/vfio.txt

Note that SPDK's scripts/setup.sh does all of this automatically if it detects
that your system has the IOMMU enabled. The IOMMU is a critical piece security-
wise for unprivileged execution because it will prevent your otherwise
unprivileged user from having free reign to DMA to any memory address.

The above all sounds great, except I just actually tried this and when I run the
NVMe identify example at ./examples/nvme/identify, DPDK gives me an error! After
much debugging, it turns out that in kernel 4.0 the kernel devs removed the
ability to get physical addresses for pages from /proc/self/pagemap without
elevated privileges in response to the Rowhammer exploit. Everything should work
fine on a 3.x series kernel, but I don't have access to a machine to test it out
right now.

The simplest solution would be to make your program start up with elevated
permissions and initialize DPDK. That provides the required virtual to physical
memory mappings, which should be static for hugepages. After initialization, the
privilege can be dropped back down. That's the strategy I would pursue for your
application today. We'll keep looking at this and working with the kernel
community to come up with a better long term solution.

Thanks,
Ben

>  
>  
> Regards,
>  
> Karthi | +91 9036339210
> CloudSimple Inc.
>  
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk

             reply	other threads:[~2016-12-27 21:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-27 21:19 Walker, Benjamin [this message]
  -- strict thread matches above, loose matches on Subject: below --
2017-12-05 20:04 [SPDK] Running SPDK As Non-root User Walker, Benjamin
2017-12-05 16:55 Dave Boutcher
2016-12-26 11:49 Kariuki, John K
2016-12-24 14:26 Andrey Kuzmin
2016-12-24  4:04 Karthi M
2016-12-23 16:50 Andrey Kuzmin
2016-12-23 16:00 Kariuki, John K
2016-12-23 11:16 karthi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1482873557.6198.1.camel@intel.com \
    --to=spdk@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.