All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Cc: dev@dpdk.org, Anatoly Burakov <anatoly.burakov@intel.com>,
	david.marchand@redhat.com, bruce.richardson@intel.com
Subject: Re: [PATCH v1 1/6] doc: add hugepage mapping details
Date: Mon, 17 Jan 2022 10:20:59 +0100	[thread overview]
Message-ID: <3969190.6PsWsQAL7t@thomas> (raw)
In-Reply-To: <20220117080801.481568-2-dkozlyuk@nvidia.com>

Thanks for the nice addition to the documentation, this is really needed.
Some comments below.

17/01/2022 09:07, Dmitry Kozlyuk:
> --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> -    Memory reservations done using the APIs provided by rte_malloc are also backed by pages from the hugetlbfs filesystem.
> +    Memory reservations done using the APIs provided by rte_malloc are also backed by hugepages.

Should we mention except if --no-huge is used?

> +Hugepage Mapping
> +^^^^^^^^^^^^^^^^
> +
> +Below is an overview of methods used for each OS to obtain hugepages,
> +explaining why certain limitations and options exist in EAL.
> +See the user guide for a specific OS for configuration details.
> +
> +FreeBSD uses ``contigmem`` kernel module
> +to reserve a fixed number of hugepages at system start,
> +which are mapped by EAL at initialization using a specific ``sysctl()``.
> +
> +Windows EAL allocates hugepages from the OS as needed using Win32 API,
> +so available amount depends on the system load.
> +It uses ``virt2phys`` kernel module to obtain physical addresses,
> +unless running in IOVA-as-VA mode (e.g. forced with ``--iova-mode=va``).
> +
> +Linux implements a variety of methods:
> +
> +* mapping each hugepage from its own file in hugetlbfs;
> +* mapping multiple hugepages from a shared file in hugetlbfs;
> +* anonymous mapping.
> +
> +Mapping hugepages from files in hugetlbfs is essential for multi-process,
> +because secondary processes need to map the same hugepages.
> +EAL creates files like ``rtemap_0``
> +in directories specified with ``--huge-dir`` option
> +(or in the mount point for a specific hugepage size).
> +The ``rtemap_`` prefix can be changed using ``--file-prefix``.
> +This may be needed for running multiple primary processes
> +that share a hugetlbfs mount point.
> +Each backing file by default corresponds to one hugepage,
> +it is opened and locked for the entire time the hugepage is used.
> +See :ref:`segment-file-descriptors` section
> +on how the number of open backing file descriptors can be reduced.
> +
> +Backing files may persist after the corresponding hugepage is freed
> +and even after the application terminates,
> +reducing the number of hugepages available to other processes.
> +EAL removes existing files at startup
> +and can remove newly created files before mapping them with ``--huge-unlink``.

This sentence require more explanations, as it is not clear when and why.

> +However, since it disables multi-process anyway,
> +using anonymous mapping (``--in-memory``) is recommended instead.
> +
> +:ref:`EAL memory allocator <malloc>` relies on hugepages being zero-filled.
> +Hugepages are cleared by the kernel when a file in hugetlbfs or its part
> +is mapped for the first time system-wide
> +to prevent data leaks from previous users of the same hugepage.
> +EAL ensures this behavior by removing existing backing files at startup
> +and by recreating them before opening for mapping (as a precaution).
> +
> +Anonymous mapping does not allow multi-process architecture,
> +but it is free of filename conflicts and leftover files on hugetlbfs.

It is also easier to run as non-root.

> +If memfd_create(2) is supported both at build and run time,
> +DPDK memory manager can provide file descriptors for memory segments,
> +which are required for VirtIO with vhost-user backend.
> +This means open file descriptor issues may also affect this mode,
> +with the same solution.

This is not clear. Which issues? Which mode? Which solution?




  reply	other threads:[~2022-01-17  9:21 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-30 14:37 [RFC PATCH 0/6] Fast restart with many hugepages Dmitry Kozlyuk
2021-12-30 14:37 ` [RFC PATCH 1/6] doc: add hugepage mapping details Dmitry Kozlyuk
2021-12-30 14:37 ` [RFC PATCH 2/6] mem: add dirty malloc element support Dmitry Kozlyuk
2021-12-30 14:37 ` [RFC PATCH 3/6] eal: refactor --huge-unlink storage Dmitry Kozlyuk
2021-12-30 14:37 ` [RFC PATCH 4/6] eal/linux: allow hugepage file reuse Dmitry Kozlyuk
2021-12-30 14:48 ` [RFC PATCH 5/6] eal: allow hugepage file reuse with --huge-unlink Dmitry Kozlyuk
2021-12-30 14:49 ` [RFC PATCH 6/6] app/test: add allocator performance benchmark Dmitry Kozlyuk
2022-01-17  8:07 ` [PATCH v1 0/6] Fast restart with many hugepages Dmitry Kozlyuk
2022-01-17  8:07   ` [PATCH v1 1/6] doc: add hugepage mapping details Dmitry Kozlyuk
2022-01-17  9:20     ` Thomas Monjalon [this message]
2022-01-17  8:07   ` [PATCH v1 2/6] app/test: add allocator performance benchmark Dmitry Kozlyuk
2022-01-17 15:47     ` Bruce Richardson
2022-01-17 15:51       ` Bruce Richardson
2022-01-19 21:12         ` Dmitry Kozlyuk
2022-01-20  9:04           ` Bruce Richardson
2022-01-17 16:06     ` Aaron Conole
2022-01-17  8:07   ` [PATCH v1 3/6] mem: add dirty malloc element support Dmitry Kozlyuk
2022-01-17 14:07     ` Thomas Monjalon
2022-01-17  8:07   ` [PATCH v1 4/6] eal: refactor --huge-unlink storage Dmitry Kozlyuk
2022-01-17 14:10     ` Thomas Monjalon
2022-01-17  8:14   ` [PATCH v1 5/6] eal/linux: allow hugepage file reuse Dmitry Kozlyuk
2022-01-17 14:24     ` Thomas Monjalon
2022-01-17  8:14   ` [PATCH v1 6/6] eal: extend --huge-unlink for " Dmitry Kozlyuk
2022-01-17 14:27     ` Thomas Monjalon
2022-01-17 16:40   ` [PATCH v1 0/6] Fast restart with many hugepages Bruce Richardson
2022-01-19 21:12     ` Dmitry Kozlyuk
2022-01-20  9:05       ` Bruce Richardson
2022-01-19 21:09   ` [PATCH v2 " Dmitry Kozlyuk
2022-01-19 21:09     ` [PATCH v2 1/6] doc: add hugepage mapping details Dmitry Kozlyuk
2022-01-27 13:59       ` Bruce Richardson
2022-01-19 21:09     ` [PATCH v2 2/6] app/test: add allocator performance benchmark Dmitry Kozlyuk
2022-01-19 21:09     ` [PATCH v2 3/6] mem: add dirty malloc element support Dmitry Kozlyuk
2022-01-19 21:09     ` [PATCH v2 4/6] eal: refactor --huge-unlink storage Dmitry Kozlyuk
2022-01-19 21:11     ` [PATCH v2 5/6] eal/linux: allow hugepage file reuse Dmitry Kozlyuk
2022-01-19 21:11       ` [PATCH v2 6/6] eal: extend --huge-unlink for " Dmitry Kozlyuk
2022-01-27 12:07     ` [PATCH v2 0/6] Fast restart with many hugepages Bruce Richardson
2022-02-02 14:12     ` Thomas Monjalon
2022-02-02 21:54     ` David Marchand
2022-02-03 10:26       ` David Marchand
2022-02-03 18:13     ` [PATCH v3 " Dmitry Kozlyuk
2022-02-03 18:13       ` [PATCH v3 1/6] doc: add hugepage mapping details Dmitry Kozlyuk
2022-02-08 15:28         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 2/6] app/test: add allocator performance benchmark Dmitry Kozlyuk
2022-02-08 16:20         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 3/6] mem: add dirty malloc element support Dmitry Kozlyuk
2022-02-08 16:36         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 4/6] eal: refactor --huge-unlink storage Dmitry Kozlyuk
2022-02-08 16:39         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 5/6] eal/linux: allow hugepage file reuse Dmitry Kozlyuk
2022-02-08 17:05         ` Burakov, Anatoly
2022-02-03 18:13       ` [PATCH v3 6/6] eal: extend --huge-unlink for " Dmitry Kozlyuk
2022-02-08 17:14         ` Burakov, Anatoly
2022-02-08 20:40       ` [PATCH v3 0/6] Fast restart with many hugepages David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3969190.6PsWsQAL7t@thomas \
    --to=thomas@monjalon.net \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=dkozlyuk@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.