From: Olivier Matz <olivier.matz@6wind.com>
To: Ilya Matveychikov <matvejchikov@gmail.com>
Cc: dev@dpdk.org,
"adrien.mazarguil@6wind.com" <adrien.mazarguil@6wind.com>,
Jan Blunck <jblunck@infradead.org>,
Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Subject: Re: A (possible) problem with `--no-huge` option
Date: Fri, 9 Jun 2017 10:27:27 +0200 [thread overview]
Message-ID: <20170609102727.0eb7f39d@platinum> (raw)
In-Reply-To: <1A9D36CE-8B6D-43A7-BE0C-9F232DFDA263@gmail.com>
Hi Ilya,
On Sun, 14 May 2017 14:34:14 +0400, Ilya Matveychikov <matvejchikov@gmail.com> wrote:
> Hi guys,
>
> I have a problem while running DPDK with `--no-huge` option. It seems that the problem occurs since commit cdc242f260e766bd95a658b5e0686a62ec04f5b0 and that is the change that affects me:
>
> + if ((page & 0x7fffffffffffffULL) == 0)
> + return RTE_BAD_PHYS_ADDR;
> +
>
> What I did is to try to create memory pool using rte_pktmbuf_pool_create(). I dig into the issue and found that in my case “page" value is 0x0080000000000000 which means that the page is not present and “soft-dirty” (according to kernel’s documentation):
>
> * Bits 0-54 page frame number (PFN) if present
> * Bits 0-4 swap type if swapped
> * Bits 5-54 swap offset if swapped
> * Bit 55 pte is soft-dirty (see Documentation/vm/soft-dirty.txt)
> * Bit 56 page exclusively mapped (since 4.2)
> * Bits 57-60 zero
> * Bit 61 page is file-page or shared-anon (since 3.5)
> * Bit 62 page swapped
> * Bit 63 page present
>
> So, before the change mentioned all “works” fine and such pages were not handled. But now the check causes rte_mempool_populate_default to fail with -EINVAL...
> Can anyone familiar with the memory pool allocation helps with the issue?
>
> Thanks in advice,
> Ilya Matveychikov.
>
I can reproduce the issue:
make config T=x86_64-native-linuxapp-gcc
make -j32 EXTRA_CFLAGS="-O0 -g"
mkdir -p /mnt/huge
mount -t hugetlbfs nodev /mnt/huge
echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
# ok
./build/app/testpmd -l 2,4 --log-level 8 --vdev=eth_null0 -- --no-numa --total-num-mbufs=4096 -i --port-topology=chained
# fail
./build/app/testpmd --no-huge -l 2,4 --log-level 8 --vdev=eth_null0 -- --no-numa --total-num-mbufs=4096 -i --port-topology=chained
I confirm that rte_mem_virt2phy() returns RTE_BAD_PHYS_ADDR,
which makes rte_mempool_populate_virt() to fail.
Reverting cdc242f260e7 ("eal/linux: support running as unprivileged user")
fixes the problem. Actually, it makes rte_mem_virt2phy() return 0 instead
of RTE_BAD_PHYS_ADDR, which is seen as a valid address.
I think querying the physical address when using --no-huge does not make
sense because the memory is not locked, and could be swapped.
Another strange thing, when using --no-huge, the physical address returned
when allocating a memzone is the virtual address.
I see several solutions to fix the issue:
1/ Always set physical addresses to RTE_BAD_PHYS_ADDR when started
with --no-huge. We consider that the physical address is invalid
in that case and must not be used.
This impacts rte_mem_virt2phy() and memzone_reserve*() functions.
In rte_mempool_populate_virt(), don't expect a physical address
if the application is started with --no-huge.
2/ Change rte_mem_virt2phy() to return the virtual address when we
ask for the physical address when started with --no-huge. This is
wrong, but consistent with what is done in memzones today.
In rte_mem_virt2phy(), add at the beginning:
if (!rte_eal_has_hugepages())
return (intptr_t)virtaddr;
3/ lock pages in memory by reverting
729f17a932dd ("mem: revert page locking when not using hugepages")
This would make the physical address available.
As explained in the commit log, this would also break the ability to
start dpdk with --no-huge for non-root users.
I think 1/ is better. I'm sending a patch in reply to this mail.
Ilya, please let me know if it fixes your issue.
Regards,
Olivier
next prev parent reply other threads:[~2017-06-09 8:27 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-14 10:34 A (possible) problem with `--no-huge` option Ilya Matveychikov
2017-06-09 8:27 ` Olivier Matz [this message]
2017-06-09 8:29 ` [PATCH] eal: don't advertise a physical address when no hugepages Olivier Matz
2017-06-10 8:31 ` Jan Blunck
2017-06-23 8:11 ` Olivier Matz
2017-06-23 17:08 ` Jan Blunck
2017-06-26 7:11 ` santosh
2017-06-12 13:58 ` Adrien Mazarguil
2017-07-03 10:04 ` [PATCH v2] " Olivier Matz
2017-07-03 10:17 ` Jan Blunck
2017-07-04 15:53 ` [dpdk-stable] " Thomas Monjalon
2017-06-09 12:08 ` A (possible) problem with `--no-huge` option Ilya Matveychikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170609102727.0eb7f39d@platinum \
--to=olivier.matz@6wind.com \
--cc=adrien.mazarguil@6wind.com \
--cc=dev@dpdk.org \
--cc=jblunck@infradead.org \
--cc=matvejchikov@gmail.com \
--cc=sergio.gonzalez.monroy@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.