* [PATCH v2] eal: change default per socket memory allocation @ 2014-05-09 13:30 David Marchand [not found] ` <1399642242-19725-1-git-send-email-david.marchand-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: David Marchand @ 2014-05-09 13:30 UTC (permalink / raw) To: dev-VfR2kkLFssw From: Didier Pallard <didier.pallard-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> Currently, if there is more memory in hugepages than the amount requested by dpdk application, the memory is allocated by taking as much memory as possible from each socket, starting from first one. For example if a system is configured with 8 GB in 2 sockets (4 GB per socket), and dpdk is requesting only 4GB of memory, all memory will be taken in socket 0 (that have exactly 4GB of free hugepages) even if some cores are configured on socket 1, and there are free hugepages on socket 1... Change this behaviour to allocate memory on all sockets where some cores are configured, spreading the memory amongst sockets using following ratio per socket: N° of cores configured on the socket / Total number of configured cores * requested memory This algorithm is used when memory amount is specified globally using -m option. Per socket memory allocation can always be done using --socket-mem option. Changes included in v2: - only update linux implementation as bsd looks not to be ready for numa - if new algorithm fails, then defaults to previous behaviour Signed-off-by: Didier Pallard <didier.pallard-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> Signed-off-by: David Marchand <david.marchand-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> --- lib/librte_eal/linuxapp/eal/eal_memory.c | 50 +++++++++++++++++++++++++++--- 1 file changed, 45 insertions(+), 5 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 73a6394..471dcfd 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -881,13 +881,53 @@ calc_num_pages_per_socket(uint64_t * memory, if (num_hp_info == 0) return -1; - for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_mem != 0; socket++) { - /* if specific memory amounts per socket weren't requested */ - if (internal_config.force_sockets == 0) { + /* if specific memory amounts per socket weren't requested */ + if (internal_config.force_sockets == 0) { + int cpu_per_socket[RTE_MAX_NUMA_NODES]; + size_t default_size, total_size; + unsigned lcore_id; + + /* Compute number of cores per socket */ + memset(cpu_per_socket, 0, sizeof(cpu_per_socket)); + RTE_LCORE_FOREACH(lcore_id) { + cpu_per_socket[rte_lcore_to_socket_id(lcore_id)]++; + } + + /* + * Automatically spread requested memory amongst detected sockets according + * to number of cores from cpu mask present on each socket + */ + total_size = internal_config.memory; + for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_size != 0; socket++) { + + /* Set memory amount per socket */ + default_size = (internal_config.memory * cpu_per_socket[socket]) + / rte_lcore_count(); + + /* Limit to maximum available memory on socket */ + default_size = RTE_MIN(default_size, get_socket_mem_size(socket)); + + /* Update sizes */ + memory[socket] = default_size; + total_size -= default_size; + } + + /* + * If some memory is remaining, try to allocate it by getting all + * available memory from sockets, one after the other + */ + for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_size != 0; socket++) { /* take whatever is available */ - memory[socket] = RTE_MIN(get_socket_mem_size(socket), - total_mem); + default_size = RTE_MIN(get_socket_mem_size(socket) - memory[socket], + total_size); + + /* Update sizes */ + memory[socket] += default_size; + total_size -= default_size; } + } + + for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_mem != 0; socket++) { /* skips if the memory on specific socket wasn't requested */ for (i = 0; i < num_hp_info && memory[socket] != 0; i++){ hp_used[i].hugedir = hp_info[i].hugedir; -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 4+ messages in thread
[parent not found: <1399642242-19725-1-git-send-email-david.marchand-pdR9zngts4EAvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH v2] eal: change default per socket memory allocation [not found] ` <1399642242-19725-1-git-send-email-david.marchand-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> @ 2014-05-13 16:27 ` Thomas Monjalon 2014-05-13 16:33 ` Venkatesan, Venky 1 sibling, 0 replies; 4+ messages in thread From: Thomas Monjalon @ 2014-05-13 16:27 UTC (permalink / raw) To: venky.venkatesan-ral2JQCrhuEAvxtiuMwx3w; +Cc: dev-VfR2kkLFssw Hi Venky, There were comments on the first version of this patch and you suggested to try this new implementation. So do you acknowledge this patch? Thanks for your review 2014-05-09 15:30, David Marchand: > From: Didier Pallard <didier.pallard-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> > > Currently, if there is more memory in hugepages than the amount > requested by dpdk application, the memory is allocated by taking as much > memory as possible from each socket, starting from first one. > For example if a system is configured with 8 GB in 2 sockets (4 GB per > socket), and dpdk is requesting only 4GB of memory, all memory will be > taken in socket 0 (that have exactly 4GB of free hugepages) even if some > cores are configured on socket 1, and there are free hugepages on socket > 1... > > Change this behaviour to allocate memory on all sockets where some cores > are configured, spreading the memory amongst sockets using following > ratio per socket: > N° of cores configured on the socket / Total number of configured cores > * requested memory > > This algorithm is used when memory amount is specified globally using > -m option. Per socket memory allocation can always be done using > --socket-mem option. > > Changes included in v2: > - only update linux implementation as bsd looks not to be ready for numa > - if new algorithm fails, then defaults to previous behaviour > > Signed-off-by: Didier Pallard <didier.pallard-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> > Signed-off-by: David Marchand <david.marchand-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> > --- > lib/librte_eal/linuxapp/eal/eal_memory.c | 50 > +++++++++++++++++++++++++++--- 1 file changed, 45 insertions(+), 5 > deletions(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c > b/lib/librte_eal/linuxapp/eal/eal_memory.c index 73a6394..471dcfd 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_memory.c > +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c > @@ -881,13 +881,53 @@ calc_num_pages_per_socket(uint64_t * memory, > if (num_hp_info == 0) > return -1; > > - for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_mem != 0; socket++) > { - /* if specific memory amounts per socket weren't requested */ > - if (internal_config.force_sockets == 0) { > + /* if specific memory amounts per socket weren't requested */ > + if (internal_config.force_sockets == 0) { > + int cpu_per_socket[RTE_MAX_NUMA_NODES]; > + size_t default_size, total_size; > + unsigned lcore_id; > + > + /* Compute number of cores per socket */ > + memset(cpu_per_socket, 0, sizeof(cpu_per_socket)); > + RTE_LCORE_FOREACH(lcore_id) { > + cpu_per_socket[rte_lcore_to_socket_id(lcore_id)]++; > + } > + > + /* > + * Automatically spread requested memory amongst detected sockets > according + * to number of cores from cpu mask present on each socket > + */ > + total_size = internal_config.memory; > + for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_size != 0; > socket++) { + > + /* Set memory amount per socket */ > + default_size = (internal_config.memory * cpu_per_socket[socket]) > + / rte_lcore_count(); > + > + /* Limit to maximum available memory on socket */ > + default_size = RTE_MIN(default_size, get_socket_mem_size(socket)); > + > + /* Update sizes */ > + memory[socket] = default_size; > + total_size -= default_size; > + } > + > + /* > + * If some memory is remaining, try to allocate it by getting all > + * available memory from sockets, one after the other > + */ > + for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_size != 0; > socket++) { /* take whatever is available */ > - memory[socket] = RTE_MIN(get_socket_mem_size(socket), > - total_mem); > + default_size = RTE_MIN(get_socket_mem_size(socket) - memory[socket], > + total_size); > + > + /* Update sizes */ > + memory[socket] += default_size; > + total_size -= default_size; > } > + } > + > + for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_mem != 0; socket++) > { /* skips if the memory on specific socket wasn't requested */ > for (i = 0; i < num_hp_info && memory[socket] != 0; i++){ > hp_used[i].hugedir = hp_info[i].hugedir; ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] eal: change default per socket memory allocation [not found] ` <1399642242-19725-1-git-send-email-david.marchand-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> 2014-05-13 16:27 ` Thomas Monjalon @ 2014-05-13 16:33 ` Venkatesan, Venky [not found] ` <1FD9B82B8BF2CF418D9A1000154491D9740AADF4-P5GAC/sN6hlcIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org> 1 sibling, 1 reply; 4+ messages in thread From: Venkatesan, Venky @ 2014-05-13 16:33 UTC (permalink / raw) To: David Marchand, dev-VfR2kkLFssw@public.gmane.org From: Didier Pallard <didier.pallard@6wind.com> Currently, if there is more memory in hugepages than the amount requested by dpdk application, the memory is allocated by taking as much memory as possible from each socket, starting from first one. For example if a system is configured with 8 GB in 2 sockets (4 GB per socket), and dpdk is requesting only 4GB of memory, all memory will be taken in socket 0 (that have exactly 4GB of free hugepages) even if some cores are configured on socket 1, and there are free hugepages on socket 1... Change this behaviour to allocate memory on all sockets where some cores are configured, spreading the memory amongst sockets using following ratio per socket: N° of cores configured on the socket / Total number of configured cores * requested memory This algorithm is used when memory amount is specified globally using -m option. Per socket memory allocation can always be done using --socket-mem option. Changes included in v2: - only update linux implementation as bsd looks not to be ready for numa - if new algorithm fails, then defaults to previous behaviour Signed-off-by: Didier Pallard <didier.pallard@6wind.com> Signed-off-by: David Marchand <david.marchand@6wind.com> --- lib/librte_eal/linuxapp/eal/eal_memory.c | 50 +++++++++++++++++++++++++++--- 1 file changed, 45 insertions(+), 5 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 73a6394..471dcfd 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -881,13 +881,53 @@ calc_num_pages_per_socket(uint64_t * memory, if (num_hp_info == 0) return -1; - for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_mem != 0; socket++) { - /* if specific memory amounts per socket weren't requested */ - if (internal_config.force_sockets == 0) { + /* if specific memory amounts per socket weren't requested */ + if (internal_config.force_sockets == 0) { + int cpu_per_socket[RTE_MAX_NUMA_NODES]; + size_t default_size, total_size; + unsigned lcore_id; + + /* Compute number of cores per socket */ + memset(cpu_per_socket, 0, sizeof(cpu_per_socket)); + RTE_LCORE_FOREACH(lcore_id) { + cpu_per_socket[rte_lcore_to_socket_id(lcore_id)]++; + } + + /* + * Automatically spread requested memory amongst detected sockets according + * to number of cores from cpu mask present on each socket + */ + total_size = internal_config.memory; + for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_size != 0; +socket++) { + + /* Set memory amount per socket */ + default_size = (internal_config.memory * cpu_per_socket[socket]) + / rte_lcore_count(); + + /* Limit to maximum available memory on socket */ + default_size = RTE_MIN(default_size, get_socket_mem_size(socket)); + + /* Update sizes */ + memory[socket] = default_size; + total_size -= default_size; + } + + /* + * If some memory is remaining, try to allocate it by getting all + * available memory from sockets, one after the other + */ + for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_size != 0; +socket++) { /* take whatever is available */ - memory[socket] = RTE_MIN(get_socket_mem_size(socket), - total_mem); + default_size = RTE_MIN(get_socket_mem_size(socket) - memory[socket], + total_size); + + /* Update sizes */ + memory[socket] += default_size; + total_size -= default_size; } + } + + for (socket = 0; socket < RTE_MAX_NUMA_NODES && total_mem != 0; +socket++) { /* skips if the memory on specific socket wasn't requested */ for (i = 0; i < num_hp_info && memory[socket] != 0; i++){ hp_used[i].hugedir = hp_info[i].hugedir; -- 1.7.10.4 Acked-by: Venky Venkatesan <Venky.venkatesan@intel.com> ^ permalink raw reply related [flat|nested] 4+ messages in thread
[parent not found: <1FD9B82B8BF2CF418D9A1000154491D9740AADF4-P5GAC/sN6hlcIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: [PATCH v2] eal: change default per socket memory allocation [not found] ` <1FD9B82B8BF2CF418D9A1000154491D9740AADF4-P5GAC/sN6hlcIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2014-05-14 9:15 ` Thomas Monjalon 0 siblings, 0 replies; 4+ messages in thread From: Thomas Monjalon @ 2014-05-14 9:15 UTC (permalink / raw) To: Didier Pallard; +Cc: dev-VfR2kkLFssw > Currently, if there is more memory in hugepages than the amount requested by > dpdk application, the memory is allocated by taking as much memory as > possible from each socket, starting from first one. For example if a > system is configured with 8 GB in 2 sockets (4 GB per socket), and dpdk is > requesting only 4GB of memory, all memory will be taken in socket 0 (that > have exactly 4GB of free hugepages) even if some cores are configured on > socket 1, and there are free hugepages on socket 1... > Change this behaviour to allocate memory on all sockets where some cores are > configured, spreading the memory amongst sockets using following ratio per > socket: > N° of cores configured on the socket / Total number of configured > cores * requested memory > > This algorithm is used when memory amount is specified globally using -m > option. Per socket memory allocation can always be done using --socket-mem > option. > Changes included in v2: > - only update linux implementation as bsd looks not to be ready for numa > - if new algorithm fails, then defaults to previous behaviour > > Signed-off-by: Didier Pallard <didier.pallard-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> > Signed-off-by: David Marchand <david.marchand-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> > > Acked-by: Venky Venkatesan <Venky.venkatesan-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> Applied for version 1.7.0. Thanks -- Thomas ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-05-14 9:15 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-09 13:30 [PATCH v2] eal: change default per socket memory allocation David Marchand [not found] ` <1399642242-19725-1-git-send-email-david.marchand-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> 2014-05-13 16:27 ` Thomas Monjalon 2014-05-13 16:33 ` Venkatesan, Venky [not found] ` <1FD9B82B8BF2CF418D9A1000154491D9740AADF4-P5GAC/sN6hlcIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org> 2014-05-14 9:15 ` Thomas Monjalon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).