From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bruce Richardson Subject: Re: [PATCH v2] add one option memory-only for secondary processes Date: Thu, 22 Jan 2015 11:17:33 +0000 Message-ID: <20150122111732.GA4580@bricha3-MOBL3> References: <1417601518-16852-1-git-send-email-xiaobo.chi@nsn.com> <7F861DC0615E0C47A872E6F3C5FCDDBD05DCB1C7@BPXM14GP.gisp.nec.co.jp> <7F861DC0615E0C47A872E6F3C5FCDDBD05DD6E68@BPXM14GP.gisp.nec.co.jp> <20141216100344.GA9152@bricha3-MOBL3> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: "dev-VfR2kkLFssw@public.gmane.org" To: "Chi, Xiaobo (NSN - CN/Hangzhou)" , thomas.monjalon-pdR9zngts4EAvxtiuMwx3w@public.gmane.org Return-path: Content-Disposition: inline In-Reply-To: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" On Thu, Jan 22, 2015 at 09:05:34AM +0000, Chi, Xiaobo (NSN - CN/Hangzhou)= wrote: > Hi, Bruce, > Since the DPDK2.0 merge window is opened now, so is it possible for thi= s patch to be one candidate for v2.0? > I searched in the DPDK patchwork(http://www.dpdk.org/dev/patchwork/proj= ect/dpdk/list/?state=3D*&q=3Dmemory-only&archive=3Dboth ), but can not fi= nd this V2 patch. Can you please help to check why? Thanks a lot. >=20 > Filters: Search =3D memory-only remove filter > Patch Date Submitter Delegate State > [dpdk-dev] add one option memory-only for those secondary PRBs 2014-12-= 02 chixiaobo Not Applicable > [dpdk-dev] add one option memory-only for those secondary PRBs 2014-12-= 02 chixiaobo Changes Requested >=20 > Brgs, > Chi Xiaobo >=20 That's a question that Thomas is better able to answer than me, since he = is the man with control over patchwork! :-) Thomas, any feedback here? Thanks, /Bruce >=20 > -----Original Message----- >=20 > From: ext Bruce Richardson [mailto:bruce.richardson-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org]=20 > Sent: Tuesday, December 16, 2014 6:04 PM > To: Chi, Xiaobo (NSN - CN/Hangzhou) > Cc: ext Hiroshi Shimamoto; dev-VfR2kkLFssw@public.gmane.org > Subject: Re: [dpdk-dev] [PATCH v2] add one option memory-only for secon= dary processes >=20 > On Tue, Dec 16, 2014 at 09:26:48AM +0000, Chi, Xiaobo (NSN - CN/Hangzho= u) wrote: > > Hi, Bruce, > > How about this patch, can it be merged to master branch? Thanks. > >=20 > > Brgs, > > Chi Xiaobo > >=20 >=20 > At this point, I think we are well past code-freeze for new features fo= r 1.8, > but this looks a good candidate for 2.0 once the merge window for that = opens. >=20 > /Bruce >=20 > >=20 > > -----Original Message----- > > From: Chi, Xiaobo (NSN - CN/Hangzhou)=20 > > Sent: Monday, December 15, 2014 5:58 PM > > To: 'ext Hiroshi Shimamoto'; dev-VfR2kkLFssw@public.gmane.org > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for sec= ondary processes > >=20 > > Hi, Hiroshi, > > Yes, the should be performance degradation, not only due to the mempo= ol cache, but also due to process scheduling overhead (lead by no CPU pin= .) > > I have not done the performance testing. In my project scenarios, tho= se SECONDARY processes only send/receive messages to/from the PRIMARY pro= cess via mempool/ring, the throughput is not so high, so the performance = degradation is not critical to us. but there are dozens of SECONDARY proc= esses in our system, it will be hard to manually properly pin them to dif= ferent CPU cores, what we want is to apply linux standard scheduling mech= anism to do load balance between CPU cores. > >=20 > > Brgs, > > Chi Xiaobo > >=20 > >=20 > > -----Original Message----- > > From: ext Hiroshi Shimamoto [mailto:h-shimamoto-ehU+Cx/zZe18UrSeD/g0lQ@public.gmane.org]=20 > > Sent: Thursday, December 11, 2014 11:03 AM > > To: Chi, Xiaobo (NSN - CN/Hangzhou); dev-VfR2kkLFssw@public.gmane.org > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for sec= ondary processes > >=20 > > Hi, > >=20 > > sorry for the delay. > >=20 > > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for s= econdary processes > > >=20 > > > Hi, Hiroshi, > > > Yes, you are right, in order to avoid such problem, while create th= e mempool, which shall be shared between the primary > > > process and those secondary Processes, we need to assign the cache_= size param value to be zero. And in order to make the > > > system more stable, it's better to define the RTE_MEMPOOL_CACHE_MAX= _SIZE to be 0 in rte_config.h. > >=20 > > Yes, it prevents the data corruption, but it also hurts the performan= ce. > > I think, if we use the mbuf w/o cache for PMD, we will see the perfor= mance degradation. > >=20 > > Don't you have any number? > >=20 > > thanks, > > Hiroshi > >=20 > > >=20 > > > /* create the mempool */ > > > struct rte_mempool * > > > rte_mempool_create(const char *name, unsigned n, unsigned elt_size, > > > unsigned cache_size, unsigned private_data_size, > > > rte_mempool_ctor_t *mp_init, void *mp_init_arg, > > > rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg, > > > int socket_id, unsigned flags); > > >=20 > > >=20 > > > Brgs, > > > Chi xiaobo > > >=20 > > >=20 > > > -----Original Message----- > > > From: ext Hiroshi Shimamoto [mailto:h-shimamoto-ehU+Cx/zZe18UrSeD/g0lQ@public.gmane.org] > > > Sent: Wednesday, December 03, 2014 6:54 PM > > > To: Chi, Xiaobo (NSN - CN/Hangzhou); dev-VfR2kkLFssw@public.gmane.org > > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for s= econdary processes > > >=20 > > > Hi, > > >=20 > > > > Subject: [dpdk-dev] [PATCH v2] add one option memory-only for sec= ondary processes > > > > > > > > From: Chi Xiaobo > > > > > > > > Problem: There is one normal DPDK processes deployment scenarios:= one primary process and several (even hundreds) secondary > > > > processes; all outside packets/messages are sent/received by prim= ary process and then distribute them to those secondary > > > > processes by DPDK's ring/sharedmemory mechanism. In such scenario= s, those SECONDARY processes need only hugepage based > > > > sharememory mechanism and it?=EF=BF=BD=EF=BF=BDs upper libs (such= as ring, mempool, etc.), they need not cpu core pinning, iopl privilege > > > > changing , pci device, timer, alarm, interrupt, shared_driver_lis= t, core_info, threads for each core, etc. Then, for > > > > such kind of SECONDARY processes, the current rte_eal_init() is t= oo heavy. > > > > > > > > Solution:One new EAL initializing argument, --memory-only, is add= ed. It is only for those SECONDARY processes which > > > only > > > > want to share memory with other processes. if this argument is de= fined, users need not define those mandatory arguments, > > > > such as -c and -n, due to we don't want to pin such kind of proce= sses to any CPUs. > > >=20 > > > however, we need the lcore_id per thread to use mempool. > > > If the lcore_id is not initialized, it must be 0, and multiple thre= ads will break > > > mempool caches per thread, because of race condition. > > > We have to assign lcore_id per thread, these ids must not be overla= pped, or disable > > > mempool handling in SECONDARY process. > > >=20 > > > thanks, > > > Hiroshi > > >=20 > > > > Signed-off-by: Chi Xiaobo > > > > --- > > > > lib/librte_eal/common/eal_common_options.c | 17 ++++++++++++--- > > > > lib/librte_eal/common/eal_internal_cfg.h | 1 + > > > > lib/librte_eal/common/eal_options.h | 2 ++ > > > > lib/librte_eal/linuxapp/eal/eal.c | 34 ++++++++++++++++= +------------- > > > > 4 files changed, 36 insertions(+), 18 deletions(-) > > > > > > > > diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/lib= rte_eal/common/eal_common_options.c > > > > index e2810ab..7b18498 100644 > > > > --- a/lib/librte_eal/common/eal_common_options.c > > > > +++ b/lib/librte_eal/common/eal_common_options.c > > > > @@ -85,6 +85,7 @@ eal_long_options[] =3D { > > > > {OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM}, > > > > {OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM}, > > > > {OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM}, > > > > + {OPT_MEMORY_ONLY, 0, NULL, OPT_MEMORY_ONLY_NUM}, > > > > {0, 0, 0, 0} > > > > }; > > > > > > > > @@ -126,6 +127,7 @@ eal_reset_internal_config(struct internal_con= fig *internal_cfg) > > > > internal_cfg->no_hpet =3D 1; > > > > #endif > > > > internal_cfg->vmware_tsc_map =3D 0; > > > > + internal_cfg->memory_only=3D 0; > > > > } > > > > > > > > /* > > > > @@ -454,6 +456,10 @@ eal_parse_common_option(int opt, const char = *optarg, > > > > conf->process_type =3D eal_parse_proc_type(optarg); > > > > break; > > > > > > > > + case OPT_MEMORY_ONLY_NUM: > > > > + conf->memory_only=3D 1; > > > > + break; > > > > + > > > > case OPT_MASTER_LCORE_NUM: > > > > if (eal_parse_master_lcore(optarg) < 0) { > > > > RTE_LOG(ERR, EAL, "invalid parameter for --" > > > > @@ -525,9 +531,9 @@ eal_check_common_options(struct internal_conf= ig *internal_cfg) > > > > { > > > > struct rte_config *cfg =3D rte_eal_get_configuration(); > > > > > > > > - if (!lcores_parsed) { > > > > - RTE_LOG(ERR, EAL, "CPU cores must be enabled with options " > > > > - "-c or -l\n"); > > > > + if (!lcores_parsed && !(internal_cfg->process_type =3D=3D RTE_P= ROC_SECONDARY&& internal_cfg->memory_only) ) { > > > > + RTE_LOG(ERR, EAL, "For those processes without memory-only opt= ion, CPU cores " > > > > + "must be enabled with options -c or -l\n"); > > > > return -1; > > > > } > > > > if (cfg->lcore_role[cfg->master_lcore] !=3D ROLE_RTE) { > > > > @@ -545,6 +551,10 @@ eal_check_common_options(struct internal_con= fig *internal_cfg) > > > > "specified\n"); > > > > return -1; > > > > } > > > > + if ( internal_cfg->process_type !=3D RTE_PROC_SECONDARY && inte= rnal_cfg->memory_only ) { > > > > + RTE_LOG(ERR, EAL, "only secondary processes can specify memory= -only option.\n"); > > > > + return -1; > > > > + } > > > > if (index(internal_cfg->hugefile_prefix, '%') !=3D NULL) { > > > > RTE_LOG(ERR, EAL, "Invalid char, '%%', in --"OPT_FILE_PREFIX" = " > > > > "option\n"); > > > > @@ -590,6 +600,7 @@ eal_common_usage(void) > > > > " --"OPT_SYSLOG" : set syslog facility\n" > > > > " --"OPT_LOG_LEVEL" : set default log level\n" > > > > " --"OPT_PROC_TYPE" : type of this process\n" > > > > + " --"OPT_MEMORY_ONLY": only use shared memory, valid on= ly for secondary process.\n" > > > > " --"OPT_PCI_BLACKLIST", -b: add a PCI device in black = list.\n" > > > > " Prevent EAL from using this PCI device. = The argument\n" > > > > " format is .\n" > > > > diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librt= e_eal/common/eal_internal_cfg.h > > > > index aac6abf..f51f0a2 100644 > > > > --- a/lib/librte_eal/common/eal_internal_cfg.h > > > > +++ b/lib/librte_eal/common/eal_internal_cfg.h > > > > @@ -85,6 +85,7 @@ struct internal_config { > > > > > > > > unsigned num_hugepage_sizes; /**< how many sizes on this s= ystem */ > > > > struct hugepage_info hugepage_info[MAX_HUGEPAGE_SIZES]; > > > > + volatile unsigned memory_only; /** > > > }; > > > > extern struct internal_config internal_config; /**< Global EAL c= onfiguration. */ > > > > > > > > diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal= /common/eal_options.h > > > > index e476f8d..87cc5db 100644 > > > > --- a/lib/librte_eal/common/eal_options.h > > > > +++ b/lib/librte_eal/common/eal_options.h > > > > @@ -77,6 +77,8 @@ enum { > > > > OPT_CREATE_UIO_DEV_NUM, > > > > #define OPT_VFIO_INTR "vfio-intr" > > > > OPT_VFIO_INTR_NUM, > > > > +#define OPT_MEMORY_ONLY "memory-only" > > > > + OPT_MEMORY_ONLY_NUM, > > > > OPT_LONG_MAX_NUM > > > > }; > > > > > > > > diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/l= inuxapp/eal/eal.c > > > > index 89f3b5e..c160771 100644 > > > > --- a/lib/librte_eal/linuxapp/eal/eal.c > > > > +++ b/lib/librte_eal/linuxapp/eal/eal.c > > > > @@ -752,14 +752,6 @@ rte_eal_init(int argc, char **argv) > > > > > > > > rte_config_init(); > > > > > > > > - if (rte_eal_pci_init() < 0) > > > > - rte_panic("Cannot init PCI\n"); > > > > - > > > > -#ifdef RTE_LIBRTE_IVSHMEM > > > > - if (rte_eal_ivshmem_init() < 0) > > > > - rte_panic("Cannot init IVSHMEM\n"); > > > > -#endif > > > > - > > > > if (rte_eal_memory_init() < 0) > > > > rte_panic("Cannot init memory\n"); > > > > > > > > @@ -772,14 +764,30 @@ rte_eal_init(int argc, char **argv) > > > > if (rte_eal_tailqs_init() < 0) > > > > rte_panic("Cannot init tail queues for objects\n"); > > > > > > > > + if (rte_eal_log_init(logid, internal_config.syslog_facility) < = 0) > > > > + rte_panic("Cannot init logs\n"); > > > > + > > > > + eal_check_mem_on_local_socket(); > > > > + > > > > + rte_eal_mcfg_complete(); > > > > + > > > > + /*with memory-only option, we need not cpu affinity, pci dev= ice, alarm, external devices, interrupt, etc. */ > > > > + if( internal_config.memory_only ){ > > > > + RTE_LOG (DEBUG, EAL, "memory-only defined, so only memory bein= g initialized.\n"); > > > > + return 0; > > > > + } > > > > + > > > > + if (rte_eal_pci_init() < 0) > > > > + rte_panic("Cannot init PCI\n"); > > > > + > > > > #ifdef RTE_LIBRTE_IVSHMEM > > > > + if (rte_eal_ivshmem_init() < 0) > > > > + rte_panic("Cannot init IVSHMEM\n"); > > > > + > > > > if (rte_eal_ivshmem_obj_init() < 0) > > > > rte_panic("Cannot init IVSHMEM objects\n"); > > > > #endif > > > > > > > > - if (rte_eal_log_init(logid, internal_config.syslog_facility) < = 0) > > > > - rte_panic("Cannot init logs\n"); > > > > - > > > > if (rte_eal_alarm_init() < 0) > > > > rte_panic("Cannot init interrupt-handling thread\n"); > > > > > > > > @@ -789,10 +797,6 @@ rte_eal_init(int argc, char **argv) > > > > if (rte_eal_timer_init() < 0) > > > > rte_panic("Cannot init HPET or TSC timers\n"); > > > > > > > > - eal_check_mem_on_local_socket(); > > > > - > > > > - rte_eal_mcfg_complete(); > > > > - > > > > TAILQ_FOREACH(solib, &solib_list, next) { > > > > RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name); > > > > solib->lib_handle =3D dlopen(solib->name, RTLD_NOW); > > > > -- > > > > 1.9.4.msysgit.2 > >=20