From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Horman Subject: Re: [PATCH] add free hugepage function Date: Wed, 29 Oct 2014 11:32:12 -0400 Message-ID: <20141029153212.GB14253@localhost.localdomain> References: <1414551269-5820-1-git-send-email-haifeng.lin@huawei.com> <533710CFB86FA344BFBF2D6802E60286C7CAAB@SHSMSX101.ccr.corp.intel.com> <20141029034437.GA29486@mhcomputing.net> <533710CFB86FA344BFBF2D6802E60286C7CB42@SHSMSX101.ccr.corp.intel.com> <54508DE1.9090908@huawei.com> <20141029102635.GB8292@BRICHA3-MOBL> <20141029142745.GA14253@localhost.localdomain> <682698A055A0F44AA47192B20141149711B6074C@BGSMSX102.gar.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: "dev-VfR2kkLFssw@public.gmane.org" To: "Ramia, Kannan Babu" Return-path: Content-Disposition: inline In-Reply-To: <682698A055A0F44AA47192B20141149711B6074C-yHIBzpp8AekFyVwBAnZdSLfspsVTdybXVpNB7YpNyf8@public.gmane.org> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" On Wed, Oct 29, 2014 at 03:22:25PM +0000, Ramia, Kannan Babu wrote: > The problem still remains if one of the process gets abruptly killed, t= hen the corresponding refcount will not get decremented. There should be = some scavenging function to be implemented to check the process livelines= s. =20 >=20 Well, abnormal termination results in abnormal consequences. You expect garbage to get left behind of a program crashes, so I wouldn't really wor= ry about that too much. If you really wanted to you can register chained ha= ndlers for SIGSEGV/SIGBUS/etc to catch those conditions, but honestly, that seem= s like overkill. If a program that uses shared resources terminates abnormally,= its well understood that those shared resources may not get released properly= , and manual intervention is required to clean them up Neil > regards > Kannan Babu >=20 > -----Original Message----- > From: dev [mailto:dev-bounces-VfR2kkLFssw@public.gmane.org] On Behalf Of Neil Horman > Sent: Wednesday, October 29, 2014 7:58 PM > To: Richardson, Bruce > Cc: dev-VfR2kkLFssw@public.gmane.org > Subject: Re: [dpdk-dev] [PATCH] add free hugepage function >=20 > On Wed, Oct 29, 2014 at 10:26:35AM +0000, Bruce Richardson wrote: > > On Wed, Oct 29, 2014 at 02:49:05PM +0800, Linhaifeng wrote: > > >=20 > > >=20 > > > On 2014/10/29 13:26, Qiu, Michael wrote: > > > > =E5=9C=A8 10/29/2014 11:46 AM, Matthew Hall =E5=86=99=E9=81=93: > > > >> On Wed, Oct 29, 2014 at 03:27:58AM +0000, Qiu, Michael wrote: > > > >>> I just saw one return path with value '0', and no any other=20 > > > >>> place return a negative value, so it is better to be designed= =20 > > > >>> as one non-return function, > > > >>> > > > >>> +void > > > >>> +rte_eal_hugepage_free(void) > > > >>> +{ > > > >>> + struct hugepage_file *hugepg_tbl =3D g_hugepage_table.hugepg_= tbl; > > > >>> + unsigned i; > > > >>> + unsigned nr_hugefiles =3D g_hugepage_table.nr_hugefiles; > > > >>> + > > > >>> + RTE_LOG(INFO, EAL, "unlink %u hugepage files\n",=20 > > > >>> +nr_hugefiles); > > > >>> + > > > >>> + for (i =3D 0; i < nr_hugefiles; i++) { > > > >>> + unlink(hugepg_tbl[i].filepath); > > > >>> + hugepg_tbl[i].orig_va =3D NULL; > > > >>> + } > > > >>> +} > > > >>> + > > > >>> > > > >>> Thanks, > > > >>> Michael > > > >> Actually, I don't think that's quite right. > > > >> > > > >> http://linux.die.net/man/2/unlink > > > >> > > > >> "On success, zero is returned. On error, -1 is returned, and=20 > > > >> errno is set appropriately." So it should be returning an error,= =20 > > > >> and logging a message for a file it cannot unlink or people will= be surprised with weird failures. > > > >=20 > > > > Really need one message for unlink failed, but I'm afraid that if= =20 > > > > it make sense for return an error code when application exit. > > > >=20 > > > > Thanks > > > > Michael > > > >> It also had some minor typos / English in the comments but we ca= n fix that too. > > > >> > > > >> Matthew. > > > >> > > > >=20 > > > >=20 > > > >=20 > > > Agree.May be it is not need to return error? > > > -- > > > Regards, > > > Haifeng > > >=20 > >=20 > > May I throw out a few extra ideas. > >=20 > > The basic problem with DPDK and hugepages on exit, as you have alread= y=20 > > diagnosed, is not that the hugepages aren't released for re-use, its=20 > > that the files inside the hugepage directory aren't unlinked. There=20 > > are two > I agree, this patch seems rather unbalanced. Hugepages are initalized = on startup, not allocated explicitly, and so should not be freed on exit.= Instead, they should be globally refcounted and (potentially per config= uration) freed when the last application to exit drops the refcount to ze= ro. >=20 > > considerations here that prevented us from coming up previously with = a=20 > > solution to this that we were fully happy with. > > 1. There is no automatic way (at least none that I'm aware of), to=20 > > have the kernel automatically delete a file on exit. This means that=20 > > any scheme to delete the files on application termination will have t= o=20 > > be a voluntary one that won't happen if an app crashed or is forcibly= =20 > > terminated. That is why the EAL right now does the clean-up on the re= start of the app instead. > Thats not true. The common POSIX compliant method is to use the atexit= () function to register a function to be called when a process terminates= . It can be used from rte_eth_hugepage_init so that a refcount is increa= sed there and decreased when the process exits. The atexit registered fu= nction can also check the refcount for zero and, dependent on configurati= on, unlink the hugepage files. That way the application can be completel= y unaware of a need to do freeing/unlinking >=20 > > 2. The other consideration is multi-process. In a multi-process=20 > > environment, it is unclear when exactly an app is finished - just=20 > > because one process terminates does not mean that the whole app is=20 > > finished with the hugepage files. If the files are deleted before a=20 > > DPDK secondary process starts up, it won't be able to start as it=20 > > won't be able to mmap the hugepage memory it needs. > >=20 > Refcounting fixes that. If every process takes a reference count on th= e hugepage set, then you unlink it when the refcount hits zero. >=20 > > Now, that being said, I think we could see about having automatic=20 > > hugepage deletion controlled by an EAL parameter. That way, the=20 > > parameter could be omitted in the multi-process case, but could be=20 > > used for those who want to have a clean hugepage dir after app termin= ation. > >=20 > > What I think we could do is, instead of having the app save the list=20 > > of hugepages it uses as the app initializes, is to have the app delet= e=20 > > the hugepage files as soon as it closes the filehandle for them. If m= y=20 > > understanding is correct, the kernel will actually keep the hugepage=20 > > memory around in the app as usual from that point onwards, and will=20 > > only release it back [automatically] on app termination. New processe= s=20 > > won't be able to mmap the file, but the running process will be=20 > > unaffected. The best thing about this approach is that the hugepage=20 > > file entries will be deleted whether or not the application terminate= s normally or crashes. > >=20 > That doesn't really work in a use case in which processes are created a= nd removed (i.e. if you have two processes, one exits, then another one i= s forked, that third process won't find the hugepage file, because the se= cond process will have unlinked it on exit). >=20 > > So, any thoughts? Would such a scheme work? I haven't prototyped it,=20 > > yet, but I think it should work. > >=20 > > Regards, > > /Bruce > >=20 > > PS: As well as multi-process not being able to use this scheme, I'd=20 > > also note that this would prevent the dump_cfg utility app - or any=20 > > other DPDK analysis app like it - from being used to inspect the=20 > > memory area of the running process. That is another reason why I thin= k=20 > > the flag should not be set by default. > >=20 > >=20