From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Monjalon Subject: Re: [PATCH] eal: add option to not store segment fd's Date: Fri, 29 Mar 2019 14:34:27 +0100 Message-ID: <4406705.fyK4ph7NJL@xps> References: <07f664c33ddedaa5dcfe82ecb97d931e68b7e33a.1550855529.git.anatoly.burakov@intel.com> <3255576.YcZt162MTL@xps> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7Bit Cc: David Marchand , dev , John McNamara , Marko Kovacevic , iain.barker@oracle.com, edwin.leung@oracle.com, maxime.coquelin@redhat.com To: "Burakov, Anatoly" Return-path: Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by dpdk.org (Postfix) with ESMTP id CFF402BF4 for ; Fri, 29 Mar 2019 14:34:30 +0100 (CET) In-Reply-To: List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 29/03/2019 14:24, Burakov, Anatoly: > On 29-Mar-19 12:40 PM, Thomas Monjalon wrote: > > 29/03/2019 13:05, Burakov, Anatoly: > >> On 29-Mar-19 11:34 AM, Thomas Monjalon wrote: > >>> 29/03/2019 11:33, Burakov, Anatoly: > >>>> On 29-Mar-19 9:50 AM, David Marchand wrote: > >>>>> On Fri, Feb 22, 2019 at 6:12 PM Anatoly Burakov > >>>>> > wrote: > >>>>> > >>>>> Due to internal glibc limitations [1], DPDK may exhaust internal > >>>>> file descriptor limits when using smaller page sizes, which results > >>>>> in inability to use system calls such as select() by user > >>>>> applications. > >>>>> > >>>>> While the problem can be worked around using --single-file-segments > >>>>> option, it does not work if --legacy-mem mode is also used. Add a > >>>>> (yet another) EAL flag to disable storing fd's internally. This > >>>>> will sacrifice compability with Virtio with vhost-backend, but > >>>>> at least select() and friends will work. > >>>>> > >>>>> [1] https://mails.dpdk.org/archives/dev/2019-February/124386.html > >>>>> > >>>>> > >>>>> Sorry, I am a bit lost and I never took the time to look in the new > >>>>> memory allocation system. > >>>>> This gives the impression that we are accumulating workarounds, between > >>>>> legacy-mem, single-file-segments, now no-seg-fds. > >>>> > >>>> Yep. I don't like this any more than you do, but i think there are users > >>>> of all of these, so we can't just drop them willy-nilly. My great hope > >>>> was that by now everyone would move on to use VFIO so legacy mem > >>>> wouldn't be needed (the only reason it exists is to provide > >>>> compatibility for use cases where lots of IOVA-contiguous memory is > >>>> required, and VFIO cannot be used), but apparently that is too much to > >>>> ask :/ > >>>> > >>>>> > >>>>> Iiuc, everything revolves around the need for per page locks. > >>>>> Can you summarize why we need them? > >>>> > >>>> The short answer is multiprocess. We have to be able to map and unmap > >>>> pages individually, and for that we need to be sure that we can, in > >>>> fact, remove a page because no one else uses it. We also need to store > >>>> fd's because virtio with vhost-user backend needs them to work, because > >>>> it relies on sharing memory between processes using fd's. > >>> > >>> It's a pity adding an option to workaround a limitation of a corner case. > >>> It adds complexity that we will have to support forever, > >>> and it's even not perfect because of vhost. > >>> > >>> Might there be another solution? > >>> > >> > >> If there is one, i'm all ears. I don't see any solutions aside from > >> adding limitations. > >> > >> For example, we could drop the single/multi file segments mode and just > >> make single file segments a default and the only available mode, but > >> this has certain risks because older kernels do not support fallocate() > >> on hugetlbfs. > >> > >> We could further draw a line in the sand, and say that, for example, > >> 19.11 (or 20.11) will not have legacy mem mode, and everyone should use > >> VFIO by now and if you don't it's your own fault. > >> > >> We could also cut down on the number of fd's we use in single-file > >> segments mode by not using locks and simply deleting pages in the > >> primary, but yanking out hugepages from under secondaries' feet makes me > >> feel uneasy, even if technically by the time that happens, they're not > >> supposed to be used anyway. This could mean that the patch is no longer > >> necessary because we don't use that many fd's any more. > > > > This last option is interesting. Is it realistic? > > > > I can do it in current release cycle, but i'm not sure if it's too late > to do such changes. I guess it's OK since the validation cycle is just > starting? I'll throw something together and see if it crashes and burns. OK let's try that.