From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751241AbdA0U6R (ORCPT ); Fri, 27 Jan 2017 15:58:17 -0500 Received: from www62.your-server.de ([213.133.104.62]:46306 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750839AbdA0U6K (ORCPT ); Fri, 27 Jan 2017 15:58:10 -0500 Message-ID: <588BA9AA.8010805@iogearbox.net> Date: Fri, 27 Jan 2017 21:12:26 +0100 From: Daniel Borkmann User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Michal Hocko CC: Alexei Starovoitov , Andrew Morton , Vlastimil Babka , Mel Gorman , Johannes Weiner , linux-mm , LKML , "netdev@vger.kernel.org" , marcelo.leitner@gmail.com Subject: Re: [PATCH 0/6 v3] kvmalloc References: <588907AA.1020704@iogearbox.net> <20170126074354.GB8456@dhcp22.suse.cz> <5889C331.7020101@iogearbox.net> <20170126100802.GF6590@dhcp22.suse.cz> <5889DEA3.7040106@iogearbox.net> <20170126115833.GI6590@dhcp22.suse.cz> <5889F52E.7030602@iogearbox.net> <20170126134004.GM6590@dhcp22.suse.cz> <588A5D3C.4060605@iogearbox.net> <20170127100544.GF4143@dhcp22.suse.cz> In-Reply-To: <20170127100544.GF4143@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated-Sender: daniel@iogearbox.net Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/27/2017 11:05 AM, Michal Hocko wrote: > On Thu 26-01-17 21:34:04, Daniel Borkmann wrote: >> On 01/26/2017 02:40 PM, Michal Hocko wrote: > [...] >>> But realistically, how big is this problem really? Is it really worth >>> it? You said this is an admin only interface and admin can kill the >>> machine by OOM and other means already. >>> >>> Moreover and I should probably mention it explicitly, your d407bd25a204b >>> reduced the likelyhood of oom for other reason. kmalloc used GPF_USER >>> previously and with order > 0 && order <= PAGE_ALLOC_COSTLY_ORDER this >>> could indeed hit the OOM e.g. due to memory fragmentation. It would be >>> much harder to hit the OOM killer from vmalloc which doesn't issue >>> higher order allocation requests. Or have you ever seen the OOM killer >>> pointing to the vmalloc fallback path? >> >> The case I was concerned about was from vmalloc() path, not kmalloc(). >> That was where the stack trace indicating OOM pointed to. As an example, >> there could be really large allocation requests for maps where the map >> has pre-allocated memory for its elements. Thus, if we get to the point >> where we need to kill others due to shortage of mem for satisfying this, >> I'd much much rather prefer to just not let vmalloc() work really hard >> and fail early on instead. > > I see, but as already mentioned, chances are that by the time you get > close to the OOM somebody else will hit the OOM before the vmalloc path > manages to free the allocated memory. > >> In my (crafted) test case, I was connected >> via ssh and it each time reliably killed my connection, which is really >> suboptimal. >> >> F.e., I could also imagine a buggy or miscalculated map definition for >> a prog that is provisioned to multiple places, which then accidentally >> triggers this. Or if large on purpose, but we crossed the line, it >> could be handled more gracefully, f.e. I could imagine an option to >> falling back to a non-pre-allocated map flavor from the application >> loading the program. Trade-off for sure, but still allowing it to >> operate up to a certain extend. Granted, if vmalloc() succeeded without >> trying hard and we then OOM elsewhere, too bad, but we don't have much >> control over that one anyway, only about our own request. Reason I >> asked above was whether having __GFP_NORETRY in would be fatal >> somewhere down the path, but seems not as you say. >> >> So to answer your second email with the bpf and netfilter hunks, why >> not replacing them with kvmalloc() and __GFP_NORETRY flag and add that >> big fat FIXME comment above there, saying explicitly that __GFP_NORETRY >> is not harmful though has only /partial/ effect right now and that full >> support needs to be implemented in future. That would still be better >> that not having it, imo, and the FIXME would make expectations clear >> to anyone reading that code. > > Well, we can do that, I just would like to prevent from this (ab)use > if there is no _real_ and _sensible_ usecase for it. Having a real bug Understandable. > report or a fallback mechanism you are mentioning above would justify > the (ab)use IMHO. But that abuse would be documented properly and have a > real reason to exist. That sounds like a better approach to me. > > But if you absolutely _insist_ I can change that. Yeah, please do (with a big FIXME comment as mentioned), this originally came from a real bug report. Anyway, feel free to add my Acked-by then. Thanks again, Daniel