From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0D39C433EF for ; Tue, 21 Sep 2021 22:13:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2612061019 for ; Tue, 21 Sep 2021 22:13:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2612061019 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 844516B006C; Tue, 21 Sep 2021 18:13:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7CC2E6B0071; Tue, 21 Sep 2021 18:13:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 64698900002; Tue, 21 Sep 2021 18:13:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id 50D726B006C for ; Tue, 21 Sep 2021 18:13:42 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E2FB32D39B for ; Tue, 21 Sep 2021 22:13:41 +0000 (UTC) X-FDA: 78612983442.24.6D8FEA2 Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com [209.85.167.51]) by imf22.hostedemail.com (Postfix) with ESMTP id A0E1A1900 for ; Tue, 21 Sep 2021 22:13:41 +0000 (UTC) Received: by mail-lf1-f51.google.com with SMTP id z24so3280858lfu.13 for ; Tue, 21 Sep 2021 15:13:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:date:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=pCZE/KATfVul7zh4Istxf2mLH4yD9uQZLzU/uqsm3SY=; b=MTDamyLducVf/fexpFbF74KsLN0AjIGWkypxrsjC2pB4V5VdW2adz1qfQv6xdz6Bx2 t5OHe/cSfQyo76F3W/r9jU9+ul+s6w71n5usmCuus1vqi6mJEJCeHKwYuYpe0LV5krPg 6X+ie3WOEEIJh7/7roZn0lHOds0F0Rp0LgIE4vzbiHo/krH7zLLeOyHVhdCIQIG7qQXz 3cUJaZAcNWdJHNhoaXc2zOPG7UtqCIHl+sWUTPru8Gge180ZUXuN/rgN0Yf+5y+k/Hnc ekSEL2HwnshlHf0B05ltS3o23ZNjvDvaWt1Z5LKB+AfXfjQwgJC5XJtz0tDPhucr4SiD t4eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:date:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=pCZE/KATfVul7zh4Istxf2mLH4yD9uQZLzU/uqsm3SY=; b=FaE9TVV/AAnC9iknSghIUrbrWS9IGfRLNgf7+5kStiCFplq476Q8vInzCSqFXGzrNk 3ZP6RxuqEb4UO3yKkrzQbcaOiKgLSgWufDyx9vZcsygWDUu83t1xu4u+6rulB+jdQDfA y04bgi2DQwcvqKPqzTJsfUbWyZ/e6xQsP2uPspXW3Y6PFpoIDvbbAFx7EtGAkilhyEQX nCB9w5dP9kdQE77COOqkkwPxA2hheL7XuhQAch+Uh1U5PKdK/ttvNSi8FGYVW7aVF/uY KarNSVGZgR/HnzorVpvy3S13rY1ctvExy+SbdEztToTQu2wSBWXXm8V9DeEFjjp4wuyz Kmpg== X-Gm-Message-State: AOAM530p1ECugXcUenUVXojbik2Pqk8e6INx2X2M4ARfg3y+K3TJdSML GBRvB2oZCKA4sB02+cxfQK8= X-Google-Smtp-Source: ABdhPJxHZ98JNyN/RVo5CFQyNkvyu8J+eSut3oEfn9GUlznngSgBTQP0BgP9fFFKD7NzQpBexmWJUQ== X-Received: by 2002:a05:6512:128b:: with SMTP id u11mr25724700lfs.24.1632262419992; Tue, 21 Sep 2021 15:13:39 -0700 (PDT) Received: from pc638.lan (h5ef52e3d.seluork.dyn.perspektivbredband.net. [94.245.46.61]) by smtp.gmail.com with ESMTPSA id r206sm23234lff.100.2021.09.21.15.13.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Sep 2021 15:13:39 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Wed, 22 Sep 2021 00:13:37 +0200 To: David Hildenbrand Cc: Uladzislau Rezki , LKML , Ping Fang , Andrew Morton , Roman Gushchin , Michal Hocko , Oscar Salvador , Linux Memory Management List Subject: Re: [PATCH v1] mm/vmalloc: fix exact allocations with an alignment > 1 Message-ID: <20210921221337.GA60191@pc638.lan> References: <20210908132727.16165-1-david@redhat.com> <20210916193403.GA1940@pc638.lan> <221e38c1-4b8a-8608-455a-6bde544adaf0@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <221e38c1-4b8a-8608-455a-6bde544adaf0@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=MTDamyLd; spf=pass (imf22.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.51 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: A0E1A1900 X-Stat-Signature: 1m7eqj3auoez3n1uqkb1fjnjy4biotsf X-HE-Tag: 1632262421-830376 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 17, 2021 at 10:47:54AM +0200, David Hildenbrand wrote: > > > > This patch looks like a KASAN specific to me. So i would like to avoid > > > > such changes to > > > > the vmalloc internals in order to simplify further maintenance and > > > > keeping it generic > > > > instead. > > > > > > There is nothing KASAN specific in there :) It's specific to exact > > > applications -- and KASAN may be one of the only users for now. > > > > > Well, you can name it either way :) So it should not be specific by the > > design, otherwise as i mentioned the allocator would be like a peace of > > code that handles corner cases what is actually not acceptable. > > Let's not overstress the situation of adding essentially 3 LOC in order to > fix a sane use case of the vmalloc allocator that was not considered > properly by internal changes due to 68ad4a330433 ("mm/vmalloc.c: keep track > of free blocks for vmap allocation"). > > > > > > > > > > > Currently the find_vmap_lowest_match() adjusts the search size > > > > criteria for any alignment > > > > values even for PAGE_SIZE alignment. That is not optimal. Because a > > > > PAGE_SIZE or one > > > > page is a minimum granularity we operate on vmalloc allocations thus > > > > all free blocks are > > > > page aligned. > > > > > > > > All that means that we should adjust the search length only if an > > > > alignment request is bigger than > > > > one page, in case of one page, that corresponds to PAGE_SIZE value, > > > > there is no reason in such > > > > extra adjustment because all free->va_start have a page boundary anyway. > > > > > > > > Could you please test below patch that is a generic improvement? > > > > > > I played with the exact approach below (almost exactly the same code in > > > find_vmap_lowest_match()), and it might make sense as a general improvement > > > -- if we can guarantee that start+end of ranges are always PAGE-aligned; I > > > was not able to come to that conclusion... > > All free blocks are PAGE aligned that is how it has to be. A vstart also > > should be aligned otherwise the __vunmap() will complain about freeing > > a bad address: > > > > > > if (WARN(!PAGE_ALIGNED(addr), "Trying to vfree() bad address (%p)\n", > > addr)) > > return; > > > > > > BTW, we should add an extra check to the alloc_vmap_area(), something like > > below: > > > > > > if (!PAGE_ALIGNED(ALIGN(vstart, align))) { > > WARN_ONCE(1, "vmalloc: vstart or align are not page aligned (0x%lx, 0x%lx)\n", > > vstart, align); > > return ERR_PTR(-EBUSY); > > } > > > > > > to check that passed pair of vstart and align are correct. > > > > Yes we better should. > > > > > > > vmap_init_free_space() shows me that our existing alignment code/checks > > > might be quite fragile. > > > > > It is not important to page align a first address. As i mentioned before > > vstart and align have to be correct and we should add such check. > > > > > > > > But I mainly decided to go with my patch instead because it will also work > > > for exact allocations with align > PAGE_SIZE: most notably, when we try > > > population of hugepages instead of base pages in __vmalloc_node_range(), by > > > increasing the alignment. If the exact range allows for using huge pages, > > > placing huge pages will work just fine with my modifications, while it won't > > > with your approach. > > > > > > Long story short: my approach handles exact allocations also for bigger > > > alignment, Your optimization makes sense as a general improvement for any > > > vmalloc allocations. > > > > > > Thanks for having a look! > > > > > The problem is that there are no users(seems only KASAN) who set the range > > that corresponds to exact size. If you add an aligment overhead on top of > > So there is one user and it was broken by 68ad4a330433 ("mm/vmalloc.c: keep > track of free blocks for vmap allocation"). > > > it means that search size should include it because we may end up with exact > > free block and after applying aligment it will not fit. This is how allocators > > handle aligment. > > This is an implementation detail of the vmalloc allocator ever since > 68ad4a330433 ("mm/vmalloc.c: keep track of free blocks for vmap > allocation"). > > If callers pass an exact range, and the alignment they specify applies, why > should we reject such an allocation? It's leaking an implementation detail > fixed easily internally into callers. > > > > > Another approach is(you mentioned it in your commit message): > > > > urezki@pc638:~/data/raid0/coding/linux-next.git$ git diff mm/kasan/shadow.c > > diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c > > index 082ee5b6d9a1..01d3bd76c851 100644 > > --- a/mm/kasan/shadow.c > > +++ b/mm/kasan/shadow.c > > @@ -200,7 +200,7 @@ static int __meminit kasan_mem_notifier(struct notifier_block *nb, > > if (shadow_mapped(shadow_start)) > > return NOTIFY_OK; > > - ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start, > > + ret = __vmalloc_node_range(shadow_size, 1, shadow_start, > > shadow_end, GFP_KERNEL, > > PAGE_KERNEL, VM_NO_GUARD, > > pfn_to_nid(mem_data->start_pfn), > > urezki@pc638:~/data/raid0/coding/linux-next.git$ > > > > why not? Also you can increase the range in KASAN code. > > No, that's leaking implementation details to the caller. And no, increasing > the range and eventually allocating something bigger (e.g., placing a huge > page where it might not have been possible) is not acceptable for KASAN. > > If you're terribly unhappy with this patch, Sorry to say but it simple does not make sense. > > please suggest something reasonable to fix exact allocations: > a) Fixes the KASAN use case. > b) Allows for automatic placement of huge pages for exact allocations. > c) Doesn't leak implementation details into the caller. > I am looking at it. -- Vlad Rezki