From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C7417C77B75 for ; Wed, 17 May 2023 11:26:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:Date:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=XbbIoQoH2uSKeYCyJepduhmtsL2KWr9H2BBm18Ol8gs=; b=aego6dNzEaMJM9 iODrBY6gTQnj7J8flaGpvnxqT/A3Ag5XCpIsCV018SRgM3O+UkBED3Wat0f0p2VvPJkUQJb8bvOFM TJ1CnH2qPKexW+tmE8uAM7V95qB3PXa4qbOl48+otBkl8BwmdUF2eQjggvnY2iXpMyM4DZnUu9TjO mIx/SwmxE0Dwy03KRd00fR3zU/j/U/mm5z2n3/VucBNZRBME4dSIMO8O5YnjMNjjBqYn39SMwcm1w s98YViagsfyUobzeK9SX+odNeUfOTM+fChLvQzjATiMe+bxaopuE8N59hWhmNL4pUcvTW827OqFAL ABIe5cYLL1+71E+wkFCA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1pzFIc-009b6X-12; Wed, 17 May 2023 11:26:30 +0000 Received: from mail-lf1-x129.google.com ([2a00:1450:4864:20::129]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1pzFIZ-009b4x-1Q for linux-arm-kernel@lists.infradead.org; Wed, 17 May 2023 11:26:29 +0000 Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-4f00c33c3d6so863178e87.2 for ; Wed, 17 May 2023 04:26:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684322785; x=1686914785; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=7o3KAIWm/mO8J93CHn5LsiTq/SPNzAT8j8L3cNwZHUA=; b=gHIYIFcb7iGNzSeiIppsvSeluLjEEkFzxKfWPlhEkTiqU/8hwwFPGJwIf0Y5f3ao8l bRO2HsOUR5+cps0t/C3yH4KXfAp2qd3QE7PHB60F5DsTXv/sBdvuCZYb9RyPP2T9PT/G DWj+C2v0aSRJiSvEQuVKhd4RS4YcFn9waSgRiVvHqOt5NwZBNu54ADo2oIhqi5+JO8xu 1vHoRwzmySf4znQmmJPISAYZgcHJ/YJaQ+AvvClWeN5hhJ8yYfWffsnysleWQ0n/keyJ 3KItFU1YG6ia3One2tkXng36VX9E1l2jtHSXrOY2LajgN6BS+vivujd3vjILvVRM+tCh EacQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684322785; x=1686914785; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=7o3KAIWm/mO8J93CHn5LsiTq/SPNzAT8j8L3cNwZHUA=; b=WOdlSpknGhNNaFReOYM8i8SUgxnXSVeTLHhOU/r/nyDQQLY7fr/fYsQ6QLWh/FdRI3 S/9kU9ohnM1PR2yCScDESlUDIZrCbEcPrv3Dhm2tQLt6YHapr5sx82TJefdQcXHMReik 0xId3T0GoLbtU2NZ6kRZeTSLdfO0cODsNsOvdREjidCVnXRHgYcNW8GOVWXZdKhxrss0 f3fE61Mo0wtl+Rn9Qx9v6U+DG8jqcGkNE/y/eOJkXhC/tCTEQjTCOB6LjRkg5ygbpXE2 fVFk1aw5JYtIhVO+OGjtOHM8Z0nX6HaG+e3nea46//7zBe/pYUOS1Gf/jT1tNUuGXRVP wO5Q== X-Gm-Message-State: AC+VfDz5y0rQIZcTMfRX0uujY3u1AujN31sng7XAmY5XdqQtT6H/b1Hd RA3Fk9Njl+avPmPQ5yHlXXA= X-Google-Smtp-Source: ACHHUZ6CjkmsgzOLbbheRwXxILxoLuRencv8dSvdo6IBSvfX+9DK1x/dc+PYZeFDZBTihawy1HyODA== X-Received: by 2002:a19:c207:0:b0:4ea:fa78:3662 with SMTP id l7-20020a19c207000000b004eafa783662mr122947lfc.39.1684322784356; Wed, 17 May 2023 04:26:24 -0700 (PDT) Received: from pc636 (host-90-235-18-147.mobileonline.telia.com. [90.235.18.147]) by smtp.gmail.com with ESMTPSA id f20-20020ac251b4000000b004f14fa44403sm3310864lfk.283.2023.05.17.04.26.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 04:26:23 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Wed, 17 May 2023 13:26:21 +0200 To: Thomas Gleixner Cc: Uladzislau Rezki , "Russell King (Oracle)" , Andrew Morton , linux-mm@kvack.org, Christoph Hellwig , Lorenzo Stoakes , Peter Zijlstra , Baoquan He , John Ogness , linux-arm-kernel@lists.infradead.org, Mark Rutland , Marc Zyngier , x86@kernel.org Subject: Re: Excessive TLB flush ranges Message-ID: References: <87r0rg93z5.ffs@tglx> <87cz308y3s.ffs@tglx> <87y1lo7a0z.ffs@tglx> <87o7mk733x.ffs@tglx> <87leho6wd9.ffs@tglx> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <87leho6wd9.ffs@tglx> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230517_042627_481598_1231C996 X-CRM114-Status: GOOD ( 28.32 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, May 16, 2023 at 07:04:34PM +0200, Thomas Gleixner wrote: > On Tue, May 16 2023 at 17:01, Uladzislau Rezki wrote: > > On Tue, May 16, 2023 at 04:38:58PM +0200, Thomas Gleixner wrote: > >> There is a world outside of x86, but even on x86 it's borderline silly > >> to take the whole TLB out when you can flush 3 TLB entries one by one > >> with exactly the same number of IPIs, i.e. _one_. No? > >> > > I meant if we invoke flush_tlb_kernel_range() on each VA's individual > > range: > > > > > > void flush_tlb_kernel_range(unsigned long start, unsigned long end) > > { > > if (tlb_ops_need_broadcast()) { > > struct tlb_args ta; > > ta.ta_start = start; > > ta.ta_end = end; > > on_each_cpu(ipi_flush_tlb_kernel_range, &ta, 1); > > } else > > local_flush_tlb_kernel_range(start, end); > > broadcast_tlb_a15_erratum(); > > } > > > > > > we should IPI and wait, no? > > The else clause does not do an IPI, but that's irrelevant. > > The proposed flush_tlb_kernel_vas(list, num_pages) mechanism > achieves: > > 1) It batches multiple ranges to _one_ invocation > > 2) It lets the architecture decide based on the number of pages > whether it does a tlb_flush_all() or a flush of individual ranges. > > Whether the architecture uses IPIs or flushes only locally and the > hardware propagates that is completely irrelevant. > > Right now any coalesced range, which is huge due to massive holes, takes > decision #2 away. > > If you want to flush individual VAs from the core vmalloc code then you > lose #1, as the aggregated number of pages might justify a tlb_flush_all(). > > That's a pure architecture decision and all the core code needs to do is > to provide appropriate information and not some completely bogus request > to flush 17312759359 pages, i.e. a ~64.5 TB range, while in reality > there are exactly _three_ distinct pages to flush. > 1. I think, all two cases(logic) should be moved into ARCH code, so a decision is made _not_ by vmalloc code how to flush, either fully, if it supported or page by page that require list chasing. As for vmalloc interace, we can provide the list(we keep it short, because of merging property) + number of pages to flush. 2. It looks like your problem is because of void vfree(const void *addr) { ... if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS)) vm_reset_perms(vm); <---- ... } so, all purged areas are drained in a caller context, so it is blocked until the drain is done including flushing. I am not sure why it is done from a caller context. IMHO, it should be deferred same way as we do in: static void free_vmap_area_noflush(struct vmap_area *va) if do not miss the point why vfree() has to do it directly. -- Uladzislau Rezki _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel