From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB2C7C433FE for ; Mon, 3 Oct 2022 17:40:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229561AbiJCRkZ (ORCPT ); Mon, 3 Oct 2022 13:40:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229844AbiJCRjz (ORCPT ); Mon, 3 Oct 2022 13:39:55 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 807DF165B7 for ; Mon, 3 Oct 2022 10:39:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3BE6CB811E1 for ; Mon, 3 Oct 2022 17:39:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 253D0C433C1; Mon, 3 Oct 2022 17:39:46 +0000 (UTC) Date: Mon, 3 Oct 2022 18:39:43 +0100 From: Catalin Marinas To: Linus Torvalds Cc: Ard Biesheuvel , Isaac Manjarres , Herbert Xu , Will Deacon , Marc Zyngier , Arnd Bergmann , Greg Kroah-Hartman , Andrew Morton , Linux Memory Management List , Linux ARM , Linux Kernel Mailing List , "David S. Miller" , Saravana Kannan , kernel-team@android.com Subject: Re: [PATCH 07/10] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Oct 02, 2022 at 03:24:57PM -0700, Linus Torvalds wrote: > On Sun, Oct 2, 2022 at 3:09 PM Ard Biesheuvel wrote: > > Non-coherent DMA for networking is going to be fun, though. > > I agree that networking is likely the main performance issue, but I > suspect 99% of the cases would come from __alloc_skb(). The problem is not the allocation but rather having a generic enough dma_needs_bounce() check. It won't be able to tell whether some 1500 byte range is for network or for crypto code that uses a small ARCH_KMALLOC_MINALIGN. Getting the actual object size (e.g. with ksize()) doesn't tell the full story on how safe the DMA is. > Similarly, that code already has magic stuff to try to be > cacheline-aligned for accesses, but it's not really for DMA coherency > reasons, just purely for performance reasons (trying to make sure that > the header accesses stay in one cacheline etc). Yeah, __skb_alloc() ends up using SMP_CACHE_BYTES for data alignment (via SKB_DATA_ALIGN). I have a suspicion this may break on SoCs with a 128-byte cache line but I haven't seen any report yet (there aren't many such systems). -- Catalin