From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 527D21ED38 for ; Tue, 25 Jul 2023 11:32:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7F5FBC433C7; Tue, 25 Jul 2023 11:32:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1690284729; bh=aeKFfjKwT/mX9zvk9wCighil8LV1zahjBSlt/uG75Nw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vu6RhSncfYsJF73GPayF7FSR38wYecO7bRz+w8sdlAndMCG1Xc1EbTN+YuUL6xh8t bvlQHglJBhXAoffNwnUdFeiw+fYx8y1AqXzoJKDrXNUQRxeARR5cnZ26H2DtvTenRH xE6ckrqYkWYfpw7dQhVhSDnpi+0Hh1nQ1AixckU0= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Christoph Hellwig , David Sterba , Sasha Levin Subject: [PATCH 5.10 460/509] btrfs: add xxhash to fast checksum implementations Date: Tue, 25 Jul 2023 12:46:39 +0200 Message-ID: <20230725104614.814806446@linuxfoundation.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230725104553.588743331@linuxfoundation.org> References: <20230725104553.588743331@linuxfoundation.org> User-Agent: quilt/0.67 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: David Sterba [ Upstream commit efcfcbc6a36195c42d98e0ee697baba36da94dc8 ] The implementation of XXHASH is now CPU only but still fast enough to be considered for the synchronous checksumming, like non-generic crc32c. A userspace benchmark comparing it to various implementations (patched hash-speedtest from btrfs-progs): Block size: 4096 Iterations: 1000000 Implementation: builtin Units: CPU cycles NULL-NOP: cycles: 73384294, cycles/i 73 NULL-MEMCPY: cycles: 228033868, cycles/i 228, 61664.320 MiB/s CRC32C-ref: cycles: 24758559416, cycles/i 24758, 567.950 MiB/s CRC32C-NI: cycles: 1194350470, cycles/i 1194, 11773.433 MiB/s CRC32C-ADLERSW: cycles: 6150186216, cycles/i 6150, 2286.372 MiB/s CRC32C-ADLERHW: cycles: 626979180, cycles/i 626, 22427.453 MiB/s CRC32C-PCL: cycles: 466746732, cycles/i 466, 30126.699 MiB/s XXHASH: cycles: 860656400, cycles/i 860, 16338.188 MiB/s Comparing purely software implementation (ref), current outdated accelerated using crc32q instruction (NI), optimized implementations by M. Adler (https://stackoverflow.com/questions/17645167/implementing-sse-4-2s-crc32c-in-software/17646775#17646775) and the best one that was taken from kernel using the PCLMULQDQ instruction (PCL). Reviewed-by: Christoph Hellwig Signed-off-by: David Sterba Signed-off-by: Sasha Levin --- fs/btrfs/disk-io.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 5a114cad988a6..608b939a4d287 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2256,6 +2256,9 @@ static int btrfs_init_csum_hash(struct btrfs_fs_info *fs_info, u16 csum_type) if (!strstr(crypto_shash_driver_name(csum_shash), "generic")) set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags); break; + case BTRFS_CSUM_TYPE_XXHASH: + set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags); + break; default: break; } -- 2.39.2