From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from phobos.denx.de (phobos.denx.de [85.214.62.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2E3C0EB64D9 for ; Thu, 6 Jul 2023 17:00:50 +0000 (UTC) Received: from h2850616.stratoserver.net (localhost [IPv6:::1]) by phobos.denx.de (Postfix) with ESMTP id A9D70861C2; Thu, 6 Jul 2023 19:00:48 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=pass (p=none dis=none) header.from=konsulko.com Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=u-boot-bounces@lists.denx.de Authentication-Results: phobos.denx.de; dkim=pass (1024-bit key; unprotected) header.d=konsulko.com header.i=@konsulko.com header.b="XcuU4ACn"; dkim-atps=neutral Received: by phobos.denx.de (Postfix, from userid 109) id D81C986313; Thu, 6 Jul 2023 19:00:46 +0200 (CEST) Received: from mail-yw1-x1133.google.com (mail-yw1-x1133.google.com [IPv6:2607:f8b0:4864:20::1133]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by phobos.denx.de (Postfix) with ESMTPS id B601E861C2 for ; Thu, 6 Jul 2023 19:00:43 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=pass (p=none dis=none) header.from=konsulko.com Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=trini@konsulko.com Received: by mail-yw1-x1133.google.com with SMTP id 00721157ae682-579e5d54e68so11979237b3.1 for ; Thu, 06 Jul 2023 10:00:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=konsulko.com; s=google; t=1688662842; x=1691254842; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=yTuT9Aj71UB6HJvAQj+1OBmPm3h0dzeuAY96qcoiFA0=; b=XcuU4ACnWxOtzmOQqVmuH/ZmYpu0eZBThzfMj12g+KlqOjNnbG+GXmSF0mtFFdsnWA WqhRpA+9Nq2MMqQ7Pi7KTnZWRCCLTSfFkn9WO/zPXYwaRNG7A3L1Dze8O+tDq14fsDzJ lbijOp6u9ogX8tvxOEod0eBfCs4OaNmyK7GIQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688662842; x=1691254842; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=yTuT9Aj71UB6HJvAQj+1OBmPm3h0dzeuAY96qcoiFA0=; b=EC+E/dW6OH0roXO3yagVmcbFBxZQWkcNmRN7NwqeShchTGtPeeyZkc9a9QOh+5M83k dqZ8HhFpnXpqKTNx2+1t4R5LD84E43KILsUT9AocdQRCZlyG/OmF0VliSCf4yHk4qVMF j3TluNkNRUiW25L1DGd4aQQD6uYNQ/ioSKHCJqZsku7HEvroHH5kPIL/f8bNMNmVRo1E rzErGz2bQkeOdMiZmoHDXtebhNCS7O4GbI5Cqi34fsa1ApewMnksPQrhEIYwhfipZC+y u/68HrFza/c2SYxRu6dU5gJf5p3l0G8d3XFzb+mC7Aasy6XSAqexrBW9yMJViMI02BZi 4VPg== X-Gm-Message-State: ABy/qLa9v2Yb99cBxtJne7sJIUPe1dx83B7LQk6QEJtlO+ABccRv6UZz GUb9AbN0evLuhGFR1pzUWtmp8Q== X-Google-Smtp-Source: APBJJlGEH4z28hCqgUxt389U/tNETpxm5M7glalAJLowUb/Ht2U1Vsug43nbMeYYCLelYWgv5VxT0Q== X-Received: by 2002:a81:6ad7:0:b0:577:3b66:70b with SMTP id f206-20020a816ad7000000b005773b66070bmr2488549ywc.47.1688662842345; Thu, 06 Jul 2023 10:00:42 -0700 (PDT) Received: from bill-the-cat (2603-6081-7b00-6400-8731-f9e3-ab03-9278.res6.spectrum.com. [2603:6081:7b00:6400:8731:f9e3:ab03:9278]) by smtp.gmail.com with ESMTPSA id w7-20020a81a207000000b005706c3e5dfcsm461617ywg.48.2023.07.06.10.00.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jul 2023 10:00:41 -0700 (PDT) Date: Thu, 6 Jul 2023 13:00:40 -0400 From: Tom Rini To: Mattijs Korpershoek Cc: qianfan , Sean Anderson , Marek Vasut , Guillaume La Roque , Gary Bisson , Troy Kisky , Neil Armstrong , u-boot@lists.denx.de Subject: Re: [PATCH] lib: sparse: allocate blkcnt instead of arbitrary small number Message-ID: <20230706170040.GA7930@bill-the-cat> References: <20230616-sparse-flash-fix-v1-1-6bafeacc567b@baylibre.com> <877crz7tgc.fsf@baylibre.com> <871qhle5ou.fsf@baylibre.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="eH93xeDI7s9an2mU" Content-Disposition: inline In-Reply-To: <871qhle5ou.fsf@baylibre.com> X-Clacks-Overhead: GNU Terry Pratchett X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.39 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: u-boot-bounces@lists.denx.de Sender: "U-Boot" X-Virus-Scanned: clamav-milter 0.103.8 at phobos.denx.de X-Virus-Status: Clean --eH93xeDI7s9an2mU Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jul 06, 2023 at 11:43:13AM +0200, Mattijs Korpershoek wrote: > On lun., juin 19, 2023 at 10:21, Mattijs Korpershoek wrote: >=20 > > Hi Qianfan, > > > > Thank you for your review. > > > > On lun., juin 19, 2023 at 14:19, qianfan wrote: > > > >> =E5=9C=A8 2023/6/16 21:26, Mattijs Korpershoek =E5=86=99=E9=81=93: > >>> Commit 62649165cb02 ("lib: sparse: Make CHUNK_TYPE_RAW buffer aligned= ") > >>> fixed cache alignment for systems with a D-CACHE. > >>> > >>> However it introduced some performance regressions [1] on system > >>> flashing huge images, such as Android. > >>> > >>> On AM62x SK EVM, we also observe such performance penalty: > >>> Sending sparse 'super' 1/2 (768793 KB) OKAY [ 23.954s] > >>> Writing 'super' OKAY [ 75.926s] > >>> Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.641s] > >>> Writing 'super' OKAY [ 62.849s] > >>> Finished. Total time: 182.474s > >>> > >>> The reason for this is that we use an arbitrary small buffer > >>> (info->blksz * 100) for transferring. > >>> > >>> Fix it by using a bigger buffer (info->blksz * blkcnt) as suggested in > >>> the original's patch review [2]. > >>> > >>> With this patch, performance impact is mitigated: > >>> Sending sparse 'super' 1/2 (768793 KB) OKAY [ 24.006s] > >>> Writing 'super' OKAY [ 15.920s] > >>> Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.651s] > >>> Writing 'super' OKAY [ 14.665s] > >>> Finished. Total time: 74.346s > >>> > >>> [1] https://lore.kernel.org/r/20221118121323.4009193-1-gary.bisson@bo= undarydevices.com > >>> [2] https://lore.kernel.org/r/all/43e4c17c-4483-ec8e-f843-9b4c5569bd1= 8@seco.com/ > >>> > >>> Fixes: 62649165cb02 ("lib: sparse: Make CHUNK_TYPE_RAW buffer aligned= ") > >>> Signed-off-by: Mattijs Korpershoek > >>> --- > >>> lib/image-sparse.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/lib/image-sparse.c b/lib/image-sparse.c > >>> index 5ec0f94ab3eb..25aed0604192 100644 > >>> --- a/lib/image-sparse.c > >>> +++ b/lib/image-sparse.c > >>> @@ -55,7 +55,7 @@ static lbaint_t write_sparse_chunk_raw(struct spars= e_storage *info, > >>> void *data, > >>> char *response) > >>> { > >>> - lbaint_t n =3D blkcnt, write_blks, blks =3D 0, aligned_buf_blks =3D= 100; > >>> + lbaint_t n =3D blkcnt, write_blks, blks =3D 0, aligned_buf_blks =3D= blkcnt; > >> Hi: > >> > >> It's a good point that this code report the performance was affected by > >> write large small > >> mmc blks, not memory copy. > > > > I believe memory copy also affects performance, but in my case, > > it has less impact than small mmc blks. > > > > With 62649165cb02 reverted: > > Sending sparse 'super' 1/2 (768793 KB) OKAY [ 23.947s] > > Writing 'super' OKAY [ 12.983s] > > Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.600s] > > Writing 'super' OKAY [ 12.796s] > > Finished. Total time: 69.430s > > > > With aligned_buf_blks =3D blkcnt: > > Sending sparse 'super' 1/2 (768793 KB) OKAY [ 24.072s] > > Writing 'super' OKAY [ 16.177s] > > Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.681s] > > Writing 'super' OKAY [ 14.845s] > > Finished. Total time: 74.919s > > > >> > >> And I can not make sure whether memalign can always alloc such huge > >> memory when we change the > >> aligned_buf_blks to blkcnt. > > > > Could you clarify the concern here? I've dumped blkcnt for my board > > (AM62x SK EVK) and the biggest blkcnt I found was: 131072 > > > > With info->blksz =3D 512, this gives me: 512 * 131072 =3D 67108864 > > > > Which is a memalign (memory alloc) of 64MB. Is 64MB really that big? (I > > don't realize it's that much) > > > >> > >> Could you please set aligned_buf_blks to FASTBOOT_MAX_BLK_WRITE(16384) > >> and test again? > > > > With aligned_buf_blks =3D FASTBOOT_MAX_BLK_WRITE(16384): > > Sending sparse 'super' 1/2 (768793 KB) OKAY [ 23.912s] > > Writing 'super' OKAY [ 15.780s] > > Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.581s] > > Writing 'super' OKAY [ 17.192s] > > Finished. Total time: 76.569s > > > > So using FASTBOOT_MAX_BLK_WRITE is slightly worse than using blkcnt. > > But allocations (for blksz =3D 512) are smaller: 8MB instead of 64MB in= my example. > > > > I can spin up a v2 with FASTBOOT_MAX_BLK_WRITE but i'm waiting a little > > more feedback before doing so. >=20 > Hi Marek, Tom, >=20 > What's your take on this ? Can we keep blkcnt or should I respin using > FASTBOOT_MAX_BLK_WRITE ? >=20 > I have also tested this on VIM3, on > U-Boot 2023.07-rc6-00003-g923de765ee1a: >=20 > Sending sparse 'super' 1/13 (114684 KB) OKAY [ 5.442s] > Writing 'super' OKAY [ 5.791s] > Sending sparse 'super' 2/13 (114684 KB) OKAY [ 5.706s] > Writing 'super' OKAY [ 5.607s] > Sending sparse 'super' 3/13 (114684 KB) OKAY [ 5.468s] > Writing 'super' OKAY [ 5.835s] > Sending sparse 'super' 4/13 (114684 KB) OKAY [ 5.703s] > Writing 'super' OKAY [ 5.618s] > Sending sparse 'super' 5/13 (114684 KB) OKAY [ 6.176s] > Writing 'super' OKAY [ 5.421s] > Sending sparse 'super' 6/13 (104176 KB) OKAY [ 5.204s] > Writing 'super' OKAY [ 5.199s] > Sending sparse 'super' 7/13 (108856 KB) OKAY [ 5.456s] > Writing 'super' OKAY [ 5.290s] > Sending sparse 'super' 8/13 (114684 KB) OKAY [ 6.122s] > Writing 'super' OKAY [ 5.838s] > Sending sparse 'super' 9/13 (114684 KB) OKAY [ 5.951s] > Writing 'super' OKAY [ 5.857s] > Sending sparse 'super' 10/13 (100980 KB) OKAY [ 4.902s] > Writing 'super' OKAY [ 4.749s] > Sending sparse 'super' 11/13 (114681 KB) OKAY [ 6.041s] > Writing 'super' OKAY [ 5.779s] > Sending sparse 'super' 12/13 (107212 KB) OKAY [ 5.174s] > Writing 'super' OKAY [ 6.587s] > Sending sparse 'super' 13/13 (71496 KB) OKAY [ 3.717s] > Writing 'super' OKAY [ 3.744s] > Finished. Total time: 142.578s >=20 > With this patch: > Sending sparse 'super' 1/13 (114684 KB) OKAY [ 7.149s] > Writing 'super' OKAY [ 1.639s] > Sending sparse 'super' 2/13 (114684 KB) OKAY [ 6.993s] > Writing 'super' OKAY [ 1.713s] > Sending sparse 'super' 3/13 (114684 KB) OKAY [ 7.029s] > Writing 'super' OKAY [ 1.107s] > Sending sparse 'super' 4/13 (114684 KB) OKAY [ 7.027s] > Writing 'super' OKAY [ 0.162s] > Sending sparse 'super' 5/13 (114684 KB) OKAY [ 6.930s] > Writing 'super' OKAY [ 1.643s] > Sending sparse 'super' 6/13 (104176 KB) OKAY [ 6.253s] > Writing 'super' OKAY [ 2.348s] > Sending sparse 'super' 7/13 (108856 KB) OKAY [ 6.346s] > Writing 'super' OKAY [ 0.723s] > Sending sparse 'super' 8/13 (114684 KB) OKAY [ 6.715s] > Writing 'super' OKAY [ 2.848s] > Sending sparse 'super' 9/13 (114684 KB) OKAY [ 6.888s] > Writing 'super' OKAY [ 1.928s] > Sending sparse 'super' 10/13 (100980 KB) OKAY [ 5.979s] > Writing 'super' OKAY [ 1.178s] > Sending sparse 'super' 11/13 (114681 KB) OKAY [ 6.822s] > Writing 'super' OKAY [ 2.652s] > Sending sparse 'super' 12/13 (107212 KB) OKAY [ 6.414s] > Writing 'super' OKAY [ 5.109s] > Sending sparse 'super' 13/13 (71496 KB) OKAY [ 4.238s] > Writing 'super' OKAY [ 0.252s] > Finished. Total time: 108.151s >=20 > It's probably too late for v2023.07 to pick this up but can we consider > taking it for next? I was waiting for a v2, and yes, it's too late for v2023.07. Sorry for not being clear enough. --=20 Tom --eH93xeDI7s9an2mU Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEEGjx/cOCPqxcHgJu/FHw5/5Y0tywFAmSm8ysACgkQFHw5/5Y0 tyyKPAv7BkeRrjwd34AX7U5zEtziFPSQ21+RGwNGerMYUZBsg59T3cC2Ra6tuDvD mR0kj1l6BTvRCfMHBSXjr4eT0dxbbl4S36neje7DKPn6gaKpkTCXLyPv42BmVZEO mhE2Met3RN/gp6ZgkFt3BgU5l3h1MwbovwH/to6ixkqpfuiAErQyfrjJ4YLt/HHk AOjij8hrW0g0WHWpUbtDCeXYlFdmVUVevPAlql7zGNj7vFkyhF2hXMq882GxwSa8 g4JwZcUFacdVMuBT9zdaxHIuUXECvOyuyMz0GbdAt2Z85gwR3HTHAR2AwbY+i5Ie ExXVLyUG4ka+RYKs620IlkXA9okzy0mCJ+tf0kTIe6kZtDCqs4QNTAuC/SP8gVLm uvg2RmFIXLiBg+1kaihQC9hPEwnJ/wJ4W38BB82WQgR3bxXBOc0CZsOL0LAy4c0/ HrzME/f8KFApESiO9ZW2Vy2PG6YTnLxcowCXetGzcWJL4i+D7zlqa3sLH6C0TfEq bpz5CqB+ =1IYH -----END PGP SIGNATURE----- --eH93xeDI7s9an2mU--