From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from phobos.denx.de (phobos.denx.de [85.214.62.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6EA43C0015E for ; Fri, 7 Jul 2023 06:52:59 +0000 (UTC) Received: from h2850616.stratoserver.net (localhost [IPv6:::1]) by phobos.denx.de (Postfix) with ESMTP id 49610863B2; Fri, 7 Jul 2023 08:52:57 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=none (p=none dis=none) header.from=baylibre.com Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=u-boot-bounces@lists.denx.de Authentication-Results: phobos.denx.de; dkim=pass (2048-bit key; unprotected) header.d=baylibre-com.20221208.gappssmtp.com header.i=@baylibre-com.20221208.gappssmtp.com header.b="SQnp2XjC"; dkim-atps=neutral Received: by phobos.denx.de (Postfix, from userid 109) id 36C47863B8; Fri, 7 Jul 2023 08:52:55 +0200 (CEST) Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by phobos.denx.de (Postfix) with ESMTPS id 40E48861E6 for ; Fri, 7 Jul 2023 08:52:52 +0200 (CEST) Authentication-Results: phobos.denx.de; dmarc=none (p=none dis=none) header.from=baylibre.com Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=mkorpershoek@baylibre.com Received: by mail-wr1-x42d.google.com with SMTP id ffacd0b85a97d-3142a9ff6d8so1549453f8f.3 for ; Thu, 06 Jul 2023 23:52:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=baylibre-com.20221208.gappssmtp.com; s=20221208; t=1688712771; x=1691304771; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=gNx8Kw5FwFZply4iRnjV/rPHr+FSKLpXafpk1noGpOM=; b=SQnp2XjCBjLwGB9FEDCVFqyUbx/jKbaA54mTQqg6ruM2VQG643v9uUPKt52q5e6tnn uLTRN4SF5vjj45+sdoovH22qOwFbxHtQkMTtNmwkSitDawsjnvVBfhdB+yYa/dgpSqnG Lk2BYJGOzCJGNrQf4wpQu0cESzfGJxCOdLdiNN5UA4ebM/IqjDNjIy5ifA3L8fQ0wPHb 7MuK99nJncZ3pM4OhBCmjV+ZzLo+49HWOQw+CW3zspfMeLsEsOx6dGHvEU4Czv5pOR2v puqB4iiWdLC0BPbEqOxbu5GN9Zmp8yIRwaLhEZdCyiXtjk+Ger39CX82+61eFf0+90NN YJlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688712771; x=1691304771; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gNx8Kw5FwFZply4iRnjV/rPHr+FSKLpXafpk1noGpOM=; b=KEpfhkqUpYl5YZOIWfCZrwnVJhnYrzhhCCYFyyVJeRFhccCcMOx71R8Bllgop53gzZ RU6ee2LeWCowKkJJ+lJSyGlks+QMJEwwt6kqqF2iNmvizEvHBNQERW0kv/xckI9r55FK XtQnhzDg1IINY5YaIj904/66twU/w2+XsKDqfAfh8EaPlvDdNPV+Sgnpb7i5jgPUqofN ZMwnY+UVYf0IHNLQIM0lHureqZe8glqtwrlEBzef20nROIszChfEKXH4jjVLJ3IzlSRm HYQB7lIYKtTt1Kkj6LGmpiKa5cfuglZwF4CU29PhnoUQVa6uUceEer8mQ+M9Y+wKZkxD rj6Q== X-Gm-Message-State: ABy/qLY0RTYAtjIOV/3VxWs5BZkw9ZXI7yl5H4JmxWZUXgqPS8jM4RYT n/maGvQNbypPSxdbIBMNysKdgA== X-Google-Smtp-Source: APBJJlFl4sQBhMMFZi6iftev8PMNbGVOYDyQ6Vp7jHuTAQPesdhHG9x4/xPHZsLyce9kpQKl0puBkw== X-Received: by 2002:a05:6000:52:b0:314:1a09:6e71 with SMTP id k18-20020a056000005200b003141a096e71mr3211216wrx.53.1688712771550; Thu, 06 Jul 2023 23:52:51 -0700 (PDT) Received: from localhost ([2a01:cb19:85e6:1900:2bf7:7388:731d:c4e1]) by smtp.gmail.com with ESMTPSA id n15-20020a5d420f000000b00314145e6d61sm3675177wrq.6.2023.07.06.23.52.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jul 2023 23:52:51 -0700 (PDT) From: Mattijs Korpershoek To: Tom Rini Cc: qianfan , Sean Anderson , Marek Vasut , Guillaume La Roque , Gary Bisson , Troy Kisky , Neil Armstrong , u-boot@lists.denx.de Subject: Re: [PATCH] lib: sparse: allocate blkcnt instead of arbitrary small number In-Reply-To: <20230706170040.GA7930@bill-the-cat> References: <20230616-sparse-flash-fix-v1-1-6bafeacc567b@baylibre.com> <877crz7tgc.fsf@baylibre.com> <871qhle5ou.fsf@baylibre.com> <20230706170040.GA7930@bill-the-cat> Date: Fri, 07 Jul 2023 08:52:50 +0200 Message-ID: <87v8ewdxh9.fsf@baylibre.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.39 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: u-boot-bounces@lists.denx.de Sender: "U-Boot" X-Virus-Scanned: clamav-milter 0.103.8 at phobos.denx.de X-Virus-Status: Clean On jeu., juil. 06, 2023 at 13:00, Tom Rini wrote: > On Thu, Jul 06, 2023 at 11:43:13AM +0200, Mattijs Korpershoek wrote: >> On lun., juin 19, 2023 at 10:21, Mattijs Korpershoek wrote: >>=20 >> > Hi Qianfan, >> > >> > Thank you for your review. >> > >> > On lun., juin 19, 2023 at 14:19, qianfan wrote: >> > >> >> =E5=9C=A8 2023/6/16 21:26, Mattijs Korpershoek =E5=86=99=E9=81=93: >> >>> Commit 62649165cb02 ("lib: sparse: Make CHUNK_TYPE_RAW buffer aligne= d") >> >>> fixed cache alignment for systems with a D-CACHE. >> >>> >> >>> However it introduced some performance regressions [1] on system >> >>> flashing huge images, such as Android. >> >>> >> >>> On AM62x SK EVM, we also observe such performance penalty: >> >>> Sending sparse 'super' 1/2 (768793 KB) OKAY [ 23.954s] >> >>> Writing 'super' OKAY [ 75.926s] >> >>> Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.641s] >> >>> Writing 'super' OKAY [ 62.849s] >> >>> Finished. Total time: 182.474s >> >>> >> >>> The reason for this is that we use an arbitrary small buffer >> >>> (info->blksz * 100) for transferring. >> >>> >> >>> Fix it by using a bigger buffer (info->blksz * blkcnt) as suggested = in >> >>> the original's patch review [2]. >> >>> >> >>> With this patch, performance impact is mitigated: >> >>> Sending sparse 'super' 1/2 (768793 KB) OKAY [ 24.006s] >> >>> Writing 'super' OKAY [ 15.920s] >> >>> Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.651s] >> >>> Writing 'super' OKAY [ 14.665s] >> >>> Finished. Total time: 74.346s >> >>> >> >>> [1] https://lore.kernel.org/r/20221118121323.4009193-1-gary.bisson@b= oundarydevices.com >> >>> [2] https://lore.kernel.org/r/all/43e4c17c-4483-ec8e-f843-9b4c5569bd= 18@seco.com/ >> >>> >> >>> Fixes: 62649165cb02 ("lib: sparse: Make CHUNK_TYPE_RAW buffer aligne= d") >> >>> Signed-off-by: Mattijs Korpershoek >> >>> --- >> >>> lib/image-sparse.c | 2 +- >> >>> 1 file changed, 1 insertion(+), 1 deletion(-) >> >>> >> >>> diff --git a/lib/image-sparse.c b/lib/image-sparse.c >> >>> index 5ec0f94ab3eb..25aed0604192 100644 >> >>> --- a/lib/image-sparse.c >> >>> +++ b/lib/image-sparse.c >> >>> @@ -55,7 +55,7 @@ static lbaint_t write_sparse_chunk_raw(struct spar= se_storage *info, >> >>> void *data, >> >>> char *response) >> >>> { >> >>> - lbaint_t n =3D blkcnt, write_blks, blks =3D 0, aligned_buf_blks = =3D 100; >> >>> + lbaint_t n =3D blkcnt, write_blks, blks =3D 0, aligned_buf_blks = =3D blkcnt; >> >> Hi: >> >> >> >> It's a good point that this code report the performance was affected = by >> >> write large small >> >> mmc blks, not memory copy. >> > >> > I believe memory copy also affects performance, but in my case, >> > it has less impact than small mmc blks. >> > >> > With 62649165cb02 reverted: >> > Sending sparse 'super' 1/2 (768793 KB) OKAY [ 23.947s] >> > Writing 'super' OKAY [ 12.983s] >> > Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.600s] >> > Writing 'super' OKAY [ 12.796s] >> > Finished. Total time: 69.430s >> > >> > With aligned_buf_blks =3D blkcnt: >> > Sending sparse 'super' 1/2 (768793 KB) OKAY [ 24.072s] >> > Writing 'super' OKAY [ 16.177s] >> > Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.681s] >> > Writing 'super' OKAY [ 14.845s] >> > Finished. Total time: 74.919s >> > >> >> >> >> And I can not make sure whether memalign can always alloc such huge >> >> memory when we change the >> >> aligned_buf_blks to blkcnt. >> > >> > Could you clarify the concern here? I've dumped blkcnt for my board >> > (AM62x SK EVK) and the biggest blkcnt I found was: 131072 >> > >> > With info->blksz =3D 512, this gives me: 512 * 131072 =3D 67108864 >> > >> > Which is a memalign (memory alloc) of 64MB. Is 64MB really that big? (I >> > don't realize it's that much) >> > >> >> >> >> Could you please set aligned_buf_blks to FASTBOOT_MAX_BLK_WRITE(16384) >> >> and test again? >> > >> > With aligned_buf_blks =3D FASTBOOT_MAX_BLK_WRITE(16384): >> > Sending sparse 'super' 1/2 (768793 KB) OKAY [ 23.912s] >> > Writing 'super' OKAY [ 15.780s] >> > Sending sparse 'super' 2/2 (629819 KB) OKAY [ 19.581s] >> > Writing 'super' OKAY [ 17.192s] >> > Finished. Total time: 76.569s >> > >> > So using FASTBOOT_MAX_BLK_WRITE is slightly worse than using blkcnt. >> > But allocations (for blksz =3D 512) are smaller: 8MB instead of 64MB i= n my example. >> > >> > I can spin up a v2 with FASTBOOT_MAX_BLK_WRITE but i'm waiting a little >> > more feedback before doing so. >>=20 >> Hi Marek, Tom, >>=20 >> What's your take on this ? Can we keep blkcnt or should I respin using >> FASTBOOT_MAX_BLK_WRITE ? >>=20 >> I have also tested this on VIM3, on >> U-Boot 2023.07-rc6-00003-g923de765ee1a: >>=20 >> Sending sparse 'super' 1/13 (114684 KB) OKAY [ 5.442s] >> Writing 'super' OKAY [ 5.791s] >> Sending sparse 'super' 2/13 (114684 KB) OKAY [ 5.706s] >> Writing 'super' OKAY [ 5.607s] >> Sending sparse 'super' 3/13 (114684 KB) OKAY [ 5.468s] >> Writing 'super' OKAY [ 5.835s] >> Sending sparse 'super' 4/13 (114684 KB) OKAY [ 5.703s] >> Writing 'super' OKAY [ 5.618s] >> Sending sparse 'super' 5/13 (114684 KB) OKAY [ 6.176s] >> Writing 'super' OKAY [ 5.421s] >> Sending sparse 'super' 6/13 (104176 KB) OKAY [ 5.204s] >> Writing 'super' OKAY [ 5.199s] >> Sending sparse 'super' 7/13 (108856 KB) OKAY [ 5.456s] >> Writing 'super' OKAY [ 5.290s] >> Sending sparse 'super' 8/13 (114684 KB) OKAY [ 6.122s] >> Writing 'super' OKAY [ 5.838s] >> Sending sparse 'super' 9/13 (114684 KB) OKAY [ 5.951s] >> Writing 'super' OKAY [ 5.857s] >> Sending sparse 'super' 10/13 (100980 KB) OKAY [ 4.902s] >> Writing 'super' OKAY [ 4.749s] >> Sending sparse 'super' 11/13 (114681 KB) OKAY [ 6.041s] >> Writing 'super' OKAY [ 5.779s] >> Sending sparse 'super' 12/13 (107212 KB) OKAY [ 5.174s] >> Writing 'super' OKAY [ 6.587s] >> Sending sparse 'super' 13/13 (71496 KB) OKAY [ 3.717s] >> Writing 'super' OKAY [ 3.744s] >> Finished. Total time: 142.578s >>=20 >> With this patch: >> Sending sparse 'super' 1/13 (114684 KB) OKAY [ 7.149s] >> Writing 'super' OKAY [ 1.639s] >> Sending sparse 'super' 2/13 (114684 KB) OKAY [ 6.993s] >> Writing 'super' OKAY [ 1.713s] >> Sending sparse 'super' 3/13 (114684 KB) OKAY [ 7.029s] >> Writing 'super' OKAY [ 1.107s] >> Sending sparse 'super' 4/13 (114684 KB) OKAY [ 7.027s] >> Writing 'super' OKAY [ 0.162s] >> Sending sparse 'super' 5/13 (114684 KB) OKAY [ 6.930s] >> Writing 'super' OKAY [ 1.643s] >> Sending sparse 'super' 6/13 (104176 KB) OKAY [ 6.253s] >> Writing 'super' OKAY [ 2.348s] >> Sending sparse 'super' 7/13 (108856 KB) OKAY [ 6.346s] >> Writing 'super' OKAY [ 0.723s] >> Sending sparse 'super' 8/13 (114684 KB) OKAY [ 6.715s] >> Writing 'super' OKAY [ 2.848s] >> Sending sparse 'super' 9/13 (114684 KB) OKAY [ 6.888s] >> Writing 'super' OKAY [ 1.928s] >> Sending sparse 'super' 10/13 (100980 KB) OKAY [ 5.979s] >> Writing 'super' OKAY [ 1.178s] >> Sending sparse 'super' 11/13 (114681 KB) OKAY [ 6.822s] >> Writing 'super' OKAY [ 2.652s] >> Sending sparse 'super' 12/13 (107212 KB) OKAY [ 6.414s] >> Writing 'super' OKAY [ 5.109s] >> Sending sparse 'super' 13/13 (71496 KB) OKAY [ 4.238s] >> Writing 'super' OKAY [ 0.252s] >> Finished. Total time: 108.151s >>=20 >> It's probably too late for v2023.07 to pick this up but can we consider >> taking it for next? > > I was waiting for a v2, and yes, it's too late for v2023.07. Sorry for > not being clear enough. Oh, sorry I did not understand that. I understand for v2023.07. Thank you for the quick answer. Will send a v2 shortly using FASTBOOT_MAX_BLK_WRITE. > > --=20 > Tom