From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2505C2D0E2 for ; Tue, 22 Sep 2020 22:50:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 81A842076E for ; Tue, 22 Sep 2020 22:50:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="z19vFTFC" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726748AbgIVWuV (ORCPT ); Tue, 22 Sep 2020 18:50:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726716AbgIVWuV (ORCPT ); Tue, 22 Sep 2020 18:50:21 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9ED9C061755 for ; Tue, 22 Sep 2020 15:50:20 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id c2so1319930qkf.10 for ; Tue, 22 Sep 2020 15:50:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QmGXBTDOzMg6GWsgAQUs3j0wteQQ5xMz7xDHIDA+7qw=; b=z19vFTFCg8PKyF0tHGkpxDb+J47WpnN1ljfqGR6+UzIKqqIn4xGQhGywKcb8qFnW/F QNpRKj/OWOHy+TmrtVc78ZKKBdcL/kB911EUJKv0EoO0sRCKyKm5RNsnzAYVS1qN87H1 f4R+RO+Qyuc9rR0a+OsSHOw5H0Uq9BXtFQxJjNXD89Uo9Yub4nDH7zDxeaQM/qRDbbi+ Nd6W12xSqwSD/ivmdS27meEY/oNIRsoEISZFM/Fcm2VGpaxdCDtHVDiEHK6YOZ82L3v8 /VDQrrusxfbAfTgOAxbeIYkKP4DrbZPOwtVaCEwzXQ0SFiipCJj7qqUTh714x5SvsQMx 7Tyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QmGXBTDOzMg6GWsgAQUs3j0wteQQ5xMz7xDHIDA+7qw=; b=pdBU7BV4O/PZ0N/4em6uj+1DJButMRMeHneZxs314ifuXb4nhxkVHjOtceW/d6A1PY p1x5pziBvBY5peJRxyl6o1HVYV2BSCY0+sbeEa70MjldKpWAdtdaQlEoZGSO7D54zmcX qMBneli7BLwGf9+IdZQQ57CBYlSwzLYW9Ks7nR7aRBOTAyzDg/dSOmZiQ3FiD+cQoiFv ARz1dyJvA2EoLmYpiwvhn6L+ZiwVBCHReNkbtOGuolmd2oWgKBt5OUHwYcD5KgmBxRPf JRtkFsx7k9zyAzO7gMEwZUbiC5eAaKcXEFcf7cIZc0ZCBZ9SusW8y7Sgu6LlwVas+DOA RnFg== X-Gm-Message-State: AOAM533Roc3ZRMj3M3GuN3NDfnXL/sjRrPp14DSaQ4VpPLF2vWC44aN0 TFDgc9cVvOULYFEcj+gTdfrXbsPAgUZzQA== X-Google-Smtp-Source: ABdhPJy8kb8GIb8N2xU+8CGEdjl+I+qW2HBklZZ6sYu4sFWlcWWzcGbXqAJ+FwoZGRBDNHViG/864A== X-Received: by 2002:a37:638d:: with SMTP id x135mr7481407qkb.60.1600815019513; Tue, 22 Sep 2020 15:50:19 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.96.45]) by smtp.gmail.com with ESMTPSA id p187sm12342359qkd.129.2020.09.22.15.50.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 15:50:18 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: jeffhost@microsoft.com, chriscool@tuxfamily.org, peff@peff.net, t.gummerer@gmail.com, newren@gmail.com Subject: [PATCH v2 00/19] Parallel Checkout (part I) Date: Tue, 22 Sep 2020 19:49:14 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This series adds helper workers to checkout, parallelizing the reading, filtering and writing of multiple blobs to the working tree. Since v1, I got the chance to benchmark parallel checkout in more machines. The results showed that the parallelization is most effective with repositories located on SSDs or over distributed file systems. For local file systems on spinning disks, it does not always bring good performances. In fact, it even brings a slowdown sometimes. But given the results on the two first cases, I think it's worth having the parallel code as an optional (and non-default) setting. The size of the repository being checked out and the compression level on the packfiles also influence how much performance gain we can get from parallel checkout. For example, downloading the Linux repo from GitHub and from kernel.org I got packfiles with 2.9GB and 1.4GB, respectively. The number of objects was the same, but GitHub's had a smaller number of delta-chains with size >= 7 [A]. For this reason, the sequential checkout after GitHub's clone was considerably faster than the sequential checkout after kernel.org's clone. And the speedup from parallel checkout was more modest (but it was faster in absolute values, nevertheless). [A]: https://docs.google.com/spreadsheets/d/1dDGLym77JAGCVYhKQHe44r3pqtrsvHrjS4NmD_Hqr6k/edit?usp=sharing V2 got bigger with tests and some additional optimizations, so I decided to divide the original series into two parts to facilitate reviewing. This one is constituted of: - The first 9 patches are preparatory steps in convert.c and entry.c. - The middle 6 actually implement parallel checkout. - The last 4 add tests. Part II will contain some extra optimizations, like work stealing and the creation of leading directories in parallel. With that, workers won't need to stat() the path components again before opening the files for writing. We will also skip some stat() calls during clone. Major changes since v1: General: - Added tests - Parallel checkout is no longer the default, since not all machines benefit from it. - Rebased on top of master to use the adjusted mem_pool API of en/mem-pool. Patch 10: - Converted BUG() to error(), in handle_results(), when we finish parallel checkout with pending entries. This is not really a BUG; it can happen when a worker dies before sending all of its results. Also, by emitting an error message instead of die()'ing, we can continue processing the next results and, thus, avoid wasting successful work. - Added missing initialization of ci->status on enqueue_entry(). - Fixed bug on which collision report during clone would not be correct when the file that is first written appears after it's colliding pair in the cache array. - Reworded commit message and added comment in handle_results() to explain why we retry writing entries with path collisions. - Renamed CI_RETRY to CI_COLLISION, to make it easier to change the behavior on collided entries in the future, if necessary. - Some other minor changes like: * Removed unnecessary PC_HANDLING_RESULTS status. * Statically allocated the global parallel_checkout struct. * Renamed checkout_item to parallel_checkout_item. Patch 11: - Made parse_and_save_result() safer by checking that the received data has the expected size, instead of trusting ci->status and possibly accessing an invalid address on errors. - Limited the workers to the number of enqueued entries. - Added comment in packet_to_ci() mentioning why it's OK to encode NULL as a zero length string when sending the working_tree_encoding to workers. - Split subprocess' spawning and finalizing loops, to mitigate the spawn/wait cost. - Don't die() when a worker exits with an error code (only report the error), to avoid wasting good work by not updating the index with the stat information from the written entries. - Renamed checkout.workersThreshold to checkout.thresholdForParallelism. Jeff Hostetler (4): convert: make convert_attrs() and convert structs public convert: add [async_]convert_to_working_tree_ca() variants convert: add get_stream_filter_ca() variant convert: add conv_attrs classification Matheus Tavares (15): entry: extract a header file for entry.c functions entry: make fstat_output() and read_blob_entry() public entry: extract cache_entry update from write_entry() entry: move conv_attrs lookup up to checkout_entry() entry: add checkout_entry_ca() which takes preloaded conv_attrs unpack-trees: add basic support for parallel checkout parallel-checkout: make it truly parallel parallel-checkout: support progress displaying make_transient_cache_entry(): optionally alloc from mem_pool builtin/checkout.c: complete parallel checkout support checkout-index: add parallel checkout support parallel-checkout: add tests for basic operations parallel-checkout: add tests related to clone collisions parallel-checkout: add tests related to .gitattributes ci: run test round with parallel-checkout enabled .gitignore | 1 + Documentation/config/checkout.txt | 21 + Makefile | 2 + apply.c | 1 + builtin.h | 1 + builtin/checkout--helper.c | 142 ++++++ builtin/checkout-index.c | 17 + builtin/checkout.c | 21 +- builtin/difftool.c | 3 +- cache.h | 34 +- ci/run-build-and-tests.sh | 1 + convert.c | 121 +++-- convert.h | 68 +++ entry.c | 102 ++-- entry.h | 54 ++ git.c | 2 + parallel-checkout.c | 631 ++++++++++++++++++++++++ parallel-checkout.h | 103 ++++ read-cache.c | 12 +- t/README | 4 + t/lib-encoding.sh | 25 + t/lib-parallel-checkout.sh | 45 ++ t/t0028-working-tree-encoding.sh | 25 +- t/t2080-parallel-checkout-basics.sh | 197 ++++++++ t/t2081-parallel-checkout-collisions.sh | 116 +++++ t/t2082-parallel-checkout-attributes.sh | 174 +++++++ unpack-trees.c | 22 +- 27 files changed, 1793 insertions(+), 152 deletions(-) create mode 100644 builtin/checkout--helper.c create mode 100644 entry.h create mode 100644 parallel-checkout.c create mode 100644 parallel-checkout.h create mode 100644 t/lib-encoding.sh create mode 100644 t/lib-parallel-checkout.sh create mode 100755 t/t2080-parallel-checkout-basics.sh create mode 100755 t/t2081-parallel-checkout-collisions.sh create mode 100755 t/t2082-parallel-checkout-attributes.sh -- 2.28.0