From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C2E7C433F5 for ; Mon, 24 Jan 2022 11:10:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237212AbiAXLKu (ORCPT ); Mon, 24 Jan 2022 06:10:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234207AbiAXLKu (ORCPT ); Mon, 24 Jan 2022 06:10:50 -0500 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1CEDC06173D for ; Mon, 24 Jan 2022 03:10:49 -0800 (PST) Received: by mail-wr1-x42f.google.com with SMTP id h29so10203595wrb.5 for ; Mon, 24 Jan 2022 03:10:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:from:subject:reply-to:to:cc :references:content-language:in-reply-to:content-transfer-encoding; bh=KMIv1l+6cpJx5yTqG+dPbrzPR6YH9isja3uX5riRNDE=; b=Bdj354sgo3g8ewGJPTa4FgfOZ5LkSL9gkq9aYeW5fg8cvQyYKj5N20/0CR/2pzkL1+ Vw1N9+vpTuPVMqjzBRCQ/rkF0o2W2/XkGCZLx6pVw4RssY8WRITcyZwVeN+lofrj30mJ OUDmHhjKDdSZgBjyDc/dgOxHnH18h/JwycTNod/teFz8cW4uvuQ0ei+M/SNaJvinHVXH ATsEZm54Rf612HqL4Njy8lp/GwxLxGCX1OnrLOeY1rT0YczkownnUrbiCnVH5ExFL7OQ HE/q52CENuYdnjqb5DQSufW+7jw4zhv3WZrMN8rtzODFM6QeQ94GA9OYV3dhEHxs1Ach j1Xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:from :subject:reply-to:to:cc:references:content-language:in-reply-to :content-transfer-encoding; bh=KMIv1l+6cpJx5yTqG+dPbrzPR6YH9isja3uX5riRNDE=; b=3wYglYod+C9iOsMNnpGenEOHIHw/fAE8FmbqCwYcKLBaUd7qSduPs0pyrKRJ4RPb0Y w40uc7r2+rCS2TmCw7w/irKfV9yFuh06DnlilODCw0QKpvM6Sp5RBv6ytRXCcy4OXin6 ok7xZjLwUdXEj60Phj2NCs2eXMNztQgSPc9PPBGZU2ya7iX7cqzBAHNJa16spBLDlbFD rPnZKUKMmUNl5fTc2xdhyHNDkv6w7bZI8edSxVjD6afac7+1HAdl+8Iw78x6P3FoKXXb MIOqxfqdOSyL+I8+wFH77dxBwkeUuFJPpG9bYAGcSSZzr5tYLKXxbPkn22BvqNm6BHTJ uhjw== X-Gm-Message-State: AOAM5300ORlrMLrMjDdJQaIYeyBBpNzaVAsYpBLGcmEdDxo+xig06Dmk EzPG15M58SnkDntJSaQv1b3HW30unFw= X-Google-Smtp-Source: ABdhPJynszGnm8NJy1T5inCtchSUIQj3YD6W8lGE370qm8E1RaZHHNzMMXp2mP5y6HIIcte2QB5hAA== X-Received: by 2002:a5d:4567:: with SMTP id a7mr13560914wrc.363.1643022648391; Mon, 24 Jan 2022 03:10:48 -0800 (PST) Received: from [192.168.1.201] ([31.185.185.186]) by smtp.googlemail.com with ESMTPSA id a6sm6068780wrx.101.2022.01.24.03.10.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 24 Jan 2022 03:10:48 -0800 (PST) Message-ID: <649cfa82-5fbf-71f3-244b-bc37404c40ab@gmail.com> Date: Mon, 24 Jan 2022 11:10:47 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.1 From: Phillip Wood Subject: Re: [PATCH v2 0/2] builtin add -p: fix hunk splitting Reply-To: phillip.wood@dunelm.org.uk To: Johannes Schindelin , Phillip Wood Cc: Junio C Hamano , =?UTF-8?B?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= , Phillip Wood via GitGitGadget , git@vger.kernel.org, =?UTF-8?Q?SZEDER_G=c3=a1bor?= References: <220111.861r1ezcxn.gmgdl@evledraar.gmail.com> <348d37a3-f01b-cda6-8ce4-1536e0bd3b06@gmail.com> <47e7b23a-56a3-b533-63ed-a854503b59ed@gmail.com> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Hi Dscho On 22/01/2022 09:05, Johannes Schindelin wrote: > Hi Phillip, > > first of all: thank you for these patches. I read over them and they have > my ACK. Thanks > On Wed, 19 Jan 2022, Phillip Wood wrote: > >> On 12/01/2022 18:51, Junio C Hamano wrote: >>> Phillip Wood writes: >>> >>>> I'm not sure I want to go with your extra changes. I've left some >>>> comments on them below >>>> >>>>> @@ -488,12 +499,12 @@ static int parse_diff(struct add_p_state *s, const >>>>> struct pathspec *ps) >>>>> else if (starts_with(p, "@@ ") || >>>>> (hunk == &file_diff->head && >>>>> (skip_prefix(p, "deleted file", &deleted)))) { >>>>> - if (marker == '-' || marker == '+') >>>>> - /* >>>>> - * Should not happen; previous hunk did not end >>>>> - * in a context line? Handle it anyway. >>>>> - */ >>>>> + hunk->splittable_into++; >>>>> + /* >>>>> + * Should not increment "splittable_into"; >>>>> + * previous hunk did not end in a context >>>>> + * line? Handle it anyway. >>>>> + */ >>>>> + complete_file(marker, &hunk->splittable_into); >>>>> ALLOC_GROW_BY(file_diff->hunk, file_diff->hunk_nr, 1, >>>>> file_diff->hunk_alloc); >>>> >>>> I deliberately left this alone as I think we should probably make >>>> this BUG() out instead of silently accepting an invalid diff. > > FWIW this was overzealous defensive programming on my part. More on that > below. > >>> As we are reading our own output, I agree that such a data error is >>> a BUG(). > > Indeed. I was less worried about the output format changing, and more > concerned with bugs in my parser ;-) > > Although, having said that, I had meant to verify that `git add -p` cannot > be asked to produce and consume diffs with `-U0` when I wrote that > comment. Now I did that, and I am now confident that there is no way to ask > `git add -p` to generate and use context line-free diffs: we neither add > `-U` in https://github.com/git/git/blob/v2.34.1/add-patch.c#L398-L417 > nor do we call the user-facing `git diff` command that would interpret > `diff.context`, but instead we use `git diff-index` and `git diff-files` > (which ignore that config setting). I did think about zero context diffs but realized that they can never be split so we don't need to worry about incrementing hunk->splittable_into in that case. It does mean that hunk->splittable_into will be zero in the -U0 case rather than one but I dont think that matters as we only care if it is >2 for splitting. Best Wishes Phillip >>> In any case, a helper to see if the file ended without post-context >>> is one thing, and a helper that specify what happens after we are >>> done with a single file, before we move on top the next file or >>> after processing the last file, is another thing. The latter may be >>> able to make use of the former, but the latter may want to do more >>> than that in the future. > > If you are concerned about the name of the function: maybe a better name > would be `maybe_increment_splittable_hunk_count(marker)`. > >>> >>> As complete_file() is about finalizing the processing we have done >>> to the current file, it should be used for that purpose, and nothing >>> else, I think the hunk I see at >>> https://github.com/git/git/commit/c082176f8c5a1fc1c8b2a93991ca28fd63aae73a >>> (reproduced below) is simply a nonsense. >>> >>> Stepping back a bit, though, is this helper really finalizing the >>> current file, or is it finalizing the current hunk? If it were the >>> latter, then its use in the hunk I called "nonsense" above actually >>> makes perfect sense. >> >> Even if the helper is finalizing the current hunk then I think that "nonsense" >> hunk would still wrong as it would be calling finalize_hunk() on _every_ >> context line in the hunk rather than just being called once to finalize the >> hunk. We could call the function something like update_splittable() but then >> we'd need to explain why we were calling that function at the start of a diff >> and at the end of the loop. > > Right. The point of this check is to see whether we missed counting a > splittable hunk. Then it makes more sense to call it at the beginning of a > file, at the end of a file _and_ at a context line. > > Having said all that, I am really fine with what landed in `next`. > > Thank you, > Dscho