From: Junio C Hamano <gitster@pobox.com>
To: "Nikolay Edigaryev via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Nikolay Edigaryev <edigaryev@gmail.com>
Subject: Re: [PATCH] rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer
Date: Tue, 16 Jan 2024 08:54:23 -0800 [thread overview]
Message-ID: <xmqqo7dl1ao0.fsf@gitster.g> (raw)
In-Reply-To: <pull.1645.git.git.1705261850650.gitgitgadget@gmail.com> (Nikolay Edigaryev via GitGitGadget's message of "Sun, 14 Jan 2024 19:50:50 +0000")
"Nikolay Edigaryev via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> From: Nikolay Edigaryev <edigaryev@gmail.com>
>
> '--filter=blob:limit=<n>' was introduced in 25ec7bcac0 (list-objects:
> filter objects in traverse_commit_list, 2017-11-21) and later expanded
> to bitmaps in 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
> 2020-02-14)
>
> The logic that was introduced in these commits (and that still persists
> to this day) omits blobs larger than _or equal_ to n bytes or units.
Good eyes. The former does this
if (object_length < filter_data->max_bytes)
goto include_it;
and the latter does this
if (!bitmap_get(tips, pos) &&
get_size_by_pos(bitmap_git, pos) >= limit)
bitmap_unset(to_filter, pos);
> However, the documentation (Documentation/rev-list-options.txt) states:
>
>>The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n
> bytes or units. n may be zero.
>
> Moreover, the t6113-rev-list-bitmap-filters.sh tests for exactly this
> logic, so it seems it is the documentation that needs fixing, not the
> code.
Yup. The mechanism is used for things like "we do not want a large
blob, like 100MB", and a byte on the boundary does not matter all
that much in such a countext, but it does not hurt to be more
correct ;-)
> The form '--filter=blob:none' omits all blobs.
> +
> -The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n bytes
> -or units. n may be zero. The suffixes k, m, and g can be used to name
> -units in KiB, MiB, or GiB. For example, 'blob:limit=1k' is the same
> -as 'blob:limit=1024'.
> +The form '--filter=blob:limit=<n>[kmg]' omits blobs of size at least n
> +bytes or units. n may be zero. The suffixes k, m, and g can be used
> +to name units in KiB, MiB, or GiB. For example, 'blob:limit=1k'
> +is the same as 'blob:limit=1024'.
With unnecessary paragraph wrapping, it is a bit hard to compare the
preimage and the postimage, but I manually checked that this only
does
"larger than" -> "of size at least"
and nothing else, which is expected and in line with what the
proposed commit message claimed to do. Good job.
Will queue. Thanks.
> +
> The form '--filter=object:type=(tag|commit|tree|blob)' omits all objects
> which are not of the requested type.
>
> base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
prev parent reply other threads:[~2024-01-16 16:54 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-14 19:50 [PATCH] rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer Nikolay Edigaryev via GitGitGadget
2024-01-16 16:54 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqo7dl1ao0.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=edigaryev@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.