From: Derrick Stolee <derrickstolee@github.com>
To: "Junio C Hamano" <gitster@pobox.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org, me@ttaylorr.com
Subject: Re: [PATCH] object-file: reprepare alternates when necessary
Date: Wed, 8 Mar 2023 08:29:18 -0500 [thread overview]
Message-ID: <64903e15-014f-a81e-26dc-83e16b632b2b@github.com> (raw)
In-Reply-To: <xmqqttyw1ndr.fsf@gitster.g>
On 3/7/2023 12:29 PM, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> But in fact we've been doing the locking since 6c307626f1e (grep:
>> protect packed_git [re-]initialization, 2020-01-15). So the only thing
>> that really needs justification here is that putting the alternates
>> re-reading under the same lock
>>
>> There is a really interesting potential caveat here which you're not
>> discussing, which is...
>>> ...
>>> +void reprepare_alt_odb(struct repository *r)
>>> +{
>>> + r->objects->loaded_alternates = 0;
>>> + prepare_alt_odb(r);
>>> +}
>>> +
>>> ...
>> This seems reasonable, but wouldn't this do the same without introducing
>> an API function just for this one use-case?
>>
>> That's of course a nit, and you seem to have been adding this for
>> consistency with reprepare_packed_git(), but it already "owns" the
>> "approximate_object_count_valid" and "packed_git_initialized" flags in
>> "struct raw_object_store".
>>
>> So as we'll only need this from reprepare_packed_git() isn't it better
>> to declare that "loaded_alternates" is another such flag?
>
> I am not sure I got what you want to say 100%, but if you are saying
> that this "drop the 'loaded' flag and force prepare_*() function to
> redo its thing" must not be done only in reprepare_packed_git(), and
> that inlining the code there without introducing a helper function
> that anybody can casually call without thinking its consequenced
> through, then I tend to agree in principle. But it is just as easy
> to lift two lines of code from the rewritten/inlined code to a new
> place---to ensure people follow the obj_read_lock() rule, the
> comment before it may have to be a bit stronger, I wonder?
The fact that we do it in a lock in reprepare_packed_git() in the
only current caller does raise suspicion that someone could call it
later and not realize that the lock could be helpful. Inlining the
change within reprepare_packed_git() makes the most sense here
instead of mimicking the pattern.
>> Perhaps not, but as the resulting patch is much shorter it seems worth
>> considering.
The shortness of the patch is metric of quality, to me. The other
reason (we might introduce a footgun) is more interesting.
>> ...but to continue the above, the *really* important thing here (and
>> correct me if I'm wrong) is that we really need to *first* prepare the
>> alternates, and *then* do the rest, as our new alternates might point to
>> new loose objects and packs.
>
> Yes, and as Derrick explained in another message, we only have to
> worry about new ones getting added, not existing ones going away.
I'll be sure to clarify that in my next version.
>> So with both of the above (the same could be done with your new helper)
>> something like this would IMO make that much clearer:
>>
>> diff --git a/packfile.c b/packfile.c
>> index 79e21ab18e7..50cb46ca4b7 100644
>> --- a/packfile.c
>> +++ b/packfile.c
>> @@ -1008,6 +1008,13 @@ void reprepare_packed_git(struct repository *r)
>> struct object_directory *odb;
>>
>> obj_read_lock();
>> + /*
>> + * New alternates might point to new loose & pack dirs, so we
>> + * must first read those.
>> + */
>> + r->objects->loaded_alternates = 0;
>> + prepare_alt_odb(r);
>> +
>> for (odb = r->objects->odb; odb; odb = odb->next)
>> odb_clear_loose_cache(odb);
>>
>> And, I think this is an exsiting edge case, but we're only locking the
>> ODB of the "parent" repository in this case, so if we have alternates in
>> play aren't we going to racily compute the rest here, the loose objects
>> and packs of the alternates we're about to consult aren't under the same
>> lock?
I don't understand what you are saying here. odb_read_lock() does not
specify a repository and is instead a global lock on reading from any
object database.
Here is its implementation:
extern int obj_read_use_lock;
extern pthread_mutex_t obj_read_mutex;
static inline void obj_read_lock(void)
{
if(obj_read_use_lock)
pthread_mutex_lock(&obj_read_mutex);
}
static inline void obj_read_unlock(void)
{
if(obj_read_use_lock)
pthread_mutex_unlock(&obj_read_mutex);
}
So I don't believe that your proposed edge case exists.
Thanks,
-Stolee
next prev parent reply other threads:[~2023-03-08 13:31 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-06 20:59 [PATCH] object-file: reprepare alternates when necessary Derrick Stolee via GitGitGadget
2023-03-06 22:54 ` Junio C Hamano
2023-03-07 0:28 ` Taylor Blau
2023-03-07 14:52 ` Derrick Stolee
2023-03-07 17:16 ` Junio C Hamano
2023-03-08 15:55 ` Taylor Blau
2023-03-08 17:13 ` Derrick Stolee
2023-03-07 11:28 ` Ævar Arnfjörð Bjarmason
2023-03-07 17:29 ` Junio C Hamano
2023-03-07 18:18 ` Junio C Hamano
2023-03-08 13:29 ` Derrick Stolee [this message]
2023-03-08 18:47 ` [PATCH v2] " Derrick Stolee via GitGitGadget
2023-03-08 19:35 ` Junio C Hamano
2023-03-08 20:47 ` Taylor Blau
2023-03-09 7:24 ` Jeff King
2023-03-09 9:06 ` Eric Wong
2023-03-10 21:29 ` Jonathan Tan
2023-03-11 0:01 ` Junio C Hamano
2023-03-11 3:09 ` Jonathan Tan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=64903e15-014f-a81e-26dc-83e16b632b2b@github.com \
--to=derrickstolee@github.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).