git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Hogg <phogg@novamoon.net>
To: Git Mailing List <git@vger.kernel.org>
Subject: Fwd: [PATCH] pack-objects.c: Initialize read mutex in cmd_pack_objects
Date: Fri, 18 Jan 2019 08:07:20 -0500	[thread overview]
Message-ID: <CAFOcBz=pcLO-t0FSPHQ9eos4SN+nK3kyFr2xTTyGNu447PFnQw@mail.gmail.com> (raw)
In-Reply-To: <CAFOcBzmCWBjng_HqFthSrg3eKcEHpQLaa5buKAcm8JHt7EsGdA@mail.gmail.com>

On Fri, Jan 18, 2019 at 4:21 AM Duy Nguyen <pclouds@gmail.com> wrote:
>
> On Fri, Jan 18, 2019 at 9:28 AM Patrick Hogg <phogg@novamoon.net> wrote:
> >
> > ac77d0c37 ("pack-objects: shrink size field in struct object_entry",
> > 2018-04-14) added an extra usage of read_lock/read_unlock in the newly
> > introduced oe_get_size_slow for thread safety in parallel calls to
> > try_delta(). Unfortunately oe_get_size_slow is also used in serial
> > code, some of which is called before the first invocation of
> > ll_find_deltas. As such the read mutex is not guaranteed to be
> > initialized.
>
> This must be the SIZE() macros in type_size_sort(), isn't it? I think
> we hit the same problem (use of uninitialized mutex) in this same code
> not long ago. I wonder if there's anyway we can reliably test and
> catch this.

It was actually the SET_SIZE macro in check_object, at least for the
repo at my company that hits this issue.  I took a look at the call
tree for oe_get_size_slow and found that it's used in many places
outside of ll_find_deltas, so there are many potential call sites
where this could crop up:
oe_get_size_slow
    oe_size (SIZE macro)
        write_reuse_object
            write_object
                write_one
                    write_pack_file
                        cmd_pack_objects
        type_size_sort
            prepare_pack
                cmd_pack_objects
        try_delta
            find_deltas
                threaded_find_deltas
                    ll_find_deltas
                        prepare_pack
                            cmd_pack_objects
                ll_find_deltas
                    prepare_pack
                        cmd_pack_objects
        free_unpacked
            find_deltas
                threaded_find_deltas
                    ll_find_deltas
                        prepare_pack
                            cmd_pack_objects
                ll_find_deltas
                    prepare_pack
                        cmd_pack_objects
    oe_size_less_than
        prepare_pack
            cmd_pack_objects
    oe_size_greater_than
        write_no_reuse_object
            write_reuse_object
                write_object
                    write_one
                        write_pack_file
                            cmd_pack_objects
            write_object
                write_one
                    write_pack_file
                        cmd_pack_objects
        get_object_details
            prepare_pack
                cmd_pack_objects
    oe_set_size (SET_SIZE macro)
        check_object
            get_object_details
                prepare_pack
                    cmd_pack_objects
        drop_reused_delta
            break_delta_chains
                get_object_details
                    prepare_pack
                        cmd_pack_objects
(Sorry if this is redundant for those who know the code better)

>
>
> > Resolve this by splitting off the read mutex initialization from
> > init_threaded_search. Instead initialize (and clean up) the read
> > mutex in cmd_pack_objects.
>
> Maybe move the mutex to 'struct packing_data' and initialize it in
> prepare_packing_data(), so we centralize mutex at two locations:
> generic ones go there, command-specific mutexes stay here in
> init_threaded_search(). We could also move oe_get_size_slow() back to
> pack-objects.c (the one outside builtin/).

I was already thinking that generic mutexes should be separated from
command specific ones (that's why I introduced init_read_mutex and
cleanup_read_mutex, but that may well not be the right exposure.)
I'll try my hand at this tonight (just moving the mutex to struct
packing_data and initializing it in prepare_packing_data, I'll leave
large code moves to the experts) and see how it turns out.

>
>
> > Signed-off-by: Patrick Hogg <phogg@novamoon.net>
> > ---
> >  builtin/pack-objects.c | 16 ++++++++++++++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> > index 411aefd68..9084bef02 100644
> > --- a/builtin/pack-objects.c
> > +++ b/builtin/pack-objects.c
> > @@ -2381,22 +2381,30 @@ static pthread_cond_t progress_cond;
> >   */
> >  static void init_threaded_search(void)
> >  {
> > -       init_recursive_mutex(&read_mutex);
> >         pthread_mutex_init(&cache_mutex, NULL);
> >         pthread_mutex_init(&progress_mutex, NULL);
> >         pthread_cond_init(&progress_cond, NULL);
> >         old_try_to_free_routine = set_try_to_free_routine(try_to_free_from_threads);
> >  }
> >
> > +static void init_read_mutex(void)
> > +{
> > +       init_recursive_mutex(&read_mutex);
> > +}
> > +
> >  static void cleanup_threaded_search(void)
> >  {
> >         set_try_to_free_routine(old_try_to_free_routine);
> >         pthread_cond_destroy(&progress_cond);
> > -       pthread_mutex_destroy(&read_mutex);
> >         pthread_mutex_destroy(&cache_mutex);
> >         pthread_mutex_destroy(&progress_mutex);
> >  }
> >
> > +static void cleanup_read_mutex(void)
> > +{
> > +       pthread_mutex_destroy(&read_mutex);
> > +}
> > +
> >  static void *threaded_find_deltas(void *arg)
> >  {
> >         struct thread_params *me = arg;
> > @@ -3319,6 +3327,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
> >                 OPT_END(),
> >         };
> >
> > +       init_read_mutex();
> > +
> >         if (DFS_NUM_STATES > (1 << OE_DFS_STATE_BITS))
> >                 BUG("too many dfs states, increase OE_DFS_STATE_BITS");
> >
> > @@ -3495,5 +3505,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
> >                            _("Total %"PRIu32" (delta %"PRIu32"),"
> >                              " reused %"PRIu32" (delta %"PRIu32")"),
> >                            written, written_delta, reused, reused_delta);
> > +
> > +       cleanup_read_mutex();
> >         return 0;
> >  }
> > --
> > 2.20.1.windows.1
> >
>
>
> --
> Duy

  parent reply	other threads:[~2019-01-18 13:08 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-18  2:27 [PATCH] pack-objects.c: Initialize read mutex in cmd_pack_objects Patrick Hogg
2019-01-18  9:21 ` Duy Nguyen
     [not found]   ` <CAFOcBzmCWBjng_HqFthSrg3eKcEHpQLaa5buKAcm8JHt7EsGdA@mail.gmail.com>
2019-01-18 13:07     ` Patrick Hogg [this message]
2019-01-18 13:09     ` Duy Nguyen
2019-01-19  1:46       ` Patrick Hogg
2019-01-18 10:54 ` Johannes Schindelin
2019-01-22 23:05 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFOcBz=pcLO-t0FSPHQ9eos4SN+nK3kyFr2xTTyGNu447PFnQw@mail.gmail.com' \
    --to=phogg@novamoon.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).