All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: karthik nayak <karthik.188@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>
Subject: Re: [PATCH 07/10] reftable/reader: introduce refcounting
Date: Thu, 22 Aug 2024 13:35:49 +0200	[thread overview]
Message-ID: <ZscilS_kLcJyHThx@framework> (raw)
In-Reply-To: <CAOLa=ZThdm98qrcQVu2uXkHJ2meEnJJCsBSPSLMQeSwsojQ-Fg@mail.gmail.com>

On Thu, Aug 22, 2024 at 02:47:43AM -0700, karthik nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > It was recently reported that concurrent reads and writes may cause the
> > reftable backend to segfault. The root cause of this is that we do not
> > properly keep track of reftable readers across reloads.
> >
> > Suppose that you have a reftable iterator and then decide to reload the
> > stack while iterating through the iterator. When the stack has been
> > rewritten since we have created the iterator, then we would end up
> > discarding a subset of readers that may still be in use by the iterator.
> > The consequence is that we now try to reference deallocated memory,
> > which of course segfaults.
> >
> > One way to trigger this is in t5616, where some background maintenance
> > jobs have been leaking from one test into another. This leads to stack
> > traces like the following one:
> >
> >   + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
> >   AddressSanitizer:DEADLYSIGNAL
> >   =================================================================
> >   ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
> > 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
> >   ==657994==The signal is caused by a READ memory access.
> >       #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
> >       #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
> >       #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
> >       #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
> >       #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
> >       #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
> >       #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
> >       #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
> >       #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
> >       #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
> >       #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
> >       #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
> >       #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
> >       #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
> >       #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
> >       #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
> >       #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
> >       #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
> >       #18 0x55f23dba7764 in run_builtin git.c:484
> >       #19 0x55f23dba7764 in handle_builtin git.c:741
> >       #20 0x55f23dbab61e in run_argv git.c:805
> >       #21 0x55f23dbab61e in cmd_main git.c:1000
> >       #22 0x55f23dba4781 in main common-main.c:64
> >       #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
> >       #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
> >       #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)
> >
> 
> The stacktrace is for iterating over refs, what I don't understand is
> where in this flow do we actually reload the stack.

Basically, whenever you call into the reftable backend we check whether
we need to reload the stack. So, when creating a reftable iterator,
reading a single ref, writing refs and so on. So in the above code flow
we had a ref iterator, but during iteration we ended up reading other
refs, as well.

> > While it is somewhat awkward that the maintenance processes survive
> > tests in the first place, it is totally expected that reftables should
> > work alright with concurrent writers. Seemingly they don't.
> >
> > The only underlying resource that we need to care about in this context
> > is the reftable reader, which is responsible for reading a single table
> > from disk. These readers get discarded immediately (unless reused) when
> > calling `reftable_stack_reload()`, which is wrong. We can only close
> > them once we know that there are no iterators using them anymore.
> >
> > Prepare for a fix by converting the reftable readers to be refcounted.
> >
> 
> Okay so my understanding is that `refcounted` refers to a reference
> count which keeps tracks of the stacks which are referring to the
> reader. The name is also used in `struct blame_origin` in blame.{c,h}.
> Makes a lot more sense now :)

Yup.

> > diff --git a/reftable/reader.h b/reftable/reader.h
> > index 88b4f3b421..3710ee09b4 100644
> > --- a/reftable/reader.h
> > +++ b/reftable/reader.h
> > @@ -50,6 +50,8 @@ struct reftable_reader {
> >  	struct reftable_reader_offsets ref_offsets;
> >  	struct reftable_reader_offsets obj_offsets;
> >  	struct reftable_reader_offsets log_offsets;
> > +
> > +	uint64_t refcount;
> 
> Wonder if there is a chance that we decrement refcount from 0 and hence
> cause a wraparound.

This should never happen in practice. And if it does, we would hit a
BUG():

    void reftable_reader_decref(struct reftable_reader *r)
    {
    	if (!r)
    		return;
    	if (!r->refcount)
    		BUG("cannot decrement ref counter of dead reader");
    	if (--r->refcount)
    		return;
    	block_source_close(&r->source);
    	FREE_AND_NULL(r->name);
    	reftable_free(r);
    }

If the refcount is at zero already, we hit the bug.

Patrick

  reply	other threads:[~2024-08-22 11:35 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-19 15:39 [PATCH 00/10] reftable: fix reload with active iterators Patrick Steinhardt
2024-08-19 15:39 ` [PATCH 01/10] reftable/blocksource: drop malloc block source Patrick Steinhardt
2024-08-19 15:39 ` [PATCH 02/10] reftable/stack: inline `stack_compact_range_stats()` Patrick Steinhardt
2024-08-22  8:23   ` karthik nayak
2024-08-19 15:39 ` [PATCH 03/10] reftable/reader: rename `reftable_new_reader()` Patrick Steinhardt
2024-08-22 18:51   ` Justin Tobler
2024-08-22 19:09     ` Junio C Hamano
2024-08-23  5:22       ` Patrick Steinhardt
2024-08-19 15:39 ` [PATCH 04/10] reftable/reader: inline `init_reader()` Patrick Steinhardt
2024-08-19 15:39 ` [PATCH 05/10] reftable/reader: inline `reader_close()` Patrick Steinhardt
2024-08-19 15:40 ` [PATCH 06/10] reftable/stack: fix broken refnames in `write_n_ref_tables()` Patrick Steinhardt
2024-08-19 15:40 ` [PATCH 07/10] reftable/reader: introduce refcounting Patrick Steinhardt
2024-08-22  9:47   ` karthik nayak
2024-08-22 11:35     ` Patrick Steinhardt [this message]
2024-08-23 10:14       ` karthik nayak
2024-08-19 15:40 ` [PATCH 08/10] reftable/reader: keep readers alive during iteration Patrick Steinhardt
2024-08-23 10:21   ` karthik nayak
2024-08-23 13:42     ` Patrick Steinhardt
2024-08-19 15:40 ` [PATCH 09/10] reftable/stack: reorder swapping in the reloaded stack contents Patrick Steinhardt
2024-08-19 15:40 ` [PATCH 10/10] reftable/stack: fix segfault when reload with reused readers fails Patrick Steinhardt
2024-08-22  8:13 ` [PATCH 00/10] reftable: fix reload with active iterators karthik nayak
2024-08-22 12:41 ` Jeff King
2024-08-22 12:53   ` Patrick Steinhardt
2024-08-22 15:15     ` Junio C Hamano
2024-08-23  5:23       ` Patrick Steinhardt
2024-08-22 15:11   ` Junio C Hamano
2024-08-23 14:12 ` [PATCH v2 " Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 01/10] reftable/blocksource: drop malloc block source Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 02/10] reftable/stack: inline `stack_compact_range_stats()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 03/10] reftable/reader: rename `reftable_new_reader()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 04/10] reftable/reader: inline `init_reader()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 05/10] reftable/reader: inline `reader_close()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 06/10] reftable/stack: fix broken refnames in `write_n_ref_tables()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 07/10] reftable/reader: introduce refcounting Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 08/10] reftable/reader: keep readers alive during iteration Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 09/10] reftable/stack: reorder swapping in the reloaded stack contents Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 10/10] reftable/stack: fix segfault when reload with reused readers fails Patrick Steinhardt
2024-08-23 16:49   ` [PATCH v2 00/10] reftable: fix reload with active iterators Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZscilS_kLcJyHThx@framework \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=karthik.188@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.