git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: karthik nayak <karthik.188@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>
Subject: Re: [PATCH 07/10] reftable/reader: introduce refcounting
Date: Fri, 23 Aug 2024 03:14:11 -0700	[thread overview]
Message-ID: <CAOLa=ZRf9cGyB23EUn_Mb1inJuqNVA85tGGN6SWxcVjbc98QQg@mail.gmail.com> (raw)
In-Reply-To: <ZscilS_kLcJyHThx@framework>

[-- Attachment #1: Type: text/plain, Size: 5696 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> On Thu, Aug 22, 2024 at 02:47:43AM -0700, karthik nayak wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>>
>> > It was recently reported that concurrent reads and writes may cause the
>> > reftable backend to segfault. The root cause of this is that we do not
>> > properly keep track of reftable readers across reloads.
>> >
>> > Suppose that you have a reftable iterator and then decide to reload the
>> > stack while iterating through the iterator. When the stack has been
>> > rewritten since we have created the iterator, then we would end up
>> > discarding a subset of readers that may still be in use by the iterator.
>> > The consequence is that we now try to reference deallocated memory,
>> > which of course segfaults.
>> >
>> > One way to trigger this is in t5616, where some background maintenance
>> > jobs have been leaking from one test into another. This leads to stack
>> > traces like the following one:
>> >
>> >   + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
>> >   AddressSanitizer:DEADLYSIGNAL
>> >   =================================================================
>> >   ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
>> > 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
>> >   ==657994==The signal is caused by a READ memory access.
>> >       #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
>> >       #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
>> >       #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
>> >       #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
>> >       #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
>> >       #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
>> >       #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
>> >       #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
>> >       #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
>> >       #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
>> >       #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
>> >       #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
>> >       #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
>> >       #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
>> >       #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
>> >       #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
>> >       #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
>> >       #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
>> >       #18 0x55f23dba7764 in run_builtin git.c:484
>> >       #19 0x55f23dba7764 in handle_builtin git.c:741
>> >       #20 0x55f23dbab61e in run_argv git.c:805
>> >       #21 0x55f23dbab61e in cmd_main git.c:1000
>> >       #22 0x55f23dba4781 in main common-main.c:64
>> >       #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>> >       #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
>> >       #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)
>> >
>>
>> The stacktrace is for iterating over refs, what I don't understand is
>> where in this flow do we actually reload the stack.
>
> Basically, whenever you call into the reftable backend we check whether
> we need to reload the stack. So, when creating a reftable iterator,
> reading a single ref, writing refs and so on. So in the above code flow
> we had a ref iterator, but during iteration we ended up reading other
> refs, as well.
>

Yes, makes sense. Thanks for clarifying.

>> > While it is somewhat awkward that the maintenance processes survive
>> > tests in the first place, it is totally expected that reftables should
>> > work alright with concurrent writers. Seemingly they don't.
>> >
>> > The only underlying resource that we need to care about in this context
>> > is the reftable reader, which is responsible for reading a single table
>> > from disk. These readers get discarded immediately (unless reused) when
>> > calling `reftable_stack_reload()`, which is wrong. We can only close
>> > them once we know that there are no iterators using them anymore.
>> >
>> > Prepare for a fix by converting the reftable readers to be refcounted.
>> >
>>
>> Okay so my understanding is that `refcounted` refers to a reference
>> count which keeps tracks of the stacks which are referring to the
>> reader. The name is also used in `struct blame_origin` in blame.{c,h}.
>> Makes a lot more sense now :)
>
> Yup.
>
>> > diff --git a/reftable/reader.h b/reftable/reader.h
>> > index 88b4f3b421..3710ee09b4 100644
>> > --- a/reftable/reader.h
>> > +++ b/reftable/reader.h
>> > @@ -50,6 +50,8 @@ struct reftable_reader {
>> >  	struct reftable_reader_offsets ref_offsets;
>> >  	struct reftable_reader_offsets obj_offsets;
>> >  	struct reftable_reader_offsets log_offsets;
>> > +
>> > +	uint64_t refcount;
>>
>> Wonder if there is a chance that we decrement refcount from 0 and hence
>> cause a wraparound.
>
> This should never happen in practice. And if it does, we would hit a
> BUG():
>
>     void reftable_reader_decref(struct reftable_reader *r)
>     {
>     	if (!r)
>     		return;
>     	if (!r->refcount)
>     		BUG("cannot decrement ref counter of dead reader");
>     	if (--r->refcount)
>     		return;
>     	block_source_close(&r->source);
>     	FREE_AND_NULL(r->name);
>     	reftable_free(r);
>     }
>
> If the refcount is at zero already, we hit the bug.
>
> Patrick

Yup, this seems correct. Thanks.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

  reply	other threads:[~2024-08-23 10:14 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-19 15:39 [PATCH 00/10] reftable: fix reload with active iterators Patrick Steinhardt
2024-08-19 15:39 ` [PATCH 01/10] reftable/blocksource: drop malloc block source Patrick Steinhardt
2024-08-19 15:39 ` [PATCH 02/10] reftable/stack: inline `stack_compact_range_stats()` Patrick Steinhardt
2024-08-22  8:23   ` karthik nayak
2024-08-19 15:39 ` [PATCH 03/10] reftable/reader: rename `reftable_new_reader()` Patrick Steinhardt
2024-08-22 18:51   ` Justin Tobler
2024-08-22 19:09     ` Junio C Hamano
2024-08-23  5:22       ` Patrick Steinhardt
2024-08-19 15:39 ` [PATCH 04/10] reftable/reader: inline `init_reader()` Patrick Steinhardt
2024-08-19 15:39 ` [PATCH 05/10] reftable/reader: inline `reader_close()` Patrick Steinhardt
2024-08-19 15:40 ` [PATCH 06/10] reftable/stack: fix broken refnames in `write_n_ref_tables()` Patrick Steinhardt
2024-08-19 15:40 ` [PATCH 07/10] reftable/reader: introduce refcounting Patrick Steinhardt
2024-08-22  9:47   ` karthik nayak
2024-08-22 11:35     ` Patrick Steinhardt
2024-08-23 10:14       ` karthik nayak [this message]
2024-08-19 15:40 ` [PATCH 08/10] reftable/reader: keep readers alive during iteration Patrick Steinhardt
2024-08-23 10:21   ` karthik nayak
2024-08-23 13:42     ` Patrick Steinhardt
2024-08-19 15:40 ` [PATCH 09/10] reftable/stack: reorder swapping in the reloaded stack contents Patrick Steinhardt
2024-08-19 15:40 ` [PATCH 10/10] reftable/stack: fix segfault when reload with reused readers fails Patrick Steinhardt
2024-08-22  8:13 ` [PATCH 00/10] reftable: fix reload with active iterators karthik nayak
2024-08-22 12:41 ` Jeff King
2024-08-22 12:53   ` Patrick Steinhardt
2024-08-22 15:15     ` Junio C Hamano
2024-08-23  5:23       ` Patrick Steinhardt
2024-08-22 15:11   ` Junio C Hamano
2024-08-23 14:12 ` [PATCH v2 " Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 01/10] reftable/blocksource: drop malloc block source Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 02/10] reftable/stack: inline `stack_compact_range_stats()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 03/10] reftable/reader: rename `reftable_new_reader()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 04/10] reftable/reader: inline `init_reader()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 05/10] reftable/reader: inline `reader_close()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 06/10] reftable/stack: fix broken refnames in `write_n_ref_tables()` Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 07/10] reftable/reader: introduce refcounting Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 08/10] reftable/reader: keep readers alive during iteration Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 09/10] reftable/stack: reorder swapping in the reloaded stack contents Patrick Steinhardt
2024-08-23 14:12   ` [PATCH v2 10/10] reftable/stack: fix segfault when reload with reused readers fails Patrick Steinhardt
2024-08-23 16:49   ` [PATCH v2 00/10] reftable: fix reload with active iterators Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOLa=ZRf9cGyB23EUn_Mb1inJuqNVA85tGGN6SWxcVjbc98QQg@mail.gmail.com' \
    --to=karthik.188@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).