git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Wong <e@80x24.org>
To: Jeff King <peff@peff.net>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	git@vger.kernel.org, "Nicolas Pitre" <nico@fluxnic.net>,
	"Lukas Sandström" <luksan@gmail.com>
Subject: Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)
Date: Fri, 5 Aug 2016 09:14:32 +0000	[thread overview]
Message-ID: <20160805091432.GB30906@starla> (raw)
In-Reply-To: <20160805083446.3bweapo4pxsjrgcs@sigill.intra.peff.net>

Jeff King <peff@peff.net> wrote:
> On Fri, Aug 05, 2016 at 08:26:30AM +0000, Eric Wong wrote:
> 
> > > I'm not sure which mallocs you mean. I allocate one struct per node,
> > > which seems like a requirement for a linked list. If you mean holding an
> > > extra list struct around an existing pointer (rather than shoving the
> > > prev/next pointers into the pointed-to- item), then yes, we could do
> > > that. But it feels like a bit dirty, since the point of the list is
> > > explicitly to provide an alternate ordering over an existing set of
> > > items.
> > 
> > This pattern to avoid that one malloc-per-node using list_entry
> > (container_of) is actually a common idiom in the Linux kernel
> > and Userspace RCU (URCU).  Fwiw, I find it less error-prone and
> > easier-to-follow than the "void *"-first-element thing we do
> > with hashmap.
> 
> My big problem with it is that it gets confusing when a particular
> struct is a member of several lists. We have had bugs in git where
> a struct ended up being used in two different lists, but accidentally
> using the same "next" pointer.

It might actually be easier since you would rarely (if ever)
touch the "next"/"prev" fields in your code.  This encourages
users to give meaningful names to list_head fields.

> So you need one "list_head" for each list that your struct may be a part
> of. Sometimes that's simple, but it's awkward when the code which wants
> the list is different than the code which "owns" the struct. Besides
> leaking concerns across modules, the struct may not want to pay the
> memory price for storing pointers for all of the possible lists it could
> be a member of.

Yes, the key is this list is flexible enough to be used either way:

	/* there are millions of these structs in practice */
	struct common_struct {
		struct list_head hot_ent;
		...
	};

	/* and only a handful of these */
	struct rarely_used_wrapper {
		struct list_head cold_ent;
		struct common_struct *common;
		...
	};

> For instance, I think it would be a mistake to convert the current
> commit_list code to something like this.

Of course, often a doubly-linked list is not needed or the extra
pointer is too expensive.  Linux/URCU have hlist for this reason.

I'm no expert on git internals, either; but there can be
readability improvements, too.  For example, I find http-walker.c
is easier-to-follow after 94e99012fc7a
("http-walker: reduce O(n) ops with doubly-linked list")

  reply	other threads:[~2016-08-05  9:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-04 22:28 What's cooking in git.git (Aug 2016, #02; Thu, 4) Junio C Hamano
2016-08-04 22:56 ` Mike Hommey
2016-08-04 23:32   ` Junio C Hamano
2016-08-04 23:39     ` Mike Hommey
2016-08-08  6:48     ` Torsten Bögershausen
2016-08-08  6:50       ` Mike Hommey
2016-08-04 23:10 ` Philip Oakley
2016-08-04 23:32   ` Junio C Hamano
2016-08-04 23:34 ` Eric Wong
2016-08-05  6:25   ` Junio C Hamano
2016-08-05  7:45   ` Jeff King
2016-08-05  8:02     ` Eric Wong
2016-08-05  8:11       ` Jeff King
2016-08-05  8:26         ` Eric Wong
2016-08-05  8:34           ` Jeff King
2016-08-05  9:14             ` Eric Wong [this message]
2016-08-07  9:06 ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160805091432.GB30906@starla \
    --to=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=luksan@gmail.com \
    --cc=nico@fluxnic.net \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).