git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: gitster@pobox.com, git@vger.kernel.org, ronniesahlberg@gmail.com,
	mhagger@alum.mit.edu
Subject: Re: [PATCH 3/4] refs.c: add a transaction function to append a reflog entry
Date: Wed, 3 Dec 2014 14:28:52 -0800	[thread overview]
Message-ID: <20141203222852.GB26810@google.com> (raw)
In-Reply-To: <20141203031519.GF6527@google.com>

On Tue, Dec 02, 2014 at 07:15:19PM -0800, Jonathan Nieder wrote:

I'll think about rewriting the commit message, so it is easier to digest.
I'll follow your suggestions except for the following annotations

> 
> > +int transaction_update_reflog(struct transaction *transaction,
> > +			      const char *refname,
> > +			      const unsigned char *new_sha1,
> > +			      const unsigned char *old_sha1,
> > +			      const char *email,
> > +			      unsigned long timestamp, int tz,
> > +			      const char *msg, int flags,
> > +			      struct strbuf *err)
> 
> This is an intimidating list of arguments.  Would it make sense to
> pack them into a struct, or to make the list less intimidating
> some other way (e.g. combining email + timestamp + tz into an
> ident string)?

It's true, that is's a huge list. One of the reasons Ronnie gave, 
was having symmetry in reflog.c:expire_reflog_ent, which has a very
similar signature, the  email + timestamp + tz are passed in as separate
arguments.

expire_reflog_ent is being called from within reflog.c:expire_reflog

	if (for_each_reflog_ent(ref, expire_reflog_ent, &cb)) {
		status |= error("%s", err.buf);
		goto cleanup;
	}

now if, we dive into the for_each_reflog_ent function, it's essentially 
just calling show_one_reflog_ent on each reflog entry.
In there we are trying to separate a reflogs entry into the fields
(new sha1, old sha1, committer, email, time, timezone, message) by the
lovely expression

	/* old SP new SP name <email> SP time TAB msg LF */
	if (sb->len < 83 || sb->buf[sb->len - 1] != '\n' ||
	    get_sha1_hex(sb->buf, osha1) || sb->buf[40] != ' ' ||
	    get_sha1_hex(sb->buf + 41, nsha1) || sb->buf[81] != ' ' ||
	    !(email_end = strchr(sb->buf + 82, '>')) ||
	    email_end[1] != ' ' ||
	    !(timestamp = strtoul(email_end + 2, &message, 10)) ||
	    !message || message[0] != ' ' ||
	    (message[1] != '+' && message[1] != '-') ||
	    !isdigit(message[2]) || !isdigit(message[3]) ||
	    !isdigit(message[4]) || !isdigit(message[5]))
		return 0; /* corrupt? */

so I wonder if it actually makes sense to first separate it out into fields,
and then joining to an ident string again. Is there some kind of struct already,
which groups together identifying information, such as name, email, timestamp + zone,
which could be reused here for passing on the data?

Anyway, this is a first step on transactioning the reflog code, so it is minimally
invasive, not changing a lot of business logic. If you look into the next patch,
(reflog.c: use a reflog transaction when writing during expire)
you'll see that it's essentially just a replacement of printfs to calling
the function.

-	if (cb->newlog) {
-		fprintf(cb->newlog, "%s %s %s %lu %c%04d\t%s",
-			sha1_to_hex(osha1), sha1_to_hex(nsha1),
-			email, timestamp, sign, zone,
-			message);
+	if (cb->t) {
+		if (transaction_update_reflog(cb->t, cb->refname, nsha1, osha1,
+					      email, timestamp, tz, message, 0,
+					      cb->err))

and the errorhandling was removed.
If we want to change the function signature of transaction_update_reflog, 
I'd do it in a follow up?



> 

> 
> [...]
> > +	if (flags & REFLOG_TRUNCATE) {
> > +		if (lseek(fd, 0, SEEK_SET) < 0 ||
> > +			ftruncate(fd, 0)) {
> > +			strbuf_addf(err, "Could not truncate reflog: %s. %s",
> > +				    refname, strerror(errno));
> 
> Odd error message format (using '.' to separate the refname from
> strerror(errno) is unusual).  Errors normally are supposed to start
> with a lowercase letter, like
> 
> 	cannot truncate reflog '%s': %s
> 
> > +			goto failure;
> > +		}
> > +	}
> 
> How does this cause the reflog to be populated in the !(flags &
> REFLOG_TRUNCATE) case?
> 
> Maybe I am misunderstanding the API.  If I use
> transaction_update_reflog() and have not updated the reflog
> previously, isn't this supposed to just append a new entry to the
> reflog?
> 

He, that's indeed a good catch. I was investigating the API again myself and
the REFLOG_TRUNCATE flag is only set on the very first call to transaction_update_reflog
and the subsequent calls are rebuilding the reflog by adding the unpruned lines 
again line by line.

I am wondering if we should actually put that into the same function now.
As the new reflog is build on the calls by transaction_update_reflog, it's actually
depending on the order strictly as opposed to the refs.
Maybe we want to have two separate functions
	transaction_update_begin_reflog(...)
	transaction_update_reflog_add_line(...)

instead? Then a reflog update could look like this:
	t=transaction_begin(...)
	
	transaction_update_begin_reflog(t, refname)
	for_each_reflog_entry_for_refname(entry, refname):
		if (want_to_keep(entry):
			transaction_update_reflog_add_line(entry, refname)

	transaction_commit(...)

The logic of want_to_keep is currently in reflog.c:expire_reflog_ent(...),
not sure if there are other places.


> [...]
> > +failure:
> > +	strbuf_release(&buf);
> > +	/*
> > +	 * As we are using the lock file API, any stale files left behind will
> > +	 * be taken care of, no need to do anything here.
> > +	 */
> 
> That's only true if the caller is going to exit instead of proceeding
> to other work.
> 
> With current callers, I assume that's true.  So should this comment
> say something like "No need to roll back stale lock files because
> the caller will exit soon"?  Or should this roll back the lockfile
> anyway, in case the caller wants to try again?

I am not sure if we ever want transactions be tried again without the user 
explicitely rerunning the command?

As we're operating on a lockfile, which was just created by us, and now in 
the failure command, we're likely to die? Maybe I remove the comment altogether, 
as the wondering reader will look into the API for lock files, when looking 
for problems here.

> > +	/* Commit all reflog updates*/
> > +	for_each_string_list_item(item, &transaction->reflog_updates) {
> > +		struct lock_file *lock = item->util;
> > +		commit_lock_file_to(lock, git_path("logs/%s", item->string));
> 
> Neat.
> 
> This seems like a good starting point.  Other code paths still write
> to reflogs directly --- is there a plan to get them to use this code,
> too, in a followup patch (e.g., by making write_ref_sha1() or
> log_ref_write() use its own small transaction for the reflog update)?
> 

That's the plan, but not part of this patch series yet.

> Thanks and hope that helps,
> Jonathan

Thanks for the comments,
Stefan

  reply	other threads:[~2014-12-03 22:28 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-02  7:46 [PATCHv2 0/4] refs.c: use transactions for the reflog Stefan Beller
2014-12-02  7:46 ` [PATCH 1/4] refs.c: rename the transaction functions Stefan Beller
2014-12-03  2:24   ` Jonathan Nieder
2014-12-02  7:46 ` [PATCH 2/4] refs.c: rename transaction.updates to transaction.ref_updates Stefan Beller
2014-12-03  2:43   ` Jonathan Nieder
2014-12-02  7:46 ` [PATCH 3/4] refs.c: add a transaction function to append a reflog entry Stefan Beller
2014-12-03  3:15   ` Jonathan Nieder
2014-12-03 22:28     ` Stefan Beller [this message]
2014-12-03 22:52       ` Jonathan Nieder
2014-12-02  7:46 ` [PATCH 4/4] reflog.c: use a reflog transaction when writing during expire Stefan Beller
2014-12-03  3:18   ` Jonathan Nieder
2014-12-03 18:05 ` [PATCHv2 0/4] refs.c: use transactions for the reflog Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2014-12-02  2:02 [PATCH 1/4] refs.c: rename the transaction functions Stefan Beller
2014-12-02  2:02 ` [PATCH 3/4] refs.c: add a transaction function to append a reflog entry Stefan Beller
2014-11-20 18:17 [PATCH v3 00/14] ref-transactions-reflog Jonathan Nieder
2014-11-27  5:34 ` [PATCH 0/4] Using transactions for the reflog Stefan Beller
2014-11-27  5:34   ` [PATCH 3/4] refs.c: add a transaction function to append a reflog entry Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141203222852.GB26810@google.com \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=mhagger@alum.mit.edu \
    --cc=ronniesahlberg@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).