git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* checkout-cache -f: a better way?
@ 2005-05-20 21:05 Jeff Garzik
  2005-05-20 22:38 ` Junio C Hamano
  2005-05-20 23:33 ` Linus Torvalds
  0 siblings, 2 replies; 11+ messages in thread
From: Jeff Garzik @ 2005-05-20 21:05 UTC (permalink / raw)
  To: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 563 bytes --]


Being a weirdo, I don't use cogito for kernel development, just git 
itself.  I store branches in .git/refs/heads/ per the defacto standard, 
and use the attached script to switch the working directory from one 
branch to another.

Problem is, 'git-checkout-cache -q -f -a' really pounds the disk, and 
takes quite a while.

Is there any way to avoid -f, while ensuring that the working directory 
truly represents the new branch?

BitKeeper has a secret checkout arg '-S', which will leave files 
untouched if the mtime/size information is unchanged.

	Jeff




[-- Attachment #2: git-switch-tree --]
[-- Type: text/plain, Size: 381 bytes --]

#!/bin/sh

if [ "x$1" != "x" ]
then
	if [ "$1" == "master" ]
	then
		( cd .git && rm -f HEAD && ln -s refs/heads/master HEAD )
	else
		if [ ! -f .git/refs/heads/$1 ]
		then
			echo Branch $1 not found.
			exit 1
		fi

		( cd .git && rm -f HEAD && ln -s refs/heads/$1 HEAD )
	fi
fi

git-read-tree $(cat .git/HEAD) && \
	git-checkout-cache -q -f -a && \
	git-update-cache --refresh


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-20 21:05 checkout-cache -f: a better way? Jeff Garzik
@ 2005-05-20 22:38 ` Junio C Hamano
  2005-05-20 23:33   ` Jeff Garzik
  2005-05-20 23:33 ` Linus Torvalds
  1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2005-05-20 22:38 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Git Mailing List

>>>>> "JG" == Jeff Garzik <jgarzik@pobox.com> writes:

JG> Being a weirdo, I don't use cogito for kernel development, just git
JG> itself.

My customer, in other words ;-).

JG> git-read-tree $(cat .git/HEAD) && \
JG> 	git-checkout-cache -q -f -a && \
JG> 	git-update-cache --refresh

I have to check checkout-cache.c, but assuming that you start
from an already populated work tree with a valid cache when you
do the git-read-tree at the third line from the last, using
"git-read-tree -m HEAD" (you do not need to say $(cat .git/HEAD)
in the modern git anymore) would be a good place to start.

Also the modern git-checkout-cache has a '-u' option and with it
you should not need 'git-update-cache --refresh' after that.

Let me know if you have any problems.  Single tree '-m' is what
Linus did and '-u' option to git-checkout-cache is mine.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-20 22:38 ` Junio C Hamano
@ 2005-05-20 23:33   ` Jeff Garzik
  2005-05-20 23:39     ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Garzik @ 2005-05-20 23:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

Junio C Hamano wrote:
>>>>>>"JG" == Jeff Garzik <jgarzik@pobox.com> writes:
> 
> 
> JG> Being a weirdo, I don't use cogito for kernel development, just git
> JG> itself.
> 
> My customer, in other words ;-).
> 
> JG> git-read-tree $(cat .git/HEAD) && \
> JG> 	git-checkout-cache -q -f -a && \
> JG> 	git-update-cache --refresh
> 
> I have to check checkout-cache.c, but assuming that you start
> from an already populated work tree with a valid cache when you
> do the git-read-tree at the third line from the last, using
> "git-read-tree -m HEAD" (you do not need to say $(cat .git/HEAD)
> in the modern git anymore) would be a good place to start.
> 
> Also the modern git-checkout-cache has a '-u' option and with it
> you should not need 'git-update-cache --refresh' after that.
> 
> Let me know if you have any problems.  Single tree '-m' is what
> Linus did and '-u' option to git-checkout-cache is mine.

Pardon my ignorance (I'm slow :)), but how do those changes address the 
fact that git-checkout-cache appears to checkout the entire kernel tree 
(over 100MB of writes) when using '-f' ?

git-checkout-cache -f writes out every file, even if it exists, correct?

	Jeff



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-20 21:05 checkout-cache -f: a better way? Jeff Garzik
  2005-05-20 22:38 ` Junio C Hamano
@ 2005-05-20 23:33 ` Linus Torvalds
  2005-05-20 23:51   ` Linus Torvalds
  1 sibling, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2005-05-20 23:33 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Git Mailing List



On Fri, 20 May 2005, Jeff Garzik wrote:
> 
> Problem is, 'git-checkout-cache -q -f -a' really pounds the disk, and 
> takes quite a while.

No. "git" is perfect, and "git-checkout-cache -f" already does exactly 
what you want.

> Is there any way to avoid -f, while ensuring that the working directory 
> truly represents the new branch?

You don't need to avoid -f, it already has the logic to avoid writing 
files that are already up-to-date.

HOWEVER, your script is broken:

	git-read-tree $(cat .git/HEAD) && \
	        git-checkout-cache -q -f -a && \
	        git-update-cache --refresh

you need to use the "-m" switch to git-read-tree to tell it to merge the 
index information from your previous tree with the new one.

Also, don't do the "$(cat .git/HEAD)" thing any more, since modern git 
does this so much more nicely, and allows you to use your branch names 
directly.

Finally, use the new "-u" flag to git-checkout-cache, which will update 
the cache as it goes along. 

In other words, those lines in your script should look like this:

	git-read-tree -m HEAD && git-checkout-cache -q -f -u -a

and you'll be a lot happier.

			Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-20 23:33   ` Jeff Garzik
@ 2005-05-20 23:39     ` Junio C Hamano
  2005-05-20 23:58       ` Jeff Garzik
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2005-05-20 23:39 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Git Mailing List

>>>>> "JG" == Jeff Garzik <jgarzik@pobox.com> writes:

JG> git-checkout-cache -f writes out every file, even if it exists, correct?

No, that's not correct.  To translate my prose, you would want
this:

    git-read-tree -m HEAD && git-checkout-cache -q -f -u -a

(notice that I do not have git-update-cache --refresh after
that).





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-20 23:33 ` Linus Torvalds
@ 2005-05-20 23:51   ` Linus Torvalds
  2005-05-20 23:55     ` Jeff Garzik
  2005-05-20 23:59     ` Junio C Hamano
  0 siblings, 2 replies; 11+ messages in thread
From: Linus Torvalds @ 2005-05-20 23:51 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Git Mailing List



On Fri, 20 May 2005, Linus Torvalds wrote:
> 
> In other words, those lines in your script should look like this:
> 
> 	git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
> 
> and you'll be a lot happier.

Btw, I do realize that I'm a total wiener, and that my inability to use 
"getopt_long()" is shameful and stupid. 

What can I say? I'm easily confused, and besides, I really seldom program 
in user mode.

So if somebody were to getopt'ify git, _without_ adding crapola like
autoconf (which probably implies that git would just require GNU getopt),
and others agree that it's ok to just say that we expect getopt_long() to
exist, then I'd not have any objections to making the above just be

	git-read-tree -m HEAD | git-checkout-cache -fqua

(to which the beavis-and-butthead in me says "hehhehhehh.. He said fqua.  
Hehhehh. fire fire fire.")

		Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-20 23:51   ` Linus Torvalds
@ 2005-05-20 23:55     ` Jeff Garzik
  2005-05-20 23:59     ` Junio C Hamano
  1 sibling, 0 replies; 11+ messages in thread
From: Jeff Garzik @ 2005-05-20 23:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 720 bytes --]

Linus Torvalds wrote:
> 
> On Fri, 20 May 2005, Linus Torvalds wrote:
> 
>>In other words, those lines in your script should look like this:
>>
>>	git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
>>
>>and you'll be a lot happier.
> 
> 
> Btw, I do realize that I'm a total wiener, and that my inability to use 
> "getopt_long()" is shameful and stupid. 

info libc argp :)  argp is a lot more flexible, but with the same basic 
structure as getopt_long().

If you pick a random git program, I would be willing to convert it as an 
example.  I attached my implementation of ipcrm[1] as an example.

	Jeff


[1] from 'posixutils', my project to implement all the POSIX command 
line utilities.  Yes, I'm crazy too.

[-- Attachment #2: ipcrm.c --]
[-- Type: text/x-csrc, Size: 5023 bytes --]

/*
 * Copyright 2004-2005 Jeff Garzik <jgarzik@pobox.com>
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; see the file COPYING.  If not, write to
 * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
 *
 */


#ifndef HAVE_CONFIG_H
#error missing autoconf-generated config.h.
#endif
#include "posixutils-config.h"

#include <sys/types.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <sys/sem.h>
#include <sys/shm.h>
#include <argp.h>
#include <libpu.h>


static const char doc[] =
N_("ipcrm - remove a message queue, semaphore set or shared memory id");

static struct argp_option options[] = {
	{ NULL, 'q', "msgid", 0,
	  N_("Remove message queue identifier msgid from system") },
	{ NULL, 'm', "shmid", 0,
	  N_("Remove shared memory identifier shmid from system") },
	{ NULL, 's', "semid", 0,
	  N_("Remove semaphore identifier semid from system") },
	{ NULL, 'Q', "msgkey", 0,
	  N_("Remove message queue identifier, created with key msgkey, from system") },
	{ NULL, 'M', "shmkey", 0,
	  N_("Remove shared memory identifier, created with key shmkey, from system") },
	{ NULL, 'S', "semkey", 0,
	  N_("Remove semaphore identifier, created with key semkey, from system") },
	{ }
};

static error_t parse_opt (int key, char *arg, struct argp_state *state);
static const struct argp argp = { options, parse_opt, NULL, doc };

enum parse_options_bits {
	OPT_MSG			= (1 << 0),
	OPT_SHM			= (1 << 1),
	OPT_SEM			= (1 << 2),
	OPT_KEY			= (1 << 3),
};

struct arglist {
	struct arglist		*next;
	int			mask;
	unsigned long		arg;
};

#ifdef _SEM_SEMUN_UNDEFINED
   union semun
   {
     int val;
     struct semid_ds *buf;
     unsigned short int *array;
     struct seminfo *__buf;
   };
#endif

static int exit_status = EXIT_SUCCESS;
static struct arglist *arglist;


static const char *arg_name(int mask)
{
	if (mask & OPT_MSG) return "msg";
	if (mask & OPT_SHM) return "shm";
	if (mask & OPT_SEM) return "sem";
	return NULL;
}

static void push_opt(int mask, unsigned long arg)
{
	struct arglist *tmp, *node = xcalloc(1, sizeof(struct arglist));

	node->mask = mask;
	node->arg = arg;

	tmp = arglist;
	if (!tmp) {
		arglist = node;
	} else {
		while (tmp->next)
			tmp = tmp->next;
		tmp->next = node;
	}
}

static void push_arg_opt(int mask, const char *arg)
{
	int base = (mask & OPT_KEY) ? 0 : 10;
	char *end = NULL;
	unsigned long l;

	l = strtoul(arg, &end, base);

	if ((*end != 0) ||	/* entire string is -not- valid */
	    ((mask & OPT_KEY) && (l == IPC_PRIVATE))) {
		fprintf(stderr, "%s%s '%s' invalid\n",
			arg_name(mask),
			mask & OPT_KEY ? "key" : "id",
			arg);
		exit_status = EXIT_FAILURE;
		return;
	}

	push_opt(mask, l);
}

static error_t parse_opt (int key, char *arg, struct argp_state *state)
{
	switch (key) {
	case 'q': push_arg_opt(OPT_MSG, arg); break;
	case 'm': push_arg_opt(OPT_SHM, arg); break;
	case 's': push_arg_opt(OPT_SEM, arg); break;
	case 'Q': push_arg_opt(OPT_MSG | OPT_KEY, arg); break;
	case 'M': push_arg_opt(OPT_SHM | OPT_KEY, arg); break;
	case 'S': push_arg_opt(OPT_SEM | OPT_KEY, arg); break;

	default:
		return ARGP_ERR_UNKNOWN;
	}

	return 0;
}

static void pinterr(const char *msg, long l)
{
	fprintf(stderr, msg, l, strerror(errno));
	exit_status = 1;
}

static void remove_one(int mask, unsigned long arg)
{
	int rc;
	int id = (int) arg;
	const char *errmsg = NULL;

	if (mask & OPT_KEY) {
		if (mask & OPT_MSG)
			id = msgget(arg, 0);
		else if (mask & OPT_SHM)
			id = shmget(arg, 0, 0);
		else if (mask & OPT_SEM)
			id = semget(arg, 0, 0);
		else
			abort();	/* should never happen */
	}

	if (id < 0) {
		pinterr("key 0x%lx lookup failed: %s\n", arg);
		return;
	}

	if (mask & OPT_MSG) {
		rc = msgctl(id, IPC_RMID, NULL);
		errmsg = "msgctl(0x%x): %s\n";
	}
	else if (mask & OPT_SHM) {
		rc = shmctl(id, IPC_RMID, NULL);
		errmsg = "shmctl(0x%x): %s\n";
	}
	else if (mask & OPT_SEM) {
		union semun dummy;
		dummy.val = 0;

		rc = semctl(id, 0, IPC_RMID, dummy);
		errmsg = "semctl(0x%x): %s\n";
	}

	else
		abort();	/* should never happen */

	if (rc < 0) {
		fprintf(stderr, errmsg, id, strerror(errno));
		exit_status = 1;
	}
}

static void remove_stuff(void)
{
	struct arglist *tmp = arglist;

	while (tmp) {
		remove_one(tmp->mask, tmp->arg);
		tmp = tmp->next;
	}
}

int main (int argc, char *argv[])
{
	error_t rc;

	pu_init();

	rc = argp_parse(&argp, argc, argv, 0, NULL, NULL);
	if (rc) {
		fprintf(stderr, "argp_parse failed: %s\n", strerror(rc));
		return 1;
	}

	remove_stuff();

	return exit_status;
}


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-20 23:39     ` Junio C Hamano
@ 2005-05-20 23:58       ` Jeff Garzik
  2005-05-21  1:44         ` Linus Torvalds
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Garzik @ 2005-05-20 23:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

Junio C Hamano wrote:
>>>>>>"JG" == Jeff Garzik <jgarzik@pobox.com> writes:
> 
> 
> JG> git-checkout-cache -f writes out every file, even if it exists, correct?
> 
> No, that's not correct.  To translate my prose, you would want
> this:

Thanks, I stand corrected :)


>     git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
> 
> (notice that I do not have git-update-cache --refresh after
> that).

Yep, thanks.  Script does seem faster now.  Numbers for hot cache (first 
is pre-modification, post is your mod):

[jgarzik@pretzel libata-dev]$ time git-switch-tree adma-mwi

real    0m7.069s
user    0m4.183s
sys     0m2.817s
[jgarzik@pretzel libata-dev]$ time git-switch-tree adma

real    0m0.389s
user    0m0.294s
sys     0m0.094s


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-20 23:51   ` Linus Torvalds
  2005-05-20 23:55     ` Jeff Garzik
@ 2005-05-20 23:59     ` Junio C Hamano
  1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2005-05-20 23:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff Garzik, Git Mailing List

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> (to which the beavis-and-butthead in me says "hehhehhehh.. He said fqua.  
LT> Hehhehh. fire fire fire.")

Earlier this week I've sent out a "Request for Help" listing
some janitorial work, on which this was one of the item.  I
believe Jeff suggested use of argp over GNU getopt(), but other
than that I do not think we had any volunteers (hint hint).  I
haven't looked into any of the RFH items myself yet.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-20 23:58       ` Jeff Garzik
@ 2005-05-21  1:44         ` Linus Torvalds
  2005-05-21  1:45           ` Jeff Garzik
  0 siblings, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2005-05-21  1:44 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Junio C Hamano, Git Mailing List



On Fri, 20 May 2005, Jeff Garzik wrote:
>
> Yep, thanks.  Script does seem faster now. 

Yeah. They "seem faster".

> real    0m7.069s

to

> real    0m0.389s

That's what, 20 times faster? 

More, actually, I suspect, since the "-m" version is not only faster, but
it doesn't do much IO, so you'll not have tons of dirty pages/inodes etc
afterwards.

		Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: checkout-cache -f: a better way?
  2005-05-21  1:44         ` Linus Torvalds
@ 2005-05-21  1:45           ` Jeff Garzik
  0 siblings, 0 replies; 11+ messages in thread
From: Jeff Garzik @ 2005-05-21  1:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

Linus Torvalds wrote:
> That's what, 20 times faster? 

:)


> More, actually, I suspect, since the "-m" version is not only faster, but
> it doesn't do much IO, so you'll not have tons of dirty pages/inodes etc
> afterwards.

Yep.  A -lot- of writeback would occur, a few seconds after my original 
script completed.

	Jeff



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-05-21  1:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-20 21:05 checkout-cache -f: a better way? Jeff Garzik
2005-05-20 22:38 ` Junio C Hamano
2005-05-20 23:33   ` Jeff Garzik
2005-05-20 23:39     ` Junio C Hamano
2005-05-20 23:58       ` Jeff Garzik
2005-05-21  1:44         ` Linus Torvalds
2005-05-21  1:45           ` Jeff Garzik
2005-05-20 23:33 ` Linus Torvalds
2005-05-20 23:51   ` Linus Torvalds
2005-05-20 23:55     ` Jeff Garzik
2005-05-20 23:59     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).