git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: Jeff Hostetler <git@jeffhostetler.com>
Cc: git@vger.kernel.org, gitster@pobox.com, peff@peff.net,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH v3 4/6] list-objects: filter objects in traverse_commit_list
Date: Tue, 7 Nov 2017 15:20:34 -0800	[thread overview]
Message-ID: <20171107152034.47686f6ece72ea3d43005b12@google.com> (raw)
In-Reply-To: <20171107193546.10017-5-git@jeffhostetler.com>

On Tue,  7 Nov 2017 19:35:44 +0000
Jeff Hostetler <git@jeffhostetler.com> wrote:

> +/*
> + * Reject the arg if it contains any characters that might
> + * require quoting or escaping when handing to a sub-command.
> + */
> +static int reject_injection_chars(const char *arg)
> +{
[snip]
> +}

Someone pointed me to quote.{c,h}, which is probably sufficient to
ensure shell safety if we do invoke subcommands through the shell. If
that is so, we probably don't need a blacklist.

Having said that, though, it might be safer to still introduce one, and
relax it later if necessary - it is much easier to relax a constraint
than to increase one.

> +	} else if (skip_prefix(arg, "sparse:", &v0)) {
> +
> +		if (skip_prefix(v0, "oid=", &v1)) {
> +			struct object_context oc;
> +			struct object_id sparse_oid;
> +			filter_options->choice = LOFC_SPARSE_OID;
> +			if (!get_oid_with_context(v1, GET_OID_BLOB,
> +						  &sparse_oid, &oc))
> +				filter_options->sparse_oid_value =
> +					oiddup(&sparse_oid);
> +			return 0;
> +		}

In your recent e-mail [1], you said that you will change it to always pass
the original expression - is that still the plan?

[1] https://public-inbox.org/git/f698d5a8-bf31-cea1-a8da-88b755b0b7af@jeffhostetler.com/

> +/* Remember to update object flag allocation in object.h */

You probably can delete this line.

> +/*
> + * FILTER_SHOWN_BUT_REVISIT -- we set this bit on tree objects
> + * that have been shown, but should be revisited if they appear
> + * in the traversal (until we mark it SEEN).  This is a way to
> + * let us silently de-dup calls to show() in the caller.

This is unclear to me at first reading. Maybe something like:

  FILTER_SHOWN_BUT_REVISIT -- we set this bit on tree objects that have
  been shown, but should not be skipped over if they reappear in the
  traversal. This ensures that the tree's descendants are re-processed
  if the tree reappears subsequently, and that the tree is not shown
  twice.

> + * This
> + * is subtly different from the "revision.h:SHOWN" and the
> + * "sha1_name.c:ONELINE_SEEN" bits.  And also different from
> + * the non-de-dup usage in pack-bitmap.c
> + */

Optional: I'm not sure if this comparison is useful. (Maybe it is useful
to others, though.)

> +/*
> + * A filter driven by a sparse-checkout specification to only
> + * include blobs that a sparse checkout would populate.
> + *
> + * The sparse-checkout spec can be loaded from a blob with the
> + * given OID or from a local pathname.  We allow an OID because
> + * the repo may be bare or we may be doing the filtering on the
> + * server.
> + */
> +struct frame {
> +	/*
> +	 * defval is the usual default include/exclude value that
> +	 * should be inherited as we recurse into directories based
> +	 * upon pattern matching of the directory itself or of a
> +	 * containing directory.
> +	 */
> +	int defval;

Can this be an "unsigned defval : 1" as well? In the function below, I
see that you assign to an "int val" first (which can take -1, 0, and 1)
before assigning to this, so that is fine.

Also, maybe a better name would be "exclude", with the documentation:

  1 if the directory is excluded, 0 otherwise. Excluded directories will
  still be recursed through, because an "include" rule for an object
  might override an "exclude" rule for one of its ancestors.

  reply	other threads:[~2017-11-07 23:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-07 19:35 [PATCH v3 0/6] Partial clone part 1: object filtering Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 1/6] dir: allow exclusions from blob in addition to file Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 2/6] oidmap: add oidmap iterator methods Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 3/6] oidset: add iterator methods to oidset Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 4/6] list-objects: filter objects in traverse_commit_list Jeff Hostetler
2017-11-07 23:20   ` Jonathan Tan [this message]
2017-11-08  5:01     ` Junio C Hamano
2017-11-16 17:28       ` Jeff Hostetler
2017-11-16 17:23     ` Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 5/6] rev-list: add list-objects filtering support Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 6/6] pack-objects: add list-objects filtering Jeff Hostetler
2017-11-08  0:45   ` Jonathan Tan
2017-11-08  5:25 ` [PATCH v3 0/6] Partial clone part 1: object filtering Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171107152034.47686f6ece72ea3d43005b12@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).