public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Tian Yuchen <a3205153416@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH v1][RFC] symlinks: use unsigned int for flags
Date: Wed, 21 Jan 2026 10:33:11 +0100	[thread overview]
Message-ID: <aXCdVyySI_biFPzS@pks.im> (raw)
In-Reply-To: <20260120152219.398999-1-a3205153416@gmail.com>

On Tue, Jan 20, 2026 at 11:22:19PM +0800, Tian Yuchen wrote:
> The 'flags' and 'track_flags' fields in symlinks.c are used
> strictly as a collection of bits (using bitwise operators including
> &, |, ~).

That's one important data point. The other important data point is
whether or not we ever use any negative values here. But these flags are
defined like this:

    #define FL_DIR      (1 << 0)
    #define FL_NOENT    (1 << 1)
    #define FL_SYMLINK  (1 << 2)
    #define FL_LSTATERR (1 << 3)
    #define FL_ERR      (1 << 4)
    #define FL_FULLPATH (1 << 5)

So they indeed are only ever positive.

> Using a signed integer for bitmasks may lead to undefined
> behavior with shift operations and logic errors if the MSB is touched.

Mostly a theoretical issue, but we indeed prefer to use unsigned ints
for bitsets like this.

> diff --git a/symlinks.c b/symlinks.c
> index 9cc090d42c..ed63891149 100644
> --- a/symlinks.c
> +++ b/symlinks.c
> @@ -74,11 +74,11 @@ static inline void reset_lstat_cache(struct cache_def *cache)
>   */
>  static int lstat_cache_matchlen(struct cache_def *cache,
>  				const char *name, int len,
> -				int *ret_flags, int track_flags,
> +				unsigned int *ret_flags, unsigned int track_flags,

`ret_flags` is also a combination of `FL_*` flags, so it's never
expected to be negative, either.

>  				int prefix_len_stat_func)
>  {
>  	int match_len, last_slash, last_slash_dir, previous_slash;
> -	int save_flags, ret, saved_errno = 0;
> +	unsigned int save_flags, ret, saved_errno = 0;

Changing the type of `save_flags` makes sense, but you also change the
type of `ret` and `saved_errno` to become unsigned, which does not make
sense.

> @@ -192,10 +192,10 @@ static int lstat_cache_matchlen(struct cache_def *cache,
>  	return match_len;
>  }
>  
> -static int lstat_cache(struct cache_def *cache, const char *name, int len,
> -		       int track_flags, int prefix_len_stat_func)
> +static unsigned int lstat_cache(struct cache_def *cache, const char *name, int len,
> +		       unsigned int track_flags, int prefix_len_stat_func)
>  {
> -	int flags;
> +	unsigned int flags;
>  	(void)lstat_cache_matchlen(cache, name, len, &flags, track_flags,
>  			prefix_len_stat_func);
>  	return flags;

All callers already pass unsigned flags, so that's good. But there's two
callers that use the return value of this function:

  - `threaded_has_symlink_leading_path()` uses `lstat_cache() & FL_SYMLINK`.

  - `threaded_has_dirs_only_path()` uses `lstat_cache() & FL_DIR`.

Both of these functions really only care about whether or not the
statement evaluates to a truish value. Maybe it would make sense to have
a preparatory commit where convert the return values of those functions
to be booleans?

> @@ -234,7 +234,7 @@ int check_leading_path(const char *name, int len, int warn_on_lstat_err)
>  static int threaded_check_leading_path(struct cache_def *cache, const char *name,
>  				       int len, int warn_on_lstat_err)
>  {
> -	int flags;
> +	unsigned int flags;
>  	int match_len = lstat_cache_matchlen(cache, name, len, &flags,
>  			   FL_SYMLINK|FL_NOENT|FL_DIR, USE_ONLY_LSTAT);
>  	int saved_errno = errno;

Looks good.

> diff --git a/symlinks.h b/symlinks.h
> index 7ae3d5b856..25bf04f54f 100644
> --- a/symlinks.h
> +++ b/symlinks.h
> @@ -5,8 +5,8 @@
>  
>  struct cache_def {
>  	struct strbuf path;
> -	int flags;
> -	int track_flags;
> +	unsigned int flags;
> +	unsigned int track_flags;
>  	int prefix_len_stat_func;
>  };
>  #define CACHE_DEF_INIT { \

There is only one user of `struct cache_def` outside of "symlink.c",
which is "preload-index.c". That user doesn't care about the flag
definitions though, so this conversion looks good to me.

Thanks!

Patrick

      parent reply	other threads:[~2026-01-21  9:33 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-20 15:22 [PATCH v1][RFC] symlinks: use unsigned int for flags Tian Yuchen
2026-01-20 15:36 ` Tian Yuchen
2026-01-21  9:39   ` Patrick Steinhardt
2026-01-21  9:33 ` Patrick Steinhardt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXCdVyySI_biFPzS@pks.im \
    --to=ps@pks.im \
    --cc=a3205153416@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox