From: "René Scharfe" <l.s.r@web.de>
To: "Junio C Hamano" <gitster@pobox.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] builtin/mv.c: use correct type to compute size of an array element
Date: Thu, 7 Jul 2022 21:11:28 +0200 [thread overview]
Message-ID: <cb866b8c-dcc6-557f-da23-1c1972619a8a@web.de> (raw)
In-Reply-To: <xmqq1quw23r8.fsf@gitster.g>
Am 07.07.22 um 20:10 schrieb Junio C Hamano:
> While our eyes are on array.cocci, I have a few observations on it.
>
> This is not meant specifically to you, Ævar, but comments by those
> more familiar with Coccinelle (and I am sure the bar to pass is
> fairly low, as I am not all that familiar) are very much
> appreciated.
>
> @@
> expression dst, src, n, E;
> @@
> memcpy(dst, src, n * sizeof(
> - E[...]
> + *(E)
> ))
>
> This seems to force us prefer sizeof(*(E)) over sizeof(E[i]), when
> it is used to compute the byte size of memcpy() operation. There is
> no reason to prefer one over the other, but I presume it is there
> only for convenience for the other rules in this file (I recall
> vaguely reading somewhere that these rules do not "execute" from top
> to bottom, so I wonder how effective it is?).
It halves the number of syntax variants to deal with.
>
> @@
> type T;
> T *ptr;
> T[] arr;
> expression E, n;
> @@
> (
> memcpy(ptr, E,
> - n * sizeof(*(ptr))
> + n * sizeof(T)
> )
> |
> memcpy(arr, E,
> - n * sizeof(*(arr))
> + n * sizeof(T)
> )
> |
> memcpy(E, ptr,
> - n * sizeof(*(ptr))
> + n * sizeof(T)
> )
> |
> memcpy(E, arr,
> - n * sizeof(*(arr))
> + n * sizeof(T)
> )
> )
>
> Likewise, but this one is a lot worse.
>
> Taken alone, sizeof(*(ptr)) is far more preferrable than sizeof(T),
> because the code will be more maintainable.
>
> Side Note. I know builtin/mv.c had this type mismatch between
> the variable and sizeof() from the beginning when 11be42a4 (Make
> git-mv a builtin, 2006-07-26) introduced both the variable
> declaration "const char **source" and memmove() on it, but a
> code that starts out with "char **src" with memmove() that moves
> part of src[] and uses sizeof(char *) to compute the byte size
> of the move would become broken the same way when a later
> developer tightens the declaration to use "const char **src"
> without realizing that they have to update the type used in
> sizeof().
>
> So even though I am guessing that this is to allow the later rules
> to worry only about sizeof(T), I am a bit unhappy to see the rule.
> If an existing code matched this rule and get rewritten to use
> sizeof(T), not sizeof(*(ptr)), but did not match the other rules to
> be rewritten to use COPY_ARRAY(), the overall effect would be that
> the automation made the code worse.
True.
>
> @@
> type T;
> T *dst_ptr;
> T *src_ptr;
> T[] dst_arr;
> T[] src_arr;
> expression n;
> @@
> (
> - memcpy(dst_ptr, src_ptr, (n) * sizeof(T))
> + COPY_ARRAY(dst_ptr, src_ptr, n)
> |
> - memcpy(dst_ptr, src_arr, (n) * sizeof(T))
> + COPY_ARRAY(dst_ptr, src_arr, n)
> |
> - memcpy(dst_arr, src_ptr, (n) * sizeof(T))
> + COPY_ARRAY(dst_arr, src_ptr, n)
> |
> - memcpy(dst_arr, src_arr, (n) * sizeof(T))
> + COPY_ARRAY(dst_arr, src_arr, n)
> )
>
> I take it that thanks to the earlier "meh -- between sizeof(*p) and
> sizeof(p[0]) there is no reason to prefer one over the other" and
> "oh, no, we should prefer sizeof(*p) not sizeof(typeof(*p)) but this
> one is the other way around" rules, this one only has to deal with
> sizeof(T).
>
> Am I reading it correctly?
Yes. Without the ugly normalization step in the middle could either
use twelve cases instead of four here or use inline alternatives,
e.g.:
type T;
T *dst_ptr;
T *src_ptr;
T[] dst_arr;
T[] src_arr;
expression n;
@@
(
- memcpy(dst_ptr, src_ptr, (n) * \( sizeof(*(dst_ptr)) \| sizeof(*(src_ptr)) \| sizeof(T) \) )
+ COPY_ARRAY(dst_ptr, src_ptr, n)
|
- memcpy(dst_ptr, src_arr, (n) * \( sizeof(*(dst_ptr)) \| sizeof(*(src_arr)) \| sizeof(T) \) )
+ COPY_ARRAY(dst_ptr, src_arr, n)
|
- memcpy(dst_arr, src_ptr, (n) * \( sizeof(*(dst_arr)) \| sizeof(*(src_ptr)) \| sizeof(T) \) )
+ COPY_ARRAY(dst_arr, src_ptr, n)
|
- memcpy(dst_arr, src_arr, (n) * \( sizeof(*(dst_arr)) \| sizeof(*(src_arr)) \| sizeof(T) \) )
+ COPY_ARRAY(dst_arr, src_arr, n)
)
I seem to remember that rules like this missed some cases, but perhaps
that's no longer an issue with the latest Coccinelle version?
>
> @@
> type T;
> T *dst;
> T *src;
> expression n;
> @@
> (
> - memmove(dst, src, (n) * sizeof(*dst));
> + MOVE_ARRAY(dst, src, n);
> |
> - memmove(dst, src, (n) * sizeof(*src));
> + MOVE_ARRAY(dst, src, n);
> |
> - memmove(dst, src, (n) * sizeof(T));
> + MOVE_ARRAY(dst, src, n);
> )
>
> What I find interesting is that this one seems to be able to do the
> necessary rewrite without having to do the "turn everything into
> sizeof(T) first" trick. If this approach works well, I'd rather see
> the COPY_ARRAY() done without the first two preliminary rewrite
> rules.
It doesn't support arrays (T[]). That doesn't matter in practice
because we don't have such cases (yet?).
>
> I wonder if the pattern in the first rule catches sizeof(dst[0])
> instead of sizeof(*dst), though.
It doesn't.
>
> @@
> type T;
> T *ptr;
> expression n;
> @@
> - ptr = xmalloc((n) * sizeof(*ptr));
> + ALLOC_ARRAY(ptr, n);
>
> @@
> type T;
> T *ptr;
> expression n;
> @@
> - ptr = xmalloc((n) * sizeof(T));
> + ALLOC_ARRAY(ptr, n);
>
> Is it a no-op rewrite if we replace the above two rules with
> something like:
>
> . @@
> . type T;
> . T *ptr;
> . expression n;
> . @@
> . (
> . - ptr = xmalloc((n) * sizeof(*ptr));
> . + ALLOC_ARRAY(ptr, n);
> . |
> . - ptr = xmalloc((n) * sizeof(T));
> . + ALLOC_ARRAY(ptr, n);
> . )
I think so.
>
> or even following the pattern of the next one ...
>
> . @@
> . type T;
> . T *ptr;
> . expression n;
> . @@
> . - ptr = xmalloc((n) * \( sizeof(*ptr) \| sizeof(T) \))
> . + ALLOC_ARRAY(ptr, n);
>
> ... I have to wonder? I like the simplicity of this pattern.
In theory this is equivalent.
>
> @@
> type T;
> T *ptr;
> expression n != 1;
> @@
> - ptr = xcalloc(n, \( sizeof(*ptr) \| sizeof(T) \) )
> + CALLOC_ARRAY(ptr, n)
>
> And this leaves xcalloc(1, ...) alone because it is a way to get a
> cleared block of memory that may not be an array at all. Shouldn't
> we have "n != 1" for xmalloc rule as well, I wonder, if only for
> consistency?
I agree that a single-element array is a bit awkward, so allowing the
explicit sizeof in that case is less iffy. ALLOC/CALLOC macros for
single items might make that automation more palatable..
René
next prev parent reply other threads:[~2022-07-07 19:11 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-07 2:02 [PATCH] builtin/mv.c: use correct type to compute size of an array element Junio C Hamano
2022-07-07 5:52 ` [PATCH v2] builtin/mv.c: use the MOVE_ARRAY() macro instead of memmove() Junio C Hamano
2022-07-10 1:33 ` [PATCH v3] " Junio C Hamano
2022-07-18 20:30 ` Derrick Stolee
2022-07-07 12:11 ` [PATCH] builtin/mv.c: use correct type to compute size of an array element Ævar Arnfjörð Bjarmason
2022-07-07 18:10 ` Junio C Hamano
2022-07-07 19:11 ` René Scharfe [this message]
2022-07-09 8:16 ` René Scharfe
2022-07-10 5:38 ` Junio C Hamano
2022-07-10 10:05 ` [PATCH] cocci: avoid normalization rules for memcpy René Scharfe
2022-07-10 14:45 ` Ævar Arnfjörð Bjarmason
2022-07-10 16:32 ` Ævar Arnfjörð Bjarmason
2022-07-10 19:30 ` Junio C Hamano
2022-07-11 17:11 ` René Scharfe
2022-07-11 20:05 ` Ævar Arnfjörð Bjarmason
2022-07-07 18:27 ` [PATCH] builtin/mv.c: use correct type to compute size of an array element René Scharfe
2022-07-07 18:42 ` Jeff King
2022-07-07 20:25 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cb866b8c-dcc6-557f-da23-1c1972619a8a@web.de \
--to=l.s.r@web.de \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).