[Qemu-devel] Re: Re: completely OT: C Q/As : let's feed the troll

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] Re: Re: completely OT: C Q/As : let's feed the troll
@ 2004-06-24 20:13 Bob Barry
  0 siblings, 0 replies; 2+ messages in thread
From: Bob Barry @ 2004-06-24 20:13 UTC (permalink / raw)
  To: qemu-devel

On Thu, 24 Jun 2004 16:21, Charlie Gordon wrote:
> Do you know any tool that can detect such horrors ?

dcc:
   http://wwwsi.supelec.fr/yn/presentdcc.eng.html

^ permalink raw reply	[flat|nested] 2+ messages in thread

* OT: C Q/As, was Re: [Qemu-devel] security_20040618
@ 2004-06-21  8:50 Christof Petig
  2004-06-21 15:44 ` Michael Jennings
  0 siblings, 1 reply; 2+ messages in thread
From: Christof Petig @ 2004-06-21  8:50 UTC (permalink / raw)
  To: gmane; +Cc: qemu-devel

Charlie Gordon schrieb:
> Charlie the C teaser.
> ---------------------
> one of my favorite Q/As : what is wrong with : enum BOOL { FALSE=0,
> TRUE=1 }; ?

can you enlighten me? The only drawback I see is that with plain C (no 
C++) typedef enum { ... } BOOL; would be more appropriate.

I would propose
#ifndef __cplusplus
typedef enum { false, true } bool;
#endif
as the optimal solution for a problem I hardly have (since I usually 
don't go back to coding in C)

    Christof

PS: I used to ask: Why does this crash later (if you are lucky)

const char *itoa(int i)
{  char x[20];
    snprintf(x,sizeof x,"%d",i);
    return x;
}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: OT: C Q/As, was Re: [Qemu-devel] security_20040618
@ 2004-06-21 15:44 ` Michael Jennings
  2004-06-22  9:57   ` [Qemu-devel] Re: completely OT: C Q/As, was security_20040618 Charlie Gordon
  0 siblings, 1 reply; 2+ messages in thread
From: Michael Jennings @ 2004-06-21 15:44 UTC (permalink / raw)
  To: qemu-devel

On Monday, 21 June 2004, at 10:50:44 (+0200),
Christof Petig wrote:

> can you enlighten me? The only drawback I see is that with plain C
> (no C++) typedef enum { ... } BOOL; would be more appropriate.
> 
> I would propose
> #ifndef __cplusplus
> typedef enum { false, true } bool;
> #endif
> as the optimal solution for a problem I hardly have (since I usually 
> don't go back to coding in C)

There are two problems with using enum's and/or #define's for
TRUE/FALSE.  They should not be used in boolean expressions except
where the return value is of that same typedef (e.g., the function
returns bool) or is generated from the same set of #define's.  While
false is always 0, true is not always 1.  True is non-zero.

> const char *itoa(int i)
> {  char x[20];
>    snprintf(x,sizeof x,"%d",i);
>    return x;
> }

Forgot "static" before char x[20];, or to be more threadsafe, either
malloc() or pass in a buffer and max size.

> defensive programming would require that TRUE be also defined as
> 
> #define TRUE 1
> 
> as many unsuspecting programmers will expect TRUE and FALSE to be handled in
> the preprocessor phase eg:
> 
> #if TRUE
>     ...
>     somecode();
>     ...
> #endif

I disagree strongly.  Anyone who writes "#if TRUE" or "#if FALSE" is
just asking for trouble; (s)he needs to be taught a lesson.  "#if 0"
and "#if 1" are just as obvious, and both "#if 1" and "#if TRUE" are
equally ridiculous.

A better technique would be something more like this:

#define UNUSED_CODE_BLOCK 0

#if UNUSED_CODE_BLOCK
...
#else  /* used code below */
...
#endif

But even then it isn't as clean and readable as 1 or 0.  Plus, you've
still failed to solve the problem that true may or may not be 1.

Michael

-- 
Michael Jennings (a.k.a. KainX)  http://www.kainx.org/  <mej@kainx.org>
n + 1, Inc., http://www.nplus1.net/       Author, Eterm (www.eterm.org)
-----------------------------------------------------------------------
 "You can accomplish much if you don't care who gets the credit."
                                                      -- Ronald Reagan

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Qemu-devel] Re: completely OT: C Q/As, was Re: security_20040618
  2004-06-21 15:44 ` Michael Jennings
@ 2004-06-22  9:57   ` Charlie Gordon
  2004-06-22 15:38     ` [Qemu-devel] Re: completely OT: C Q/As Michael Jennings
  0 siblings, 1 reply; 2+ messages in thread
From: Charlie Gordon @ 2004-06-22  9:57 UTC (permalink / raw)
  To: qemu-devel

"Michael Jennings" <mej@eterm.org> wrote in message
news:20040621154440.GQ4686@kainx.org...
> There are two problems with using enum's and/or #define's for
> TRUE/FALSE.  They should not be used in boolean expressions except
> where the return value is of that same typedef (e.g., the function
> returns bool) or is generated from the same set of #define's.  While
> false is always 0, true is not always 1.  True is non-zero.

This is a common misunderstanding.  your last statement doesn't make sense :
in C, false IS 0 and true IS 1.  That is the result of a false comparison is
0 (eg: 2 == 3) and the result of a true comparison is 1 (as in 2 == 2).  As
such boolean expressions can only have 2 values : 0 and 1, appropriately
defined as FALSE and TRUE, be it via #define or enum.
Unlike its distant cousin Java, C allows non boolean expressions to control
test statements such as 'if', 'for', and 'while'.  In such constructs,
falseness is not limited to the integer value 0 : '\0', 0.0, and NULL
pointers also convert to false, while all other values convert to true.
Further confusion is caused by the convention used in the standard C library
where boolean valued function do not necessarily return 1 for trueness.
Furthermore, it is perfectly OK to compare values from different typedefs
but identical implementation types.
I do agree with you that there is room for confusion, especially for java
bred programmers, and it is usually better to avoid the extra clutter of
comparing to FALSE or TRUE in test expressions : just use the plain
expression or bang (!) it for the reverse condition.

> Forgot "static" before char x[20];, or to be more threadsafe, either
> malloc() or pass in a buffer and max size.

These are obvious solutions to the problem, the latter (buffer and size) is
actually the one used in the C library for that very function (itoa).  But
the question was a different one : Why does this crash later (if you are
lucky) ?

> > defensive programming would require that TRUE be also defined as
> >
> > #define TRUE 1
> >
> > as many unsuspecting programmers will expect TRUE and FALSE to be
handled in
> > the preprocessor phase eg:
> >
> > #if TRUE
> >     ...
> >     somecode();
> >     ...
> > #endif
>
> I disagree strongly.  Anyone who writes "#if TRUE" or "#if FALSE" is
> just asking for trouble; (s)he needs to be taught a lesson.  "#if 0"
> and "#if 1" are just as obvious, and both "#if 1" and "#if TRUE" are
> equally ridiculous.

You use #if 0 yourself, try pretending you never switch those to #if 1 ?
I used to be harsh like this and show contempt for imperfect programming.
I've had to grow more tolerant, as 99% of C or C++ code is indeed imperfect,
and most programmers are more focussed on problem solving than clean coding.
The truth is that *anyone* who writes C code is just asking for trouble.
Yet this is still my favorite language.  I do use #if 0 or #if 1 sometimes,
never #if TRUE or #if FALSE, but I do not believe people that do should be
taught such a difficult lesson : after all, it is an accepted convention
that all uppercase identifiers be preprocessor defined. Christof Petig
provided a more compelling example :

#define USE_MORE_CAUTION  TRUE
#if USE_MORE_CAUTION == TRUE
    // this code would compile
#endif
#if USE_MORE_CAUTION == FALSE
    // this code would compile as well
#endif

My point is that if you define TRUE and FALSE in a header file, your job is
to make sure that you are not setting up boobytraps for the unsuspecting
(and imperfect) programmer to trigger.

I took a look at libast and you DO redefine TRUE and FALSE so as to setup
such a trap !!! have pity on your users!

/* from libast/types.h */
...
#undef false
#undef False
#undef FALSE
#undef true
#undef True
#undef TRUE

typedef enum {
#ifndef __cplusplus
  false = 0,
#endif
  False = 0,
  FALSE = 0,
#ifndef __cplusplus
  true = 1,
#endif
  True = 1,
  TRUE = 1
} spif_bool_t;
...

While your code is amazingly consistent and readable (except for line
length), you can learn from this very thread and fix it : you use strncpy 10
times. 7 of those are incorrect and at least 5 will lead to potential buffer
overflow and inefficiencies, the remaining 3 could be replaced directly with
calls to memcpy.  You make one incorrect use of strncat, that may overflow
by just one byte.  pstrcpy provides the very semantics you require in most
of these cases.

Cheers,

Chqrlie.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Qemu-devel] Re: completely OT: C Q/As
  2004-06-22  9:57   ` [Qemu-devel] Re: completely OT: C Q/As, was security_20040618 Charlie Gordon
@ 2004-06-22 15:38     ` Michael Jennings
  2004-06-24 14:21       ` [Qemu-devel] Re: Re: completely OT: C Q/As : let's feed the troll Charlie Gordon
  0 siblings, 1 reply; 2+ messages in thread
From: Michael Jennings @ 2004-06-22 15:38 UTC (permalink / raw)
  To: qemu-devel

On Tuesday, 22 June 2004, at 11:57:23 (+0200),
Charlie Gordon wrote:

> This is a common misunderstanding.  your last statement doesn't make
> sense : in C, false IS 0 and true IS 1.  That is the result of a
> false comparison is 0 (eg: 2 == 3) and the result of a true
> comparison is 1 (as in 2 == 2).  As such boolean expressions can
> only have 2 values : 0 and 1, appropriately defined as FALSE and
> TRUE, be it via #define or enum.

Incorrect.  The result of a boolean operator is guaranteed to be 1 or
0.  However, values evaluated in a boolean context such as those in an
if statement are not bound by that rule.

> Unlike its distant cousin Java, C allows non boolean expressions to
> control test statements such as 'if', 'for', and 'while'.  In such
> constructs, falseness is not limited to the integer value 0 : '\0',
> 0.0, and NULL pointers also convert to false, while all other values
> convert to true.

All the values you mentioned evaluate to (int) 0 in a boolean context.

> Further confusion is caused by the convention used in the standard C
> library where boolean valued function do not necessarily return 1
> for trueness.

As there is no such thing as a "boolean value" in C, no standard C
function returns a boolean value.  Some do, however, return an integer
that may be successfully used in a boolean context (albeit possibly
negated).

> I do agree with you that there is room for confusion, especially for
> java bred programmers, and it is usually better to avoid the extra
> clutter of comparing to FALSE or TRUE in test expressions : just use
> the plain expression or bang (!) it for the reverse condition.

That was not my statement.  Again, it depends on what you're doing.
If you're guaranteed a return of 0 or non-zero, best to use == 0 or !=
0.  With pointers, the value itself is fine, negated or not, but other
valid approaches include == NULL, != NULL, or even a macro like
ISNULL().

> These are obvious solutions to the problem, the latter (buffer and
> size) is actually the one used in the C library for that very
> function (itoa).  But the question was a different one : Why does
> this crash later (if you are lucky) ?

You found my answer to be obvious.  I found the question to be
obvious.  So we're even. :-)

The answer you were looking for (regarding the stack) had already been
given.  I was simply taking a different tack.

> You use #if 0 yourself, try pretending you never switch those to #if 1 ?

In production-quality code?  Never.  For testing purposes where only I
was meant to see it?  Absolutely.

> I used to be harsh like this and show contempt for imperfect
> programming.  I've had to grow more tolerant, as 99% of C or C++
> code is indeed imperfect, and most programmers are more focussed on
> problem solving than clean coding.

And a lackadaisical attitude such as this is why so many programmers
just don't know any better.  Be tolerant in what you accept, but
strict in what you create.  And just because something doesn't
generate an error doesn't mean it shouldn't generate a warning.

> The truth is that *anyone* who writes C code is just asking for
> trouble.

Don't agree there.

> after all, it is an accepted convention that all uppercase
> identifiers be preprocessor defined.

We agree on that point at least. :-)

> Christof Petig provided a more compelling example :
> 
> #define USE_MORE_CAUTION  TRUE
> #if USE_MORE_CAUTION == TRUE
>     // this code would compile
> #endif
> #if USE_MORE_CAUTION == FALSE
>     // this code would compile as well
> #endif

Again, it's more properly written as "#if USE_MORE_CAUTION" or "#if
!USE_MORE_CAUTION"

> My point is that if you define TRUE and FALSE in a header file, your
> job is to make sure that you are not setting up boobytraps for the
> unsuspecting (and imperfect) programmer to trigger.

And again, I disagree.  By defining something in a header file that
other programmers will use, you're defining an API.  What is most
important is that it is clear and consistent, not that it follows any
one particular religion.

> I took a look at libast and you DO redefine TRUE and FALSE so as to
> setup such a trap !!! have pity on your users!
> 
> /* from libast/types.h */
> ...
> #undef false
> #undef False
> #undef FALSE
> #undef true
> #undef True
> #undef TRUE
> 
> typedef enum {
> #ifndef __cplusplus
>   false = 0,
> #endif
>   False = 0,
>   FALSE = 0,
> #ifndef __cplusplus
>   true = 1,
> #endif
>   True = 1,
>   TRUE = 1
> } spif_bool_t;
> ...

First off, I'm creating a well-defined API.  I use an enum to create
an actual type I can return, manipulate, check against, blah blah
blah.  I undefine those macros to prevent situations where some
platforms may expand one or more of the LHS values to something that's
no longer valid for the enum while others may not.

Secondly, libast headers are API headers, and are really not all that
different from system headers.  If a programmer wishes to use them,
(s)he should #include them BEFORE any custom headers (s)he may also be
using.  Nothing is stopping a programmer from redefining those macros
as (s)he sees fit.  But for my purposes, the correct choice is clear.

Besides, several other items which are more intended for preprocessor
use (e.g., FIXME_BLOCK and UNUSED_BLOCK) are defined separately.

> While your code is amazingly consistent and readable (except for
> line length),

LOL  You've got me there; I use GNU Emacs under X11, so my maximum
line length is largely dependent on how big the Emacs window was in
which I wrote that code. :-)

Wrapping code at 80 chars can result in some really ugly messes, so I
gave up on that.  I just haven't found a really good alternative yet.

> you can learn from this very thread and fix it : you use strncpy 10
> times. 7 of those are incorrect and at least 5 will lead to
> potential buffer overflow and inefficiencies,

My analysis of the code disagrees, but I'd like to look at any
evidence you may have.

> the remaining 3 could be replaced directly with calls to memcpy.
> You make one incorrect use of strncat, that may overflow by just one
> byte.

You're right, thanks.

Michael

-- 
Michael Jennings (a.k.a. KainX)  http://www.kainx.org/  <mej@kainx.org>
n + 1, Inc., http://www.nplus1.net/       Author, Eterm (www.eterm.org)
-----------------------------------------------------------------------
 "Did you really have to die for me?  All I am for all You are because
  What I need and what I believe are worlds apart."
                                       -- Jars of Clay, "Worlds Apart"

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Qemu-devel] Re: Re: completely OT: C Q/As : let's feed the troll
  2004-06-22 15:38     ` [Qemu-devel] Re: completely OT: C Q/As Michael Jennings
@ 2004-06-24 14:21       ` Charlie Gordon
  0 siblings, 0 replies; 2+ messages in thread
From: Charlie Gordon @ 2004-06-24 14:21 UTC (permalink / raw)
  To: qemu-devel

> As there is no such thing as a "boolean value" in C, no standard C
> function returns a boolean value.  Some do, however, return an integer
> that may be successfully used in a boolean context (albeit possibly
> negated).

In C there is no boolean type, but there are boolean values : results of
boolean operators as you pointed out.
I agree that most standard C functions do not return booleans values, which
is the cause of numerous bugs by unsuspecting but bold programmers, not
necessarily newbies. look at these examples:

if (isspace(*p) == isspace(*q)) ...   // stupid optimisation!
if (isdir(path1) & isdir(path2)) ...     // lack of C experience or plain
typo

These are horrible things to write, but I've seen much worse examples of non
portable code like this, that do work in a lot of environments, and will
fail without warning on others.
One would need to define an integer type that can only be compared to 0.
strcmp provides even more horrible semantics: C newbies routinely compare
the result to -1 !
Do you know any tool that can detect such horrors ?  I agree with teaching
offenders a lesson, that's what compile time warnings and errors are for.
If you let bugs like this crawl into production software, end users will
become victims, and maintainers will do the hard work...  Original
authors/sinners will have already moved on to other projects, and commit
more trash.

 My point is this : C is difficult enough as a language, try not to add more
traps unnecessarily.

> And just because something doesn't
> generate an error doesn't mean it shouldn't generate a warning.
>

I agree completely, as a matter of fact, I always compile with almost all
warnings on, and make gcc treat them as errors.

> > /* from libast/types.h */
> > ...
> > #undef false
> > #undef False
> > #undef FALSE
> > #undef true
> > #undef True
> > #undef TRUE
> >
> > typedef enum {
> > #ifndef __cplusplus
> >   false = 0,
> > #endif
> >   False = 0,
> >   FALSE = 0,
> > #ifndef __cplusplus
> >   true = 1,
> > #endif
> >   True = 1,
> >   TRUE = 1
> > } spif_bool_t;
> > ...
>
> First off, I'm creating a well-defined API.  I use an enum to create
> an actual type I can return, manipulate, check against, blah blah
> blah.  I undefine those macros to prevent situations where some
> platforms may expand one or more of the LHS values to something that's
> no longer valid for the enum while others may not.
>
> Secondly, libast headers are API headers, and are really not all that
> different from system headers.  If a programmer wishes to use them,
> (s)he should #include them BEFORE any custom headers (s)he may also be
> using.  Nothing is stopping a programmer from redefining those macros
> as (s)he sees fit.  But for my purposes, the correct choice is clear.

In the case of FALSE/TRUE, you know damn well you are not "creating" the
API, you are potentially redefining it, the fact that you first undef those
symbols proves my point.  I cannot understand why you define upto 3
different symbols for each boolean value !
In my original post, I was pointing out how such a redefinition has subtle
side-effects that you probably didn't anticipate.

For completeness, TRUE and FALSE are defined by numerous API headers.
Apache for one has a bogus definition in several header files :
apache/ap_ctx.h:
#ifndef FALSE
#define FALSE 0
#define TRUE  !FALSE
#endif

The missing parentheses should strike as an obvious mistake.
Further thinking may convince you otherwise...
Wrong ! here is a contrived counterexample :

    printf("this should print 98 : TRUE[\"ab\"]=%d\n", TRUE["ab"]);

will print 0 instead of 98 with that definition.

C IS DIFFICULT ! There is always one more bug.

> > While your code is amazingly consistent and readable (except for
> > line length),
>
> LOL  You've got me there; I use GNU Emacs under X11, so my maximum
> line length is largely dependent on how big the Emacs window was in
> which I wrote that code. :-)
>
> Wrapping code at 80 chars can result in some really ugly messes, so I
> gave up on that.  I just haven't found a really good alternative yet.

Good for you !  I can read C code much faster when it doesn't extend beyond
80 columns.
You get ugly messes because you do too much on one line, use too many
verbose macros...

    self->items = SPIF_CAST_C(spif_obj_t *) REALLOC(self->items,
SPIF_SIZEOF_TYPE(obj) * (--(self->len)));

Since REALLOC, and MALLOC always need a cast in C++, why not pass the type ?
Since realloc has despicable behaviour when running out of memory, why not
pass the address of the pointer ?
Then wrap the ugly mess in a macro (you know about that)...
This is much safer, more concise, and fits in 80 columns, and you could even
check the boolean return value :

    REALLOC(&self->items, --self->len, spif_obj_t);

> > you can learn from this very thread and fix it : you use strncpy 10
> > times. 7 of those are incorrect and at least 5 will lead to
> > potential buffer overflow and inefficiencies,
>
> My analysis of the code disagrees, but I'd like to look at any
> evidence you may have.

Lets go:

socket.c line 717
    addr->sun_path[0] = 0;
    strncat(addr->sun_path, SPIF_STR_STR(spif_url_get_path(self)),
sizeof(addr->sun_path));

strncat will copy upto sizeof(addr->sun_path) bytes from the source, and
tack a '\0'.
the destination happens to be empty (why use strncat ?), so this code can
only overflow addr->sun_path by 1 byte, it would be much worse if actual
concatenation took place.

file.c line 51:
    if (len) {
        strncpy(ftemplate, buff, len);
        ftemplate[len - 1] = 0;
    }

this is just inefficient and cumbersome. pstrcpy is what you want.  also len
is a misleading name for the buffer size.

mem.c line 78:

    strncpy(p->file, filename, LIBAST_FNAME_LEN);
    p->file[LIBAST_FNAME_LEN] = 0;

inefficiency again, luckily the buffer size is LIBAST_FNAME_LEN + 1, pstrcpy
fixes this.

mem.c line 135

   strncpy(p->file, filename, LIBAST_FNAME_LEN);

no extra null setting here ! code is inconsistent, I guess you rely on magic
side effects : the ptr_t structure is not allocated here and most likely the
last byte of the file buffer has already been set to 0.  This is unsafe.
pstrcpy fixes this.

strings.c line 99. line 120:

    tmpstr = (char *) MALLOC(cnt + 1);
    strncpy(tmpstr, str, cnt);
    tmpstr[cnt] = 0;
    return (tmpstr);

inefficiency !  pstrcpy fixes this.

conf.c line 485:

                  strncpy(newbuff + j, getenv("HOME"), max - j);
                  cnt1 = strlen(getenv("HOME")) - 1;
                  cnt2 = max - j - 1;
                  j += MIN(cnt1, cnt2);

given CONFIG_BUFF is defined as 20480, (max - j) is likely very large. So
inefficient!
It is also bogus as newbuff is not guarantied to be null terminated.
getting $HOME twice is childish.
relying on MIN to fix the unsigned overflow in case $HOME is the empty
string is nothing to be proud of.
Similar code is repeated 3 more times in the same function with similar
problems.

conf.c line 741:
        if (n > 0 && n <= maxpathlen) {
            /* Compose the /path/file combo */
            strncpy(full_path, path, n);
            if (full_path[n - 1] != '/') {
                full_path[n++] = '/';
            }
            full_path[n] = '\0';
            strcat(full_path, name);

strncpy is OK here, but not needed as memcpy can be used directly.
strcpy(full_path + n, name) would be more efficient as well.

I'll grant you that the bugs are very remote, but that makes them completely
unlikely to be caught
before production.

Just don't use strncpy, steal pstrcpy from here and use it, and propagate to
message !

Cheers,

Charlie Gordon.

PS: please let's stop polluting qemu, address further remarks to me
directly.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-06-24 20:12 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-24 20:13 [Qemu-devel] Re: Re: completely OT: C Q/As : let's feed the troll Bob Barry
  -- strict thread matches above, loose matches on Subject: below --
2004-06-21  8:50 OT: C Q/As, was Re: [Qemu-devel] security_20040618 Christof Petig
2004-06-21 15:44 ` Michael Jennings
2004-06-22  9:57   ` [Qemu-devel] Re: completely OT: C Q/As, was security_20040618 Charlie Gordon
2004-06-22 15:38     ` [Qemu-devel] Re: completely OT: C Q/As Michael Jennings
2004-06-24 14:21       ` [Qemu-devel] Re: Re: completely OT: C Q/As : let's feed the troll Charlie Gordon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).