Re: Cleaning up elkscmd and adding help text - Questions

public inbox for linux-8086@vger.kernel.org
 help / color / mirror / Atom feed

From: Nils Stec <nils.stec@gmail.com>
To: ELKS <linux-8086@vger.kernel.org>
Subject: Re: Cleaning up elkscmd and adding help text - Questions
Date: Fri, 27 Mar 2015 18:59:28 +0100	[thread overview]
Message-ID: <55159A80.4040408@gmail.com> (raw)
In-Reply-To: <551558D7.8040000@jodybruchon.com>

Wow i really belive that writing this took some time... Thank you, this 
gave me a lot of hints, do's and don'ts.

I read The Art of Unix Programming, but you showed me that i forgot a lot ;)

I think I'm going to rewrite my patches with all that things in mind, 
will send them soon.

I hope my submissions will be ready very soon.

Nils





On 03/27/2015 02:19 PM, Jody Bruchon wrote:
> On 3/27/2015 4:19 AM, Nils Stec wrote:
>> I wrote a patch for cat which shows the user some usage-information.
>> Jody Bruchon used write() for this purpose, other tools are using
>> fprintf(stderr, ....).
>> I don't know which i should prefer. In my opinion i would rather use
>> fprintf.
>> STDERR should be the output for operations, right?
>
> Some of the programs already used write() and write() avoids dragging 
> the code in for the printf family of functions (all ELKS programs are 
> statically linked), so if there is NO usage of the printf family of 
> functions in the program, it may make sense to use write() to minimize 
> total program size. That being said, even a trivial usage message 
> should mention the program being run in the usage, and it's not a good 
> idea to assume that i.e. 'ls' is ALWAYS run as 'ls' since someone 
> could symlink or alias it in the future. Because of this,
>
> fprintf(stderr, "usage: %s args\n", argv[0]);
>
> is a thousand times better than:
>
> write(STDERR_FILENO, "usage: ", 7);
> write(STDERR_FILENO, argv[0], strlen(argv[0]));
> write(STDERR_FILENO, "\n", 1);
>
> This results in three calls to write() and a call to strlen() when we 
> could have just used one fprintf() to do all the work. In the future, 
> I think many calls to write() will be eliminated, since even one call 
> to a printf family function brings in the core code for all printf 
> functions anyway.
>
>> Is this the right way to write code for this project? It's the first
>> time I'm doing this in here and i don't wan't to do things wrong...
>
> The cat patch should use fprintf(stderr, ...) since I've already 
> brought in the printf function family with my error_read: code and 
> write() is bad for the reasons outlined above.
>
> A few other comments:
>
> +#define NOT_OVERWRITING        0xABCD
> ...snip...
> +            return NOT_OVERWRITING;
>
> If the user asks you to not overwrite (clobber), you should silently 
> not clobber files and not consider it an error. If any program 
> succeeds in doing what the user asked for, it should return success 
> (0) with no messages. If you haven't read Eric S. Raymond's rules of 
> the Unix philosophy, take a look at them; I find them to be very 
> helpful guidelines when making program design decisions.
>
> +    if (stat(srcname, &statbuf1) < 0) {    // src nonexistent
>
> You can't write comments using // unless you're using a C99 compiler. 
> bcc is a K&R C compiler that is not even guaranteed to be C89 
> compliant, so no // comments. Use /* comment */ instead. Source 
> comments are nice to have as long as they're not excessive or 
> explaining something that is very obvious.
>
> +static int option_no_overwrite;
> +static int option_verbose;
> +static int option_interactive;
>
> Depending on how many options you add, you may want to use one global 
> static int called 'flags' or similar, and use bit flags instead of a 
> bunch of ints. From the program 'fdupes' in one of my GitHub repos, 
> snipped up for brevity:
>
> #define ISFLAG(a,b) ((a & b) == b)
> #define SETFLAG(a,b) (a |= b)
> #define F_RECURSE 0x0001
> #define F_HIDEPROGRESS 0x0002
> etc. etc.
> ...
> switch (opt) {
> case 'q':
>   SETFLAG(flags, F_HIDEPROGRESS);
>   break;
> ...
> if (ISFLAG(flags, F_RECURSEAFTER)) <code here>...
> if (!ISFLAG(flags, F_HIDEPROGRESS)) <code here>...
>
> I don't necessarily recommend using this code, it's just an example 
> and it's written to be usable with more than one variable that holds a 
> set of flags. I'd rewrite it like this for ELKS programs to make it 
> cleaner:
>
> static int flags;
> #define ISFLAG(a) ((flags & a) == a)
> #define SETFLAG(a) (flags |= a)
> #define F_QUIET 0x0001
> etc.
> ...
> if (strcmp(argv[i], "-q")) SETFLAG(F_QUIET);
> if (!ISFLAG(F_QUIET)) fprintf(stderr, "copied %d files\n", count);
>
> ---
> +    if(option_verbose) {
> +        printf("copied '%s' -> '%s'\n", srcname, destname);
> +    }
>
> Firstly, the output from GNU 'cp' looks like this:
>
> $ cp -v foo/* bar/
> 'foo/1' -> 'bar/1'
> 'foo/2' -> 'bar/2'
> 'foo/3' -> 'bar/3'
>
> I'd also prefer one-liners such as this with no multi-line 'else' 
> clause to be collapsed together (it makes the code easier to read and 
> avoids 'brace hell'). I also prefer that if, for, and while statements 
> have a space between the word and the (condition) part for the sake of 
> code readability. Write it like this instead:
>
> if (option_verbose) printf("'%s' -> '%s'\n", srcname, destname);
>
> Next:
>
> +    // set defaults for options //
> +    option_no_overwrite = 0;
> +    option_verbose = 0;
> +    option_interactive = 0;
>
> The comment is unnecessary since it's obvious that "options are being 
> initialized to zero" by the code itself. This is self-commenting code. 
> I'd also prefer to see the variable names shortened a bit ('o_' or 
> 'opt_' instead of 'option_' for example) and as mentioned previously, 
> it's probably better to use the bit flags method mentioned previously 
> if you're only using the variables as a yes/no flag.
>
> +    arg_counter = 1;    // start at argv[1]
> +    options_counter = 0;
> +    while(arg_counter < argc) {
> +        if(!strcmp(argv[arg_counter], "-n")) {
> +            options_counter++;
> +            option_no_overwrite = 1;
> +        }
> +        if(!strcmp(argv[arg_counter], "-v")) {
> +            options_counter++;
> +            option_verbose = 1;
> +        }
> +        if(!strcmp(argv[arg_counter], "-i")) {
> +            options_counter++;
> +            option_interactive = 1;
> +        }
> +        if(!strcmp(argv[arg_counter], "-h")) goto usage;
> +        arg_counter++;
> +    }
>
> -    if ((argc > 3) && !dirflag) {
> -        fprintf(stderr, "%s: not a directory\n", lastarg);
> +    if((argc > (3+options_counter)) && !dirflag) { // can't copy more 
> than one src-files into one dst-file
>
> Ouch. Rather than try to explain the problems in detail, let me 
> rewrite it.
>
>     arg_cnt = 1;    // start at argv[1]
>     opt_cnt = 0;
>     while(arg_cnt < argc) {
>         if (!strcmp(argv[arg_cnt], "-n")) {
>             opt_noclobber = 1;
>         } else if (!strcmp(argv[arg_cnt], "-v")) {
>             opt_verbose = 1;
>         } else if (!strcmp(argv[arg_cnt], "-i")) {
>             opt_interactive = 1;
>         } else if (!strcmp(argv[arg_cnt], "-h")) goto usage;
>         else break;
>         opt_cnt++;
>         arg_cnt++;
>     }
>
>     /* If multiple files specified, last one must be a directory */
>     if ((argc > (opt_cnt + 3)) && !dirflag) {
>
> This rewrite combines all the option count increments into one 
> instance which reduces code size and stops parsing options at the 
> first non-option string. The comparison on the last line is more 
> readable and it is clearer that it is "comparing total argument count 
> to the number of options specified plus 3." The 'if-elseif-else' tree 
> stops processing once a valid option is hit and "falls through" once a 
> non-option is hit. It could (and probably should) be improved by 
> handling the combined option case such as 'cp -iv src dest' but this 
> is sufficient for now as long as the usage text implies that option 
> flags must be specified separately.
>
> +    copy_count = 0;
> +    file_count = 0;
> +    while (argc-- > (2+options_counter)) {
> +        srcname = argv[options_counter+1+file_count];
>
> I've spent a lot of time cleaning these compressed mathematical 
> expressions out of the elkscmd code; let's not introduce more of them 
> (also shorten variable names while retaining their meaning, this is a 
> good example for why to use shorter names):
>
> +    copy_cnt = 0;
> +    file_cnt = 0;
> +    while (argc-- > (opt_cnt + 2)) {
> +        srcname = argv[opt_cnt + 1 + file_cnt];
>
> Next,
>
> -        if (dirflag) destname = buildname(destname, srcname);
> +        if(dirflag) destname = buildname(destname, srcname);
>
> Don't do this, if no other reason than the fact that 'if' is not a 
> function call. I understand that it comes down to individual coding 
> style and the compiler doesn't care if you write
>
> if((something+(q/2))==argv[x+2+y*z[aaa]])usage();
>
> but it is hard to read and readability is very important. I believe 
> that 'if', 'while', and 'for' shouldn't look like function calls in 
> code since they are actually C control statements. Using a space makes 
> those control statements immediately obvious, whereas if() and while() 
> and for() look like function calls when skimming the code.
>
> -        if (!copyfile(*++argv, destname, 0)) goto error_copy;
> +        retval = copyfile(srcname, destname, 0);
> +        if((!retval) && (!(retval == NOT_OVERWRITING))) goto error_copy;
>
> Remember the rule about silent success mentioned earlier? This change 
> can be eliminated because of it. Even if it wasn't, though, this code 
> could be simplified and made more readable in one shot:
>
> +        retval = copyfile(srcname, destname, 0);
> +        if (retval != 1) goto error_copy;
>
> I'd actually recommend reversing the return values in copyfile(), 
> using nonzero as an error return instead (most C library calls seem to 
> use this convention and consistency with that is rarely a bad thing) 
> so you could just go:
>
> +        if (retval) goto error_copy;
>
> Next,
>
> +    if(option_verbose) printf("copy-count = %d\n", copy_count);
>
> This shouldn't be done. Here's GNU coreutils 'cp' in action:
>
> file_utils $ cp -v foo/* bar/
> 'foo/COPYING' -> 'bar/COPYING'
> 'foo/COPYING.sash' -> 'bar/COPYING.sash'
> 'foo/Makefile' -> 'bar/Makefile'
> 'foo/README' -> 'bar/README'
> 'foo/WARRANTY' -> 'bar/WARRANTY'
> ...snip...
>
> If someone is interested in knowing a "copy-count" for some reason, 
> they'll almost certainly do this instead:
>
> file_utils $ cp -v foo/* bar/ | wc -l
> 46
>
> Your helpful message will throw this off and make it not work as 
> expected. A shell script author will have to introduce an incompatible 
> kludge to their script to make up for 'cp' misbehaving, i.e.
>
> CNT=$(cp -v $A/* $B/ | wc -l)
> CNT=$((CNT - 1))  # This shouldn't be necessary...
>
> Another problem is that you've declared a static global 'copy_count' 
> variable and you're introducing a local 'copy_count' variable as well. 
> It needs to be called something else for the sake of avoiding 
> programmer confusion. However, assuming you eliminate the "copy-count" 
> message I just talked about along with its local variable counterpart, 
> the global 'static unsigned int copy_count' is set (copy_count++) but 
> not used throughout the entire program and can be discarded.
>
> If you ever want an "official" reference for how a particular elkscmd 
> program "should" behave, take a look at the POSIX standards:
>
> http://pubs.opengroup.org/onlinepubs/9699919799/
>
> Whew, that took me a long time to write. I hope it's helpful to you 
> and everyone else on the list. Thanks for your impressive work so far! 
> I'm looking forward to your finalized [PATCH] submissions. :-)
>
> -Jody
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-8086" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2015-03-27 17:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-27  8:19 Cleaning up elkscmd and adding help text - Questions Nils Stec
2015-03-27 13:19 ` Jody Bruchon
2015-03-27 14:10   ` Alan Cox
2015-03-27 16:17     ` Jody Bruchon
2015-03-27 18:19       ` MFLD
2015-03-27 18:26         ` Jody Bruchon
2015-03-27 18:52           ` MFLD
2015-03-27 22:19             ` Alan Cox
2015-03-28 15:35               ` LM
2015-03-27 18:41         ` Alan Cox
2015-03-27 17:59   ` Nils Stec [this message]
2015-03-27 18:31     ` Alan Cox
2015-03-27 13:54 ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55159A80.4040408@gmail.com \
    --to=nils.stec@gmail.com \
    --cc=linux-8086@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox