From: Nils Stec <nils.stec@gmail.com>
To: ELKS <linux-8086@vger.kernel.org>
Subject: Re: Cleaning up elkscmd and adding help text - Questions
Date: Fri, 27 Mar 2015 18:59:28 +0100 [thread overview]
Message-ID: <55159A80.4040408@gmail.com> (raw)
In-Reply-To: <551558D7.8040000@jodybruchon.com>
Wow i really belive that writing this took some time... Thank you, this
gave me a lot of hints, do's and don'ts.
I read The Art of Unix Programming, but you showed me that i forgot a lot ;)
I think I'm going to rewrite my patches with all that things in mind,
will send them soon.
I hope my submissions will be ready very soon.
Nils
On 03/27/2015 02:19 PM, Jody Bruchon wrote:
> On 3/27/2015 4:19 AM, Nils Stec wrote:
>> I wrote a patch for cat which shows the user some usage-information.
>> Jody Bruchon used write() for this purpose, other tools are using
>> fprintf(stderr, ....).
>> I don't know which i should prefer. In my opinion i would rather use
>> fprintf.
>> STDERR should be the output for operations, right?
>
> Some of the programs already used write() and write() avoids dragging
> the code in for the printf family of functions (all ELKS programs are
> statically linked), so if there is NO usage of the printf family of
> functions in the program, it may make sense to use write() to minimize
> total program size. That being said, even a trivial usage message
> should mention the program being run in the usage, and it's not a good
> idea to assume that i.e. 'ls' is ALWAYS run as 'ls' since someone
> could symlink or alias it in the future. Because of this,
>
> fprintf(stderr, "usage: %s args\n", argv[0]);
>
> is a thousand times better than:
>
> write(STDERR_FILENO, "usage: ", 7);
> write(STDERR_FILENO, argv[0], strlen(argv[0]));
> write(STDERR_FILENO, "\n", 1);
>
> This results in three calls to write() and a call to strlen() when we
> could have just used one fprintf() to do all the work. In the future,
> I think many calls to write() will be eliminated, since even one call
> to a printf family function brings in the core code for all printf
> functions anyway.
>
>> Is this the right way to write code for this project? It's the first
>> time I'm doing this in here and i don't wan't to do things wrong...
>
> The cat patch should use fprintf(stderr, ...) since I've already
> brought in the printf function family with my error_read: code and
> write() is bad for the reasons outlined above.
>
> A few other comments:
>
> +#define NOT_OVERWRITING 0xABCD
> ...snip...
> + return NOT_OVERWRITING;
>
> If the user asks you to not overwrite (clobber), you should silently
> not clobber files and not consider it an error. If any program
> succeeds in doing what the user asked for, it should return success
> (0) with no messages. If you haven't read Eric S. Raymond's rules of
> the Unix philosophy, take a look at them; I find them to be very
> helpful guidelines when making program design decisions.
>
> + if (stat(srcname, &statbuf1) < 0) { // src nonexistent
>
> You can't write comments using // unless you're using a C99 compiler.
> bcc is a K&R C compiler that is not even guaranteed to be C89
> compliant, so no // comments. Use /* comment */ instead. Source
> comments are nice to have as long as they're not excessive or
> explaining something that is very obvious.
>
> +static int option_no_overwrite;
> +static int option_verbose;
> +static int option_interactive;
>
> Depending on how many options you add, you may want to use one global
> static int called 'flags' or similar, and use bit flags instead of a
> bunch of ints. From the program 'fdupes' in one of my GitHub repos,
> snipped up for brevity:
>
> #define ISFLAG(a,b) ((a & b) == b)
> #define SETFLAG(a,b) (a |= b)
> #define F_RECURSE 0x0001
> #define F_HIDEPROGRESS 0x0002
> etc. etc.
> ...
> switch (opt) {
> case 'q':
> SETFLAG(flags, F_HIDEPROGRESS);
> break;
> ...
> if (ISFLAG(flags, F_RECURSEAFTER)) <code here>...
> if (!ISFLAG(flags, F_HIDEPROGRESS)) <code here>...
>
> I don't necessarily recommend using this code, it's just an example
> and it's written to be usable with more than one variable that holds a
> set of flags. I'd rewrite it like this for ELKS programs to make it
> cleaner:
>
> static int flags;
> #define ISFLAG(a) ((flags & a) == a)
> #define SETFLAG(a) (flags |= a)
> #define F_QUIET 0x0001
> etc.
> ...
> if (strcmp(argv[i], "-q")) SETFLAG(F_QUIET);
> if (!ISFLAG(F_QUIET)) fprintf(stderr, "copied %d files\n", count);
>
> ---
> + if(option_verbose) {
> + printf("copied '%s' -> '%s'\n", srcname, destname);
> + }
>
> Firstly, the output from GNU 'cp' looks like this:
>
> $ cp -v foo/* bar/
> 'foo/1' -> 'bar/1'
> 'foo/2' -> 'bar/2'
> 'foo/3' -> 'bar/3'
>
> I'd also prefer one-liners such as this with no multi-line 'else'
> clause to be collapsed together (it makes the code easier to read and
> avoids 'brace hell'). I also prefer that if, for, and while statements
> have a space between the word and the (condition) part for the sake of
> code readability. Write it like this instead:
>
> if (option_verbose) printf("'%s' -> '%s'\n", srcname, destname);
>
> Next:
>
> + // set defaults for options //
> + option_no_overwrite = 0;
> + option_verbose = 0;
> + option_interactive = 0;
>
> The comment is unnecessary since it's obvious that "options are being
> initialized to zero" by the code itself. This is self-commenting code.
> I'd also prefer to see the variable names shortened a bit ('o_' or
> 'opt_' instead of 'option_' for example) and as mentioned previously,
> it's probably better to use the bit flags method mentioned previously
> if you're only using the variables as a yes/no flag.
>
> + arg_counter = 1; // start at argv[1]
> + options_counter = 0;
> + while(arg_counter < argc) {
> + if(!strcmp(argv[arg_counter], "-n")) {
> + options_counter++;
> + option_no_overwrite = 1;
> + }
> + if(!strcmp(argv[arg_counter], "-v")) {
> + options_counter++;
> + option_verbose = 1;
> + }
> + if(!strcmp(argv[arg_counter], "-i")) {
> + options_counter++;
> + option_interactive = 1;
> + }
> + if(!strcmp(argv[arg_counter], "-h")) goto usage;
> + arg_counter++;
> + }
>
> - if ((argc > 3) && !dirflag) {
> - fprintf(stderr, "%s: not a directory\n", lastarg);
> + if((argc > (3+options_counter)) && !dirflag) { // can't copy more
> than one src-files into one dst-file
>
> Ouch. Rather than try to explain the problems in detail, let me
> rewrite it.
>
> arg_cnt = 1; // start at argv[1]
> opt_cnt = 0;
> while(arg_cnt < argc) {
> if (!strcmp(argv[arg_cnt], "-n")) {
> opt_noclobber = 1;
> } else if (!strcmp(argv[arg_cnt], "-v")) {
> opt_verbose = 1;
> } else if (!strcmp(argv[arg_cnt], "-i")) {
> opt_interactive = 1;
> } else if (!strcmp(argv[arg_cnt], "-h")) goto usage;
> else break;
> opt_cnt++;
> arg_cnt++;
> }
>
> /* If multiple files specified, last one must be a directory */
> if ((argc > (opt_cnt + 3)) && !dirflag) {
>
> This rewrite combines all the option count increments into one
> instance which reduces code size and stops parsing options at the
> first non-option string. The comparison on the last line is more
> readable and it is clearer that it is "comparing total argument count
> to the number of options specified plus 3." The 'if-elseif-else' tree
> stops processing once a valid option is hit and "falls through" once a
> non-option is hit. It could (and probably should) be improved by
> handling the combined option case such as 'cp -iv src dest' but this
> is sufficient for now as long as the usage text implies that option
> flags must be specified separately.
>
> + copy_count = 0;
> + file_count = 0;
> + while (argc-- > (2+options_counter)) {
> + srcname = argv[options_counter+1+file_count];
>
> I've spent a lot of time cleaning these compressed mathematical
> expressions out of the elkscmd code; let's not introduce more of them
> (also shorten variable names while retaining their meaning, this is a
> good example for why to use shorter names):
>
> + copy_cnt = 0;
> + file_cnt = 0;
> + while (argc-- > (opt_cnt + 2)) {
> + srcname = argv[opt_cnt + 1 + file_cnt];
>
> Next,
>
> - if (dirflag) destname = buildname(destname, srcname);
> + if(dirflag) destname = buildname(destname, srcname);
>
> Don't do this, if no other reason than the fact that 'if' is not a
> function call. I understand that it comes down to individual coding
> style and the compiler doesn't care if you write
>
> if((something+(q/2))==argv[x+2+y*z[aaa]])usage();
>
> but it is hard to read and readability is very important. I believe
> that 'if', 'while', and 'for' shouldn't look like function calls in
> code since they are actually C control statements. Using a space makes
> those control statements immediately obvious, whereas if() and while()
> and for() look like function calls when skimming the code.
>
> - if (!copyfile(*++argv, destname, 0)) goto error_copy;
> + retval = copyfile(srcname, destname, 0);
> + if((!retval) && (!(retval == NOT_OVERWRITING))) goto error_copy;
>
> Remember the rule about silent success mentioned earlier? This change
> can be eliminated because of it. Even if it wasn't, though, this code
> could be simplified and made more readable in one shot:
>
> + retval = copyfile(srcname, destname, 0);
> + if (retval != 1) goto error_copy;
>
> I'd actually recommend reversing the return values in copyfile(),
> using nonzero as an error return instead (most C library calls seem to
> use this convention and consistency with that is rarely a bad thing)
> so you could just go:
>
> + if (retval) goto error_copy;
>
> Next,
>
> + if(option_verbose) printf("copy-count = %d\n", copy_count);
>
> This shouldn't be done. Here's GNU coreutils 'cp' in action:
>
> file_utils $ cp -v foo/* bar/
> 'foo/COPYING' -> 'bar/COPYING'
> 'foo/COPYING.sash' -> 'bar/COPYING.sash'
> 'foo/Makefile' -> 'bar/Makefile'
> 'foo/README' -> 'bar/README'
> 'foo/WARRANTY' -> 'bar/WARRANTY'
> ...snip...
>
> If someone is interested in knowing a "copy-count" for some reason,
> they'll almost certainly do this instead:
>
> file_utils $ cp -v foo/* bar/ | wc -l
> 46
>
> Your helpful message will throw this off and make it not work as
> expected. A shell script author will have to introduce an incompatible
> kludge to their script to make up for 'cp' misbehaving, i.e.
>
> CNT=$(cp -v $A/* $B/ | wc -l)
> CNT=$((CNT - 1)) # This shouldn't be necessary...
>
> Another problem is that you've declared a static global 'copy_count'
> variable and you're introducing a local 'copy_count' variable as well.
> It needs to be called something else for the sake of avoiding
> programmer confusion. However, assuming you eliminate the "copy-count"
> message I just talked about along with its local variable counterpart,
> the global 'static unsigned int copy_count' is set (copy_count++) but
> not used throughout the entire program and can be discarded.
>
> If you ever want an "official" reference for how a particular elkscmd
> program "should" behave, take a look at the POSIX standards:
>
> http://pubs.opengroup.org/onlinepubs/9699919799/
>
> Whew, that took me a long time to write. I hope it's helpful to you
> and everyone else on the list. Thanks for your impressive work so far!
> I'm looking forward to your finalized [PATCH] submissions. :-)
>
> -Jody
> --
> To unsubscribe from this list: send the line "unsubscribe linux-8086" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-03-27 17:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-27 8:19 Cleaning up elkscmd and adding help text - Questions Nils Stec
2015-03-27 13:19 ` Jody Bruchon
2015-03-27 14:10 ` Alan Cox
2015-03-27 16:17 ` Jody Bruchon
2015-03-27 18:19 ` MFLD
2015-03-27 18:26 ` Jody Bruchon
2015-03-27 18:52 ` MFLD
2015-03-27 22:19 ` Alan Cox
2015-03-28 15:35 ` LM
2015-03-27 18:41 ` Alan Cox
2015-03-27 17:59 ` Nils Stec [this message]
2015-03-27 18:31 ` Alan Cox
2015-03-27 13:54 ` Alan Cox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55159A80.4040408@gmail.com \
--to=nils.stec@gmail.com \
--cc=linux-8086@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox