From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nils Stec Subject: Re: Cleaning up elkscmd and adding help text - Questions Date: Fri, 27 Mar 2015 18:59:28 +0100 Message-ID: <55159A80.4040408@gmail.com> References: <551512AC.5050704@gmail.com> <551558D7.8040000@jodybruchon.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=/3AEYf4uni68cwZPRs614d8J7xChUik1fRepJUrbR6o=; b=oiyoxt4JEnSTGleh6JYjR+JBrsFVSTacCUbFDKbDC5bzATgmoH25GtpUxLiBVEaKRk Zmtvovw+kVWO47uNFWHXs4UYTIRp8UIhe8J2Xcd0DjIO1dRdUow6yTfziE0t2hwv7/5D g9VNEtOFGF8AXLJK4ai9UL0fX70GjEegN1lCcP0rnqw6TT5NrlfF799w7CyezXyrhskX 45u4CvbPhTHlwqtdlPMWtfPFqAF/r1UGzawnMR/wfDb7yKPKnCb2nZIqd9EppTzkuoC2 a8qY7Q8cGM09jYpICltUJFHwQb90hj9WB64t3SZulv9Zb4T2PGgOi2bChFYcpUPzl3hd 8ekQ== In-Reply-To: <551558D7.8040000@jodybruchon.com> Sender: linux-8086-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: ELKS Wow i really belive that writing this took some time... Thank you, this gave me a lot of hints, do's and don'ts. I read The Art of Unix Programming, but you showed me that i forgot a lot ;) I think I'm going to rewrite my patches with all that things in mind, will send them soon. I hope my submissions will be ready very soon. Nils On 03/27/2015 02:19 PM, Jody Bruchon wrote: > On 3/27/2015 4:19 AM, Nils Stec wrote: >> I wrote a patch for cat which shows the user some usage-information. >> Jody Bruchon used write() for this purpose, other tools are using >> fprintf(stderr, ....). >> I don't know which i should prefer. In my opinion i would rather use >> fprintf. >> STDERR should be the output for operations, right? > > Some of the programs already used write() and write() avoids dragging > the code in for the printf family of functions (all ELKS programs are > statically linked), so if there is NO usage of the printf family of > functions in the program, it may make sense to use write() to minimize > total program size. That being said, even a trivial usage message > should mention the program being run in the usage, and it's not a good > idea to assume that i.e. 'ls' is ALWAYS run as 'ls' since someone > could symlink or alias it in the future. Because of this, > > fprintf(stderr, "usage: %s args\n", argv[0]); > > is a thousand times better than: > > write(STDERR_FILENO, "usage: ", 7); > write(STDERR_FILENO, argv[0], strlen(argv[0])); > write(STDERR_FILENO, "\n", 1); > > This results in three calls to write() and a call to strlen() when we > could have just used one fprintf() to do all the work. In the future, > I think many calls to write() will be eliminated, since even one call > to a printf family function brings in the core code for all printf > functions anyway. > >> Is this the right way to write code for this project? It's the first >> time I'm doing this in here and i don't wan't to do things wrong... > > The cat patch should use fprintf(stderr, ...) since I've already > brought in the printf function family with my error_read: code and > write() is bad for the reasons outlined above. > > A few other comments: > > +#define NOT_OVERWRITING 0xABCD > ...snip... > + return NOT_OVERWRITING; > > If the user asks you to not overwrite (clobber), you should silently > not clobber files and not consider it an error. If any program > succeeds in doing what the user asked for, it should return success > (0) with no messages. If you haven't read Eric S. Raymond's rules of > the Unix philosophy, take a look at them; I find them to be very > helpful guidelines when making program design decisions. > > + if (stat(srcname, &statbuf1) < 0) { // src nonexistent > > You can't write comments using // unless you're using a C99 compiler. > bcc is a K&R C compiler that is not even guaranteed to be C89 > compliant, so no // comments. Use /* comment */ instead. Source > comments are nice to have as long as they're not excessive or > explaining something that is very obvious. > > +static int option_no_overwrite; > +static int option_verbose; > +static int option_interactive; > > Depending on how many options you add, you may want to use one global > static int called 'flags' or similar, and use bit flags instead of a > bunch of ints. From the program 'fdupes' in one of my GitHub repos, > snipped up for brevity: > > #define ISFLAG(a,b) ((a & b) == b) > #define SETFLAG(a,b) (a |= b) > #define F_RECURSE 0x0001 > #define F_HIDEPROGRESS 0x0002 > etc. etc. > ... > switch (opt) { > case 'q': > SETFLAG(flags, F_HIDEPROGRESS); > break; > ... > if (ISFLAG(flags, F_RECURSEAFTER)) ... > if (!ISFLAG(flags, F_HIDEPROGRESS)) ... > > I don't necessarily recommend using this code, it's just an example > and it's written to be usable with more than one variable that holds a > set of flags. I'd rewrite it like this for ELKS programs to make it > cleaner: > > static int flags; > #define ISFLAG(a) ((flags & a) == a) > #define SETFLAG(a) (flags |= a) > #define F_QUIET 0x0001 > etc. > ... > if (strcmp(argv[i], "-q")) SETFLAG(F_QUIET); > if (!ISFLAG(F_QUIET)) fprintf(stderr, "copied %d files\n", count); > > --- > + if(option_verbose) { > + printf("copied '%s' -> '%s'\n", srcname, destname); > + } > > Firstly, the output from GNU 'cp' looks like this: > > $ cp -v foo/* bar/ > 'foo/1' -> 'bar/1' > 'foo/2' -> 'bar/2' > 'foo/3' -> 'bar/3' > > I'd also prefer one-liners such as this with no multi-line 'else' > clause to be collapsed together (it makes the code easier to read and > avoids 'brace hell'). I also prefer that if, for, and while statements > have a space between the word and the (condition) part for the sake of > code readability. Write it like this instead: > > if (option_verbose) printf("'%s' -> '%s'\n", srcname, destname); > > Next: > > + // set defaults for options // > + option_no_overwrite = 0; > + option_verbose = 0; > + option_interactive = 0; > > The comment is unnecessary since it's obvious that "options are being > initialized to zero" by the code itself. This is self-commenting code. > I'd also prefer to see the variable names shortened a bit ('o_' or > 'opt_' instead of 'option_' for example) and as mentioned previously, > it's probably better to use the bit flags method mentioned previously > if you're only using the variables as a yes/no flag. > > + arg_counter = 1; // start at argv[1] > + options_counter = 0; > + while(arg_counter < argc) { > + if(!strcmp(argv[arg_counter], "-n")) { > + options_counter++; > + option_no_overwrite = 1; > + } > + if(!strcmp(argv[arg_counter], "-v")) { > + options_counter++; > + option_verbose = 1; > + } > + if(!strcmp(argv[arg_counter], "-i")) { > + options_counter++; > + option_interactive = 1; > + } > + if(!strcmp(argv[arg_counter], "-h")) goto usage; > + arg_counter++; > + } > > - if ((argc > 3) && !dirflag) { > - fprintf(stderr, "%s: not a directory\n", lastarg); > + if((argc > (3+options_counter)) && !dirflag) { // can't copy more > than one src-files into one dst-file > > Ouch. Rather than try to explain the problems in detail, let me > rewrite it. > > arg_cnt = 1; // start at argv[1] > opt_cnt = 0; > while(arg_cnt < argc) { > if (!strcmp(argv[arg_cnt], "-n")) { > opt_noclobber = 1; > } else if (!strcmp(argv[arg_cnt], "-v")) { > opt_verbose = 1; > } else if (!strcmp(argv[arg_cnt], "-i")) { > opt_interactive = 1; > } else if (!strcmp(argv[arg_cnt], "-h")) goto usage; > else break; > opt_cnt++; > arg_cnt++; > } > > /* If multiple files specified, last one must be a directory */ > if ((argc > (opt_cnt + 3)) && !dirflag) { > > This rewrite combines all the option count increments into one > instance which reduces code size and stops parsing options at the > first non-option string. The comparison on the last line is more > readable and it is clearer that it is "comparing total argument count > to the number of options specified plus 3." The 'if-elseif-else' tree > stops processing once a valid option is hit and "falls through" once a > non-option is hit. It could (and probably should) be improved by > handling the combined option case such as 'cp -iv src dest' but this > is sufficient for now as long as the usage text implies that option > flags must be specified separately. > > + copy_count = 0; > + file_count = 0; > + while (argc-- > (2+options_counter)) { > + srcname = argv[options_counter+1+file_count]; > > I've spent a lot of time cleaning these compressed mathematical > expressions out of the elkscmd code; let's not introduce more of them > (also shorten variable names while retaining their meaning, this is a > good example for why to use shorter names): > > + copy_cnt = 0; > + file_cnt = 0; > + while (argc-- > (opt_cnt + 2)) { > + srcname = argv[opt_cnt + 1 + file_cnt]; > > Next, > > - if (dirflag) destname = buildname(destname, srcname); > + if(dirflag) destname = buildname(destname, srcname); > > Don't do this, if no other reason than the fact that 'if' is not a > function call. I understand that it comes down to individual coding > style and the compiler doesn't care if you write > > if((something+(q/2))==argv[x+2+y*z[aaa]])usage(); > > but it is hard to read and readability is very important. I believe > that 'if', 'while', and 'for' shouldn't look like function calls in > code since they are actually C control statements. Using a space makes > those control statements immediately obvious, whereas if() and while() > and for() look like function calls when skimming the code. > > - if (!copyfile(*++argv, destname, 0)) goto error_copy; > + retval = copyfile(srcname, destname, 0); > + if((!retval) && (!(retval == NOT_OVERWRITING))) goto error_copy; > > Remember the rule about silent success mentioned earlier? This change > can be eliminated because of it. Even if it wasn't, though, this code > could be simplified and made more readable in one shot: > > + retval = copyfile(srcname, destname, 0); > + if (retval != 1) goto error_copy; > > I'd actually recommend reversing the return values in copyfile(), > using nonzero as an error return instead (most C library calls seem to > use this convention and consistency with that is rarely a bad thing) > so you could just go: > > + if (retval) goto error_copy; > > Next, > > + if(option_verbose) printf("copy-count = %d\n", copy_count); > > This shouldn't be done. Here's GNU coreutils 'cp' in action: > > file_utils $ cp -v foo/* bar/ > 'foo/COPYING' -> 'bar/COPYING' > 'foo/COPYING.sash' -> 'bar/COPYING.sash' > 'foo/Makefile' -> 'bar/Makefile' > 'foo/README' -> 'bar/README' > 'foo/WARRANTY' -> 'bar/WARRANTY' > ...snip... > > If someone is interested in knowing a "copy-count" for some reason, > they'll almost certainly do this instead: > > file_utils $ cp -v foo/* bar/ | wc -l > 46 > > Your helpful message will throw this off and make it not work as > expected. A shell script author will have to introduce an incompatible > kludge to their script to make up for 'cp' misbehaving, i.e. > > CNT=$(cp -v $A/* $B/ | wc -l) > CNT=$((CNT - 1)) # This shouldn't be necessary... > > Another problem is that you've declared a static global 'copy_count' > variable and you're introducing a local 'copy_count' variable as well. > It needs to be called something else for the sake of avoiding > programmer confusion. However, assuming you eliminate the "copy-count" > message I just talked about along with its local variable counterpart, > the global 'static unsigned int copy_count' is set (copy_count++) but > not used throughout the entire program and can be discarded. > > If you ever want an "official" reference for how a particular elkscmd > program "should" behave, take a look at the POSIX standards: > > http://pubs.opengroup.org/onlinepubs/9699919799/ > > Whew, that took me a long time to write. I hope it's helpful to you > and everyone else on the list. Thanks for your impressive work so far! > I'm looking forward to your finalized [PATCH] submissions. :-) > > -Jody > -- > To unsubscribe from this list: send the line "unsubscribe linux-8086" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html