util-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Robin Jarry" <robin@jarry.cc>
To: "Thomas Weißschuh" <thomas@t-8ch.de>
Cc: <util-linux@vger.kernel.org>
Subject: Re: [RFC PATCH util-linux] text-utils: add bits command
Date: Wed, 16 Oct 2024 23:11:22 +0200	[thread overview]
Message-ID: <D4XJFK5WGV2R.244DMWY2H7ABM@ringo> (raw)
In-Reply-To: <0792f071-786b-4cb2-9733-2b10c98c20f2@t-8ch.de>

Hi Thomas,

Thomas Weißschuh, Oct 16, 2024 at 23:00:
> On 2024-10-16 22:26:22+0200, Robin Jarry wrote:
>> Add a new test utility to convert between bit masks in various formats.
>
> "test" or "text" utility?

That was a typo, I meant "text" :)

>> This can be handy to avoid parsing affinity masks in one's head and/or
>> to interact with the kernel in a more human friendly way.
>> 
>> This is a rewrite in C of the bits command from my linux-tools python
>> package so that it can be more widely available.
>> 
>> Here is an example:
>> 
>>  ~# cat /sys/kernel/debug/tracing/tracing_cpumask
>>  fffffff,ffffffff,ffffffff,ffffffff
>>  ~# bits -l ,$(cat /sys/kernel/debug/tracing/tracing_cpumask)
>>  0-128
>>  ~# bits -g 58,59,120,123
>>  9000000,00000000,0c000000,00000000
>>  ~# bits -g 58,59 > /sys/kernel/debug/tracing/tracing_cpumask
>>  ~# echo 1 > /sys/kernel/debug/tracing/tracing_on
>> 
>> Add man page and basic tests.
>> 
>> Link: https://git.sr.ht/~rjarry/linux-tools#bits
>> Signed-off-by: Robin Jarry <robin@jarry.cc>
>> ---
>>  bash-completion/bits                  |  21 ++
>>  meson.build                           |  11 ++
>>  tests/commands.sh                     |   1 +
>>  tests/expected/misc/bits              |   0
>>  tests/expected/misc/bits-and          |   1 +
>>  tests/expected/misc/bits-binary       |   1 +
>>  tests/expected/misc/bits-default      |   1 +
>>  tests/expected/misc/bits-grouped-mask |   1 +
>>  tests/expected/misc/bits-list         |   1 +
>>  tests/expected/misc/bits-mask         |   1 +
>>  tests/expected/misc/bits-not          |   1 +
>>  tests/expected/misc/bits-or           |   1 +
>>  tests/expected/misc/bits-overflow     |   1 +
>>  tests/expected/misc/bits-parse-mask   |   1 +
>>  tests/expected/misc/bits-parse-range  |   1 +
>>  tests/expected/misc/bits-stdin        |   1 +
>>  tests/expected/misc/bits-xor          |   1 +
>>  tests/ts/misc/bits                    |  82 ++++++++
>>  text-utils/Makemodule.am              |   6 +
>>  text-utils/bits.1.adoc                | 147 ++++++++++++++
>>  text-utils/bits.c                     | 265 ++++++++++++++++++++++++++
>>  text-utils/meson.build                |   4 +
>>  22 files changed, 550 insertions(+)
>>  create mode 100644 bash-completion/bits
>>  create mode 100644 tests/expected/misc/bits
>>  create mode 100644 tests/expected/misc/bits-and
>>  create mode 100644 tests/expected/misc/bits-binary
>>  create mode 100644 tests/expected/misc/bits-default
>>  create mode 100644 tests/expected/misc/bits-grouped-mask
>>  create mode 100644 tests/expected/misc/bits-list
>>  create mode 100644 tests/expected/misc/bits-mask
>>  create mode 100644 tests/expected/misc/bits-not
>>  create mode 100644 tests/expected/misc/bits-or
>>  create mode 100644 tests/expected/misc/bits-overflow
>>  create mode 100644 tests/expected/misc/bits-parse-mask
>>  create mode 100644 tests/expected/misc/bits-parse-range
>>  create mode 100644 tests/expected/misc/bits-stdin
>>  create mode 100644 tests/expected/misc/bits-xor
>>  create mode 100755 tests/ts/misc/bits
>>  create mode 100644 text-utils/bits.1.adoc
>>  create mode 100644 text-utils/bits.c
>
> ...
>
>> +++ b/tests/ts/misc/bits
>> @@ -0,0 +1,82 @@
>> +#!/bin/bash
>> +
>> +#
>> +# Copyright (c) 2024 Robin Jarry
>> +#
>> +# This file is part of util-linux.
>> +#
>> +# This file is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 2 of the License, or
>> +# (at your option) any later version.
>> +#
>> +# This file is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>
> New files should use SPDX tags.

Ack, will change in v2.

>> diff --git a/text-utils/bits.c b/text-utils/bits.c
>> new file mode 100644
>> index 000000000000..6dd0db81a5de
>> --- /dev/null
>> +++ b/text-utils/bits.c
>> @@ -0,0 +1,265 @@
>> +/*
>> + * SPDX-License-Identifier: GPL-2.0-or-later
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * Copyright (c) 2024 Robin Jarry
>> + *
>> + * bits - convert bit masks from/to various formats
>> + */
>> +
>> +#include <errno.h>
>> +#include <getopt.h>
>> +#include <sched.h>
>> +#include <stdbool.h>
>> +#include <stdint.h>
>> +#include <stdio.h>
>> +#include <string.h>
>> +#include <unistd.h>
>> +
>> +#include "c.h"
>> +#include "closestream.h"
>> +#include "cpuset.h"
>> +#include "nls.h"
>> +#include "strutils.h"
>> +#include "strv.h"
>> +
>> +typedef enum {
>> +	OP_OR,
>> +	OP_AND,
>> +	OP_NOT,
>> +	OP_XOR,
>> +} bitvise_op_t;
>
> "bitwise"?
> Also avoid using the _t suffix as it's reserved.
> "enum bitwise_op" seems clear enough.

Yes, bitwise is the correct spelling. I will probably remove that in v2 
as you suggested below.

>> +
>> +static int parse_mask_or_list(const char *cmdline_arg, cpu_set_t *all_bits)
>> +{
>> +	bitvise_op_t op = OP_OR;
>> +	cpu_set_t bits, copy;
>
> This seems very geared towards cpu masks.
> What happens if a user wants to use it with very large non-CPU bitmasks,
> as the tool is advertised as a generic utility?
>
> Call the utility "cpumask"?

The default cpu_set_t size can hold up to 1024 bits (128 bytes). 
I assumed it was more than enough for common use cases. I added a basic 
test that checks that the program does not crash when specifying a too 
large mask.

I also didn't want to call that tool "*mask" because it can also deal 
with lists. But I don't have a strong opinion on the matter.

>
>> +	char buf[BUFSIZ];
>
> Only use BUFSIZ in cases where the actual value does not matter,
> for example loops.
> On musl BUFSIZ == 1024 and this could not be enough fairly soon.

Good point. I wasn't sure what is a reasonable maximum argument size. 
execve(2) says MAX_ARG_STRLEN (32 pages) which seems excessive. I could 
hardcode 8192 and be done with it. What do you think?

>
>> +	char *arg = buf;
>> +
>> +	/* copy to allow modifying the argument contents */
>> +	strlcpy(buf, cmdline_arg, sizeof(buf));
>> +	buf[sizeof(buf) - 1] = '\0';
>
> You can just use xstrdup().

But then, I'd have to free it. I wanted to avoid this.

>
>> +
>> +	/* strip optional operator first */
>> +	if (startswith(arg, "&")) {
>> +		op = OP_AND;
>> +		arg++;
>> +	} else if (startswith(arg, "^")) {
>> +		op = OP_XOR;
>> +		arg++;
>> +	} else if (startswith(arg, "~")) {
>> +		op = OP_NOT;
>> +		arg++;
>> +	} else if (startswith(arg, "|")) {
>> +		op = OP_OR;
>> +		arg++;
>> +	}
>
> The whole op enum could also be replaced by the actual character,
> this is all in a single function.
>
> op = '&';
>
> ...
>
> switch (op) {
> case '&':
> }

Good point. I'll do that in v2.

>
>> +
>> +	if (startswith(arg, ",") || startswith(arg, "0x")) {
>> +		if (cpumask_parse(arg, &bits, sizeof(bits)) < 0)
>> +			return -EINVAL;
>
> As this is a commandline tool you can call errx() here to kill the
> process and provide better error information.
> It would also make the call sequence slightly more simple.

Ack.

>
>> +	} else {
>> +		if (cpulist_parse(arg, &bits, sizeof(bits), 1) < 0)
>> +			return -EINVAL;
>> +	}
>> +
>> +	switch (op) {
>> +	case OP_AND:
>> +		copy = *all_bits;
>> +		CPU_AND(all_bits, &copy, &bits);
>> +		break;
>> +	case OP_OR:
>> +		copy = *all_bits;
>> +		CPU_OR(all_bits, &copy, &bits);
>> +		break;
>> +	case OP_XOR:
>> +		copy = *all_bits;
>> +		CPU_XOR(all_bits, &copy, &bits);
>> +		break;
>> +	case OP_NOT:
>> +		for (int i = 0; i < CPU_SETSIZE; i++) {
>> +			if (CPU_ISSET(i, &bits))
>> +				CPU_CLR(i, all_bits);
>> +		}
>> +		break;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +typedef enum {
>> +	MODE_BINARY,
>> +	MODE_GROUPED_MASK,
>> +	MODE_LIST,
>> +	MODE_MASK,
>> +} output_mode_t;
>> +
>> +static void print_bits(cpu_set_t *bits, output_mode_t mode)
>> +{
>> +	bool started = false;
>> +	char buf[BUFSIZ];
>> +	ssize_t n = 0;
>> +
>> +	buf[0] = '\0';
>
> Is this necessary?

It may have been at some point but it isn't anymore. I'll remove it.

>
>> +
>> +	switch (mode) {
>> +	case MODE_MASK:
>> +		cpumask_create(buf, sizeof(buf), bits, sizeof(*bits));
>> +
>> +		/* strip leading zeroes */
>> +		while (buf[n] == '0')
>> +			n++;
>> +		if (buf[n] == '\0')
>> +			printf("0x0\n");
>> +		else
>> +			printf("0x%s\n", buf + n);
>> +		break;
>> +
>> +	case MODE_GROUPED_MASK:
>> +		cpumask_create(buf, sizeof(buf), bits, sizeof(*bits));
>> +
>> +		/* strip leading zeroes */
>> +		while (buf[n] == '0')
>> +			n++;
>> +
>> +		while (buf[n] != '\0') {
>> +			if (started && (n % 8) == 0)
>> +				printf(",");
>> +			if (buf[n] != '0')
>> +				started = true;
>> +			printf("%c", buf[n]);
>> +			n++;
>> +		}
>> +		printf("\n");
>> +		break;
>> +
>> +	case MODE_BINARY:
>> +		printf("0b");
>> +		for (n = CPU_SETSIZE; n >= 0; n--) {
>> +			if (started && ((n + 1) % 4) == 0)
>> +				printf("_");
>> +			if (CPU_ISSET(n, bits)) {
>> +				started = true;
>> +				printf("1");
>> +			} else if (started) {
>> +				printf("0");
>> +			}
>> +		}
>> +		printf("\n");
>> +		break;
>> +
>> +	case MODE_LIST:
>> +		cpulist_create(buf, sizeof(buf), bits, sizeof(*bits));
>
> This could use the return values of cpulist_create() and
> cpumask_create().

Since I am printing into a stack buffer, is there a benefit?

>> +		printf("%s\n", buf);
>> +		break;
>> +	}
>> +
>> +}
>> +
>> +static void __attribute__((__noreturn__)) usage(void)
>> +{
>> +	fputs(USAGE_HEADER, stdout);
>> +	fprintf(stdout, _(" %s [options] [<mask_or_list>...]\n"), program_invocation_short_name);
>> +
>> +	fputs(USAGE_SEPARATOR, stdout);
>> +	fputsln(_("Convert bit masks from/to various formats."), stdout);
>> +
>> +	fputs(USAGE_ARGUMENTS, stdout);
>> +	fputsln(_(" <mask_or_list>      A set of bits specified as a hex mask value (e.g. 0xeec2)\n"
>> +		"                     or as a comma-separated list of bit IDs.\n"
>> +		"                     If not specified, arguments will be read from stdin."), stdout);
>> +
>> +	fputs(USAGE_OPTIONS, stdout);
>> +	fprintf(stdout, USAGE_HELP_OPTIONS(21));
>> +
>> +	fputs(_("\nMode:\n"), stdout);
>> +	fputsln(_(" -m, --mask          Print the combined args as a hex mask value (default).\n"), stdout);
>> +	fputsln(_(" -g, --grouped-mask  Print the combined args as a hex mask value in 32bit\n"
>> +			"                     comma separated groups."), stdout);
>> +	fputsln(_(" -b, --binary        Print the combined args as a binary mask value."), stdout);
>> +	fputsln(_(" -l, --list          Print the combined args as a compressed list of bit IDs."), stdout);
>> +
>> +	fputs(USAGE_SEPARATOR, stdout);
>> +	fprintf(stdout, USAGE_MAN_TAIL("bits(1)"));
>> +	exit(EXIT_SUCCESS);
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +	output_mode_t mode = MODE_MASK;
>> +	char **stdin_lines = NULL;
>> +	cpu_set_t bits;
>> +	int c;
>> +
>> +	static const struct option longopts[] = {
>> +		{ "version",      no_argument, NULL, 'V'},
>> +		{ "help",         no_argument, NULL, 'h'},
>> +		{ "mask",         no_argument, NULL, 'm'},
>> +		{ "grouped-mask", no_argument, NULL, 'g'},
>> +		{ "binary",       no_argument, NULL, 'b'},
>> +		{ "list",         no_argument, NULL, 'l'},
>> +		{ NULL, 0, NULL, 0 }
>> +	};
>> +
>> +	setlocale(LC_ALL, "");
>> +	bindtextdomain(PACKAGE, LOCALEDIR);
>> +	textdomain(PACKAGE);
>> +	close_stdout_atexit();
>> +
>> +	while ((c = getopt_long(argc, argv, "Vhmgbl", longopts, NULL)) != -1)
>> +		switch (c) {
>> +		case 'm':
>> +			mode = MODE_MASK;
>> +			break;
>> +		case 'g':
>> +			mode = MODE_GROUPED_MASK;
>> +			break;
>> +		case 'b':
>> +			mode = MODE_BINARY;
>> +			break;
>> +		case 'l':
>> +			mode = MODE_LIST;
>> +			break;
>> +		case 'V':
>> +			print_version(EXIT_SUCCESS);
>> +		case 'h':
>> +			usage();
>> +		default:
>> +			errtryhelp(EXIT_FAILURE);
>> +		}
>> +
>> +	argc -= optind;
>> +	argv += optind;
>> +	if (argc == 0) {
>> +		/* no arguments provided, read lines from stdin */
>> +		char buf[BUFSIZ];
>> +
>> +		while (fgets(buf, sizeof(buf), stdin)) {
>> +			/* strip LF, CR, CRLF, LFCR */
>> +			buf[strcspn(buf, "\r\n")] = '\0';
>
> rtrim_whitespace()?

Ack.

>
>> +			strv_push(&stdin_lines, strdup(buf));
>> +		}
>> +
>> +		argc = strv_length(stdin_lines);
>> +		argv = stdin_lines;
>> +	}
>> +
>> +	CPU_ZERO(&bits);
>> +
>> +	for (; argc > 0; argc--, argv++)
>> +		if (parse_mask_or_list(*argv, &bits) < 0) {
>> +			fprintf(stderr, _("error: invalid argument: %s\n"), *argv);
>> +			errtryhelp(EXIT_FAILURE);
>> +		}
>> +
>> +	strv_free(stdin_lines);
>
> No need to free this here. The process will exit anyways.

I wanted to make ASAN happy. Is this not necessary?

>
>> +
>> +	print_bits(&bits, mode);
>> +
>> +	return EXIT_SUCCESS;
>> +}

Thanks for the review!


  reply	other threads:[~2024-10-16 21:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-16 20:26 [RFC PATCH util-linux] text-utils: add bits command Robin Jarry
2024-10-16 21:00 ` Thomas Weißschuh
2024-10-16 21:11   ` Robin Jarry [this message]
2024-10-18  6:53     ` Thomas Weißschuh 
2024-10-16 21:57 ` [PATCH util-linux v2] " Robin Jarry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D4XJFK5WGV2R.244DMWY2H7ABM@ringo \
    --to=robin@jarry.cc \
    --cc=thomas@t-8ch.de \
    --cc=util-linux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).