BPF List
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: bpf@vger.kernel.org, ast@kernel.org
Cc: andrii@kernel.org, daniel@iogearbox.net, kernel-team@fb.com,
	yhs@fb.com, arnaldo.melo@gmail.com,
	Eduard Zingerman <eddyz87@gmail.com>
Subject: [RFC bpf-next 08/12] kbuild: Script to infer header guard values for uapi headers
Date: Wed, 26 Oct 2022 01:27:57 +0300	[thread overview]
Message-ID: <20221025222802.2295103-9-eddyz87@gmail.com> (raw)
In-Reply-To: <20221025222802.2295103-1-eddyz87@gmail.com>

The script infers header guard defines in headers from
include/uapi/**/*.h . E.g. header guard for the
`include/uapi/linux/tcp.h` is `_UAPI_LINUX_TCP_H`:

    include/uapi/linux/tcp.h:

      #ifndef _UAPI_LINUX_TCP_H
      #define _UAPI_LINUX_TCP_H
      ...
      union tcp_word_hdr {
            struct tcphdr hdr;
            __be32        words[5];
      };
      ...
      #endif /* _UAPI_LINUX_TCP_H */

The output of the script could be used as an input to pahole's
`--header_guards_db` parameter. This information is necessary to
repeat the same header guards in the `vmlinux.h` generated from BTF.

It is not possible to infer the guard names from header file names
alone, the file content has to be analyzed. The following heuristic is
used to infer guard for a specific file:
- All pairs `#ifndef <candidate>` / `#define <candidate>` are collected;
- If a unique candidate matching regex `${headername}.*_H(EADER)?` it
  is selected;
- If a unique candidate matching regex `_H(EADER)?_` it is selected;
- If a unique candidate matching regex `_H(EADER)?$` it is selected;

There is also a small list of headers that can't be caught by the
rules above, 15 in total. These headers and corresponding guard values
are listed in the `%OVERRIDES` hash table.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 scripts/infer_header_guards.pl | 191 +++++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)
 create mode 100755 scripts/infer_header_guards.pl

diff --git a/scripts/infer_header_guards.pl b/scripts/infer_header_guards.pl
new file mode 100755
index 000000000000..201008ac83f3
--- /dev/null
+++ b/scripts/infer_header_guards.pl
@@ -0,0 +1,191 @@
+#!/usr/bin/env perl
+# SPDX-License-Identifier: GPL-2.0
+
+# This script scans the passed directory for header files (files ending with ".h").
+# For each header file it tries to infer the name of a C pre-processor
+# variable used as a double include guard (dubbed as "header guard").
+# For example:
+#
+#   #ifdef __MY_HEADER__  // <-- "header guard"
+#     ...
+#   #endif
+#
+# The inferred guards are printed to stdout in the following format:
+#
+#   <header-file> <header-guard>
+#
+# This is an expected format for pahole --header_guards_db parameter.
+# Intended usage is to infer header guards for Linux UAPI headers.
+# The collected information is further used in BTF embedded into kernel.
+#
+# The following inference logic is used for each file:
+# - find all pairs `#ifndef <name> #define <name>`
+# - if there is a unique <name> that matches a pattern - use this
+#   <name> as a header guard, (see subroutine `select_guard` for the
+#   list of the patterns);
+# - files containing only #include directives are safe to ignore.
+#
+# There are a few UAPI header files that don't fit in such logic,
+# header guards for these files are hard-coded in %OVERRIDES hash.
+#
+# The script reports inference only when --report-failures flag is
+# passed. This flag is intended for BPF tests.
+#
+# See subroutine `help` for usage info.
+
+use strict;
+use warnings;
+use File::Basename;
+use File::Find;
+use Getopt::Long;
+
+sub help {
+	my $message = << "EOM";
+Usage:
+  $0 [--report-failures] directory-or-file...
+  $0 --help
+
+For a specific file or for each .h file in a directory infer the name
+of a C pre-processor variable used as a double include guard.
+
+Options:
+  --report-failures   Report inference errors to stderr,
+                      exit with non-zero code if guards were not inferred
+                      for some files.
+  --help              Print this message and exit.
+EOM
+	print $message;
+}
+
+my %OVERRIDES = (
+	# Header guards that don't follow common naming rules
+	"include/uapi/linux/cciss_ioctl.h" => "_UAPICCISS_IOCTLH",
+	"include/uapi/linux/hpet.h" => "_UAPI__HPET__",
+	"include/uapi/linux/if_ppp.h" => "_PPP_IOCTL_H",
+	"include/uapi/linux/netfilter/xt_NFLOG.h" => "_XT_NFLOG_TARGET",
+	"include/uapi/linux/netfilter_ipv6/ip6t_NPT.h" => "__NETFILTER_IP6T_NPT",
+	"include/uapi/linux/quota.h" => "_UAPI_LINUX_QUOTA_",
+	"include/uapi/linux/v4l2-common.h" => "__V4L2_COMMON__",
+	# Headers that should be ignored
+	"arch/x86/include/uapi/asm/hw_breakpoint.h" => undef,
+	"arch/x86/include/uapi/asm/posix_types.h" => undef,
+	"arch/x86/include/uapi/asm/setup.h" => undef,
+	"include/generated/uapi/linux/version.h" => undef,
+	"include/uapi/asm-generic/bitsperlong.h" => undef,
+	"include/uapi/asm-generic/kvm_para.h" => undef,
+	"include/uapi/asm-generic/unistd.h" => undef,
+	"include/uapi/linux/irqnr.h" => undef,
+	"include/uapi/linux/zorro_ids.h" => undef,
+	);
+
+sub get_basename {
+	my ($filename) = @_;
+	my $basename = fileparse($filename, qr/\.[^.]*/);
+	return $basename;
+}
+
+sub find_bracket_candidates {
+	my ($filename) = @_;
+	my @candidates = ();
+	my $guard_candidate = undef;
+	my $safe_to_ignore = 1;
+
+	open my $file, $filename or die "Can't open file $filename: $!";
+	while (my $line = <$file>) {
+		if (not($line =~ "^#include")) {
+			$safe_to_ignore = 0;
+		}
+		if ($line =~ "^#ifndef[ \t]+([a-zA-Z0-9_]+)") {
+			$guard_candidate = $1;
+		} elsif ($guard_candidate && $line =~ "^#define[ \t]+${guard_candidate}") {
+			push(@candidates, $guard_candidate);
+			$guard_candidate = undef;
+		}
+	}
+	close $file;
+
+	return ($safe_to_ignore, @candidates);
+}
+
+sub select_guard {
+	my ($filename, @candidates) = @_;
+	my $basename = get_basename($filename);
+	my @regexes = ("$basename.*_H(EADER)?",
+		       "_H(EADER)?_",
+		       "_H(EADER)?\$");
+	foreach my $re (@regexes) {
+		my @filtered = grep(/$re/i, @candidates);
+		if (scalar(@filtered) == 1) {
+			return $filtered[0];
+		}
+	}
+
+	return undef;
+}
+
+sub collect_headers {
+	my ($dir) = @_;
+	my @headers = ();
+
+	find(sub { /\.h$/ && push(@headers, $File::Find::name); }, $dir);
+
+	return @headers;
+}
+
+my $report_failures = 0;
+my $options_parsed = GetOptions(
+	"report-failures" => \$report_failures,
+	"help" => sub { help(); exit 0; },
+    );
+
+if (!$options_parsed || scalar @ARGV == 0) {
+	help();
+	exit 1;
+}
+
+my @headers;
+
+foreach my $dir_or_file (@ARGV) {
+	if (-f $dir_or_file) {
+		push(@headers, $dir_or_file);
+	} elsif (-d $dir_or_file) {
+		push(@headers, collect_headers($dir_or_file));
+	} else {
+		print("'$dir_or_file' is not a file or directory.\n");
+		help();
+		exit 1;
+	}
+}
+
+my $rc = 0;
+
+foreach my $header (@headers) {
+	my $basename = get_basename($header);
+	my $guard;
+
+	if (exists $OVERRIDES{$header}) {
+		$guard = $OVERRIDES{$basename};
+	} else {
+		my ($safe_to_ignore, @candidates) = find_bracket_candidates($header);
+		$guard = select_guard($header, @candidates);
+		if ((not $guard) && (not $safe_to_ignore) && $report_failures) {
+			print STDERR "Can't select guard for $header, candidates:\n";
+			print STDERR "  ";
+			if (scalar(@candidates)) {
+				print STDERR join(", ", @candidates);
+			} else {
+				print STDERR "<no candidates>"
+			}
+			print STDERR "\n";
+			$rc = 1;
+		}
+	}
+	if ($guard) {
+		# Remove the _UAPI prefix/suffix the same way
+		# scripts/headers_install.sh does it.
+		$guard =~ s/_UAPI//;
+		print("$header $guard\n");
+	}
+}
+
+exit $rc;
-- 
2.34.1


  parent reply	other threads:[~2022-10-25 22:28 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-25 22:27 [RFC bpf-next 00/12] Use uapi kernel headers with vmlinux.h Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 01/12] libbpf: Deduplicate unambigous standalone forward declarations Eduard Zingerman
2022-10-27 22:07   ` Andrii Nakryiko
2022-10-31  1:00     ` Eduard Zingerman
2022-10-31 15:49     ` Eduard Zingerman
2022-11-01 17:08       ` Alan Maguire
2022-11-01 17:37         ` Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 02/12] selftests/bpf: Tests for standalone forward BTF declarations deduplication Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 03/12] libbpf: Support for BTF_DECL_TAG dump in C format Eduard Zingerman
2022-10-27 22:36   ` Andrii Nakryiko
2022-10-25 22:27 ` [RFC bpf-next 04/12] selftests/bpf: Tests " Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 05/12] libbpf: Header guards for selected data structures in vmlinux.h Eduard Zingerman
2022-10-27 22:44   ` Andrii Nakryiko
2022-10-25 22:27 ` [RFC bpf-next 06/12] selftests/bpf: Tests for header guards printing in BTF dump Eduard Zingerman
2022-10-25 22:27 ` [RFC bpf-next 07/12] bpftool: Enable header guards generation Eduard Zingerman
2022-10-25 22:27 ` Eduard Zingerman [this message]
2022-10-27 22:51   ` [RFC bpf-next 08/12] kbuild: Script to infer header guard values for uapi headers Andrii Nakryiko
2022-10-25 22:27 ` [RFC bpf-next 09/12] kbuild: Header guards for types from include/uapi/*.h in kernel BTF Eduard Zingerman
2022-10-27 18:43   ` Yonghong Song
2022-10-27 18:55     ` Yonghong Song
2022-10-27 22:44       ` Yonghong Song
2022-10-28  0:00         ` Eduard Zingerman
2022-10-28  0:14           ` Mykola Lysenko
2022-10-28  1:23             ` Yonghong Song
2022-10-28  1:21           ` Yonghong Song
2022-10-25 22:27 ` [RFC bpf-next 10/12] selftests/bpf: Script to verify uapi headers usage with vmlinux.h Eduard Zingerman
2022-10-25 22:28 ` [RFC bpf-next 11/12] selftests/bpf: Known good uapi headers for test_uapi_headers.py Eduard Zingerman
2022-10-25 22:28 ` [RFC bpf-next 12/12] selftests/bpf: script for infer_header_guards.pl testing Eduard Zingerman
2022-10-25 23:46 ` [RFC bpf-next 00/12] Use uapi kernel headers with vmlinux.h Alexei Starovoitov
2022-10-26 22:46   ` Eduard Zingerman
2022-10-26 11:10 ` Alan Maguire
2022-10-26 23:54   ` Eduard Zingerman
2022-10-27 23:14 ` Andrii Nakryiko
2022-10-28  1:33   ` Yonghong Song
2022-10-28 17:13     ` Andrii Nakryiko
2022-10-28 18:56       ` Yonghong Song
2022-10-28 21:35         ` Andrii Nakryiko
2022-11-01 16:01           ` Alan Maguire
2022-11-01 18:35             ` Alexei Starovoitov
2022-11-01 19:21               ` Eduard Zingerman
2022-11-01 19:44                 ` Alexei Starovoitov
2022-11-11 21:55         ` Eduard Zingerman
2022-11-14  7:52           ` Yonghong Song
2022-11-14 21:13             ` Eduard Zingerman
2022-11-14 21:50               ` Alexei Starovoitov
2022-11-16  2:01                 ` Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221025222802.2295103-9-eddyz87@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=andrii@kernel.org \
    --cc=arnaldo.melo@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox