All of lore.kernel.org
 help / color / mirror / Atom feed
From: Emil Velikov <emil.l.velikov@gmail.com>
To: linux-firmware@kernel.org
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Subject: [PATCH v2 07/16] copy-firmware.sh: flesh out and fix dedup-firmware.sh
Date: Mon, 23 Sep 2024 14:09:35 +0100	[thread overview]
Message-ID: <20240923-misc-fixes-v2-7-397f23443628@gmail.com> (raw)
In-Reply-To: <20240923-misc-fixes-v2-0-397f23443628@gmail.com>

Flesh out the de-duplication logic in separate script. The copy-firmware.sh is
already complex enough and de-duplication doesn't really fit in there.

In the process we migrate away from the open-coded `ln --relative`. We also
avoid touching symlinks, which are not created by rdfind. Otherwise we end up
"fixing" the folder to folder symlinks (created earlier in the process) and
things explode.

As result we also get a few bonuses:
 - the COPYOPTS shell injection is gone - the variable was never used
 - people can dedup as separate step if/when they choose to do so

Aside: based on the noise in git log and around distros ... I'm wondering if
having the de-duplication as opt-in, would have been better. Is it too late to
change or the ship has sailed?

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
---
 Makefile          |  9 +++++----
 check_whence.py   |  1 +
 copy-firmware.sh  | 23 -----------------------
 dedup-firmware.sh | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 58 insertions(+), 27 deletions(-)

diff --git a/Makefile b/Makefile
index ac43c97fbd0a35b610c4d21c2bf8e8cf21864c9a..216dd4885b5bdf35189bbcca325fc7535acbe8f0 100644
--- a/Makefile
+++ b/Makefile
@@ -26,21 +26,22 @@ deb:
 rpm:
 	./build_packages.py --rpm
 
-install:
-	install -d $(DESTDIR)$(FIRMWAREDIR)
-	./copy-firmware.sh $(COPYOPTS) $(DESTDIR)$(FIRMWAREDIR)
+install: install-nodedup
+	./dedup-firmware.sh $(DESTDIR)$(FIRMWAREDIR)
 
 install-nodedup:
 	install -d $(DESTDIR)$(FIRMWAREDIR)
-	./copy-firmware.sh --ignore-duplicates $(DESTDIR)$(FIRMWAREDIR)
+	./copy-firmware.sh $(DESTDIR)$(FIRMWAREDIR)
 
 install-xz:
 	install -d $(DESTDIR)$(FIRMWAREDIR)
 	./copy-firmware.sh --xz $(DESTDIR)$(FIRMWAREDIR)
+	./dedup-firmware.sh $(DESTDIR)$(FIRMWAREDIR)
 
 install-zst:
 	install -d $(DESTDIR)$(FIRMWAREDIR)
 	./copy-firmware.sh --zstd $(DESTDIR)$(FIRMWAREDIR)
+	./dedup-firmware.sh $(DESTDIR)$(FIRMWAREDIR)
 
 clean:
 	rm -rf release dist
diff --git a/check_whence.py b/check_whence.py
index afe97a76cf5ed609117d07387a19c2092fa00c6c..09dbdd5439e738385c0402e244bad11ed9fc5558 100755
--- a/check_whence.py
+++ b/check_whence.py
@@ -91,6 +91,7 @@ def main():
             "contrib/templates/debian.copyright",
             "contrib/templates/rpm.spec",
             "copy-firmware.sh",
+            "dedup-firmware.sh",
         ]
     )
     known_prefixes = set(name for name in whence_list if name.endswith("/"))
diff --git a/copy-firmware.sh b/copy-firmware.sh
index 6c557f234dd2b90e68ba26429ad3e788f2201c4f..344f478db74a7080b34c264645d6bcd9a80384a8 100755
--- a/copy-firmware.sh
+++ b/copy-firmware.sh
@@ -9,7 +9,6 @@ prune=no
 # shellcheck disable=SC2209
 compress=cat
 compext=
-skip_dedup=0
 
 while test $# -gt 0; do
     case $1 in
@@ -45,11 +44,6 @@ while test $# -gt 0; do
             shift
             ;;
 
-        --ignore-duplicates)
-            skip_dedup=1
-            shift
-            ;;
-
         -*)
             if test "$compress" = "cat"; then
                 echo "ERROR: unknown command-line option: $1"
@@ -75,13 +69,6 @@ if [ -z "$destdir" ]; then
 	exit 1
 fi
 
-if ! command -v rdfind >/dev/null; then
-	if [ "$skip_dedup" != 1 ]; then
-    		echo "ERROR: rdfind is not installed.  Pass --ignore-duplicates to skip deduplication"
-		exit 1
-	fi
-fi
-
 # shellcheck disable=SC2162 # file/folder name can include escaped symbols
 grep -E '^(RawFile|File):' WHENCE | sed -E -e 's/^(RawFile|File): */\1 /;s/"//g' | while read k f; do
     test -f "$f" || continue
@@ -95,16 +82,6 @@ grep -E '^(RawFile|File):' WHENCE | sed -E -e 's/^(RawFile|File): */\1 /;s/"//g'
     fi
 done
 
-if [ "$skip_dedup" != 1 ] ; then
-	$verbose "Finding duplicate files"
-	rdfind -makesymlinks true -makeresultsfile false "$destdir" >/dev/null
-	find "$destdir" -type l | while read -r l; do
-		target="$(realpath "$l")"
-		$verbose "Correcting path for $l"
-		ln -fs "$(realpath --relative-to="$(dirname "$(realpath -s "$l")")" "$target")" "$l"
-	done
-fi
-
 # shellcheck disable=SC2162 # file/folder name can include escaped symbols
 grep -E '^Link:' WHENCE | sed -e 's/^Link: *//g;s/-> //g' | while read f d; do
     if test -L "$f$compext"; then
diff --git a/dedup-firmware.sh b/dedup-firmware.sh
new file mode 100755
index 0000000000000000000000000000000000000000..2bbd637f0736252027bfe1f60f886772efec1d08
--- /dev/null
+++ b/dedup-firmware.sh
@@ -0,0 +1,52 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Deduplicate files in a given destdir
+#
+
+err() {
+    echo "ERROR: $*"
+    exit 1
+}
+
+verbose=:
+destdir=
+while test $# -gt 0; do
+    case $1 in
+        -v | --verbose)
+            # shellcheck disable=SC2209
+            verbose=echo
+            ;;
+        *)
+            if test -n "$destdir"; then
+                err "unknown command-line options: $*"
+            fi
+
+            destdir="$1"
+            shift
+            ;;
+    esac
+done
+
+if test -z "$destdir"; then
+    err "destination directory was not specified."
+fi
+
+if ! test -d "$destdir"; then
+    err "provided directory does not exit."
+fi
+
+if ! command -v rdfind >/dev/null; then
+    err "rdfind is not installed."
+fi
+
+$verbose "Finding duplicate files"
+rdfind -makesymlinks true -makeresultsfile true "$destdir" >/dev/null
+
+grep DUPTYPE_WITHIN_SAME_TREE results.txt | grep -o "$destdir.*" | while read -r l; do
+    target="$(realpath "$l")"
+    $verbose "Correcting path for $l"
+    ln --force --symbolic --relative "$target" "$l"
+done
+
+rm results.txt

-- 
2.46.1


WARNING: multiple messages have this Message-ID (diff)
From: Emil Velikov via B4 Relay <devnull+emil.l.velikov.gmail.com@kernel.org>
To: linux-firmware@kernel.org
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Subject: [PATCH v2 07/16] copy-firmware.sh: flesh out and fix dedup-firmware.sh
Date: Mon, 23 Sep 2024 14:09:35 +0100	[thread overview]
Message-ID: <20240923-misc-fixes-v2-7-397f23443628@gmail.com> (raw)
In-Reply-To: <20240923-misc-fixes-v2-0-397f23443628@gmail.com>

From: Emil Velikov <emil.l.velikov@gmail.com>

Flesh out the de-duplication logic in separate script. The copy-firmware.sh is
already complex enough and de-duplication doesn't really fit in there.

In the process we migrate away from the open-coded `ln --relative`. We also
avoid touching symlinks, which are not created by rdfind. Otherwise we end up
"fixing" the folder to folder symlinks (created earlier in the process) and
things explode.

As result we also get a few bonuses:
 - the COPYOPTS shell injection is gone - the variable was never used
 - people can dedup as separate step if/when they choose to do so

Aside: based on the noise in git log and around distros ... I'm wondering if
having the de-duplication as opt-in, would have been better. Is it too late to
change or the ship has sailed?

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
---
 Makefile          |  9 +++++----
 check_whence.py   |  1 +
 copy-firmware.sh  | 23 -----------------------
 dedup-firmware.sh | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 58 insertions(+), 27 deletions(-)

diff --git a/Makefile b/Makefile
index ac43c97fbd0a35b610c4d21c2bf8e8cf21864c9a..216dd4885b5bdf35189bbcca325fc7535acbe8f0 100644
--- a/Makefile
+++ b/Makefile
@@ -26,21 +26,22 @@ deb:
 rpm:
 	./build_packages.py --rpm
 
-install:
-	install -d $(DESTDIR)$(FIRMWAREDIR)
-	./copy-firmware.sh $(COPYOPTS) $(DESTDIR)$(FIRMWAREDIR)
+install: install-nodedup
+	./dedup-firmware.sh $(DESTDIR)$(FIRMWAREDIR)
 
 install-nodedup:
 	install -d $(DESTDIR)$(FIRMWAREDIR)
-	./copy-firmware.sh --ignore-duplicates $(DESTDIR)$(FIRMWAREDIR)
+	./copy-firmware.sh $(DESTDIR)$(FIRMWAREDIR)
 
 install-xz:
 	install -d $(DESTDIR)$(FIRMWAREDIR)
 	./copy-firmware.sh --xz $(DESTDIR)$(FIRMWAREDIR)
+	./dedup-firmware.sh $(DESTDIR)$(FIRMWAREDIR)
 
 install-zst:
 	install -d $(DESTDIR)$(FIRMWAREDIR)
 	./copy-firmware.sh --zstd $(DESTDIR)$(FIRMWAREDIR)
+	./dedup-firmware.sh $(DESTDIR)$(FIRMWAREDIR)
 
 clean:
 	rm -rf release dist
diff --git a/check_whence.py b/check_whence.py
index afe97a76cf5ed609117d07387a19c2092fa00c6c..09dbdd5439e738385c0402e244bad11ed9fc5558 100755
--- a/check_whence.py
+++ b/check_whence.py
@@ -91,6 +91,7 @@ def main():
             "contrib/templates/debian.copyright",
             "contrib/templates/rpm.spec",
             "copy-firmware.sh",
+            "dedup-firmware.sh",
         ]
     )
     known_prefixes = set(name for name in whence_list if name.endswith("/"))
diff --git a/copy-firmware.sh b/copy-firmware.sh
index 6c557f234dd2b90e68ba26429ad3e788f2201c4f..344f478db74a7080b34c264645d6bcd9a80384a8 100755
--- a/copy-firmware.sh
+++ b/copy-firmware.sh
@@ -9,7 +9,6 @@ prune=no
 # shellcheck disable=SC2209
 compress=cat
 compext=
-skip_dedup=0
 
 while test $# -gt 0; do
     case $1 in
@@ -45,11 +44,6 @@ while test $# -gt 0; do
             shift
             ;;
 
-        --ignore-duplicates)
-            skip_dedup=1
-            shift
-            ;;
-
         -*)
             if test "$compress" = "cat"; then
                 echo "ERROR: unknown command-line option: $1"
@@ -75,13 +69,6 @@ if [ -z "$destdir" ]; then
 	exit 1
 fi
 
-if ! command -v rdfind >/dev/null; then
-	if [ "$skip_dedup" != 1 ]; then
-    		echo "ERROR: rdfind is not installed.  Pass --ignore-duplicates to skip deduplication"
-		exit 1
-	fi
-fi
-
 # shellcheck disable=SC2162 # file/folder name can include escaped symbols
 grep -E '^(RawFile|File):' WHENCE | sed -E -e 's/^(RawFile|File): */\1 /;s/"//g' | while read k f; do
     test -f "$f" || continue
@@ -95,16 +82,6 @@ grep -E '^(RawFile|File):' WHENCE | sed -E -e 's/^(RawFile|File): */\1 /;s/"//g'
     fi
 done
 
-if [ "$skip_dedup" != 1 ] ; then
-	$verbose "Finding duplicate files"
-	rdfind -makesymlinks true -makeresultsfile false "$destdir" >/dev/null
-	find "$destdir" -type l | while read -r l; do
-		target="$(realpath "$l")"
-		$verbose "Correcting path for $l"
-		ln -fs "$(realpath --relative-to="$(dirname "$(realpath -s "$l")")" "$target")" "$l"
-	done
-fi
-
 # shellcheck disable=SC2162 # file/folder name can include escaped symbols
 grep -E '^Link:' WHENCE | sed -e 's/^Link: *//g;s/-> //g' | while read f d; do
     if test -L "$f$compext"; then
diff --git a/dedup-firmware.sh b/dedup-firmware.sh
new file mode 100755
index 0000000000000000000000000000000000000000..2bbd637f0736252027bfe1f60f886772efec1d08
--- /dev/null
+++ b/dedup-firmware.sh
@@ -0,0 +1,52 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Deduplicate files in a given destdir
+#
+
+err() {
+    echo "ERROR: $*"
+    exit 1
+}
+
+verbose=:
+destdir=
+while test $# -gt 0; do
+    case $1 in
+        -v | --verbose)
+            # shellcheck disable=SC2209
+            verbose=echo
+            ;;
+        *)
+            if test -n "$destdir"; then
+                err "unknown command-line options: $*"
+            fi
+
+            destdir="$1"
+            shift
+            ;;
+    esac
+done
+
+if test -z "$destdir"; then
+    err "destination directory was not specified."
+fi
+
+if ! test -d "$destdir"; then
+    err "provided directory does not exit."
+fi
+
+if ! command -v rdfind >/dev/null; then
+    err "rdfind is not installed."
+fi
+
+$verbose "Finding duplicate files"
+rdfind -makesymlinks true -makeresultsfile true "$destdir" >/dev/null
+
+grep DUPTYPE_WITHIN_SAME_TREE results.txt | grep -o "$destdir.*" | while read -r l; do
+    target="$(realpath "$l")"
+    $verbose "Correcting path for $l"
+    ln --force --symbolic --relative "$target" "$l"
+done
+
+rm results.txt

-- 
2.46.1



  parent reply	other threads:[~2024-09-23 13:09 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-23 13:09 [PATCH v2 00/16] Range of copy-firmware/check_whence fixes Emil Velikov
2024-09-23 13:09 ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 01/16] check_whence.py: use consistent naming Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 02/16] check_whence.py: ban link-to-a-link Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 03/16] check_whence.py: LC_ALL=C sort -u the filelist Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 04/16] check_whence.py: annotate replacement strings as raw Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 05/16] editorconfig: add initial config file Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 06/16] Style update yaml files Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` Emil Velikov [this message]
2024-09-23 13:09   ` [PATCH v2 07/16] copy-firmware.sh: flesh out and fix dedup-firmware.sh Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 08/16] Revert "copy-firmware: Support additional compressor options" Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 09/16] copy-firmware.sh: reset and consistently handle destdir Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 10/16] copy-firmware.sh: fix indentation Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 11/16] copy-firmware.sh: add err() helper Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 12/16] copy-firmware.sh: warn if the destination folder is not empty Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 13/16] copy-firmware.sh: call ./check_whence.py before parsing the file Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 14/16] copy-firmware.sh: remove no longer reachable test -f Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 15/16] copy-firmware.sh: remove no longer reachable test -L Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-09-23 13:09 ` [PATCH v2 16/16] copy-firmware.sh: rename variables in symlink hanlding Emil Velikov
2024-09-23 13:09   ` Emil Velikov via B4 Relay
2024-10-10  3:25 ` [PATCH v2 00/16] Range of copy-firmware/check_whence fixes Mario Limonciello

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240923-misc-fixes-v2-7-397f23443628@gmail.com \
    --to=emil.l.velikov@gmail.com \
    --cc=linux-firmware@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.