From: "Yann E. MORIN" <yann.morin.1998@free.fr>
To: Herve Codina <herve.codina@bootlin.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>, buildroot@buildroot.org
Subject: Re: [Buildroot] [RFC PATCH 1/1] support/download: allow to use part of file in checksum checking
Date: Thu, 27 Jul 2023 17:59:44 +0200 [thread overview]
Message-ID: <20230727155944.GF3544423@scaer> (raw)
In-Reply-To: <20230727125123.927568-1-herve.codina@bootlin.com>
Hervé, All,
On 2023-07-27 14:51 +0200, Herve Codina spake thusly:
> The checksum checking is done on whole files only using the <pkg>.hash
> checksum references.
> Among the files checked, files related to licenses are checked.
>
> Some packages do not contain any specific license files and, for
> them, some source files are used. These source files contain license
> information (usually comments at the beginning of the file).
>
> Using the whole source file for checksum checking in this case can lead
> to issues if a br2-external is present and applies some patches to this
s/br2-external/global patch-dir/
> source file.
> Indeed, patching a package from a br2-external is allowed but in that
s/br2-external/global patch-dir/
> case the whole file checksum change and all checksum verification done
> on that file fails. In particular 'make legal-info' fails.
>
> Using only the license related part of a source file for checksum
> checking solve this issue.
> 'make legal-info' will fail only if the license part is modify.
modified
> Introduce the possibility to have a lines range in <pkg>.hash and, if
> present, compute checksum on the part defined by this lines range.
I'm afraid this is fraught with limitations.
For example, I've seen source files where the license texts (yes,
plural) are spread out out in the source file, intermixed between
parts of the code.
If you patch such a file, you are certain that:
1. its hash as a whole changes, as you noticed
2. the license will not be on the lines you expect them to be, and will
defeat your proposed scheme.
> For instance, in <pkg>.hash:
> sha256 xxxxxx foo.c <-- sha256 on the whole foo.c file
> sha256 xxxxxx foo.c 1,15 <-- sha256 on extraction from line 1 to 15
That's hackish.
Instead, I would argue for either or both of the following:
1. considering that the source file is used as a license file, it is
not non-redistributable, so it is already redistributed as part of
the legal-info artefacts, so I would suggest that we simply stop
using source file as a license file;
2. for such packages, add a post-extract hook that prepares the license
file(s), and register those as _FOO_LICENSE_FILES instead. By using
code, you can search for patterns rather than depend on constant
line numbers.
Both are trivial enough, I believe.
Regards,
Yann E. MORIN.
> Signed-off-by: Herve Codina <herve.codina@bootlin.com>
> ---
> support/download/check-hash | 24 +++++++++++++++++++-----
> 1 file changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/support/download/check-hash b/support/download/check-hash
> index 5a47f49bc3..a90c9ca58a 100755
> --- a/support/download/check-hash
> +++ b/support/download/check-hash
> @@ -45,10 +45,20 @@ fi
> # $1: algo hash
> # $2: known hash
> # $3: file (full path)
> +# $4: lines ranges in the form s,e. If present, the checksum is compute on the
> +# file extracted part from line number s to line number e included.
> +# The first line in the file is the line number 1.
> +# If not present, the whole file is used.
> check_one_hash() {
> _h="${1}"
> _known="${2}"
> _file="${3}"
> + _r="${4}"
> +
> + base_with_range=${base}
> + if [ ${_r} ]; then
> + base_with_range="${base_with_range}:${_r}"
> + fi
>
> # Note: md5 is supported, but undocumented on purpose.
> # Note: sha3 is not supported, since there is currently no implementation
> @@ -64,13 +74,17 @@ check_one_hash() {
> esac
>
> # Do the hashes match?
> - _hash=$( ${_h}sum "${_file}" |cut -d ' ' -f 1 )
> + if [ ${_r} ]; then
> + _hash=$( sed -n "${r} p" ${file} | ${_h}sum |cut -d ' ' -f 1 )
> + else
> + _hash=$( ${_h}sum "${_file}" |cut -d ' ' -f 1 )
> + fi
> if [ "${_hash}" = "${_known}" ]; then
> - printf "%s: OK (%s: %s)\n" "${base}" "${_h}" "${_hash}"
> + printf "%s: OK (%s: %s)\n" "${base_with_range}" "${_h}" "${_hash}"
> return 0
> fi
>
> - printf "ERROR: %s has wrong %s hash:\n" "${base}" "${_h}" >&2
> + printf "ERROR: %s has wrong %s hash:\n" "${base_with_range}" "${_h}" >&2
> printf "ERROR: expected: %s\n" "${_known}" >&2
> printf "ERROR: got : %s\n" "${_hash}" >&2
> printf "ERROR: Incomplete download, or man-in-the-middle (MITM) attack\n" >&2
> @@ -80,7 +94,7 @@ check_one_hash() {
>
> # Do we know one or more hashes for that file?
> nb_checks=0
> -while read t h f; do
> +while read t h f r; do
> case "${t}" in
> ''|'#'*)
> # Skip comments and empty lines
> @@ -88,7 +102,7 @@ while read t h f; do
> ;;
> *)
> if [ "${f}" = "${base}" ]; then
> - check_one_hash "${t}" "${h}" "${file}"
> + check_one_hash "${t}" "${h}" "${file}" "${r}"
> : $((nb_checks++))
> fi
> ;;
> --
> 2.41.0
>
--
.-----------------.--------------------.------------------.--------------------.
| Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ |
| +33 561 099 427 `------------.-------: X AGAINST | \e/ There is no |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. |
'------------------------------^-------^------------------^--------------------'
_______________________________________________
buildroot mailing list
buildroot@buildroot.org
https://lists.buildroot.org/mailman/listinfo/buildroot
next prev parent reply other threads:[~2023-07-27 15:59 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-27 12:51 [Buildroot] [RFC PATCH 1/1] support/download: allow to use part of file in checksum checking Herve Codina via buildroot
2023-07-27 15:59 ` Yann E. MORIN [this message]
2023-07-27 20:59 ` Thomas Petazzoni via buildroot
2023-07-28 7:04 ` Herve Codina via buildroot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230727155944.GF3544423@scaer \
--to=yann.morin.1998@free.fr \
--cc=buildroot@buildroot.org \
--cc=herve.codina@bootlin.com \
--cc=thomas.petazzoni@bootlin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox