All of lore.kernel.org
 help / color / mirror / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	phillip.wood@dunelm.org.uk,
	Brian Norris <briannorris@chromium.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH] t3429: try to protect against a potential racy todo file problem
Date: Mon, 25 Nov 2019 04:10:05 +0100	[thread overview]
Message-ID: <20191125031005.GC23183@szeder.dev> (raw)
In-Reply-To: <20191124211021.GB23183@szeder.dev>

On Sun, Nov 24, 2019 at 10:10:21PM +0100, SZEDER Gábor wrote:
> To notice a changed todo file the sequencer stores the file's stat
> data in its 'struct todo_list' instance, and compares it with the
> file's current stat data after 'reword', 'squash' and 'exec'
> instructions.  If the two stat data doesn't match, it re-reads the
> todo file.
> 
> Sounds simple, but there are some subtleties going on here:
> 
>   - The 'struct todo_list' holds the stat data from the time when the
>     todo file was last read.
> 
>   - This stat data in 'struct todo_list' is not updated when the
>     sequencer itself writes the todo file.
> 
>   - Before executing each instruction during an interactive rebase,
>     the sequencer always updates the todo file by removing the
>     just-about-to-be-executed instruction.  This changes the file's
>     size and inode [1].
> 
> Consequently, when the sequencer looks at the stat data after a
> 'reword', 'squash' or 'exec' instruction, it most likely finds that
> they differ, even when the user didn't modify the todo list at all!
> This is not an issue in practice, it just wastes a few cycles on
> re-reading the todo list that matches what the sequencer already has
> in memory anyway.
> 
> However, an unsuspecting Git developer might try to "fix" it by simply
> updating the stat data each time the sequencer writes the todo list
> for an interactive rebase.  On first sight it looks quite sensible and
> straightforward, but we have to be very careful when doing that,
> because potential racy problems lie that way.
> 
> It is possible to overwrite the todo list file without modifying
> either its inode or size, and if such an overwrite were to happen in
> the same second when the file was last read (our stat data has one
> second granularity by default), then the actual stat data on the file
> system would match the stat data that the sequencer has in memory.
> Consequently, such a modification to the todo list file would go
> unnoticed.

> [1] The todo file is written using the lockfile API, i.e. first to the
>     lockfile, which is then moved to place, so the new file can't
>     possibly have the same inode as the file it replaces.  Note,
>     however, that the file system might reuse inodes, so it is
>     possible that the new todo file ends up with the same inode as is
>     recorded in the 'struct todo_list' from the last time the file was
>     read.

Unfortunately, we already do have an issue here when the sequencer can
overlook a modified todo list, but triggering it needs the lucky
"alignment" of inodes as well.  I can trigger it fairly reliably with
the test below, but forcing the inode match makes it kind of gross and
Linux-only.

  https://travis-ci.org/szeder/git/jobs/616460522#L1470


  ---  >8  ---

diff --git a/t/t9999-rebase-racy-todo-reread.sh b/t/t9999-rebase-racy-todo-reread.sh
new file mode 100755
index 0000000000..437ebd55e0
--- /dev/null
+++ b/t/t9999-rebase-racy-todo-reread.sh
@@ -0,0 +1,87 @@
+#!/bin/sh
+
+test_description='racy edit todo reread problem'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	test_commit first_ &&
+	test_commit second &&
+	test_commit third_ &&
+	test_commit fourth &&
+	test_commit fifth_ &&
+	test_commit sixth_ &&
+
+	write_script sequence-editor <<-\EOS &&
+		todo=.git/rebase-merge/git-rebase-todo &&
+		cat >"$todo" <<-EOF
+			reword $(git rev-parse second^0) second
+			reword $(git rev-parse third_^0) third_
+			reword $(git rev-parse fourth^0) fourth
+		EOF
+	EOS
+
+	write_script commit-editor <<-\EOS &&
+		read first_line <"$1" &&
+		echo "$first_line - edited" >"$1" &&
+
+		todo=.git/rebase-merge/git-rebase-todo &&
+
+		if test "$first_line" = second
+		then
+			stat --format=%i "$todo" >expected-ino
+		elif test "$first_line" = third_
+		then
+			ino=$(cat expected-ino) &&
+			file=$(find . -inum $ino) &&
+			if test -n "$file"
+			then
+				echo &&
+				echo "Trying to free inode $ino by moving \"$file\" out of the way" &&
+				cp -av "$file" "$file".tmp &&
+				rm -fv "$file"
+			fi &&
+
+			cat >"$todo".tmp <<-EOF &&
+			reword $(git rev-parse fifth_^0) fifth_
+			reword $(git rev-parse sixth_^0) sixth_
+			EOF
+			mv -v "$todo".tmp "$todo" &&
+
+			if test "$ino" -eq $(stat --format=%i "$todo")
+			then
+				echo "Yay! The todo list did get inode $ino, just what the sequencer is expecting!"
+			fi &&
+
+			if test -n "$file"
+			then
+				mv -v "$file".tmp "$file"
+			fi
+		fi
+	EOS
+
+	cat >expect <<-\EOF
+	sixth_ - edited
+	fifth_ - edited
+	third_ - edited
+	second - edited
+	first_
+	EOF
+'
+
+for trial in 0 1 2 3 4
+do
+	test_expect_success "demonstrate racy todo re-read problem #$trial" '
+		git reset --hard fourth &&
+		>expected-ino && # placeholder
+
+		GIT_SEQUENCE_EDITOR=./sequence-editor \
+		GIT_EDITOR=./commit-editor \
+		git rebase -i HEAD^^^ &&
+
+		git log --format=%s >actual &&
+		test_cmp expect actual
+	'
+done
+
+test_done

  parent reply	other threads:[~2019-11-25  3:10 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-22 23:10 git 2.24: git revert <commit1> <commit2> requires extra '--continue'? Brian Norris
2019-11-23  0:34 ` SZEDER Gábor
2019-11-23  9:53   ` Phillip Wood
2019-11-23 17:20     ` [PATCH] sequencer: don't re-read todo for revert and cherry-pick SZEDER Gábor
2019-11-23 21:14       ` Johannes Schindelin
2019-11-24  4:49       ` Junio C Hamano
2019-11-24 10:44         ` Phillip Wood
2019-11-24 21:10           ` [PATCH] t3429: try to protect against a potential racy todo file problem SZEDER Gábor
2019-11-25  1:28             ` Junio C Hamano
2019-11-25  3:10             ` SZEDER Gábor [this message]
2019-11-25 13:18             ` SZEDER Gábor
2019-11-25 14:43               ` Phillip Wood
2019-11-25 15:15                 ` SZEDER Gábor
2019-11-25 16:40                   ` Phillip Wood
2019-11-25  1:10           ` [PATCH] sequencer: don't re-read todo for revert and cherry-pick Junio C Hamano
2019-11-25 10:47             ` Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191125031005.GC23183@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=briannorris@chromium.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.