From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@bugzilla.kernel.org
Subject: [Bug 29162] Reiserfs hang with dataloss sometimes
Date: Tue, 07 Apr 2015 10:05:14 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Return-path:
In-Reply-To:
Sender: reiserfs-devel-owner@vger.kernel.org
List-ID:
Content-Type: text/plain; charset="us-ascii"
To: reiserfs-devel@vger.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=29162
Hamdi Hamdi changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hhamdi@sevone.com
--- Comment #66 from Hamdi Hamdi ---
Hi,
After some research there is a scenario which triggered my suspicion (ref
http://lxr.free-electrons.com/source/fs/reiserfs/journal.c?v=3.3):
1) several threads call `queue_log_writer` and are put to sleep
2) J_WRITERS_QUEUED bit is set
3) `check_journal_end` returns nonzero value and the `do_journal_end`
continues it's execution
4) at line 4252 it clears the bit and wakes 1 thread
5) no other thread enters `queue_log_writer` due to some unknown reason and
the bit is not set again
5) the awaken thread ends up with `if (!check_journal_end(th, sb, nblocks,
flags)`
returning 0 which triggers the awakening of another thread
6) No one will be awaken since there is no bit set on `if
(test_and_clear_bit(J_WRITERS_QUEUED, &journal->j_state))`,
except when the execution reaches line 4253
One ugly workaround for testing purposes at line 4253:
4253 wake_up(&(journal->j_join_wait));
replaced with
4253 wake_up_all(&(journal->j_join_wait));
There must be a reason for this to happen on high loads only so any help and/or
suggestions is more that welcome !
--
You are receiving this mail because:
You are the assignee for the bug.