linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: linux-ext4@vger.kernel.org
Subject: [Bug 14602] New: JBD2 journal abort / checkpoint creation racy?
Date: Sat, 14 Nov 2009 12:05:15 GMT	[thread overview]
Message-ID: <bug-14602-13602@http.bugzilla.kernel.org/> (raw)

http://bugzilla.kernel.org/show_bug.cgi?id=14602

           Summary: JBD2 journal abort / checkpoint creation racy?
           Product: File System
           Version: 2.5
    Kernel Version: 2.6.32-rc6
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
        AssignedTo: fs_ext4@kernel-bugs.osdl.org
        ReportedBy: andi-bz@firstfloor.org
        Regression: No


I was testing a new file system feature that triggered IO errors
on inode read. I did not actually change the abort part here
so I believe this is unrelated to my changes.

I had one case during testing where the journal abort didn't work
and jbd2 journal abort errored out.

<putting some stress on the file system>
EXT4-fs error (device sda1): ext4_iget: triggering IO error
EXT4-fs error (device sda1): ext4_put_super: Couldn't clean up the journal

This happens in
ext4_put_super->jbd2_journal_destroy->jbd2_log_do_checkpoint
and then finally 


  if (journal->j_sb_buffer) {
                if (!is_journal_aborted(journal)) {
                        /* We can now mark the journal as empty. */
                        journal->j_tail = 0;
                        journal->j_tail_sequence =
                                ++journal->j_transaction_sequence;
                        jbd2_journal_update_superblock(journal, 1);
                } else {
                        err = -EIO; <------------ this is triggered
                }

So it looks like jbd2_log_do_checkpoint sometimes does not succeed?
It does a couple of retry, maybe that log is not enough.

The problem is not easy to trigger unfortunately, i only saw it very rarely.
I assume it's some race in the checkpoint creation.

I tried to reproduce it with jbd2 debugging enabled, but no luck so far.

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

             reply	other threads:[~2009-11-14 12:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-14 12:05 bugzilla-daemon [this message]
2009-11-15 22:28 ` [Bug 14602] JBD2 journal abort / checkpoint creation racy? bugzilla-daemon
2009-11-16 16:04 ` bugzilla-daemon
2009-11-16 18:59 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-14602-13602@http.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).