All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amon Ott <a.ott@m-privacy.de>
To: Yehuda Sadeh Weinraub <yehuda.sadeh@dreamhost.com>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: BUG at fs/inode.c
Date: Tue, 25 Oct 2011 10:38:06 +0200	[thread overview]
Message-ID: <201110251038.06482.a.ott@m-privacy.de> (raw)
In-Reply-To: <CAC-hyiEcNcJGsMLWa6WVhM6gABYKgeKo2aFMvOEtuu7kQqN9ow@mail.gmail.com>

On Monday 24 October 2011 wrote Yehuda Sadeh Weinraub:
> On Mon, Oct 24, 2011 at 3:39 AM, Amon Ott <a.ott@m-privacy.de> wrote:
> > we have hit a kernel bug with current ceph-client master (commit
> > a2742a09568f81315e0f30021f29f14e7cd3924b), which I assume to be a Ceph
> > bug.
>
> Is it easily reproducible? What's the scenario?

It is quite easy to reproduce. We run a virtual test cluster with two nodes, 
each running OSD, MDS and MON, but using "max mon = 1".

Cephfs is mounted on both nodes so that they share the same data. Kernel is 
3.0.7 with PaX, RSBAC and ceph-client master. The intention is to have a 
scalable cluster of servers where any number of nodes may fail at any time, 
as long as there are always enough left to keep at least one copy of the data 
and restore redundancy. If it works out as expected, we want to scale to 20 
or even more nodes, depending on the needs of our customers.

> > Kernel is x86-32, Ceph is running on a two node cluster over ext4. The
> > kernel traces are attached, the system dies shortly after these messages.
> > The bug is reproducable. I have not found anything useful in ceph bug
> > tracker when searching for "fs/inode.c".
>
> How many mds servers?

We run a test cluster with two nodes, each running OSD, MDS and MON, but 
using "max mon = 1".

> > Around fs/inode.c line 1375 mentioned in the trace is the iput()
> > function: void iput(struct inode *inode)
> > {
> >        if (inode) {
> >                BUG_ON(inode->i_state & I_CLEAR);
> >
> >                if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock))
> >                        iput_final(inode);
> >        }
> > }
> >
> > So inode->i_state seems to be incorrect when iput() is called, maybe a
> > double call to iput() or a missing iget() somewhere. Is this really a
> > Ceph bug or have I messed up our kernel code when merging patches?
>
> What patches?

See above. PaX, RSBAC and Ceph master. I have been merging the first two in 
for years now, being the RSBAC main author myself.

> Also, the client logs could help shedding a light on the issue. You
> should have dynamic debugging turned on (CONFIG_DYNAMIC_DEBUG), and
> something along the lines of:
>
> # mount -t debugfs none /sys/kernel/debug
> # echo 'module ceph +p' > /sys/kernel/debug/dynamic_debug/control
> # echo 'module libceph +p' > /sys/kernel/debug/dynamic_debug/control

New kernels are building right now. Upgraded to 3.0.8, put in new ceph-client 
master fix 8ba1683acc83aee4bcab304844f8e60330e5ef1f and added 
CONFIG_DYNAMIC_DEBUG. This kernel will go into two big servers this time to 
give it some real load. Let's see whether I can reproduce there, too. If so, 
I will provide debug output as requested.

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH           Tel: +49 30 24342334
Am Köllnischen Park 1    Fax: +49 30 24342336
10179 Berlin             http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-10-25  8:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-24 10:39 BUG at fs/inode.c Amon Ott
2011-10-24 16:51 ` Yehuda Sadeh Weinraub
2011-10-25  8:38   ` Amon Ott [this message]
2011-10-25 14:35     ` Amon Ott
2011-11-01  8:23       ` Amon Ott
2011-11-01 16:51         ` Sage Weil
2011-11-02  8:53           ` Amon Ott
2011-11-02 14:23             ` Sage Weil
2011-11-05 17:06               ` Amon Ott
2011-11-06  5:33                 ` Sage Weil
2011-11-07 15:32                   ` Amon Ott

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201110251038.06482.a.ott@m-privacy.de \
    --to=a.ott@m-privacy.de \
    --cc=ceph-devel@vger.kernel.org \
    --cc=yehuda.sadeh@dreamhost.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.