All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wido den Hollander <wido@42on.com>
To: Gregory Farnum <greg@gregs42.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: CephFS hangs when writing 10GB files in loop
Date: Thu, 18 Dec 2014 11:13:42 +0100	[thread overview]
Message-ID: <5492A8D6.1020802@42on.com> (raw)
In-Reply-To: <CAC6JEv9JBUjKgVzFcribvxLSn+HGhXbpdaymVWYi1RoD+YCqeg@mail.gmail.com>

On 12/17/2014 07:42 PM, Gregory Farnum wrote:
> On Wed, Dec 17, 2014 at 8:35 AM, Wido den Hollander <wido@42on.com> wrote:
>> Hi,
>>
>> Today I've been playing with CephFS and the morning started great with
>> CephFS playing along just fine.
>>
>> Some information first:
>> - Ceph 0.89
>> - Linux kernel 3.18
>> - Ceph fuse 0.89
>> - One Active MDS, one Standby
>>
>> This morning I could write a 10GB file like this using the kclient:
>> $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync
>>
>> That gave me 850MB/sec (all 10G network) and I could read the same file
>> again with 610MB/sec.
>>
>> After writing to it multiple times it suddenly started to hang.
>>
>> No real evidence on the MDS (debug mds set to 20) or anything on the
>> client. That specific operation just blocked, but I could still 'ls' the
>> filesystem in a second terminal.
>>
>> The MDS was showing in it's log that it was checking active sessions of
>> clients. It showed the active session of my single client.
>>
>> The client renewed it's caps and proceeded.
> 
> Can you clarify this? I'm not quite sure what you mean.
> 

I currently don't have the logs available. That was my problem when
typing the original e-mail.

>> I currently don't have any logs, but I'm just looking for a direction to
>> be pointed towards.
>>
>> Any ideas?
> 
> Well, now that you're on v0.89 you should explore the admin
> socket...there are commands on the MDS to dump ops in flight (and
> maybe to look at session states? I don't remember when that merged).

Sage's pointer towards the kernel debugging and the new admin socket
showed me that it were RADOS calls which were hanging.

I investigated even further and it seems that this is not a CephFS
problem, but a local TCP issue which is only triggered when using CephFS.

At some point, which is still unclear to me, data transfer becomes very
slow. The MDS doesn't seem to be able to update the journal and the
client can't write to the OSDs anymore.

It happened after I did some very basic TCP tuning (timestamp, rmem,
wmem, sack, fastopen).

Reverting back to the Ubuntu 14.04 defaults resolved it all and CephFS
is running happily now.

I'll dig some deeper to see why this system was affected by those
changes. I applied these settings earlier on a RBD-only cluster without
any problems.

> -Greg
> 


-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on

  reply	other threads:[~2014-12-18 10:13 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-17 16:35 CephFS hangs when writing 10GB files in loop Wido den Hollander
2014-12-17 16:40 ` Sage Weil
2014-12-17 16:43   ` Wido den Hollander
2014-12-17 18:42 ` Gregory Farnum
2014-12-18 10:13   ` Wido den Hollander [this message]
2014-12-18 15:54     ` Wido den Hollander
2014-12-18 16:32       ` Atchley, Scott
2014-12-18 20:50         ` Wido den Hollander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5492A8D6.1020802@42on.com \
    --to=wido@42on.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=greg@gregs42.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.