From: "Jim Schutt" <jaschut@sandia.gov>
To: sridhar basam <sri@basam.org>
Cc: ceph-devel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load
Date: Mon, 13 Feb 2012 08:26:03 -0700 [thread overview]
Message-ID: <4F392B8B.4030204@sandia.gov> (raw)
In-Reply-To: <CAGnVnB=scn13bE-+0xn4cqYnStObN+VvSMv4bL3whvRzXv1dFw@mail.gmail.com>
On 02/10/2012 05:05 PM, sridhar basam wrote:
>> > But the server never ACKed that packet. Too busy?
>> >
>> > I was collecting vmstat data during the run; here's the important bits:
>> >
>> > Fri Feb 10 11:56:51 MST 2012
>> > vmstat -w 8 16
>> > procs -------------------memory------------------ ---swap-- -----io----
>> > --system-- -----cpu-------
>> > r b swpd free buff cache si so bi bo in
>> > cs us sy id wa st
>> > 13 10 0 250272 944 37859080 0 0 7 5346 1098
>> > 444 2 5 92 1 0
>> > 88 8 0 260472 944 36728776 0 0 0 1329838
>> > 257602 68861 19 73 5 4 0
>> > 100 10 0 241952 944 36066536 0 0 0 1635891 340724
>> > 85570 22 68 6 4 0
>> > 105 9 0 250288 944 34750820 0 0 0 1584816 433223
>> > 111462 21 73 4 3 0
>> > 126 3 0 259908 944 33841696 0 0 0 749648
>> > 225707 86716 9 83 4 3 0
>> > 157 2 0 245032 944 31572536 0 0 0 736841 252406
>> > 99083 9 81 5 5 0
>> > 45 17 0 246720 944 28877640 0 0 1 755085
>> > 282177 116551 8 77 9 5 0
> Holy crap! That might explain why you aren't seeing anything. You are
> writing out over a 1.6 million blocks/sec. That too averaged over a 8
> second interval. I bet the missed acks are when this is happening.
> What sort of I/O load is going through this system during those times?
> What sort of filesystem and Linux system are these OSDs on?
Dual socket Nehalem EP @ 3 GHz, 24 ea. 7200RPM SAS drives w/ 64 MB cache,
3 LSI SAS HBAs w/8 drives per HBA, btrfs, 3.2.0 kernel. Each OSD
has a ceph journal and a ceph data store on a single drive.
I'm running 24 OSDs on such a box; all that write load is the result
of dd from 166 linux ceph clients.
FWIW, I've seen these boxes sustain > 2 GB/s for 60 sec or so under
this load, when I have TSO/GSO/GRO turned on, and am writing to
a freshly created ceph filesystem.
That lasts until my OSDs get stalled reading from a socket, as
documented by those packet traces I posted.
If you compare the timestamps on the retransmits to the times
that vmstat is dumping reports, at least some of the retransmits
hit the system when it is ~80% idle.
-- Jim
>
> Sridhar
>
>
>
next prev parent reply other threads:[~2012-02-13 15:26 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-01 15:54 [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 1/6] msgr: print message sequence number and tid when receiving message envelope Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 2/6] common/Throttle: track sleep/wake sequences in Throttle, report them for policy throttler Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 3/6] common/Throttle: throttle in FIFO order Jim Schutt
2012-02-02 17:53 ` Gregory Farnum
2012-02-02 18:31 ` Jim Schutt
2012-02-02 19:01 ` Gregory Farnum
2012-02-01 15:54 ` [RFC PATCH 4/6] common/Throttle: FIFO throttler doesn't need to signal waiters when max changes Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 5/6] common/Throttle: make get() report number of waiters on entry/exit Jim Schutt
2012-02-01 15:54 ` [RFC PATCH 6/6] msg: log Message interactions with throttler Jim Schutt
2012-02-01 22:33 ` [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load Gregory Farnum
2012-02-02 15:38 ` Jim Schutt
[not found] ` <4F29CDAA.408@sandia.gov>
[not found] ` <CAF3hT9BZEP_FWS=qt8ivA++aDpPGGFzuD_PtMcvDRS2aDEN+hw@mail.gmail.com>
[not found] ` <4F2AABF5.6050803@sandia.gov>
2012-02-02 17:52 ` Gregory Farnum
2012-02-02 19:06 ` [EXTERNAL] " Jim Schutt
2012-02-02 19:15 ` Sage Weil
2012-02-02 19:33 ` Jim Schutt
2012-02-02 19:32 ` Gregory Farnum
2012-02-02 20:22 ` Jim Schutt
2012-02-02 20:31 ` Jim Schutt
2012-02-03 0:28 ` [EXTERNAL] " Gregory Farnum
2012-02-03 16:17 ` Jim Schutt
2012-02-03 17:06 ` Gregory Farnum
2012-02-03 23:33 ` Jim Schutt
[not found] ` <CAC-hyiHSNv_VgLcyVCrJ66HxTGFNBONrmmBddJk5326dLTKgkw@mail.gmail.com>
2012-02-04 0:04 ` Yehuda Sadeh Weinraub
2012-02-06 16:20 ` Jim Schutt
2012-02-06 17:22 ` Yehuda Sadeh Weinraub
2012-02-06 18:20 ` Jim Schutt
2012-02-06 18:35 ` Gregory Farnum
2012-02-09 20:53 ` Jim Schutt
2012-02-09 22:40 ` sridhar basam
2012-02-09 23:15 ` Jim Schutt
2012-02-10 0:34 ` Tommi Virtanen
2012-02-10 1:26 ` sridhar basam
2012-02-10 15:32 ` [EXTERNAL] " Jim Schutt
2012-02-10 17:13 ` sridhar basam
2012-02-10 23:09 ` Jim Schutt
2012-02-11 0:05 ` sridhar basam
2012-02-13 15:26 ` Jim Schutt [this message]
2012-02-03 17:07 ` Sage Weil
2012-02-24 15:38 ` Jim Schutt
2012-02-24 18:31 ` Tommi Virtanen
2012-02-24 18:38 ` Tommi Virtanen
2013-02-21 0:12 ` Sage Weil
2013-02-26 19:16 ` Jim Schutt
2013-02-26 19:36 ` Sage Weil
2013-02-28 19:37 ` Jim Schutt
2013-02-28 21:06 ` Sage Weil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F392B8B.4030204@sandia.gov \
--to=jaschut@sandia.gov \
--cc=ceph-devel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sri@basam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.