From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Subject: Re: Cyclic performance drop Date: Sat, 15 Oct 2016 16:22:10 +0300 Message-ID: <1476537730.1723.8.camel@gmail.com> References: <716f53fa-0fce-a7b2-1d2a-f4bd10ea9133@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Return-path: Received: from mail-lf0-f44.google.com ([209.85.215.44]:35892 "EHLO mail-lf0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753578AbcJONWb (ORCPT ); Sat, 15 Oct 2016 09:22:31 -0400 Received: by mail-lf0-f44.google.com with SMTP id b75so216368079lfg.3 for ; Sat, 15 Oct 2016 06:22:30 -0700 (PDT) In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Sage Weil Cc: ceph-devel On Пт, 2016-10-14 at 18:53 +0000, Sage Weil wrote: > On Fri, 14 Oct 2016, Mike wrote: > > Hello. > > On the latest Jewel release I see a cyclic performance drop on read operations. > > Performance significantly drops every 4-5 seconds from ~70k IOPS to ~20k IOPS. > >  > > It looks like this (some fields were truncated to fit in line length): > > > > ... > > 19:46:10.432125 4096 pgs: 4096 active+clean; 67378 MB data, 82433 kB/s rd, 20608 > op/s > > 19:46:11.453338 4096 pgs: 4096 active+clean; 67378 MB data, 104 MB/s rd, 26857 op/s > > 19:46:12.486138 4096 pgs: 4096 active+clean; 67378 MB data, 276 MB/s rd, 70879 op/s > > 19:46:13.517175 4096 pgs: 4096 active+clean; 67378 MB data, 235 MB/s rd, 60375 op/s > > 19:46:15.530826 4096 pgs: 4096 active+clean; 67378 MB data, 81768 kB/s rd, 20442 > op/s > > 19:46:16.561929 4096 pgs: 4096 active+clean; 67378 MB data, 132 MB/s rd, 33811 op/s > > 19:46:17.582495 4096 pgs: 4096 active+clean; 67378 MB data, 277 MB/s rd, 71027 op/s > > 19:46:18.614087 4096 pgs: 4096 active+clean; 67378 MB data, 200 MB/s rd, 51365 op/s > > 19:46:20.643567 4096 pgs: 4096 active+clean; 67378 MB data, 97849 kB/s rd, 24462 > op/s > > 19:46:21.664988 4096 pgs: 4096 active+clean; 67378 MB data, 129 MB/s rd, 33108 op/s > > 19:46:22.693243 4096 pgs: 4096 active+clean; 67378 MB data, 270 MB/s rd, 69269 op/s > > 19:46:23.692111 4096 pgs: 4096 active+clean; 67378 MB data, 199 MB/s rd, 51186 op/s > > 19:46:25.725054 4096 pgs: 4096 active+clean; 67378 MB data, 84951 kB/s rd, 21238 > op/s > > 19:46:26.746227 4096 pgs: 4096 active+clean; 67378 MB data, 132 MB/s rd, 33833 op/s > > 19:46:27.779780 4096 pgs: 4096 active+clean; 67378 MB data, 293 MB/s rd, 75189 op/s > > 19:46:28.775288 4096 pgs: 4096 active+clean; 67378 MB data, 204 MB/s rd, 52249 op/s > > 19:46:30.795561 4096 pgs: 4096 active+clean; 67378 MB data, 75260 kB/s rd, 18815 > op/s > > 19:46:31.818544 4096 pgs: 4096 active+clean; 67378 MB data, 133 MB/s rd, 34243 op/s > > 19:46:32.851392 4096 pgs: 4096 active+clean; 67378 MB data, 295 MB/s rd, 75755 op/s > > 19:46:33.843960 4096 pgs: 4096 active+clean; 67378 MB data, 205 MB/s rd, 52649 op/s > > 19:46:34.861416 4096 pgs: 4096 active+clean; 67378 MB data, 69177 kB/s rd, 17294 > op/s > > 19:46:35.872386 4096 pgs: 4096 active+clean; 67378 MB data, 85299 kB/s rd, 21324 > op/s > > 19:46:36.898020 4096 pgs: 4096 active+clean; 67378 MB data, 155 MB/s rd, 39896 op/s > > 19:46:37.934147 4096 pgs: 4096 active+clean; 67378 MB data, 321 MB/s rd, 82209 op/s > > 19:46:39.966386 4096 pgs: 4096 active+clean; 67378 MB data, 163 MB/s rd, 41735 op/s > > 19:46:40.973110 4096 pgs: 4096 active+clean; 67378 MB data, 55481 kB/s rd, 13870 > op/s > > ... > > > > You should probably confirm this result by looking at the raw perfcounter  > stats coming out of the OSD admin socket interface.  (Operators usually  > wire this up to graphite or similar monitoring tools.) > > If this is a smallish cluster, a simpler check would be > >  ceph daemonperf osd.0 > > and see if the stats reported by a single OSD show the same behavior. > > The numbers reported by the monitor are not very accurate.  They average  > over a short period of time and can be sensitive to the timing of stat  > reports from OSDs (we're effectively taking the differential of a very  > choppy stair-step function and hoping for the best). > > sage > > Thanks Sage! You are right when use "ceph daemon perf ..." I didn't see the issue described above. --  Mike, run.