From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay1.corp.sgi.com [137.38.102.111]) by oss.sgi.com (Postfix) with ESMTP id B369A7F37 for ; Thu, 9 Jul 2015 12:33:58 -0500 (CDT) Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by relay1.corp.sgi.com (Postfix) with ESMTP id 985158F8049 for ; Thu, 9 Jul 2015 10:33:58 -0700 (PDT) Received: from mrout4.yahoo.com (mrout4.yahoo.com [216.145.54.109]) by cuda.sgi.com with ESMTP id kHXZd9xrHdw45glp (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO) for ; Thu, 09 Jul 2015 10:33:55 -0700 (PDT) Received: from omp1007.mail.ne1.yahoo.com (omp1007.mail.ne1.yahoo.com [98.138.87.7]) by mrout4.yahoo.com (8.14.9/8.14.9/y.out) with ESMTP id t69HXFqi073774 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 9 Jul 2015 10:33:16 -0700 (PDT) Date: Thu, 9 Jul 2015 17:32:50 +0000 (UTC) From: Hogan Whittall Message-ID: <110866563.1804043.1436463170539.JavaMail.yahoo@mail.yahoo.com> Subject: Issue with RHEL6 mkfs.xfs (3.1.1+), HP P420 RAID, and MySQL replication MIME-Version: 1.0 List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============5488718859376395651==" Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: "xfs@oss.sgi.com" --===============5488718859376395651== Content-Type: multipart/alternative; boundary="----=_Part_1804042_1960197275.1436463170531" ------=_Part_1804042_1960197275.1436463170531 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello, Recently we encountered a previously-reported issue regarding write amplifi= cation with MySQL replication and XFS when used with certain RAID controlle= rs (In our case, HP P420). =C2=A0That issue exactly matches our issue and w= as documented by someone else here -=C2=A0http://oss.sgi.com/archives/xfs/2= 013-03/msg00133.html=C2=A0- but I don't see any resolution. =C2=A0I will sa= y that the problem *does not* exist when mkfs.xfs 2.9.6 is used to format t= he filesystem on RHEL6 as that sets sunit=3D0 and swidth=3D0 instead of set= ting based on minimum_io_size and optimal_io_size. We have systems that are identical in how they are built and configured, we= can take a RHEL6 box that has the MySQL partition formatted with mkfs.xfs = v3.1.1 and reproduce the write amplification problem with MySQL replication= every single time. =C2=A0If we take the same box and format the MySQL part= ition with mkfs.xfs 2.9.6, then bring up MySQL with the exact same configur= ation there is no problem. =C2=A0I've included the working and broken setti= ngs below. =C2=A0If it's not the sunit/swidth settings then what will cause= 7-10MB/s worth of writes to the XFS partition to become over 200MB/s downs= tream? =C2=A0The actual data change on the disks is not 200MB/s, but becaus= e the write ops are truly being amplified and not just being misreported ou= r MySQL slaves with the bad XFS settings cannot keep up and the lag steadil= y increases with no hope of ever becoming current. I am happy to try some other settings/options with the RHEL6 mkfs.xfs to se= e if replication performance is able to match that of systems formatted wit= h mkfs.xfs 2.9.6, but the values set by 3.1.1 with the P420 RAID do not wor= k for MySQL replication. =C2=A0We have ruled out everything else as a possi= ble cause, the absolute only difference on these systems is what values are= set by mkfs.xfs. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=C2=A0Working RHEL6 XFS partition: meta-data=3D/dev/mapper/sys-home =C2=A0 isize=3D256 =C2=A0 =C2=A0agcount=3D= 4, agsize=3D71271680 blks=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sectsz= =3D512 =C2=A0 attr=3D2, projid32bit=3D0data =C2=A0 =C2=A0 =3D =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 bsize=3D409= 6 =C2=A0 blocks=3D285086720, imaxpct=3D5=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =3D =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 sunit=3D0 =C2=A0 =C2=A0 =C2=A0swidth=3D0 blksnaming =C2=A0 =3Dversio= n 2 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bsize=3D4096 =C2=A0 asc= ii-ci=3D0log =C2=A0 =C2=A0 =C2=A0=3Dinternal =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 bsize=3D4096 =C2=A0 blocks=3D32768, version=3D2=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sectsz=3D512 =C2=A0 sunit=3D0 blks, lazy= -count=3D0realtime =3Dnone =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 extsz=3D4096 =C2=A0 blocks=3D0, rtextents=3D0 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=C2=A0 Broken RHEL6 XFS partition: meta-data=3D/dev/mapper/sys-home =C2=A0 isize=3D256 =C2=A0 =C2=A0agcount=3D= 32, agsize=3D8908992 blks=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sectsz= =3D512 =C2=A0 attr=3D2, projid32bit=3D0data =C2=A0 =C2=A0 =3D =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 bsize=3D409= 6 =C2=A0 blocks=3D285086720, imaxpct=3D5=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =3D =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 sunit=3D64 =C2=A0 =C2=A0 swidth=3D128 blksnaming =C2=A0 =3Dversion 2= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bsize=3D4096 =C2=A0 ascii-= ci=3D0log =C2=A0 =C2=A0 =C2=A0=3Dinternal =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 bsize=3D4096 =C2=A0 blocks=3D139264, version=3D2=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sectsz=3D512 =C2=A0 sunit=3D64 blks, laz= y-count=3D1realtime =3Dnone =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 extsz=3D4096 =C2=A0 blocks=3D0, rtextents=3D0 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=C2=A0 Thanks! -Hogan ------=_Part_1804042_1960197275.1436463170531 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hello,

Recently we en= countered a previously-reported issue regarding write amplification with My= SQL replication and XFS when used with certain RAID controllers (In our cas= e, HP P420).  That issue exactly matches our issue and was documented = by someone else here - http://oss.sgi.c= om/archives/xfs/2013-03/msg00133.html - but I don't see any resolu= tion.  I will say that the problem *does not* exist when mkfs.xfs 2.9.= 6 is used to format the filesystem on RHEL6 as that sets sunit=3D0 and swid= th=3D0 instead of setting based on minimum_io_size and optimal_io_size.

We have systems that are= identical in how they are built and configured, we can take a RHEL6 box th= at has the MySQL partition formatted with mkfs.xfs v3.1.1 and reproduce the= write amplification problem with MySQL replication every single time. &nbs= p;If we take the same box and format the MySQL partition with mkfs.xfs 2.9.= 6, then bring up MySQL with the exact same configuration there is no proble= m.  I've included the working and broken settings below.  If it's= not the sunit/swidth settings then what will cause 7-10MB/s worth of write= s to the XFS partition to become over 200MB/s downstream?  The actual = data change on the disks is not 200MB/s, but because the write ops are trul= y being amplified and not just being misreported our MySQL slaves with the = bad XFS settings cannot keep up and the lag steadily increases with no hope= of ever becoming current.

I am happy to try some other settings/options with the RHEL6 mkfs.xfs= to see if replication performance is able to match that of systems formatt= ed with mkfs.xfs 2.9.6, but the values set by 3.1.1 with the P420 RAID do n= ot work for MySQL replication.  We have ruled out everything else as a= possible cause, the absolute only difference on these systems is what valu= es are set by mkfs.xfs.

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 
Working RHEL6 XFS partition:

meta-data=3D/dev/mapper/sys-h= ome   isize=3D256    agcount=3D4, agsize=3D71271680 blks
 = ;        =3D             =           sectsz=3D512   attr=3D2, projid32bi= t=3D0
data     =3D             &nbs= p;         bsize=3D4096   blocks=3D285086720, imax= pct=3D5
         =3D         &nb= sp;             sunit=3D0      = ;swidth=3D0 blks
naming   =3Dversion 2          = ;    bsize=3D4096   ascii-ci=3D0
log      =3Din= ternal               bsize=3D4096  = blocks=3D32768, version=3D2
         =3D   &nb= sp;                   sectsz= =3D512   sunit=3D0 blks, lazy-count=3D0
realtime =3Dnone     &= nbsp;             extsz=3D4096   blocks= =3D0, rtextents=3D0

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 
Broken RHEL6= XFS partition:

meta-data=3D/dev/mapper/sys-home   isize=3D256  =  agcount=3D32, agsize=3D8908992 blks
         = =3D                     &= nbsp; sectsz=3D512   attr=3D2, projid32bit=3D0
data     =3D &n= bsp;                     = bsize=3D4096   blocks=3D285086720, imaxpct=3D5
      &nbs= p;  =3D                  =     sunit=3D64     swidth=3D128 blks
naming   =3Dve= rsion 2              bsize=3D4096  = ascii-ci=3D0
log      =3Dinternal        =       bsize=3D4096   blocks=3D139264, version=3D2
 =        =3D             &= nbsp;         sectsz=3D512   sunit=3D64 blks, lazy= -count=3D1
realtime =3Dnone               =     extsz=3D4096   blocks=3D0, rtextents=3D0

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =

Thanks!

-Hogan
------=_Part_1804042_1960197275.1436463170531-- --===============5488718859376395651== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs --===============5488718859376395651==--