From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mondschein.lichtvoll.de ([194.150.191.11]:42219 "EHLO mail.lichtvoll.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751505AbaL0TYJ (ORCPT ); Sat, 27 Dec 2014 14:24:09 -0500 From: Martin Steigerwald To: Hugo Mills Cc: Zygo Blaxell , Robert White , linux-btrfs@vger.kernel.org Subject: Re: BTRFS free space handling still needs more work: Hangs again (no complete lockups, "just" tasks stuck for some time) Date: Sat, 27 Dec 2014 20:23:59 +0100 Message-ID: <2138510.KXMt4iLDat@merkaba> In-Reply-To: <20141227184017.GL25267@carfax.org.uk> References: <3738341.y7uRQFcLJH@merkaba> <20141227182846.GA11878@hungrycats.org> <20141227184017.GL25267@carfax.org.uk> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart28980453.7n3ZBEQpEO"; micalg="pgp-sha1"; protocol="application/pgp-signature" Sender: linux-btrfs-owner@vger.kernel.org List-ID: --nextPart28980453.7n3ZBEQpEO Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Am Samstag, 27. Dezember 2014, 18:40:17 schrieb Hugo Mills: > On Sat, Dec 27, 2014 at 01:28:46PM -0500, Zygo Blaxell wrote: > > On Sat, Dec 27, 2014 at 09:30:43AM +0000, Hugo Mills wrote: > > > On Sat, Dec 27, 2014 at 10:01:17AM +0100, Martin Steigerwald wrot= e: > > > > Am Freitag, 26. Dezember 2014, 14:48:38 schrieb Robert White: > > > > > On 12/26/2014 05:37 AM, Martin Steigerwald wrote: > > > Now, since you're seeing lockups when the space on your disks = is > > > all allocated I'd say that's a bug. However, you're the *only* pe= rson > > > who's reported this as a regular occurrence. Does this happen wit= h all > > > filesystems you have, or just this one? > >=20 > > I do see something similar, but there are so many problems going on= I > > have no idea which ones to report, and which ones are my own doing.= :-P > >=20 > > I see lots of CPU being burned when all the disk space is allocated= > > to chunks, but there is still lots of space free (multiple GB) insi= de > > the chunks. > >=20 > > iotop shows a crapton of disk writes (1-5MB/sec) from one kworker. > > There are maybe a few kB/sec of writes through the filesystem at th= e time. > >=20 > > The filesystem where I see this most is on a laptop, so the disk wr= ites > > also hit the CPU again for encryption. There's so much CPU usage i= t's > > worth mentioning twice. :-( > >=20 > > 'watch cat /proc/12345/stack' on the active processes shows the ker= nel > > fairly often in that new chunk deallocator function whose name esca= pes > > me at the moment. > >=20 > > Deleting a bunch of data then running balance helps return to sane = CPU > > usage...for a while (maybe a week?). > >=20 > > It's not technically "locked up" per se, but when a 5KB download ta= kes > > a minute or more, most users won't wait around to see the differenc= e. > >=20 > > Kernel versions I'm using are 3.17.7 and 3.18.1. >=20 > OK, so I'd like to change my statement above. >=20 > When I first read Martin's problem, I thought that he was referrin= g > to a complete, hit-the-power-button kind of lock-up. Given that > (erroneous) assumption, I stand by my (now pointless) statement. :) >=20 > I realised during a brief conversation on IRC that Martin was > actually referring to long but temporary periods where the machine is= > unusable by any process requiring disk activity. There's clearly a > number of people seeing that. >=20 > It doesn't stop it being a major problem, but it does change the > interpretation considerably. Ah, then my bet was right with whom I talked there. :) Yeah, it does not seem to be a complete hang, I though so initially, ca= use honestly after waiting several minutes for my Plasma desktop to come ba= ck I just gave up. Maybe it would have returned at some time. I just didn=C2= =B4t have the patience to wait. It now did at my last testing where I continued on tty1 (had all the te= sting in a screen) as the desktop session locked up. After some time after th= e test completed I was able to use that desktop again and I am still usin= g it. So the issue I see is: One kworker uses 100% of one core for minutes an= d while doing so processes that do I/O to the BTRFS that I test (/home) i= n my case seem to be stuck in uninteruptible sleep ("D" process state). Whil= e I see this there is no huge load on the SSDs so=E2=80=A6 it seems to be s= omething CPU bound. I didn=C2=B4t yet use a strace on the kworker process =E2=80= =93 or at the allocation time on the fio process =E2=80=93, Robert, thats a good sugg= estion. From a gut feeling I wouldn=C2=B4t be surprised if I see *nothing* in strace= as my bet is that the kworker thread deals with finding free space inside the chu= nks and deals with some data structures while doing so. But that is really = just a gut feeling and so an strace would be nice. I made a backup yesterday, so I think I can try the strace. But I also = spend a considerable amount of time of reproducing it and digging deeper into= it so likely not this weekend anymore although this even makes some fun. B= ut I see myself neglecting other stuff thats important to me as well, so=E2= =80=A6 My simple test case didn=C2=B4t trigger it, and I so not have another t= wice 160 GiB available on this SSDs available to try with a copy of my home filesystem. Then I could safely test without bringing the desktop sessi= on to an halt. Maybe someone has an idea on how to "enhance" my test case in order to reliably trigger the issue. It may be challenging tough. My /home is quite a filesystem. It has a m= aildir with at least one million of files (yeah, I am performance testing KMai= l and Akonadi as well to the limit!), and it has git repos and this one VM im= age, and the desktop search and the Akonadi database. In other words: It has= been hit nicely with various mostly random I think workloads over the l= ast about six months. I bet its not that easy to simulate that. Maybe some = runs of compilebench to age the filesystem before the fio test? That said, BTRFS performs a lot better. The complete lockups without an= y CPU usage of 3.15 and 3.16 have gone for sure. Thats wonderful. But the= re is this kworker issue now. I noticed it that gravely just while trying = to complete this tax returns stuff with the Windows XP VM. Otherwise it ma= y have happened, I have seen some backtraces in kern.log, but it didn=C2=B4= t last for minutes. So this indeed is of less severity than the full lockups w= ith 3.15 and 3.16. Zygo, was is the characteristics of your filesystem. Do you use compress=3Dlzo and skinny metadata as well? How are the chunks allocate= d? What kind of data you have on it? Well now off to some dancing event. Thats just right now :) Ciao, =2D-=20 Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 --nextPart28980453.7n3ZBEQpEO Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlSfB1MACgkQmRvqrKWZhMdo9QCgtX9PvOonejBzXUUVimDSEzAH /6IAn31tWaDpKM4541jEljUdT9bdRgrR =fqeB -----END PGP SIGNATURE----- --nextPart28980453.7n3ZBEQpEO--