From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from james.kirk.hungrycats.org ([174.142.39.145]:40475 "EHLO james.kirk.hungrycats.org" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S932234AbaJVUIN (ORCPT ); Wed, 22 Oct 2014 16:08:13 -0400 Date: Wed, 22 Oct 2014 16:08:12 -0400 From: Zygo Blaxell To: Duncan <1i5t5.duncan@cox.net> Cc: linux-btrfs@vger.kernel.org Subject: Re: 5 _thousand_ snapshots? even 160? (was: device balance times) Message-ID: <20141022200812.GA17395@hungrycats.org> References: <9cf38edae6c01b900d4ea0068d2dcfdd@admin.virtall.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="cWoXeonUoKmBZSoM" In-Reply-To: Sender: linux-btrfs-owner@vger.kernel.org List-ID: --cWoXeonUoKmBZSoM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 22, 2014 at 07:41:32AM +0000, Duncan wrote: > Tomasz Chmielewski posted on Wed, 22 Oct 2014 09:14:14 +0200 as excerpted: > >> Tho that is of course per subvolume. If you have multiple subvolumes > >> on the same filesystem, that can still end up being a thousand or two > >> snapshots per filesystem. But those are all groups of something under > >> 300 (under 100 with hourly) highly connected to each other, with the > >> interweaving inside each of those groups being the real complexity in > >> terms of btrfs management. >=20 > IOW, if you thin down the snapshots per subvolume to something reasonable= =20 > (under 300 for sure, preferably under 100), then depending on the number= =20 > of subvolumes you're snapshotting, you might have a thousand or two. =20 > However, of those couple thousand, btrfs will only have to deal with the= =20 > under 300 and preferably well under a hundred in the same group, that are= =20 > snapshots of the same thing and thus related to each other, at any given= =20 > time. The other snapshots will be there but won't be adding to the=20 > complexity near as much since they're of different subvolumes and aren't= =20 > logically interwoven together with the ones being considered at that=20 > moment. >=20 > But even then, at say 250 snapshots per subvolume, 2000 snapshots is 8=20 > independent subvolumes. That could happen. But 5000 snapshots? That'd= =20 > be 20 independent subvolumes, which is heading toward the extreme again. = =20 > Yes it could happen, but better if it does to cut down on the per- > subvolume snapshots further, to say the 25 per subvolume I mentioned, or= =20 > perhaps even further. 25 snapshots per subvolume with those same 20=20 > subvolumes... 500 snapshots total instead of 5000. =3D:^) If you have one subvolume per user and 1000 user directories on a server, it's only 5 snapshots per user (last hour, last day, last week, last month, and last year). I hear this is a normal use case in the ZFS world. It would certainly be attractive if there was working quota support. I have datasets where I record 14000+ snapshots of filesystem directory trees scraped from test machines and aggregated onto a single server for deduplication...but I store each snapshot as a git commit, not as a btrfs snapshot or even subvolume. We do sometimes run queries like "in the last two years, how many times did $CONDITION occur?" which will scan a handful files in all of the snapshots. The use case itself isn't unreasonable, although using the filesystem instead of a more domain-specific tool to achieve it may be. --cWoXeonUoKmBZSoM Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlRIDqwACgkQgfmLGlazG5ys/wCfbRtnMfvH7DcnNrBiKmQNGe7/ dfEAnRmVz1iCdlXWrI6gPKVUfWgSuh8+ =MT39 -----END PGP SIGNATURE----- --cWoXeonUoKmBZSoM--