* Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) @ 2020-03-29 22:30 Victor Hooi 2020-03-30 5:46 ` Andrei Borzenkov 0 siblings, 1 reply; 11+ messages in thread From: Victor Hooi @ 2020-03-29 22:30 UTC (permalink / raw) To: linux-btrfs Hi, I have a small 12-bay SuperMicro server I'm using as a local NAS, with FreeNAS/ZFS. Each drive is a 12TB HDD. I'm in the process of moving it to Linux - and I thought this might be a good chance to try out BTRFS again =). (I'd previously tried BTRFS many years a go, and hit some issues - it's possible this may have been made worse by my inexperience with BTRFS at the time - e.g. https://www.spinics.net/lists/linux-btrfs/msg04240.html) Anyhow - currently the server has a 750GB Intel Optane drive, that we're using as a ZLOG/SIL drive: https://www.ixsystems.com/community/threads/how-best-to-use-960gb-optane-in-freenas-build.75798/#post-527264 My question is - what's the equivalent in BTRFS-land? Or what is the best way to use an ultra-fast Intel Optane drive to accelerate reads/writes on a BTRFS array? Thanks, Victor ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-29 22:30 Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) Victor Hooi @ 2020-03-30 5:46 ` Andrei Borzenkov 2020-03-30 6:00 ` Paul Jones 0 siblings, 1 reply; 11+ messages in thread From: Andrei Borzenkov @ 2020-03-30 5:46 UTC (permalink / raw) To: Victor Hooi, linux-btrfs 30.03.2020 01:30, Victor Hooi пишет: > Hi, > > I have a small 12-bay SuperMicro server I'm using as a local NAS, with > FreeNAS/ZFS. > > Each drive is a 12TB HDD. > > I'm in the process of moving it to Linux - and I thought this might be > a good chance to try out BTRFS again =). > > (I'd previously tried BTRFS many years a go, and hit some issues - > it's possible this may have been made worse by my inexperience with > BTRFS at the time - e.g. > https://www.spinics.net/lists/linux-btrfs/msg04240.html) > > Anyhow - currently the server has a 750GB Intel Optane drive, that > we're using as a ZLOG/SIL drive: > Do you mean ZIL/SLOG? ZIL == ZFS Intent Log, SLOG == SSD Log. > https://www.ixsystems.com/community/threads/how-best-to-use-960gb-optane-in-freenas-build.75798/#post-527264 > > My question is - what's the equivalent in BTRFS-land? > Not on btrfs level. I guess using bcache on top of btrfs may achieve some similar effects. > Or what is the best way to use an ultra-fast Intel Optane drive to > accelerate reads/writes on a BTRFS array? > ZIL is *write* intent log, it does not directly accelerates reads. ZFS supports SSD as second-level read cache, but as far as I remember it is physically separate from ZIL. ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-30 5:46 ` Andrei Borzenkov @ 2020-03-30 6:00 ` Paul Jones 2020-03-31 17:01 ` Eli V 0 siblings, 1 reply; 11+ messages in thread From: Paul Jones @ 2020-03-30 6:00 UTC (permalink / raw) To: Andrei Borzenkov, Victor Hooi, linux-btrfs > -----Original Message----- > From: linux-btrfs-owner@vger.kernel.org <linux-btrfs- > owner@vger.kernel.org> On Behalf Of Andrei Borzenkov > Sent: Monday, 30 March 2020 4:46 PM > To: Victor Hooi <victorhooi@gmail.com>; linux-btrfs <linux- > btrfs@vger.kernel.org> > Subject: Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of > ZLOG/SIL for ZFS?) > > 30.03.2020 01:30, Victor Hooi пишет: > > Hi, > > > > I have a small 12-bay SuperMicro server I'm using as a local NAS, with > > FreeNAS/ZFS. > > > > Each drive is a 12TB HDD. > > > > I'm in the process of moving it to Linux - and I thought this might be > > a good chance to try out BTRFS again =). > > > > (I'd previously tried BTRFS many years a go, and hit some issues - > > it's possible this may have been made worse by my inexperience with > > BTRFS at the time - e.g. > > https://www.spinics.net/lists/linux-btrfs/msg04240.html) > > > > Anyhow - currently the server has a 750GB Intel Optane drive, that > > we're using as a ZLOG/SIL drive: > > > > Do you mean ZIL/SLOG? ZIL == ZFS Intent Log, SLOG == SSD Log. > > > https://www.ixsystems.com/community/threads/how-best-to-use-960gb- > opta > > ne-in-freenas-build.75798/#post-527264 > > > > My question is - what's the equivalent in BTRFS-land? > > > > Not on btrfs level. I guess using bcache on top of btrfs may achieve some > similar effects. > > > Or what is the best way to use an ultra-fast Intel Optane drive to > > accelerate reads/writes on a BTRFS array? > > > > > ZIL is *write* intent log, it does not directly accelerates reads. ZFS supports > SSD as second-level read cache, but as far as I remember it is physically > separate from ZIL. I have used caching with lvm under btrfs. It's a pain to setup correctly for a btrfs raid1 setup (need separate volume groups with separate logical volumes to ensure it's impossible to have two raid1 stripes on the same physical disk without noticing it) but it did work quite well and I never had any strange problems with it. Paul. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-30 6:00 ` Paul Jones @ 2020-03-31 17:01 ` Eli V 2020-03-31 17:09 ` Andrei Borzenkov 2020-03-31 17:17 ` Roman Mamedov 0 siblings, 2 replies; 11+ messages in thread From: Eli V @ 2020-03-31 17:01 UTC (permalink / raw) To: Paul Jones; +Cc: Andrei Borzenkov, Victor Hooi, linux-btrfs On Mon, Mar 30, 2020 at 2:02 AM Paul Jones <paul@pauljones.id.au> wrote: > > > -----Original Message----- > > From: linux-btrfs-owner@vger.kernel.org <linux-btrfs- > > owner@vger.kernel.org> On Behalf Of Andrei Borzenkov > > Sent: Monday, 30 March 2020 4:46 PM > > To: Victor Hooi <victorhooi@gmail.com>; linux-btrfs <linux- > > btrfs@vger.kernel.org> > > Subject: Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of > > ZLOG/SIL for ZFS?) > > > > 30.03.2020 01:30, Victor Hooi пишет: > > > Hi, > > > > > > I have a small 12-bay SuperMicro server I'm using as a local NAS, with > > > FreeNAS/ZFS. > > > > > > Each drive is a 12TB HDD. > > > > > > I'm in the process of moving it to Linux - and I thought this might be > > > a good chance to try out BTRFS again =). > > > > > > (I'd previously tried BTRFS many years a go, and hit some issues - > > > it's possible this may have been made worse by my inexperience with > > > BTRFS at the time - e.g. > > > https://www.spinics.net/lists/linux-btrfs/msg04240.html) > > > > > > Anyhow - currently the server has a 750GB Intel Optane drive, that > > > we're using as a ZLOG/SIL drive: > > > > > > > Do you mean ZIL/SLOG? ZIL == ZFS Intent Log, SLOG == SSD Log. > > > > > https://www.ixsystems.com/community/threads/how-best-to-use-960gb- > > opta > > > ne-in-freenas-build.75798/#post-527264 > > > > > > My question is - what's the equivalent in BTRFS-land? > > > > > > > Not on btrfs level. I guess using bcache on top of btrfs may achieve some > > similar effects. > > > > > Or what is the best way to use an ultra-fast Intel Optane drive to > > > accelerate reads/writes on a BTRFS array? > > > > > > > > > ZIL is *write* intent log, it does not directly accelerates reads. ZFS supports > > SSD as second-level read cache, but as far as I remember it is physically > > separate from ZIL. > > I have used caching with lvm under btrfs. It's a pain to setup correctly for a btrfs raid1 setup (need separate volume groups with separate logical volumes to ensure it's impossible to have two raid1 stripes on the same physical disk without noticing it) but it did work quite well and I never had any strange problems with it. > > Paul. Another option is to put the 12TB drives in an mdadm RAID, and then use the mdadm raid & the ssd for btrfs RAID1 metadata, with SINGLE data on the the array. Currently, this will make roughly half of the meta data lookups run at SSD speed, but there is a pending patch to allow all the metadata reads to go to the SSD. This option is, of course, only useful for speeding up metadata operations. It can make large btrfs filesystems feel much more responsive in interactive use however. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-31 17:01 ` Eli V @ 2020-03-31 17:09 ` Andrei Borzenkov 2020-03-31 20:08 ` Goffredo Baroncelli 2020-03-31 17:17 ` Roman Mamedov 1 sibling, 1 reply; 11+ messages in thread From: Andrei Borzenkov @ 2020-03-31 17:09 UTC (permalink / raw) To: Eli V, Paul Jones; +Cc: Victor Hooi, linux-btrfs 31.03.2020 20:01, Eli V пишет: > > Another option is to put the 12TB drives in an mdadm RAID, and then > use the mdadm raid & the ssd for btrfs RAID1 metadata, with SINGLE > data on the the array. How do you restrict specific device for metadata only? > Currently, this will make roughly half of the > meta data lookups run at SSD speed, but there is a pending patch to > allow all the metadata reads to go to the SSD. This option is, of > course, only useful for speeding up metadata operations. It can make > large btrfs filesystems feel much more responsive in interactive use > however. > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-31 17:09 ` Andrei Borzenkov @ 2020-03-31 20:08 ` Goffredo Baroncelli 2020-03-31 21:44 ` Goffredo Baroncelli 0 siblings, 1 reply; 11+ messages in thread From: Goffredo Baroncelli @ 2020-03-31 20:08 UTC (permalink / raw) To: Andrei Borzenkov, Eli V, Paul Jones; +Cc: Victor Hooi, linux-btrfs On 3/31/20 7:09 PM, Andrei Borzenkov wrote: > 31.03.2020 20:01, Eli V пишет: >> >> Another option is to put the 12TB drives in an mdadm RAID, and then >> use the mdadm raid & the ssd for btrfs RAID1 metadata, with SINGLE >> data on the the array. > > How do you restrict specific device for metadata only? I never tried, but I don't think that it would be so complicated. When BTRFS has to allocate a new chunk, it collects all the available free spaces on the disks; it sorts all these free spaces on the basis of criterion like the largest contiguous area and how the disk is full and pick the top one. It could be sufficient to add another criteria to the sorting algorithm, something like that - if the chunk is a metadata one, an SSD has an higher priority - if the chunk is a data one, an SSD has a lower priority So the metadata will have an higher likelihood to be on the SSD, instead the data will have an higher likelihood to be a NON SSD disk. Of course this is a soft constraint, when a kind of disk is full, it will be possible to use the other kind, only with a lower priority. > >> Currently, this will make roughly half of the >> meta data lookups run at SSD speed, but there is a pending patch to >> allow all the metadata reads to go to the SSD. This option is, of >> course, only useful for speeding up metadata operations. It can make >> large btrfs filesystems feel much more responsive in interactive use >> however. >> > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-31 20:08 ` Goffredo Baroncelli @ 2020-03-31 21:44 ` Goffredo Baroncelli 0 siblings, 0 replies; 11+ messages in thread From: Goffredo Baroncelli @ 2020-03-31 21:44 UTC (permalink / raw) To: Andrei Borzenkov, Eli V, Paul Jones; +Cc: Victor Hooi, linux-btrfs On 3/31/20 10:08 PM, Goffredo Baroncelli wrote: > On 3/31/20 7:09 PM, Andrei Borzenkov wrote: >> 31.03.2020 20:01, Eli V пишет: >>> >>> Another option is to put the 12TB drives in an mdadm RAID, and then >>> use the mdadm raid & the ssd for btrfs RAID1 metadata, with SINGLE >>> data on the the array. >> >> How do you restrict specific device for metadata only? > > I never tried, but I don't think that it would be so complicated. > > When BTRFS has to allocate a new chunk, it collects all the available > free spaces on the disks; it sorts all these free spaces on the basis of > criterion like the largest contiguous area and how the disk is full > and pick the top one. > > It could be sufficient to add another criteria to the sorting algorithm, > something like that > - if the chunk is a metadata one, an SSD has an higher priority > - if the chunk is a data one, an SSD has a lower priority > > So the metadata will have an higher likelihood to be on the SSD, > instead the data will have an higher likelihood to be a NON SSD disk. > > Of course this is a soft constraint, when a kind of disk is full, it will > be possible to use the other kind, only with a lower priority. > This is only to give an idea. In order to enable the feature, it must be mounted with the flag ssd_metadata: # mount -o ssd_metadata /dev/sdX /mnt/test (don't try at home !) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2e9f938508e9..0f3c09cc4863 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1187,6 +1187,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct btrfs_fs_info *info) #define BTRFS_MOUNT_FREE_SPACE_TREE (1 << 26) #define BTRFS_MOUNT_NOLOGREPLAY (1 << 27) #define BTRFS_MOUNT_REF_VERIFY (1 << 28) +#define BTRFS_MOUNT_SSD_METADATA (1 << 29) #define BTRFS_DEFAULT_COMMIT_INTERVAL (30) #define BTRFS_DEFAULT_MAX_INLINE (2048) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index c6557d44907a..d0a5cf496f90 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -346,6 +346,7 @@ enum { #ifdef CONFIG_BTRFS_FS_REF_VERIFY Opt_ref_verify, #endif + Opt_ssd_metadata, Opt_err, }; @@ -416,6 +417,7 @@ static const match_table_t tokens = { #ifdef CONFIG_BTRFS_FS_REF_VERIFY {Opt_ref_verify, "ref_verify"}, #endif + {Opt_ssd_metadata, "ssd_metadata"}, {Opt_err, NULL}, }; @@ -853,6 +855,10 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, btrfs_set_opt(info->mount_opt, REF_VERIFY); break; #endif + case Opt_ssd_metadata: + btrfs_set_and_info(info, SSD_METADATA, + "enabling ssd_metadata"); + break; case Opt_err: btrfs_info(info, "unrecognized mount option '%s'", p); ret = -EINVAL; @@ -1369,6 +1375,8 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry) #endif if (btrfs_test_opt(info, REF_VERIFY)) seq_puts(seq, ",ref_verify"); + if (btrfs_test_opt(info, SSD_METADATA)) + seq_puts(seq, ",ssd_metadata"); seq_printf(seq, ",subvolid=%llu", BTRFS_I(d_inode(dentry))->root->root_key.objectid); seq_puts(seq, ",subvol="); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index a8b71ded4d21..43bb5d98a8cb 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -4758,6 +4758,67 @@ static int btrfs_cmp_device_info(const void *a, const void *b) return 0; } +/* + * sort the devices in descending order by rotational, + * max_avail, total_avail + */ +static int btrfs_cmp_device_info_metadata(const void *a, const void *b) +{ + const struct btrfs_device_info *di_a = a; + const struct btrfs_device_info *di_b = b; + const int nrot_a = test_bit(QUEUE_FLAG_NONROT, + &(bdev_get_queue(di_a->dev->bdev)->queue_flags)); + + const int nrot_b = test_bit(QUEUE_FLAG_NONROT, + &(bdev_get_queue(di_b->dev->bdev)->queue_flags)); + + /* metadata -> non rotational first */ + if (nrot_a && !nrot_b) + return -1; + if (!nrot_a && nrot_b) + return 1; + if (di_a->max_avail > di_b->max_avail) + return -1; + if (di_a->max_avail < di_b->max_avail) + return 1; + if (di_a->total_avail > di_b->total_avail) + return -1; + if (di_a->total_avail < di_b->total_avail) + return 1; + return 0; +} + +/* + * sort the devices in descending order by !rotational, + * max_avail, total_avail + */ +static int btrfs_cmp_device_info_data(const void *a, const void *b) +{ + const struct btrfs_device_info *di_a = a; + const struct btrfs_device_info *di_b = b; + const int nrot_a = test_bit(QUEUE_FLAG_NONROT, + &(bdev_get_queue(di_a->dev->bdev)->queue_flags)); + const int nrot_b = test_bit(QUEUE_FLAG_NONROT, + &(bdev_get_queue(di_b->dev->bdev)->queue_flags)); + + /* data -> non rotational last */ + if (nrot_a && !nrot_b) + return 1; + if (!nrot_a && nrot_b) + return -1; + if (di_a->max_avail > di_b->max_avail) + return -1; + if (di_a->max_avail < di_b->max_avail) + return 1; + if (di_a->total_avail > di_b->total_avail) + return -1; + if (di_a->total_avail < di_b->total_avail) + return 1; + return 0; +} + + + static void check_raid56_incompat_flag(struct btrfs_fs_info *info, u64 type) { if (!(type & BTRFS_BLOCK_GROUP_RAID56_MASK)) @@ -4917,9 +4978,17 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, /* * now sort the devices by hole size / available space */ - sort(devices_info, ndevs, sizeof(struct btrfs_device_info), - btrfs_cmp_device_info, NULL); - + if (((type & BTRFS_BLOCK_GROUP_DATA) && + (type & BTRFS_BLOCK_GROUP_METADATA)) || + !btrfs_test_opt(info, SSD_METADATA)) + sort(devices_info, ndevs, sizeof(struct btrfs_device_info), + btrfs_cmp_device_info, NULL); + else if (type & BTRFS_BLOCK_GROUP_DATA) + sort(devices_info, ndevs, sizeof(struct btrfs_device_info), + btrfs_cmp_device_info_data, NULL); + else + sort(devices_info, ndevs, sizeof(struct btrfs_device_info), + btrfs_cmp_device_info_metadata, NULL); /* * Round down to number of usable stripes, devs_increment can be any * number so we can't use round_down() >> >>> Currently, this will make roughly half of the >>> meta data lookups run at SSD speed, but there is a pending patch to >>> allow all the metadata reads to go to the SSD. This option is, of >>> course, only useful for speeding up metadata operations. It can make >>> large btrfs filesystems feel much more responsive in interactive use >>> however. >>> >> > > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-31 17:01 ` Eli V 2020-03-31 17:09 ` Andrei Borzenkov @ 2020-03-31 17:17 ` Roman Mamedov 2020-03-31 17:31 ` Eli V 1 sibling, 1 reply; 11+ messages in thread From: Roman Mamedov @ 2020-03-31 17:17 UTC (permalink / raw) To: Eli V; +Cc: Paul Jones, Andrei Borzenkov, Victor Hooi, linux-btrfs On Tue, 31 Mar 2020 13:01:09 -0400 Eli V <eliventer@gmail.com> wrote: > Another option is to put the 12TB drives in an mdadm RAID, and then > use the mdadm raid & the ssd for btrfs RAID1 metadata, with SINGLE > data on the the array. Currently, this will make roughly half of the > meta data lookups run at SSD speed, but there is a pending patch to > allow all the metadata reads to go to the SSD. This option is, of > course, only useful for speeding up metadata operations. It can make > large btrfs filesystems feel much more responsive in interactive use > however. If you're not taking advantage of Btrfs-side features for RAID, then might as well run LVM Cache on top of mdadm, and then Btrfs on top of the cached LV. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/logical_volume_manager_administration/lvm_cache_volume_creation https://lukas.zapletalovi.com/2019/05/lvm-cache-in-six-easy-steps.html Or Bcache, which is the same concept, but I do not suggest it over LVM cache due to perceived lower code quality, i.e. many data loss bugs, at least in the past. And as the 2nd article mentions, you can't un-bcache a block device, even if the cache device is disabled, the metadata cannot be removed. Unlike LVM where it is easy to switch back an LV to a plain uncached one. -- With respect, Roman ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-31 17:17 ` Roman Mamedov @ 2020-03-31 17:31 ` Eli V 2020-03-31 17:42 ` Roman Mamedov 0 siblings, 1 reply; 11+ messages in thread From: Eli V @ 2020-03-31 17:31 UTC (permalink / raw) To: Roman Mamedov; +Cc: Paul Jones, Andrei Borzenkov, Victor Hooi, linux-btrfs On Tue, Mar 31, 2020 at 1:17 PM Roman Mamedov <rm@romanrm.net> wrote: > > On Tue, 31 Mar 2020 13:01:09 -0400 > Eli V <eliventer@gmail.com> wrote: > > > Another option is to put the 12TB drives in an mdadm RAID, and then > > use the mdadm raid & the ssd for btrfs RAID1 metadata, with SINGLE > > data on the the array. Currently, this will make roughly half of the > > meta data lookups run at SSD speed, but there is a pending patch to > > allow all the metadata reads to go to the SSD. This option is, of > > course, only useful for speeding up metadata operations. It can make > > large btrfs filesystems feel much more responsive in interactive use > > however. > > If you're not taking advantage of Btrfs-side features for RAID, then might as > well run LVM Cache on top of mdadm, and then Btrfs on top of the > cached LV. > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/logical_volume_manager_administration/lvm_cache_volume_creation > https://lukas.zapletalovi.com/2019/05/lvm-cache-in-six-easy-steps.html > > Or Bcache, which is the same concept, but I do not suggest it over LVM cache > due to perceived lower code quality, i.e. many data loss bugs, at least in the > past. And as the 2nd article mentions, you can't un-bcache a block device, > even if the cache device is disabled, the metadata cannot be removed. Unlike > LVM where it is easy to switch back an LV to a plain uncached one. > > -- > With respect, > Roman Yes using lvm cache is an option, and will give you actual caching of the data files as well. However, in my experience it doesn't do much caching of metadata so using it on large filesystems doesn't seem to improve interactive usage much at all, i.e. ls -l, or btrfs filesystem usage etc. As to the question of "How do you restrict specific device for metadata only?" With btrfs metadata as RAID1 and data as SINGLE, and the mdadm array being much larger then the SSD, all data allocations will naturally go to the mdadm array, and all metadata writes will go to both the SSD and the array. Currently, the metadata reads will be balanced across the 2 devices based on PID. Once the btrfs readmirror patches are merged then you'll be able to have all the metadata reads go to just the SSD. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-31 17:31 ` Eli V @ 2020-03-31 17:42 ` Roman Mamedov 2020-03-31 19:46 ` Eli V 0 siblings, 1 reply; 11+ messages in thread From: Roman Mamedov @ 2020-03-31 17:42 UTC (permalink / raw) To: Eli V; +Cc: Paul Jones, Andrei Borzenkov, Victor Hooi, linux-btrfs On Tue, 31 Mar 2020 13:31:19 -0400 Eli V <eliventer@gmail.com> wrote: > Yes using lvm cache is an option, and will give you actual caching of > the data files as well. However, in my experience it doesn't do much > caching of metadata so using it on large filesystems doesn't seem to > improve interactive usage much at all, i.e. ls -l, or btrfs filesystem > usage etc. Forgot to mention that in my case (on a large media server) I had great results with the described setup, especially noticeable in the mount time. Walking large directories in a GUI file manager was more responsive too. Not to mention mass deletion of snapshots. LVM cache seemed to know well to avoid polluting itself with infrequently accessed sequential-pattern bulk operations (i.e. copying or reading back the actual file data) and appeared to cache mostly the metadata as it should. For anyone considering this, give it a try, and give it at least a few days of normal usage to properly warm up. -- With respect, Roman ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) 2020-03-31 17:42 ` Roman Mamedov @ 2020-03-31 19:46 ` Eli V 0 siblings, 0 replies; 11+ messages in thread From: Eli V @ 2020-03-31 19:46 UTC (permalink / raw) To: Roman Mamedov; +Cc: Paul Jones, Andrei Borzenkov, Victor Hooi, linux-btrfs On Tue, Mar 31, 2020 at 1:42 PM Roman Mamedov <rm@romanrm.net> wrote: > > On Tue, 31 Mar 2020 13:31:19 -0400 > Eli V <eliventer@gmail.com> wrote: > > > Yes using lvm cache is an option, and will give you actual caching of > > the data files as well. However, in my experience it doesn't do much > > caching of metadata so using it on large filesystems doesn't seem to > > improve interactive usage much at all, i.e. ls -l, or btrfs filesystem > > usage etc. > > Forgot to mention that in my case (on a large media server) I had great > results with the described setup, especially noticeable in the mount time. > Walking large directories in a GUI file manager was more responsive too. Not > to mention mass deletion of snapshots. LVM cache seemed to know well to avoid > polluting itself with infrequently accessed sequential-pattern bulk operations > (i.e. copying or reading back the actual file data) and appeared to cache > mostly the metadata as it should. For anyone considering this, give it a try, > and give it at least a few days of normal usage to properly warm up. > > -- > With respect, > Roman Yes, certainly test it out for yourself. My use case is quite different, large(>300TB) btrfs filesystems used for rsync & snapshot backups of proprietary NAS. The coolest thing is, through the wonders of btrfs and lvm, you can dynamically convert from one configuration to the other. I don't think even a umount is needed. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-03-31 21:44 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-03-29 22:30 Using Intel Optane to accelerate a BTRFS array? (equivalent of ZLOG/SIL for ZFS?) Victor Hooi 2020-03-30 5:46 ` Andrei Borzenkov 2020-03-30 6:00 ` Paul Jones 2020-03-31 17:01 ` Eli V 2020-03-31 17:09 ` Andrei Borzenkov 2020-03-31 20:08 ` Goffredo Baroncelli 2020-03-31 21:44 ` Goffredo Baroncelli 2020-03-31 17:17 ` Roman Mamedov 2020-03-31 17:31 ` Eli V 2020-03-31 17:42 ` Roman Mamedov 2020-03-31 19:46 ` Eli V
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox