linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space
@ 2018-08-21  5:40 Qu Wenruo
  2018-08-24  7:14 ` Misono Tomohiro
  0 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2018-08-21  5:40 UTC (permalink / raw)
  To: linux-btrfs

Commit c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
makes nocow check less frequent to improve performance.

However for quota enabled case, such optimization could lead to extra
unnecessary data reservation, which results failure for test case like
btrfs/153 in fstests.

Fix it by reverting to old behavior for quota enabled case.

Fixes: c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
changelog
v2:
  Fix regression for quota+cow case. (Previously it will skip data
  reservation if quota is enabled, causing regression for limit case.
  Pointed out by Misono)
---
 fs/btrfs/file.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 2be00e873e92..183e5fb96f42 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1584,6 +1584,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 	int ret = 0;
 	bool only_release_metadata = false;
 	bool force_page_uptodate = false;
+	bool quota_enabled = test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
 
 	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
 			PAGE_SIZE / (sizeof(struct page *)));
@@ -1624,13 +1625,28 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 				fs_info->sectorsize);
 
 		extent_changeset_release(data_reserved);
+
+		/*
+		 * If we have quota enabled, we must do the heavy lift nocow
+		 * check here to avoid reserving data space, or we can hit
+		 * limitation for NOCOW files.
+		 */
+		if (quota_enabled) {
+			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
+						      BTRFS_INODE_PREALLOC)) &&
+			    check_can_nocow(BTRFS_I(inode), pos,
+					    &write_bytes) > 0)
+				goto reserve_meta_only;
+		}
 		ret = btrfs_check_data_free_space(inode, &data_reserved, pos,
 						  write_bytes);
 		if (ret < 0) {
 			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
 						      BTRFS_INODE_PREALLOC)) &&
 			    check_can_nocow(BTRFS_I(inode), pos,
-					&write_bytes) > 0) {
+					&write_bytes) > 0 &&
+			    !quota_enabled) {
+reserve_meta_only:
 				/*
 				 * For nodata cow case, no need to reserve
 				 * data space.
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space
  2018-08-21  5:40 [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space Qu Wenruo
@ 2018-08-24  7:14 ` Misono Tomohiro
  2018-08-24  7:20   ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: Misono Tomohiro @ 2018-08-24  7:14 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

Hi,

On 2018/08/21 14:40, Qu Wenruo wrote:
> Commit c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
> makes nocow check less frequent to improve performance.
> 
> However for quota enabled case, such optimization could lead to extra
> unnecessary data reservation, which results failure for test case like
> btrfs/153 in fstests.
> 
> Fix it by reverting to old behavior for quota enabled case.
> 
> Fixes: c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> changelog
> v2:
>   Fix regression for quota+cow case. (Previously it will skip data
>   reservation if quota is enabled, causing regression for limit case.
>   Pointed out by Misono)
> ---
>  fs/btrfs/file.c | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 2be00e873e92..183e5fb96f42 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1584,6 +1584,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>  	int ret = 0;
>  	bool only_release_metadata = false;
>  	bool force_page_uptodate = false;
> +	bool quota_enabled = test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>  
>  	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
>  			PAGE_SIZE / (sizeof(struct page *)));
> @@ -1624,13 +1625,28 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>  				fs_info->sectorsize);
>  
>  		extent_changeset_release(data_reserved);
> +
> +		/*
> +		 * If we have quota enabled, we must do the heavy lift nocow
> +		 * check here to avoid reserving data space, or we can hit
> +		 * limitation for NOCOW files.
> +		 */
> +		if (quota_enabled) {
> +			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
> +						      BTRFS_INODE_PREALLOC)) &&
> +			    check_can_nocow(BTRFS_I(inode), pos,
> +					    &write_bytes) > 0)
> +				goto reserve_meta_only;
> +		}
>  		ret = btrfs_check_data_free_space(inode, &data_reserved, pos,
>  						  write_bytes);
>  		if (ret < 0) {
>  			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>  						      BTRFS_INODE_PREALLOC)) &&
>  			    check_can_nocow(BTRFS_I(inode), pos,
> -					&write_bytes) > 0) {
> +					&write_bytes) > 0 &&

> +			    !quota_enabled) {

(When we check this condition, quota_enabled must be false or otherwise
we have already goto reserve_meta_only. So it seems redundant.)

> +reserve_meta_only:
>  				/*
>  				 * For nodata cow case, no need to reserve
>  				 * data space.
> 

I applied this patch on today's misc-next and it seems mostly ok, but
btrfs/022 sometimes gives following warning:

[80244.152130] WARNING: CPU: 5 PID: 14575 at fs/btrfs/extent-tree.c:9742 btrfs_free_block_groups+0x2d7/0x440 [btrfs]
[80244.152132] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress xxhash raid6_pq xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc xt_conntrack ip_set nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security sunrpc intel_powerclamp kvm_intel kvm gpio_ich iTCO_wdt ipmi_ssif iTCO_vendor_support ipmi_si st irqbypass ipmi_devintf crct10dif_pclmul crc32_pclmul ipmi_msghandler ghash_clmulni_intel pcspkr acpi_power_meter i2c_i801 pcc_cpufreq i7core_edac lpc_ich acpi_cpufreq xfs libcrc32c mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb sr_mod hwmon uas ptp crc32c_intel cdrom usb_storage pps_core ata_generic megaraid_sas dca pata_acpi i2c_algo_bit ipv6 [last unloaded: xor]
[80244.152185] CPU: 5 PID: 14575 Comm: umount Tainted: G        W IO      4.18.0-rc8+ #98
[80244.152187] Hardware name: FUJITSU-SV                       PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.09.2619.N1           12/13/2010
[80244.152205] RIP: 0010:btrfs_free_block_groups+0x2d7/0x440 [btrfs]
[80244.152206] Code: 85 20 cb 00 00 48 39 c6 0f 84 b9 00 00 00 49 bf 00 01 00 00 00 00 ad de 48 8b 9d 20 cb 00 00 48 83 7b a0 00 0f 84 0d 01 00 00 <0f> 0b 48 8d 73 88 31 c9 31 d2 48 89 ef e8 27 7a ff ff 48 89 df e8
[80244.152235] RSP: 0018:ffff8ea10393fdb0 EFLAGS: 00010286
[80244.152237] RAX: ffff8c1025819e78 RBX: ffff8c1025819e78 RCX: 0000000000000000
[80244.152238] RDX: 0000000000000001 RSI: ffff8c115329cb20 RDI: ffff8c1025818e00
[80244.152239] RBP: ffff8c1153290000 R08: 0000000000000000 R09: 0000000000000000
[80244.152240] R10: ffff8c1025818e98 R11: 0000000000000002 R12: ffff8c11532900a0
[80244.152241] R13: 0000000000000000 R14: dead000000000200 R15: dead000000000100
[80244.152243] FS:  00007ff0cc3d8fc0(0000) GS:ffff8c1137d40000(0000) knlGS:0000000000000000
[80244.152244] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[80244.152245] CR2: 00007ffe75e53c88 CR3: 0000000235d16001 CR4: 00000000000206e0
[80244.152246] Call Trace:
[80244.152270]  close_ctree+0x146/0x320 [btrfs]
[80244.152276]  ? kthread_stop+0x42/0xf0
[80244.152280]  generic_shutdown_super+0x6c/0x110
[80244.152283]  kill_anon_super+0xe/0x20
[80244.152298]  btrfs_kill_super+0x13/0x100 [btrfs]
[80244.152301]  deactivate_locked_super+0x3f/0x70
[80244.152303]  cleanup_mnt+0x3b/0x70
[80244.152305]  task_work_run+0x84/0xa0
[80244.152308]  do_syscall_64+0x143/0x4cd
[80244.152311]  ? do_page_fault+0x31/0x130
[80244.152314]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[80244.152316] RIP: 0033:0x7ff0cb43c1a7
[80244.152317] Code: ad 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 ac 2b 00 f7 d8 64 89 01 48
[80244.152346] RSP: 002b:00007ffe75e554b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[80244.152348] RAX: 0000000000000000 RBX: 0000563ccced82d0 RCX: 00007ff0cb43c1a7
[80244.152349] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000563ccced84b0
[80244.152350] RBP: 0000563ccced84b0 R08: 0000000000000005 R09: 0000563ccced84d0
[80244.152351] R10: 00007ff0cb4ba320 R11: 0000000000000246 R12: 00007ff0cc1d0184
[80244.152352] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[80244.152354] ---[ end trace b5975ec96174c60c ]---
[80244.152358] BTRFS info (device sdh2): space_info 1 has 1022132224 free, is not full
[80244.152360] BTRFS info (device sdh2): space_info total=1082130432, used=60039168, pinned=0, reserved=0, may_use=18446744073709510656, readonly=0

Obviously, may_use value is underflowed.

Thanks,
Misono

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space
  2018-08-24  7:14 ` Misono Tomohiro
@ 2018-08-24  7:20   ` Qu Wenruo
  2018-08-24  7:54     ` Misono Tomohiro
  0 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2018-08-24  7:20 UTC (permalink / raw)
  To: Misono Tomohiro, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 7158 bytes --]



On 2018/8/24 下午3:14, Misono Tomohiro wrote:
> Hi,
> 
> On 2018/08/21 14:40, Qu Wenruo wrote:
>> Commit c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>> makes nocow check less frequent to improve performance.
>>
>> However for quota enabled case, such optimization could lead to extra
>> unnecessary data reservation, which results failure for test case like
>> btrfs/153 in fstests.
>>
>> Fix it by reverting to old behavior for quota enabled case.
>>
>> Fixes: c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> changelog
>> v2:
>>   Fix regression for quota+cow case. (Previously it will skip data
>>   reservation if quota is enabled, causing regression for limit case.
>>   Pointed out by Misono)
>> ---
>>  fs/btrfs/file.c | 18 +++++++++++++++++-
>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>> index 2be00e873e92..183e5fb96f42 100644
>> --- a/fs/btrfs/file.c
>> +++ b/fs/btrfs/file.c
>> @@ -1584,6 +1584,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>>  	int ret = 0;
>>  	bool only_release_metadata = false;
>>  	bool force_page_uptodate = false;
>> +	bool quota_enabled = test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>>  
>>  	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
>>  			PAGE_SIZE / (sizeof(struct page *)));
>> @@ -1624,13 +1625,28 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>>  				fs_info->sectorsize);
>>  
>>  		extent_changeset_release(data_reserved);
>> +
>> +		/*
>> +		 * If we have quota enabled, we must do the heavy lift nocow
>> +		 * check here to avoid reserving data space, or we can hit
>> +		 * limitation for NOCOW files.
>> +		 */
>> +		if (quota_enabled) {
>> +			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>> +						      BTRFS_INODE_PREALLOC)) &&
>> +			    check_can_nocow(BTRFS_I(inode), pos,
>> +					    &write_bytes) > 0)
>> +				goto reserve_meta_only;
>> +		}
>>  		ret = btrfs_check_data_free_space(inode, &data_reserved, pos,
>>  						  write_bytes);
>>  		if (ret < 0) {
>>  			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>>  						      BTRFS_INODE_PREALLOC)) &&
>>  			    check_can_nocow(BTRFS_I(inode), pos,
>> -					&write_bytes) > 0) {
>> +					&write_bytes) > 0 &&
> 
>> +			    !quota_enabled) {
> 
> (When we check this condition, quota_enabled must be false or otherwise
> we have already goto reserve_meta_only. So it seems redundant.)

It's possible that we have quota enabled, and then
btrfs_check_data_free_space() failed with -EDQUOT.

In that case, we need above !quota_enabled check to avoid unnecessary
check and just go error branch.

> 
>> +reserve_meta_only:
>>  				/*
>>  				 * For nodata cow case, no need to reserve
>>  				 * data space.
>>
> 
> I applied this patch on today's misc-next and it seems mostly ok, but
> btrfs/022 sometimes gives following warning:

This looks like related to the regression caused by commit
c4c129db5da8f070147f175 ("btrfs: drop unused
parameter qgroup_reserved").

Would you please try reverting that patch?

Thanks,
Qu


> 
> [80244.152130] WARNING: CPU: 5 PID: 14575 at fs/btrfs/extent-tree.c:9742 btrfs_free_block_groups+0x2d7/0x440 [btrfs]
> [80244.152132] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress xxhash raid6_pq xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc xt_conntrack ip_set nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security sunrpc intel_powerclamp kvm_intel kvm gpio_ich iTCO_wdt ipmi_ssif iTCO_vendor_support ipmi_si st irqbypass ipmi_devintf crct10dif_pclmul crc32_pclmul ipmi_msghandler ghash_clmulni_intel pcspkr acpi_power_meter i2c_i801 pcc_cpufreq i7core_edac lpc_ich acpi_cpufreq xfs libcrc32c mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb sr_mod hwmon uas ptp crc32c_intel cdrom usb_storage pps_core ata_generic megaraid_sas dca pata_acpi i2c_algo_bit ipv6 [last unloaded: xor]
> [80244.152185] CPU: 5 PID: 14575 Comm: umount Tainted: G        W IO      4.18.0-rc8+ #98
> [80244.152187] Hardware name: FUJITSU-SV                       PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.09.2619.N1           12/13/2010
> [80244.152205] RIP: 0010:btrfs_free_block_groups+0x2d7/0x440 [btrfs]
> [80244.152206] Code: 85 20 cb 00 00 48 39 c6 0f 84 b9 00 00 00 49 bf 00 01 00 00 00 00 ad de 48 8b 9d 20 cb 00 00 48 83 7b a0 00 0f 84 0d 01 00 00 <0f> 0b 48 8d 73 88 31 c9 31 d2 48 89 ef e8 27 7a ff ff 48 89 df e8
> [80244.152235] RSP: 0018:ffff8ea10393fdb0 EFLAGS: 00010286
> [80244.152237] RAX: ffff8c1025819e78 RBX: ffff8c1025819e78 RCX: 0000000000000000
> [80244.152238] RDX: 0000000000000001 RSI: ffff8c115329cb20 RDI: ffff8c1025818e00
> [80244.152239] RBP: ffff8c1153290000 R08: 0000000000000000 R09: 0000000000000000
> [80244.152240] R10: ffff8c1025818e98 R11: 0000000000000002 R12: ffff8c11532900a0
> [80244.152241] R13: 0000000000000000 R14: dead000000000200 R15: dead000000000100
> [80244.152243] FS:  00007ff0cc3d8fc0(0000) GS:ffff8c1137d40000(0000) knlGS:0000000000000000
> [80244.152244] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [80244.152245] CR2: 00007ffe75e53c88 CR3: 0000000235d16001 CR4: 00000000000206e0
> [80244.152246] Call Trace:
> [80244.152270]  close_ctree+0x146/0x320 [btrfs]
> [80244.152276]  ? kthread_stop+0x42/0xf0
> [80244.152280]  generic_shutdown_super+0x6c/0x110
> [80244.152283]  kill_anon_super+0xe/0x20
> [80244.152298]  btrfs_kill_super+0x13/0x100 [btrfs]
> [80244.152301]  deactivate_locked_super+0x3f/0x70
> [80244.152303]  cleanup_mnt+0x3b/0x70
> [80244.152305]  task_work_run+0x84/0xa0
> [80244.152308]  do_syscall_64+0x143/0x4cd
> [80244.152311]  ? do_page_fault+0x31/0x130
> [80244.152314]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [80244.152316] RIP: 0033:0x7ff0cb43c1a7
> [80244.152317] Code: ad 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 ac 2b 00 f7 d8 64 89 01 48
> [80244.152346] RSP: 002b:00007ffe75e554b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> [80244.152348] RAX: 0000000000000000 RBX: 0000563ccced82d0 RCX: 00007ff0cb43c1a7
> [80244.152349] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000563ccced84b0
> [80244.152350] RBP: 0000563ccced84b0 R08: 0000000000000005 R09: 0000563ccced84d0
> [80244.152351] R10: 00007ff0cb4ba320 R11: 0000000000000246 R12: 00007ff0cc1d0184
> [80244.152352] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [80244.152354] ---[ end trace b5975ec96174c60c ]---
> [80244.152358] BTRFS info (device sdh2): space_info 1 has 1022132224 free, is not full
> [80244.152360] BTRFS info (device sdh2): space_info total=1082130432, used=60039168, pinned=0, reserved=0, may_use=18446744073709510656, readonly=0
> 
> Obviously, may_use value is underflowed.
> 
> Thanks,
> Misono
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space
  2018-08-24  7:20   ` Qu Wenruo
@ 2018-08-24  7:54     ` Misono Tomohiro
  2018-08-24  7:58       ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: Misono Tomohiro @ 2018-08-24  7:54 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On 2018/08/24 16:20, Qu Wenruo wrote:
> 
> 
> On 2018/8/24 下午3:14, Misono Tomohiro wrote:
>> Hi,
>>
>> On 2018/08/21 14:40, Qu Wenruo wrote:
>>> Commit c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>>> makes nocow check less frequent to improve performance.
>>>
>>> However for quota enabled case, such optimization could lead to extra
>>> unnecessary data reservation, which results failure for test case like
>>> btrfs/153 in fstests.
>>>
>>> Fix it by reverting to old behavior for quota enabled case.
>>>
>>> Fixes: c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>> ---
>>> changelog
>>> v2:
>>>   Fix regression for quota+cow case. (Previously it will skip data
>>>   reservation if quota is enabled, causing regression for limit case.
>>>   Pointed out by Misono)
>>> ---
>>>  fs/btrfs/file.c | 18 +++++++++++++++++-
>>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>>> index 2be00e873e92..183e5fb96f42 100644
>>> --- a/fs/btrfs/file.c
>>> +++ b/fs/btrfs/file.c
>>> @@ -1584,6 +1584,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>>>  	int ret = 0;
>>>  	bool only_release_metadata = false;
>>>  	bool force_page_uptodate = false;
>>> +	bool quota_enabled = test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>>>  
>>>  	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
>>>  			PAGE_SIZE / (sizeof(struct page *)));
>>> @@ -1624,13 +1625,28 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>>>  				fs_info->sectorsize);
>>>  
>>>  		extent_changeset_release(data_reserved);
>>> +
>>> +		/*
>>> +		 * If we have quota enabled, we must do the heavy lift nocow
>>> +		 * check here to avoid reserving data space, or we can hit
>>> +		 * limitation for NOCOW files.
>>> +		 */
>>> +		if (quota_enabled) {
>>> +			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>>> +						      BTRFS_INODE_PREALLOC)) &&
>>> +			    check_can_nocow(BTRFS_I(inode), pos,
>>> +					    &write_bytes) > 0)
>>> +				goto reserve_meta_only;
>>> +		}
>>>  		ret = btrfs_check_data_free_space(inode, &data_reserved, pos,
>>>  						  write_bytes);
>>>  		if (ret < 0) {
>>>  			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>>>  						      BTRFS_INODE_PREALLOC)) &&
>>>  			    check_can_nocow(BTRFS_I(inode), pos,
>>> -					&write_bytes) > 0) {
>>> +					&write_bytes) > 0 &&
>>
>>> +			    !quota_enabled) {
>>
>> (When we check this condition, quota_enabled must be false or otherwise
>> we have already goto reserve_meta_only. So it seems redundant.)
> 
> It's possible that we have quota enabled, and then
> btrfs_check_data_free_space() failed with -EDQUOT.
> 
> In that case, we need above !quota_enabled check to avoid unnecessary
> check and just go error branch.

So should quota_enabled be checked before check_can_nocow()?

> 
>>
>>> +reserve_meta_only:
>>>  				/*
>>>  				 * For nodata cow case, no need to reserve
>>>  				 * data space.
>>>
>>
>> I applied this patch on today's misc-next and it seems mostly ok, but
>> btrfs/022 sometimes gives following warning:
> 
> This looks like related to the regression caused by commit
> c4c129db5da8f070147f175 ("btrfs: drop unused
> parameter qgroup_reserved").
> 
> Would you please try reverting that patch?

I think above commit is fixed by commit eb27db470 ("btrfs: fix
qgroup_free wrong num_bytes in btrfs_subvolume_reserve_metadata") which
is already in misc-next too.

I reverted above two patch (and one more related patch 6b0cb14901
("btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot")),
but get the same result.

Thanks,
Misono

> 
> Thanks,
> Qu
> 
> 
>>
>> [80244.152130] WARNING: CPU: 5 PID: 14575 at fs/btrfs/extent-tree.c:9742 btrfs_free_block_groups+0x2d7/0x440 [btrfs]
>> [80244.152132] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress xxhash raid6_pq xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc xt_conntrack ip_set nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security sunrpc intel_powerclamp kvm_intel kvm gpio_ich iTCO_wdt ipmi_ssif iTCO_vendor_support ipmi_si st irqbypass ipmi_devintf crct10dif_pclmul crc32_pclmul ipmi_msghandler ghash_clmulni_intel pcspkr acpi_power_meter i2c_i801 pcc_cpufreq i7core_edac lpc_ich acpi_cpufreq xfs libcrc32c mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb sr_mod hwmon uas ptp crc32c_intel cdrom usb_storage pps_core ata_generic megaraid_sas dca pata_acpi i2c_algo_bit ipv6 [last unloaded: xor]
>> [80244.152185] CPU: 5 PID: 14575 Comm: umount Tainted: G        W IO      4.18.0-rc8+ #98
>> [80244.152187] Hardware name: FUJITSU-SV                       PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.09.2619.N1           12/13/2010
>> [80244.152205] RIP: 0010:btrfs_free_block_groups+0x2d7/0x440 [btrfs]
>> [80244.152206] Code: 85 20 cb 00 00 48 39 c6 0f 84 b9 00 00 00 49 bf 00 01 00 00 00 00 ad de 48 8b 9d 20 cb 00 00 48 83 7b a0 00 0f 84 0d 01 00 00 <0f> 0b 48 8d 73 88 31 c9 31 d2 48 89 ef e8 27 7a ff ff 48 89 df e8
>> [80244.152235] RSP: 0018:ffff8ea10393fdb0 EFLAGS: 00010286
>> [80244.152237] RAX: ffff8c1025819e78 RBX: ffff8c1025819e78 RCX: 0000000000000000
>> [80244.152238] RDX: 0000000000000001 RSI: ffff8c115329cb20 RDI: ffff8c1025818e00
>> [80244.152239] RBP: ffff8c1153290000 R08: 0000000000000000 R09: 0000000000000000
>> [80244.152240] R10: ffff8c1025818e98 R11: 0000000000000002 R12: ffff8c11532900a0
>> [80244.152241] R13: 0000000000000000 R14: dead000000000200 R15: dead000000000100
>> [80244.152243] FS:  00007ff0cc3d8fc0(0000) GS:ffff8c1137d40000(0000) knlGS:0000000000000000
>> [80244.152244] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [80244.152245] CR2: 00007ffe75e53c88 CR3: 0000000235d16001 CR4: 00000000000206e0
>> [80244.152246] Call Trace:
>> [80244.152270]  close_ctree+0x146/0x320 [btrfs]
>> [80244.152276]  ? kthread_stop+0x42/0xf0
>> [80244.152280]  generic_shutdown_super+0x6c/0x110
>> [80244.152283]  kill_anon_super+0xe/0x20
>> [80244.152298]  btrfs_kill_super+0x13/0x100 [btrfs]
>> [80244.152301]  deactivate_locked_super+0x3f/0x70
>> [80244.152303]  cleanup_mnt+0x3b/0x70
>> [80244.152305]  task_work_run+0x84/0xa0
>> [80244.152308]  do_syscall_64+0x143/0x4cd
>> [80244.152311]  ? do_page_fault+0x31/0x130
>> [80244.152314]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [80244.152316] RIP: 0033:0x7ff0cb43c1a7
>> [80244.152317] Code: ad 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 ac 2b 00 f7 d8 64 89 01 48
>> [80244.152346] RSP: 002b:00007ffe75e554b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>> [80244.152348] RAX: 0000000000000000 RBX: 0000563ccced82d0 RCX: 00007ff0cb43c1a7
>> [80244.152349] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000563ccced84b0
>> [80244.152350] RBP: 0000563ccced84b0 R08: 0000000000000005 R09: 0000563ccced84d0
>> [80244.152351] R10: 00007ff0cb4ba320 R11: 0000000000000246 R12: 00007ff0cc1d0184
>> [80244.152352] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
>> [80244.152354] ---[ end trace b5975ec96174c60c ]---
>> [80244.152358] BTRFS info (device sdh2): space_info 1 has 1022132224 free, is not full
>> [80244.152360] BTRFS info (device sdh2): space_info total=1082130432, used=60039168, pinned=0, reserved=0, may_use=18446744073709510656, readonly=0
>>
>> Obviously, may_use value is underflowed.
>>
>> Thanks,
>> Misono
>>
>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space
  2018-08-24  7:54     ` Misono Tomohiro
@ 2018-08-24  7:58       ` Qu Wenruo
  2018-08-24  8:09         ` Misono Tomohiro
  0 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2018-08-24  7:58 UTC (permalink / raw)
  To: Misono Tomohiro, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 8308 bytes --]



On 2018/8/24 下午3:54, Misono Tomohiro wrote:
> On 2018/08/24 16:20, Qu Wenruo wrote:
>>
>>
>> On 2018/8/24 下午3:14, Misono Tomohiro wrote:
>>> Hi,
>>>
>>> On 2018/08/21 14:40, Qu Wenruo wrote:
>>>> Commit c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>>>> makes nocow check less frequent to improve performance.
>>>>
>>>> However for quota enabled case, such optimization could lead to extra
>>>> unnecessary data reservation, which results failure for test case like
>>>> btrfs/153 in fstests.
>>>>
>>>> Fix it by reverting to old behavior for quota enabled case.
>>>>
>>>> Fixes: c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>> ---
>>>> changelog
>>>> v2:
>>>>   Fix regression for quota+cow case. (Previously it will skip data
>>>>   reservation if quota is enabled, causing regression for limit case.
>>>>   Pointed out by Misono)
>>>> ---
>>>>  fs/btrfs/file.c | 18 +++++++++++++++++-
>>>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>>>> index 2be00e873e92..183e5fb96f42 100644
>>>> --- a/fs/btrfs/file.c
>>>> +++ b/fs/btrfs/file.c
>>>> @@ -1584,6 +1584,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>>>>  	int ret = 0;
>>>>  	bool only_release_metadata = false;
>>>>  	bool force_page_uptodate = false;
>>>> +	bool quota_enabled = test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>>>>  
>>>>  	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
>>>>  			PAGE_SIZE / (sizeof(struct page *)));
>>>> @@ -1624,13 +1625,28 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>>>>  				fs_info->sectorsize);
>>>>  
>>>>  		extent_changeset_release(data_reserved);
>>>> +
>>>> +		/*
>>>> +		 * If we have quota enabled, we must do the heavy lift nocow
>>>> +		 * check here to avoid reserving data space, or we can hit
>>>> +		 * limitation for NOCOW files.
>>>> +		 */
>>>> +		if (quota_enabled) {
>>>> +			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>>>> +						      BTRFS_INODE_PREALLOC)) &&
>>>> +			    check_can_nocow(BTRFS_I(inode), pos,
>>>> +					    &write_bytes) > 0)
>>>> +				goto reserve_meta_only;
>>>> +		}
>>>>  		ret = btrfs_check_data_free_space(inode, &data_reserved, pos,
>>>>  						  write_bytes);
>>>>  		if (ret < 0) {
>>>>  			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>>>>  						      BTRFS_INODE_PREALLOC)) &&
>>>>  			    check_can_nocow(BTRFS_I(inode), pos,
>>>> -					&write_bytes) > 0) {
>>>> +					&write_bytes) > 0 &&
>>>
>>>> +			    !quota_enabled) {
>>>
>>> (When we check this condition, quota_enabled must be false or otherwise
>>> we have already goto reserve_meta_only. So it seems redundant.)
>>
>> It's possible that we have quota enabled, and then
>> btrfs_check_data_free_space() failed with -EDQUOT.
>>
>> In that case, we need above !quota_enabled check to avoid unnecessary
>> check and just go error branch.
> 
> So should quota_enabled be checked before check_can_nocow()?

Oh, yes, it should be put before nocow check.

> 
>>
>>>
>>>> +reserve_meta_only:
>>>>  				/*
>>>>  				 * For nodata cow case, no need to reserve
>>>>  				 * data space.
>>>>
>>>
>>> I applied this patch on today's misc-next and it seems mostly ok, but
>>> btrfs/022 sometimes gives following warning:
>>
>> This looks like related to the regression caused by commit
>> c4c129db5da8f070147f175 ("btrfs: drop unused
>> parameter qgroup_reserved").
>>
>> Would you please try reverting that patch?
> 
> I think above commit is fixed by commit eb27db470 ("btrfs: fix
> qgroup_free wrong num_bytes in btrfs_subvolume_reserve_metadata") which
> is already in misc-next too.
> 
> I reverted above two patch (and one more related patch 6b0cb14901
> ("btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot")),
> but get the same result.

So something really goes wrong.
I assume it's some error handler which double freed, but a quick glance
nor my initial test run doesn't show something obvious.

BTW, what's the possibility of such problem in your test environment?

Thanks,
Qu

> 
> Thanks,
> Misono
> 
>>
>> Thanks,
>> Qu
>>
>>
>>>
>>> [80244.152130] WARNING: CPU: 5 PID: 14575 at fs/btrfs/extent-tree.c:9742 btrfs_free_block_groups+0x2d7/0x440 [btrfs]
>>> [80244.152132] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress xxhash raid6_pq xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc xt_conntrack ip_set nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security sunrpc intel_powerclamp kvm_intel kvm gpio_ich iTCO_wdt ipmi_ssif iTCO_vendor_support ipmi_si st irqbypass ipmi_devintf crct10dif_pclmul crc32_pclmul ipmi_msghandler ghash_clmulni_intel pcspkr acpi_power_meter i2c_i801 pcc_cpufreq i7core_edac lpc_ich acpi_cpufreq xfs libcrc32c mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb sr_mod hwmon uas ptp crc32c_intel cdrom usb_storage pps_core ata_generic megaraid_sas dca pata_acpi i2c_algo_bit ipv6 [last unloaded: xor]
>>> [80244.152185] CPU: 5 PID: 14575 Comm: umount Tainted: G        W IO      4.18.0-rc8+ #98
>>> [80244.152187] Hardware name: FUJITSU-SV                       PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.09.2619.N1           12/13/2010
>>> [80244.152205] RIP: 0010:btrfs_free_block_groups+0x2d7/0x440 [btrfs]
>>> [80244.152206] Code: 85 20 cb 00 00 48 39 c6 0f 84 b9 00 00 00 49 bf 00 01 00 00 00 00 ad de 48 8b 9d 20 cb 00 00 48 83 7b a0 00 0f 84 0d 01 00 00 <0f> 0b 48 8d 73 88 31 c9 31 d2 48 89 ef e8 27 7a ff ff 48 89 df e8
>>> [80244.152235] RSP: 0018:ffff8ea10393fdb0 EFLAGS: 00010286
>>> [80244.152237] RAX: ffff8c1025819e78 RBX: ffff8c1025819e78 RCX: 0000000000000000
>>> [80244.152238] RDX: 0000000000000001 RSI: ffff8c115329cb20 RDI: ffff8c1025818e00
>>> [80244.152239] RBP: ffff8c1153290000 R08: 0000000000000000 R09: 0000000000000000
>>> [80244.152240] R10: ffff8c1025818e98 R11: 0000000000000002 R12: ffff8c11532900a0
>>> [80244.152241] R13: 0000000000000000 R14: dead000000000200 R15: dead000000000100
>>> [80244.152243] FS:  00007ff0cc3d8fc0(0000) GS:ffff8c1137d40000(0000) knlGS:0000000000000000
>>> [80244.152244] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [80244.152245] CR2: 00007ffe75e53c88 CR3: 0000000235d16001 CR4: 00000000000206e0
>>> [80244.152246] Call Trace:
>>> [80244.152270]  close_ctree+0x146/0x320 [btrfs]
>>> [80244.152276]  ? kthread_stop+0x42/0xf0
>>> [80244.152280]  generic_shutdown_super+0x6c/0x110
>>> [80244.152283]  kill_anon_super+0xe/0x20
>>> [80244.152298]  btrfs_kill_super+0x13/0x100 [btrfs]
>>> [80244.152301]  deactivate_locked_super+0x3f/0x70
>>> [80244.152303]  cleanup_mnt+0x3b/0x70
>>> [80244.152305]  task_work_run+0x84/0xa0
>>> [80244.152308]  do_syscall_64+0x143/0x4cd
>>> [80244.152311]  ? do_page_fault+0x31/0x130
>>> [80244.152314]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>> [80244.152316] RIP: 0033:0x7ff0cb43c1a7
>>> [80244.152317] Code: ad 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 ac 2b 00 f7 d8 64 89 01 48
>>> [80244.152346] RSP: 002b:00007ffe75e554b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>>> [80244.152348] RAX: 0000000000000000 RBX: 0000563ccced82d0 RCX: 00007ff0cb43c1a7
>>> [80244.152349] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000563ccced84b0
>>> [80244.152350] RBP: 0000563ccced84b0 R08: 0000000000000005 R09: 0000563ccced84d0
>>> [80244.152351] R10: 00007ff0cb4ba320 R11: 0000000000000246 R12: 00007ff0cc1d0184
>>> [80244.152352] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
>>> [80244.152354] ---[ end trace b5975ec96174c60c ]---
>>> [80244.152358] BTRFS info (device sdh2): space_info 1 has 1022132224 free, is not full
>>> [80244.152360] BTRFS info (device sdh2): space_info total=1082130432, used=60039168, pinned=0, reserved=0, may_use=18446744073709510656, readonly=0
>>>
>>> Obviously, may_use value is underflowed.
>>>
>>> Thanks,
>>> Misono
>>>
>>>
>>
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space
  2018-08-24  7:58       ` Qu Wenruo
@ 2018-08-24  8:09         ` Misono Tomohiro
  2018-08-28  5:21           ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: Misono Tomohiro @ 2018-08-24  8:09 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On 2018/08/24 16:58, Qu Wenruo wrote:
> 
> 
> On 2018/8/24 下午3:54, Misono Tomohiro wrote:
>> On 2018/08/24 16:20, Qu Wenruo wrote:
>>>
>>>
>>> On 2018/8/24 下午3:14, Misono Tomohiro wrote:
>>>> Hi,
>>>>
>>>> On 2018/08/21 14:40, Qu Wenruo wrote:
>>>>> Commit c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>>>>> makes nocow check less frequent to improve performance.
>>>>>
>>>>> However for quota enabled case, such optimization could lead to extra
>>>>> unnecessary data reservation, which results failure for test case like
>>>>> btrfs/153 in fstests.
>>>>>
>>>>> Fix it by reverting to old behavior for quota enabled case.
>>>>>
>>>>> Fixes: c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>>> ---
>>>>> changelog
>>>>> v2:
>>>>>   Fix regression for quota+cow case. (Previously it will skip data
>>>>>   reservation if quota is enabled, causing regression for limit case.
>>>>>   Pointed out by Misono)
>>>>> ---
>>>>>  fs/btrfs/file.c | 18 +++++++++++++++++-
>>>>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>>>>> index 2be00e873e92..183e5fb96f42 100644
>>>>> --- a/fs/btrfs/file.c
>>>>> +++ b/fs/btrfs/file.c
>>>>> @@ -1584,6 +1584,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>>>>>  	int ret = 0;
>>>>>  	bool only_release_metadata = false;
>>>>>  	bool force_page_uptodate = false;
>>>>> +	bool quota_enabled = test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>>>>>  
>>>>>  	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
>>>>>  			PAGE_SIZE / (sizeof(struct page *)));
>>>>> @@ -1624,13 +1625,28 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>>>>>  				fs_info->sectorsize);
>>>>>  
>>>>>  		extent_changeset_release(data_reserved);
>>>>> +
>>>>> +		/*
>>>>> +		 * If we have quota enabled, we must do the heavy lift nocow
>>>>> +		 * check here to avoid reserving data space, or we can hit
>>>>> +		 * limitation for NOCOW files.
>>>>> +		 */
>>>>> +		if (quota_enabled) {
>>>>> +			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>>>>> +						      BTRFS_INODE_PREALLOC)) &&
>>>>> +			    check_can_nocow(BTRFS_I(inode), pos,
>>>>> +					    &write_bytes) > 0)
>>>>> +				goto reserve_meta_only;
>>>>> +		}
>>>>>  		ret = btrfs_check_data_free_space(inode, &data_reserved, pos,
>>>>>  						  write_bytes);
>>>>>  		if (ret < 0) {
>>>>>  			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>>>>>  						      BTRFS_INODE_PREALLOC)) &&
>>>>>  			    check_can_nocow(BTRFS_I(inode), pos,
>>>>> -					&write_bytes) > 0) {
>>>>> +					&write_bytes) > 0 &&
>>>>
>>>>> +			    !quota_enabled) {
>>>>
>>>> (When we check this condition, quota_enabled must be false or otherwise
>>>> we have already goto reserve_meta_only. So it seems redundant.)
>>>
>>> It's possible that we have quota enabled, and then
>>> btrfs_check_data_free_space() failed with -EDQUOT.
>>>
>>> In that case, we need above !quota_enabled check to avoid unnecessary
>>> check and just go error branch.
>>
>> So should quota_enabled be checked before check_can_nocow()?
> 
> Oh, yes, it should be put before nocow check.
> 
>>
>>>
>>>>
>>>>> +reserve_meta_only:
>>>>>  				/*
>>>>>  				 * For nodata cow case, no need to reserve
>>>>>  				 * data space.
>>>>>
>>>>
>>>> I applied this patch on today's misc-next and it seems mostly ok, but
>>>> btrfs/022 sometimes gives following warning:
>>>
>>> This looks like related to the regression caused by commit
>>> c4c129db5da8f070147f175 ("btrfs: drop unused
>>> parameter qgroup_reserved").
>>>
>>> Would you please try reverting that patch?
>>
>> I think above commit is fixed by commit eb27db470 ("btrfs: fix
>> qgroup_free wrong num_bytes in btrfs_subvolume_reserve_metadata") which
>> is already in misc-next too.
>>
>> I reverted above two patch (and one more related patch 6b0cb14901
>> ("btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot")),
>> but get the same result.
> 
> So something really goes wrong.
> I assume it's some error handler which double freed, but a quick glance
> nor my initial test run doesn't show something obvious.
> 
> BTW, what's the possibility of such problem in your test environment?

It's like one in several times.
It may depend on hardware performance? (the machine is not so fast),

I also noticed following warning happens too (not always):

[84089.286669] WARNING: CPU: 4 PID: 19255 at fs/btrfs/extent-tree.c:4277 btrfs_free_reserved_data_space_noquota+0xd2/0xf0 [btrfs]
[84089.286670] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress xxhash raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison loop dm_flakey xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc xt_conntrack ip_set nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security sunrpc intel_powerclamp kvm_intel kvm gpio_ich iTCO_wdt ipmi_ssif iTCO_vendor_support ipmi_si st irqbypass ipmi_devintf crct10dif_pclmul crc32_pclmul ipmi_msghandler ghash_clmulni_intel pcspkr acpi_power_meter i2c_i801 pcc_cpufreq i7core_edac lpc_ich acpi_cpufreq xfs libcrc32c mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb sr_mod hwmon uas ptp crc32c_intel cdrom usb_storage pps_core ata_generic megaraid_sas dca pata_acpi
[84089.286708]  i2c_algo_bit ipv6 [last unloaded: xor]
[84089.286712] CPU: 4 PID: 19255 Comm: kworker/u25:1 Tainted: G        W IO      4.18.0-rc8+ #98
[84089.286713] Hardware name: FUJITSU-SV                       PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.09.2619.N1           12/13/2010
[84089.286717] Workqueue: writeback wb_workfn (flush-btrfs-474)
[84089.286731] RIP: 0010:btrfs_free_reserved_data_space_noquota+0xd2/0xf0 [btrfs]
[84089.286731] Code: 49 89 d8 4c 89 f1 4c 89 fa 4c 89 e6 e8 27 51 d1 d2 48 8b 45 00 48 85 c0 75 db 41 c6 45 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b 49 c7 45 28 00 00 00 00 e9 79 ff ff ff 0f 1f 44 00 00 66 2e
[84089.286752] RSP: 0018:ffff8ea1043176e0 EFLAGS: 00010287
[84089.286754] RAX: 0000000000004000 RBX: 0000000000005000 RCX: ffff8c1151e8cae8
[84089.286755] RDX: 0000000000000001 RSI: 00000000000c3000 RDI: ffff8c106602fa00
[84089.286755] RBP: ffffffffffffb000 R08: 0000000000000000 R09: 0000000000000143
[84089.286756] R10: ffff8c11dd95d000 R11: 00000000ffffff01 R12: ffff8c1151e80000
[84089.286757] R13: ffff8c106602fa00 R14: ffff8ea1043177e4 R15: ffff8c1151e80000
[84089.286758] FS:  0000000000000000(0000) GS:ffff8c1137d00000(0000) knlGS:0000000000000000
[84089.286759] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[84089.286760] CR2: 00007fcbb3a26000 CR3: 000000019400a002 CR4: 00000000000206e0
[84089.286761] Call Trace:
[84089.286779]  btrfs_clear_bit_hook+0x1a2/0x3f0 [btrfs]
[84089.286796]  clear_state_bit+0x51/0x1b0 [btrfs]
[84089.286812]  __clear_extent_bit+0x364/0x3c0 [btrfs]
[84089.286841]  extent_clear_unlock_delalloc+0x43/0x70 [btrfs]
[84089.286868]  run_delalloc_nocow+0x930/0xa60 [btrfs]
[84089.286897]  run_delalloc_range+0x19c/0x370 [btrfs]
[84089.286926]  writepage_delalloc+0xcf/0x180 [btrfs]
[84089.286955]  __extent_writepage+0x17a/0x310 [btrfs]
[84089.286984]  extent_write_cache_pages+0x131/0x390 [btrfs]
[84089.287012]  ? btrfs_i_callback+0x20/0x20 [btrfs]
[84089.287040]  extent_writepages+0x50/0x80 [btrfs]
[84089.287045]  do_writepages+0x4b/0xe0
[84089.287073]  ? btrfs_get_token_32+0x6b/0x130 [btrfs]
[84089.287077]  ? __writeback_single_inode+0x3d/0x320
[84089.287079]  __writeback_single_inode+0x3d/0x320
[84089.287082]  writeback_sb_inodes+0x19f/0x460
[84089.287086]  wb_writeback+0x107/0x300
[84089.287090]  ? wb_workfn+0xdf/0x410
[84089.287093]  ? current_is_workqueue_rescuer+0x27/0x40
[84089.287095]  wb_workfn+0xdf/0x410
[84089.287099]  ? put_prev_entity+0x20/0x100
[84089.287103]  process_one_work+0x18f/0x370
[84089.287106]  worker_thread+0x30/0x380
[84089.287109]  ? process_one_work+0x370/0x370
[84089.287112]  kthread+0x113/0x130
[84089.287115]  ? kthread_create_worker_on_cpu+0x70/0x70
[84089.287119]  ret_from_fork+0x35/0x40
[84089.287122] ---[ end trace b5975ec96174c615 ]---


> 
> Thanks,
> Qu
> 
>>
>> Thanks,
>> Misono
>>
>>>
>>> Thanks,
>>> Qu
>>>
>>>
>>>>
>>>> [80244.152130] WARNING: CPU: 5 PID: 14575 at fs/btrfs/extent-tree.c:9742 btrfs_free_block_groups+0x2d7/0x440 [btrfs]
>>>> [80244.152132] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress xxhash raid6_pq xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc xt_conntrack ip_set nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security sunrpc intel_powerclamp kvm_intel kvm gpio_ich iTCO_wdt ipmi_ssif iTCO_vendor_support ipmi_si st irqbypass ipmi_devintf crct10dif_pclmul crc32_pclmul ipmi_msghandler ghash_clmulni_intel pcspkr acpi_power_meter i2c_i801 pcc_cpufreq i7core_edac lpc_ich acpi_cpufreq xfs libcrc32c mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb sr_mod hwmon uas ptp crc32c_intel cdrom usb_storage pps_core ata_generic megaraid_sas dca pata_acpi i2c_algo_bit ipv6 [last unloaded: xor]
>>>> [80244.152185] CPU: 5 PID: 14575 Comm: umount Tainted: G        W IO      4.18.0-rc8+ #98
>>>> [80244.152187] Hardware name: FUJITSU-SV                       PRIMERGY RX300 S6             /D2619, BIOS 6.00 Rev. 1.09.2619.N1           12/13/2010
>>>> [80244.152205] RIP: 0010:btrfs_free_block_groups+0x2d7/0x440 [btrfs]
>>>> [80244.152206] Code: 85 20 cb 00 00 48 39 c6 0f 84 b9 00 00 00 49 bf 00 01 00 00 00 00 ad de 48 8b 9d 20 cb 00 00 48 83 7b a0 00 0f 84 0d 01 00 00 <0f> 0b 48 8d 73 88 31 c9 31 d2 48 89 ef e8 27 7a ff ff 48 89 df e8
>>>> [80244.152235] RSP: 0018:ffff8ea10393fdb0 EFLAGS: 00010286
>>>> [80244.152237] RAX: ffff8c1025819e78 RBX: ffff8c1025819e78 RCX: 0000000000000000
>>>> [80244.152238] RDX: 0000000000000001 RSI: ffff8c115329cb20 RDI: ffff8c1025818e00
>>>> [80244.152239] RBP: ffff8c1153290000 R08: 0000000000000000 R09: 0000000000000000
>>>> [80244.152240] R10: ffff8c1025818e98 R11: 0000000000000002 R12: ffff8c11532900a0
>>>> [80244.152241] R13: 0000000000000000 R14: dead000000000200 R15: dead000000000100
>>>> [80244.152243] FS:  00007ff0cc3d8fc0(0000) GS:ffff8c1137d40000(0000) knlGS:0000000000000000
>>>> [80244.152244] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [80244.152245] CR2: 00007ffe75e53c88 CR3: 0000000235d16001 CR4: 00000000000206e0
>>>> [80244.152246] Call Trace:
>>>> [80244.152270]  close_ctree+0x146/0x320 [btrfs]
>>>> [80244.152276]  ? kthread_stop+0x42/0xf0
>>>> [80244.152280]  generic_shutdown_super+0x6c/0x110
>>>> [80244.152283]  kill_anon_super+0xe/0x20
>>>> [80244.152298]  btrfs_kill_super+0x13/0x100 [btrfs]
>>>> [80244.152301]  deactivate_locked_super+0x3f/0x70
>>>> [80244.152303]  cleanup_mnt+0x3b/0x70
>>>> [80244.152305]  task_work_run+0x84/0xa0
>>>> [80244.152308]  do_syscall_64+0x143/0x4cd
>>>> [80244.152311]  ? do_page_fault+0x31/0x130
>>>> [80244.152314]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>> [80244.152316] RIP: 0033:0x7ff0cb43c1a7
>>>> [80244.152317] Code: ad 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 ac 2b 00 f7 d8 64 89 01 48
>>>> [80244.152346] RSP: 002b:00007ffe75e554b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>>>> [80244.152348] RAX: 0000000000000000 RBX: 0000563ccced82d0 RCX: 00007ff0cb43c1a7
>>>> [80244.152349] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000563ccced84b0
>>>> [80244.152350] RBP: 0000563ccced84b0 R08: 0000000000000005 R09: 0000563ccced84d0
>>>> [80244.152351] R10: 00007ff0cb4ba320 R11: 0000000000000246 R12: 00007ff0cc1d0184
>>>> [80244.152352] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
>>>> [80244.152354] ---[ end trace b5975ec96174c60c ]---
>>>> [80244.152358] BTRFS info (device sdh2): space_info 1 has 1022132224 free, is not full
>>>> [80244.152360] BTRFS info (device sdh2): space_info total=1082130432, used=60039168, pinned=0, reserved=0, may_use=18446744073709510656, readonly=0
>>>>
>>>> Obviously, may_use value is underflowed.
>>>>
>>>> Thanks,
>>>> Misono
>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space
  2018-08-24  8:09         ` Misono Tomohiro
@ 2018-08-28  5:21           ` Qu Wenruo
  2018-08-30  1:57             ` Misono Tomohiro
  0 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2018-08-28  5:21 UTC (permalink / raw)
  To: Misono Tomohiro, Qu Wenruo, linux-btrfs@vger.kernel.org


[-- Attachment #1.1: Type: text/plain, Size: 1645 bytes --]

On 2018/8/24 下午4:09, Misono Tomohiro wrote:
[snip]
>>
>> BTW, what's the possibility of such problem in your test environment?
> 
> It's like one in several times.
> It may depend on hardware performance? (the machine is not so fast),
> 
> I also noticed following warning happens too (not always):
> 

After digging into the case, it's more complex than just my patch.

Firstly, we lacks a lot of underflow check when modifying bytes_may_use.
So we need to do all the underflow detection for every modifier of
bytes_may_use.

Secondly, btrfs_cross_ref_exist() check makes NODATACOW check in
__btrfs_buffered_write() unreliable.

For the following case, at __btrfs_buffered_write() time we're pretty
sure we could do NODATACOW, but at sync time, due to cloned range,
btrfs_cross_ref_exist() would detect reflinked prealloc extent, then
falls back to CoW, and finally cause bytes_may_use underflow:

---
mkfs.btrfs -f $dev > $full_log

mount $dev $mnt -o nospace_cache
btrfs quota enable $mnt
btrfs quota rescan -w $mnt

xfs_io -f -c "falloc 0 2M" $mnt/file1 > /dev/null
xfs_io -c "pwrite -b 1M 0 1M" $mnt/file1 > /dev/null
xfs_io -c "reflink $mnt/file1 1M 4M 1M" $mnt/file1 > /dev/null
sync
----

Even without my patch, the "pwrite" command is still CoWed, which could
be avoided.
And that's the reason my patch is causing the underflow.

To fix this, we need more accurate btrfs_cross_ref_exist() check, not
only for @disk_bytenr but also check @len.

Or we could try to flush the whole inode in clone_range() so we could go
through NOCOW routine before clone really happens.

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space
  2018-08-28  5:21           ` Qu Wenruo
@ 2018-08-30  1:57             ` Misono Tomohiro
  2018-08-30  2:14               ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: Misono Tomohiro @ 2018-08-30  1:57 UTC (permalink / raw)
  To: Qu Wenruo, Qu Wenruo, linux-btrfs@vger.kernel.org

On 2018/08/28 14:21, Qu Wenruo wrote:
> On 2018/8/24 下午4:09, Misono Tomohiro wrote:
> [snip]
>>>
>>> BTW, what's the possibility of such problem in your test environment?
>>
>> It's like one in several times.
>> It may depend on hardware performance? (the machine is not so fast),
>>
>> I also noticed following warning happens too (not always):
>>
> 
> After digging into the case, it's more complex than just my patch.
> 
> Firstly, we lacks a lot of underflow check when modifying bytes_may_use.
> So we need to do all the underflow detection for every modifier of
> bytes_may_use.
> 
> Secondly, btrfs_cross_ref_exist() check makes NODATACOW check in
> __btrfs_buffered_write() unreliable.
> 
> For the following case, at __btrfs_buffered_write() time we're pretty
> sure we could do NODATACOW, but at sync time, due to cloned range,
> btrfs_cross_ref_exist() would detect reflinked prealloc extent, then
> falls back to CoW, and finally cause bytes_may_use underflow:
> 
> ---
> mkfs.btrfs -f $dev > $full_log
> 
> mount $dev $mnt -o nospace_cache
> btrfs quota enable $mnt
> btrfs quota rescan -w $mnt
> 
> xfs_io -f -c "falloc 0 2M" $mnt/file1 > /dev/null
> xfs_io -c "pwrite -b 1M 0 1M" $mnt/file1 > /dev/null
> xfs_io -c "reflink $mnt/file1 1M 4M 1M" $mnt/file1 > /dev/null
> sync
> ----

Thanks for the explanation. I will try to understand the relevant code.

> 
> Even without my patch, the "pwrite" command is still CoWed, which could
> be avoided.
> And that's the reason my patch is causing the underflow.
> 
> To fix this, we need more accurate btrfs_cross_ref_exist() check, not
> only for @disk_bytenr but also check @len.
> 
> Or we could try to flush the whole inode in clone_range() so we could go
> through NOCOW routine before clone really happens.

So as your RFC patch does not work, the option is first one?

Thanks,
Misono

> 
> Thanks,
> Qu
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space
  2018-08-30  1:57             ` Misono Tomohiro
@ 2018-08-30  2:14               ` Qu Wenruo
  0 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2018-08-30  2:14 UTC (permalink / raw)
  To: Misono Tomohiro, Qu Wenruo, linux-btrfs@vger.kernel.org


[-- Attachment #1.1: Type: text/plain, Size: 2239 bytes --]



On 2018/8/30 上午9:57, Misono Tomohiro wrote:
> On 2018/08/28 14:21, Qu Wenruo wrote:
>> On 2018/8/24 下午4:09, Misono Tomohiro wrote:
>> [snip]
>>>>
>>>> BTW, what's the possibility of such problem in your test environment?
>>>
>>> It's like one in several times.
>>> It may depend on hardware performance? (the machine is not so fast),
>>>
>>> I also noticed following warning happens too (not always):
>>>
>>
>> After digging into the case, it's more complex than just my patch.
>>
>> Firstly, we lacks a lot of underflow check when modifying bytes_may_use.
>> So we need to do all the underflow detection for every modifier of
>> bytes_may_use.
>>
>> Secondly, btrfs_cross_ref_exist() check makes NODATACOW check in
>> __btrfs_buffered_write() unreliable.
>>
>> For the following case, at __btrfs_buffered_write() time we're pretty
>> sure we could do NODATACOW, but at sync time, due to cloned range,
>> btrfs_cross_ref_exist() would detect reflinked prealloc extent, then
>> falls back to CoW, and finally cause bytes_may_use underflow:
>>
>> ---
>> mkfs.btrfs -f $dev > $full_log
>>
>> mount $dev $mnt -o nospace_cache
>> btrfs quota enable $mnt
>> btrfs quota rescan -w $mnt
>>
>> xfs_io -f -c "falloc 0 2M" $mnt/file1 > /dev/null
>> xfs_io -c "pwrite -b 1M 0 1M" $mnt/file1 > /dev/null
>> xfs_io -c "reflink $mnt/file1 1M 4M 1M" $mnt/file1 > /dev/null
>> sync
>> ----
> 
> Thanks for the explanation. I will try to understand the relevant code.

Feel free to ask if there is anything I forgot to explain.

> 
>>
>> Even without my patch, the "pwrite" command is still CoWed, which could
>> be avoided.
>> And that's the reason my patch is causing the underflow.
>>
>> To fix this, we need more accurate btrfs_cross_ref_exist() check, not
>> only for @disk_bytenr but also check @len.
>>
>> Or we could try to flush the whole inode in clone_range() so we could go
>> through NOCOW routine before clone really happens.
> 
> So as your RFC patch does not work, the option is first one?

Yes, and it may be more complex to do that, while even fixed we could
only get diminishing return.

Thanks,
Qu

> 
> Thanks,
> Misono
> 
>>
>> Thanks,
>> Qu
>>
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-08-30  6:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-21  5:40 [PATCH v2] btrfs: Always check nocow for quota enabled case to make sure we won't reserve unnecessary data space Qu Wenruo
2018-08-24  7:14 ` Misono Tomohiro
2018-08-24  7:20   ` Qu Wenruo
2018-08-24  7:54     ` Misono Tomohiro
2018-08-24  7:58       ` Qu Wenruo
2018-08-24  8:09         ` Misono Tomohiro
2018-08-28  5:21           ` Qu Wenruo
2018-08-30  1:57             ` Misono Tomohiro
2018-08-30  2:14               ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).