* Flush warning
@ 2017-08-03 18:32 Steve Wise
2017-08-07 1:06 ` Sagi Grimberg
0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2017-08-03 18:32 UTC (permalink / raw)
To: sagi grimberg, Christoph Hellwig
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
Hey guys,
We're seeing a WARNING happening when running an fio test on a single NVMF
attached ramdisk over iw_cxgb4. While the fio test is running, the NVMF host is
also killing the controller via writing to
/sys/block/nvme*/device/reset_controller. Here is the script:
----
[root@trinitycraft ~]# cat fio_issue.sh
num=0
fio --rw=randrw --name=random --norandommap --ioengine=libaio --size=400m
--group_reporting --exitall --fsync_on_close=1 --invalidate=1 --direct=1
--filename=/dev/nvme0n1 --time_based --runtime=30 --iodepth=32 --numjobs=8
--unit_base=1 --bs=4k --kb_base=1000 &
sleep 2
while [ $num -lt 30 ]
do
echo 1 >/sys/block/nvme0n1/device/reset_controller
[ $? -eq 1 ] && echo "reset_controller operation failed: $num" && exit 1
((num++))
sleep 0.5
done
-----
The WARNING seems to be due to nvmet_rdma_queue_connect() calling
flush_scheduled_work() while in the upcall from the RDMA_CM. It I running on
the iw_cm event workqueue, which is created with WQ_MEM_RECLAIM set. I'm not
sure what this WARNING is telling me. Does the iw_cm workqueue NOT need
WQ_MEM_RECLAIM set? Or is there some other issue with the nvmet/rdma code doing
work flushing in the iw_cm workq context?
This is with 4.12.0.
Any thoughts? Thanks!
Steve.
---
[ 1887.155804] workqueue: WQ_MEM_RECLAIM iw_cm_wq:cm_work_handler [iw_cm] is
flushing !WQ_MEM_RECLAIM events: (null)
[ 1887.155811] ------------[ cut here ]------------
[ 1887.155816] WARNING: CPU: 6 PID: 3355 at kernel/workqueue.c:2423
check_flush_dependency+0xa9/0x100
[ 1887.155817] Modules linked in: nvmet_rdma nvmet rdma_ucm ib_uverbs iw_cxgb4
cxgb4 brd rdma_cm iw_cm ib_cm ib_core libcxgb xt_CHECKSUM iptable_mangle
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT
nf_reject_ipv4 fuse tun bridge stp llc ebtable_filter ebtables ip6table_filter
ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support mxm_wmi
irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc nvme nvme_core
aesni_intel mei_me crypto_simd ipmi_si glue_helper cryptd mei pcspkr sg ioatdma
lpc_ich ipmi_devintf shpchp i2c_i801 mfd_core ipmi_msghandler wmi acpi_pad
acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace
[ 1887.155849] sunrpc ip_tables ext4 jbd2 mbcache sd_mod ast drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb ahci libahci ptp
libata crc32c_intel pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash
dm_log dm_mod dax [last unloaded: nvmet]
[ 1887.155863] CPU: 6 PID: 3355 Comm: kworker/u32:1 Not tainted 4.12.0 #1
[ 1887.155864] Hardware name: Supermicro X10DRi/X10DRi, BIOS 2.1 09/13/2016
[ 1887.155866] Workqueue: iw_cm_wq cm_work_handler [iw_cm]
[ 1887.155867] task: ffff944585015700 task.stack: ffffb36b47604000
[ 1887.155869] RIP: 0010:check_flush_dependency+0xa9/0x100
[ 1887.155870] RSP: 0018:ffffb36b476079f8 EFLAGS: 00010246
[ 1887.155871] RAX: 000000000000006e RBX: ffff943d9f808e00 RCX: ffffffff8cc605a8
[ 1887.155872] RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000202
[ 1887.155873] RBP: ffffb36b47607a10 R08: 000000000000006e R09: 00000000000005e9
[ 1887.155873] R10: 0000000000000000 R11: 000000000000006e R12: ffff943d92c72e40
[ 1887.155874] R13: 0000000000000000 R14: 0000000000000006 R15: ffffb36b47607a50
[ 1887.155875] FS: 0000000000000000(0000) GS:ffff9445bfc80000(0000)
knlGS:0000000000000000
[ 1887.155876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1887.155877] CR2: 00000000006e8430 CR3: 0000000160c09000 CR4: 00000000001406e0
[ 1887.155878] Call Trace:
[ 1887.155881] flush_workqueue+0x15a/0x490
[ 1887.155885] nvmet_rdma_queue_connect+0x7cf/0xc70 [nvmet_rdma]
[ 1887.155887] ? nvmet_rdma_cm_reject+0xa0/0xa0 [nvmet_rdma]
[ 1887.155888] nvmet_rdma_cm_handler+0x12f/0x2f0 [nvmet_rdma]
[ 1887.155893] iw_conn_req_handler+0x186/0x230 [rdma_cm]
[ 1887.155894] cm_work_handler+0xcef/0xd10 [iw_cm]
[ 1887.155897] process_one_work+0x149/0x360
[ 1887.155898] worker_thread+0x4d/0x3c0
[ 1887.155901] kthread+0x109/0x140
[ 1887.155902] ? rescuer_thread+0x380/0x380
[ 1887.155903] ? kthread_park+0x60/0x60
[ 1887.155907] ? do_syscall_64+0x67/0x150
[ 1887.155910] ret_from_fork+0x25/0x30
[ 1887.155911] Code: 49 8b 54 24 18 48 8d 8b b0 00 00 00 48 81 c6 b0 00 00 00 4d
89 e8 48 c7 c7 c0 fc a2 8c 31 c0 c6 05 bd 62 c6 00 01 e8 31 82 10 00 <0f> ff e9
77 ff ff ff 45 31 ed e9 66 ff ff ff 80 3d a3 62 c6 00
[ 1887.155926] ---[ end trace c67d348e72eb38e9 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: Flush warning 2017-08-03 18:32 Flush warning Steve Wise @ 2017-08-07 1:06 ` Sagi Grimberg [not found] ` <9bc142de-b8ba-acb6-5ea1-2ccdbb578655-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 0 siblings, 1 reply; 14+ messages in thread From: Sagi Grimberg @ 2017-08-07 1:06 UTC (permalink / raw) To: Steve Wise, Christoph Hellwig Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r > Hey guys, Hey Steve, > We're seeing a WARNING happening when running an fio test on a single NVMF > attached ramdisk over iw_cxgb4. While the fio test is running, the NVMF host is > also killing the controller via writing to > /sys/block/nvme*/device/reset_controller. Here is the script: > > ---- > [root@trinitycraft ~]# cat fio_issue.sh > num=0 > > fio --rw=randrw --name=random --norandommap --ioengine=libaio --size=400m > --group_reporting --exitall --fsync_on_close=1 --invalidate=1 --direct=1 > --filename=/dev/nvme0n1 --time_based --runtime=30 --iodepth=32 --numjobs=8 > --unit_base=1 --bs=4k --kb_base=1000 & > > sleep 2 > while [ $num -lt 30 ] > do > echo 1 >/sys/block/nvme0n1/device/reset_controller > [ $? -eq 1 ] && echo "reset_controller operation failed: $num" && exit 1 > ((num++)) > sleep 0.5 > done > ----- > > The WARNING seems to be due to nvmet_rdma_queue_connect() calling > flush_scheduled_work() while in the upcall from the RDMA_CM. It I running on > the iw_cm event workqueue, which is created with WQ_MEM_RECLAIM set. I'm not > sure what this WARNING is telling me. Does the iw_cm workqueue NOT need > WQ_MEM_RECLAIM set? Or is there some other issue with the nvmet/rdma code doing > work flushing in the iw_cm workq context? This flush is designed to prevent nvmet-rdma from having too much inflight resources in case of a high pace of controller teardown and establishment (like you trigger in your test). queue teardowns are run on system_wq, does iw_cm needs memory reclamation protection? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <9bc142de-b8ba-acb6-5ea1-2ccdbb578655-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* RE: Flush warning [not found] ` <9bc142de-b8ba-acb6-5ea1-2ccdbb578655-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2017-08-07 15:28 ` Steve Wise 2017-08-09 16:21 ` Steve Wise 0 siblings, 1 reply; 14+ messages in thread From: Steve Wise @ 2017-08-07 15:28 UTC (permalink / raw) To: 'Sagi Grimberg', 'Christoph Hellwig' Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r > > The WARNING seems to be due to nvmet_rdma_queue_connect() calling > > flush_scheduled_work() while in the upcall from the RDMA_CM. It I running on > > the iw_cm event workqueue, which is created with WQ_MEM_RECLAIM set. I'm > not > > sure what this WARNING is telling me. Does the iw_cm workqueue NOT need > > WQ_MEM_RECLAIM set? Or is there some other issue with the nvmet/rdma > code doing > > work flushing in the iw_cm workq context? > > This flush is designed to prevent nvmet-rdma from having too much > inflight resources in case of a high pace of controller teardown and > establishment (like you trigger in your test). > > queue teardowns are run on system_wq, does iw_cm needs memory > reclamation protection? I don't know. I read the workqueue doc on WQ_MEM_RECLAIM, but I don't know who to tell if iw_cm needs this or not. Can you give me an example of a workqueue that _does_ need WQ_MEM_RECLAIM? I _think_ it means your workqueue is required to run something that would get triggered by the oom OS code, but I don't know if that would include rdma CMs or not... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: Flush warning 2017-08-07 15:28 ` Steve Wise @ 2017-08-09 16:21 ` Steve Wise 2017-08-09 16:27 ` Jason Gunthorpe 0 siblings, 1 reply; 14+ messages in thread From: Steve Wise @ 2017-08-09 16:21 UTC (permalink / raw) To: 'Sean Hefty' Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Sagi Grimberg', 'Christoph Hellwig' > > > > The WARNING seems to be due to nvmet_rdma_queue_connect() calling > > > flush_scheduled_work() while in the upcall from the RDMA_CM. It I running > on > > > the iw_cm event workqueue, which is created with WQ_MEM_RECLAIM set. > I'm > > not > > > sure what this WARNING is telling me. Does the iw_cm workqueue NOT need > > > WQ_MEM_RECLAIM set? Or is there some other issue with the nvmet/rdma > > code doing > > > work flushing in the iw_cm workq context? > > > > This flush is designed to prevent nvmet-rdma from having too much > > inflight resources in case of a high pace of controller teardown and > > establishment (like you trigger in your test). > > > > queue teardowns are run on system_wq, does iw_cm needs memory > > reclamation protection? > > I don't know. I read the workqueue doc on WQ_MEM_RECLAIM, but I don't know > how > to tell if iw_cm needs this or not. Can you give me an example of a workqueue > that _does_ need WQ_MEM_RECLAIM? I _think_ it means your workqueue is > required > to run something that would get triggered by the oom OS code, but I don't know > if that would include rdma CMs or not... Many of the workqueues in infiniband/core use WQ_MEM_RECLAIM: cma, iwcm, mad, multicast, sa_query, and ucma. Hey Sean, do you have any insight into whether the CMA modules really need WQ_MEM_RECLAIM for their workqueues? Does anyone else know? Thanks! Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Flush warning 2017-08-09 16:21 ` Steve Wise @ 2017-08-09 16:27 ` Jason Gunthorpe [not found] ` <20170809162749.GA4069-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 14+ messages in thread From: Jason Gunthorpe @ 2017-08-09 16:27 UTC (permalink / raw) To: Steve Wise Cc: 'Sean Hefty', linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Sagi Grimberg', 'Christoph Hellwig' On Wed, Aug 09, 2017 at 11:21:38AM -0500, Steve Wise wrote: > > I don't know. I read the workqueue doc on WQ_MEM_RECLAIM, but I don't know > > how > > to tell if iw_cm needs this or not. Can you give me an example of a workqueue > > that _does_ need WQ_MEM_RECLAIM? I _think_ it means your workqueue is > > required > > to run something that would get triggered by the oom OS code, but I don't know > > if that would include rdma CMs or not... > > Many of the workqueues in infiniband/core use WQ_MEM_RECLAIM: cma, iwcm, mad, > multicast, sa_query, and ucma. > > Hey Sean, do you have any insight into whether the CMA modules really need > WQ_MEM_RECLAIM for their workqueues? > > Does anyone else know? Consider that the ib_core can be used to back storage. Ie consider a situation where iSER/NFS/SRP needs to reconnect to respond to kernel paging/reclaim. On the surface it seems reasonable to me that these are on a reclaim path? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20170809162749.GA4069-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* RE: Flush warning [not found] ` <20170809162749.GA4069-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> @ 2017-08-09 16:38 ` Steve Wise 2017-08-13 6:46 ` Leon Romanovsky 0 siblings, 1 reply; 14+ messages in thread From: Steve Wise @ 2017-08-09 16:38 UTC (permalink / raw) To: 'Jason Gunthorpe' Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Sean Hefty', 'Sagi Grimberg', linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Christoph Hellwig' > On Wed, Aug 09, 2017 at 11:21:38AM -0500, Steve Wise wrote: > > > > I don't know. I read the workqueue doc on WQ_MEM_RECLAIM, but I don't > know > > > how > > > to tell if iw_cm needs this or not. Can you give me an example of a workqueue > > > that _does_ need WQ_MEM_RECLAIM? I _think_ it means your workqueue is > > > required > > > to run something that would get triggered by the oom OS code, but I don't know > > > if that would include rdma CMs or not... > > > > Many of the workqueues in infiniband/core use WQ_MEM_RECLAIM: cma, iwcm, > mad, > > multicast, sa_query, and ucma. > > > > Hey Sean, do you have any insight into whether the CMA modules really need > > WQ_MEM_RECLAIM for their workqueues? > > > > Does anyone else know? > > Consider that the ib_core can be used to back storage. Ie consider a > situation where iSER/NFS/SRP needs to reconnect to respond to kernel > paging/reclaim. > > On the surface it seems reasonable to me that these are on a reclaim > path? > > Jason hmm. That seems reasonable. Then I would think the nvme_rdma would also need to be using a reclaim workqueue. Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to nvme_rdma vs using the system_wq? nvme/target probably needs one also... Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Flush warning 2017-08-09 16:38 ` Steve Wise @ 2017-08-13 6:46 ` Leon Romanovsky [not found] ` <20170813064651.GR24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org> 0 siblings, 1 reply; 14+ messages in thread From: Leon Romanovsky @ 2017-08-13 6:46 UTC (permalink / raw) To: Steve Wise Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Sean Hefty', 'Sagi Grimberg', linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Christoph Hellwig' [-- Attachment #1: Type: text/plain, Size: 2012 bytes --] On Wed, Aug 09, 2017 at 11:38:49AM -0500, Steve Wise wrote: > > On Wed, Aug 09, 2017 at 11:21:38AM -0500, Steve Wise wrote: > > > > > > I don't know. I read the workqueue doc on WQ_MEM_RECLAIM, but I don't > > know > > > > how > > > > to tell if iw_cm needs this or not. Can you give me an example of a > workqueue > > > > that _does_ need WQ_MEM_RECLAIM? I _think_ it means your workqueue is > > > > required > > > > to run something that would get triggered by the oom OS code, but I don't > know > > > > if that would include rdma CMs or not... > > > > > > Many of the workqueues in infiniband/core use WQ_MEM_RECLAIM: cma, iwcm, > > mad, > > > multicast, sa_query, and ucma. > > > > > > Hey Sean, do you have any insight into whether the CMA modules really need > > > WQ_MEM_RECLAIM for their workqueues? > > > > > > Does anyone else know? > > > > Consider that the ib_core can be used to back storage. Ie consider a > > situation where iSER/NFS/SRP needs to reconnect to respond to kernel > > paging/reclaim. > > > > On the surface it seems reasonable to me that these are on a reclaim > > path? > > > > Jason > > hmm. That seems reasonable. Then I would think the nvme_rdma would also need > to be using a reclaim workqueue. > > Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to > nvme_rdma vs using the system_wq? nvme/target probably needs one also... The workqueue which frees the memory and doesn't allocate memory during execution is supposed to be marked as WQ_MEM_RECLAIM. This flag will cause to priority increase for such workqueue during low memory conditions. I don't remember it for sure, but I think that shrinker will call to such workqueue in these conditions. In normal conditions, it won't change a lot. Thanks > > Steve. > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20170813064651.GR24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>]
* Re: Flush warning [not found] ` <20170813064651.GR24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org> @ 2017-08-13 9:14 ` Sagi Grimberg [not found] ` <90ada4f5-d6a1-2b93-5164-c593955c20cf-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 0 siblings, 1 reply; 14+ messages in thread From: Sagi Grimberg @ 2017-08-13 9:14 UTC (permalink / raw) To: Leon Romanovsky, Steve Wise Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Sean Hefty', linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Christoph Hellwig' >>>> Does anyone else know? >>> >>> Consider that the ib_core can be used to back storage. Ie consider a >>> situation where iSER/NFS/SRP needs to reconnect to respond to kernel >>> paging/reclaim. >>> >>> On the surface it seems reasonable to me that these are on a reclaim >>> path? I'm pretty sure that ULP connect will trigger memory allocations, which will fail under memory pressure... Maybe I'm missing something. >> hmm. That seems reasonable. Then I would think the nvme_rdma would also need >> to be using a reclaim workqueue. >> >> Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to >> nvme_rdma vs using the system_wq? nvme/target probably needs one also... I'm not sure, being unable to flush system workqueue from CM context is somewhat limiting... We could use a private workqueue for nvmet teardowns but I'm not sure we want to do that. > The workqueue which frees the memory and doesn't allocate memory during execution > is supposed to be marked as WQ_MEM_RECLAIM. This flag will cause to priority increase > for such workqueue during low memory conditions. Which to my understanding means that CM workqueue should not use it as on each CM connect, by definition the ULP allocates memory (qp, cq etc). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <90ada4f5-d6a1-2b93-5164-c593955c20cf-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* Re: Flush warning [not found] ` <90ada4f5-d6a1-2b93-5164-c593955c20cf-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2017-08-13 10:33 ` Leon Romanovsky [not found] ` <20170813103359.GW24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org> 0 siblings, 1 reply; 14+ messages in thread From: Leon Romanovsky @ 2017-08-13 10:33 UTC (permalink / raw) To: Sagi Grimberg, Hal Rosenstock Cc: Steve Wise, 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Sean Hefty', linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Christoph Hellwig' [-- Attachment #1: Type: text/plain, Size: 1657 bytes --] On Sun, Aug 13, 2017 at 02:14:58AM -0700, Sagi Grimberg wrote: > > > > > > Does anyone else know? > > > > > > > > Consider that the ib_core can be used to back storage. Ie consider a > > > > situation where iSER/NFS/SRP needs to reconnect to respond to kernel > > > > paging/reclaim. > > > > > > > > On the surface it seems reasonable to me that these are on a reclaim > > > > path? > > I'm pretty sure that ULP connect will trigger memory allocations, which > will fail under memory pressure... Maybe I'm missing something. > > > > hmm. That seems reasonable. Then I would think the nvme_rdma would also need > > > to be using a reclaim workqueue. > > > > > > Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to > > > nvme_rdma vs using the system_wq? nvme/target probably needs one also... > > I'm not sure, being unable to flush system workqueue from CM context is > somewhat limiting... We could use a private workqueue for nvmet > teardowns but I'm not sure we want to do that. > > > The workqueue which frees the memory and doesn't allocate memory during execution > > is supposed to be marked as WQ_MEM_RECLAIM. This flag will cause to priority increase > > for such workqueue during low memory conditions. > > Which to my understanding means that CM workqueue should not use it as > on each CM connect, by definition the ULP allocates memory (qp, cq etc). From my understanding too. That workqueue was introduced in 2005, in a977049dacde ("[PATCH] IB: Add the kernel CM implementation"), it is not clear if it was intentionally. Hal, do you remember the rationale there? Thanks [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20170813103359.GW24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>]
* Re: Flush warning [not found] ` <20170813103359.GW24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org> @ 2017-08-14 12:13 ` Hal Rosenstock [not found] ` <3d9cca62-4612-53dc-776c-3aeb2df58365-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 14+ messages in thread From: Hal Rosenstock @ 2017-08-14 12:13 UTC (permalink / raw) To: Leon Romanovsky, Sagi Grimberg, Hal Rosenstock Cc: Steve Wise, 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Sean Hefty', linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Christoph Hellwig' On 8/13/2017 6:33 AM, Leon Romanovsky wrote: > On Sun, Aug 13, 2017 at 02:14:58AM -0700, Sagi Grimberg wrote: >> >>>>>> Does anyone else know? >>>>> >>>>> Consider that the ib_core can be used to back storage. Ie consider a >>>>> situation where iSER/NFS/SRP needs to reconnect to respond to kernel >>>>> paging/reclaim. >>>>> >>>>> On the surface it seems reasonable to me that these are on a reclaim >>>>> path? >> >> I'm pretty sure that ULP connect will trigger memory allocations, which >> will fail under memory pressure... Maybe I'm missing something. >> >>>> hmm. That seems reasonable. Then I would think the nvme_rdma would also need >>>> to be using a reclaim workqueue. >>>> >>>> Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to >>>> nvme_rdma vs using the system_wq? nvme/target probably needs one also... >> >> I'm not sure, being unable to flush system workqueue from CM context is >> somewhat limiting... We could use a private workqueue for nvmet >> teardowns but I'm not sure we want to do that. >> >>> The workqueue which frees the memory and doesn't allocate memory during execution >>> is supposed to be marked as WQ_MEM_RECLAIM. This flag will cause to priority increase >>> for such workqueue during low memory conditions. >> >> Which to my understanding means that CM workqueue should not use it as >> on each CM connect, by definition the ULP allocates memory (qp, cq etc). > > From my understanding too. > That workqueue was introduced in 2005, in a977049dacde > ("[PATCH] IB: Add the kernel CM implementation"), it is not clear if it > was intentionally. > > Hal, > do you remember the rationale there? Sean is best to respond to this. -- Hal > > Thanks > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <3d9cca62-4612-53dc-776c-3aeb2df58365-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* RE: Flush warning [not found] ` <3d9cca62-4612-53dc-776c-3aeb2df58365-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2017-08-14 17:01 ` Hefty, Sean [not found] ` <1828884A29C6694DAF28B7E6B8A82373AB17595A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 14+ messages in thread From: Hefty, Sean @ 2017-08-14 17:01 UTC (permalink / raw) To: Hal Rosenstock, Leon Romanovsky, Sagi Grimberg, Hal Rosenstock Cc: Steve Wise, 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, 'Christoph Hellwig' > >>> The workqueue which frees the memory and doesn't allocate memory > >>> during execution is supposed to be marked as WQ_MEM_RECLAIM. This > >>> flag will cause to priority increase for such workqueue during low > memory conditions. > >> > >> Which to my understanding means that CM workqueue should not use it > >> as on each CM connect, by definition the ULP allocates memory (qp, > cq etc). > > > > From my understanding too. > > That workqueue was introduced in 2005, in a977049dacde ("[PATCH] IB: > > Add the kernel CM implementation"), it is not clear if it was > > intentionally. > > > > Hal, > > do you remember the rationale there? > > Sean is best to respond to this. I believe the work queue was to avoid potential deadlocks that could arise from using the MAD work queue. The original submission did not mark the work queue with WQ_MEM_RECLAIM. I do not know when that change was added. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <1828884A29C6694DAF28B7E6B8A82373AB17595A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* RE: Flush warning [not found] ` <1828884A29C6694DAF28B7E6B8A82373AB17595A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2017-08-14 17:21 ` Steve Wise 2017-08-15 9:09 ` Sagi Grimberg 0 siblings, 1 reply; 14+ messages in thread From: Steve Wise @ 2017-08-14 17:21 UTC (permalink / raw) To: 'Hefty, Sean', 'Hal Rosenstock', 'Leon Romanovsky', 'Sagi Grimberg', 'Hal Rosenstock' Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Christoph Hellwig' > > >>> The workqueue which frees the memory and doesn't allocate memory > > >>> during execution is supposed to be marked as WQ_MEM_RECLAIM. This > > >>> flag will cause to priority increase for such workqueue during low > > memory conditions. > > >> > > >> Which to my understanding means that CM workqueue should not use it > > >> as on each CM connect, by definition the ULP allocates memory (qp, > > cq etc). > > > > > > From my understanding too. > > > That workqueue was introduced in 2005, in a977049dacde ("[PATCH] IB: > > > Add the kernel CM implementation"), it is not clear if it was > > > intentionally. > > > > > > Hal, > > > do you remember the rationale there? > > > > Sean is best to respond to this. > > I believe the work queue was to avoid potential deadlocks that could arise from > using the MAD work queue. The original submission did not mark the work queue > with WQ_MEM_RECLAIM. I do not know when that change was added. By the way, the queue I'm actually having the problem with is in the iwcm, not the ibcm. I think we should remove WQ_MEM_RECLAIM from all the CM queues because I agree now that ulps potentially can/will allocate memory in the contect of these workqueues... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Flush warning 2017-08-14 17:21 ` Steve Wise @ 2017-08-15 9:09 ` Sagi Grimberg [not found] ` <64134765-9a10-d014-3ac4-b0f747d6c670-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 0 siblings, 1 reply; 14+ messages in thread From: Sagi Grimberg @ 2017-08-15 9:09 UTC (permalink / raw) To: Steve Wise, 'Hefty, Sean', 'Hal Rosenstock', 'Leon Romanovsky', 'Hal Rosenstock' Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Christoph Hellwig' > By the way, the queue I'm actually having the problem with is in the iwcm, not > the ibcm. I think we should remove WQ_MEM_RECLAIM from all the CM queues > because I agree now that ulps potentially can/will allocate memory in the > contect of these workqueues... Agreed, want to take it Steve or should I? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <64134765-9a10-d014-3ac4-b0f747d6c670-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* RE: Flush warning [not found] ` <64134765-9a10-d014-3ac4-b0f747d6c670-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2017-08-15 13:42 ` Steve Wise 0 siblings, 0 replies; 14+ messages in thread From: Steve Wise @ 2017-08-15 13:42 UTC (permalink / raw) To: 'Sagi Grimberg', 'Hefty, Sean', 'Hal Rosenstock', 'Leon Romanovsky', 'Hal Rosenstock' Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, 'Christoph Hellwig' > > > > By the way, the queue I'm actually having the problem with is in the iwcm, not > > the ibcm. I think we should remove WQ_MEM_RECLAIM from all the CM queues > > because I agree now that ulps potentially can/will allocate memory in the > > contect of these workqueues... > > Agreed, want to take it Steve or should I? Please do take it. That would help. Thanks! Steve -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-08-15 13:42 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-03 18:32 Flush warning Steve Wise
2017-08-07 1:06 ` Sagi Grimberg
[not found] ` <9bc142de-b8ba-acb6-5ea1-2ccdbb578655-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-08-07 15:28 ` Steve Wise
2017-08-09 16:21 ` Steve Wise
2017-08-09 16:27 ` Jason Gunthorpe
[not found] ` <20170809162749.GA4069-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-08-09 16:38 ` Steve Wise
2017-08-13 6:46 ` Leon Romanovsky
[not found] ` <20170813064651.GR24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-08-13 9:14 ` Sagi Grimberg
[not found] ` <90ada4f5-d6a1-2b93-5164-c593955c20cf-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-08-13 10:33 ` Leon Romanovsky
[not found] ` <20170813103359.GW24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-08-14 12:13 ` Hal Rosenstock
[not found] ` <3d9cca62-4612-53dc-776c-3aeb2df58365-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2017-08-14 17:01 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373AB17595A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-08-14 17:21 ` Steve Wise
2017-08-15 9:09 ` Sagi Grimberg
[not found] ` <64134765-9a10-d014-3ac4-b0f747d6c670-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-08-15 13:42 ` Steve Wise
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox