Flush warning

public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed

* Flush warning
@ 2017-08-03 18:32 Steve Wise
  2017-08-07  1:06 ` Sagi Grimberg
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2017-08-03 18:32 UTC (permalink / raw)
  To: sagi grimberg, Christoph Hellwig
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hey guys,

We're seeing a WARNING happening when running an fio test on a single NVMF
attached ramdisk over iw_cxgb4.  While the fio test is running, the NVMF host is
also killing the controller via writing to
/sys/block/nvme*/device/reset_controller.  Here is the script:

----
[root@trinitycraft ~]# cat fio_issue.sh
num=0

fio --rw=randrw --name=random --norandommap --ioengine=libaio --size=400m
--group_reporting --exitall --fsync_on_close=1 --invalidate=1 --direct=1
--filename=/dev/nvme0n1 --time_based --runtime=30 --iodepth=32 --numjobs=8
--unit_base=1 --bs=4k --kb_base=1000 &

sleep 2
while [ $num -lt 30 ]
do
        echo 1 >/sys/block/nvme0n1/device/reset_controller
        [ $? -eq 1 ] && echo "reset_controller operation failed: $num" && exit 1
        ((num++))
        sleep 0.5
done
-----

The WARNING seems to be due to nvmet_rdma_queue_connect() calling
flush_scheduled_work() while in the upcall from the RDMA_CM.  It I running on
the iw_cm event workqueue, which is created with WQ_MEM_RECLAIM set.  I'm not
sure what this WARNING is telling me.  Does the iw_cm workqueue NOT need
WQ_MEM_RECLAIM set?  Or is there some other issue with the nvmet/rdma code doing
work flushing in the iw_cm workq context?

This is with 4.12.0.

Any thoughts?  Thanks!

Steve.

---

[ 1887.155804] workqueue: WQ_MEM_RECLAIM iw_cm_wq:cm_work_handler [iw_cm] is
flushing !WQ_MEM_RECLAIM events:          (null)
[ 1887.155811] ------------[ cut here ]------------
[ 1887.155816] WARNING: CPU: 6 PID: 3355 at kernel/workqueue.c:2423
check_flush_dependency+0xa9/0x100
[ 1887.155817] Modules linked in: nvmet_rdma nvmet rdma_ucm ib_uverbs iw_cxgb4
cxgb4 brd rdma_cm iw_cm ib_cm ib_core libcxgb xt_CHECKSUM iptable_mangle
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT
nf_reject_ipv4 fuse tun bridge stp llc ebtable_filter ebtables ip6table_filter
ip6_tables iptable_filter intel_rapl sb_edac x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support mxm_wmi
irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc nvme nvme_core
aesni_intel mei_me crypto_simd ipmi_si glue_helper cryptd mei pcspkr sg ioatdma
lpc_ich ipmi_devintf shpchp i2c_i801 mfd_core ipmi_msghandler wmi acpi_pad
acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace
[ 1887.155849]  sunrpc ip_tables ext4 jbd2 mbcache sd_mod ast drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb ahci libahci ptp
libata crc32c_intel pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash
dm_log dm_mod dax [last unloaded: nvmet]
[ 1887.155863] CPU: 6 PID: 3355 Comm: kworker/u32:1 Not tainted 4.12.0 #1
[ 1887.155864] Hardware name: Supermicro X10DRi/X10DRi, BIOS 2.1 09/13/2016
[ 1887.155866] Workqueue: iw_cm_wq cm_work_handler [iw_cm]
[ 1887.155867] task: ffff944585015700 task.stack: ffffb36b47604000
[ 1887.155869] RIP: 0010:check_flush_dependency+0xa9/0x100
[ 1887.155870] RSP: 0018:ffffb36b476079f8 EFLAGS: 00010246
[ 1887.155871] RAX: 000000000000006e RBX: ffff943d9f808e00 RCX: ffffffff8cc605a8
[ 1887.155872] RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000202
[ 1887.155873] RBP: ffffb36b47607a10 R08: 000000000000006e R09: 00000000000005e9
[ 1887.155873] R10: 0000000000000000 R11: 000000000000006e R12: ffff943d92c72e40
[ 1887.155874] R13: 0000000000000000 R14: 0000000000000006 R15: ffffb36b47607a50
[ 1887.155875] FS:  0000000000000000(0000) GS:ffff9445bfc80000(0000)
knlGS:0000000000000000
[ 1887.155876] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1887.155877] CR2: 00000000006e8430 CR3: 0000000160c09000 CR4: 00000000001406e0
[ 1887.155878] Call Trace:
[ 1887.155881]  flush_workqueue+0x15a/0x490
[ 1887.155885]  nvmet_rdma_queue_connect+0x7cf/0xc70 [nvmet_rdma]
[ 1887.155887]  ? nvmet_rdma_cm_reject+0xa0/0xa0 [nvmet_rdma]
[ 1887.155888]  nvmet_rdma_cm_handler+0x12f/0x2f0 [nvmet_rdma]
[ 1887.155893]  iw_conn_req_handler+0x186/0x230 [rdma_cm]
[ 1887.155894]  cm_work_handler+0xcef/0xd10 [iw_cm]
[ 1887.155897]  process_one_work+0x149/0x360
[ 1887.155898]  worker_thread+0x4d/0x3c0
[ 1887.155901]  kthread+0x109/0x140
[ 1887.155902]  ? rescuer_thread+0x380/0x380
[ 1887.155903]  ? kthread_park+0x60/0x60
[ 1887.155907]  ? do_syscall_64+0x67/0x150
[ 1887.155910]  ret_from_fork+0x25/0x30
[ 1887.155911] Code: 49 8b 54 24 18 48 8d 8b b0 00 00 00 48 81 c6 b0 00 00 00 4d
89 e8 48 c7 c7 c0 fc a2 8c 31 c0 c6 05 bd 62 c6 00 01 e8 31 82 10 00 <0f> ff e9
77 ff ff ff 45 31 ed e9 66 ff ff ff 80 3d a3 62 c6 00
[ 1887.155926] ---[ end trace c67d348e72eb38e9 ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Flush warning
  2017-08-03 18:32 Flush warning Steve Wise
@ 2017-08-07  1:06 ` Sagi Grimberg
       [not found]   ` <9bc142de-b8ba-acb6-5ea1-2ccdbb578655-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2017-08-07  1:06 UTC (permalink / raw)
  To: Steve Wise, Christoph Hellwig
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

> Hey guys,

Hey Steve,

> We're seeing a WARNING happening when running an fio test on a single NVMF
> attached ramdisk over iw_cxgb4.  While the fio test is running, the NVMF host is
> also killing the controller via writing to
> /sys/block/nvme*/device/reset_controller.  Here is the script:
> 
> ----
> [root@trinitycraft ~]# cat fio_issue.sh
> num=0
> 
> fio --rw=randrw --name=random --norandommap --ioengine=libaio --size=400m
> --group_reporting --exitall --fsync_on_close=1 --invalidate=1 --direct=1
> --filename=/dev/nvme0n1 --time_based --runtime=30 --iodepth=32 --numjobs=8
> --unit_base=1 --bs=4k --kb_base=1000 &
> 
> sleep 2
> while [ $num -lt 30 ]
> do
>          echo 1 >/sys/block/nvme0n1/device/reset_controller
>          [ $? -eq 1 ] && echo "reset_controller operation failed: $num" && exit 1
>          ((num++))
>          sleep 0.5
> done
> -----
> 
> The WARNING seems to be due to nvmet_rdma_queue_connect() calling
> flush_scheduled_work() while in the upcall from the RDMA_CM.  It I running on
> the iw_cm event workqueue, which is created with WQ_MEM_RECLAIM set.  I'm not
> sure what this WARNING is telling me.  Does the iw_cm workqueue NOT need
> WQ_MEM_RECLAIM set?  Or is there some other issue with the nvmet/rdma code doing
> work flushing in the iw_cm workq context?

This flush is designed to prevent nvmet-rdma from having too much
inflight resources in case of a high pace of controller teardown and
establishment (like you trigger in your test).

queue teardowns are run on system_wq, does iw_cm needs memory
reclamation protection?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <9bc142de-b8ba-acb6-5ea1-2ccdbb578655-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]

* RE: Flush warning
       [not found]   ` <9bc142de-b8ba-acb6-5ea1-2ccdbb578655-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2017-08-07 15:28     ` Steve Wise
  2017-08-09 16:21       ` Steve Wise
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2017-08-07 15:28 UTC (permalink / raw)
  To: 'Sagi Grimberg', 'Christoph Hellwig'
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

> > The WARNING seems to be due to nvmet_rdma_queue_connect() calling
> > flush_scheduled_work() while in the upcall from the RDMA_CM.  It I running
on
> > the iw_cm event workqueue, which is created with WQ_MEM_RECLAIM set.  I'm
> not
> > sure what this WARNING is telling me.  Does the iw_cm workqueue NOT need
> > WQ_MEM_RECLAIM set?  Or is there some other issue with the nvmet/rdma
> code doing
> > work flushing in the iw_cm workq context?
> 
> This flush is designed to prevent nvmet-rdma from having too much
> inflight resources in case of a high pace of controller teardown and
> establishment (like you trigger in your test).
> 
> queue teardowns are run on system_wq, does iw_cm needs memory
> reclamation protection?

I don't know.  I read the workqueue doc on WQ_MEM_RECLAIM, but I don't know who
to tell if iw_cm needs this or not.  Can you give me an example of a workqueue
that _does_ need WQ_MEM_RECLAIM?  I _think_ it means your workqueue is required
to run something that would get triggered by the oom OS code, but I don't know
if that would include rdma CMs or not...


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: Flush warning
  2017-08-07 15:28     ` Steve Wise
@ 2017-08-09 16:21       ` Steve Wise
  2017-08-09 16:27         ` Jason Gunthorpe
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2017-08-09 16:21 UTC (permalink / raw)
  To: 'Sean Hefty'
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Sagi Grimberg', 'Christoph Hellwig'

> 
> > > The WARNING seems to be due to nvmet_rdma_queue_connect() calling
> > > flush_scheduled_work() while in the upcall from the RDMA_CM.  It I running
> on
> > > the iw_cm event workqueue, which is created with WQ_MEM_RECLAIM set.
> I'm
> > not
> > > sure what this WARNING is telling me.  Does the iw_cm workqueue NOT need
> > > WQ_MEM_RECLAIM set?  Or is there some other issue with the nvmet/rdma
> > code doing
> > > work flushing in the iw_cm workq context?
> >
> > This flush is designed to prevent nvmet-rdma from having too much
> > inflight resources in case of a high pace of controller teardown and
> > establishment (like you trigger in your test).
> >
> > queue teardowns are run on system_wq, does iw_cm needs memory
> > reclamation protection?
> 
> I don't know.  I read the workqueue doc on WQ_MEM_RECLAIM, but I don't know
> how
> to tell if iw_cm needs this or not.  Can you give me an example of a workqueue
> that _does_ need WQ_MEM_RECLAIM?  I _think_ it means your workqueue is
> required
> to run something that would get triggered by the oom OS code, but I don't know
> if that would include rdma CMs or not...

Many of the workqueues in infiniband/core use WQ_MEM_RECLAIM: cma, iwcm, mad,
multicast, sa_query, and ucma. 

Hey Sean, do you have any insight into whether the CMA modules really need
WQ_MEM_RECLAIM for their workqueues?

Does anyone else know?

Thanks!

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Flush warning
  2017-08-09 16:21       ` Steve Wise
@ 2017-08-09 16:27         ` Jason Gunthorpe
       [not found]           ` <20170809162749.GA4069-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2017-08-09 16:27 UTC (permalink / raw)
  To: Steve Wise
  Cc: 'Sean Hefty', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Sagi Grimberg', 'Christoph Hellwig'

On Wed, Aug 09, 2017 at 11:21:38AM -0500, Steve Wise wrote:

> > I don't know.  I read the workqueue doc on WQ_MEM_RECLAIM, but I don't know
> > how
> > to tell if iw_cm needs this or not.  Can you give me an example of a workqueue
> > that _does_ need WQ_MEM_RECLAIM?  I _think_ it means your workqueue is
> > required
> > to run something that would get triggered by the oom OS code, but I don't know
> > if that would include rdma CMs or not...
> 
> Many of the workqueues in infiniband/core use WQ_MEM_RECLAIM: cma, iwcm, mad,
> multicast, sa_query, and ucma. 
> 
> Hey Sean, do you have any insight into whether the CMA modules really need
> WQ_MEM_RECLAIM for their workqueues?
> 
> Does anyone else know?

Consider that the ib_core can be used to back storage. Ie consider a
situation where iSER/NFS/SRP needs to reconnect to respond to kernel
paging/reclaim.

On the surface it seems reasonable to me that these are on a reclaim
path?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20170809162749.GA4069-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]

* RE: Flush warning
       [not found]           ` <20170809162749.GA4069-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-08-09 16:38             ` Steve Wise
  2017-08-13  6:46               ` Leon Romanovsky
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2017-08-09 16:38 UTC (permalink / raw)
  To: 'Jason Gunthorpe'
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Sean Hefty',
	'Sagi Grimberg',
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Christoph Hellwig'

> On Wed, Aug 09, 2017 at 11:21:38AM -0500, Steve Wise wrote:
> 
> > > I don't know.  I read the workqueue doc on WQ_MEM_RECLAIM, but I don't
> know
> > > how
> > > to tell if iw_cm needs this or not.  Can you give me an example of a
workqueue
> > > that _does_ need WQ_MEM_RECLAIM?  I _think_ it means your workqueue is
> > > required
> > > to run something that would get triggered by the oom OS code, but I don't
know
> > > if that would include rdma CMs or not...
> >
> > Many of the workqueues in infiniband/core use WQ_MEM_RECLAIM: cma, iwcm,
> mad,
> > multicast, sa_query, and ucma.
> >
> > Hey Sean, do you have any insight into whether the CMA modules really need
> > WQ_MEM_RECLAIM for their workqueues?
> >
> > Does anyone else know?
> 
> Consider that the ib_core can be used to back storage. Ie consider a
> situation where iSER/NFS/SRP needs to reconnect to respond to kernel
> paging/reclaim.
> 
> On the surface it seems reasonable to me that these are on a reclaim
> path?
> 
> Jason

hmm.  That seems reasonable.  Then I would think the nvme_rdma would also need
to be using a reclaim workqueue.

Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to
nvme_rdma vs using the system_wq?  nvme/target probably needs one also...

Steve.



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Flush warning
  2017-08-09 16:38             ` Steve Wise
@ 2017-08-13  6:46               ` Leon Romanovsky
       [not found]                 ` <20170813064651.GR24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2017-08-13  6:46 UTC (permalink / raw)
  To: Steve Wise
  Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	'Sean Hefty', 'Sagi Grimberg',
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Christoph Hellwig'

[-- Attachment #1: Type: text/plain, Size: 2012 bytes --]

On Wed, Aug 09, 2017 at 11:38:49AM -0500, Steve Wise wrote:
> > On Wed, Aug 09, 2017 at 11:21:38AM -0500, Steve Wise wrote:
> >
> > > > I don't know.  I read the workqueue doc on WQ_MEM_RECLAIM, but I don't
> > know
> > > > how
> > > > to tell if iw_cm needs this or not.  Can you give me an example of a
> workqueue
> > > > that _does_ need WQ_MEM_RECLAIM?  I _think_ it means your workqueue is
> > > > required
> > > > to run something that would get triggered by the oom OS code, but I don't
> know
> > > > if that would include rdma CMs or not...
> > >
> > > Many of the workqueues in infiniband/core use WQ_MEM_RECLAIM: cma, iwcm,
> > mad,
> > > multicast, sa_query, and ucma.
> > >
> > > Hey Sean, do you have any insight into whether the CMA modules really need
> > > WQ_MEM_RECLAIM for their workqueues?
> > >
> > > Does anyone else know?
> >
> > Consider that the ib_core can be used to back storage. Ie consider a
> > situation where iSER/NFS/SRP needs to reconnect to respond to kernel
> > paging/reclaim.
> >
> > On the surface it seems reasonable to me that these are on a reclaim
> > path?
> >
> > Jason
>
> hmm.  That seems reasonable.  Then I would think the nvme_rdma would also need
> to be using a reclaim workqueue.
>
> Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to
> nvme_rdma vs using the system_wq?  nvme/target probably needs one also...

The workqueue which frees the memory and doesn't allocate memory during execution
is supposed to be marked as WQ_MEM_RECLAIM. This flag will cause to priority increase
for such workqueue during low memory conditions.

I don't remember it for sure, but I think that shrinker will call to
such workqueue in these conditions. In normal conditions, it won't
change a lot.

Thanks

>
> Steve.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20170813064651.GR24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>]

* Re: Flush warning
       [not found]                 ` <20170813064651.GR24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-08-13  9:14                   ` Sagi Grimberg
       [not found]                     ` <90ada4f5-d6a1-2b93-5164-c593955c20cf-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2017-08-13  9:14 UTC (permalink / raw)
  To: Leon Romanovsky, Steve Wise
  Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	'Sean Hefty', linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Christoph Hellwig'


>>>> Does anyone else know?
>>>
>>> Consider that the ib_core can be used to back storage. Ie consider a
>>> situation where iSER/NFS/SRP needs to reconnect to respond to kernel
>>> paging/reclaim.
>>>
>>> On the surface it seems reasonable to me that these are on a reclaim
>>> path?

I'm pretty sure that ULP connect will trigger memory allocations, which
will fail under memory pressure... Maybe I'm missing something.

>> hmm.  That seems reasonable.  Then I would think the nvme_rdma would also need
>> to be using a reclaim workqueue.
>>
>> Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to
>> nvme_rdma vs using the system_wq?  nvme/target probably needs one also...

I'm not sure, being unable to flush system workqueue from CM context is
somewhat limiting... We could use a private workqueue for nvmet
teardowns but I'm not sure we want to do that.

> The workqueue which frees the memory and doesn't allocate memory during execution
> is supposed to be marked as WQ_MEM_RECLAIM. This flag will cause to priority increase
> for such workqueue during low memory conditions.

Which to my understanding means that CM workqueue should not use it as
on each CM connect, by definition the ULP allocates memory (qp, cq etc).
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <90ada4f5-d6a1-2b93-5164-c593955c20cf-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]

* Re: Flush warning
       [not found]                     ` <90ada4f5-d6a1-2b93-5164-c593955c20cf-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2017-08-13 10:33                       ` Leon Romanovsky
       [not found]                         ` <20170813103359.GW24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2017-08-13 10:33 UTC (permalink / raw)
  To: Sagi Grimberg, Hal Rosenstock
  Cc: Steve Wise, 'Jason Gunthorpe',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Sean Hefty',
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Christoph Hellwig'

[-- Attachment #1: Type: text/plain, Size: 1657 bytes --]

On Sun, Aug 13, 2017 at 02:14:58AM -0700, Sagi Grimberg wrote:
>
> > > > > Does anyone else know?
> > > >
> > > > Consider that the ib_core can be used to back storage. Ie consider a
> > > > situation where iSER/NFS/SRP needs to reconnect to respond to kernel
> > > > paging/reclaim.
> > > >
> > > > On the surface it seems reasonable to me that these are on a reclaim
> > > > path?
>
> I'm pretty sure that ULP connect will trigger memory allocations, which
> will fail under memory pressure... Maybe I'm missing something.
>
> > > hmm.  That seems reasonable.  Then I would think the nvme_rdma would also need
> > > to be using a reclaim workqueue.
> > >
> > > Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to
> > > nvme_rdma vs using the system_wq?  nvme/target probably needs one also...
>
> I'm not sure, being unable to flush system workqueue from CM context is
> somewhat limiting... We could use a private workqueue for nvmet
> teardowns but I'm not sure we want to do that.
>
> > The workqueue which frees the memory and doesn't allocate memory during execution
> > is supposed to be marked as WQ_MEM_RECLAIM. This flag will cause to priority increase
> > for such workqueue during low memory conditions.
>
> Which to my understanding means that CM workqueue should not use it as
> on each CM connect, by definition the ULP allocates memory (qp, cq etc).

From my understanding too.
That workqueue was introduced in 2005, in a977049dacde
("[PATCH] IB: Add the kernel CM implementation"), it is not clear if it
was intentionally.

Hal,
do you remember the rationale there?


Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20170813103359.GW24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>]

* Re: Flush warning
       [not found]                         ` <20170813103359.GW24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-08-14 12:13                           ` Hal Rosenstock
       [not found]                             ` <3d9cca62-4612-53dc-776c-3aeb2df58365-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Hal Rosenstock @ 2017-08-14 12:13 UTC (permalink / raw)
  To: Leon Romanovsky, Sagi Grimberg, Hal Rosenstock
  Cc: Steve Wise, 'Jason Gunthorpe',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, 'Sean Hefty',
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Christoph Hellwig'

On 8/13/2017 6:33 AM, Leon Romanovsky wrote:
> On Sun, Aug 13, 2017 at 02:14:58AM -0700, Sagi Grimberg wrote:
>>
>>>>>> Does anyone else know?
>>>>>
>>>>> Consider that the ib_core can be used to back storage. Ie consider a
>>>>> situation where iSER/NFS/SRP needs to reconnect to respond to kernel
>>>>> paging/reclaim.
>>>>>
>>>>> On the surface it seems reasonable to me that these are on a reclaim
>>>>> path?
>>
>> I'm pretty sure that ULP connect will trigger memory allocations, which
>> will fail under memory pressure... Maybe I'm missing something.
>>
>>>> hmm.  That seems reasonable.  Then I would think the nvme_rdma would also need
>>>> to be using a reclaim workqueue.
>>>>
>>>> Sagi, Do you think I should add a private workqueue with WQ_MEM_RECLAIM to
>>>> nvme_rdma vs using the system_wq?  nvme/target probably needs one also...
>>
>> I'm not sure, being unable to flush system workqueue from CM context is
>> somewhat limiting... We could use a private workqueue for nvmet
>> teardowns but I'm not sure we want to do that.
>>
>>> The workqueue which frees the memory and doesn't allocate memory during execution
>>> is supposed to be marked as WQ_MEM_RECLAIM. This flag will cause to priority increase
>>> for such workqueue during low memory conditions.
>>
>> Which to my understanding means that CM workqueue should not use it as
>> on each CM connect, by definition the ULP allocates memory (qp, cq etc).
> 
> From my understanding too.
> That workqueue was introduced in 2005, in a977049dacde
> ("[PATCH] IB: Add the kernel CM implementation"), it is not clear if it
> was intentionally.
> 
> Hal,
> do you remember the rationale there?

Sean is best to respond to this.

-- Hal

> 
> Thanks
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <3d9cca62-4612-53dc-776c-3aeb2df58365-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]

* RE: Flush warning
       [not found]                             ` <3d9cca62-4612-53dc-776c-3aeb2df58365-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2017-08-14 17:01                               ` Hefty, Sean
       [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373AB17595A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Hefty, Sean @ 2017-08-14 17:01 UTC (permalink / raw)
  To: Hal Rosenstock, Leon Romanovsky, Sagi Grimberg, Hal Rosenstock
  Cc: Steve Wise, 'Jason Gunthorpe',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	'Christoph Hellwig'

> >>> The workqueue which frees the memory and doesn't allocate memory
> >>> during execution is supposed to be marked as WQ_MEM_RECLAIM. This
> >>> flag will cause to priority increase for such workqueue during low
> memory conditions.
> >>
> >> Which to my understanding means that CM workqueue should not use it
> >> as on each CM connect, by definition the ULP allocates memory (qp,
> cq etc).
> >
> > From my understanding too.
> > That workqueue was introduced in 2005, in a977049dacde ("[PATCH] IB:
> > Add the kernel CM implementation"), it is not clear if it was
> > intentionally.
> >
> > Hal,
> > do you remember the rationale there?
> 
> Sean is best to respond to this.

I believe the work queue was to avoid potential deadlocks that could arise from using the MAD work queue.  The original submission did not mark the work queue with WQ_MEM_RECLAIM.  I do not know when that change was added.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <1828884A29C6694DAF28B7E6B8A82373AB17595A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>]

* RE: Flush warning
       [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373AB17595A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-08-14 17:21                                   ` Steve Wise
  2017-08-15  9:09                                     ` Sagi Grimberg
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2017-08-14 17:21 UTC (permalink / raw)
  To: 'Hefty, Sean', 'Hal Rosenstock',
	'Leon Romanovsky', 'Sagi Grimberg',
	'Hal Rosenstock'
  Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Christoph Hellwig'

> > >>> The workqueue which frees the memory and doesn't allocate memory
> > >>> during execution is supposed to be marked as WQ_MEM_RECLAIM. This
> > >>> flag will cause to priority increase for such workqueue during low
> > memory conditions.
> > >>
> > >> Which to my understanding means that CM workqueue should not use it
> > >> as on each CM connect, by definition the ULP allocates memory (qp,
> > cq etc).
> > >
> > > From my understanding too.
> > > That workqueue was introduced in 2005, in a977049dacde ("[PATCH] IB:
> > > Add the kernel CM implementation"), it is not clear if it was
> > > intentionally.
> > >
> > > Hal,
> > > do you remember the rationale there?
> >
> > Sean is best to respond to this.
> 
> I believe the work queue was to avoid potential deadlocks that could arise
from
> using the MAD work queue.  The original submission did not mark the work queue
> with WQ_MEM_RECLAIM.  I do not know when that change was added.

By the way, the queue I'm actually having the problem with is in the iwcm, not
the ibcm.  I think we should remove WQ_MEM_RECLAIM from all the CM queues
because I agree now that ulps potentially can/will allocate memory in the
contect of these workqueues...


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Flush warning
  2017-08-14 17:21                                   ` Steve Wise
@ 2017-08-15  9:09                                     ` Sagi Grimberg
       [not found]                                       ` <64134765-9a10-d014-3ac4-b0f747d6c670-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Sagi Grimberg @ 2017-08-15  9:09 UTC (permalink / raw)
  To: Steve Wise, 'Hefty, Sean', 'Hal Rosenstock',
	'Leon Romanovsky', 'Hal Rosenstock'
  Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Christoph Hellwig'


> By the way, the queue I'm actually having the problem with is in the iwcm, not
> the ibcm.  I think we should remove WQ_MEM_RECLAIM from all the CM queues
> because I agree now that ulps potentially can/will allocate memory in the
> contect of these workqueues...

Agreed, want to take it Steve or should I?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <64134765-9a10-d014-3ac4-b0f747d6c670-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]

* RE: Flush warning
       [not found]                                       ` <64134765-9a10-d014-3ac4-b0f747d6c670-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2017-08-15 13:42                                         ` Steve Wise
  0 siblings, 0 replies; 14+ messages in thread
From: Steve Wise @ 2017-08-15 13:42 UTC (permalink / raw)
  To: 'Sagi Grimberg', 'Hefty, Sean',
	'Hal Rosenstock', 'Leon Romanovsky',
	'Hal Rosenstock'
  Cc: 'Jason Gunthorpe', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	'Christoph Hellwig'

> 
> 
> > By the way, the queue I'm actually having the problem with is in the iwcm, not
> > the ibcm.  I think we should remove WQ_MEM_RECLAIM from all the CM queues
> > because I agree now that ulps potentially can/will allocate memory in the
> > contect of these workqueues...
> 
> Agreed, want to take it Steve or should I?

Please do take it.  That would help.  Thanks!

Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-08-15 13:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-03 18:32 Flush warning Steve Wise
2017-08-07  1:06 ` Sagi Grimberg
     [not found]   ` <9bc142de-b8ba-acb6-5ea1-2ccdbb578655-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-08-07 15:28     ` Steve Wise
2017-08-09 16:21       ` Steve Wise
2017-08-09 16:27         ` Jason Gunthorpe
     [not found]           ` <20170809162749.GA4069-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-08-09 16:38             ` Steve Wise
2017-08-13  6:46               ` Leon Romanovsky
     [not found]                 ` <20170813064651.GR24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-08-13  9:14                   ` Sagi Grimberg
     [not found]                     ` <90ada4f5-d6a1-2b93-5164-c593955c20cf-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-08-13 10:33                       ` Leon Romanovsky
     [not found]                         ` <20170813103359.GW24282-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-08-14 12:13                           ` Hal Rosenstock
     [not found]                             ` <3d9cca62-4612-53dc-776c-3aeb2df58365-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2017-08-14 17:01                               ` Hefty, Sean
     [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373AB17595A-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-08-14 17:21                                   ` Steve Wise
2017-08-15  9:09                                     ` Sagi Grimberg
     [not found]                                       ` <64134765-9a10-d014-3ac4-b0f747d6c670-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-08-15 13:42                                         ` Steve Wise

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox