* lvmcache in writeback mode gets stuck flushing dirty blocks @ 2019-07-30 4:58 Lakshmi Narasimhan Sundararajan 2019-07-30 8:02 ` Nikhil Kshirsagar 2019-07-30 8:21 ` lvmcache in writeback mode gets stuck flushing dirty blocks Zdenek Kabelac 0 siblings, 2 replies; 11+ messages in thread From: Lakshmi Narasimhan Sundararajan @ 2019-07-30 4:58 UTC (permalink / raw) To: lvm-devel Hi Team, A very good day to all. I am using lvmcache in writeback mode. When there are dirty blocks still in the lv, and if needs to be destroyed or flushed, then It seems to me that there are some conditions under which the dirty data flush gets stuck forever. As an example: root at pdc4-sm35:~# lvremove -f pwx0/pool 367 blocks must still be flushed. 367 blocks must still be flushed. 367 blocks must still be flushed. 367 blocks must still be flushed. 367 blocks must still be flushed. 367 blocks must still be flushed. ^C root at pdc4-sm35:~# I am running these version: root at pdc4-sm35:~# lvm version LVM version: 2.02.133(2) (2015-10-30) Library version: 1.02.110 (2015-10-30) Driver version: 4.34.0 root at pdc4-sm35:~# This issue seems old and reported multiple places. There have been some acknowledgement that this issue is resolved in 2.02.133, but still I see it. Also, I have seen some posts report it in 2.02.170+ as well (here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=878441) (Package: lvm2 Version: 2.02.173-1 Severity: normal) I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, trying to understand from you experts where we are on this? I would sincerely appreciate your help in understanding the state of this issue in more detail. Best regards LN Sent from Mail for Windows 10 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190730/f1009a0a/attachment.htm> ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuck flushing dirty blocks 2019-07-30 4:58 lvmcache in writeback mode gets stuck flushing dirty blocks Lakshmi Narasimhan Sundararajan @ 2019-07-30 8:02 ` Nikhil Kshirsagar 2019-07-30 8:15 ` Nikhil Kshirsagar 2019-07-31 9:53 ` lvmcache in writeback mode gets stuck flushing dirtyblocks Lakshmi Narasimhan Sundararajan 2019-07-30 8:21 ` lvmcache in writeback mode gets stuck flushing dirty blocks Zdenek Kabelac 1 sibling, 2 replies; 11+ messages in thread From: Nikhil Kshirsagar @ 2019-07-30 8:02 UTC (permalink / raw) To: lvm-devel This used to happen if the chunksize increased as a result of needing to use more than a million chunks to store the size of the cached lv. What is the size of the pool? Regards, Nikhil. On Tue, 30 Jul, 2019, 1:25 PM Lakshmi Narasimhan Sundararajan, < lns@portworx.com> wrote: > Hi Team, > > A very good day to all. > > > I am using lvmcache in writeback mode. When there are dirty blocks still > in the lv, and if needs to be destroyed or flushed, then > > It seems to me that there are some conditions under which the dirty data > flush gets stuck forever. > > > > > > As an example: > > root at pdc4-sm35:~# lvremove -f pwx0/pool > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > ^C > > root at pdc4-sm35:~# > > > > I am running these version: > > root at pdc4-sm35:~# lvm version > > LVM version: 2.02.133(2) (2015-10-30) > > Library version: 1.02.110 (2015-10-30) > > Driver version: 4.34.0 > > root at pdc4-sm35:~# > > > > > > This issue seems old and reported multiple places. There have been some > acknowledgement that this issue is resolved in 2.02.133, but still I see > it. Also, I have seen some posts report it in 2.02.170+ as well (here: > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=878441) (Package: lvm2 > Version: 2.02.173-1 Severity: normal) > > > > I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, > trying to understand from you experts where we are on this? > > > > I would sincerely appreciate your help in understanding the state of this > issue in more detail. > > > > Best regards > LN > > Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for > Windows 10 > > > -- > lvm-devel mailing list > lvm-devel at redhat.com > https://www.redhat.com/mailman/listinfo/lvm-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190730/562a564e/attachment.htm> ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuck flushing dirty blocks 2019-07-30 8:02 ` Nikhil Kshirsagar @ 2019-07-30 8:15 ` Nikhil Kshirsagar 2019-07-31 9:53 ` lvmcache in writeback mode gets stuck flushing dirtyblocks Lakshmi Narasimhan Sundararajan 1 sibling, 0 replies; 11+ messages in thread From: Nikhil Kshirsagar @ 2019-07-30 8:15 UTC (permalink / raw) To: lvm-devel Please see https://bugzilla.redhat.com/show_bug.cgi?id=1665650 and https://bugzilla.redhat.com/show_bug.cgi?id=1661987 Fixed by the migration threshold fix in https://bugzilla.redhat.com/show_bug.cgi?id=1665654 I think. Regards, nikhil. On Tue, Jul 30, 2019 at 1:32 PM Nikhil Kshirsagar <nkshirsa@redhat.com> wrote: > This used to happen if the chunksize increased as a result of needing to > use more than a million chunks to store the size of the cached lv. What is > the size of the pool? > > Regards, > Nikhil. > > On Tue, 30 Jul, 2019, 1:25 PM Lakshmi Narasimhan Sundararajan, < > lns at portworx.com> wrote: > >> Hi Team, >> >> A very good day to all. >> >> >> I am using lvmcache in writeback mode. When there are dirty blocks still >> in the lv, and if needs to be destroyed or flushed, then >> >> It seems to me that there are some conditions under which the dirty data >> flush gets stuck forever. >> >> >> >> >> >> As an example: >> >> root at pdc4-sm35:~# lvremove -f pwx0/pool >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> ^C >> >> root at pdc4-sm35:~# >> >> >> >> I am running these version: >> >> root at pdc4-sm35:~# lvm version >> >> LVM version: 2.02.133(2) (2015-10-30) >> >> Library version: 1.02.110 (2015-10-30) >> >> Driver version: 4.34.0 >> >> root at pdc4-sm35:~# >> >> >> >> >> >> This issue seems old and reported multiple places. There have been some >> acknowledgement that this issue is resolved in 2.02.133, but still I see >> it. Also, I have seen some posts report it in 2.02.170+ as well (here: >> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=878441) (Package: lvm2 >> Version: 2.02.173-1 Severity: normal) >> >> >> >> I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, >> trying to understand from you experts where we are on this? >> >> >> >> I would sincerely appreciate your help in understanding the state of this >> issue in more detail. >> >> >> >> Best regards >> LN >> >> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for >> Windows 10 >> >> >> -- >> lvm-devel mailing list >> lvm-devel at redhat.com >> https://www.redhat.com/mailman/listinfo/lvm-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190730/dd00bd13/attachment.htm> ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuck flushing dirtyblocks 2019-07-30 8:02 ` Nikhil Kshirsagar 2019-07-30 8:15 ` Nikhil Kshirsagar @ 2019-07-31 9:53 ` Lakshmi Narasimhan Sundararajan 2019-08-02 11:44 ` Nikhil Kshirsagar 1 sibling, 1 reply; 11+ messages in thread From: Lakshmi Narasimhan Sundararajan @ 2019-07-31 9:53 UTC (permalink / raw) To: lvm-devel Hi Nikhil, Thank you for your email. Much appreciated. In my environment, Chunksize is fixed at 1M irrespective of the pool size. This may take the number of entries over 1M and result in kernel warning. But the class of systems we are using are huge, and so the memory and cpu bottlenecks does not seem to be a factor in our testing. I looked up at the bugs. The first one about chunksize > 1M, we should be safe on that given our chunksize is fixed at 1MB. The other one about migration threshold is interesting, I will have to validate this again. What would be the unit of migration threshold? Is it the number of 512 byte sectors? And what exactly is its definition? And also curiously this does not seem to be exported through lvm cli, need to fetch this only through dmsetup? Thanks LN Sent from Mail for Windows 10 From: Nikhil Kshirsagar Sent: Wednesday, July 31, 2019 3:04 PM To: LVM2 development Subject: Re: [lvm-devel] lvmcache in writeback mode gets stuck flushing dirtyblocks This used to happen if the chunksize increased as a result of needing to use more than a million chunks to store the size of the cached lv. What is the size of the pool? Regards, Nikhil. On Tue, 30 Jul, 2019, 1:25 PM Lakshmi Narasimhan Sundararajan, <lns@portworx.com> wrote: Hi Team, A very good day to all. I am using lvmcache in writeback mode. When there are dirty blocks still in the lv, and if needs to be destroyed or flushed, then It seems to me that there are some conditions under which the dirty data flush gets stuck forever. ? ? As an example: root at pdc4-sm35:~# lvremove -f pwx0/pool ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ^C root at pdc4-sm35:~# ? I am running these version: root at pdc4-sm35:~# lvm version ? LVM version:???? 2.02.133(2) (2015-10-30) ? Library version: 1.02.110 (2015-10-30) ? Driver version:? 4.34.0 root at pdc4-sm35:~# ? ? This issue seems old and reported multiple places. There have been some acknowledgement that this issue is resolved in 2.02.133, but still I see it. Also, I have seen some posts report it in 2.02.170+ as well (here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=878441) (Package: lvm2 Version: 2.02.173-1 Severity: normal) ? I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, trying? to understand from you experts where we are on this? ? I would sincerely appreciate your help in understanding the state of this issue in more detail. ? Best regards LN Sent from Mail for Windows 10 ? -- lvm-devel mailing list lvm-devel at redhat.com https://www.redhat.com/mailman/listinfo/lvm-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190731/94cfcb69/attachment.htm> ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuck flushing dirtyblocks 2019-07-31 9:53 ` lvmcache in writeback mode gets stuck flushing dirtyblocks Lakshmi Narasimhan Sundararajan @ 2019-08-02 11:44 ` Nikhil Kshirsagar 2019-08-03 7:26 ` Nikhil Kshirsagar 0 siblings, 1 reply; 11+ messages in thread From: Nikhil Kshirsagar @ 2019-08-02 11:44 UTC (permalink / raw) To: lvm-devel Hello, You are welcome. The migration threshold is in terms of chunks, I think.. So it should be at least one chunk so the looping forever won't happen. The bug we found was if chunksize goes beyond a certain value triggered by larger than one tb sized cached lv, it ends up with migration threshold hard coded to lower than the increased chunksize. Yes migration threshold right now needs better documentation and explanations. Also the ability to see it from lvm commands just like we can see chunksize. We are working on it through the bzs mentioned earlier. (See the bz about migration threshold needing better documentation in the man pages) I think right now you can get it only at the device mapper layer, will check.. Regards, Nikhil. On Fri, 2 Aug, 2019, 5:09 PM Lakshmi Narasimhan Sundararajan, < lns@portworx.com> wrote: > Hi Nikhil, > Thank you for your email. Much appreciated. > > > > In my environment, Chunksize is fixed at 1M irrespective of the pool size. > This may take the number of entries over 1M and result in kernel warning. > But the class of systems we are using are huge, and so the memory and cpu > bottlenecks does not seem to be a factor in our testing. > > > > I looked up at the bugs. The first one about chunksize > 1M, we should be > safe on that given our chunksize is fixed at 1MB. > > The other one about migration threshold is interesting, I will have to > validate this again. > > > > What would be the unit of migration threshold? Is it the number of 512 > byte sectors? And what exactly is its definition? > > > > And also curiously this does not seem to be exported through lvm cli, need > to fetch this only through dmsetup? > > > > Thanks > > LN > > Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for > Windows 10 > > > > *From: *Nikhil Kshirsagar <nkshirsa@redhat.com> > *Sent: *Wednesday, July 31, 2019 3:04 PM > *To: *LVM2 development <lvm-devel@redhat.com> > *Subject: *Re: [lvm-devel] lvmcache in writeback mode gets stuck flushing > dirtyblocks > > > > This used to happen if the chunksize increased as a result of needing to > use more than a million chunks to store the size of the cached lv. What is > the size of the pool? > > > > Regards, > > Nikhil. > > > > On Tue, 30 Jul, 2019, 1:25 PM Lakshmi Narasimhan Sundararajan, < > lns at portworx.com> wrote: > > Hi Team, > > A very good day to all. > > > I am using lvmcache in writeback mode. When there are dirty blocks still > in the lv, and if needs to be destroyed or flushed, then > > It seems to me that there are some conditions under which the dirty data > flush gets stuck forever. > > > > > > As an example: > > root at pdc4-sm35:~# lvremove -f pwx0/pool > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > 367 blocks must still be flushed. > > ^C > > root at pdc4-sm35:~# > > > > I am running these version: > > root at pdc4-sm35:~# lvm version > > LVM version: 2.02.133(2) (2015-10-30) > > Library version: 1.02.110 (2015-10-30) > > Driver version: 4.34.0 > > root at pdc4-sm35:~# > > > > > > This issue seems old and reported multiple places. There have been some > acknowledgement that this issue is resolved in 2.02.133, but still I see > it. Also, I have seen some posts report it in 2.02.170+ as well (here: > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=878441) (Package: lvm2 > Version: 2.02.173-1 Severity: normal) > > > > I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, > trying to understand from you experts where we are on this? > > > > I would sincerely appreciate your help in understanding the state of this > issue in more detail. > > > > Best regards > LN > > Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for > Windows 10 > > > > -- > lvm-devel mailing list > lvm-devel at redhat.com > https://www.redhat.com/mailman/listinfo/lvm-devel > > > -- > lvm-devel mailing list > lvm-devel at redhat.com > https://www.redhat.com/mailman/listinfo/lvm-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190802/a4558dd0/attachment.htm> ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuck flushing dirtyblocks 2019-08-02 11:44 ` Nikhil Kshirsagar @ 2019-08-03 7:26 ` Nikhil Kshirsagar 2019-08-07 12:14 ` lvmcache in writeback mode gets stuck flushingdirtyblocks Lakshmi Narasimhan Sundararajan 0 siblings, 1 reply; 11+ messages in thread From: Nikhil Kshirsagar @ 2019-08-03 7:26 UTC (permalink / raw) To: lvm-devel Can you try increasing migration threshold through the device mapper commands and check if this gets rid of the infinite flushes ? On Fri, 2 Aug, 2019, 5:14 PM Nikhil Kshirsagar, <nkshirsa@redhat.com> wrote: > Hello, > > You are welcome. > > The migration threshold is in terms of chunks, I think.. So it should be > at least one chunk so the looping forever won't happen. The bug we found > was if chunksize goes beyond a certain value triggered by larger than one > tb sized cached lv, it ends up with migration threshold hard coded to lower > than the increased chunksize. > > Yes migration threshold right now needs better documentation and > explanations. Also the ability to see it from lvm commands just like we can > see chunksize. We are working on it through the bzs mentioned earlier. (See > the bz about migration threshold needing better documentation in the man > pages) > > I think right now you can get it only at the device mapper layer, will > check.. > > Regards, > Nikhil. > > > > On Fri, 2 Aug, 2019, 5:09 PM Lakshmi Narasimhan Sundararajan, < > lns at portworx.com> wrote: > >> Hi Nikhil, >> Thank you for your email. Much appreciated. >> >> >> >> In my environment, Chunksize is fixed at 1M irrespective of the pool >> size. This may take the number of entries over 1M and result in kernel >> warning. But the class of systems we are using are huge, and so the memory >> and cpu bottlenecks does not seem to be a factor in our testing. >> >> >> >> I looked up at the bugs. The first one about chunksize > 1M, we should be >> safe on that given our chunksize is fixed at 1MB. >> >> The other one about migration threshold is interesting, I will have to >> validate this again. >> >> >> >> What would be the unit of migration threshold? Is it the number of 512 >> byte sectors? And what exactly is its definition? >> >> >> >> And also curiously this does not seem to be exported through lvm cli, >> need to fetch this only through dmsetup? >> >> >> >> Thanks >> >> LN >> >> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for >> Windows 10 >> >> >> >> *From: *Nikhil Kshirsagar <nkshirsa@redhat.com> >> *Sent: *Wednesday, July 31, 2019 3:04 PM >> *To: *LVM2 development <lvm-devel@redhat.com> >> *Subject: *Re: [lvm-devel] lvmcache in writeback mode gets stuck >> flushing dirtyblocks >> >> >> >> This used to happen if the chunksize increased as a result of needing to >> use more than a million chunks to store the size of the cached lv. What is >> the size of the pool? >> >> >> >> Regards, >> >> Nikhil. >> >> >> >> On Tue, 30 Jul, 2019, 1:25 PM Lakshmi Narasimhan Sundararajan, < >> lns at portworx.com> wrote: >> >> Hi Team, >> >> A very good day to all. >> >> >> I am using lvmcache in writeback mode. When there are dirty blocks still >> in the lv, and if needs to be destroyed or flushed, then >> >> It seems to me that there are some conditions under which the dirty data >> flush gets stuck forever. >> >> >> >> >> >> As an example: >> >> root at pdc4-sm35:~# lvremove -f pwx0/pool >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> 367 blocks must still be flushed. >> >> ^C >> >> root at pdc4-sm35:~# >> >> >> >> I am running these version: >> >> root at pdc4-sm35:~# lvm version >> >> LVM version: 2.02.133(2) (2015-10-30) >> >> Library version: 1.02.110 (2015-10-30) >> >> Driver version: 4.34.0 >> >> root at pdc4-sm35:~# >> >> >> >> >> >> This issue seems old and reported multiple places. There have been some >> acknowledgement that this issue is resolved in 2.02.133, but still I see >> it. Also, I have seen some posts report it in 2.02.170+ as well (here: >> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=878441) (Package: lvm2 >> Version: 2.02.173-1 Severity: normal) >> >> >> >> I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, >> trying to understand from you experts where we are on this? >> >> >> >> I would sincerely appreciate your help in understanding the state of this >> issue in more detail. >> >> >> >> Best regards >> LN >> >> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for >> Windows 10 >> >> >> >> -- >> lvm-devel mailing list >> lvm-devel at redhat.com >> https://www.redhat.com/mailman/listinfo/lvm-devel >> >> >> -- >> lvm-devel mailing list >> lvm-devel at redhat.com >> https://www.redhat.com/mailman/listinfo/lvm-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190803/5656d0b8/attachment.htm> ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuck flushingdirtyblocks 2019-08-03 7:26 ` Nikhil Kshirsagar @ 2019-08-07 12:14 ` Lakshmi Narasimhan Sundararajan 2019-08-12 4:41 ` lvmcache in writeback mode gets stuckflushingdirtyblocks Lakshmi Narasimhan Sundararajan 0 siblings, 1 reply; 11+ messages in thread From: Lakshmi Narasimhan Sundararajan @ 2019-08-07 12:14 UTC (permalink / raw) To: lvm-devel Hi Nikhil, So far with migration_threshold set to 20480 from original 2048 has not seen this problem. I shall keep you posted on further internal testing on this. But I would like to understand more on the tunables we have with lvmcache. Can you please help refine the definitions and my understanding of the below. Defaults: migration_threshold 2048 random_threshold 4 sequential_threshold 512 1) Migration_threshold: This tunable controls how many sectors (512B) of data are pulled in or pushed out of cache. So all flush/writeback operations from the cache device operates in multiples of this threshold. There is no migration ever in writethrough cache. Larger the number of sectors will help in moving larger context into/out of cache immediately and improve sequential performance, but shall adversely affect random performance. 2) Sequential_threshold: This tunable is a count of IO requests that have to be contiguous (start from last IO end) to treat incoming IO as sequential. Each IO can be of any size. As long as the next IO is contiguous it shall get counted. All IOs only after hitting the sequential_threshold shall be bypassed from cache. Even if one IO misses the sequential pattern from last IO, the threshold gets reset to zero? And all intervening IO are cached? 3) Random_threshold: This tunable is a count of IO requests that miss sequential condition to be considered as a random IO. In default condition, first 4 IO requests in the stream can never get cached. All IO between 4 and 512 requests in the stream get cached. And only after 512 requests does the caching module recognize incoming IO as sequential and stop caching further. Outside of this I also see 3 other tunables. "read_promote_adjustment", "write_promote_adjustment", "discard_promote_adjustment" To which I do not understand how this needs to be configured. Are there any other tunables that I am not aware of. Can you please help clarify on the same. Regards LN Sent from Mail for Windows 10 From: Nikhil Kshirsagar Sent: Monday, August 5, 2019 2:42 PM To: LVM2 development Subject: Re: [lvm-devel] lvmcache in writeback mode gets stuck flushingdirtyblocks Can you try increasing migration threshold through the device mapper commands and check if this gets rid of the infinite flushes ? On Fri, 2 Aug, 2019, 5:14 PM Nikhil Kshirsagar, <nkshirsa@redhat.com> wrote: Hello, You are welcome. The migration threshold is in terms of chunks, I think.. So it should be at least one chunk so the looping forever won't happen. The bug we found was if chunksize goes beyond a certain value triggered by larger than one tb sized cached lv, it ends up with migration threshold hard coded to lower than the increased chunksize. Yes migration threshold right now needs better documentation and explanations. Also the ability to see it from lvm commands just like we can see chunksize. We are working on it through the bzs mentioned earlier. (See the bz about migration threshold needing better documentation in the man pages) I think right now you can get it only at the device mapper layer, will check.. Regards, Nikhil. On Fri, 2 Aug, 2019, 5:09 PM Lakshmi Narasimhan Sundararajan, <lns@portworx.com> wrote: Hi Nikhil, Thank you for your email. Much appreciated. ? In my environment, Chunksize is fixed at 1M irrespective of the pool size. This may take the number of entries over 1M and result in kernel warning. But the class of systems we are using are huge, and so the memory and cpu bottlenecks does not seem to be a factor in our testing. ? I looked up at the bugs. The first one about chunksize > 1M, we should be safe on that given our chunksize is fixed at 1MB. The other one about migration threshold is interesting, I will have to validate this again. ? What would be the unit of migration threshold?? Is it the number of 512 byte sectors? And what exactly is its definition? ? And also curiously this does not seem to be exported through lvm cli, need to fetch this only through dmsetup? ? Thanks LN Sent from Mail for Windows 10 ? From: Nikhil Kshirsagar Sent: Wednesday, July 31, 2019 3:04 PM To: LVM2 development Subject: Re: [lvm-devel] lvmcache in writeback mode gets stuck flushing dirtyblocks ? This used to happen if the chunksize increased as a result of needing to use more than a million chunks to store the size of the cached lv. What is the size of the pool? ? Regards, Nikhil. ? On Tue, 30 Jul, 2019, 1:25 PM Lakshmi Narasimhan Sundararajan, <lns@portworx.com> wrote: Hi Team, A very good day to all. I am using lvmcache in writeback mode. When there are dirty blocks still in the lv, and if needs to be destroyed or flushed, then It seems to me that there are some conditions under which the dirty data flush gets stuck forever. ? ? As an example: root at pdc4-sm35:~# lvremove -f pwx0/pool ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ^C root at pdc4-sm35:~# ? I am running these version: root at pdc4-sm35:~# lvm version ? LVM version:???? 2.02.133(2) (2015-10-30) ? Library version: 1.02.110 (2015-10-30) ? Driver version:? 4.34.0 root at pdc4-sm35:~# ? ? This issue seems old and reported multiple places. There have been some acknowledgement that this issue is resolved in 2.02.133, but still I see it. Also, I have seen some posts report it in 2.02.170+ as well (here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=878441) (Package: lvm2 Version: 2.02.173-1 Severity: normal) ? I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, trying? to understand from you experts where we are on this? ? I would sincerely appreciate your help in understanding the state of this issue in more detail. ? Best regards LN Sent from Mail for Windows 10 ? -- lvm-devel mailing list lvm-devel at redhat.com https://www.redhat.com/mailman/listinfo/lvm-devel ? -- lvm-devel mailing list lvm-devel at redhat.com https://www.redhat.com/mailman/listinfo/lvm-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190807/8b9a51ad/attachment.htm> ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuckflushingdirtyblocks 2019-08-07 12:14 ` lvmcache in writeback mode gets stuck flushingdirtyblocks Lakshmi Narasimhan Sundararajan @ 2019-08-12 4:41 ` Lakshmi Narasimhan Sundararajan 0 siblings, 0 replies; 11+ messages in thread From: Lakshmi Narasimhan Sundararajan @ 2019-08-12 4:41 UTC (permalink / raw) To: lvm-devel Gentle reminder? I would sincerely appreciate clarification for the below. Regards LN Sent from Mail for Windows 10 From: Lakshmi Narasimhan Sundararajan Sent: Wednesday, August 7, 2019 5:44 PM To: LVM2 development Subject: RE: [lvm-devel] lvmcache in writeback mode gets stuckflushingdirtyblocks Hi Nikhil, So far with migration_threshold set to 20480 from original 2048 has not seen this problem. I shall keep you posted on further internal testing on this. But I would like to understand more on the tunables we have with lvmcache. Can you please help refine the definitions and my understanding of the below. Defaults: migration_threshold 2048 random_threshold 4 sequential_threshold 512 1) Migration_threshold: This tunable controls how many sectors (512B) of data are pulled in or pushed out of cache. So all flush/writeback operations from the cache device operates in multiples of this threshold. There is no migration ever in writethrough cache. ?Larger the number of sectors will help in moving larger context into/out of cache immediately and improve sequential performance, but shall adversely affect random performance. 2) Sequential_threshold: This tunable is a count of IO requests that have to be contiguous (start from last IO end) to treat incoming IO as sequential. Each IO can be of any size. As long as the next IO is contiguous it shall get counted. All IOs only after hitting the sequential_threshold shall be bypassed from cache. Even if one IO misses the sequential pattern from last IO, the threshold gets reset to zero? And all intervening IO are cached? 3) Random_threshold: This tunable is a count of IO requests that miss sequential condition to be considered as a random IO. In default condition, first 4 IO requests in the stream can never get cached. All IO between 4 and 512 requests in the stream get cached. And only after 512 requests does the caching module recognize incoming IO as sequential and stop caching further. Outside of this I also see 3 other tunables. ??? "read_promote_adjustment", ??? "write_promote_adjustment", ??? "discard_promote_adjustment" To which I do not understand how this needs to be configured. Are there any other tunables that I am not aware of. Can you please help clarify on the same. Regards LN Sent from Mail for Windows 10 From: Nikhil Kshirsagar Sent: Monday, August 5, 2019 2:42 PM To: LVM2 development Subject: Re: [lvm-devel] lvmcache in writeback mode gets stuck flushingdirtyblocks Can you try increasing migration threshold through the device mapper commands and check if this gets rid of the infinite flushes ? On Fri, 2 Aug, 2019, 5:14 PM Nikhil Kshirsagar, <nkshirsa@redhat.com> wrote: Hello, You are welcome. The migration threshold is in terms of chunks, I think.. So it should be at least one chunk so the looping forever won't happen. The bug we found was if chunksize goes beyond a certain value triggered by larger than one tb sized cached lv, it ends up with migration threshold hard coded to lower than the increased chunksize. Yes migration threshold right now needs better documentation and explanations. Also the ability to see it from lvm commands just like we can see chunksize. We are working on it through the bzs mentioned earlier. (See the bz about migration threshold needing better documentation in the man pages) I think right now you can get it only at the device mapper layer, will check.. Regards, Nikhil. On Fri, 2 Aug, 2019, 5:09 PM Lakshmi Narasimhan Sundararajan, <lns@portworx.com> wrote: Hi Nikhil, Thank you for your email. Much appreciated. ? In my environment, Chunksize is fixed at 1M irrespective of the pool size. This may take the number of entries over 1M and result in kernel warning. But the class of systems we are using are huge, and so the memory and cpu bottlenecks does not seem to be a factor in our testing. ? I looked up at the bugs. The first one about chunksize > 1M, we should be safe on that given our chunksize is fixed at 1MB. The other one about migration threshold is interesting, I will have to validate this again. ? What would be the unit of migration threshold?? Is it the number of 512 byte sectors? And what exactly is its definition? ? And also curiously this does not seem to be exported through lvm cli, need to fetch this only through dmsetup? ? Thanks LN Sent from Mail for Windows 10 ? From: Nikhil Kshirsagar Sent: Wednesday, July 31, 2019 3:04 PM To: LVM2 development Subject: Re: [lvm-devel] lvmcache in writeback mode gets stuck flushing dirtyblocks ? This used to happen if the chunksize increased as a result of needing to use more than a million chunks to store the size of the cached lv. What is the size of the pool? ? Regards, Nikhil. ? On Tue, 30 Jul, 2019, 1:25 PM Lakshmi Narasimhan Sundararajan, <lns@portworx.com> wrote: Hi Team, A very good day to all. I am using lvmcache in writeback mode. When there are dirty blocks still in the lv, and if needs to be destroyed or flushed, then It seems to me that there are some conditions under which the dirty data flush gets stuck forever. ? ? As an example: root at pdc4-sm35:~# lvremove -f pwx0/pool ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ? 367 blocks must still be flushed. ^C root at pdc4-sm35:~# ? I am running these version: root at pdc4-sm35:~# lvm version ? LVM version:???? 2.02.133(2) (2015-10-30) ? Library version: 1.02.110 (2015-10-30) ? Driver version:? 4.34.0 root at pdc4-sm35:~# ? ? This issue seems old and reported multiple places. There have been some acknowledgement that this issue is resolved in 2.02.133, but still I see it. Also, I have seen some posts report it in 2.02.170+ as well (here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=878441) (Package: lvm2 Version: 2.02.173-1 Severity: normal) ? I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, trying? to understand from you experts where we are on this? ? I would sincerely appreciate your help in understanding the state of this issue in more detail. ? Best regards LN Sent from Mail for Windows 10 ? -- lvm-devel mailing list lvm-devel at redhat.com https://www.redhat.com/mailman/listinfo/lvm-devel ? -- lvm-devel mailing list lvm-devel at redhat.com https://www.redhat.com/mailman/listinfo/lvm-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190812/fb18713a/attachment.htm> ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuck flushing dirty blocks 2019-07-30 4:58 lvmcache in writeback mode gets stuck flushing dirty blocks Lakshmi Narasimhan Sundararajan 2019-07-30 8:02 ` Nikhil Kshirsagar @ 2019-07-30 8:21 ` Zdenek Kabelac 2019-07-30 9:23 ` lvmcache in writeback mode gets stuck flushing dirtyblocks Lakshmi Narasimhan Sundararajan 1 sibling, 1 reply; 11+ messages in thread From: Zdenek Kabelac @ 2019-07-30 8:21 UTC (permalink / raw) To: lvm-devel Dne 30. 07. 19 v 6:58 Lakshmi Narasimhan Sundararajan napsal(a): > Hi Team, > > A very good day to all. > > > I am using lvmcache in writeback mode. When there are dirty blocks still in > the lv, and if needs to be destroyed or flushed, then > > It seems to me that there are some conditions under which the dirty data flush > gets stuck forever. > > As an example: > > root at pdc4-sm35:~# lvremove -f pwx0/pool > > ? 367 blocks must still be flushed. > > ? 367 blocks must still be flushed. > > ? 367 blocks must still be flushed. > I am running these version: > > root at pdc4-sm35:~# lvm version > > ? LVM version:???? 2.02.133(2) (2015-10-30) > > I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, trying? to > understand from you experts where we are on this? > > I would sincerely appreciate your help in understanding the state of this > issue in more detail. Hi Yep you are using very old version of lvm2 - there is already year 2019 - and in the initial releases of lvm2 with writeback cache support (as you happen to still use these days) there was a problem that uncaching was not switching to writethrough mode (and this was not the only one) Please consider to use way newer lvm2 & kernel. Regards Zdenek ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuck flushing dirtyblocks 2019-07-30 8:21 ` lvmcache in writeback mode gets stuck flushing dirty blocks Zdenek Kabelac @ 2019-07-30 9:23 ` Lakshmi Narasimhan Sundararajan 2019-07-30 11:32 ` Zdenek Kabelac 0 siblings, 1 reply; 11+ messages in thread From: Lakshmi Narasimhan Sundararajan @ 2019-07-30 9:23 UTC (permalink / raw) To: lvm-devel Hi Zdenek, Thank you for the acknowledging the issue. I may not be at a liberty to choose the environment always, as most of the major distributions come bundled with 2.02.133 I have two followup questions. 1/ Is there a way to tell that a particular version has very critical bug (like the one I reported)? Nothing short of hitting it seem the way to confirm currently. 2/ which is the nearest stable release that addresses this particular issue? 3/ Does latest lvm stable work well in old distributions, Linux kernels too? Whats the compatibility matrix here? Regards LN From: Zdenek Kabelac Sent: Tuesday, July 30, 2019 1:51 PM To: LVM2 development; Lakshmi Narasimhan Sundararajan Subject: Re: [lvm-devel] lvmcache in writeback mode gets stuck flushing dirtyblocks Dne 30. 07. 19 v 6:58 Lakshmi Narasimhan Sundararajan napsal(a): > Hi Team, > > A very good day to all. > > > I am using lvmcache in writeback mode. When there are dirty blocks still in > the lv, and if needs to be destroyed or flushed, then > > It seems to me that there are some conditions under which the dirty data flush > gets stuck forever. > > As an example: > > root at pdc4-sm35:~# lvremove -f pwx0/pool > > ? 367 blocks must still be flushed. > > ? 367 blocks must still be flushed. > > ? 367 blocks must still be flushed. > I am running these version: > > root at pdc4-sm35:~# lvm version > > ? LVM version:???? 2.02.133(2) (2015-10-30) > > I filed one here myself, https://github.com/lvmteam/lvm2/issues/22, trying? to > understand from you experts where we are on this? > > I would sincerely appreciate your help in understanding the state of this > issue in more detail. Hi Yep you are using very old version of lvm2 - there is already year 2019 - and in the initial releases of lvm2 with writeback cache support (as you happen to still use these days) there was a problem that uncaching was not switching to writethrough mode (and this was not the only one) Please consider to use way newer lvm2 & kernel. Regards Zdenek -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/lvm-devel/attachments/20190730/5a5c1a2e/attachment.htm> ^ permalink raw reply [flat|nested] 11+ messages in thread
* lvmcache in writeback mode gets stuck flushing dirtyblocks 2019-07-30 9:23 ` lvmcache in writeback mode gets stuck flushing dirtyblocks Lakshmi Narasimhan Sundararajan @ 2019-07-30 11:32 ` Zdenek Kabelac 0 siblings, 0 replies; 11+ messages in thread From: Zdenek Kabelac @ 2019-07-30 11:32 UTC (permalink / raw) To: lvm-devel Dne 30. 07. 19 v 11:23 Lakshmi Narasimhan Sundararajan napsal(a): > Hi Zdenek, > > Thank you for the acknowledging the issue. > > I may not be at a liberty to choose the environment always, as most of the > major distributions come bundled with 2.02.133 > > I have two followup questions. > > 1/ Is there a way to tell that a particular version has very critical bug > (like the one I reported)? Nothing short of hitting it seem the way to confirm > currently. See 'stable-2.02' branch WHATS_NEW file content: https://sourceware.org/git/?p=lvm2.git;a=blob_plain;f=WHATS_NEW;hb=refs/heads/stable-2.02 Huge amount of bugfixes and improvements. > 2/ which is the nearest stable release that addresses this particular issue? 2.02.185.... > > 3/ Does latest lvm stable work well in old distributions, Linux kernels too? > Whats the compatibility matrix here? For cache or thin I'd not use anything below 4.20 kernel. Also note - we do provide support upstream - not for every individual distro out there in Universe. If you want/need to backport individual fixes into your version - you will likely need to ask for this service your distro provider. lvm2 is maintained to be fully backward compatible (version 2.02) so it should work with almost any distros - to the point we are informed about bug/problem :) HEAD of git master is a bit more experimental (2.03)... Regards Zdenek ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-08-12 4:41 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-07-30 4:58 lvmcache in writeback mode gets stuck flushing dirty blocks Lakshmi Narasimhan Sundararajan 2019-07-30 8:02 ` Nikhil Kshirsagar 2019-07-30 8:15 ` Nikhil Kshirsagar 2019-07-31 9:53 ` lvmcache in writeback mode gets stuck flushing dirtyblocks Lakshmi Narasimhan Sundararajan 2019-08-02 11:44 ` Nikhil Kshirsagar 2019-08-03 7:26 ` Nikhil Kshirsagar 2019-08-07 12:14 ` lvmcache in writeback mode gets stuck flushingdirtyblocks Lakshmi Narasimhan Sundararajan 2019-08-12 4:41 ` lvmcache in writeback mode gets stuckflushingdirtyblocks Lakshmi Narasimhan Sundararajan 2019-07-30 8:21 ` lvmcache in writeback mode gets stuck flushing dirty blocks Zdenek Kabelac 2019-07-30 9:23 ` lvmcache in writeback mode gets stuck flushing dirtyblocks Lakshmi Narasimhan Sundararajan 2019-07-30 11:32 ` Zdenek Kabelac
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.