* [Drbd-dev] Crash in lru_cache.c
@ 2008-01-10 19:00 Graham, Simon
2008-01-10 20:19 ` Lars Ellenberg
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Graham, Simon @ 2008-01-10 19:00 UTC (permalink / raw)
To: drbd-dev
I've been seeing an occasional crash (using DRBD8.0) recently in the
lru_cache.c file that looks like this:
Dec 5 05:57:09 ------------[ cut here ]------------
Dec 5 05:57:09 kernel BUG at
/test_logs/builds/SuperNova/trunk/20071205-r21536/src/platform/drbd/src/
drbd/lru_cache.c:312!
Dec 5 05:57:09 invalid opcode: 0000 [#1]
Dec 5 05:57:09 SMP
Dec 5 05:57:09 Modules linked in: tun drbd cn ipmi_devintf ipmi_si
ipmi_msghandler bridge ipv6 binfmt_misc dm_mirror dm_multipath dm_mod
video thermal sbs processor i2c_ec i2c_core fan container button battery
asus_acpi ac parport_pc lp parport nvram ide_cd cdrom evdev intel_rng
pcspkr sg bnx2 shpchp piix zlib_inflate pci_hotplug serio_raw
serial_core rtc mptspi scsi_transport_spi ide_disk mptsas mptscsih
mptbase scsi_transport_sas sd_mod scsi_mod raid1 ehci_hcd ohci_hcd
uhci_hcd usbcore
Dec 5 05:57:09 CPU: 1
Dec 5 05:57:09 EIP: 0061:[<ee297564>] Tainted: GF VLI
Dec 5 05:57:09 EFLAGS: 00010046 (2.6.18-xen #1)
Dec 5 05:57:09 EIP is at lc_put+0x84/0xc0 [drbd]
Dec 5 05:57:10 eax: 00000000 ebx: ee24e000 ecx: ed3a4000 edx:
ee24fc50
Dec 5 05:57:10 esi: ee24fc50 edi: ed3a4000 ebp: deb55e30 esp:
deb55e28
Dec 5 05:57:10 ds: 007b es: 007b ss: 0069
Dec 5 05:57:10 Process drbd5_asender (pid: 12691, ti=deb54000
task=e6d960f0 task.ti=deb54000)
Dec 5 05:57:10 Stack: ed3a43b0 00000001 deb55e70 ee29480e 00000000
00000000 00004100 deb55e48
Dec 5 05:57:10 00000000 c031e320 00000000 00000000 deb55f38
ed3a4000 00000000 000000e6
Dec 5 05:57:10 c8d32288 ed3a4000 deb55ee8 ee29149e 00000001
ffffffff 00000000 00000000
Dec 5 05:57:10 Call Trace:
Dec 5 05:57:10 [<c0105dc1>] show_stack_log_lvl+0xb1/0xe0
Dec 5 05:57:10 [<c0105ffa>] show_registers+0x1aa/0x230
Dec 5 05:57:10 [<c01061b6>] die+0x136/0x300
Dec 5 05:57:10 [<c01063ff>] do_trap+0x7f/0xb0
Dec 5 05:57:10 [<c0106be7>] do_invalid_op+0x97/0xb0
Dec 5 05:57:10 [<c01058f3>] error_code+0x2b/0x30
Dec 5 05:57:10 [<ee29480e>] drbd_al_complete_io+0x6e/0x130 [drbd]
Dec 5 05:57:10 [<ee29149e>] _req_may_be_done+0x5ee/0x780 [drbd]
Dec 5 05:57:10 [<ee291993>] _req_mod+0x363/0xab0 [drbd]
Dec 5 05:57:10 [<ee29e7c1>] tl_release+0x51/0x1f0 [drbd]
Dec 5 05:57:10 [<ee28c576>] got_BarrierAck+0x16/0xb0 [drbd]
Dec 5 05:57:10 [<ee28d7b9>] drbd_asender+0x2e9/0x5a0 [drbd]
Dec 5 05:57:10 [<ee29ea0f>] drbd_thread_setup+0xaf/0xf0 [drbd]
Dec 5 05:57:10 [<c0103005>] kernel_thread_helper+0x5/0x10
The failing line of code is asserting in lc_put that the current ref
count is greater than zero. Now, I think this bug has been there for a
while if you are using protocol A or B and has now been exposed when
using protocol C because of the recent change to maintain the transfer
log for all protocols (i.e. it's my fault it got exposed!)
My theory is that the following occurred:
1. We were running normally; this means that the TL has at least one
entry most of the
time - this entry is a request that includes a reference to the AL
cache for that
write operation.
2. The local disk is detached for some reason (failure, or 'drbdsetup
detach') - this
causes the AL cache to be discarded
3. The local disk is reattached - the creates a brand spanking new AL
with no hot entries
4. We process a second write for the same AL area as the one above -
this will create a
new hot entry in the cache, but the refcount will only be one even
though there are
two I/O's outstanding for the AL area covered by the entry
5. Now we get a barrier ack that allows us to clear both entries from
the TL when we
attempt to lc_put for the second one we crash because the ref count
is already zero.
So, first of all, does this seem a reaonsable explanation, or did I miss
something?
Secondly, assuming I'm right, I see a couple of possible solutions:
1. Remember in the req structure if this request has a reference to the
AL cache entry.
When clearing the AL because of a detach, go through the TL list at
that time and
clear the flag - thus when we eventually remove the entry, we wont
even try the
lc_put.
2. When attached a disk, run through the current TL and allocate AL
entries for each
request currently in the list. The problem with this is that the AL
cache size
might have changed in a way that doesn't allow sufficient hot entries
(i.e. the
cache size is less than the number of unique entries required by the
current
TL list.
Thoughts? I'm about to start on fixing this, so would welcome ideas...
Thanks,
Simon
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Drbd-dev] Crash in lru_cache.c 2008-01-10 19:00 [Drbd-dev] Crash in lru_cache.c Graham, Simon @ 2008-01-10 20:19 ` Lars Ellenberg 2008-01-10 20:31 ` Graham, Simon [not found] ` <342BAC0A5467384983B586A6B0B3767107C5AE95@EXNA.corp.s tratus.com> 2 siblings, 0 replies; 8+ messages in thread From: Lars Ellenberg @ 2008-01-10 20:19 UTC (permalink / raw) To: drbd-dev On Thu, Jan 10, 2008 at 02:00:06PM -0500, Graham, Simon wrote: > I've been seeing an occasional crash (using DRBD8.0) recently in the > lru_cache.c file that looks like this: > > Dec 5 05:57:09 ------------[ cut here ]------------ > Dec 5 05:57:09 kernel BUG at > /test_logs/builds/SuperNova/trunk/20071205-r21536/src/platform/drbd/src/ > drbd/lru_cache.c:312! in what exact codebase do you see this? up to which point have you merged upstream drbd-8.0.git? what local patches are applied? > Dec 5 05:57:09 invalid opcode: 0000 [#1] > Dec 5 05:57:09 SMP > Dec 5 05:57:09 Modules linked in: tun drbd cn ipmi_devintf ipmi_si > ipmi_msghandler bridge ipv6 binfmt_misc dm_mirror dm_multipath dm_mod > video thermal sbs processor i2c_ec i2c_core fan container button battery > asus_acpi ac parport_pc lp parport nvram ide_cd cdrom evdev intel_rng > pcspkr sg bnx2 shpchp piix zlib_inflate pci_hotplug serio_raw > serial_core rtc mptspi scsi_transport_spi ide_disk mptsas mptscsih > mptbase scsi_transport_sas sd_mod scsi_mod raid1 ehci_hcd ohci_hcd > uhci_hcd usbcore > Dec 5 05:57:09 CPU: 1 > Dec 5 05:57:09 EIP: 0061:[<ee297564>] Tainted: GF VLI > Dec 5 05:57:09 EFLAGS: 00010046 (2.6.18-xen #1) > Dec 5 05:57:09 EIP is at lc_put+0x84/0xc0 [drbd] > Dec 5 05:57:10 eax: 00000000 ebx: ee24e000 ecx: ed3a4000 edx: ee24fc50 > Dec 5 05:57:10 esi: ee24fc50 edi: ed3a4000 ebp: deb55e30 esp: deb55e28 > Dec 5 05:57:10 ds: 007b es: 007b ss: 0069 > Dec 5 05:57:10 Process drbd5_asender (pid: 12691, ti=deb54000 task=e6d960f0 task.ti=deb54000) > Dec 5 05:57:10 Stack: ed3a43b0 00000001 deb55e70 ee29480e 00000000 00000000 00004100 deb55e48 > Dec 5 05:57:10 00000000 c031e320 00000000 00000000 deb55f38 ed3a4000 00000000 000000e6 > Dec 5 05:57:10 c8d32288 ed3a4000 deb55ee8 ee29149e 00000001 ffffffff 00000000 00000000 > Dec 5 05:57:10 Call Trace: > Dec 5 05:57:10 [<c0105dc1>] show_stack_log_lvl+0xb1/0xe0 > Dec 5 05:57:10 [<c0105ffa>] show_registers+0x1aa/0x230 > Dec 5 05:57:10 [<c01061b6>] die+0x136/0x300 > Dec 5 05:57:10 [<c01063ff>] do_trap+0x7f/0xb0 > Dec 5 05:57:10 [<c0106be7>] do_invalid_op+0x97/0xb0 > Dec 5 05:57:10 [<c01058f3>] error_code+0x2b/0x30 > Dec 5 05:57:10 [<ee29480e>] drbd_al_complete_io+0x6e/0x130 [drbd] > Dec 5 05:57:10 [<ee29149e>] _req_may_be_done+0x5ee/0x780 [drbd] > Dec 5 05:57:10 [<ee291993>] _req_mod+0x363/0xab0 [drbd] > Dec 5 05:57:10 [<ee29e7c1>] tl_release+0x51/0x1f0 [drbd] > Dec 5 05:57:10 [<ee28c576>] got_BarrierAck+0x16/0xb0 [drbd] > Dec 5 05:57:10 [<ee28d7b9>] drbd_asender+0x2e9/0x5a0 [drbd] > Dec 5 05:57:10 [<ee29ea0f>] drbd_thread_setup+0xaf/0xf0 [drbd] > Dec 5 05:57:10 [<c0103005>] kernel_thread_helper+0x5/0x10 > > The failing line of code is asserting in lc_put that the current ref > count is greater than zero. Now, I think this bug has been there for a > while if you are using protocol A or B and has now been exposed when > using protocol C because of the recent change to maintain the transfer > log for all protocols (i.e. it's my fault it got exposed!) > > My theory is that the following occurred: > > 1. We were running normally; this means that the TL has at least one > entry most of the time - this entry is a request that includes a > reference to the AL cache for that write operation. > > 2. The local disk is detached for some reason (failure, or 'drbdsetup > detach') - this causes the AL cache to be discarded > > 3. The local disk is reattached - the creates a brand spanking new AL > with no hot entries > > 4. We process a second write for the same AL area as the one above - > this will create a new hot entry in the cache, but the refcount will > only be one even though there are two I/O's outstanding for the AL > area covered by the entry > > 5. Now we get a barrier ack that allows us to clear both entries from > the TL when we attempt to lc_put for the second one we crash because > the ref count is already zero. > > So, first of all, does this seem a reaonsable explanation, or did I miss > something? > > Secondly, assuming I'm right, I see a couple of possible solutions: > > 1. Remember in the req structure if this request has a reference to the > AL cache entry. When clearing the AL because of a detach, go through > the TL list at that time and clear the flag - thus when we eventually > remove the entry, we wont even try the lc_put. > > 2. When attached a disk, run through the current TL and allocate AL > entries for each request currently in the list. The problem with this > is that the AL cache size might have changed in a way that doesn't > allow sufficient hot entries (i.e. the cache size is less than the > number of unique entries required by the current TL list. > > Thoughts? I'm about to start on fixing this, so would welcome ideas... in my version of drbd-8.0, that would be in this code path: if (s & RQ_LOCAL_MASK) { if (inc_local_if_state(mdev,Failed)) { drbd_al_complete_io(mdev, req->sector); dec_local(mdev); } else { WARN("Should have called drbd_al_complete_io(, %llu), " "but my Disk seems to have failed:(\n", (unsigned long long) req->sector); } } I don't see why there could possibly be requests in the tl that have (s & RQ_LOCAL_MASK) when there is no disk. if there are, that is the real bug, I think. other than that, what about 3. when attaching a disk, suspend incoming requests and wait for the tl to become empty. then attach, and resume. -- : Lars Ellenberg Tel +43-1-8178292-55 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com : ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [Drbd-dev] Crash in lru_cache.c 2008-01-10 19:00 [Drbd-dev] Crash in lru_cache.c Graham, Simon 2008-01-10 20:19 ` Lars Ellenberg @ 2008-01-10 20:31 ` Graham, Simon 2008-01-12 13:51 ` Lars Ellenberg [not found] ` <342BAC0A5467384983B586A6B0B3767107C5AE95@EXNA.corp.s tratus.com> 2 siblings, 1 reply; 8+ messages in thread From: Graham, Simon @ 2008-01-10 20:31 UTC (permalink / raw) To: Lars Ellenberg, drbd-dev > > Dec 5 05:57:09 ------------[ cut here ]------------ > > Dec 5 05:57:09 kernel BUG at > > /test_logs/builds/SuperNova/trunk/20071205- > r21536/src/platform/drbd/src/ > > drbd/lru_cache.c:312! > > in what exact codebase do you see this? > up to which point have you merged upstream drbd-8.0.git? > what local patches are applied? > Yes - sorry... this is 8.0.4 plus a bunch of the fixes that are in 8.0.8 (but not all) plus a few more than T haven't submitted yet (but I will once I wrestle git into submission); the specific change that exposes this that I have pulled is the one to use the TL for Protocol C as well as A and B -- however, I think this bug exists IF you are using A or B without this fix. > that would be in this code path: > if (s & RQ_LOCAL_MASK) { > if (inc_local_if_state(mdev,Failed)) { > drbd_al_complete_io(mdev, req->sector); > dec_local(mdev); > } else { > WARN("Should have called > drbd_al_complete_io(, %llu), " > "but my Disk seems to have > failed:(\n", > (unsigned long long) req->sector); > } > } > Exactly. > I don't see why there could possibly be requests in the tl > that have (s & RQ_LOCAL_MASK) when there is no disk. Because there WAS a disk when the request was issued - in fact, the local write to disk completed successfully, but the request is still sitting in the TL waiting for the next barrier to complete. Subsequent to that but while the request is still in the TL, the local disk is detached. > other than that, what about > > 3. when attaching a disk, > suspend incoming requests and wait for the tl to become empty. > then attach, and resume. > I think this might work but only as a side effect -- if you look back to the sequence I documented, you will see that there has to be a write request to the same AL area after the disk is reattached - this is because drbd_al_complete_io quietly ignores the case where no active AL extent is found for the request being completed. You would also need to trigger a barrier op in this case to force the TL to be flushed. Simon ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Drbd-dev] Crash in lru_cache.c 2008-01-10 20:31 ` Graham, Simon @ 2008-01-12 13:51 ` Lars Ellenberg 0 siblings, 0 replies; 8+ messages in thread From: Lars Ellenberg @ 2008-01-12 13:51 UTC (permalink / raw) To: drbd-dev On Thu, Jan 10, 2008 at 03:31:02PM -0500, Graham, Simon wrote: > > > Dec 5 05:57:09 ------------[ cut here ]------------ > > > Dec 5 05:57:09 kernel BUG at > > > /test_logs/builds/SuperNova/trunk/20071205- > > r21536/src/platform/drbd/src/ > > > drbd/lru_cache.c:312! > > > > in what exact codebase do you see this? > > up to which point have you merged upstream drbd-8.0.git? > > what local patches are applied? > > > > Yes - sorry... this is 8.0.4 plus a bunch of the fixes that are in 8.0.8 > (but not all) plus a few more than T haven't submitted yet (but I will > once I wrestle git into submission); the specific change that exposes > this that I have pulled is the one to use the TL for Protocol C as well > as A and B -- however, I think this bug exists IF you are using A or B > without this fix. > > > that would be in this code path: > > if (s & RQ_LOCAL_MASK) { > > if (inc_local_if_state(mdev,Failed)) { > > drbd_al_complete_io(mdev, req->sector); > > dec_local(mdev); > > } else { > > WARN("Should have called drbd_al_complete_io(, %llu), " > > "but my Disk seems to have failed:(\n", > > (unsigned long long) req->sector); > > } > > } > > > > Exactly. > > > I don't see why there could possibly be requests in the tl > > that have (s & RQ_LOCAL_MASK) when there is no disk. > > Because there WAS a disk when the request was issued - in fact, the > local write to disk completed successfully, but the request is still > sitting in the TL waiting for the next barrier to complete. Subsequent > to that but while the request is still in the TL, the local disk is > detached. AND it is re-attached so fast, that we have a new (uhm; well, probably the same?) disk again, while still the very same request is sitting there waiting for that very barrier ack? now, how unlikely is THAT to happen in real life. but I think I understand your scenario. but how do you test this, actually? inject io failures, and trigger a re-attach as soon as you see the detach event? is that to implement a "hot-spare" feature? > > other than that, what about > > > > 3. when attaching a disk, > > suspend incoming requests and wait for the tl to become empty. > > then attach, and resume. > > > > I think this might work but only as a side effect -- if you look back to > the sequence I documented, you will see that there has to be a write > request to the same AL area after the disk is reattached - this is > because drbd_al_complete_io quietly ignores the case where no active AL > extent is found for the request being completed. huh? I simply disallow re-attaching while there are still requests pending from before the detach. no more (s & RQ_LOCAL_MASK), no more un-accounted for references. if I understand correctly, you can reproduce this easily. to underline my point, does it still trigger when you do "dd if=/dev/drbdX of=/dev/null bs=1b count=1 iflag=direct ; sleep 5" before the re-attach? (the dd, even if it only reads, due to using directio, and drbd being diskless, will trigger any pending barrier to be sent) > You would also need to trigger a barrier op in this case to force the > TL to be flushed. for other reasons, I think we need to rewrite the barrier code anyways to send out the barrier as soon as possible, and not wait until the next io request comes in. -- : Lars Ellenberg Tel +43-1-8178292-55 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com : ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <342BAC0A5467384983B586A6B0B3767107C5AE95@EXNA.corp.s tratus.com>]
* RE: [Drbd-dev] Crash in lru_cache.c [not found] ` <342BAC0A5467384983B586A6B0B3767107C5AE95@EXNA.corp.s tratus.com> @ 2008-01-12 15:23 ` Graham, Simon 2008-01-12 17:04 ` Lars Ellenberg 2008-01-12 23:37 ` Graham, Simon 0 siblings, 2 replies; 8+ messages in thread From: Graham, Simon @ 2008-01-12 15:23 UTC (permalink / raw) To: Lars Ellenberg, drbd-dev > > Because there WAS a disk when the request was issued - in fact, the > > local write to disk completed successfully, but the request is still > > sitting in the TL waiting for the next barrier to complete. > Subsequent > > to that but while the request is still in the TL, the local disk is > > detached. > > AND it is re-attached so fast, > that we have a new (uhm; well, probably the same?) disk again, > while still the very same request is sitting there > waiting for that very barrier ack? > You got it! > now, how unlikely is THAT to happen in real life. > Fairly rare I agree although someone could do a 'drbdadm detach' and then 'drbdadm attach' -- that's how we hit this situation (and the reason for THAT is as a way to test errors on meta-data reads) Given that there is no real boundary on the lifetime of a request in the TL, it's also feasible (although unlikely I agree) that a disk could fail and be replaced and reattached whilst an old request is still in the TL... > > I think this might work but only as a side effect -- if you look back > to > > the sequence I documented, you will see that there has to be a write > > request to the same AL area after the disk is reattached - this is > > because drbd_al_complete_io quietly ignores the case where no active > AL > > extent is found for the request being completed. > > huh? > I simply disallow re-attaching while there are still requests pending > from before the detach. > no more (s & RQ_LOCAL_MASK), no more un-accounted for references. > Yes but those requests that have unaccounted references from before the detach are still in the TL -- it so happens that the code does not crash in this case (completing a request in the TL when there is no matching AL cache entry) but that's not very safe I think. You also have to trigger a barrier as part of this -- not only block new requests during attach until the TL is empty but also trigger a barrier so that the TL will be emptied... Both of these are why I like the idea of "reconnecting" the requests in the TL to the AL cache when doing an attach... > if I understand correctly, > you can reproduce this easily. > to underline my point, > does it still trigger when you do > "dd if=/dev/drbdX of=/dev/null bs=1b count=1 iflag=direct ; sleep 5" > before the re-attach? So, the real test is to do this _before_ the DETACH, then see what happens when the requests are removed from the TL. > for other reasons, I think we need to rewrite the barrier code anyways > to send out the barrier as soon as possible, and not wait until the > next > io request comes in. That's an interesting idea -- it would also allow you to use the Linux barrier mechanism to implement. Still wouldn't handle this case I think though -- you can have requests in the TL that do not yet require a barrier when you lose the local disk... Simon ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Drbd-dev] Crash in lru_cache.c 2008-01-12 15:23 ` Graham, Simon @ 2008-01-12 17:04 ` Lars Ellenberg 2008-01-12 23:37 ` Graham, Simon 1 sibling, 0 replies; 8+ messages in thread From: Lars Ellenberg @ 2008-01-12 17:04 UTC (permalink / raw) To: Graham, Simon; +Cc: drbd-dev On Sat, Jan 12, 2008 at 10:23:58AM -0500, Graham, Simon wrote: > > > Because there WAS a disk when the request was issued - in fact, the > > > local write to disk completed successfully, but the request is still > > > sitting in the TL waiting for the next barrier to complete. > > Subsequent > > > to that but while the request is still in the TL, the local disk is > > > detached. > > > > AND it is re-attached so fast, > > that we have a new (uhm; well, probably the same?) disk again, > > while still the very same request is sitting there > > waiting for that very barrier ack? > > > > You got it! > > > now, how unlikely is THAT to happen in real life. > > > > Fairly rare I agree although someone could do a 'drbdadm detach' and > then 'drbdadm attach' -- that's how we hit this situation (and the > reason for THAT is as a way to test errors on meta-data reads) > > Given that there is no real boundary on the lifetime of a request in the > TL, it's also feasible (although unlikely I agree) that a disk could > fail and be replaced and reattached whilst an old request is still in > the TL... well, there is. the request will only live in the tl until either - connection is lost, and we call tl_clear - the corresponding barrier ack comes in right, currently, a barrier is not sent when the epoch closes, but before the next epoch start, which may be a very long time. but, we are changing this anyways, and will now send the barrier as soon as we close the current epoch. once that is done, soon (milliseconds) after any request is reported as completed to upper layers (which is the event that is causing the current epoch to close, the barrier to be send), it will also be cleared from the tl. > > > I think this might work but only as a side effect -- if you look > back > > to > > > the sequence I documented, you will see that there has to be a write > > > request to the same AL area after the disk is reattached - this is > > > because drbd_al_complete_io quietly ignores the case where no active > > AL > > > extent is found for the request being completed. > > > > huh? > > I simply disallow re-attaching while there are still requests pending > > from before the detach. > > no more (s & RQ_LOCAL_MASK), no more un-accounted for references. > > > > Yes but those requests that have unaccounted references from before the > detach are still in the TL no they are not, I just said I would not allow an attach while they are still in there. > -- it so happens that the code does not crash > in this case (completing a request in the TL when there is no matching > AL cache entry) but that's not very safe I think. > > You also have to trigger a barrier as part of this -- not only block new > requests during attach until the TL is empty but also trigger a barrier > so that the TL will be emptied... as outlined earlier, and implemented next week hopefully, barriers will be sent as soon as the old epoch is closed, not only when the first new request for the new epoch comes in. > Both of these are why I like the idea of "reconnecting" the requests in > the TL to the AL cache when doing an attach... > > > if I understand correctly, > > you can reproduce this easily. > > to underline my point, > > does it still trigger when you do > > "dd if=/dev/drbdX of=/dev/null bs=1b count=1 iflag=direct ; sleep 5" > > before the re-attach? > > So, the real test is to do this _before_ the DETACH, then see what > happens when the requests are removed from the TL. no. only a remote read can trigger a barrier. as long as i have valid local data, all reads are local. > > for other reasons, I think we need to rewrite the barrier code anyways > > to send out the barrier as soon as possible, and not wait until the > > next io request comes in. > > That's an interesting idea -- it would also allow you to use the Linux > barrier mechanism to implement. Still wouldn't handle this case I think > though -- you can have requests in the TL that do not yet require a > barrier when you lose the local disk... sure I can have requests there, but they are not yet completed to upper layers. if they are, their correponding barrier will have been send out already. for attach, we would then do block new incomming request wait for ap count to reach zero [in current code, send out a barrier now; with the idea outline above, there is no need for that] wait for the lates barrier ack (tl now empty) attach unblock am I still missing something? -- : Lars Ellenberg Tel +43-1-8178292-55 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com : ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [Drbd-dev] Crash in lru_cache.c 2008-01-12 15:23 ` Graham, Simon 2008-01-12 17:04 ` Lars Ellenberg @ 2008-01-12 23:37 ` Graham, Simon 2008-01-13 3:14 ` Lars Ellenberg 1 sibling, 1 reply; 8+ messages in thread From: Graham, Simon @ 2008-01-12 23:37 UTC (permalink / raw) To: Lars Ellenberg; +Cc: drbd-dev > sure I can have requests there, but they are not yet completed to upper > layers. if they are, their correponding barrier will have been send > out > already. > > for attach, we would then do > block new incomming request > wait for ap count to reach zero > [in current code, send out a barrier now; > with the idea outline above, there is no need for that] > wait for the lates barrier ack > (tl now empty) > attach > unblock > > am I still missing something? I think that works. What's the appropriate mechanism for blocking new requests? There are existing mechanisms based on locking the AL cache entry, but since there is no AL at this time, we cant use that... Thanks for the ideas! Simon ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Drbd-dev] Crash in lru_cache.c 2008-01-12 23:37 ` Graham, Simon @ 2008-01-13 3:14 ` Lars Ellenberg 0 siblings, 0 replies; 8+ messages in thread From: Lars Ellenberg @ 2008-01-13 3:14 UTC (permalink / raw) To: drbd-dev On Sat, Jan 12, 2008 at 06:37:47PM -0500, Graham, Simon wrote: > > sure I can have requests there, but they are not yet completed to > upper > > layers. if they are, their correponding barrier will have been send > > out > > already. > > > > for attach, we would then do > > block new incomming request > > wait for ap count to reach zero > > [in current code, send out a barrier now; > > with the idea outline above, there is no need for that] > > wait for the lates barrier ack > > (tl now empty) > > attach > > unblock > > > > am I still missing something? > > I think that works. > > What's the appropriate mechanism for blocking new requests? There are > existing mechanisms based on locking the AL cache entry, but since there > is no AL at this time, we cant use that... there is an other one, examining the drbd state, right as the first statement in drbd_make_request_common inc_ap_bio __inc_ap_bio_cond :-) I have to think about whether we need yet an other itermediate state. probably blocking anything for Diskless < state < Inconsistent would be enough. -- : Lars Ellenberg Tel +43-1-8178292-55 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com : ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-01-13 3:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-10 19:00 [Drbd-dev] Crash in lru_cache.c Graham, Simon
2008-01-10 20:19 ` Lars Ellenberg
2008-01-10 20:31 ` Graham, Simon
2008-01-12 13:51 ` Lars Ellenberg
[not found] ` <342BAC0A5467384983B586A6B0B3767107C5AE95@EXNA.corp.s tratus.com>
2008-01-12 15:23 ` Graham, Simon
2008-01-12 17:04 ` Lars Ellenberg
2008-01-12 23:37 ` Graham, Simon
2008-01-13 3:14 ` Lars Ellenberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox