* [Drbd-dev] DRBD Bug Report
@ 2016-10-25 1:23 Guo, Lei
2016-10-31 10:58 ` Lars Ellenberg
0 siblings, 1 reply; 4+ messages in thread
From: Guo, Lei @ 2016-10-25 1:23 UTC (permalink / raw)
To: drbd-dev@lists.linbit.com
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 1357 bytes --]
Version£º9.0.5/9.0.4
File£ºdrbd_state.c
Two nodes are setup, Node 1 is primary, node 2 is secondary.
On node 2 , command ¡°drbdadm down r0¡± returns error.
[error] Failed to disconnected or detach the r0.
cmd_result=[11], cmd_output=[r0: State change failed: (-10) State change was refused by peer node
additional info from kernel:failed to disconnect
The possible bug is as follows.
static enum outdate_what outdate_on_disconnect(struct drbd_connection *connection)
{
struct drbd_resource *resource = connection->resource;
if (connection->fencing_policy >= FP_RESOURCE &&
resource->role[NOW] != connection->peer_role[NOW]) {
if (resource->role[NOW] == R_PRIMARY)
return OUTDATE_PEER_DISKS;
if (connection->peer_role[NOW] != R_PRIMARY) <--------- should be ¡°if (connection->peer_role[NOW] == R_PRIMARY)¡±
return OUTDATE_DISKS;
}
return OUTDATE_NOTHING;
}
ÒÔÉÏ¡¢¤è¤í¤·¤¯¤ªî¤¤¤·¤Þ¤¹¡£
-------
¹ùÀÚ Guo Lei
Development Dept.III (3-2-3)
Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)
TEL: +86+25-86630566-9437
E-mail: guol-fnst@cn.fujistu.com<mailto:guol-fnst@cn.fujistu.com>
[-- Attachment #2: Type: text/html, Size: 7538 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Drbd-dev] DRBD Bug Report
2016-10-25 1:23 [Drbd-dev] DRBD Bug Report Guo, Lei
@ 2016-10-31 10:58 ` Lars Ellenberg
2018-04-04 7:04 ` [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect Farhan Khan
0 siblings, 1 reply; 4+ messages in thread
From: Lars Ellenberg @ 2016-10-31 10:58 UTC (permalink / raw)
To: drbd-dev
On Tue, Oct 25, 2016 at 01:23:42AM +0000, Guo, Lei wrote:
> Version:9.0.5/9.0.4
> File:drbd_state.c
>
> Two nodes are setup, Node 1 is primary, node 2 is secondary.
> On node 2 , command “drbdadm down r0” returns error.
>
> [error] Failed to disconnected or detach the r0.
> cmd_result=[11], cmd_output=[r0: State change failed: (-10) State change was refused by peer node
> additional info from kernel:failed to disconnect
>
>
> The possible bug is as follows.
> static enum outdate_what outdate_on_disconnect(struct drbd_connection *connection)
> {
> struct drbd_resource *resource = connection->resource;
>
> if (connection->fencing_policy >= FP_RESOURCE &&
> resource->role[NOW] != connection->peer_role[NOW]) {
> if (resource->role[NOW] == R_PRIMARY)
> return OUTDATE_PEER_DISKS;
> if (connection->peer_role[NOW] != R_PRIMARY) <--------- should be “if (connection->peer_role[NOW] == R_PRIMARY)”
Yes.
And I thought I fixed that some weeks ago.
But apparently I did not push a test case,
and it got lost during the last merge/release cycle.
Pushed now.
Note though that this is by far not the only thing
that is broken with enabled fencing-policies on DRBD 9.
I was operating on the assumption that there had been only a few missing
missing parts regarding pacemaker (and other) integration. Turned out I
was wrong, and DRBD 9 + fencing policies is pretty much completely
broken in the module itself still. We are working on it.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R&D, Integration, Ops, Consulting, Support
DRBD® and LINBIT® are registered trademarks of LINBIT
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect
2016-10-31 10:58 ` Lars Ellenberg
@ 2018-04-04 7:04 ` Farhan Khan
2018-04-04 10:37 ` Lars Ellenberg
0 siblings, 1 reply; 4+ messages in thread
From: Farhan Khan @ 2018-04-04 7:04 UTC (permalink / raw)
To: drbd-dev
Hello, I would like to report a bug:
Scenario:
- 2 nodes running as primary and secondary on CentOS 7.4 with pacemaker
and corosync with dedicated link Gigabit link between nodes
Bug description and replication:
- Disconnect the link between nodes
--> fence-peer is called and secondary node is correctly fenced
- Do not write any new data to primary while the link is broken
-Reconnect the link
--> nodes reconnect and re-sync correctly
--> after-resync-target is NOT called and location constraint remains
Further steps to manually trigger after-resync-target handler:
- wrote some data to primary
--> correctly syncs with secondary
--> after-resync-target is still not called
Temporary workaround:
- write a script to to write some data to primary node when it gets
disconnected using something like "dd if=/dev/zero of=speetest bs=64k
count=1 conv=fdatasync"
-->this triggers after-resync-target and location constraint is
successfully removed
my config:
resource r0 {
device /dev/drbd0 minor 0;
disk /dev/sda4;
meta-disk internal;
on storage1 {
node-id 0;
address ipv4 172.26.1.1:7790;
}
on storage2 {
node-id 1;
address ipv4 172.26.1.2:7790;
}
net {
protocol C;
sndbuf-size 0;
max-buffers 8000;
max-epoch-size 8000;
after-sb-0pri discard-least-changes;
after-sb-1pri discard-secondary;
after-sb-2pri call-pri-lost-after-sb;
fencing resource-only;
}
disk {
md-flushes no;
al-extents 3833;
resync-rate 90M;
c-plan-ahead 2;
c-fill-target 2M;
c-max-rate 100M;
c-min-rate 25M;
}
handlers {
fence-peer /usr/lib/drbd/crm-fence-peer.9.sh;
after-resync-target /usr/lib/drbd/crm-unfence-peer.9.sh;
pri-lost-after-sb /usr/lib/drbd/notify-pri-lost-after-sb.sh;
}
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect
2018-04-04 7:04 ` [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect Farhan Khan
@ 2018-04-04 10:37 ` Lars Ellenberg
0 siblings, 0 replies; 4+ messages in thread
From: Lars Ellenberg @ 2018-04-04 10:37 UTC (permalink / raw)
To: drbd-dev
On Wed, Apr 04, 2018 at 10:04:49AM +0300, Farhan Khan wrote:
> Hello, I would like to report a bug:
>
> Scenario:
> - 2 nodes running as primary and secondary on CentOS 7.4 with pacemaker and
> corosync with dedicated link Gigabit link between nodes
>
> Bug description and replication:
> - Disconnect the link between nodes
> --> fence-peer is called and secondary node is correctly fenced
> - Do not write any new data to primary while the link is broken
> -Reconnect the link
> --> nodes reconnect and re-sync correctly
> --> after-resync-target is NOT called and location constraint remains
Becaquse it likely did not sync anything, because there was nothing to
sync, it did not become sync target, so there was nothing to do for an
"after" resync target handler?
You should use the "unfence-peer" handler to unfence.
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R&D, Integration, Ops, Consulting, Support
DRBD® and LINBIT® are registered trademarks of LINBIT
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-04-04 10:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-25 1:23 [Drbd-dev] DRBD Bug Report Guo, Lei
2016-10-31 10:58 ` Lars Ellenberg
2018-04-04 7:04 ` [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect Farhan Khan
2018-04-04 10:37 ` Lars Ellenberg
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.