All of lore.kernel.org
 help / color / mirror / Atom feed
* [Drbd-dev] DRBD Bug Report
@ 2016-10-25  1:23 Guo, Lei
  2016-10-31 10:58 ` Lars Ellenberg
  0 siblings, 1 reply; 4+ messages in thread
From: Guo, Lei @ 2016-10-25  1:23 UTC (permalink / raw)
  To: drbd-dev@lists.linbit.com

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 1357 bytes --]

Version£º9.0.5/9.0.4
File£ºdrbd_state.c

Two nodes are setup, Node 1 is primary, node 2 is secondary.
On node 2 , command ¡°drbdadm down r0¡± returns error.

[error] Failed to disconnected or detach the r0.
                                                 cmd_result=[11], cmd_output=[r0: State change failed: (-10) State change was refused by peer node
                                                 additional info from kernel:failed to disconnect


The possible bug is as follows.
static enum outdate_what outdate_on_disconnect(struct drbd_connection *connection)
{
        struct drbd_resource *resource = connection->resource;

        if (connection->fencing_policy >= FP_RESOURCE &&
            resource->role[NOW] != connection->peer_role[NOW]) {
                if (resource->role[NOW] == R_PRIMARY)
                        return OUTDATE_PEER_DISKS;
                if (connection->peer_role[NOW] != R_PRIMARY)           <--------- should be ¡°if (connection->peer_role[NOW] == R_PRIMARY)¡±
                        return OUTDATE_DISKS;
        }
        return OUTDATE_NOTHING;
}


ÒÔÉÏ¡¢¤è¤í¤·¤¯¤ªîФ¤¤·¤Þ¤¹¡£
-------
¹ùÀÚ Guo Lei
Development Dept.III (3-2-3)
Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)
TEL: +86+25-86630566-9437
E-mail: guol-fnst@cn.fujistu.com<mailto:guol-fnst@cn.fujistu.com>




[-- Attachment #2: Type: text/html, Size: 7538 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Drbd-dev] DRBD Bug Report
  2016-10-25  1:23 [Drbd-dev] DRBD Bug Report Guo, Lei
@ 2016-10-31 10:58 ` Lars Ellenberg
  2018-04-04  7:04   ` [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect Farhan Khan
  0 siblings, 1 reply; 4+ messages in thread
From: Lars Ellenberg @ 2016-10-31 10:58 UTC (permalink / raw)
  To: drbd-dev

On Tue, Oct 25, 2016 at 01:23:42AM +0000, Guo, Lei wrote:
> Version:9.0.5/9.0.4
> File:drbd_state.c
> 
> Two nodes are setup, Node 1 is primary, node 2 is secondary.
> On node 2 , command “drbdadm down r0” returns error.
> 
> [error] Failed to disconnected or detach the r0.
>                                                  cmd_result=[11], cmd_output=[r0: State change failed: (-10) State change was refused by peer node
>                                                  additional info from kernel:failed to disconnect
> 
> 
> The possible bug is as follows.
> static enum outdate_what outdate_on_disconnect(struct drbd_connection *connection)
> {
>         struct drbd_resource *resource = connection->resource;
> 
>         if (connection->fencing_policy >= FP_RESOURCE &&
>             resource->role[NOW] != connection->peer_role[NOW]) {
>                 if (resource->role[NOW] == R_PRIMARY)
>                         return OUTDATE_PEER_DISKS;
>                 if (connection->peer_role[NOW] != R_PRIMARY)           <--------- should be “if (connection->peer_role[NOW] == R_PRIMARY)”

Yes.
And I thought I fixed that some weeks ago.
But apparently I did not push a test case,
and it got lost during the last merge/release cycle.
Pushed now.

Note though that this is by far not the only thing
that is broken with enabled fencing-policies on DRBD 9.

I was operating on the assumption that there had been only a few missing
missing parts regarding pacemaker (and other) integration.  Turned out I
was wrong, and DRBD 9 + fencing policies is pretty much completely
broken in the module itself still.  We are working on it.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R&D, Integration, Ops, Consulting, Support

DRBD® and LINBIT® are registered trademarks of LINBIT

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect
  2016-10-31 10:58 ` Lars Ellenberg
@ 2018-04-04  7:04   ` Farhan Khan
  2018-04-04 10:37     ` Lars Ellenberg
  0 siblings, 1 reply; 4+ messages in thread
From: Farhan Khan @ 2018-04-04  7:04 UTC (permalink / raw)
  To: drbd-dev

Hello, I would like to report a bug:

Scenario:
- 2 nodes running as primary and secondary on CentOS 7.4 with pacemaker 
and corosync with dedicated link Gigabit link between nodes

Bug description and replication:
- Disconnect the link between nodes
      --> fence-peer is called and secondary node is correctly fenced
- Do not write any new data to primary while the link is broken
-Reconnect the link
      --> nodes reconnect and re-sync correctly
      --> after-resync-target is NOT called and location constraint remains

Further steps to manually trigger after-resync-target handler:
- wrote some data to primary
      --> correctly syncs with secondary
      --> after-resync-target is still not called

Temporary workaround:
- write a script to to write some data to primary node when it gets 
disconnected using something like  "dd if=/dev/zero of=speetest bs=64k 
count=1 conv=fdatasync"
      -->this triggers after-resync-target and location constraint is 
successfully removed

my config:
resource r0 {
     device               /dev/drbd0 minor 0;
     disk                 /dev/sda4;
     meta-disk            internal;
     on storage1 {
         node-id 0;
         address          ipv4 172.26.1.1:7790;
     }
     on storage2 {
         node-id 1;
         address          ipv4 172.26.1.2:7790;
     }
     net {
         protocol           C;
         sndbuf-size        0;
         max-buffers      8000;
         max-epoch-size   8000;
         after-sb-0pri    discard-least-changes;
         after-sb-1pri    discard-secondary;
         after-sb-2pri    call-pri-lost-after-sb;
         fencing          resource-only;
     }
     disk {
         md-flushes        no;
         al-extents       3833;
         resync-rate      90M;
         c-plan-ahead       2;
         c-fill-target     2M;
         c-max-rate       100M;
         c-min-rate       25M;
     }
     handlers {
         fence-peer       /usr/lib/drbd/crm-fence-peer.9.sh;
         after-resync-target /usr/lib/drbd/crm-unfence-peer.9.sh;
         pri-lost-after-sb /usr/lib/drbd/notify-pri-lost-after-sb.sh;
     }
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect
  2018-04-04  7:04   ` [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect Farhan Khan
@ 2018-04-04 10:37     ` Lars Ellenberg
  0 siblings, 0 replies; 4+ messages in thread
From: Lars Ellenberg @ 2018-04-04 10:37 UTC (permalink / raw)
  To: drbd-dev

On Wed, Apr 04, 2018 at 10:04:49AM +0300, Farhan Khan wrote:
> Hello, I would like to report a bug:
> 
> Scenario:
> - 2 nodes running as primary and secondary on CentOS 7.4 with pacemaker and
> corosync with dedicated link Gigabit link between nodes
> 
> Bug description and replication:
> - Disconnect the link between nodes
>      --> fence-peer is called and secondary node is correctly fenced
> - Do not write any new data to primary while the link is broken
> -Reconnect the link
>      --> nodes reconnect and re-sync correctly
>      --> after-resync-target is NOT called and location constraint remains

Becaquse it likely did not sync anything, because there was nothing to
sync, it did not become sync target, so there was nothing to do for an
"after" resync target handler?

You should use the "unfence-peer" handler to unfence.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R&D, Integration, Ops, Consulting, Support

DRBD® and LINBIT® are registered trademarks of LINBIT

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-04-04 10:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-25  1:23 [Drbd-dev] DRBD Bug Report Guo, Lei
2016-10-31 10:58 ` Lars Ellenberg
2018-04-04  7:04   ` [Drbd-dev] DRBD Bug Report : DRBD 9.0 after-resync-target not being called after resync / reconnect Farhan Khan
2018-04-04 10:37     ` Lars Ellenberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.