Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
It seems connection->cstate only update via P_CONN_ST_CHG_REQ, but drbd 8.3 send req via P_STATE_CHG_REQ. 2017-08-11 14:12 GMT+08:00 li songmin <lisongmin9 at gmail.com>: > > When I debug the crash, when oops occurs, the cstate of connection is > C_WF_REPORT_PARAMS but not C_TEAR_DOWN. > > So this problem may also occurs up to 8.4.10 in may opinion. > > the order of change state and init ack_sender in conn_connect function is: > > ``` > rv = conn_request_state(connection, NS(conn, C_WF_REPORT_PARAMS), > CS_VERBOSE); <-- change cstate here > if (rv < SS_SUCCESS || connection->cstate != C_WF_REPORT_PARAMS) { > clear_bit(STATE_SENT, &connection->flags); > return 0; > } > > drbd_thread_start(&connection->ack_receiver); > /* opencoded create_singlethread_workqueue(), > * to be able to use format string arguments */ > connection->ack_sender = > <-- init ack_sender here > #if LINUX_VERSION_CODE >= KERNEL_VERSION(3,3,0) > alloc_ordered_workqueue("drbd_as_%s", WQ_MEM_RECLAIM, > connection->resource->name); > #else > create_singlethread_workqueue("drbd_ack_sender"); > #endif > if (!connection->ack_sender) { > drbd_err(connection, "Failed to create workqueue ack_sender\n"); > return 0; > } > > ``` > > and the oops point valid ack_sender by cstate: > > ``` > if (connection->cstate >= C_WF_REPORT_PARAMS) { > kref_get(&device->kref); /* put is in drbd_send_acks_wf() */ > if (!queue_work(connection->ack_sender, > &peer_device->send_acks_work)) <-- oops here. > kref_put(&device->kref, drbd_destroy_device); > } > ``` > > 2017-08-10 18:21 GMT+08:00 Lars Ellenberg <lars.ellenberg at linbit.com>: > >> On Wed, Aug 09, 2017 at 05:20:22PM +0800, li songmin wrote: >> > Hi, >> > >> > when I upgrade fdrbd rom 8.3.15 to 8.4.6-5, there is an oops cause by >> NULL >> > pointer Error. >> >> We are at 8.4.10 already. >> Just saying. >> >> > >> > upgrade step as follow: >> > >> > 1. primary node work as normal >> > 2. stop drbd 8.3.15 on secondary node, and upgrade it to 8.4.6-5. >> > 3. start secondary node, now data begin sync from primary node. >> > 4. upgrade primary node with follow step >> > 1. stop business service on drbd >> > 2. disconnect drbd for unmount quickly <-- oops on secondary >> node >> > here? >> >> Why disconnect? >> >> > 3. umount filesystem >> > 4. primary -> secondary >> > 5. connect drbd and waiting sync complete. >> > 6. business service may start on secondary node now. >> > 7. stop drbd 8.3.15 on primary node, and upgrade it to 8.4.6-5. >> > >> > call stack: >> >> > <4>[66071017.155051] Modules linked in: softdog drbd(FN) >> >> What did you need to force the module for? >> Probably *that* is your problem right there. >> >> >> -- >> : Lars Ellenberg >> : LINBIT | Keeping the Digital World Running >> : DRBD -- Heartbeat -- Corosync -- Pacemaker >> >> DRBD® and LINBIT® are registered trademarks of LINBIT >> __ >> please don't Cc me, but send to list -- I'm subscribed >> _______________________________________________ >> drbd-user mailing list >> drbd-user at lists.linbit.com >> http://lists.linbit.com/mailman/listinfo/drbd-user >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20170813/7e0087dc/attachment.htm>