[PATCH 04/11] drbd_transport_rdma: dont schedule retry_connect_work in active is false
Dongsheng Yang
dongsheng.yang at easystack.cn
Mon Jul 1 04:11:27 CEST 2024
在 2024/6/28 星期五 下午 7:51, Philipp Reisner 写道:
> Hello Dongsheng,
>
> Please explain what problem you are fixing with this change. Do you
> have a log that shows a problem in this area? Please describe why your
> proposed change improves DRBD's behavior.
retry_connect_work can be flushed in dtr_free, that's correct. but if we
schedule new work after that, there is a NULL pointer dereference in our
testing. So dont schedule new retry_connect_work when
rdma_transport->active is false. it is set to false in dtr_free before
flushing retry_connect_work.
>
> best regards,
> Philipp
>
> On Mon, Jun 24, 2024 at 9:28 AM zhengbing.huang
> <zhengbing.huang at easystack.cn> wrote:
>>
>> From: Dongsheng Yang <dongsheng.yang at easystack.cn>
>>
>> Signed-off-by: Dongsheng Yang <dongsheng.yang at easystack.cn>
>> ---
>> drbd/drbd_transport_rdma.c | 12 ++++++++++--
>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
>> index eccd0c6ce..b7ccb15d4 100644
>> --- a/drbd/drbd_transport_rdma.c
>> +++ b/drbd/drbd_transport_rdma.c
>> @@ -1089,9 +1089,13 @@ static void dtr_cma_retry_connect_work_fn(struct work_struct *work)
>> if (err) {
>> struct dtr_path *path = container_of(cs, struct dtr_path, cs);
>> struct drbd_transport *transport = path->path.transport;
>> + struct dtr_transport *rdma_transport =
>> + container_of(transport, struct dtr_transport, transport);
>>
>> tr_err(transport, "dtr_start_try_connect failed %d\n", err);
>> - schedule_delayed_work(&cs->retry_connect_work, HZ);
>> + if (rdma_transport->active) {
>> + schedule_delayed_work(&cs->retry_connect_work, HZ);
>> + }
>> }
>> }
>>
>> @@ -1116,6 +1120,8 @@ static void dtr_remove_cm_from_path(struct dtr_path *path, struct dtr_cm *failed
>> static void dtr_cma_retry_connect(struct dtr_path *path, struct dtr_cm *failed_cm)
>> {
>> struct drbd_transport *transport = path->path.transport;
>> + struct dtr_transport *rdma_transport =
>> + container_of(transport, struct dtr_transport, transport);
>> struct dtr_connect_state *cs = &path->cs;
>> long connect_int = 10 * HZ;
>> struct net_conf *nc;
>> @@ -1128,7 +1134,9 @@ static void dtr_cma_retry_connect(struct dtr_path *path, struct dtr_cm *failed_c
>> connect_int = nc->connect_int * HZ;
>> rcu_read_unlock();
>>
>> - schedule_delayed_work(&cs->retry_connect_work, connect_int);
>> + if (rdma_transport->active) {
>> + schedule_delayed_work(&cs->retry_connect_work, connect_int);
>> + }
>> }
>>
>> static void dtr_cma_connect_work_fn(struct work_struct *work)
>> --
>> 2.27.0
>>
> .
>
More information about the drbd-dev
mailing list