net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2160099
Upstream status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Tested: by IBM
Build-Info: ihttps://brewweb.engineering.redhat.com/brew/taskinfo?taskID=52893145
Conflicts: None
commit 22a825c541d775c1dbe7b2402786025acad6727b
Author: D. Wythe <alibuda@linux.alibaba.com>
Date:   Wed Mar 8 16:17:12 2023 +0800

    net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()

    When performing a stress test on SMC-R by rmmod mlx5_ib driver
    during the wrk/nginx test, we found that there is a probability
    of triggering a panic while terminating all link groups.

    This issue dues to the race between smc_smcr_terminate_all()
    and smc_buf_create().

                            smc_smcr_terminate_all

    smc_buf_create
    /* init */
    conn->sndbuf_desc = NULL;
    ...

                            __smc_lgr_terminate
                                    smc_conn_kill
                                            smc_close_abort
                                                    smc_cdc_get_slot_and_msg_send

                            __softirqentry_text_start
                                    smc_wr_tx_process_cqe
                                            smc_cdc_tx_handler
                                                    READ(conn->sndbuf_desc->len);
                                                    /* panic dues to NULL sndbuf_desc */

    conn->sndbuf_desc = xxx;

    This patch tries to fix the issue by always to check the sndbuf_desc
    before send any cdc msg, to make sure that no null pointer is
    seen during cqe processing.

    Fixes: 0b29ec6436 ("net/smc: immediate termination for SMCR link groups")
    Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
    Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
    Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
    Link: https://lore.kernel.org/r/1678263432-17329-1-git-send-email-alibuda@linux.alibaba.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Tobias Huschle <thuschle@redhat.com>
This commit is contained in:
Tobias Huschle 2023-05-26 09:39:20 +00:00
parent 3af290b771
commit e3f9a7c0c9
1 changed files with 3 additions and 0 deletions

View File

@ -114,6 +114,9 @@ int smc_cdc_msg_send(struct smc_connection *conn,
union smc_host_cursor cfed;
int rc;
if (unlikely(!READ_ONCE(conn->sndbuf_desc)))
return -ENOBUFS;
smc_cdc_add_pending_send(conn, pend);
conn->tx_cdc_seq++;