最近使用 zmq 莫名出现了 abort 现象,core 的打印信息: mq 内部的打印: 根据打印应该是内部使用的 fd 被提前释放了,但什么原因导致的不太清楚,又遇到过得 v 友么?
1
hankai17 2023-07-13 09:49:34 +08:00
我猜你的业务回调里 有类似代理功能吧 把另外的 fd 暴力关掉了
|
2
NoAnyLove 2023-07-13 09:52:07 +08:00
zeromq 版本呢?有重现错误的最小化代码吗?
|
3
kkkbbb OP @NoAnyLove 版本是 4.3.3 目前不是必现,所以不太清楚可能是什么问题
贴下 send 的代码,看了下没看出来那会提前 close socket: void *cts_zmq_socket_connect(void *context, void * requester, const char* dest,short destory) { void *pSocket = NULL; int optionVlalue = 0; int result; if(destory) { zmq_disconnect(requester, dest); zmq_close(requester); } pSocket = zmq_socket(pContext, ZMQ_REQ); if(NULL == pSocket) { return NULL; } zmq_setsockopt(pSocket, ZMQ_LINGER, &optionVlalue, sizeof(int)); result = zmq_connect(pSocket, dest); if(result < 0) { zmq_close(pSocket); return NULL; } return pSocket; } int cts_zmq_send(ZMQ_MSG *msg, const char* dest) { int result; int expect_reply = 1; int retries_left = REQUEST_RETRIES; void *pSocket = NULL; pSocket = cts_zmq_socket_connect(pContext, pSocket, dest, 0); if(pSocket == NULL) { return -1; } msg->MessageSequence = cts_zmq_random_get(10000); result = zmq_send(pSocket, msg, sizeof(*msg), 0); if(result < 0) { zmq_disconnect(pSocket, dest); zmq_close(pSocket); return -2; } while (expect_reply) { zmq_pollitem_t items[] = {{pSocket, 0, ZMQ_POLLIN, 0}}; int rc = zmq_poll(items, 1, REQUEST_TIMEOUT); if (rc == -1) { result = -3; break; } if (items[0].revents & ZMQ_POLLIN) { ZMQ_MSG recvMsg; int rcv = zmq_recv(pSocket, &recvMsg, sizeof(recvMsg), 0); if (rcv < 0) { result = -4; break; } else { retries_left = REQUEST_RETRIES; expect_reply = 0; if (recvMsg.MessageSequence == msg->MessageSequence) { result = 0; break; } else { result = -5; break; } } } else { if (retries_left <= 0) { retries_left = REQUEST_RETRIES; result = -6; break; } else { retries_left -=1; pSocket = cts_zmq_socket_connect(pContext, pSocket, dest, 1); if(pSocket == NULL) { return -7; } msg->MessageSequence = cts_zmq_random_get(10000); zmq_send(pSocket, msg, sizeof(*msg), 0); continue; } } } zmq_disconnect(pSocket, dest); zmq_close(pSocket); return result; } |
5
NoAnyLove 2023-07-13 13:24:33 +08:00
那个 file descriptor 是一个 eventfd ,理论上不会被被主动关闭。那么问题来了,
* 有用 fork 吗? * 创建了多少个 zmq socket ? * 感觉问题出在下面这行代码,这种初始化似乎只在第一次循环时候执行,后面的 retry 会关闭之前的 socket:zmq_pollzmq_pollitem_t items[] = {{pSocket, 0, ZMQ_POLLIN, 0}}; |
6
NoAnyLove 2023-07-13 13:25:10 +08:00
换成按每个成员单独赋值
|