-
Notifications
You must be signed in to change notification settings - Fork 10.1k
Description
Describe the bug
I have been trying to load test my Socket.IO service using artillery (attached the script by the name of load-test.yml). Unfortunately, when I do so, I end up getting the below results:
All VUs finished. Total time: 12 minutes, 42 seconds
--------------------------------
Summary report @ 12:39:47(+0530)
--------------------------------
engine.socketio.emit: .......................................................... 137254
engine.socketio.emit_rate: ..................................................... 182/sec
errors.Error: timeout: ......................................................... 110
errors.Error: websocket error: ................................................. 317
errors.timeout: ................................................................ 150
errors.websocket error: ........................................................ 257
vusers.completed: .............................................................. 1774
vusers.created: ................................................................ 2400
vusers.created_by_name.Load Test Scenario: ..................................... 2400
vusers.failed: ................................................................. 626
vusers.session_length:
min: ......................................................................... 140078.2
max: ......................................................................... 200970.3
median: ...................................................................... 148798.3
p95: ......................................................................... 167770.3
p99: ......................................................................... 174617.4
Log file: test-case-7.json
Now the vusers.failed count has also gone upto 1500 (in other tests that I have ran). To debug further, I enabled logs of my server and figured out that there seems to be a memory leak and hence the service was stopping and restarting again. The log which I found on the server was:
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: <--- Last few GCs --->
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: [45021:0x69022b0] 100805 ms: Scavenge (reduce) 2041.4 (2082.5) -> 2040.7 (2082.7) MB, 4.0 / 0.0 ms (average mu = 0.308, current mu = 0.292) allocation failure;
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: [45021:0x69022b0] 101973 ms: Mark-sweep (reduce) 2041.5 (2082.7) -> 2039.8 (2083.0) MB, 1165.3 / 0.0 ms (average mu = 0.182, current mu = 0.050) allocation failure; scavenge might not succeed
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: <--- JS stacktrace --->
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 1: 0xb7a940 node::Abort() [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 2: 0xa8e823 [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 3: 0xd5c940 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 4: 0xd5cce7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 5: 0xf3a3e5 [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 6: 0xf3b2e8 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 7: 0xf4b7f3 [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 8: 0xf4c668 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 9: 0xf26fce v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 10: 0xf28397 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 11: 0xf0956a v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 12: 0x12ce7af v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 /var/log/socket_service.log[45021]: 13: 0x16fb6b9 [/home/ubuntu/.nvm/versions/node/v18.16.0/bin/node]
Jun 09 12:10:37 ip-10-100-2-215 systemd[1]: socket-io.service: Main process exited, code=dumped, status=6/ABRT
Jun 09 12:10:37 ip-10-100-2-215 systemd[1]: socket-io.service: Failed with result 'core-dump'.
I have also taken heap snapshots and unfortunately, I don't find any of them leading to the code which I have written. Would this indicate that there is something wrong with the library? Please correct me if I am wrong. I am attaching the screenshots of the comparison done between heap snapshots:
To Reproduce
Socket.IO server version: 4.6.0
Server
index.js
const { Server } = require("socket.io");
const { createCustomLogger } = require("./utils/logger");
const logger = createCustomLogger();
// let garbageCollectorInterval = null;
import { createClient } from "redis";
import { createAdapter } from "@socket.io/redis-streams-adapter";
import { MAIN_NAMESPACE_PREFIX } from "./utils/constants";
import initializeCobrowseNamespace from "./namespaces/cobrowse";
import initializeLiveChatNamespace from "./namespaces/livechat";
import {
REDIS_HOST,
REDIS_PORT,
CORS_WHITELISTED_DOMAINS,
} from "./utils/config";
const { instrument, RedisStore } = require("@socket.io/admin-ui");
const redisClient = createClient({
socket: {
host: REDIS_HOST,
port: REDIS_PORT,
},
});
try {
redisClient.connect();
} catch (error) {
logger.error(`Could not connect to redis client = ${error}`);
}
/*
If the serveClient flag were to be marked as true then make sure required changes are done in the
webpack.config.js file.
Documentation - https://socket.io/docs/v4/server-with-bundlers/
*/
const io = new Server(3000, {
path: "/websocket/",
maxHttpBufferSize: 1e8, // 100 MB
adapter: createAdapter(redisClient),
cors: {
origin: CORS_WHITELISTED_DOMAINS,
credentials: true,
},
serveClient: false,
});
/*
Setting up the monitoring tool for the socket-io servers
*/
instrument(io, {
auth: {
type: "basic",
username: "cognoai-test",
password: "$2a$10$rhCtnxXuBRcBF2t3I4kFUOeHN/yLlXq1rl5iDQvSckuw7HVQ7eoZy",
},
namespaceName: "/admin-ui",
serverId: `${require("os").hostname()}#${process.pid}`,
store: new RedisStore(redisClient),
mode: "production",
});
initializeCobrowseNamespace(io, logger);
// initializeLiveChatNamespace(io, logger);
/*
Event - "connection"
Event trigggered when a socket connection is established
*/
io.on("connection", (socket) => {
logger.info(
`${MAIN_NAMESPACE_PREFIX} request URL of the connection = ${socket.handshake.url}`
);
socket.on("test_event", (data) => {
logger.info(`${MAIN_NAMESPACE_PREFIX} received data = ${data}`);
io.emit("test_event_server", data);
});
socket.on("disconnect", (reason) => {
logger.warn(
`${MAIN_NAMESPACE_PREFIX} socket got disconnected due to the following reason = ${reason}`
);
});
});
/*
Event - "connection_error"
Event will be emitted when a connection is abnormally closed
*/
io.engine.on("connection_error", (errorObject) => {
logger.error(
`${MAIN_NAMESPACE_PREFIX} connection_error event with request obj = ${errorObject.req} | error code = ${errorObject.code} | error message = ${errorObject.message} | error context = ${errorObject.context}`
);
});
cobrowse.js
import { COBROWSE_NAMESPACE_PREFIX } from "../utils/constants";
const attachRoomEventListeners = (cobrowseNamespace, logger) => {
logger.info(`${COBROWSE_NAMESPACE_PREFIX} attaching room event listeners`);
cobrowseNamespace.adapter.on("create-room", (room) => {
logger.info(`${COBROWSE_NAMESPACE_PREFIX} room ${room} was created`);
});
cobrowseNamespace.adapter.on("delete-room", (room) => {
logger.info(`${COBROWSE_NAMESPACE_PREFIX} room ${room} was deleted`);
});
cobrowseNamespace.adapter.on("join-room", (room, id) => {
logger.info(
`${COBROWSE_NAMESPACE_PREFIX} socket ${id} has joined room ${room}`
);
});
cobrowseNamespace.adapter.on("leave-room", (room, id) => {
logger.info(
`${COBROWSE_NAMESPACE_PREFIX} socket ${id} has left room ${room}`
);
});
};
const initializeCobrowseNamespace = (io, logger) => {
logger.info("Initialising co-browse namespace");
const cobrowseNamespace = io.of("/cobrowse");
attachRoomEventListeners(cobrowseNamespace, logger);
cobrowseNamespace.on("connection", (socket) => {
logger.info(
`${COBROWSE_NAMESPACE_PREFIX} request URL of the connection = ${socket.handshake.url}`
);
/*
Extracting the room key from the query parameter. Once we get the room key, we add the
socket to the room created on the basis of that room key.
*/
let roomName = socket.handshake.query.roomKey;
logger.info(
`${COBROWSE_NAMESPACE_PREFIX} room key extracted = ${roomName}`
);
socket.join(roomName);
socket.on("cobrowse_packet", (data) => {
logger.info(
`${COBROWSE_NAMESPACE_PREFIX} received data cobrowse_packet = ${Object.keys(
data
)}`
);
cobrowseNamespace.to(roomName).emit("cobrowse_packet", data);
});
socket.on("cobrowse_sync_utils_packet", (data) => {
logger.info(
`${COBROWSE_NAMESPACE_PREFIX} received data cobrowse_sync_utils_packet = ${Object.keys(
data
)}`
);
cobrowseNamespace.to(roomName).emit("cobrowse_sync_utils_packet", data);
});
socket.on("cobrowse_signal_packet", (data) => {
logger.info(
`${COBROWSE_NAMESPACE_PREFIX} received data cobrowse_signal_packet = ${Object.keys(
data
)}`
);
cobrowseNamespace.to(roomName).emit("cobrowse_signal_packet", data);
});
socket.on("app_cobrowse_packet", (data) => {
logger.info(
`${COBROWSE_NAMESPACE_PREFIX} received data app_cobrowse_packet = ${Object.keys(
data
)}`
);
cobrowseNamespace.to(roomName).emit("app_cobrowse_packet", data);
});
socket.on("disconnect", (reason) => {
logger.warn(
`${COBROWSE_NAMESPACE_PREFIX} socket got disconnected due to the following reason = ${reason}`
);
});
});
};
export default initializeCobrowseNamespace;
Socket.IO client version: artillery-engine-socketio-v3 - v1.1.3
Client
Artillery .yaml script
config:
target: "https://xyz.com"
socketio:
path: "/websocket"
transports: ["websocket"]
query:
roomKey: "abcd"
phases:
- duration: 600
arrivalRate: 5
engines:
socketio-v3: {}
scenarios:
- name: Load Test Scenario
engine: socketio-v3
flow:
- loop:
- namespace: "/cobrowse"
emit:
channel: "cobrowse_packet"
data: "world"
- think: 1
count: 70
Infrastructure Specifications:
There are 2 servers each with 4 cores and 8GB ram. Both of them are running the Socket.IO instance and one of them is running redis. There is an AWS ALB present in front of the two websocket servers, I am using only websockets and not HTTP polling.
Additional context
I also minify my code using webpack. The remaining package version details can be found in the package.json file. You can find my code over here WebSocketServiceExport.zip.