양방향 connection 이 open 된 이후 spring boot(mvc) application 에서 sticky session 을 3가지 방법으로 벤치마크 테스트를 진행하고자 합니다.
client 는 A 를 통해서 WebSocket 을 연결 하고 A 는 B 에 WebSocket 연결을 한다.
application server 수준 에서 주어진 Resource 를 최대한 활용하여 WebSocket Connection 연결 후 작업(대기열 확인, 별도의 로직 등..) 이 지속되는 경우 상황에 맞는 최적화를 하기 위함임
Connect Time, Sample Time : WebSocket 이 Connection 이 걸리는 시간 이번 테스트에선 두개 값은 동일
classic: background 로 실행되지않고 해당 method 에서 처리 완료 하는방식
(B서버 테스트에 나오는 Busy-wating 과 Thread sleep 참조)
Clietn 에서 A 서버에 WebSocket connection 연결에 대한 spike 테스트 (1000)
Sample Time avg : 4191ms
Connect Time avg: 4191ms
Throughput: 158
특이사항
Tomcat 의 default thread 개수인 200 개를 초과하면 Connect Time 확연히 성능 이슈 발생
private val scheduler = Executors.newScheduledThreadPool(50)
private val scheduledTasks = ConcurrentHashMap<WebSocketSession, ScheduledFuture<*>>()
private val managerMap = ConcurrentHashMap<WebSocketSession, WebSocketSession>()
override fun afterConnectionEstablished(session: WebSocketSession) {
val task = Runnable {
connectToSaas(session)
}
val scheduleAtFixedRate = scheduler.scheduleAtFixedRate(task, 0, 1, TimeUnit.SECONDS)
scheduledTasks[session] = scheduleAtFixedRate
}
private fun connectToSaas(session: WebSocketSession) {
val url = "ws://localhost:9999/internal"
val client = StandardWebSocketClient()
val header = getHeader(session)
val saasSession = client.execute(saasWebSocketHandler, header, URI(url)).get()
managerMap[session] = saasSession
saasSession.setUserSession(session)
scheduledTasks.remove(session)
logger().info(managerMap.size.toString())
}
Sample Time avg : 21ms
Connect Time avg: 21ms
Throughput: 1000
cpu
Sample Time avg : 26ms
Connect Time avg: 26ms
Throughput: 995
A 서버와 WebSocket 연결이 이뤄지고 busy-wating 해야하거나 작업이 오래걸리는 상황을 가정 (2000)
테스트 환경 (그 외 동일)
override fun afterConnectionEstablished(session: WebSocketSession) {
WebSocketConnection.WAIT_USER.add(session)
while (session.isOpen) {
Thread.sleep(1000)
}
log.info("size = ${WebSocketConnection.WAIT_USER.size}")
}
B 서버의 Thread pool 개수인 50 개를 초과로 TimeOut 발생
Exception in thread "pool-8-thread-48" java.util.concurrent.ExecutionException: jakarta.websocket.DeploymentException: The HTTP request to initiate the WebSocket connection to [ws://localhost:9999/internal] failed
at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
at com.klleon.ncc.apiserver.websocket.handler.UserWebSocketHandler2.connectToSaas(UserWebSocketHandler2.kt:57)
at com.klleon.ncc.apiserver.websocket.handler.UserWebSocketHandler2.access$connectToSaas(UserWebSocketHandler2.kt:21)
at com.klleon.ncc.apiserver.websocket.handler.UserWebSocketHandler2$afterConnectionEstablished$1.invokeSuspend(UserWebSocketHandler2.kt:34)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1623)
Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@3a3a98bc, java.util.concurrent.ScheduledThreadPoolExecutor@70841eba[Running, pool size = 50, active threads = 50, queued tasks = 1471, completed tasks = 479]]
Caused by: jakarta.websocket.DeploymentException: The HTTP request to initiate the WebSocket connection to [ws://localhost:9999/internal] failed
at org.apache.tomcat.websocket.WsWebSocketContainer.connectToServerRecursive(WsWebSocketContainer.java:429)
at org.apache.tomcat.websocket.WsWebSocketContainer.connectToServer(WsWebSocketContainer.java:179)
at org.springframework.web.socket.client.standard.StandardWebSocketClient.lambda$executeInternal$0(StandardWebSocketClient.java:150)
at org.springframework.util.concurrent.FutureUtils.lambda$toSupplier$0(FutureUtils.java:74)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
... 1 more
Caused by: java.util.concurrent.TimeoutException: The HTTP upgrade to WebSocket failed but partial data may have been received: Status Code [0], HTTP headers [{}]
at org.apache.tomcat.websocket.WsWebSocketContainer.processResponse(WsWebSocketContainer.java:796)
at org.apache.tomcat.websocket.WsWebSocketContainer.connectToServerRecursive(WsWebSocketContainer.java:335)
... 5 more
Caused by: java.util.concurrent.TimeoutException
at java.base/sun.nio.ch.PendingFuture.get(PendingFuture.java:195)
at org.apache.tomcat.websocket.WsWebSocketContainer.processResponse(WsWebSocketContainer.java:793)
... 6 more
private val scheduler = Executors.newScheduledThreadPool(50)
private val scheduledTasks = ConcurrentHashMap<WebSocketSession, ScheduledFuture<*>>()
override fun afterConnectionEstablished(session: WebSocketSession) {
WebSocketConnection.WAIT_USER.add(session)
val checkWaitStatusTask = Runnable {
if (session.isOpen) {
if (WebSocketConnection.WAIT_USER.indexOf(session) <= 0) {
handleConnection(session)
}
} else {
scheduledTasks.remove(session)?.cancel(false)
}
}
val scheduledFuture = scheduler.scheduleAtFixedRate(checkWaitStatusTask, 0, 1, TimeUnit.SECONDS)
scheduledTasks[session] = scheduledFuture
log.info("size = ${WebSocketConnection.WAIT_USER.size}")
}
private fun handleConnection(session: WebSocketSession) {
println("session waiting ${session.id}")
}
private val coroutineDispatcher = Executors.newScheduledThreadPool(50).asCoroutineDispatcher()
private val handlerScope = CoroutineScope(coroutineDispatcher + SupervisorJob())
private val scheduler = Executors.newScheduledThreadPool(50)
private val scheduledTasks = ConcurrentHashMap<WebSocketSession, ScheduledFuture<*>>()
override fun afterConnectionEstablished(session: WebSocketSession) {
handlerScope.launch {
handleSignaling(session)
}
log.info("size = ${WebSocketConnection.WAIT_USER.size}")
}
private suspend fun handleSignaling(session: WebSocketSession) {
WebSocketConnection.WAIT_USER.add(session)
log.info("size = ${WebSocketConnection.WAIT_USER.size}")
while (session.isOpen) {
match(session)
delay(1000)
}
}
B 서버에서 나온 websocket open max, avg 에 대한 값은 큰 의미는 갖지 못한다. (A 서버와 Client 간의 결과 값)
Throughput: Scheduler ≥ Coroutine > Classic
memory usage: classic > coroutine > Scheduler
cpu utilization: 유의미한 차이 X
classic 방식은 Thread 개수를 초과하면 작업을 진행하지 못하기에 제외
Throughput: Scheduler ≥ Coroutine
memory usage: coroutine > Scheduler
WebSocket Handshake 후 connection 이 맺어졌을때 처리하는 방식에 따라 벤치마크 테스트를 진행해봤는데 내부 구현 방식에 따라 많은 차이가 있겠지만 상당의 유의미한 지표가 나왔다고 생각합니다. Application 이 감당할 수 있는 Connection 개수를 많이 요구한다면 webflux 등 과 같이 충분히 고려할만하다 생각합니다.