So, in one of my apps, while QA testing, something very interesting happened. I observed that, while using Mutex from Kotlin Coroutines, I happened to see two coroutines enter the same mutex at the same time. The problem was, there was a socket creation in this protected-with-mutex method, so one of the object reference was lost, causing a memory leak.
I made sure the only access to the socket object is through the mutex, and nullifying the socket in any way was not performed.
Here is the service code:
import kotlinx.coroutines.CoroutineDispatcher
import kotlinx.coroutines.sync.Mutex
import kotlinx.coroutines.sync.withLock
class SomeAndroidService : LifecycleService() {
private val mutex: Mutex = Mutex()
private var socket: Socket? = null
@Inject
lateinit var getSocketTokenUseCase: GetSocketTokenUseCase
@Inject
lateinit var ioDispatcher: CoroutineDispatcher // same as Dispatchers.IO in GetSocketTokenUseCase
// ignore the method signature here, this is an example code
override fun onStartCommand(): Int {
lifecycleScope.launch(ioDispatcher) {
acquireSocket()?.let {
Log.d("SomeAndroidService", "socket.sendServiceCreatedEvent() called")
it.sendServiceCreatedEvent()
}
}
setupLocationCallbacks()
return START_STICKY
}
private fun setupLocationCallbacks() {
FusedLocationProviderClient.requestLocationUpdates(
someLocationRequest,
object: LocationListener {
override fun onNewLocation(loc: Location) {
lifecycleScope.launch(ioDispatcher) {
acquireSocket()?.let {
Log.d("SomeAndroidService", "socket.sendCurrentLocation() called")
it.sendCurrentLocation(loc)
}
}
}
}
)
}
private suspend fun acquireSocket() = mutex.withLock {
socket ?: createSocket()
}
private suspend fun createSocket(): Socket? {
val socketToken = getSocketTokenUseCase.invoke()
socket = Socket(socketToken)
socket.connect() // asynchronous call, not suspending
Log.d("SomeAndroidService", "0x${Integer.toHexString(System.identityHashCode(socket))} New socket instance created")
return socket
}
}
And here is the class that gets the token for the socket:
import kotlinx.coroutines.CoroutineDispatcher
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.IO
import kotlinx.coroutines.withContext
class GetSocketTokenUseCase(
private val dispatcher: CoroutineDispatcher = Dispatchers.IO
) {
suspend operator fun invoke(): String {
return withContext(dispatcher) {
// get socket token here with suspending calls from preference data store
}
}
}
And, weirdly enough, here are the logs:
2024-01-14 15:30:03.123 SomeAndroidService 0x7f342642 New socket instance created
2024-01-14 15:30:03.124 SomeAndroidService socket.sendServiceCreatedEvent() called
2024-01-14 15:30:03.126 SomeAndroidService 0x7f2c1023 New socket instance created
2024-01-14 15:30:03.127 SomeAndroidService socket.sendCurrentLocation() called
Notice, "New socket instance created" was executed twice. The method itself checks for nullability, so it should've returned the existing socket for one of the coroutines. But even then, there were 2 instances created, and they have different addresses, which means one of the sockets was lost and remained connected, with no way to disconnect.
I checked:
- It is the same service (in my original logs I also printed out the service address via
System.identityHashCode(this)
which was the same while the socket addresses were different), - It is the same mutex,
- Both methods access the socket through
mutex.withLock
viaacquireSocket()
And yet, these are the logs.
Isn't the whole point of using mutex is to protect multiple threads / coroutines to enter a block of code at the same time, and make it execute only on one thread / coroutine? (I know threads and coroutines are different, but mutex works almost exactly the same here.)
If somebody can give me an explanation as to why two coroutines executed under the same mutex at the same time, I'd be really glad.
Thanks in advance.