Skip to content
This repository has been archived by the owner on Jul 11, 2024. It is now read-only.

Cannot Reconnect: connection already exist #390

Open
suhaibmalik opened this issue Apr 7, 2021 · 6 comments
Open

Cannot Reconnect: connection already exist #390

suhaibmalik opened this issue Apr 7, 2021 · 6 comments
Assignees
Labels

Comments

@suhaibmalik
Copy link

Getting a random reconnect fail. I can't reliably reproduce this aside from just waiting for it to eventually happen.

Logs:

DEBUG: [[ws-e,s:14506,shard:0] heartbeat ACK ok]
DEBUG: [[ws-e,s:14507,shard:0] sent heartbeat]
DEBUG: [[ws-e,s:14508,shard:0] heartbeat ACK ok]
DEBUG: [[ws-e,s:14509,shard:0] sent heartbeat]
INFO: [[ws-e,s:14510,shard:0] heartbeat ACK was not received, forcing reconnect]
DEBUG: [[ws-e,s:14511,shard:0] stopping pulse]
DEBUG: [[ws-e,s:14512,shard:0] is reconnecting]
DEBUG: [[ws-e,s:14513,shard:0] closing emitter]
DEBUG: [[ws-e,s:14514,shard:0] closing receiver]
INFO: [[ws-e,s:14515,shard:0] disconnected]
DEBUG: [[ws-e,s:14516,shard:0] trying to connect]
DEBUG: [[shardSync] shard 0 is waiting to identify]
DEBUG: [[ws-e,s:14517,shard:0] waiting to send identify/resume]
DEBUG: [[ws-e,s:14518,shard:0] starting receiver]
DEBUG: [[ws-e,s:14520,shard:0] starting emitter]
DEBUG: [[ws-e,s:14519,shard:0] Ready to receive operation codes...]
DEBUG: [[ws-e,s:14521,shard:0] closing receiver after read error]
ERROR: [[ws-e,s:14522,shard:0] discord timeout during connect (3 minutes). No idea what went wrong..]
DEBUG: [[shardSync] shard 0 waited and finished execution after 3m0.041366691s]
ERROR: [[ws-e,s:14523,shard:0] establishing connection failed:  websocket connected but was not able to send identify packet within 3 minutes]
INFO: [[ws-e,s:14524,shard:0] next connection attempt in  3s]
DEBUG: [[ws-e,s:14525,shard:0] trying to connect]
ERROR: [[ws-e,s:14526,shard:0] establishing connection failed:  cannot Connect while a connection already exist]
INFO: [[ws-e,s:14527,shard:0] next connection attempt in  7s]
DEBUG: [[ws-e,s:14528,shard:0] trying to connect]
ERROR: [[ws-e,s:14529,shard:0] establishing connection failed:  cannot Connect while a connection already exist]
INFO: [[ws-e,s:14530,shard:0] next connection attempt in  11s]
DEBUG: [[ws-e,s:14531,shard:0] trying to connect]
ERROR: [[ws-e,s:14532,shard:0] establishing connection failed:  cannot Connect while a connection already exist]
INFO: [[ws-e,s:14533,shard:0] next connection attempt in  15s]
DEBUG: [[ws-e,s:14534,shard:0] trying to connect]
ERROR: [[ws-e,s:14535,shard:0] establishing connection failed:  cannot Connect while a connection already exist]
INFO: [[ws-e,s:14536,shard:0] next connection attempt in  19s]
DEBUG: [[ws-e,s:14537,shard:0] trying to connect]
ERROR: [[ws-e,s:14538,shard:0] establishing connection failed:  cannot Connect while a connection already exist]
INFO: [[ws-e,s:14539,shard:0] next connection attempt in  23s]
DEBUG: [[ws-e,s:14540,shard:0] trying to connect]
ERROR: [[ws-e,s:14541,shard:0] establishing connection failed:  cannot Connect while a connection already exist]
INFO: [[ws-e,s:14542,shard:0] next connection attempt in  27s]
DEBUG: [[ws-e,s:14543,shard:0] trying to connect]
...

The fix is to recreate the pod (container). The process immediately reconnects with the new process.

Connection Code:

...

client := disgord.New(disgord.Config{
  ProjectName: "Corvis",
  BotToken:    token,
  Logger:      logger,
})
defer client.Gateway().StayConnectedUntilInterrupted()

client.Gateway().BotReady(func() {

...
  • Golang version: v1.16
  • Using Go modules
  • Disgord version: 29f9278

Also, if the issue is sporadic and difficult to resolve within code, it'd be better for the process to exit to let the platform solve the issue (e.g. After x retries, exit process).

@andersfylling
Copy link
Owner

You aren't alone with this issue. A discord user reported the same.

For progress on improving the gateway; I've simply written a new gateway system. It's not done yet, but one of the features is to give you complete control over your shards and handle exit codes (if you want to). Otherwise it just runs in the background as normal. However, it's much easier to write tests for it.

https://github.com/andersfylling/discordgateway

Sadly I'm uncertain when I will be merging this into disgord. Missing write capability and proper testing of heartbeats.

@suhaibmalik
Copy link
Author

@andersfylling All good! Good to know that I'm not missing some low-hanging fix.

I can implement a force-exit for myself in the meantime.

@suhaibmalik
Copy link
Author

I've been saving debug logs for over a month in the hopes of finding a specific error to target with a force-exit. However, I've not encountered the same disconnect since. My assumption is that the root cause was on Discord's end. Still would like to see better error handling but if that's already being tracked via another issue/branch (e.g. gateway rewrite), I recommend closing this issue.

@andersfylling
Copy link
Owner

There is still the matter of how shards should be implemented in disgord after that discordgateway project is "done". I think I want to allow people to inject a ShardManager so they can easily get complete control. But I'm uncertain how I want to deal with errors, blocking? non blocking? etc.

@suhaibmalik
Copy link
Author

My opinion: In a world where production services are orchestrated (Kubernetes, docker-compose, etc.) it's safer to default to crashing an app instead of continuing to run a, deceptive, bad state. If I were in your shoes, I would default to having any unhandled error crash the app (i.e default panic handler log.Fatals the app). As long as users have the option to override that panic, you're not closing the door to any particular usage.

@FiHoEco
Copy link

FiHoEco commented Oct 23, 2023

This has been happening since 2019 and is still not fixed in 2023

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants