You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to sink some pretty large topics from Kafka (5 topics with about 250 million events each) into BigQuery via a separate (rather large - 8CPU, 32Gb RAM X3) Kafka Connect cluster. It starts up fine but after about 2 minutes, the connect instance CPUs are pegged at 100%, and the nodes start disconnecting - ultimately the whole process restarts with little progress on getting any data into BigQuery.
I tried that configuration in a replica of our environment with many less events (500,000) and it works fine.
Are there any configurations that can throttle the processing of events to keep the CPU in check? I tried tuning queueSize and threadPoolSize, as well as max.queue.size and max.batch.size to no avail.
I'm trying to sink some pretty large topics from Kafka (5 topics with about 250 million events each) into BigQuery via a separate (rather large - 8CPU, 32Gb RAM X3) Kafka Connect cluster. It starts up fine but after about 2 minutes, the connect instance CPUs are pegged at 100%, and the nodes start disconnecting - ultimately the whole process restarts with little progress on getting any data into BigQuery.
I tried that configuration in a replica of our environment with many less events (500,000) and it works fine.
Are there any configurations that can throttle the processing of events to keep the CPU in check? I tried tuning
queueSize
andthreadPoolSize
, as well as max.queue.size and max.batch.size to no avail.Any hint/help would be very much appreciated!
Here's our config for reference:
The text was updated successfully, but these errors were encountered: