Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export logs. The request could not be executed. Full error message: timeout-------logs error with grafana-otel-java-agent #742

Open
Venture704 opened this issue Oct 11, 2024 · 8 comments

Comments

@Venture704
Copy link

In my kubernetes cluster while using grafana-opentelemetry-java-agent as init container in my springboot apps i am encounting this issue.
Details

springboot version 2.7.3
jhipster version-793

full Error ----
[otel.javaagent 2024-10-11 08:13:14:920 +0000] [BatchLogRecordProcessor_WorkerThread-1] DEBUG io.opentelemetry.sdk.logs.export.BatchLogRecordProcessor$Worker - Exporter failed
[otel.javaagent 2024-10-11 08:13:25:931 +0000] [OkHttp http://grafana-k8s-monitoring-grafana-agent.monitoring.svc.cluster.local:4318/...] ERROR io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export logs. The request could not be executed. Full error message: timeout
java.net.SocketTimeoutException: timeout
at okio.SocketAsyncTimeout.newTimeoutException(JvmOkio.kt:146)
at okio.AsyncTimeout.access$newTimeoutException(AsyncTimeout.kt:186)
at okio.AsyncTimeout$source$1.read(AsyncTimeout.kt:390)
at okio.RealBufferedSource.indexOf(RealBufferedSource.kt:436)
at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.kt:329)
at okhttp3.internal.http1.HeadersReader.readLine(HeadersReader.kt:29)
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:180)
at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:110)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at io.opentelemetry.exporter.sender.okhttp.internal.RetryInterceptor.intercept(RetryInterceptor.java:91)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:517)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.read(SocketInputStream.java:204)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at okio.InputStreamSource.read(JvmOkio.kt:93)
at okio.AsyncTimeout$source$1.read(AsyncTimeout.kt:153)
... 22 more
[otel.javaagent 2024-10-11 08:13:25:932 +0000] [BatchLogRecordProcessor_WorkerThread-1] DEBUG io.opentelemetry.sdk.logs.export.BatchLogRecordProcessor$Worker - Exporter failed

@zeitlinger
Copy link
Member

@maryliag can we use the otel tester?

@maryliag
Copy link

yes!
@Venture704 we have a tool, called otel-checker that currently have a few checks, such as permissions, correct environment variables, exporter types, etc, that can help identify if there are any basic issues with your setup.

Are you able to install and run a Go binary in your cluster? If so, the steps can be found here: https://github.com/grafana/otel-checker

@Venture704
Copy link
Author

hi @maryliag -- i am not allowed to use the otel checker however i found the issue. The issue was due to loki limits. Due to a large number of logs flowing into loki (approl 50k/sec) , grafana alloy buffer was getting filled and thus the **FAILED TO EXPORT LOGS MESSAGE **** was coming again and again. As soon as i increased the loki limits it started workin fine. -------------- the config i found on google is
retention_period: 72h
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
max_cache_freshness_per_query: 10m
split_queries_by_interval: 15m
# for big logs tune
per_stream_rate_limit: 512M
per_stream_rate_limit_burst: 1024M
cardinality_limit: 200000
ingestion_burst_size_mb: 1000
ingestion_rate_mb: 10000
max_entries_limit_per_query: 1000000
max_label_value_length: 20480
max_label_name_length: 10240
max_label_names_per_series: 300

@Venture704
Copy link
Author

@maryliag --- However i ran into another problem, I want to increase my loki-write pods as logs will increase and now loki-pods are not getting into ready state. the error message is ----Readiness probe failed: HTTP probe failed with statuscode: 503.
the repo i am using for loki is ---https://github.com/grafana/loki/blob/main/production/helm/loki/Chart.yaml
What could be the issue now?
Thanks

@maryliag
Copy link

@Venture704 I've reached out to the loki team to get some help.
In the meantime, @zeitlinger are you aware of any issues with the java agent that would cause this?

@zeitlinger
Copy link
Member

I can't think of anything in the Java Agent that would cause this

@Venture704
Copy link
Author

Hi @zeitlinger, just want to know that is there any way by which i can decrease the response time of the OTLP receiver of the grafana alloy, as i have over 200 pods(spring boot 2.7) sending otlp data to the grafana alloy. I will use otel-checker, very soon.
the error is the same ----------[otel.javaagent 2024-12-04 14:17:23:900 +0000] [OkHttp http://grafana-k8s-monitoring-grafana-agent.monitoring.svc.cluster.local:4318/...] ERROR io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export logs. The request could not be executed. Full error message: timeout

@zeitlinger
Copy link
Member

just want to know that is there any way by which i can decrease the response time of the OTLP receiver of the grafana alloy

sound like a great question for https://github.com/grafana/alloy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants