Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting StatsD to work in OnlyOffice Kubernetes/Docker 8.2 #781

Open
torsten-simon opened this issue Nov 16, 2024 · 12 comments
Open

Getting StatsD to work in OnlyOffice Kubernetes/Docker 8.2 #781

torsten-simon opened this issue Nov 16, 2024 · 12 comments
Labels
confirmed-bug Issues with confirmed bugs

Comments

@torsten-simon
Copy link

torsten-simon commented Nov 16, 2024

Do you want to request a feature or report a bug?

Bug

What is the current behavior?

StatsD / Metrics not working

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem.

metrics:
  enabled: true
log:
  level: ALL

In addition to that, since I had issues with the METRICS_ENABLED env in previous versions (7.5):
Add a secret for local-production-linux.json with contents of

    {
        "statsd": {
                "useMetrics": true,
                "host": "statsd-exporter-prometheus-statsd-exporter",
                "port": "8125",
                "prefix": "ds."
        }
    }

(I can confirm that the docserver is parsing this file since it fails with an error on startup if I put in a syntax error on purpose.)

What is the expected behavior?

Pushing metrics to statsd in cluster, seeing log messages like Flushing stats at

Did this work in previous versions of DocumentServer?
Yes, last tested on a docker based install with onlyoffice/documentserver-ee:7.5

DocumentServer Docker tag:
onlyoffice/docs-docservice-ee:8.2.0-1 (also tried with "de" variant)
Helm chart version 4.3.0

Host Operating System:
Kubernetes Rancher / Ubuntu Worker

@igwyd
Copy link
Member

igwyd commented Nov 19, 2024

Hello @torsten-simon, metrics generally work, I checked it on the latest Helm release v4.4.0 with Grafana, but there is a issue #71746 with some of them - no data in Authirizations, Get lock and Saving Changes, do you have the same?

@torsten-simon
Copy link
Author

torsten-simon commented Nov 19, 2024

Hi @igwyd thanks for taking a look!

Unfortunately, the only stats I'm able to retrieve come from the statsd exporter metrics itself.

Could you confirm that there should be log messages when stats are flushed by the docserver (I've set the log level to ALL)? Because, I can't see any in 8.2 but could see them in 7.5

We're mostly interested in the editing & viewing stats (ds_expireDoc_connections_edit & ds_expireDoc_connections_view) to analyze the current usage to verify if the license is "large" enough.

@igwyd
Copy link
Member

igwyd commented Nov 19, 2024

Maybe it becase you add your local-productin-linux.json, I didn't do it, only change metrics.enabled: true and install statsd.

@Rita-Bubnova Rita-Bubnova added the confirmed-bug Issues with confirmed bugs label Nov 20, 2024
@torsten-simon
Copy link
Author

Hi @igwyd , I tried both with metrics.enabled: true as well as local-production-linux.json, in both cases I don't see any log messages regarding pushed stats. (metrics.enabled: true also dont has an affect on the value in the regular json config files, altough I'm not sure if this is correct behaviour)

@igwyd
Copy link
Member

igwyd commented Nov 28, 2024

Can you show env for docservice process? cat /proc/7/environ (7 is PID docservice)
111
There will be a statsd object and what is specified in it will have priority over what is passed in the config.
The values ​​in env are taken from here.

PS i dont know about metrics logging in k8s yet, but i will let you know.

@torsten-simon
Copy link
Author

Hi @igwyd, thanks for comming back and sorry for the dealy.

In my case, the metrics config shows up as expected as well
image

Hope that helps to further investigate it

@igwyd
Copy link
Member

igwyd commented Dec 9, 2024

This is fine. let's see kubectl get svc, is there statsd-exporter-prometheus-statsd-exporter here?

kubectl get pods, is there statsd-exporter-prometheus-statsd-exporter-xxxxx-xxxxx and prometheus-server-xxxxx-xxxxx, are they running?
photo_2024-12-09_15-43-32

PS attach screenshots please.

@torsten-simon
Copy link
Author

torsten-simon commented Dec 12, 2024

Hi @igwyd ,

kubectl svc output:

NAME                                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                 AGE
prometheus-kube-state-metrics                ClusterIP   10.43.47.156    <none>        8080/TCP                                11m
prometheus-prometheus-node-exporter          ClusterIP   10.43.189.117   <none>        9100/TCP                                11m
prometheus-prometheus-pushgateway            ClusterIP   10.43.26.11     <none>        9091/TCP                                11m
prometheus-server                            ClusterIP   10.43.149.57    <none>        80/TCP                                  11m
redis-headless                               ClusterIP   None            <none>        6379/TCP                                28d
redis-master                                 ClusterIP   10.43.100.110   <none>        6379/TCP                                28d
statsd-exporter-prometheus-statsd-exporter   ClusterIP   10.43.164.118   <none>        9102/TCP,8126/TCP,8125/UDP              27d

kubectl get pods:

NAME                                                          READY   STATUS    RESTARTS   AGE
prometheus-kube-state-metrics-7df7665df9-8pbcz                1/1     Running   0          12m
prometheus-prometheus-node-exporter-2g8jk                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-47q5t                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-6crpd                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-74xj6                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-8x5tl                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-fs99w                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-fz44q                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-hzwcw                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-jgtdr                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-kgp4q                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-lvfqk                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-plwfl                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-qhgpp                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-w5sjq                     1/1     Running   0          12m
prometheus-prometheus-node-exporter-w7w8m                     1/1     Running   0          12m
prometheus-prometheus-pushgateway-6f8f9dd848-2l7kq            1/1     Running   0          12m
prometheus-server-6977d97578-rhv76                            2/2     Running   0          2m2s
redis-master-0                                                1/1     Running   0          28d
statsd-exporter-prometheus-statsd-exporter-6c8587bf54-ptzhg   1/1     Running   0          27d

Everything is running
Rancher Screenshot (Pods)
image

@igwyd
Copy link
Member

igwyd commented Dec 13, 2024

Open the document, edit it and after a while go to some pod where there is curl and execute curl statsd-exporter-prometheus-statsd-exporter:9102/metrics | grep -i ds_. Are there any ds metrics? For example like these
photo_2024-12-13_16-59-55

@torsten-simon
Copy link
Author

torsten-simon commented Jan 5, 2025

Hi @igwyd ,

sorry again for the long delay (holdiay season) and a happy new year!

When running th command, I can indeed see some results:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP ds_coauth_convertservice Metric autogenerated by statsd_exporter.
# TYPE ds_coauth_convertservice summary
ds_coauth_convertservice{quantile="0.5"} NaN
ds_coauth_convertservice{quantile="0.9"} NaN
ds_coauth_convertservice{quantile="0.99"} NaN
ds_coauth_convertservice_sum 0.11300000000000004
ds_coauth_convertservice_count 24
# HELP ds_coauth_openDocument_open Metric autogenerated by statsd_exporter.
# TYPE ds_coauth_openDocument_open summary
ds_coauth_openDocument_open{quantile="0.5"} 0.005
ds_coauth_openDocument_open{quantile="0.9"} 0.022
ds_coauth_openDocument_open{quantile="0.99"} 0.022
ds_coauth_openDocument_open_sum 1.0640000000000003
ds_coauth_openDocument_open_count 78
# HELP ds_coauth_saveAfterEditingSessionClosed Metric autogenerated by statsd_exporter.
# TYPE ds_coauth_saveAfterEditingSessionClosed summary
ds_coauth_saveAfterEditingSessionClosed{quantile="0.5"} NaN
ds_coauth_saveAfterEditingSessionClosed{quantile="0.9"} NaN
ds_coauth_saveAfterEditingSessionClosed{quantile="0.99"} NaN
ds_coauth_saveAfterEditingSessionClosed_sum 64.499
ds_coauth_saveAfterEditingSessionClosed_count 10
# HELP ds_coauth_saveFromChanges Metric autogenerated by statsd_exporter.
# TYPE ds_coauth_saveFromChanges summary
ds_coauth_saveFromChanges{quantile="0.5"} NaN
ds_coauth_saveFromChanges{quantile="0.9"} NaN
ds_coauth_saveFromChanges{quantile="0.99"} NaN
ds_coauth_saveFromChanges_sum 0.23700000000000007
ds_coauth_saveFromChanges_count 15
# HELP ds_coauth_session_edit Metric autogenerated by statsd_exporter.
# TYPE ds_coauth_session_edit summary
ds_coauth_session_edit{quantile="0.5"} NaN
ds_coauth_session_edit{quantile="0.9"} NaN
ds_coauth_session_edit{quantile="0.99"} NaN
ds_coauth_session_edit_sum 104217.62400000001
ds_coauth_session_edit_count 28
# HELP ds_coauth_session_view Metric autogenerated by statsd_exporter.
# TYPE ds_coauth_session_view summary
ds_coauth_session_view{quantile="0.5"} 52.139
ds_coauth_session_view{quantile="0.9"} 52.139
ds_coauth_session_view{quantile="0.99"} 52.139
ds_coauth_session_view_sum 48073.51900000001
ds_coauth_session_view_count 53
# HELP ds_conv_allconvert Metric autogenerated by statsd_exporter.
# TYPE ds_conv_allconvert summary
ds_conv_allconvert{quantile="0.5"} 0.162
ds_conv_allconvert{quantile="0.9"} 0.162
ds_conv_allconvert{quantile="0.99"} 0.162
ds_conv_allconvert_sum 255.78499999999997
ds_conv_allconvert_count 70
# HELP ds_conv_deleteFolderRecursive Metric autogenerated by statsd_exporter.
# TYPE ds_conv_deleteFolderRecursive summary
ds_conv_deleteFolderRecursive{quantile="0.5"} 0.001
ds_conv_deleteFolderRecursive{quantile="0.9"} 0.001
ds_conv_deleteFolderRecursive{quantile="0.99"} 0.001
10ds_conv_deleteFolderRecursive_sum 0.07500000000000005
0ds_conv_deleteFolderRecursive_count 70
 1# HELP ds_conv_downloadFile Metric autogenerated by statsd_exporter.
69# TYPE ds_conv_downloadFile summary
9ds_conv_downloadFile{quantile="0.5"} 0.096
0 ds_conv_downloadFile{quantile="0.9"} 0.096
  ds_conv_downloadFile{quantile="0.99"} 0.096
 ds_conv_downloadFile_sum 218.079
0 ds_conv_downloadFile_count 28
16# HELP ds_conv_downloadFileFromStorage Metric autogenerated by statsd_exporter.
99# TYPE ds_conv_downloadFileFromStorage summary
0 ds_conv_downloadFileFromStorage{quantile="0.5"} NaN
  ds_conv_downloadFileFromStorage{quantile="0.9"} NaN
 0ds_conv_downloadFileFromStorage{quantile="0.99"} NaN
   ds_conv_downloadFileFromStorage_sum 1.0999999999999999
ds_conv_downloadFileFromStorage_count 39
# HELP ds_conv_postProcess Metric autogenerated by statsd_exporter.
# TYPE ds_conv_postProcess summary
ds_conv_postProcess{quantile="0.5"} 0.021
ds_conv_postProcess{quantile="0.9"} 0.021
  ds_conv_postProcess{quantile="0.99"} 0.021
0ds_conv_postProcess_sum 1.677999999999997
  ds_conv_postProcess_count 70
37# HELP ds_conv_spawnSync Metric autogenerated by statsd_exporter.
55k# TYPE ds_conv_spawnSync summary
  ds_conv_spawnSync{quantile="0.5"} 0.044
 ds_conv_spawnSync{quantile="0.9"} 0.044
 ds_conv_spawnSync{quantile="0.99"} 0.044
  ds_conv_spawnSync_sum 34.79
0ds_conv_spawnSync_count 65
 -# HELP ds_expireDoc_connections_edit Metric autogenerated by statsd_exporter.
-# TYPE ds_expireDoc_connections_edit gauge
:-ds_expireDoc_connections_edit 1
-:# HELP ds_expireDoc_connections_liveview Metric autogenerated by statsd_exporter.
-# TYPE ds_expireDoc_connections_liveview gauge
-ds_expireDoc_connections_liveview 0
 # HELP ds_expireDoc_connections_view Metric autogenerated by statsd_exporter.
-# TYPE ds_expireDoc_connections_view gauge
-ds_expireDoc_connections_view 0
:go_gc_duration_seconds_sum 5.233183407
-go_gc_duration_seconds_count 38675
-:-- --:--:-- 4147k# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.

# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 1639.78

Does this mean there is an issue getting the stats to prometheus itself?

@igwyd
Copy link
Member

igwyd commented Jan 13, 2025

I hope the holidays went well)
We see that there is data from statsd. Lets check prometeus with 2 requests:

curl http://prometheus-server/api/v1/targets?scrapePool=statsd
curl http://prometheus-server/api/v1/query?query=ds_conv_allconvert_count

In the end you should get something like this:

{"status":"success","data":{"activeTargets":[{"discoveredLabels":{"__address__":"statsd-exporter-prometheus-statsd-exporter:9102","__metrics_path__":"/metrics","__scheme__":"http","__scrape_interval__":"30s","__scrape_timeout__":"10s","job":"statsd"},"labels":{"instance":"statsd-exporter-prometheus-statsd-exporter:9102","job":"statsd"},"scrapePool":"statsd","scrapeUrl":"http://statsd-exporter-prometheus-statsd-exporter:9102/metrics","globalUrl":"http://statsd-exporter-prometheus-statsd-exporter:9102/metrics","lastError":"","lastScrape":"2025-01-13T13:52:30.817877265Z","lastScrapeDuration":0.003627013,"health":"up","scrapeInterval":"30s","scrapeTimeout":"10s"}],

{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"ds_conv_allconvert_count","instance":"statsd-exporter-prometheus-statsd-exporter:9102","job":"statsd"},"value":[1736776436.737,"3"]}]}}

@torsten-simon
Copy link
Author

Hi @igwyd
Yes, I get output there, however, there are no targets in the list neither any data

{"status":"success","data":{"activeTargets":[],"droppedTargets":[],"droppedTargetCounts":{"kubernetes-apiservers":875,"kubernetes-nodes":0,"kubernetes-nodes-cadvisor":0,"kubernetes-pods":578,"kubernetes-pods-slow":590,"kubernetes-service-endpoints":831,"kubernetes-service-endpoints-slow":878,"kubernetes-services":172,"prometheus":0,"prometheus-pushgateway":171}}}

{"status":"success","data":{"resultType":"vector","result":[]}

However: I found out that it works for our cluster service-monitor and the prometheus there is getting stats now!

I can see several metrics starting with ds_. So I guess the issue can be closed :-)

Just one last question.

The metric for ds_expiredoc_connections_edit equals the amount of parallel edits which is also evaluated in terms of the ordered license size? I'm asking in particular to build an alerting in case our customer is getting close to it's license limit so it can be increased if necessary.

Thanks again for your support and effort!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs
Projects
None yet
Development

No branches or pull requests

3 participants