Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPS is lower when queue-proxy in request path #15627

Open
jokerwenxiao opened this issue Nov 25, 2024 · 4 comments
Open

RPS is lower when queue-proxy in request path #15627

jokerwenxiao opened this issue Nov 25, 2024 · 4 comments
Labels
kind/question Further information is requested

Comments

@jokerwenxiao
Copy link

jokerwenxiao commented Nov 25, 2024

same question like #10085
When curl pod-ip:user-container-port directly, rps are normal, but when curl pod-ip:queue-proxy-port, rps is lower

root@master:~# hey -z 60s -c 70 http://172.22.28.105

Summary:
  Total:        60.0059 secs
  Slowest:      0.2129 secs
  Fastest:      0.0007 secs
  Average:      0.0064 secs
  Requests/sec: 10906.5445

  Total data:   53665474 bytes
  Size/request: 82 bytes

Response time histogram:
  0.001 [1]     |
  0.022 [654035]        |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.043 [313]   |
  0.064 [74]    |
  0.086 [3]     |
  0.107 [1]     |
  0.128 [11]    |
  0.149 [13]    |
  0.170 [0]     |
  0.192 [5]     |
  0.213 [1]     |


Latency distribution:
  10% in 0.0057 secs
  25% in 0.0061 secs
  50% in 0.0064 secs
  75% in 0.0066 secs
  90% in 0.0070 secs
  95% in 0.0072 secs
  99% in 0.0090 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0000 secs, 0.0007 secs, 0.2129 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0071 secs
  resp wait:    0.0063 secs, 0.0004 secs, 0.2128 secs
  resp read:    0.0001 secs, 0.0000 secs, 0.0280 secs

Status code distribution:
  [200] 654457 responses



root@master:~# hey -z 60s -c 70 http://172.22.28.105:8012

Summary:
  Total:        60.0100 secs
  Slowest:      0.2062 secs
  Fastest:      0.0011 secs
  Average:      0.0112 secs
  Requests/sec: 6232.7635

  Total data:   30670296 bytes
  Size/request: 82 bytes

Response time histogram:
  0.001 [1]     |
  0.022 [354717]        |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.042 [19194] |■■
  0.063 [38]    |
  0.083 [35]    |
  0.104 [1]     |
  0.124 [2]     |
  0.145 [1]     |
  0.165 [2]     |
  0.186 [28]    |
  0.206 [9]     |


Latency distribution:
  10% in 0.0062 secs
  25% in 0.0073 secs
  50% in 0.0095 secs
  75% in 0.0141 secs
  90% in 0.0190 secs
  95% in 0.0217 secs
  99% in 0.0257 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0000 secs, 0.0011 secs, 0.2062 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0097 secs
  resp wait:    0.0111 secs, 0.0010 secs, 0.2061 secs
  resp read:    0.0000 secs, 0.0000 secs, 0.0282 secs

Status code distribution:
  [200] 374028 responses


queue-proxy resource config
  queue-sidecar-cpu-limit: 20000m
  queue-sidecar-cpu-request: 20000m
  queue-sidecar-ephemeral-storage-limit: 2048Mi
  queue-sidecar-ephemeral-storage-request: 2048Mi
  queue-sidecar-memory-limit: 2048Mi
  queue-sidecar-memory-request: 2048Mi

In order to improve the RPS of queue-proxy, I have set the resource allocation of the queue proxy container very high. If I follow the default queue-proxy resource usage in the config-deployment configmap, rqs will only be less than 100

knative version:v1.1.2

@jokerwenxiao jokerwenxiao added the kind/question Further information is requested label Nov 25, 2024
@skonto
Copy link
Contributor

skonto commented Nov 26, 2024

Hi @jokerwenxiao in cases of cpu contention you may have to configure QP with higher resources, see this old ticket as well. What is the user container doing (is it a helloworld app or something more computational expensive)? What resources are assigned to the user container? What is the cpu utilization at the node where the pod is running?
Btw when you are hitting the QP your request goes through one more hope and two containers are utilizing the cpu, so there is some overhead anyway.

@jokerwenxiao
Copy link
Author

Hi @skonto
this is my user-container code:

// main.go
package main

import (
	"github.com/valyala/fasthttp"
	"log"
)


func requestHandler(ctx *fasthttp.RequestCtx) {
	args := ctx.QueryArgs()
	hostname := os.Getenv("HOSTNAME")
	ctx.WriteString("response from host " + hostname + ", query parameter is " + string(args.Peek("param")))
}

func main() {
	address := ":80"
	log.Printf("Starting server on %s", address)
	if err := fasthttp.ListenAndServe(address, requestHandler); err != nil {
		log.Fatalf("Error starting server: %s", err)
	}
}

user-container resource:

    Limits:
      cpu:     2
      memory:  4G
    Requests:
      cpu:     2
      memory:  4G

user-container cpu utilization:
截屏2024-11-27 08 58 01

@skonto
Copy link
Contributor

skonto commented Nov 27, 2024

Could you run: kubectl describe node <node> and list the running pods utilization? How many cpus you have on the node? Could you also run the user container with lower resources and report back (it seems you are allocating a lot for the user container). In general QP does several stuff eg. proxying, draining requests, emitting metrics etc. That means there is a penalty to pay and that is why queue.sidecar.serving.knative.dev/resource-percentage was introduced in the past with some upper bound to be flexible with what resources need to be allocated compared to the user container ones.

@aqemia-aymeric-alixe
Copy link

Hey @skonto I've just seen in the knative documentation that

queue.sidecar.serving.knative.dev/resource-percentage

is deprecated
Do you know what is the best practice for knative v1.16 ?
Or maybe the annotation mechanism is going to be replaced by another one ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants