Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: Cache tikv request in tidb client side #1098

Merged
merged 33 commits into from
Feb 21, 2024

Conversation

bufferflies
Copy link
Contributor

@bufferflies bufferflies commented Dec 28, 2023

close #1099

1. all the requests will put into the priority queue.
2. requests can't be canceled if the client is not available tempory, it remove from the queue if the request has been canceled by the caller such as timeout or server is stop.
3. add new config to limit the concurrency of requests that are sent one tikv.

Manual Test

rg1 is high priority, rg2 is medium priorit.
the connection of high priority always is 5,
the conncetion of medium priority is [5,20,50,100,200]
default

image

max-concurrency-request-limit =30
image

performance between master and this pr
image

The high priority sql run faster than the medium after set max-concurrency-request-limit =30

performance comapre
img_v3_027k_2c98e0cb-2d07-419a-9183-7c270b8f521g

Signed-off-by: bufferflies <[email protected]>
Signed-off-by: bufferflies <[email protected]>
Signed-off-by: bufferflies <[email protected]>
@bufferflies bufferflies changed the title Cache tikv request in tikv side client: Cache tikv request in tikv side Dec 28, 2023
Signed-off-by: bufferflies <[email protected]>
Signed-off-by: bufferflies <[email protected]>
Signed-off-by: bufferflies <[email protected]>
@bufferflies bufferflies marked this pull request as ready for review January 3, 2024 09:00
tikvrpc/tikvrpc.go Outdated Show resolved Hide resolved
internal/client/client_batch.go Outdated Show resolved Hide resolved
@bufferflies bufferflies force-pushed the cache_request branch 2 times, most recently from a74f4f9 to 5ebf055 Compare January 15, 2024 09:30
Signed-off-by: bufferflies <[email protected]>
integration_tests/2pc_test.go Outdated Show resolved Hide resolved
integration_tests/lock_test.go Outdated Show resolved Hide resolved
internal/client/client_batch.go Outdated Show resolved Hide resolved
if collect != nil {
collect(b.idAlloc, e)
}
for (count < limit && b.entries.Len() > 0) || b.hasHighPriorityTask() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to disable the limit related code path if limit is MaxInt64 value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's same if the limit is MaxInt64. if the limit is set as MaxInt64, the take operator will return all elements in the queues.

internal/client/client_batch.go Outdated Show resolved Hide resolved
@cfzjywxk
Copy link
Contributor

@crazycs520 PTAL

Signed-off-by: bufferflies <[email protected]>
Signed-off-by: bufferflies <[email protected]>
Signed-off-by: bufferflies <[email protected]>
metrics.TiKVNoAvailableConnectionCounter.Inc()

// Please ensure the error is handled in region cache correctly.
a.reqBuilder.cancel(errors.New("no available connections"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this change these request are not canceled and retry sending them when new request arrives. So these request will be block if there is no new incoming requests, is it a proper behavior?
Another issue is that if the maxConcurrencyRequestLimit is not very large, it is possible that the request builder can cache a lot of requests when the incoming requsets number are large, it may lead to issues such as OOM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this change these request are not canceled and retry sending them when new request arrives. So these request will be block if there is no new incoming requests, is it a proper behavior?

yes, the request maybe block if there are any requests coming if the configuration is small. It will timeout and then retry it again. I will fixed it by notified mechanism.

Another issue is that if the maxConcurrencyRequestLimit is not very large, it is possible that the request builder can cache a lot of requests when the incoming requsets number are large, it may lead to issues such as OOM.
yes, it maybe happen in origin logic, the request object canbe gc after receiving response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first issue, we may handle it in fetchAllPendingRequests. That is, we can skip waiting for headEntry when entries is not empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It maybe cause busy loop if there's no one client that has sent token. I will optimaze it by using channel to notify the sender to sent requests again.

internal/client/client_batch.go Show resolved Hide resolved
internal/client/client_batch.go Outdated Show resolved Hide resolved
metrics.TiKVNoAvailableConnectionCounter.Inc()

// Please ensure the error is handled in region cache correctly.
a.reqBuilder.cancel(errors.New("no available connections"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first issue, we may handle it in fetchAllPendingRequests. That is, we can skip waiting for headEntry when entries is not empty.

@bufferflies bufferflies changed the title client: Cache tikv request in tikv side client: Cache tikv request in tidb client side Feb 2, 2024
reasons = append(reasons, SendFailedReasonTryLockForSendFail)
}
} else {
reasons = append(reasons, SendFailedReasonTryLockForSendFail)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
reasons = append(reasons, SendFailedReasonTryLockForSendFail)
reasons = append(reasons, SendFailedReasonNoAvailableLimit)

Signed-off-by: bufferflies <[email protected]>
@nolouch
Copy link
Contributor

nolouch commented Feb 6, 2024

Can we merge it?

@@ -89,6 +90,9 @@ type TiKVClient struct {
// TTLRefreshedTxnSize controls whether a transaction should update its TTL or not.
TTLRefreshedTxnSize int64 `toml:"ttl-refreshed-txn-size" json:"ttl-refreshed-txn-size"`
ResolveLockLiteThreshold uint64 `toml:"resolve-lock-lite-threshold" json:"resolve-lock-lite-threshold"`
// MaxConcurrencyRequestLimit is the max concurrency number of request to be sent the tikv
// 0 means auto adjust by feedback.
MaxConcurrencyRequestLimit int64 `toml:"max-concurrency-request-limit" json:"max-concurrency-request-limit"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a public tidb configuration, should it be approved by the PM member according to the current process requirements?

@bufferflies
Copy link
Contributor Author

Can we merge it?
wait for tpcc performance test

Signed-off-by: bufferflies <[email protected]>
@bufferflies bufferflies merged commit 824302a into tikv:master Feb 21, 2024
10 checks passed
crazycs520 added a commit to crazycs520/client-go that referenced this pull request May 22, 2024
crazycs520 added a commit to crazycs520/client-go that referenced this pull request May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cache tikv request in client side
6 participants