New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

client: Cache tikv request in tidb client side #1098

Merged

bufferflies merged 33 commits into tikv:master from bufferflies:cache_request

Feb 21, 2024

Contributor

bufferflies commented Dec 28, 2023 •

edited

Loading

close #1099

1. all the requests will put into the priority queue.
2. requests can't be canceled if the client is not available tempory, it remove from the queue if the request has been canceled by the caller such as timeout or server is stop.
3. add new config to limit the concurrency of requests that are sent one tikv.

Manual Test

rg1 is high priority, rg2 is medium priorit.
the connection of high priority always is 5,
the conncetion of medium priority is [5,20,50,100,200]
default

max-concurrency-request-limit =30

performance between master and this pr

The high priority sql run faster than the medium after set max-concurrency-request-limit =30

performance comapre

bufferflies added 3 commits

December 28, 2023 10:23


          impl priority queue

2d2427c

Signed-off-by: bufferflies <[email protected]>


          replace priority queue

d249194

Signed-off-by: bufferflies <[email protected]>


          cache request in tidb side

9a0739e

Signed-off-by: bufferflies <[email protected]>

bufferflies changed the title ~~Cache tikv request in tikv side~~ client: Cache tikv request in tikv side


          fix gosimple

c68dbeb

Signed-off-by: bufferflies <[email protected]>

bufferflies force-pushed the cache_request branch from bfd7daa to 680cc2d Compare

December 28, 2023 12:24


          impl priority

7f3cc02

Signed-off-by: bufferflies <[email protected]>

bufferflies force-pushed the cache_request branch from 680cc2d to 7f3cc02 Compare

December 28, 2023 12:28

bufferflies added 5 commits

January 3, 2024 11:44


          Merge branch 'master' into cache_request

4f42240


          pass ut

8d1fca6

Signed-off-by: bufferflies <[email protected]>


          Merge branch 'cache_request' of github.com:bufferflies/client-go into…

9e7283f

… cache_request


          Merge branch 'cache_request' of github.com:bufferflies/client-go into…

55e98de

… cache_request


          resolve conflict

e3a4cab

Signed-off-by: bufferflies <[email protected]>

bufferflies force-pushed the cache_request branch from 9e7283f to e3a4cab Compare

January 3, 2024 08:41


          add comment

8e939c9

Signed-off-by: bufferflies <[email protected]>

bufferflies marked this pull request as ready for review

January 3, 2024 09:00

bufferflies requested review from nolouch, glorv and disksing

January 4, 2024 07:32

bufferflies added 4 commits

January 5, 2024 18:40

add

cabd454

Signed-off-by: bufferflies <[email protected]>


          remove request if the request has been canceled

709b464

Signed-off-by: bufferflies <[email protected]>


          remove request if it has been canceled

38d9dfd

Signed-off-by: bufferflies <[email protected]>


          add comment for cancel

5766a91

Signed-off-by: bufferflies <[email protected]>

nolouch reviewed

View reviewed changes

tikvrpc/tikvrpc.go Outdated Show resolved Hide resolved

internal/client/client_batch.go Outdated Show resolved Hide resolved

bufferflies force-pushed the cache_request branch 2 times, most recently from a74f4f9 to 5ebf055 Compare

January 15, 2024 09:30


          not make the loop is busy

3602ee7

Signed-off-by: bufferflies <[email protected]>

bufferflies force-pushed the cache_request branch from 5ebf055 to 3602ee7 Compare

January 16, 2024 03:46


          Merge branch 'master' into cache_request

2b9969c

bufferflies force-pushed the cache_request branch from 038d798 to 25fff22 Compare

January 16, 2024 06:19

cfzjywxk reviewed

View reviewed changes

integration_tests/2pc_test.go Outdated Show resolved Hide resolved

integration_tests/lock_test.go Outdated Show resolved Hide resolved

internal/client/client_batch.go Outdated Show resolved Hide resolved

internal/client/client_batch.go

-              		if collect != nil {
-              			collect(b.idAlloc, e)
+              	}
+              	for (count < limit && b.entries.Len() > 0) || b.hasHighPriorityTask() {

Contributor

cfzjywxk Jan 31, 2024

Is it possible to disable the limit related code path if limit is MaxInt64 value?

Contributor Author

bufferflies Jan 31, 2024

It's same if the limit is MaxInt64. if the limit is set as MaxInt64, the take operator will return all elements in the queues.

internal/client/client_batch.go Outdated Show resolved Hide resolved

cfzjywxk requested review from zyguan and MyonKeminta

January 31, 2024 02:01

Contributor

cfzjywxk commented Jan 31, 2024

@crazycs520 PTAL

bufferflies added 2 commits

January 31, 2024 10:30


          squash

bbe0543

Signed-off-by: bufferflies <[email protected]>


          revert all to All

94a0d9c

Signed-off-by: bufferflies <[email protected]>

bufferflies force-pushed the cache_request branch from 9685f2d to 94a0d9c Compare

January 31, 2024 03:13

nolouch approved these changes

View reviewed changes


          remove index from entry

27aca2a

Signed-off-by: bufferflies <[email protected]>

glorv reviewed

View reviewed changes

internal/client/client_batch.go

               		metrics.TiKVNoAvailableConnectionCounter.Inc()
-              		// Please ensure the error is handled in region cache correctly.
-              		a.reqBuilder.cancel(errors.New("no available connections"))

Contributor

glorv Jan 31, 2024

After this change these request are not canceled and retry sending them when new request arrives. So these request will be block if there is no new incoming requests, is it a proper behavior?
Another issue is that if the maxConcurrencyRequestLimit is not very large, it is possible that the request builder can cache a lot of requests when the incoming requsets number are large, it may lead to issues such as OOM.

Contributor Author

bufferflies Jan 31, 2024

After this change these request are not canceled and retry sending them when new request arrives. So these request will be block if there is no new incoming requests, is it a proper behavior?

yes, the request maybe block if there are any requests coming if the configuration is small. It will timeout and then retry it again. I will fixed it by notified mechanism.

Another issue is that if the maxConcurrencyRequestLimit is not very large, it is possible that the request builder can cache a lot of requests when the incoming requsets number are large, it may lead to issues such as OOM.
yes, it maybe happen in origin logic, the request object canbe gc after receiving response.

Contributor

zyguan Feb 1, 2024

For the first issue, we may handle it in fetchAllPendingRequests. That is, we can skip waiting for headEntry when entries is not empty.

Contributor Author

bufferflies Feb 2, 2024

It maybe cause busy loop if there's no one client that has sent token. I will optimaze it by using channel to notify the sender to sent requests again.

zyguan reviewed

View reviewed changes

internal/client/client_batch.go Show resolved Hide resolved

internal/client/client_batch.go Outdated Show resolved Hide resolved

internal/client/client_batch.go

               		metrics.TiKVNoAvailableConnectionCounter.Inc()
-              		// Please ensure the error is handled in region cache correctly.
-              		a.reqBuilder.cancel(errors.New("no available connections"))

Contributor

zyguan Feb 1, 2024

For the first issue, we may handle it in fetchAllPendingRequests. That is, we can skip waiting for headEntry when entries is not empty.

bufferflies changed the title ~~client: Cache tikv request in tikv side~~ client: Cache tikv request in tidb client side

zyguan approved these changes

View reviewed changes

zyguan reviewed

View reviewed changes

internal/client/client_batch.go Outdated

+              				reasons = append(reasons, SendFailedReasonTryLockForSendFail)
+              			}
+              		} else {
+              			reasons = append(reasons, SendFailedReasonTryLockForSendFail)

Contributor

zyguan Feb 2, 2024

Suggested change

      
            			reasons = append(reasons, SendFailedReasonTryLockForSendFail)
          
            			reasons = append(reasons, SendFailedReasonNoAvailableLimit)


          make fail reasons more clear

a01352e

Signed-off-by: bufferflies <[email protected]>

bufferflies force-pushed the cache_request branch from 9b3ea26 to a01352e Compare

February 2, 2024 09:08


          Merge branch 'master' into cache_request

f48967c

crazycs520 approved these changes

View reviewed changes

Contributor

nolouch commented Feb 6, 2024

Can we merge it?

cfzjywxk reviewed

View reviewed changes

config/client.go

@@ @@ -89,6 +90,9 @@ type TiKVClient struct { @@
               	// TTLRefreshedTxnSize controls whether a transaction should update its TTL or not.
               	TTLRefreshedTxnSize      int64  `toml:"ttl-refreshed-txn-size" json:"ttl-refreshed-txn-size"`
               	ResolveLockLiteThreshold uint64 `toml:"resolve-lock-lite-threshold" json:"resolve-lock-lite-threshold"`
+              	// MaxConcurrencyRequestLimit is the max concurrency number of request to be sent the tikv
+              	// 0 means auto adjust by feedback.
+              	MaxConcurrencyRequestLimit int64 `toml:"max-concurrency-request-limit" json:"max-concurrency-request-limit"`

Contributor

cfzjywxk Feb 6, 2024

This would be a public tidb configuration, should it be approved by the PM member according to the current process requirements?

Contributor Author

bufferflies commented Feb 6, 2024

Can we merge it?
wait for tpcc performance test


          resolve conflict

85f81ab

Signed-off-by: bufferflies <[email protected]>

bufferflies force-pushed the cache_request branch from 3ee8264 to 85f81ab Compare

February 20, 2024 09:32

bufferflies merged commit 824302a into tikv:master

10 checks passed

crazycs520 mentioned this pull request

no available connections cause by concurrency request limit bug #1225

Closed

crazycs520 added a commit to crazycs520/client-go that referenced this pull request


          Revert "client: Cache tikv request in tidb client side (tikv#1098)"

303ddf7

This reverts commit 824302a.

Signed-off-by: crazycs520 <[email protected]>

crazycs520 mentioned this pull request

revert #1095 #1098 to fix performance regression #1352

Closed

crazycs520 added a commit to crazycs520/client-go that referenced this pull request


          Revert "client: Cache tikv request in tidb client side (tikv#1098)"

4fb9989

This reverts commit 824302a.

Signed-off-by: crazycs520 <[email protected]>

crazycs520 mentioned this pull request

revert #1095 #1098 #1226 to fix performance regression #1353

Closed

MyonKeminta mentioned this pull request

TiKV min resolved ts lag up to 30s during tikv rolling restart, resulting in cdc lag increase tikv/tikv#16698

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

zyguan zyguan approved these changes

glorv glorv left review comments

cfzjywxk cfzjywxk left review comments

crazycs520 crazycs520 approved these changes

nolouch nolouch approved these changes

disksing Awaiting requested review from disksing

you06 Awaiting requested review from you06

MyonKeminta Awaiting requested review from MyonKeminta

Labels

None yet