-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NATS integration and implement PdmV use case #617
Conversation
Yuyi, I think I found appropriate places for NATS integration and implemented McM use case. I suggest to perform profile measurements of insert APIs with and without NATS such that it will give you confidence if NATS makes any additional load on DBS. I made NATS fully configurable, e.g. if it is not present in configuration then no NATS interaction will happen. To configure NATS you'll need the following parameters in your config file: To include NATS the changes to dbs spec file will be required to include CMSMonitoring dependency. I'll prepare separate PR for that and link it here. Please review, ask questions and let me know. |
PR#5401 provides necessary changes for dbs spec to include CMSMonitoring dependency. |
PR#821 contains necessary changes for DBS configuration to include/exclude NATS. |
Valentin, |
Yuyi, don't worry, as I said I'll try to help as much as I can. For testing I can build new DBS RPMs, and set them up somewhere, but I need to know how to inject some data. If you can give me some examples it would be great, at least how to inject a new dataset would be sufficient. Of course I understand that I need to configure DBS with proper DB backend (whatever test DB you're using). Or, if you have working VM and can give me access to it, I can easily patch DBS over there and try it out. |
Valentin, |
So, can I adjust DBS in k8s then? |
yes. |
Yuyi, I deployed DBS with NATS on k8s cluster (it is part of DBSGlobalWriter). Could you please let me know how to run an injection of dataset and/or test for writer that I can see how code is working. |
Valentin, There are a lot of examples at https://github.com/dmwm/DBS/tree/master/Client/utils |
Yuyi,
I can access DBSReader though, e.g.
So, what I need is
|
From DBSWriter logs on k8s I see this
which implies that my DN is not authorized to write to DBS. What do I need to change to get the access? |
You were not in the DBS operator group. I tried to add you from siteDB, but failed. I opened a ticket on suteDB and cc'ed to you. Just wonder I thought that you wrote to DBS in the past, didn't you? |
I never wrote to DBS, so, yes I need a permission. And, now we use CRIC instead of SiteDB. I think CRIC is managed by [email protected] or [email protected] |
SiteDB page should be redirect to CRIC.I wrote to them. |
Yuyi, now I seems to have an access since I'm getting different error, but I don't know what's wrong. Please advise:
I already inserted Acq era though and I printed a dict I'm trying to insert, but apparetnly DBS complains that insertDataset must have "dataset, primary_ds_name, processed_ds_name, data_tier_name" which are presented in my dict. |
this is how error looks in dbs server logs
|
and this is the client code I used:
|
You don’t need to insert dataset. You may do a listDatasets to get the dataset names and try to update the status to see if you get the NATS message.
Yuyi
From: Valentin Kuznetsov <[email protected]>
Reply-To: dmwm/DBS <[email protected]>
Date: Wednesday, December 4, 2019 at 1:29 PM
To: dmwm/DBS <[email protected]>
Cc: Yuyi Guo <[email protected]>, Comment <[email protected]>
Subject: Re: [dmwm/DBS] Add NATS integration and implement PdmV use case (#617)
and this is the client code I used:
from __future__ import print_function
#DBS-3 imports
from dbs.apis.dbsClient import *
import os
url=os.getenv('DBS_WRITER_URL', "https://cmsweb-test.cern.ch/dbs/int/global/DBSWriter")
print(url)
# API Object
dbs3api = DbsApi(url=url)
#acq_era={'acquisition_era_name': 'cmsnats', 'description': 'testing_insert_era',
# 'start_date':1234567890}
#print(dbs3api.insertAcquisitionEra(acq_era))
dataset={'primary_ds_name': 'cmsnats_pri',
'physics_group_name': 'Tracker',
'processed_ds_name':'cmsnats-v101',
'dataset_access_type': 'VALID',
'xtcrosssection': 123,
'data_tier_name': 'GEN-SIM-DIGI-RAW',
'acquisition_era_name':'cmsnats',
'processing_version':101 }
dataset.update({'dataset' : '/%s/%s/%s' %(dataset['primary_ds_name'], dataset['processed_ds_name'],
dataset['data_tier_name'])})
print(dataset)
print(dbs3api.insertDataset(dataset))
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dmwm_DBS_pull_617-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAANROTQWNBZRBPLKG4RN43DQXAALDA5CNFSM4JUYSW4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF6G4WA-23issuecomment-2D561802840&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=kZ1Ak7EETIUN8E-49hjvDYCCanVR1900xlCIRfSZxIA&s=3wVq2-hnwM59bwFgrSbwhLzSIszCrTo0_fXMRWXcYZA&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AANROTWFB64RPCJMVZNRKRLQXAALDANCNFSM4JUYSW4A&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=kZ1Ak7EETIUN8E-49hjvDYCCanVR1900xlCIRfSZxIA&s=JMYvoKdMBFmJjwBuZ0zO2ZW06NF79_80zH-KDxdkGuE&e=>.
|
It does not explain the error. The error is misleading at least.
But I can try to get list of dataset, even though I have no idea
which one I should use and which one I can update.
…On 0, Yuyi Guo ***@***.***> wrote:
You don’t need to insert dataset. You may do a listDatasets to get the dataset names and try to update the status to see if you get the NATS message.
Yuyi
From: Valentin Kuznetsov ***@***.***>
Reply-To: dmwm/DBS ***@***.***>
Date: Wednesday, December 4, 2019 at 1:29 PM
To: dmwm/DBS ***@***.***>
Cc: Yuyi Guo ***@***.***>, Comment ***@***.***>
Subject: Re: [dmwm/DBS] Add NATS integration and implement PdmV use case (#617)
and this is the client code I used:
from __future__ import print_function
#DBS-3 imports
from dbs.apis.dbsClient import *
import os
url=os.getenv('DBS_WRITER_URL', "https://cmsweb-test.cern.ch/dbs/int/global/DBSWriter")
print(url)
# API Object
dbs3api = DbsApi(url=url)
#acq_era={'acquisition_era_name': 'cmsnats', 'description': 'testing_insert_era',
# 'start_date':1234567890}
#print(dbs3api.insertAcquisitionEra(acq_era))
dataset={'primary_ds_name': 'cmsnats_pri',
'physics_group_name': 'Tracker',
'processed_ds_name':'cmsnats-v101',
'dataset_access_type': 'VALID',
'xtcrosssection': 123,
'data_tier_name': 'GEN-SIM-DIGI-RAW',
'acquisition_era_name':'cmsnats',
'processing_version':101 }
dataset.update({'dataset' : '/%s/%s/%s' %(dataset['primary_ds_name'], dataset['processed_ds_name'],
dataset['data_tier_name'])})
print(dataset)
print(dbs3api.insertDataset(dataset))
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dmwm_DBS_pull_617-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAANROTQWNBZRBPLKG4RN43DQXAALDA5CNFSM4JUYSW4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF6G4WA-23issuecomment-2D561802840&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=kZ1Ak7EETIUN8E-49hjvDYCCanVR1900xlCIRfSZxIA&s=3wVq2-hnwM59bwFgrSbwhLzSIszCrTo0_fXMRWXcYZA&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AANROTWFB64RPCJMVZNRKRLQXAALDANCNFSM4JUYSW4A&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=kZ1Ak7EETIUN8E-49hjvDYCCanVR1900xlCIRfSZxIA&s=JMYvoKdMBFmJjwBuZ0zO2ZW06NF79_80zH-KDxdkGuE&e=>.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#617 (comment)
|
You just pick any one of the datasets and switch the status beck/forward to see the message. We just upgrade the integration DB from production for some testing. Please don’t switch the status of a list datasets. Playing with a few with them will not destroy the data.
Cheers,
yuyi
From: Valentin Kuznetsov <[email protected]>
Reply-To: dmwm/DBS <[email protected]>
Date: Wednesday, December 4, 2019 at 1:37 PM
To: dmwm/DBS <[email protected]>
Cc: Yuyi Guo <[email protected]>, Comment <[email protected]>
Subject: Re: [dmwm/DBS] Add NATS integration and implement PdmV use case (#617)
It does not explain the error. The error is misleading at least.
But I can try to get list of dataset, even though I have no idea
which one I should use and which one I can update.
On 0, Yuyi Guo ***@***.***> wrote:
You don’t need to insert dataset. You may do a listDatasets to get the dataset names and try to update the status to see if you get the NATS message.
Yuyi
From: Valentin Kuznetsov ***@***.***>
Reply-To: dmwm/DBS ***@***.***>
Date: Wednesday, December 4, 2019 at 1:29 PM
To: dmwm/DBS ***@***.***>
Cc: Yuyi Guo ***@***.***>, Comment ***@***.***>
Subject: Re: [dmwm/DBS] Add NATS integration and implement PdmV use case (#617)
and this is the client code I used:
from __future__ import print_function
#DBS-3 imports
from dbs.apis.dbsClient import *
import os
url=os.getenv('DBS_WRITER_URL', "https://cmsweb-test.cern.ch/dbs/int/global/DBSWriter%22)
print(url)
# API Object
dbs3api = DbsApi(url=url)
#acq_era={'acquisition_era_name': 'cmsnats', 'description': 'testing_insert_era',
# 'start_date':1234567890}
#print(dbs3api.insertAcquisitionEra(acq_era))
dataset={'primary_ds_name': 'cmsnats_pri',
'physics_group_name': 'Tracker',
'processed_ds_name':'cmsnats-v101',
'dataset_access_type': 'VALID',
'xtcrosssection': 123,
'data_tier_name': 'GEN-SIM-DIGI-RAW',
'acquisition_era_name':'cmsnats',
'processing_version':101 }
dataset.update({'dataset' : '/%s/%s/%s' %(dataset['primary_ds_name'], dataset['processed_ds_name'],
dataset['data_tier_name'])})
print(dataset)
print(dbs3api.insertDataset(dataset))
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dmwm_DBS_pull_617-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAANROTQWNBZRBPLKG4RN43DQXAALDA5CNFSM4JUYSW4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF6G4WA-23issuecomment-2D561802840&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=kZ1Ak7EETIUN8E-49hjvDYCCanVR1900xlCIRfSZxIA&s=3wVq2-hnwM59bwFgrSbwhLzSIszCrTo0_fXMRWXcYZA&e=%3E, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AANROTWFB64RPCJMVZNRKRLQXAALDANCNFSM4JUYSW4A&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=kZ1Ak7EETIUN8E-49hjvDYCCanVR1900xlCIRfSZxIA&s=JMYvoKdMBFmJjwBuZ0zO2ZW06NF79_80zH-KDxdkGuE&e=%3E.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#617 (comment)<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dmwm_DBS_pull_617-23issuecomment-2D561804729&d=DwQFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=zW44Tm6eIYUZ4kF68_iOiNNzi3HMowq_xMfGC_Hpguk&s=-FQkoIklXWiSQPTqV_kqnWv3K11N5fjCd-Z8mJykebk&e=>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dmwm_DBS_pull_617-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAANROTRZJNSNGVIH6EV7ZJ3QXABH3A5CNFSM4JUYSW4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF6HUYQ-23issuecomment-2D561805922&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=zW44Tm6eIYUZ4kF68_iOiNNzi3HMowq_xMfGC_Hpguk&s=Hn1KsAiJX1POGa7vvaEyMTD4c7ujARWBiDvFD-KuX2M&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AANROTRKTOJVWSCVIBMDZYLQXABH3ANCNFSM4JUYSW4A&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=zW44Tm6eIYUZ4kF68_iOiNNzi3HMowq_xMfGC_Hpguk&s=o11ezgNskSna2ZVYe82XW81HSdFoXDDVGB-gzsiyjYw&e=>.
|
Yuyi,
and here is messages I got in my subscriber
But I don't understand why I get 2 requests in DBS which lead to two message per single change of dataset access type. From DBS log I see 2 PUT requests
The first uses To sum-up, now the new DBS image |
Great !
Yuyi
From: Valentin Kuznetsov <[email protected]>
Reply-To: dmwm/DBS <[email protected]>
Date: Wednesday, December 4, 2019 at 3:13 PM
To: dmwm/DBS <[email protected]>
Cc: Yuyi Guo <[email protected]>, Comment <[email protected]>
Subject: Re: [dmwm/DBS] Add NATS integration and implement PdmV use case (#617)
Yuyi,
it is working !!! Here is output from DBS logs:
INFO:cherrypy.access:REQUEST [04/Dec/2019 22:07:19] 137.138.31.19 37756 PUT /datasets [/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov] [{'dataset_access_type': u'PRODUCTION', 'dataset': u'/ZMM_13TeV_TuneCP5-pythia8/RunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2/NANOAODSIM'}]
INFO:cherrypy.access:[04/Dec/2019:22:07:19] dbs-global-w-6dfdc68885-gw2zm 137.138.31.19 "PUT /dbs/int/global/DBSWriter/datasets?dataset_access_type=PRODUCTION&dataset=%2FZMM_13TeV_TuneCP5-pythia8%2FRunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2%2FNANOAODSIM HTTP/1.1" 200 OK [data: 2 in 4 out 271749 us ] [auth: OK "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov" "" ] [ref: "" "DBSClient/Unknown/" ]
INFO:cherrypy.access:REQUEST [04/Dec/2019 22:07:19] 137.138.31.19 54056 PUT /datasets [/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov] [{'dataset_access_type': u'VALID', 'dataset': u'/ZMM_13TeV_TuneCP5-pythia8/RunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2/NANOAODSIM'}]
INFO:cherrypy.access:[04/Dec/2019:22:07:20] dbs-global-w-6dfdc68885-gw2zm 137.138.31.19 "PUT /dbs/int/global/DBSWriter/datasets?dataset_access_type=VALID&dataset=%2FZMM_13TeV_TuneCP5-pythia8%2FRunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2%2FNANOAODSIM HTTP/1.1" 200 OK [data: 2 in 4 out 119986 us ] [auth: OK "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov" "" ] [ref: "" "DBSClient/Unknown/" ]
and here is messages I got in my subscriber
./nats-sub -t "cms.dbs.>"
Listening on [cms-nats.cern.ch/cms.dbs.>]
2019/12/04 22:07:19 /ZMM_13TeV_TuneCP5-pythia8/RunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2/NANOAODSIM PRODUCTION
2019/12/04 22:07:19 /ZMM_13TeV_TuneCP5-pythia8/RunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2/NANOAODSIM PRODUCTION
2019/12/04 22:07:19 /ZMM_13TeV_TuneCP5-pythia8/RunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2/NANOAODSIM VALID
2019/12/04 22:07:20 /ZMM_13TeV_TuneCP5-pythia8/RunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2/NANOAODSIM VALID
But I don't understand why I get 2 requests in DBS which lead to two message per single change of dataset access type. From DBS log I see 2 PUT requests
INFO:cherrypy.access:REQUEST [04/Dec/2019 22:07:19] 137.138.31.19 37756 PUT /datasets [/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov] [{'dataset_access_type': u'PRODUCTION', 'dataset': u'/ZMM_13TeV_TuneCP5-pythia8/RunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2/NANOAODSIM'}]
INFO:cherrypy.access:[04/Dec/2019:22:07:19] dbs-global-w-6dfdc68885-gw2zm 137.138.31.19 "PUT /dbs/int/global/DBSWriter/datasets?dataset_access_type=PRODUCTION&dataset=%2FZMM_13TeV_TuneCP5-pythia8%2FRunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2%2FNANOAODSIM HTTP/1.1" 200 OK [data: 2 in 4 out 271749 us ] [auth: OK "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov" "" ] [ref: "" "DBSClient/Unknown/" ]
The first uses /datasets API call payload data (JSON), while second uses /dbs/int/global/DBSWriter/datasets?dataset_access_type=PRODUCTION&dataset=%2FZMM_13TeV_TuneCP5-pythia8%2FRunIIAutumn18NanoAODv5-SNBHP_Nano1June2019_SNB_HP_102X_upgrade2018_realistic_v19-v2%2FNANOAODSIM API w/o payload. I doubt that it is related to my changes since I didn't modified DBS APIs logic, I only added nats messaging. It is something that you should check.
To sum-up, now the new DBS image cmssw/dbs contains this patch which adds NATS manager and yields messages. If you want to test the impact of NATS I think we need to run series of tests with and without NATS in identical environment and measure timing of DBS APIs.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dmwm_DBS_pull_617-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAANROTS5T4PX3AASPZAAMYLQXAMPNA5CNFSM4JUYSW4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF6QHUQ-23issuecomment-2D561841106&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=rakc1H7IQR8nLe1wTrSlk35L4gIvujtQZ7J_2BzvH_4&s=k7QMs0_98q8sFylIgSZbepfZ4kWRc5L89t6JnshQnls&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AANROTQEJTOLKASCYUP4N7LQXAMPNANCNFSM4JUYSW4A&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=8bursUuc0V63OwREQMBG2Q&m=rakc1H7IQR8nLe1wTrSlk35L4gIvujtQZ7J_2BzvH_4&s=NFG7MADvKARsR0-Lxf9QrGGfm7jmeiIJ6IHzeeVfhr4&e=>.
|
Yuyi, I updated PR with print statements (not exceptions) about possible NATS failures (if any) and all NATS calls are within try/except blocks so it will not affect usual DBS API flaw. From my part the PR is done and you can review and decide when to merge it. I would suggest that you'll first put it into your VM and run your usual unit tests with/without NATS, then we can add the code to cmsweb-testbed where we'll have some data flaw and we can observe it in NATS. If you need more info please feel free to let me know, I can easily chat and even show you a demo how NATS behave when I make changes in DBS. But I also want to understand a time scale you foresee for this issue. |
@yuyi, do you have time to review (and possibly merge) this PR before holidays? Since we have monthly upgrade cycles I would prefer to include this functionality for next cmsweb upgrade such that we can properly test it and decide if we can enable it. |
@yuyi, can we include this PR into this cmsweb release upgrade cycle? I understand that you may still need to test the functionality before enabling the NATS, but if we don't enable and just include the code it will be there and we can avoid waiting yet another round of cmsweb upgrade cycle? |
Hi Valentin, Also, please disable the DBS DB connection from your testing server. I need that DB to do DBS testing for the cmsweb-testbed because we have new DBS server deployed. |
Publishing is automatic, there is no configuration is needed, and it is de-centralized, therefore messages can come from any host. I'm not sure about DBS DB connection, which testing server you're referring? I tested NATS using k8s DBS deployment, I didn't make any special setup for it. Therefore I doubt I need to do anything. Regarding testing message, the server can post messages at any time, we only need to run subscriber which can consume messages. But since authentication is involved, you need a login/password. Feel free to ping me tomorrow on slack or send email and I can show you how to run it, it is just one executable (and not setup) which can listen to messages. |
[VK] I'm not sure about DBS DB connection, which testing server you're referring? I tested NATS using k8s DBS deployment, I didn't make any special setup for it. Therefore I doubt I need to do anything. Valentin, Thanks, |
Yes, my k8s server is running, we also deployed pre-production k8s cluster where DBS is running. On top of that we have DBS in VM based pre-production cluster. We may have other k8s deployment in a future. You should provide clear instructions how DBS should be handled. My understanding that unless we write something to DBS the read-only DB are fine to use. In my k8s cluster I use Meanwhile, please let me know when you'll address this PR. |
@vkuznet @muhammadimranfarooqi I had clear message with Lina about this DB account used on k8s pre-production. I asked her stopping the server as soon as the quick test was done. But since the credential is in the system, I was used by Muhammad. I will create a new DBS account for the k8s pre_production. Muhammad, could you please stop the k8s DBS pre-production servers for now? Otherwise, I can not do anything for DBS validation tests. Thanks, |
Yuyi, please use this ticket for NATs related questions. All other issues should be redirected to appropriate tickets (e.g. for k8s and cmsweb related issues to gitlab). I don't want to have very confused thread about different topics not related to issue in question. |
Valentin, Basically, DBS client will insert into DBS by a block. A new dataset is also inserted when the first block of the dataset is inserted. So what you need is fire a NATS message when insertBulkBlock is called. Could you update you PR? I put one comment on the print in the code. Other than that, the rest code is OK to me. Thanks, |
Please be clear where to add publication. Should I remove it from |
You don't need to remove the insertDataset. what you need to is add publication to insertBukBlock. The input JSON most time is a huge data block. DBS does not extract data from it at web layer. I am afraid we have to extract data two times from this data block to cause a big memory print and add execution time of the API. We are already at the edge of timeout. Regarding of the data structure, you may look at an example https://github.com/dmwm/DBS/blob/master/Client/utils/blockdump.dict. |
…ATS block of insertDataset
Yuyi, I added code to insertBulkBlock and there is no need to extract data twice since the code already does json decoding (i.e. your indata is a dict and therefore I was need only to get the appropriate key). Please have a look at updated code. I extracted dataset info based on your blockdump.dict structure. |
Yuyi, regarding big memory footprint. If you'll change your unstructured JSON input to structured one then all of your memory issues will be gone since later can be parsed line by line instead of loading entire dict into RAM. I already pointed out how this can be done in dmwm/DBS/issue/599 |
I have urgent matter to handle, I may not get back to this until tomorrow or next week. I will look into your changes as soon as I have time and will let you know. |
I am going to merge this PR now. I understand that data you want to broadcast is the dataset info, no blocks or files involved. |
This PR contains the following changes:
The interaction with NATS is made optional and safe, i.e. if something happens during NATS publication (e.g. Exception is thrown) the DBS code should not be affected and neither blocked. All NATS publications are done in asynchronous way such that there is no blocking requests.