-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
solved #83 #120
solved #83 #120
Conversation
@m-appel passed all checks can you review PR :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, have you tried to run this code? For me the add_relationship_properties
function produces invalid Cypher queries, but you do not need this function anyways.
The basic outline of this crawler would be:
- Fetch all data first and keep track of unique ASes and to which tag they should be connected
- Fetch/create all AS nodes using
batch_get_nodes_by_single_prop
- Fetch/create the two tag nodes with
get_node
- Create links and push them with
batch_add_links
Like I wrote in the issue, this crawler will be very similar to the bgpkit.pfx2asn
crawler. So please look at that first, understand how it works and then try again :)
Thanks!
P.S.: I noticed that the RoVista API does not seem to return all results, I will follow up with the authors.
iyp/crawlers/rov/rovista.py
Outdated
from iyp import BaseCrawler, RequestStatusError | ||
|
||
URL = 'https://api.rovista.netsecurelab.org/rovista/api/overview' | ||
ORG = 'ROV' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call the organization RoVista
(and rename the folder to rovista
)
iyp/crawlers/rov/rovista.py
Outdated
|
||
URL = 'https://api.rovista.netsecurelab.org/rovista/api/overview' | ||
ORG = 'ROV' | ||
NAME = 'rov.rovista' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the script validating_rov
, so NAME = 'rovista.validating_rov'
hi @m-appel , In the add_relationship_properties function ,I made a mistake in Cypher Query URL = 'https://api.rovista.netsecurelab.org/rovista/api/overview'
ORG = 'RoVista'
NAME = 'rovista.validating_rov'
class Crawler(BaseCrawler):
def run(self):
"""Get RoVista data from their API."""
batch_size = 1000 # Adjust batch size as needed
offset = 0
entries = []
asns = set()
while True:
# Make a request with the current offset
response = requests.get(URL, params={'offset': offset, 'count': batch_size})
if response.status_code != 200:
raise RequestStatusError('Error while fetching RoVista data')
data = response.json().get('data', [])
for entry in data:
asns.add(entry['asn'])
if entry['ratio'] > 0.5:
entries.append({'asn':entry['asn'],'ratio':entry['ratio'],'label':'Validating RPKI ROV'})
else:
entries.append({'asn':entry['asn'],'ratio':entry['ratio'],'label':'Not Validating RPKI ROV'})
# Move to the next page
offset += batch_size
# Break the loop if there's no more data
if len(data) < batch_size:
break
logging.info('Pushing nodes to neo4j...\n')
# get ASNs and Tag IDs
self.asn_id = self.iyp.batch_get_nodes_by_single_prop('AS', 'asn', asns)
tag_id_not_vali= self.iyp.get_node('Tag','{label:"Not Validating RPKI ROV"}',create=False)
tag_id_vali=self.iyp.get_node('Tag','{label:"Validating RPKI ROV"}',create=False)
# Compute links
links = []
for entry in entries:
asn_qid = self.asn_id[entry['asn']]
if entry['ratio'] > 0.5:
links.append({'src_id': asn_qid, 'dst_id':tag_id_vali , 'props': [self.reference, entry]})
else :
links.append({'src_id': asn_qid, 'dst_id':tag_id_not_vali , 'props': [self.reference, entry]})
logging.info('Pushing links to neo4j...\n')
# Push all links to IYP
self.iyp.batch_add_links('CATEGORIZED', links) Is it correct ? |
Yes this looks better, but you can just commit, then I can use the GitHub interface to give easier feedback. (I will also squash all commits of the PR into one, so no worries about polluting the tree or something) The properties for I also understood now why the API is weird, the Btw. for future pull requests, please provide a better name and description. I believe there is also a template displayed when you create a new PR; please do not just delete everything, there are some useful checks to be aware of like "How did you test your code" and "Did you update the documentation". If you run and test your code, you can be more confident if it is correct or not :) |
Thanx @m-appel for the guidance and Feedback . I m doing the changes as you request :-) |
…operty DNS remodeling (InternetHealthReport#119) * update url2domain to url2hostname * remove iana root zone file and dns hierarchy from config file * Atlas measurement targets are now hostnames * update openintel crawlers to the new DNS model * umbrella now ranks a mix of DomainName and HostName nodes and should be run after openintel.umbrella1m * Add explanation for cloudflare DNS modeling * lower umbrella crawler in config file * update READMEs with the new DNS modeling * add (:Service {name:'DNS'}) node and link it to authoritative name servers * Nodes do not have reference properties * Normalize IPv6 addresses * Fix wrong crawler name * Typos and formatting * Remove infra_mx crawler since it does not do anything at the moment * Update Cisco Umbrella crawler - Batch create new nodes (happens more often than expected) - Add logging output - Do not use builtins as variable names * Remove redundant set and parameters * Remove Service node for now We could not decide on a name, so we will deal with this later. --------- Co-authored-by: Malte Tashiro <[email protected]> Add OpenINTEL DNS dependency crawler Integrate with existing files and remove some unnecessary stuff. Co-authored-by: Raffaele Sommese <[email protected]> precommit error rectified Update __init__.py
fc2ece4
to
b697da9
Compare
@m-appel Modified the code and checked it still I am getting this precommit error |
eebd677
to
313ba93
Compare
Description
#83 solved
Added new function to add relationship property
@m-appel can you review my PR :-)