Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IANA root zone file crawler #92

Merged

Conversation

m-appel
Copy link
Member

@m-appel m-appel commented Dec 20, 2023

This PR implements the IANA root zone file crawler and closes #82. Since this is the first crawler that adds multi-label nodes, additional changes to the OpenINTEL crawlers were required to prevent conflicts.

Description

The IANA root zone file contains NS records for the top-level domains, as well as A/AAAA records for the authoritative name servers.

This is the first crawler that introduces multi-label nodes, namely we now have a combination DomainName:AuthoritativeNameServer, since every name server is identified by a domain name. In accordance with this change, this PR updates the other crawler that creates AuthoritativeNameServer nodes, namely the OpenINTEL crawler. Without this change there are conflicting constraints.

As part of changing the OpenINTEL crawler this PR also reduces the execution time of the link-computation phase of the crawler by a factor of 10. The current version used an inefficient method for iterating over the data.

How Has This Been Tested?

These changes have been tested as part of a full database creation and also repeated independently.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@romain-fontugne
Copy link
Member

thanks!

@romain-fontugne romain-fontugne merged commit a6abff3 into InternetHealthReport:main Dec 20, 2023
2 checks passed
@m-appel m-appel deleted the 82-dns-root-zone-file-iana branch December 20, 2023 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DNS root zone file (IANA)
2 participants