Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #116 Include layer information to stanford.asdb AS categories #126

Merged
merged 2 commits into from
Feb 16, 2024

Conversation

JustinLoye
Copy link
Contributor

Description

Added a layer property to links -[r:CATEGORIZED {reference_name: "stanford.asdb"}]-
Added a PART_OF relationship (subcategory) -> (category)
Removed Other and other subcategories

Motivation and Context

Issue 116

How Has This Been Tested?

Thorough check of my file parsing (noticed new errors).

Overall number of links -[r:CATEGORIZED {reference_name: "stanford.asdb"}]-:
before 394 834, now roughly twice more 592 754

Pushed and observed changes on my local iyp

Screenshots (if appropriate):

Screenshot from 2024-02-16 14-08-46

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@m-appel m-appel self-requested a review February 16, 2024 05:57
@m-appel m-appel linked an issue Feb 16, 2024 that may be closed by this pull request
Copy link
Member

@m-appel m-appel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much good to go, functionally no problems, only two nitpicks (I would not have mentioned the linebreak, but since you'll touch the file anyways for the other comment I mentioned it).

Btw. I prefer using something like

if not category:
    continue
asns.add(asn)

this prevents having so many nested levels of code. But purely stylistic, you don't have to change this.

# Remove 'Other' subcategories
# Only store their parent category
if category == 'Other' or category == 'other':
layer = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can just continue here, since we already added this link in the previous loop iteration in the if above?


asn_qid = asn_id[asn]
category_qid = category_id[category]

links.append({'src_id': asn_qid, 'dst_id': category_qid, 'props': [self.reference]}) # Set AS category
links.append({'src_id': asn_qid, 'dst_id': category_qid, 'props': [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you break this line before 'props'? :)

@JustinLoye
Copy link
Contributor Author

As a side comment, I checked that the PART_OF links are well behaved

MATCH (n: Tag)-[r:PART_OF {reference_name: "stanford.asdb"}]-(m: Tag) return n, r, m
Should return ${number_of_layer1_categories} graphs of diameter 2.

As shown in this screenshot, it's almost the case
Screenshot from 2024-02-16 17-12-29

the exception (left-most graph) being caused by the error mentioned in #116 (comment)

Copy link
Member

@m-appel m-appel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks looks good to me.

@m-appel m-appel merged commit 236f736 into InternetHealthReport:main Feb 16, 2024
1 check passed
@JustinLoye JustinLoye deleted the issue-116/asdb-layer branch February 19, 2024 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include layer information to stanford.asdb AS categories
2 participants