diff --git a/README.md b/README.md index 62ae48bbdf..cac7c3d266 100644 --- a/README.md +++ b/README.md @@ -39,16 +39,13 @@ Start [here](https://lyft.github.io/cartography/install.html). ## Usage Start with our [tutorial](https://lyft.github.io/cartography/usage/tutorial.html). Our [data schema](https://lyft.github.io/cartography/usage/schema.html) is a helpful reference when you get stuck. -## Contact +## Community - Join us on `#cartography` on the [Lyft OSS Slack](https://join.slack.com/t/lyftoss/shared_invite/enQtOTYzODg5OTQwNDE2LTFiYjgwZWM3NTNhMTFkZjc4Y2IxOTI4NTdiNTdhNjQ4M2Q5NTIzMjVjOWI4NmVlNjRiZmU2YzA5NTc3MmFjYTQ). - -## Community Meeting - -Talk to us and see what we're working on at our [monthly community meeting](https://calendar.google.com/calendar/embed?src=lyft.com_p10o6ceuiieq9sqcn1ef61v1io%40group.calendar.google.com&ctz=America%2FLos_Angeles). -- Meeting minutes are [here](https://docs.google.com/document/d/1VyRKmB0dpX185I15BmNJZpfAJ_Ooobwz0U1WIhjDxvw). -- Recorded videos are posted [here](https://www.youtube.com/playlist?list=PLMga2YJvAGzidUWJB_fnG7EHI4wsDDsE1). -- Our current project road map is [here](https://docs.google.com/document/d/18MOsGI-isFvag1fGk718Aht7wQPueWd4SqOI9KapBa8/edit#heading=h.15nsmgmjaaml). +- Talk to us and see what we're working on at our [monthly community meeting](https://calendar.google.com/calendar/embed?src=lyft.com_p10o6ceuiieq9sqcn1ef61v1io%40group.calendar.google.com&ctz=America%2FLos_Angeles). + - Meeting minutes are [here](https://docs.google.com/document/d/1VyRKmB0dpX185I15BmNJZpfAJ_Ooobwz0U1WIhjDxvw). + - Recorded videos are posted [here](https://www.youtube.com/playlist?list=PLMga2YJvAGzidUWJB_fnG7EHI4wsDDsE1). +- Our current project roadmap is [here](https://github.com/orgs/lyft/projects/26/views/1). ## Contributing Thank you for considering contributing to Cartography! diff --git a/docs/root/modules/_cartography-metadata/schema.md b/docs/root/modules/_cartography-metadata/schema.md new file mode 100644 index 0000000000..878d8268df --- /dev/null +++ b/docs/root/modules/_cartography-metadata/schema.md @@ -0,0 +1,18 @@ +## Cartography metadata schema + +.. _metadata_schema: + +Some Cartography sync jobs write nodes to convey information about the job itself. See https://github.com/lyft/cartography/issues/758 for more background on this. + +### SyncMetadata:ModuleSyncMetadata + +This is a node to represent metadata about the sync job of a particular module. Its existence indicates that a particular sync job did happen. +The 'types' used here should be actual node labels. For example, if we did sync a particular AWSAccount's S3Buckets, +the `grouptype` is 'AWSAccount', the `groupid` is the particular account's `id`, and the `syncedtype` is 'S3Bucket'. + +| Field | Description | Source| +|-------|-------------|------| +|**id**|`{group_type}_{group_id}_{synced_type}`|util.py| +|grouptype| The parent module's type |util.py| +|groupid|The parent module's id|util.py| +|syncedtype|The sub-module's type|util.py| diff --git a/docs/root/usage/schema.md b/docs/root/usage/schema.md index 3d6da845c4..a5f1d101d4 100644 --- a/docs/root/usage/schema.md +++ b/docs/root/usage/schema.md @@ -22,6 +22,7 @@ - In these docs, more specific nodes will be decorated with `GenericNode::SpecificNode` notation. For example, if we have a `Car` node and a `RaceCar` node, we will refer to the `RaceCar` as `Car::RaceCar`. +.. mdinclude:: ../modules/_cartography-metadata/schema.md .. mdinclude:: ../modules/aws/schema.md .. mdinclude:: ../modules/azure/schema.md .. mdinclude:: ../modules/crxcavator/schema.md diff --git a/docs/root/usage/tutorial.md b/docs/root/usage/tutorial.md index 357c8a7f2e..f0cb52fa9b 100644 --- a/docs/root/usage/tutorial.md +++ b/docs/root/usage/tutorial.md @@ -2,24 +2,16 @@ Once everything has been installed and synced, you can view the Neo4j web interface at http://localhost:7474. You can view the reference on this [here](https://neo4j.com/developer/guide-neo4j-browser/#_installing_and_starting_neo4j_browser). -### Permalinking Bookmarklet +If you already know Neo4j and just need to know what are the nodes, attributes, and graph relationships for our representation of infrastructure assets, you can view our [sample queries](samplequeries.html). More sample queries are available at https://github.com/marco-lancini/cartography-queries. -You can set up a bookmarklet that lets you quickly get a permalink to a Cartography query. To do so, add a bookmark with the following contents as the URL - make sure to replace `neo4j.contoso.com:7474` with your instance of Neo4j: +Otherwise, read on for this handhold-y tutorial filled with examples. Suppose we wanted to find out: -```javascript -javascript:(() => { const query = document.querySelectorAll('article label span')[0].innerText; if (query === ':server connect') { console.log('no query has been run!'); return; } const searchParams = new URLSearchParams(); searchParams.append('connectURL', 'bolt://neo4j:neo4j@neo4j.contoso.net:7687'); searchParams.append('cmd', 'edit'); searchParams.append('arg', query.replaceAll(/\r /g, '\r')); newURL = `http://neo4j.contoso.net:7474/browser/?${searchParams}`; window.open(newURL, '_blank', 'noopener'); })() -``` - -Then, any time you are in the web interface, you can click the bookmarklet to open a new tab with a permalink to your most recently executed query in the URL bar. - -### ℹ️ Already know [how to query Neo4j](https://neo4j.com/developer/cypher-query-language/)? You can skip to our reference material! -If you already know Neo4j and just need to know what are the nodes, attributes, and graph relationships for our representation of infrastructure assets, you can skip this handholdy walkthrough and see our [sample queries](samplequeries.md). - -### What [RDS](https://aws.amazon.com/rds/) instances are installed in my [AWS](https://aws.amazon.com/) accounts? -``` +### What [RDS](https://aws.amazon.com/rds/) instances are installed in my AWS accounts? +```cypher MATCH (aws:AWSAccount)-[r:RESOURCE]->(rds:RDSInstance) return * ``` + ![Visualization of RDS nodes and AWS nodes](../images/accountsandrds.png) In this query we asked Neo4j to find all `[:RESOURCE]` relationships from AWSAccounts to RDSInstances, and return the nodes and the `:RESOURCE` relationships. @@ -35,7 +27,7 @@ and then pick options on the menu that shows up at the bottom of the view like t ### Which RDS instances have [encryption](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.Encryption.html) turned off? -``` +```cypher MATCH (a:AWSAccount)-[:RESOURCE]->(rds:RDSInstance{storage_encrypted:false}) RETURN a.name, rds.id ``` @@ -49,7 +41,7 @@ If you want to go back to viewing the graph and not a table, simply make sure yo Let's look at some other AWS assets now. ### Which [EC2](https://aws.amazon.com/ec2/) instances are directly exposed to the internet? -``` +```cypher MATCH (instance:EC2Instance{exposed_internet: true}) RETURN instance.instanceid, instance.publicdnsname ``` @@ -60,7 +52,7 @@ These instances are open to the internet either through permissive inbound IP pe If you know a lot about AWS, you may have noticed that EC2 instances [don't actually have an exposed_internet field](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_Instance.html). We're able to query for this because Cartography performs some [data enrichment](#data-enrichment) to add this field to EC2Instance nodes. ### Which [S3](https://aws.amazon.com/s3/) buckets have a policy granting any level of anonymous access to the bucket? -``` +```cypher MATCH (s:S3Bucket) WHERE s.anonymous_access = true RETURN s @@ -76,13 +68,81 @@ A couple of other things to notice: instead of using the "{}" notation to filter Let's go back to analyzing RDS instances. In an earlier example we queried for RDS instances that have encryption turned off. We can aggregate this data by AWSAccount with a small change: -``` +```cypher MATCH (a:AWSAccount)-[:RESOURCE]->(rds:RDSInstance) WHERE rds.storage_encrypted = false RETURN a.name as AWSAccount, count(rds) as UnencryptedInstances ``` ![Table of unencrypted RDS instances by AWS account](../images/unencryptedcounts.png) + +### Given a node label, what other node labels can be connected to it? + +Suppose we wanted to know what other assets can be connected to a DNSRecord. We would ask the graph like this: + +```cypher +match (d:DNSRecord)--(n) +return distinct labels(n); +``` + +This says "what are the possible labels for all nodes connected to all DNSRecord nodes `d` in my graph?" Your answer might look like this: + +``` +["AWSDNSRecord", "DNSRecord"] +["AWSDNSZone", "DNSZone"] +["LoadBalancerV2"] +["NameServer"] +["ESDomain"] +["LoadBalancer"] +["EC2Instance", "Instance"] +``` + +You can then make the path more specific like this: + +```cypher +match (d:DNSRecord)--(:EC2Instance)--(n) +return distinct labels(n); +``` + +And then you can continue building your query. + +We also include [full schema docs](schema.html), but this way of building a query can be faster and more interactive. + + +### Given a node label, what are the possible property names defined on it? + +We can find what properties are available on an S3Bucket like this: + +```cypher +match (n:S3Bucket) return properties(n) limit 1; +``` + +The result will look like this: + +``` +{ + "bucket_key_enabled": false, + "creationdate": "2022-05-10 00:22:52+00:00", + "ignore_public_acls": true, + "anonymous_access": false, + "firstseen": 1652400141863, + "block_public_policy": true, + "versioning_status": "Enabled", + "block_public_acls": true, + "anonymous_actions": [], + "name": "my-fake-bucket-123", + "lastupdated": 1688605272, + "encryption_algorithm": "AES256", + "default_encryption": true, + "id": "my-fake-bucket-123", + "arn": "arn:aws:s3:::my-fake-bucket-123", + "restrict_public_buckets": false +} +``` + +Our [full schema docs](schema.html) describe all possible fields, but listing out properties this way lets you avoid switching between browser tabs. + + ### Learning more If you want to learn more in depth about Neo4j and Cypher queries you can look at [this tutorial](https://neo4j.com/developer/cypher-query-language/) and see this [reference card](https://neo4j.com/docs/cypher-refcard/current/). @@ -117,3 +177,14 @@ You can add your own custom attributes and relationships without writing Python ### Mapping AWS Access Permissions Cartography can map permissions between IAM Principals and resources in the graph. Here's [how](../modules/aws/permissions-mapping.html). + + +### Permalinking Bookmarklet + +You can set up a bookmarklet that lets you quickly get a permalink to a Cartography query. To do so, add a bookmark with the following contents as the URL - make sure to replace `neo4j.contoso.com:7474` with your instance of Neo4j: + +```javascript +javascript:(() => { const query = document.querySelectorAll('article label span')[0].innerText; if (query === ':server connect') { console.log('no query has been run!'); return; } const searchParams = new URLSearchParams(); searchParams.append('connectURL', 'bolt://neo4j:neo4j@neo4j.contoso.net:7687'); searchParams.append('cmd', 'edit'); searchParams.append('arg', query.replaceAll(/\r /g, '\r')); newURL = `http://neo4j.contoso.net:7474/browser/?${searchParams}`; window.open(newURL, '_blank', 'noopener'); })() +``` + +Then, any time you are in the web interface, you can click the bookmarklet to open a new tab with a permalink to your most recently executed query in the URL bar. diff --git a/docs/schema/syncmetadata.md b/docs/schema/syncmetadata.md index baad4be28e..7572a7ff29 100644 --- a/docs/schema/syncmetadata.md +++ b/docs/schema/syncmetadata.md @@ -1,17 +1 @@ -## SyncMetadata - -SyncMetadata nodes are created by sync jobs to convey information about the job itself. See this doc for how this is -used. - -## SyncMetadata:ModuleSyncMetadata - -This is a node to represent some metadata about the sync job of a particular module or sub-module. Its existence should suggest that a paritcular sync job did happen. -The 'types' used here should be actual node labels. For example, if we did sync a particular AWSAccount's S3Buckets, -the `grouptype` is 'AWSAccount', the `groupid` is the particular account's `id`, and the `syncedtype` is 'S3Bucket'. - -| Field | Description | Source| -|-------|-------------|------| -|**id**|`{group_type}_{group_id}_{synced_type}`|util.py| -|grouptype| The parent module's type |util.py| -|groupid|The parent module's id|util.py| -|syncedtype|The sub-module's type|util.py| +This document has been moved [here](https://lyft.github.io/cartography/modules/_cartography-metadata/schema.html)