Releases: netsage-project/netsage-pipeline
v2.1.2
2.1.0
NetSage Ingest Pipeline 2.1.0 -- July 31, 2024
Features:
- All-new containers based on current OS builds and elimination of EOL platforms such as CentOS
- Updates to current versions of Logstash, RabbitMQ, Ofelia, and nfdump
- Streamlined codebase, eliminating unnecessary legacy code and repositories
- Unified, transparent naming scheme for containers
- New NetSage Manager container, responsible for necessary restarts of containers when Science Registry files are updated
- Ability to ingest Globus log data
- Revised flow tagging scheme using the NetSage Science Registry, resulting in more accurate information on NetSage Dashboads
Bug Fixes:
- Fixed issue where not all containers would restart after a host reboot
- Fixed issue where updated Science Registry data was not being used by Logstash
- Fixed issue with incorrect Docker container dependencies
v2.0.0
NetSage Deidentfier 2.0.0 -- September 19, 2023
Features:
- Updated docker_init.sh with new locations for Science Registry, CAIDA, and GeoLite databases
- Updated jobs in cron.d with new locations for Science Registry, CAIDA, and GeoLite databases
- General updates to all container images
- Updated pipeline-logstash, pipeline-importer, and netsage-nfdump-collector to point to TACC repositories
- Revised 15-sensor-specific-changes.conf to add filter-by-subnet flow collection filtering
- Added 57-ip-protocol.conf to tag IPv4/IPv6 flows
- AARNET privatization is no longer needed, so added .disabled to 80-privatize-org.conf, and made it into a generalized version as an example. Moved lines making the AARNET org name consistent to 95-cleanup.conf.
Documentation updates
- Dependabot automatic remediations of vulnerabilites (for docusaurus)
v1.2.12
GRNOC NetSage Deidentfier 1.2.12 -- Jan 4, 2022
Usage note: With this release, we will move to using logstash 7.16.2 to fix a Log4j vulnerability.
Bare-metal installations will need to upgrade logstash manually.
(Dec 14,2021- original 1.2.12 release with logstash 7.16.1 in the pipeline_logstash Dockerfile)
Features:
- In the dockerfile, increased the version on which the logstash docker container is based
- Added LEARN to the regexes in the sensor groups and types support files
v1.2.11
GRNOC NetSage Deidentfier 1.2.11 -- Sept 3, 2021
Features:
- Made filtering by ifindex (optionally) sensor-specific
- Added tags to flows with src and dst IPs = 0.0.0.x (user can set outputs filter to do something based on tags)
- When duration <= 0.002 sec, set duration, bits/s, and packets/s to 0 as rates are inaccurate for small durations
- Added NORDUnet* and tacc_netflows to sensor group and type regexes
- Added onenet-members-list.rb to the members-list files to download
- Increased version numbers for some website-related packages is response to Dependabot
- Documentation improvements
Bugs:
- Fixed es_doc_id. The hash had been missing meta.id due to a bug.
- At the beginning of the pipeline, set missing IPs to 0.0.0.0, missing ifindexes to -10, missing durations to 0.
v1.2.10
GRNOC NetSage Deidentfier 1.2.10 -- May 10 2021
Usage note: With this release, we will move to using nfdump v1.6.23.
This includes a fix for IPs not being parsed in MPLS flows, as well as the fix for missing ASNs from April.
- docker-compose.override_example.yml has been updated to refer to this version.
Features:
- 15-sensor-specific-changes.conf can now be used to drop all flows from a certain sensor except those from listed ifindexes.
- 0.0.0.0 flows are no longer dropped
- Will now tag flows with the pipeline version number (@pipeline_ver)
- Added a sript (to util/) that can be used to process as-org files from CAIDA into the ASN lookup files that we need.
- Documentation updates
v1.2.9
GRNOC NetSage Deidentfier 1.2.9 -- Apr 9 2021
Usage note: With this release, we are also moving to using a version of nfdump built from github master which includes commits through Feb 20, 2021. This includes a fix for incorrect ASNs being added to flows when the ASN data is actually missing.
- To go along with this, docker-compose.override_example.yml refers to a "nightly" tag of nfdump (this is not updated nightly!)
Features:
- The installed version of 15-sensor-specific-changes.conf now accommodates environment variables for
renaming sensors according to ifindex, and doing sampling corrections based on sensor name.
Env values are taken from the Docker .env file or from /etc/logstash/logstash-env-vars (filename is set in a new logstash systemd file). - The flow-size threshold will now be 10 MB for Docker installations, down from 100 MB (changed the setting in the installed importer config)
- Added some new regexes to sensor_group and sensor_type lists
- For Docker installations, added support for alpine containers, set logging level to INFO
- Documentation updates
Bugs
- Flow-filter changes have been made to accommodate changes to simp
- Flows with IPs of 0.0.0.0 are dropped
- For Docker installs, the rabbit host name will be constant
- Docusaurus and some packages flagged by dependabot were upgraded
v1.2.8
GRNOC NetSage Deidentfier 1.2.8 -- Jan 28 2021
Features:
- Added 15-sensor-specific-changes.conf with multiplication by mirroring-sampling rate for a pacificwave sensor and changing of the sensor name for NEAAR flows using a certain ifindex.
- Started saving ifindexes to ES (at least for now)
- Added consideration of continents to possibly get a country_scope value when a country is missing.
- Stopped saving old 'projects' array field to ES (use project_names)
- Added example cron file to get member-org lists. Docker will automatically download them (and maxmind db's, etc) weekly.
- Documentation updates
- Misc. minor changes
Bugs
- Fixed a typo that affected src org when dst was Multicast or missing an asn or org
- Moved the es_doc_id calculation to after aggregation, so it would be saved to ES.
Changes
- Removed explicit logstash dependency in the spec file
- (Upgraded logstash to logstash.x86_64 version 7.10.2 outside of pipeline upgrade)
- Added Sun Corridor sensors to regexes for sensor type and group.
v1.2.7
GRNOC NetSage Deidentfier 1.2.7 -- Nov 17 2020
Features:
- Documentation updates
- Updated importer cache file handling so that if culling of nfcapd files is not enabled, only files in
directories newer than 2 months (by dir name) will be compared to those in the cache file to determine which have
not yet been imported, and files older than 3 months (by filename) will be removed from the cache file.
(If culling is enabled, these are not needed.) - Aggregation timouts and the aggregation map filename were added to the .env file for Docker, so they can be changed by the user.
- Added field es_doc_id (hash of meta.id and start time).
This field can be used as the elasticsearch doc id in order to do upserts instead of add duplicates.
(This works for sflow, but for netflow, behavior is being investigated.) - Added "replay" dataset for testing
- Added versioned documentation
Bugs:
- Added some type conversions back as they are needed fow now.
- Exposed rabbitmq data volume to ensure data persists if pipeline is shutdown (Docker)
- Exposed the importer cache file to the local data volume, so data is not reprocessed and duplicate flows created on restart (Docker)
- Fixed an issue where the aggregation map file was not saved to a proper location in Docker deployments
Changes:
- Uncommented and set min-bytes=100 MB and min-file-age=10 min in example importer config file so these values will be used
by default for Docker deployments instead of the importer's default of 500 MB and no min-file-age.. - Added SANReN and tacc_sflows to sensor_types and/or sensor_group regexes.
v1.2.6
GRNOC NetSage Deidentfier 1.2.6 -- Sept 17 2020
Logstash configs:
- Split input and output options into their own .conf files for easy enable/disable. Unused ones have .disabled extension.
- Split maxmind geoip-tagging config into 2 parts to separately get location and ASNs -- new 45-geoip-tagging.conf and 50-asn.conf.
- If the flow's original asn is private, 0, etc, try getting an asn from the maxmind ASN db by IP. If one is found, add tag "maxmind src/dst asn"
- If a public asn is not found, set asn to -1 (not 0)
- -> Will now get organizations from a CAIDA csv file instead of maxmind -- added 53-caida-org.conf
- If the org cannot be determined, set it to "Unknown"
- If lat/lon are unknown, don't set the fields at all. Set country and continent to Unknown.
- Convert the common variations of AARNET org names to "Australian Academic and Research Network (AARNet)", whether redacted or not.
(caida is now listing the abbr but redaction uses/has used the longer name) - If dst is Multicast, set no country_scope.
- Moved @exit_time and @processing_time to 98-post-process.conf
- Corrected spelling of @injest_time to @ingest_time
- Added sox and fixed gpn and tacc in sensor_groups and sensor_types dictionaries
- We no longer need to check to see if events are flows so removed the "if [type] == flow" conditionals
Other:
- Docker changes to allow more than one sflow/netflow sensor (default is 1 of each but user can edit shared file to set it up how they want)
- Added replay script given a valid input file.
- Started to add Automated Ruby Unit tests.
- Changed some dir names
- Renamed cron files and changed the times in them. Added netsage-caida-update.cron.
- Moved maxmind, caida, and science registry dbs to /var/lib/grnoc/netsage/ directory.
Temporary:
- Added "caida orgs" tags to all flows