Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add synthetic seed data version 2025.1.15
Generated from recent crawls: - a bunch of recent daily crawls: ./misc/merge_results.sh f3c7cc7 66b894a - five 40K, one 35K and six 30K distributed crawl runs Filtered to ignore a bunch of potentially malware-infected sites with --load-data-ignore-sites=asianetnews.com,gumtree.co.za,malaysiakini.com,minna.cc,planetsuzy.org Filtered out some invalid beacon destination domains.
- Loading branch information