From 7852ab72cfc1beb0193f8586c056d0c88c5e4432 Mon Sep 17 00:00:00 2001 From: Misha Puzanov Date: Tue, 18 Oct 2022 22:20:27 +0200 Subject: [PATCH 1/6] Add RSC doc section, improve docs --- README.md | 53 ++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 42 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 1d052cc9..3efc5f7e 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,20 @@ -# rpki-prover +# Contents -Implementation of the [RPKI relying party software](https://rpki.readthedocs.io/en/latest/rpki/using-rpki-data.html) with the focus on a reasonable compromise between resource utilisation and ease of introducing changes. +* [Introduction](#introduction) +* [Usage](#usage) + - [Static Liinux binary](#static-linux-binary) + - [Docker image](#docker-image) + - [Building from sources](#building-from-sources) +* [HTTP API](#http-api) + - [Prometheus metrics](#prometheus-metrics) +* [Support of RSC](#support-of-rsc) +* [Resource consumption](#resource-consumption) +* [Why Haskell?](#why-haskell) + + +# Introduction + +RPKI prover is an implementation of the [RPKI relying party software](https://rpki.readthedocs.io/en/latest/rpki/using-rpki-data.html) with the focus on a reasonable compromise between resource utilisation and ease of introducing changes. Issues are tracked [here](https://github.com/lolepezy/rpki-prover/issues), any questions can be asked there as well. @@ -24,7 +38,7 @@ Current and future work - CPU and memory optimisations -# Using rpki-prover +# Usage Running `rpki-prover --help` gives some reasonable help on CLI options. @@ -40,11 +54,11 @@ There is no config file and all the configuration is provided with CLI (most of There is an initialise step necessary to start after downloading or building the executable: you need to run something like `rpki-prover.exe --initialise --rpki-root-directory /var/where-you-want-data-to-be` to create the necessary FS layout in `/var/where-you-want-data-to-be`. It will download the TAL files to `/var/where-you-want-data-to-be/tals` as well. The awkward part related to ARIN TAL license agreement is pretty much a rip off from the Routinator implementation as the most convenient for the user. -## Static Linux binary +## Static Linux binary Every [release](https://github.com/lolepezy/rpki-prover/releases) includes statically linked Linux x64 executable, just download and run it. -## Docker image +## Docker image It is possible to run rpki-prover as `docker run lolepezy/rpki-prover:latest`. The image is available on Docker Hub and it's about 80mb in size. @@ -63,7 +77,7 @@ The important part here is `target=/rpki-data`, this directory is created by def docker run -p 9999:9999 --mount source=rpki-data,target=/something-else lolepezy/rpki-prover:latest --rpki-root-directory /something-else ``` -## Building from sources +## Building from sources The software is a daemon written in Haskell and can be built using [`stack`](https://docs.haskellstack.org/en/stable/README/). @@ -89,7 +103,7 @@ Normally it prints quite a lot of logs about what it's doing to the stdout. Afte Main page http://localhost:9999 is the UI that reports some metrics about trust anchorts, repositories and the list of errors and warnings. -## HTTP API +# HTTP API There are several API endpoints. Note that there is not much of design invested into the API, so more adjustments will come based on the feedback and feature requests. @@ -106,11 +120,28 @@ There are several API endpoints. Note that there is not much of design invested - http://localhost:9999/api/object?uri=rsync://rpki.apnic.net/member_repository/A917D135/19712F8613DD11EB8FFBED74C4F9AE02/0u36vBm4dKB-OdauyKaLMLuB2lw.crl - or http://localhost:9999/api/object?hash=a92126d3a58a0f6593702f36713521fcd581e0a9e38a146f7f762c7d7d48ed0a -## Prometheus metrics +## Prometheus metrics Prometheus metrics are accessible via the standard `/metrics` path. -## Resource consumption + +# Support of RSC + +RPKI prover supports validating RPKI Signed Checklists (https://datatracker.ietf.org/doc/draft-ietf-sidrops-rpki-rsc/). + +In order to validate a set of files with an RSC object it is necessary to have a running rpki-prover instance to be able to use its cache of validated object. In the examples below it is assumed that there's an instance of rpki-prover (the same version) running with `/var/prover` set as `--rpki-root-directory` option. It is also possible to skip `--rpki-root-directory` parameter assuming that the default (`~/.rpki`) with be used. + +The following example validates two files `foo.txt` and `bar.bin` against the `checklist.sig` object: + +`rpki-prover --rpki-root-directory /var/prover --verify-signature --signature-file checklist.sig --verify-files foo.txt bar.bin` + +The following example validates all files in the `dir` directory against the `checklist.sig` object: + +`rpki-prover --rpki-root-directory /var/prover --verify-signature --signature-file checklist.sig --verify-directory ./dir` + + + +# Resource consumption Cold start, i.e. the first start without cache takes at least 2 minutes and consumes around 3 minutes of CPU time. This time can be slightly reduced by setting higher `--cpu-count` value in case multiple CPUs are available. While CPU-intensive tasks scale pretty well (speed-up is sublinear up to 8-10 CPU cores), the total warm up time is moslty limited by the download time of the slowest of RPKI repositories and cannot be reduced drastically. @@ -129,12 +160,12 @@ Note that memory consumption is mostly determined by how big the biggest objects Disk space usage depends on the `--cache-lifetime-hours` parameter. The default is 72 hours and it results in a cache size about 2Gb. 72 hours is a little bit on a big side, so lower values would reduce the amount of data stored. However, LMDB is not very good in reusing the free space in its file, so physical size of the `cache` directory can be 2 or more times bigger than the total size of data in it. There is a compaction procedure that kicks in when the LMDB file size is 2 or more times bigger than the total size of all data. So overall, in the worst case scenario, it would need approximately 1GB of disk space for every 10 hours of `--cache-lifetime-hours`. -## Known issues +# Known issues - From time to time a message 'rpki-prover: Thread killed by timeout manager' may be printed to `stderr`. It's the result of a bug in the HTTP server used for API and UI and is harmless. It will be fixed one way or the other in future versions. - As mentioned before, total RSS of the process can go up to several gigabytes even though most of it mapped to LMDB cache and not in RAM. It may, however, be that `rpki-prover` is killed by OOM and some configuration adjustments would be needed to prevent it. - ## Why Haskell? +## Why Haskell? - Relatively small code-base. Currently the size of it is around 10KLOC, including a lot of functionality implemented from scratch, such as CMS-parsing. - Fast prototyping and smooth refactoring. From e0b0f58b773b70c0e746a0c45b9f2d885d931a32 Mon Sep 17 00:00:00 2001 From: Misha Puzanov Date: Tue, 18 Oct 2022 22:23:06 +0200 Subject: [PATCH 2/6] Fix doc --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 3efc5f7e..bbb00335 100644 --- a/README.md +++ b/README.md @@ -165,7 +165,7 @@ Disk space usage depends on the `--cache-lifetime-hours` parameter. The default - From time to time a message 'rpki-prover: Thread killed by timeout manager' may be printed to `stderr`. It's the result of a bug in the HTTP server used for API and UI and is harmless. It will be fixed one way or the other in future versions. - As mentioned before, total RSS of the process can go up to several gigabytes even though most of it mapped to LMDB cache and not in RAM. It may, however, be that `rpki-prover` is killed by OOM and some configuration adjustments would be needed to prevent it. -## Why Haskell? +# Why Haskell? - Relatively small code-base. Currently the size of it is around 10KLOC, including a lot of functionality implemented from scratch, such as CMS-parsing. - Fast prototyping and smooth refactoring. From 6b2438736931e56e430d84df4f61453bce0cebd7 Mon Sep 17 00:00:00 2001 From: Mikhail Puzanov Date: Tue, 18 Oct 2022 22:24:47 +0200 Subject: [PATCH 3/6] Update README.md Fix formatting --- README.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index bbb00335..4d4873b4 100644 --- a/README.md +++ b/README.md @@ -133,12 +133,15 @@ In order to validate a set of files with an RSC object it is necessary to have a The following example validates two files `foo.txt` and `bar.bin` against the `checklist.sig` object: -`rpki-prover --rpki-root-directory /var/prover --verify-signature --signature-file checklist.sig --verify-files foo.txt bar.bin` +``` +rpki-prover --rpki-root-directory /var/prover --verify-signature --signature-file checklist.sig --verify-files foo.txt bar.bin +``` The following example validates all files in the `dir` directory against the `checklist.sig` object: -`rpki-prover --rpki-root-directory /var/prover --verify-signature --signature-file checklist.sig --verify-directory ./dir` - +``` +rpki-prover --rpki-root-directory /var/prover --verify-signature --signature-file checklist.sig --verify-directory ./dir +``` # Resource consumption From 0b924bf28f7cbffc37ae6b17381510c564ec8e47 Mon Sep 17 00:00:00 2001 From: Mikhail Puzanov Date: Tue, 18 Oct 2022 22:35:30 +0200 Subject: [PATCH 4/6] Update README.md Fix resource consumption part --- README.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 4d4873b4..46daaa4c 100644 --- a/README.md +++ b/README.md @@ -150,14 +150,16 @@ Cold start, i.e. the first start without cache takes at least 2 minutes and cons After initial warmup, it's not a very CPU-bound application. With default settings RPKI Prover consumes about 1 hour of CPU time every 18 hours on a typical modern CPU, creating load average of 5-10%. Smaller revalidation interval will increase the load. -The amount of memory needed for a smooth run for the current state of the repositories (6 trust anchors, including [AS0 TA](https://www.apnic.net/community/security/resource-certification/tal-archive/) of APNIC with about 330K of VRPs in total) is somewhere around 1.5GB. Adding or removing TAs can increase or reduce this amount. What can be confusing about memory usage is the figures given by `top/htop`. +The amount of memory needed for a smooth run for the current state of the repositories (6 trust anchors, including [AS0 TA](https://www.apnic.net/community/security/resource-certification/tal-archive/) of APNIC with about 330K of VRPs in total) is somewhere around 1.5-2GB for all processes in total. Adding or removing TAs can increase or reduce this amount. What can be confusing about memory usage is the figures given by `top/htop`. An example of a server, running for a few days: ``` VIRT RES SHR -1.0T 5383M 3843M +1.0T 4463M 3920M ``` -Here `SHR` is largely dominated by the LMDB cache and other mmap-ed files (temporary files used to download RRDP repositories, etc.). That means that actual heap of the process is about `5383-3843=1540M`. +Here `SHR` is largely dominated by the LMDB cache and other mmap-ed files (temporary files used to download RRDP repositories, etc.). That means that actual heap of the process is about `4463-3920=543M`. + +Every validation or repository fetch runs as a separate process with it's own heap, with typical heap size for the validatro up to 600-700M and up to 100-200MB for a fetching process. Note that memory consumption is mostly determined by how big the biggest objects are and not that much by how many there are objects in total, so the growth of repositories is not such a big issue for rpki-prover. It it recommended to have 3GB of RAM available on the machine mostly to reduce the IOPS related to reading objects from the LMDB cache. Since every validation typically goes through 160K of objects (at the moment of writing), each of them being 3Kb in size on average, it would be benificial to have at least few hundred of megabytes in FS page cache. From bb20e42b3db1c68f0dee95ba676e66abcc147e97 Mon Sep 17 00:00:00 2001 From: Mikhail Puzanov Date: Tue, 18 Oct 2022 22:36:38 +0200 Subject: [PATCH 5/6] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 46daaa4c..64fe7b75 100644 --- a/README.md +++ b/README.md @@ -159,7 +159,7 @@ VIRT RES SHR ``` Here `SHR` is largely dominated by the LMDB cache and other mmap-ed files (temporary files used to download RRDP repositories, etc.). That means that actual heap of the process is about `4463-3920=543M`. -Every validation or repository fetch runs as a separate process with it's own heap, with typical heap size for the validatro up to 600-700M and up to 100-200MB for a fetching process. +Every validation or repository fetch runs as a separate process with its own heap, with typical heap size for the validator up to 600-700M and up to 100-200MB for a fetching process. Note that memory consumption is mostly determined by how big the biggest objects are and not that much by how many there are objects in total, so the growth of repositories is not such a big issue for rpki-prover. It it recommended to have 3GB of RAM available on the machine mostly to reduce the IOPS related to reading objects from the LMDB cache. Since every validation typically goes through 160K of objects (at the moment of writing), each of them being 3Kb in size on average, it would be benificial to have at least few hundred of megabytes in FS page cache. From 646e25010d47bb9c09ab9d5496e7dd3054859094 Mon Sep 17 00:00:00 2001 From: Mikhail Puzanov Date: Tue, 18 Oct 2022 22:40:40 +0200 Subject: [PATCH 6/6] Update README.md Add security pitch --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 64fe7b75..512e1525 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,8 @@ RPKI prover is an implementation of the [RPKI relying party software](https://rp Issues are tracked [here](https://github.com/lolepezy/rpki-prover/issues), any questions can be asked there as well. +This implementation seeks to address potential security vulnerabilites by utilising process isolation, memory and time constraints and other ways of preventing resource exhaustion attacks and make sure that "it keeps going" regardless of unstable or potentially maliciouly constructed RPKI repositories. + Implemented features are - Fetching from both rsync and RRDP repositories