Skip to content

Commit

Permalink
Merge pull request #132 from lolepezy/rsc-doc
Browse files Browse the repository at this point in the history
Add RSC doc section, improve docs
  • Loading branch information
lolepezy authored Oct 19, 2022
2 parents 304c67b + 646e250 commit 49f180d
Showing 1 changed file with 52 additions and 14 deletions.
66 changes: 52 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,25 @@
# rpki-prover
# Contents

Implementation of the [RPKI relying party software](https://rpki.readthedocs.io/en/latest/rpki/using-rpki-data.html) with the focus on a reasonable compromise between resource utilisation and ease of introducing changes.
* [Introduction](#introduction)
* [Usage](#usage)
- [Static Liinux binary](#static-linux-binary)
- [Docker image](#docker-image)
- [Building from sources](#building-from-sources)
* [HTTP API](#http-api)
- [Prometheus metrics](#prometheus-metrics)
* [Support of RSC](#support-of-rsc)
* [Resource consumption](#resource-consumption)
* [Why Haskell?](#why-haskell)


# Introduction <a name="introduction"></a>

RPKI prover is an implementation of the [RPKI relying party software](https://rpki.readthedocs.io/en/latest/rpki/using-rpki-data.html) with the focus on a reasonable compromise between resource utilisation and ease of introducing changes.

Issues are tracked [here](https://github.com/lolepezy/rpki-prover/issues), any questions can be asked there as well.

This implementation seeks to address potential security vulnerabilites by utilising process isolation, memory and time constraints and other ways of preventing resource exhaustion attacks and make sure that "it keeps going" regardless of unstable or potentially maliciouly constructed RPKI repositories.

Implemented features are

- Fetching from both rsync and RRDP repositories
Expand All @@ -24,7 +40,7 @@ Current and future work
- CPU and memory optimisations


# Using rpki-prover
# Usage <a name="usage"></a>

Running `rpki-prover --help` gives some reasonable help on CLI options.

Expand All @@ -40,11 +56,11 @@ There is no config file and all the configuration is provided with CLI (most of

There is an initialise step necessary to start after downloading or building the executable: you need to run something like `rpki-prover.exe --initialise --rpki-root-directory /var/where-you-want-data-to-be` to create the necessary FS layout in `/var/where-you-want-data-to-be`. It will download the TAL files to `/var/where-you-want-data-to-be/tals` as well. The awkward part related to ARIN TAL license agreement is pretty much a rip off from the Routinator implementation as the most convenient for the user.

## Static Linux binary
## Static Linux binary <a name="static-linux-binary"></a>

Every [release](https://github.com/lolepezy/rpki-prover/releases) includes statically linked Linux x64 executable, just download and run it.

## Docker image
## Docker image <a name="docker-image"></a>

It is possible to run rpki-prover as `docker run lolepezy/rpki-prover:latest`. The image is available on Docker Hub and it's about 80mb in size.

Expand All @@ -63,7 +79,7 @@ The important part here is `target=/rpki-data`, this directory is created by def
docker run -p 9999:9999 --mount source=rpki-data,target=/something-else lolepezy/rpki-prover:latest --rpki-root-directory /something-else
```

## Building from sources
## Building from sources <a name="building-from-sources"></a>

The software is a daemon written in Haskell and can be built using [`stack`](https://docs.haskellstack.org/en/stable/README/).

Expand All @@ -89,7 +105,7 @@ Normally it prints quite a lot of logs about what it's doing to the stdout. Afte
Main page http://localhost:9999 is the UI that reports some metrics about trust anchorts, repositories and the list of errors and warnings.


## HTTP API
# HTTP API <a name="http-api"></a>

There are several API endpoints. Note that there is not much of design invested into the API, so more adjustments will come based on the feedback and feature requests.

Expand All @@ -106,35 +122,57 @@ There are several API endpoints. Note that there is not much of design invested
- http://localhost:9999/api/object?uri=rsync://rpki.apnic.net/member_repository/A917D135/19712F8613DD11EB8FFBED74C4F9AE02/0u36vBm4dKB-OdauyKaLMLuB2lw.crl
- or http://localhost:9999/api/object?hash=a92126d3a58a0f6593702f36713521fcd581e0a9e38a146f7f762c7d7d48ed0a

## Prometheus metrics
## Prometheus metrics <a name="prometheus-metrics"></a>

Prometheus metrics are accessible via the standard `/metrics` path.

## Resource consumption

# Support of RSC <a name="support-of-rsc"></a>

RPKI prover supports validating RPKI Signed Checklists (https://datatracker.ietf.org/doc/draft-ietf-sidrops-rpki-rsc/).

In order to validate a set of files with an RSC object it is necessary to have a running rpki-prover instance to be able to use its cache of validated object. In the examples below it is assumed that there's an instance of rpki-prover (the same version) running with `/var/prover` set as `--rpki-root-directory` option. It is also possible to skip `--rpki-root-directory` parameter assuming that the default (`~/.rpki`) with be used.

The following example validates two files `foo.txt` and `bar.bin` against the `checklist.sig` object:

```
rpki-prover --rpki-root-directory /var/prover --verify-signature --signature-file checklist.sig --verify-files foo.txt bar.bin
```

The following example validates all files in the `dir` directory against the `checklist.sig` object:

```
rpki-prover --rpki-root-directory /var/prover --verify-signature --signature-file checklist.sig --verify-directory ./dir
```


# Resource consumption <a name="resource-consumption"></a>

Cold start, i.e. the first start without cache takes at least 2 minutes and consumes around 3 minutes of CPU time. This time can be slightly reduced by setting higher `--cpu-count` value in case multiple CPUs are available. While CPU-intensive tasks scale pretty well (speed-up is sublinear up to 8-10 CPU cores), the total warm up time is moslty limited by the download time of the slowest of RPKI repositories and cannot be reduced drastically.

After initial warmup, it's not a very CPU-bound application. With default settings RPKI Prover consumes about 1 hour of CPU time every 18 hours on a typical modern CPU, creating load average of 5-10%. Smaller revalidation interval will increase the load.

The amount of memory needed for a smooth run for the current state of the repositories (6 trust anchors, including [AS0 TA](https://www.apnic.net/community/security/resource-certification/tal-archive/) of APNIC with about 330K of VRPs in total) is somewhere around 1.5GB. Adding or removing TAs can increase or reduce this amount. What can be confusing about memory usage is the figures given by `top/htop`.
The amount of memory needed for a smooth run for the current state of the repositories (6 trust anchors, including [AS0 TA](https://www.apnic.net/community/security/resource-certification/tal-archive/) of APNIC with about 330K of VRPs in total) is somewhere around 1.5-2GB for all processes in total. Adding or removing TAs can increase or reduce this amount. What can be confusing about memory usage is the figures given by `top/htop`.

An example of a server, running for a few days:
```
VIRT RES SHR
1.0T 5383M 3843M
1.0T 4463M 3920M
```
Here `SHR` is largely dominated by the LMDB cache and other mmap-ed files (temporary files used to download RRDP repositories, etc.). That means that actual heap of the process is about `5383-3843=1540M`.
Here `SHR` is largely dominated by the LMDB cache and other mmap-ed files (temporary files used to download RRDP repositories, etc.). That means that actual heap of the process is about `4463-3920=543M`.

Every validation or repository fetch runs as a separate process with its own heap, with typical heap size for the validator up to 600-700M and up to 100-200MB for a fetching process.

Note that memory consumption is mostly determined by how big the biggest objects are and not that much by how many there are objects in total, so the growth of repositories is not such a big issue for rpki-prover. It it recommended to have 3GB of RAM available on the machine mostly to reduce the IOPS related to reading objects from the LMDB cache. Since every validation typically goes through 160K of objects (at the moment of writing), each of them being 3Kb in size on average, it would be benificial to have at least few hundred of megabytes in FS page cache.

Disk space usage depends on the `--cache-lifetime-hours` parameter. The default is 72 hours and it results in a cache size about 2Gb. 72 hours is a little bit on a big side, so lower values would reduce the amount of data stored. However, LMDB is not very good in reusing the free space in its file, so physical size of the `cache` directory can be 2 or more times bigger than the total size of data in it. There is a compaction procedure that kicks in when the LMDB file size is 2 or more times bigger than the total size of all data. So overall, in the worst case scenario, it would need approximately 1GB of disk space for every 10 hours of `--cache-lifetime-hours`.

## Known issues
# Known issues <a name="known-issues"></a>

- From time to time a message 'rpki-prover: Thread killed by timeout manager' may be printed to `stderr`. It's the result of a bug in the HTTP server used for API and UI and is harmless. It will be fixed one way or the other in future versions.
- As mentioned before, total RSS of the process can go up to several gigabytes even though most of it mapped to LMDB cache and not in RAM. It may, however, be that `rpki-prover` is killed by OOM and some configuration adjustments would be needed to prevent it.

## Why Haskell?
# Why Haskell? <a name="why-haskell"></a>

- Relatively small code-base. Currently the size of it is around 10KLOC, including a lot of functionality implemented from scratch, such as CMS-parsing.
- Fast prototyping and smooth refactoring.
Expand Down

0 comments on commit 49f180d

Please sign in to comment.