Skip to content

Commit

Permalink
Initial commit.
Browse files Browse the repository at this point in the history
  • Loading branch information
geerlingguy committed Nov 9, 2022
0 parents commit 67b04a9
Show file tree
Hide file tree
Showing 13 changed files with 634 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .github/FUNDING.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# These are supported funding model platforms
---
github: geerlingguy
patreon: geerlingguy
56 changes: 56 additions & 0 deletions .github/stale.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Configuration for probot-stale - https://github.com/probot/stale

# Number of days of inactivity before an Issue or Pull Request becomes stale
daysUntilStale: 90

# Number of days of inactivity before an Issue or Pull Request with the stale label is closed.
# Set to false to disable. If disabled, issues still need to be closed manually, but will remain marked as stale.
daysUntilClose: 30

# Only issues or pull requests with all of these labels are check if stale. Defaults to `[]` (disabled)
onlyLabels: []

# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable
exemptLabels:
- pinned
- security
- planned

# Set to true to ignore issues in a project (defaults to false)
exemptProjects: false

# Set to true to ignore issues in a milestone (defaults to false)
exemptMilestones: false

# Set to true to ignore issues with an assignee (defaults to false)
exemptAssignees: false

# Label to use when marking as stale
staleLabel: stale

# Limit the number of actions per hour, from 1-30. Default is 30
limitPerRun: 30

pulls:
markComment: |-
This pull request has been marked 'stale' due to lack of recent activity. If there is no further activity, the PR will be closed in another 30 days. Thank you for your contribution!
Please read [this blog post](https://www.jeffgeerling.com/blog/2020/enabling-stale-issue-bot-on-my-github-repositories) to see the reasons why I mark pull requests as stale.
unmarkComment: >-
This pull request is no longer marked for closure.
closeComment: >-
This pull request has been closed due to inactivity. If you feel this is in error, please reopen the pull request or file a new PR with the relevant details.
issues:
markComment: |-
This issue has been marked 'stale' due to lack of recent activity. If there is no further activity, the issue will be closed in another 30 days. Thank you for your contribution!
Please read [this blog post](https://www.jeffgeerling.com/blog/2020/enabling-stale-issue-bot-on-my-github-repositories) to see the reasons why I mark issues as stale.
unmarkComment: >-
This issue is no longer marked for closure.
closeComment: >-
This issue has been closed due to inactivity. If you feel this is in error, please reopen the issue or file a new issue with the relevant details.
31 changes: 31 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
name: CI
'on':
pull_request:
push:
branches:
- master

jobs:

lint:
name: Lint
runs-on: ubuntu-latest

steps:
- name: Check out the codebase.
uses: actions/checkout@v2

- name: Set up Python 3.
uses: actions/setup-python@v2
with:
python-version: '3.x'

- name: Install test dependencies.
run: pip3 install yamllint ansible

- name: Lint all the YAMLs.
run: yamllint .

- name: Run the HPL benchmark playbook without networking.
run: ansible-playbook main.yml --tags "setup,benchmark"
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
hosts.ini
11 changes: 11 additions & 0 deletions .yamllint
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
extends: default
rules:
line-length:
max: 140
level: warning
truthy: false

ignore: |
**/.github/workflows/ci.yml
**/stale.yml
70 changes: 70 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Top500 Benchmark - HPL Linpack

[![CI](https://github.com/geerlingguy/top500-benchmark/workflows/CI/badge.svg?branch=master&event=push)](https://github.com/geerlingguy/top500-benchmark/actions?query=workflow%3ACI)

A common generic benchmark for clusters (or extremly powerful single node workstations) is Linpack, or HPL (High Performance Linpack), which is famous for its use in rankings in the [Top500 supercomputer list](https://top500.org) over the past few decades.

I wanted to see where my various clusters and workstations would rank, historically ([you can compare to past lists here](https://hpl-calculator.sourceforge.net/hpl-calculations.php)), so I built this Ansible playbook which installs all the necessary tooling for HPL to run, connects all the nodes together via SSH, then runs the benchmark and outputs the result.

## Why not PTS?

Phoronix Test Suite includes [HPL Linpack](https://openbenchmarking.org/test/pts/hpl) and [HPCC](https://openbenchmarking.org/test/pts/hpcc) test suites. I may see how they compare in the future.

When I initially started down this journey, the PTS versions didn't play nicely with the Pi, especially when clustered. And the PTS versions don't seem to support clustered usage at all!

## Benchmarking - Cluster

Make sure you have Ansible installed (`pip3 install ansible`), then set up a `hosts.ini` file in this directory based on the `example.hosts.ini` file.

Each host should be reachable via SSH using the username set in `ansible_user`. Other Ansible options can be set under `[cluster:vars]` to connect in more exotic clustering scenarios (e.g. via bastion/jump-host).

Tweak any settings inside `config.yml` as desired (the most important being `hpl_root`—this is where the compiled MPI, ATLAS, and HPL benchmarking code will live).

Then run the benchmarking playbook inside this directory:

```
ansible-playbook main.yml
```

This will run three separate plays:

1. Setup: downloads and compiles all the code required to run HPL. (This play takes a long time—up to many hours on a slower Raspberry Pi!)
2. SSH: configures the nodes to be able to communicate with each other.
3. Benchmark: creates an `HPL.dat` file and runs the benchmark, outputting the results in your console.

After the entire playbook is complete, you can also log directly into any of the nodes (though I generally do things on node 1), and run the following commands to kick off a benchmarking run:

```
cd ~/tmp/hpl-2.3/bin/rpi
mpirun -f cluster-hosts ./xhpl
```

> The configuration here was tested on smaller 1, 4, and 6-node clusters with 6-64 GB of RAM. Some settings in the `config.yml` file that affect the generated `HPL.dat` file may need diffent tuning for different cluster layouts!
### Benchmarking a Single Node

To run locally on a single node, clone or download this repository to the node where you want to run HPL. Make sure the `hosts.ini` is set up with the default options (with just one node, `127.0.0.1`).

Then, run the following command so the cluster networking portion of the playbook is not run:

```
ansible-playbook main.yml --tags "setup,benchmark"
```

> For testing, you can start an Ubuntu docker container:
>
> ```
> docker run -it --rm -v $PWD:/code geerlingguy/docker-ubuntu2204-ansible:latest bash
> ```
>
> Then go into the code directory (`cd /code`) and run the playbook using the command above.
## Results
In my testing on Raspberry Pi OS Bullseye, in November 2021, I got the following results:
| Benchmark | Configuration | Result | Wattage | Gflops/W |
| --- | --- | --- | --- | --- |
| HPL (1.5 GHz default clock) | Turing Pi 2 (4x CM4) | 44.942 Gflops | 24.5W | 1.83 Gflops/W |
| HPL (2.0 GHz overclock) | Turing Pi 2 (4x CM4) | 51.327 Gflops | 33W | 1.54 Gflops/W |
| HPL (1.5 GHz default clock) | DeskPi Super6c (6x CM4) | TODO Gflops | TODOW | TODO Gflops/W |
5 changes: 5 additions & 0 deletions ansible.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[defaults]
nocows = true
inventory = hosts.ini
interpreter_python = /usr/bin/python3
stdout_callback = yaml
15 changes: 15 additions & 0 deletions config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
# Working directory where HPL and associated applications will be compiled.
hpl_root: /home/pi

# HPL.dat configuration options.
# See: https://www.advancedclustering.com/act_kb/tune-hpl-dat-file/
# See also: https://hpl-calculator.sourceforge.net/HPL-HowTo.pdf
nodecount: "{{ ansible_play_hosts | length | int }}"
ram_in_gb: "{{ ( ansible_memtotal_mb / 1024 * 0.75 ) | int | abs }}"
hpl_dat_opts:
# sqrt((Memory in GB * 1024 * 1024 * 1024 * Node count) / 8) * 0.9
Ns: "{{ (((((ram_in_gb | int) * 1024 * 1024 * 1024 * (nodecount|int)) / 8) | root) * 0.90) | int }}"
NBs: 192
Ps: 2
Qs: 2
14 changes: 14 additions & 0 deletions example.hosts.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# For single node benchmarking (default), use this:
[cluster]
127.0.0.1 ansible_connection=local

# For cluster benchmarking, delete everything above this line and uncomment:
# [cluster]
# node-01.local
# node-02.local
# node-03.local
# node-04.local
# node-05.local
#
# [cluster:vars]
# ansible_user=username
Loading

0 comments on commit 67b04a9

Please sign in to comment.