Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks #59

Merged
merged 62 commits into from
May 31, 2022
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
a75351b
POC for a simple benchmark
CastilloDel Aug 1, 2021
235029f
Add config for the htsget-refserver
CastilloDel Aug 3, 2021
9fcb2ea
Add a little script to start the htsget-refserver
CastilloDel Aug 3, 2021
02075ea
Add a simple benchmark
CastilloDel Aug 4, 2021
650bbfd
Comparison between total download size
CastilloDel Aug 5, 2021
8adfd7a
Change url types
CastilloDel Aug 6, 2021
cbf58e0
Add static files to the server
CastilloDel Aug 6, 2021
4f28931
Show the benchmark download size properly
CastilloDel Aug 7, 2021
02a6418
Don't force to use '/data' as an extension for htsget-search urls
CastilloDel Aug 8, 2021
203abf9
Add another benchmark
CastilloDel Aug 8, 2021
f9f8b30
Refactor and adjustments
CastilloDel Aug 9, 2021
87ff1b3
Add a more complex benchmark
CastilloDel Aug 9, 2021
7c4924d
Add htsget-search benchmark
CastilloDel Aug 9, 2021
5e7bf21
Add more htsget-search tests
CastilloDel Aug 10, 2021
46baaac
Add VCF test
CastilloDel Aug 12, 2021
4d39271
Add a test with a big file
CastilloDel Aug 12, 2021
5b1f429
Add GitHub Action
CastilloDel Aug 12, 2021
d6d85b5
Use constants for the benchmark configuration
CastilloDel Aug 12, 2021
9874e0a
Fix CI
CastilloDel Aug 13, 2021
4674ddc
Fix CI
CastilloDel Aug 13, 2021
13f609d
Give exec bit to shell scripts, add html-reports to criterion to supr…
brainstorm Aug 16, 2021
8fb4a85
Try latest version of criterion-compare
brainstorm Aug 16, 2021
2875e2a
Add instruction on how to run the benchmarks in the README
CastilloDel Aug 16, 2021
7d0da7f
Fix typo
CastilloDel Aug 21, 2021
378e638
Fix dead code warnings for async/blocking versions
brainstorm Sep 17, 2021
0fbbefc
Merge branch 'main' into benchmarks
brainstorm Sep 17, 2021
27ff9ee
Fix bad first pass merge
brainstorm Sep 17, 2021
e98d828
Merge branch 'dead_code_warning_async_blocking' into benchmarks
brainstorm Sep 17, 2021
70652bc
Unused actix files, must check how @CastilloDel was approaching this …
brainstorm Sep 17, 2021
d4cdf58
Only get/post requests (with_range) failing on benchmarks branch as a…
brainstorm Sep 17, 2021
46ea97f
Further separating the blocking/async versions via #cfg directives, n…
brainstorm Sep 17, 2021
e0d0ab6
fmt that
brainstorm Sep 17, 2021
8fe539d
Arc only for async
brainstorm Sep 23, 2021
698aae2
Test both default --all-features and --no--default-features
brainstorm Sep 23, 2021
dfb26db
Argh, this should be in args, not command... also add cargo cache fro…
brainstorm Sep 23, 2021
8824d82
The blocking side has to have the handlers on 'pub async fn' for acti…
brainstorm Sep 27, 2021
72b8ed5
Merge branch 'dead_code_warning_async_blocking' into benchmarks
brainstorm Sep 27, 2021
998026e
Add missing actix_files::Files
brainstorm Sep 27, 2021
2af6bde
Fix several errors related to the static file server
CastilloDel Sep 28, 2021
838f78e
Fix benchmark so it can run as blocking
CastilloDel Sep 28, 2021
86be4dc
Merge branch 'main' of https://github.com/umccr/htsget-rs into benchm…
mmalenic May 22, 2022
9b40653
Merge branch 'main' of https://github.com/umccr/htsget-rs into benchm…
mmalenic May 24, 2022
50d78de
Refactor benchmarks, fix a few errors, rearrange some tests.
mmalenic May 24, 2022
5eca71a
Merge branch 'main' into benchmarks
brainstorm May 24, 2022
e7217e0
Refactor rcgen in axum server.
mmalenic May 25, 2022
65dbfd6
Merge branch 'main' of https://github.com/umccr/htsget-rs into benchm…
mmalenic May 25, 2022
8f89303
Merge branch 'benchmarks' of https://github.com/umccr/htsget-rs into …
mmalenic May 25, 2022
85a2d95
Implement script files into rust code.
mmalenic May 25, 2022
aaea724
Merge branch 'benchmarks' of https://github.com/umccr/htsget-rs into …
mmalenic May 25, 2022
4fdf781
Merge branch 'main' of https://github.com/umccr/htsget-rs into benchm…
mmalenic May 25, 2022
c89fac3
Fix localstorage path (#86)
mmalenic May 27, 2022
f44c52a
Merge branch 'benchmarks' of https://github.com/umccr/htsget-rs into …
mmalenic May 27, 2022
b252096
Fix certificate errors by using rustls-tls.
mmalenic May 27, 2022
25e364d
Http byte ranges are inclusive for ending range, fix this and all aff…
mmalenic May 30, 2022
020a9ad
Add light and heavy benchmarks, update README.md.
mmalenic May 30, 2022
a528f65
Update actions, remove files, update README.md.
mmalenic May 30, 2022
97d1875
Update action
mmalenic May 30, 2022
2a77de4
Resolve default/s3-storage feature clashes.
mmalenic May 31, 2022
da79a0f
Update action.yml.
mmalenic May 31, 2022
28e9347
Update action.yml.
mmalenic May 31, 2022
253f97f
Update action
mmalenic May 31, 2022
b13c410
Reduce number of samples.
mmalenic May 31, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion htsget-http-actix/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,18 @@ edition = "2018"
[dependencies]

actix-web = "3"
actix-files = "0.5"
envy = "0.4"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
futures-util = { version = "0.3.5", default-features = false }
htsget-http-core = { path = "../htsget-http-core" }
htsget-search = { path = "../htsget-search" }
htsget-search = { path = "../htsget-search" }

[dev-dependencies]
criterion = "0.3"
reqwest = { version = "0.11", features = ["blocking", "json"] }

[[bench]]
name = "request-benchmark"
harness = false
138 changes: 138 additions & 0 deletions htsget-http-actix/benches/request-benchmark.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
use criterion::measurement::WallTime;
use criterion::{criterion_group, criterion_main, BenchmarkGroup, Criterion};
use htsget_http_core::{JsonResponse, PostRequest, Region};
use reqwest::{blocking::Client, Error as ActixError};
use serde::Serialize;
use std::collections::HashMap;
use std::{convert::TryInto, time::Duration};
#[derive(Serialize)]
struct Empty {}

const HTSGET_RS_URL: &str = "http://localhost:8080/reads/data/bam/htsnexus_test_NA12878";
const HTSGET_REFSERVER_URL: &str = "http://localhost:8081/reads/htsnexus_test_NA12878";

fn request(url: &str, json_content: &impl Serialize) -> Result<usize, ActixError> {
let client = Client::new();
let response: JsonResponse = client.get(url).json(json_content).send()?.json()?;
Ok(
response
.htsget
.urls
.iter()
.map(|json_url| {
Ok(
client
.get(&json_url.url)
.headers(
json_url
.headers
.as_ref()
.unwrap_or(&HashMap::new())
.try_into()
.unwrap(),
)
.send()?
.text()?
.len(),
)
})
.collect::<Result<Vec<_>, ActixError>>()?
.into_iter()
.sum(),
)
}

fn bench_request(
group: &mut BenchmarkGroup<WallTime>,
name: &str,
url: &str,
json_content: &impl Serialize,
) {
println!(
"\n\nDownload size: {} bytes",
request(url, json_content).expect("Error during the request")
);
group.bench_function(name, |b| b.iter(|| request(url, json_content)));
}

fn criterion_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("Requests");
group
.sample_size(150)
.measurement_time(Duration::from_secs(15));

bench_request(
&mut group,
"htsget-rs simple request",
HTSGET_RS_URL,
&Empty {},
);
bench_request(
&mut group,
"htsget-refserver simple request",
HTSGET_REFSERVER_URL,
&Empty {},
);

let json_content = PostRequest {
format: None,
class: None,
fields: None,
tags: None,
notags: None,
regions: Some(vec![Region {
reference_name: "20".to_string(),
start: None,
end: None,
}]),
};
bench_request(
&mut group,
"htsget-rs with region",
HTSGET_RS_URL,
&json_content,
);
bench_request(
&mut group,
"htsget-refserver with region",
HTSGET_REFSERVER_URL,
&json_content,
);

let json_content = PostRequest {
format: None,
class: None,
fields: None,
tags: None,
notags: None,
regions: Some(vec![
Region {
reference_name: "20".to_string(),
start: None,
end: None,
},
Region {
reference_name: "11".to_string(),
start: Some(4999977),
end: Some(5008321),
},
]),
};
bench_request(
&mut group,
"htsget-rs with two regions",
HTSGET_RS_URL,
&json_content,
);
bench_request(
&mut group,
"htsget-refserver with two regions",
HTSGET_REFSERVER_URL,
&json_content,
);

group.finish();
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
2 changes: 2 additions & 0 deletions htsget-http-actix/docker-htsget-refserver.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
docker image pull ga4gh/htsget-refserver:1.4.0
docker container run -d -p 8081:3000 -v $(pwd)/../data:/data -v $(pwd):/config ga4gh/htsget-refserver:1.4.0 ./htsget-refserver -config /config/htsget-refserver-config.json
18 changes: 18 additions & 0 deletions htsget-http-actix/htsget-refserver-config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{
CastilloDel marked this conversation as resolved.
Show resolved Hide resolved
"htsgetConfig": {
"props": {
"port": 8081,
"host": "http://localhost:8081/"
},
"reads": {
"dataSourceRegistry": {
"sources": [
{
"pattern": "^(?P<id>.*)$",
"path": "/data/bam/{id}.bam"
}
]
}
}
}
}
5 changes: 4 additions & 1 deletion htsget-http-actix/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
use actix_files::Files;
use actix_web::{web, App, HttpServer};
use htsget_search::{
htsget::{from_storage::HtsGetFromStorage, HtsGet},
Expand Down Expand Up @@ -43,12 +44,13 @@ async fn main() -> std::io::Result<()> {
}
let config = envy::from_env::<Config>().expect("The environment variables weren't properly set!");
let address = format!("{}:{}", config.htsget_ip, config.htsget_port);
let storage_base_address = format!("{}/data", address);
let htsget_path = config.htsget_path.clone();
HttpServer::new(move || {
App::new()
.data(AppState {
htsget: HtsGetFromStorage::new(
LocalStorage::new(htsget_path.clone())
LocalStorage::new(&htsget_path, &storage_base_address)
.expect("Couldn't create a Storage with the provided path"),
),
config: config.clone(),
Expand Down Expand Up @@ -79,6 +81,7 @@ async fn main() -> std::io::Result<()> {
.route("/{id:.+}", web::get().to(get::variants::<HtsGetStorage>))
.route("/{id:.+}", web::post().to(post::variants::<HtsGetStorage>)),
)
.service(Files::new("/data", htsget_path.clone()))
})
.bind(address)?
.run()
Expand Down
37 changes: 18 additions & 19 deletions htsget-http-core/src/json_response.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
use std::collections::HashMap;

use htsget_search::htsget::{Class, Format, Response, Url};
use serde::Serialize;
use serde::{Deserialize, Serialize};

/// A helper struct to convert [Responses](Response) to JSON. It implements [serde's Serialize trait](Serialize),
/// so it's trivial to convert to JSON.
#[derive(Debug, PartialEq, Serialize)]
#[derive(Debug, PartialEq, Serialize, Deserialize)]
pub struct JsonResponse {
htsget: HtsGetResponse,
pub htsget: HtsGetResponse,
}

impl JsonResponse {
Expand All @@ -20,10 +20,10 @@ impl JsonResponse {

/// A helper struct to represent a JSON response. It shouldn't be used
/// on its own, but with [JsonResponse]
#[derive(Debug, PartialEq, Serialize)]
#[derive(Debug, PartialEq, Serialize, Deserialize)]
pub struct HtsGetResponse {
format: String,
urls: Vec<JsonUrl>,
pub format: String,
pub urls: Vec<JsonUrl>,
}

impl HtsGetResponse {
Expand All @@ -39,26 +39,25 @@ impl HtsGetResponse {

/// A helper struct to convert [Urls](Url) to JSON. It shouldn't be used
/// on its own, but with [JsonResponse]
#[derive(Debug, PartialEq, Serialize)]
#[derive(Debug, PartialEq, Serialize, Deserialize)]
pub struct JsonUrl {
url: String,
headers: HashMap<String, String>,
class: String,
pub url: String,
pub headers: Option<HashMap<String, String>>,
pub class: Option<String>,
}

impl JsonUrl {
fn new(url: Url) -> Self {
JsonUrl {
url: url.url,
headers: match url.headers {
Some(headers) => headers.get_inner(),
None => HashMap::new(),
},
class: match url.class {
Class::Body => "body",
Class::Header => "header",
}
.to_string(),
headers: url.headers.map(|headers| headers.get_inner()),
class: Some(
match url.class {
Class::Body => "body",
Class::Header => "header",
}
.to_string(),
),
}
}
}
59 changes: 17 additions & 42 deletions htsget-http-core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,6 @@ mod tests {
htsget::{from_storage::HtsGetFromStorage, Format, Headers, Url},
storage::local::LocalStorage,
};
use std::path::PathBuf;
#[test]
fn get_request() {
let mut request = HashMap::new();
Expand All @@ -129,14 +128,10 @@ mod tests {
get_response_for_get_request(&get_searcher(), request, Endpoint::Reads),
Ok(JsonResponse::from_response(Response::new(
Format::Bam,
vec![Url::new(format!(
"file://{}",
get_base_path()
.join("bam")
.join("htsnexus_test_NA12878.bam")
.to_string_lossy()
))
.with_headers(Headers::new(headers))]
vec![
Url::new("http://localhost/data/bam/htsnexus_test_NA12878.bam")
.with_headers(Headers::new(headers))
]
)))
)
}
Expand Down Expand Up @@ -167,14 +162,10 @@ mod tests {
get_response_for_get_request(&get_searcher(), request, Endpoint::Variants),
Ok(JsonResponse::from_response(Response::new(
Format::Vcf,
vec![Url::new(format!(
"file://{}",
get_base_path()
.join("vcf")
.join("sample1-bcbio-cancer.vcf.gz")
.to_string_lossy()
))
.with_headers(Headers::new(headers))]
vec![
Url::new("http://localhost/data/vcf/sample1-bcbio-cancer.vcf.gz")
.with_headers(Headers::new(headers))
]
)))
)
}
Expand All @@ -200,14 +191,10 @@ mod tests {
),
Ok(JsonResponse::from_response(Response::new(
Format::Bam,
vec![Url::new(format!(
"file://{}",
get_base_path()
.join("bam")
.join("htsnexus_test_NA12878.bam")
.to_string_lossy()
))
.with_headers(Headers::new(headers))]
vec![
Url::new("http://localhost/data/bam/htsnexus_test_NA12878.bam")
.with_headers(Headers::new(headers))
]
)))
)
}
Expand Down Expand Up @@ -260,27 +247,15 @@ mod tests {
),
Ok(JsonResponse::from_response(Response::new(
Format::Vcf,
vec![Url::new(format!(
"file://{}",
get_base_path()
.join("vcf")
.join("sample1-bcbio-cancer.vcf.gz")
.to_string_lossy()
))
.with_headers(Headers::new(headers))]
vec![
Url::new("http://localhost/data/vcf/sample1-bcbio-cancer.vcf.gz")
.with_headers(Headers::new(headers))
]
)))
)
}

fn get_base_path() -> PathBuf {
std::env::current_dir()
.unwrap()
.parent()
.unwrap()
.join("data")
}

fn get_searcher() -> impl HtsGet {
HtsGetFromStorage::new(LocalStorage::new("../data").unwrap())
HtsGetFromStorage::new(LocalStorage::new("../data", "localhost/data").unwrap())
}
}
5 changes: 5 additions & 0 deletions htsget-search/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,8 @@ noodles = { version = "0.5.0", features = ["bam", "bcf", "bgzf", "cram", "csi",

[dev-dependencies]
tempfile = "3.2.0"
criterion = "0.3"

[[bench]]
name = "benchmark"
harness = false
Loading