Fix small phone display

hao-ai-lab · Mar 18, 2024 · 251706d · 251706d
1 parent 5d7897e
commit 251706d
Show file tree

Hide file tree

Showing 2 changed files with 7 additions and 7 deletions.
diff --git a/content/blogs/distserve/index.md b/content/blogs/distserve/index.md
@@ -118,18 +118,18 @@ Figure 5 illustrates how a request is processed in such a disaggregated system.
 Let’s go through a simple experiment to see why disaggregation is beneficial. We serve a 13B LLM on a single A100-80GB GPU with a synthetic workload of inputs of length 512 and output length 64 following [Poisson arrival](https://en.wikipedia.org/wiki/Poisson_point_process). We gradually increase the request rates (x-axis) and measure how the two latencies (P90 TTFT and P90 TPOT, y-axis) change in Figure 6.
 
 Suppose we set the SLO of P90 TTFT as 0.4 second and P90 TPOT as 0.04 second (the horizontal line in **Figure 6**). We observe the existing systems can support roughly 3 rps that stay within the TTFT latency constraint using 1 GPU, whereas for TPOT, it sustains 1.6 rps (**Figure 6a)**. Since we need to satisfy both constraints, the goodput of existing colocated system becomes:
-$$
-\text{	Goodput (colocate) = min(2.3, 1.6) = 1.6 rps (per GPU)
-}
-$$
+
+
+Goodput (colocate) = min(2.3, 1.6) = 1.6 rps (per GPU)
+
 
 
 The performance is significantly boosted after disaggregation. Prefill worker and decode worker can both achieve better rps than previous if only handling one phase – as shown in **Figure 6**, one prefill worker achieves roughly 5.6 rps and one decode worker achieves roughly 10 rps. 
 
 More importantly, now we can **flexibly** allocate 2 prefill workers to pair with 1 decode worker (notate as 2P1D), 3 GPUs in total.The goodput becomes
-$$
-\text{Goodput (2P1D) = min(5.6 x 2, 10) = 10 reqs/s / 3 GPUs ≈ 3.3 reqs/s (per GPU)}
-$$
+
+Goodput (2P1D) = min(5.6 x 2, 10) = 10 reqs/s / 3 GPUs ≈ 3.3 reqs/s (per GPU)
+
 
 
 **Simply disaggregating without any parallelism yields 2x goodput improvement.**

diff --git a/...rve_anime-crop_hu5ed75df8f1915f731141f797d983e102_612366_720x0_resize_box_1.gif b/...rve_anime-crop_hu5ed75df8f1915f731141f797d983e102_612366_720x0_resize_box_1.gif