Key Performance Metrics (wip)

Natarajan Santhosh
2 min readNov 8, 2019

--

Why p99?

It’s the upper bound of latencies experienced by 99% of flows (e.g., TCP flows, HTTP requests, RPCs, …). In other words, 99% of the flows are experiencing less than the p99 (aka 99th-percentile) latency.

Why is it used?
In most applications, we want to minimize tail latencies, which correspond to the worst user experience. Now, given that we have roughly 1% noise in our measurements (like network congestions, outages, service degradations), the p99 latency is a good representative of practically the worst case. And, almost always, our goal is to reduce the p99 latency.

Let me give you an example. Let say you have a web app that its data is stored in a persistent DB and is also partly cached in memory. If you answer 90% of the requests from the cache, but 10% from the persistent DB, the p99 latency is determined by the DB’s latency. At this stage, you need to work on your DB design and caching strategy to improve p99, otherwise you’ll get lots of complaints from your customers (end-users or other developers in your team).

Process vs Thread

must read

https://www.slashroot.in/difference-between-process-and-thread-linux

Memory Paging

unit testing: rails in-build performance testing

incremental e2e testing load testing

Deploy persistent test environment and keep it running

every time add performance datasetup/incremental datasetup. performance test the new change + regression

virtual memory concept comes from a time when memory was expensive

A portion of hard disk acts as physical memory called a page file

when machine runs out of memory, it moves pages of memory(inactive) to hard disk to use memory for active/other processes

If a workload relies on swap page, more it will negatively effect performance

--

--

No responses yet