The number of documents (denoted as N) to search over consumes resources and increases the overall latency of the search infrastructure. However, the amount of traffic ebbs and flows. Under periods of lower traffic, we likely can increase N to provide good search results without violating latency constraints. Conversely, high traffic periods likely requires lower values of N.
Let's then approximate system strain by p99 (or p99.XXX) latency.
Solution:
Use a PID controller to set N as a function of latency (p99, p99.5, etc.) of the cluster. This leads to the outcome where N reduces when p99 latency starts to spike (resource starvation), and increases when p99 is low.
The number of documents (denoted as N) to search over consumes resources and increases the overall latency of the search infrastructure. However, the amount of traffic ebbs and flows. Under periods of lower traffic, we likely can increase N to provide good search results without violating latency constraints. Conversely, high traffic periods likely requires lower values of N.
Let's then approximate system strain by p99 (or p99.XXX) latency.
Solution:
Use a PID controller to set N as a function of latency (p99, p99.5, etc.) of the cluster. This leads to the outcome where N reduces when p99 latency starts to spike (resource starvation), and increases when p99 is low.