What is being measured here? For end-to-end time, one model is:
t_total = t_network + t_queue + t_batch_wait + t_inference + t_service_overhead
What is being measured here? For end-to-end time, one model is:
t_total = t_network + t_queue + t_batch_wait + t_inference + t_service_overhead