One of the most important indicator of performance for online systems is its response time, and systems with high response time tend to be drivers of revenue loss, for online businesses. It has been shown in research that poor response time may be the result of the unavoidable server's performance variability, and this can create very high tail latency. The impact of this variability although negligible in systems with a few number of servers, it becomes very important as the system scales, since the user requests are more likely to be handled by a poor performing server. However, when scaling applications, some redundant servers are employed to provide replication in order to increase resilience, this is a very widespread approach in databases, thus it opens the possibility of employing a replica selection algorithm with the objective of choosing the better performing server every time a request needs to be processed thus reducing the effect of the tail latency
In this presentation, I will present my research of a regression model based replica selection algorithm, which after learning query response time patterns, can be used to direct long runnning queries to better performing servers, thus helping to reduce tail latency, while preserving throughput of the system.