## Abstract

Server-side variability – the idea that the same job can take longer to run on one server than another due to server-dependent factors – isan increasingly important concern in many queueing systems. One strategy for overcoming server-side variability to achieve low response time is redundancy, under which jobs create copies of themselves and send these copies to multiple different servers, waiting for only one copy to complete service. Most of the existing theoretical work on redundancy has focused on developing bounds, approximations, and exact analysis to study the response time gains offered by redundancy. However, response time is not the only important metric in redundancy systems: in addition to providing low overall response time, the system should also be fair in the sense that no job class should have a worse mean response time in the system with redundancy than it did in the system before redundancy is allowed. In this paper we use scheduling to address the simultaneous goals of (1) achieving low response time and (2) maintaining fairness across job classes. We develop new exact analysis for per-class response time under First-Come First-Served (FCFS) scheduling for a general type of system structure; our analysis shows that FCFS can be unfair in that it can hurt non-redundant jobs. We then introduce the Least Redundant First (LRF) scheduling policy, which we prove is optimal with respect to overall system response time, but which can be unfair in that it can hurt the jobs that become redundant. Finally, we introduce the Primaries First (PF) scheduling policy, which is provably fair and also achieves excellent overall mean response time.

Original language | English |
---|---|

Pages (from-to) | 1-25 |

Number of pages | 25 |

Journal | Performance Evaluation |

Volume | 116 |

DOIs | |

Publication status | Published - Nov 2017 |

### Bibliographical note

Funding Information:This work was supported by the NSF under grants NSF-XPS-1629444 , NSF-CMMI-1538204 , and NSF-CMMI-1334194 ; by a Google Faculty Research Award; by a Facebook Faculty Gift; by a Google Anita Borg Memorial Scholarship; by the Siebel Scholars Program; and by the Academy of Finland for the FQ4BD project (grant no. 296206 ). We would like to thank the referees for their detailed feedback. Appendix A

Funding Information:

Lemma 4 Proof We find the normalizing constant via induction on the number of job classes. Our base case is the W model, which has three classes and two servers (note that we can obtain systems with two or one classes by simply setting the arrival rates for unwanted classes to be 0). From [6] , we know that the normalizing constant in the W model is C W = 1 − ρ A 1 − ρ B 1 − ρ R . Now assume that the normalizing constant follows the form given in (4) for all systems with at most m job classes. Consider a system with m + 1 job classes, and let class R be the most redundant class (i.e., class- R jobs replicate to all servers). By Lemma 3 , we know that the limiting probability of being in state n R , R ; ( A ) ( B ) is π n R , R ; ( A ) ( B ) = C ⋅ λ μ A + μ B n R λ R μ A + μ B ⋅ π A ⋅ π B , where C is the normalizing constant, and where there are i ≤ m job classes in the left subsystem and m − i job classes in the right subsystem. Aggregating over all possible states for the left and the right subsystems, we find, for a constant ξ to be determined below, π n R , R ; ( ∗ ) ( ∗ ) = ξ λ μ A + μ B n R λ R μ A + μ B , where, letting A 1 , … , A i and B 1 , … , B n − i denote the job classes in subsystems A and B respectively, C A = ∏ y = 1 i ( 1 − ρ A y ) and C B = ∏ y = 1 m − i ( 1 − ρ B y ) follow from the inductive hypothesis, and C = ξ ⋅ C A ⋅ C B . Finally, we sum over all values of □ n , as well as the state in which there are no class- R jobs in the system; this sum must be equal to 1: 1 = π R , − , ( ∗ ) ( ∗ ) + ∑ n = 0 ∞ π R , n R , ( ∗ ) ( ∗ ) 1 = ξ + ∑ n = 0 ∞ ξ λ μ A + μ B n R λ R μ A + μ B 1 = ξ 1 + λ R μ A + μ B 1 1 − λ μ A + μ B 1 = ξ μ A + μ B − λ + λ R μ A + μ B − λ , so we have ξ = π − , R ; ( ∗ ) ( ∗ ) = μ A + μ B − λ μ A + μ B − λ + λ R = 1 − ρ R and hence C = ξ ⋅ C A ⋅ C B = ∏ i = 1 ℓ ( 1 − ρ i ) as desired. Kristen Gardner is an Assistant Professor of Computer Science at Amherst College. Her research interests are in queueing theory and performance modeling. She received her Ph.D. in Computer Science from Carnegie Mellon University in 2017, advised by Mor Harchol-Balter. She was a Class of 2017 Siebel Scholar, received a Google Anita Borg Memorial Scholarship in 2016, and was supported by an NSF Graduate Research Fellowship from 2012–2015. She obtained her B.A. in Computer Science from Amherst College in 2012. Mor Harchol-Balter is a Professor of Computer Science at Carnegie Mellon. She received her Ph.D. from U.C. Berkeley in 1996 under the direction of Manuel Blum. She joined CMU in 1999, and served as the Head of the PhD program from 2008-2011. Mor is a recipient of the McCandless Junior Chair, the NSF CAREER award, and several teaching awards, including the Herbert A. Simon Award. She has received faculty awards from a dozen companies including Google, Microsoft, IBM, EMC, Facebook, Intel, Yahoo, and Seagate. Mor’s work focuses on designing new resource allocation policies, including load balancing policies, power management policies, and scheduling policies, for distributed systems. She is known for her new techniques in queueing-theoretic modeling and analysis, as well as her work in computer systems implementation. Mor is heavily involved in the SIGMETRICS/PERFORMANCE research community, where she has received multiple best paper awards, and where she served as Technical Program Chair in 2007, as General Chair in 2013, and as the Keynote Speaker for 2016. She is also the author of a popular textbook, “Performance Analysis and Design of Computer Systems,” published by Cambridge University Press. Esa Hyytiä is an Associate Professor in the Department of Computer Science at the University of Iceland (HÎ). During 2005–2006, he was with the Norwegian University of Science and Technology (NTNU), Norway, 2006–2009 with Telecommunication Research Center Vienna (FTW), Austria, and 2009–2015 with Aalto University, Finland. He received his Dr.Sc. (Tech.) degree in Electrical Engineering from Helsinki University of Technology (TKK) in 2004. His research interests include performance analysis, modelling, design and optimization of computer and communications systems. Rhonda Righter is a Professor and past Chair of the Department of Industrial Engineering and Operations Research at the University of California, Berkeley. Before joining Berkeley she was a Professor of Operations Management and Information Systems in the Leavey School of Business at Santa Clara University. Her PhD is in Industrial Engineering and Operations Research from UC Berkeley. Her primary research and teaching interests are in the general area of stochastic modeling and optimization, especially as applied to service, manufacturing, computer communication, and cloud computing systems. She is an associate editor for the Journal of Scheduling, Queueing Systems, and the INFORMS Service Science Journal. She formerly served on the editorial boards of Management Science, Operations Research, and Operations Research Letters. She is the past (founding) Chair of the Applied Probability Society of INFORMS.

Publisher Copyright:

© 2017 Elsevier B.V.

## Other keywords

- Queueing theory
- Redundancy
- Replication
- Resource allocation
- Scheduling
- Stochastic processes