Queueing theory is the study of "the phenomena of standing, waiting, and serving" (definition given by Leonard Kleinrock in the introduction to his two-volume work, Queueing Systems). As a mathematical discipline, queueing theory draws on the work of many famous mathematicians of the past: Euler, Gauss, Poisson, etc., but the systematic application of queueing theory to telecommunications begins with the work of the great Danish telephone engineer Agner Erlang in the early 20th century.
On this page we give an introduction to queueing theory that is complete enough so that a specialist can understand the theoretical basis for the functions in the Erlang Library for Excel, but incomplete enough so that non-specialists will not get lost in a sea of details.
Spelling. In other contexts the word "queueing" is often spelled "queuing", but practioners of queueing theory prefer to place two e's in the word.
To use queueing theory, you need three things:
We may completely understand the queueing model, because it is a simplified system that exists only in the idealized world of mathematics. It may have precise formulas that describe it completely. By contrast, our understanding of a real-world system is always imperfect. Real-world systems are complex, messy, and mysterious. For that reason, it will never be true that a queueing model exactly represents a real-world system. The art of applying queueing theory consists in deciding which models to use with which systems, and how to perform the mapping (3) above in order to glean helpful information about real-world systems.
Beginning with Agner Erlang, some of the most successful applications of queueing theory have been in telecommunications. The Erlang Library for Excel gathers together several of the queueing functions that are most useful to telecommunications analysts. These functions are based on two queueing models:
To read about either model in detail, click on its link above.
In both models, we assume the existence of a finite number c of entities called servers. The servers operate inside a system containing a finite number r of slots or positions. Each server occupies a position. Therefore r is always greater than or equal to c (and r may be infinite). Positions not containing a server are called waiting positions. The number of waiting positions is r − c . Those positions containing a server will be called server positions.
Each position may hold 0 or 1 servers, and also 0 or 1 customers. What are customers? They are entities who arrive at the system and then either leave immediately, or enter the system and stay for a while. While in the system, a customer either occupies a waiting position, or shares a server position with a server.
How do customers behave? When a customer arrives at the system, the customer looks to see if any server positions have no customers in them. If so, the customer enters such a server position and begins service. After a while that customer ends service, leaving the system. On the other hand, it can happen that when the customer arrives at the system, all server positions are occupied by other customers receiving service. In that case, the arriving customer looks to see if any waiting positions are unoccupied. If so, then that customer enters a waiting position and waits.
When a customer finishes service, the customer leaves the system (never to return). The server who was sharing the server postion that the customer just vacated looks to see if any waiting positions have a customer in them. If so, the server chooses the longest waiting customer; that customer then vacates the waiting position, joins the server, and begins service. But if no customer is waiting, then the server becomes idle and waits until an arriving customer (see above) joins the server to begin service. (We have just specified here that the queueing discipline is FIFO, First-In-First-Out.)
We assume that the customers arrive at random times in what is called a Poisson process, with arrival rate λ > 0. This means that at any time t, the probability that at least one customer will arrive during the time interval from t to t + Δt (where Δt is a small time duration) is approximately λΔt. The exact value of this probability is 1 − e−λΔt.
Service time (also called duration of service, or handle time) is measured from the moment that a customer begins service until the moment that the customer ends service. (Time spent waiting is not included in service time.) We assume that service time is exponentially distributed with service rate μ > 0. The service rate is the reciprocal of the average service time, which we will call Average Handle Time, or AHT. Thus, μ = 1 / AHT. The phrase "exponentially distributed" has the meaning that the fraction f of service times that are less than a given value t is given by
f = 1 − e−μt
We define a quantity x called the offered traffic load as x = arrival rate divided by service rate. In other words,
x = λ / μ
You can think of x as the "average number of simultaneous services that are trying to take place" during the operation of the system. Since x is the quotient of two frequencies, it is said to be a "unitless" quantity. Nevertheless, its "units" are called erlangs, in honor of Agner Erlang. Thus, for example, if the arrival rate = 10/sec, and the service rate = 2/sec, then the traffic load is x = 5 erlangs.
Queueing theory analysts use a shorthand called Kendall 
  notation to quickly describe queueing models such as the Erlang B and 
  Erlang C models.  A Kendall expression most commonly takes the form A/B/C/K, 
  where
    A describes the arrival behavior of customers
    B describes the distribution of service times
    C is the number of servers, and
    K is the total number of positions in the system (including 
  both serving positions and waiting positions).  K may be infinite, 
  in which case K may be omitted.  If A=M, 
  then the customers arrive in a Poisson process.  If B=M, 
  then service times are exponentially distributed. 
Although queueing theory textbooks usually denote the number of servers by c, in this help file we sometimes use n or nsrv.
In the Erlang B queueing model we assume Poisson arrivals, exponential service times, c servers (where c is a non-negative integer), and no waiting positions. In Kendall notation, therefore, this model is denoted M/M/c/c .
In the Erlang C queueing model we assume Poisson arrivals, exponential service times, c servers (where c is a non-negative integer), and infinitely many waiting positions. In Kendall notation, this model becomes M/M/c/∞, or simply, M/M/c . An important special case is the M/M/1 queueing model, commonly called the single-server queue.