Permutations with Inversions

Barbara H. Margolius
Cleveland State University
Cleveland, Ohio 44115

Email address: margolius@math.csuohio.edu

Abstract: The number of inversions in a random permutation is a way to measure the extent to which the permutation is "out of order". Sequence A008302 in The On-line Encyclopedia of Integer Sequences is a triangle of these inversion numbers. The kth entry of the nth row of this triangle counts the number of distinct permutations of n elements with k inversions. In this paper, we derive an asymptotic formula for some of the diagonals of the triangle of the number of inversions in a random permutation. Asymptotic formulas are provided for the sequences: A000707, A001892, A001893, A001894, A005283, A005284, and A005285. Asymptotic behavior of the inversion numbers is illustrated with 2 interactive figures.

Contents

  1. Introduction
  2. Generating Function
  3. Asymptotic Normality
  4. An explicit formula for the inversion numbers
  5. An asymptotic formula for the inversion numbers
  6. References

Table of Figures

1. Introduction

Let a1,a2,...,an be a permutation of the set {1,2,...,n}. If i < j and ai > aj, the pair (ai,aj) is called an "inversion" of the permutation; for example, the permutation 3142 has three inversions: (3,1), (3,2), and (4,2). Each inversion is a pair of elements that is "out of sort", so the only permutation with no inversions is the sorted permutation.

2. Generating Function

Let In(k) represent the number of permutations of length n with k inversions.

Theorem 1

[Muir,1898][10] The numbers In(k) have as a generating function

   

Clearly, the number of permutations with no inversions, In(0) is 1 for all n, and in particular, I1(0) = 1 = F1(x). So the formula given in the theorem is correct for n = 1. Now, consider a permutation of n-1 elements. We insert the nth element in the jth position, j = 1,2,...,n, choosing the insertion point randomly. Since the nth element is larger than the n-1 elements in the set {1,2,...,n-1}, by inserting the element in the jth position, n-j additional inversions are added. The generating function for the number of additional inversions is since each number of additional inversions is equally likely. The additional inversions are independent from the inversions present in the permutation of length n-1, so the total number of inversions has as its generating function, the product of the generating function for n-1 inversions, and the generating function for additional inversions:

The required result then follows by induction.

Below is a table of values of the number of inversions sequence A008302 in [13], see also [2], [3], [8], and [11]:

Table 1 In(k) = In((n(n-1)/2)-k)
k, number of inversions
n\k012345678910111213
11
211
31221
41356531
51491520222015941
61514294971901011019071492914
716204998169259359455531573573531455
8172776174343602961141519402493301734503736
9183511128562812302191360655458031110211439517957
10194415544010682298448980951364021670326834704364889

Table 1 (continued) In(k) = In((n(n-1)/2)-k)
k, number of inversions
n\k14151617181920212223
651
735925916998492061
83836373634503017249319401415961602343
921450245842707328675292282867527073245842145017957
1086054110010135853162337187959211089230131243694250749250749

3. Asymptotic Normality

The unimodal behavior of the inversion numbers suggests that the number of inversions in a random permutation may be asymptotically normal. We explore this possibility by looking at the generating function for the probability distribution of the number of inversions. To get this generating function, we divide Fn(x) by n! since each of the n! permutations is equally likely.

Following Vladimir Sachkov, we have the moment generating function [12]

The explicit formula for the generating function of the Bernoulli numbers is

So we have

where the final step follows from integrating both sides and noting that

so the constant of integration is zero.

Using this generating function, we get that the log of the moment generating function is

Now consider lnMn(t/s) where s is the standard deviation of the number of inversions in a random equiprobable permutation with n elements,

The sum

for k > 1 is bounded above by the following integral,

so

Hence,
as

 

uniformly for t from any bounded set.

This leads to the following theorem:

Theorem 2 [Sachkov][12] If xn is a random variable representing the number of inversions in a random equiprobable permutation of n elements, then the random variable

has as asymptotically normal distribution with parameters (0,1).

The graph below shows the density for a standard normal random variable in black. The red curves give a continuous approximation for the discrete probability mass function for the number of inversions of a random permutation with n elements. Graphs are shown for n = 10 to n = 100. Holding the mouse over any of the numbers n = 10 to n = 100 shown at the bottom of the figure will show the graph for that n value. Note that as n increases, the red curve moves closer to the standard normal density so that it appears that the normal density may serve as a useful tool for approximating the inversion numbers.

Figure 1. Comparison of the inversion probability mass function to the standard normal density

The figure below shows the ratio of the inversion numbers to the estimate provided by the normal density. The better the approximation, the closer the curve will be to one. The graph is scaled so that the x-axis is the number of standard deviations from the mean.

Figure 2. The ratio of the inversion probability mass function to the standard normal density scaled by the number of standard deviations from the mean

The curves are shaped sort of like a cowboy hat. The top of the hat at about y = 1 seems to be getting broader as n increases (black is n=10, red is n=25, blue is n=50, and green is n=100), suggesting that the approximation improves with increasing n. Compare the figure above to the one below:

Figure 3. The ratio of the inversion probability mass function to the standard normal density scaled by the nonzero inversion numbers

The curves are rescaled in this figure so that 0 inversions is mapped to -0.5, and n(n-1)/2 inversions is mapped to 0.5 on the x-axis. In this way, we can see whether the estimates for the nonzero inversion numbers improve as a percentage of the total nonzero inversion numbers as n increases. Note that the colored curves are in the opposite order of the preceding figure. The figure suggests that the estimates actually get worse as n increases. The width of the top of the cowboy hat is getting narrower as n increases. What this shows is that the relative error of the normal density approximation increases as n increases as we move further into the tails of the distribution. We can examine the asymptotic behavior of In(k) for k n more closely.

4. An explicit formula for the inversion numbers

Donald Knuth has made the observation that we may write an explicit formula for the kth coefficient of the generating function when k n, [4], p.16. In that case,

Theorem 3 [Knuth, Netto][8],[11], The inversion numbers In(k) satisfy the formula

(1)
for k n.

The binomial coefficients are defined to be zero when the lower index is negative, so there are only finitely many nonzero terms: for the first sum, and for the second. The uj are the pentagonal numbers, sequence A000326 in [13],

Donald Knuth's formula follows from the generating function and Euler's pentagonal number theorem.

Theorem 4 [Euler][1][7][8]

Recall the generating function

 

 

 

for    |x| < 1.

The coefficients of will match those in the power series expansion of the infinite product given by Euler's pentagonal number theorem up to the coefficient on xn. We consider the product

  The coefficient on xk is given by (1), for k n.

5. An asymptotic formula for the inversion numbers

We are interested in the sequences In+k(n). For , the nth term of the sequence is given by

(2)

With a = uj+j or a = uj, all terms are of the form:

We can approximate the value of this number, by applying Stirling's approximation ([4], p.54 or [6], p.452):

So we have

With this asymptotic formula, we can compute an asymptotic formula for the sum In+k(n) given in equation (2).

where

is a digital search tree constant [5], http://pauillac.inria.fr/algo/bsolve/constant/dig/dig.html, and C1 and C2 are given by the convergent series

and

respectively. We summarize a less precise result as the following theorem:

Theorem 5

where

This formula can be used to provide asymptotic estimates for the sequences: A000707, A001892, A001893, A001894, A005283, A005284, and A005285. In the formula, take k to be one for the first sequence, and increase k by one for each subsequent sequence.

The figure below shows the tail behavior of the number of permutations with k inversions for k n. The blue curve is n! times normal density with mean n(n-1)/4 and variance [(2n3+3n2-5n)/ 72], that is, the blue curve is the estimate of In(k) based on the normal density. The red dots are the values of the asymptotic estimate; and the green dots are the exact values of In(k). Where the red and green dots are not both visible, one dot covers the other. The figure shows the tail for n = 4 to n = 22. Move the cursor over the appropriate n to see its graph.

Figure 4. Comparison of normal density estimate to asymptotic formula and actual inversion numbers

From our asymptotic formula for In(n), we can see that

but the normal density approximation for the ratio [(In(n))/( In-1(n-1))] gives the estimate ne-9/8 as n tends to infinity. Hence, the normal density approximation grows much faster than the inversion numbers in the tails do.

6. References

  1. G. E. Andrews, The Theory of Partitions, first paperback edition, Cambridge University Press, 1998.
  2. L. Comtet, Advanced Combinatorics, Reidel, 1974, p. 240.
  3. F. N. David, M. G. Kendall and D. E. Barton, Symmetric Function and Allied Tables, Cambridge, 1966, p. 241.
  4. W. Feller, An introduction to probability theory and its applications, second edition, John Wiley and Sons, New York, NY, 1971.
  5. S. Finch, Table of mathematical constants, published electronically at http://pauillac.inria.fr/algo/bsolve/constant/table.html, 2001.
  6. R. L. Graham, D. E. Knuth and O. Patashnik, Concrete Mathematics, 2d Ed., Addison-Wesley Publishing Company, Inc., Reading, MA, 1994.
  7. G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, Oxford, Clarendon Press, 1954.
  8. D. E. Knuth, The Art of Computer Programming. Addison-Wesley, Reading, MA, Vol. 3, p. 15.
  9. R. H. Moritz and R. C. Williams, "A coin-tossing problem and some related combinatorics", Math. Mag., 61 (1988), 24-29.
  10. Muir, "On a simple term of a determinant," Proc. Royal S. Edinborough, 21 (1898-9), 441-477.
  11. E. Netto, Lehrbuch der Combinatorik,. 2nd ed., Teubner, Leipzig, 1927, p. 96.
  12. V. N. Sachkov, Probabilistic Methods in Combinatorial Analysis, Cambridge University Press, New York, NY, 1997.
  13. N. J. A. Sloane, The on-line encyclopedia of integer sequences, published electronically at http://www.research.att.com/~njas/sequences/, 2001.