Email address: margolius@math.csuohio.edu
Abstract: The number of inversions in a random permutation is a way to measure the extent to which the permutation is "out of order". Sequence A008302 in The On-line Encyclopedia of Integer Sequences is a triangle of these inversion numbers. The kth entry of the nth row of this triangle counts the number of distinct permutations of n elements with k inversions. In this paper, we derive an asymptotic formula for some of the diagonals of the triangle of the number of inversions in a random permutation. Asymptotic formulas are provided for the sequences: A000707, A001892, A001893, A001894, A005283, A005284, and A005285. Asymptotic behavior of the inversion numbers is illustrated with 2 interactive figures.
Theorem 1
[Muir,1898][10] The numbers In(k) have as a generating function
![]() |
Clearly, the number of permutations with no inversions, In(0)
is 1 for all n, and in particular, I1(0) = 1 = F1(x). So the formula given in the theorem is correct for n
= 1. Now, consider a permutation of n-1 elements. We insert the nth
element in the jth position, j = 1,2,...,n, choosing the insertion
point randomly. Since the nth element is larger than the n-1 elements
in the set {1,2,...,n-1}, by inserting the element in the jth
position, n-j additional inversions are added. The generating function
for the number of additional inversions is
since each number of additional inversions is equally likely. The additional
inversions are independent from the inversions present in the permutation of
length n-1, so the total number of inversions has as its generating function,
the product of the generating function for n-1 inversions, and the generating
function for additional inversions:
Below is a table of values of the number of inversions sequence A008302 in [13], see also [2], [3], [8], and [11]:
| Table 1 In(k) = In((n(n-1)/2)-k) | ||||||||||||||
| k, number of inversions | ||||||||||||||
| n\k | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |
| 1 | 1 | |||||||||||||
| 2 | 1 | 1 | ||||||||||||
| 3 | 1 | 2 | 2 | 1 | ||||||||||
| 4 | 1 | 3 | 5 | 6 | 5 | 3 | 1 | |||||||
| 5 | 1 | 4 | 9 | 15 | 20 | 22 | 20 | 15 | 9 | 4 | 1 | |||
| 6 | 1 | 5 | 14 | 29 | 49 | 71 | 90 | 101 | 101 | 90 | 71 | 49 | 29 | 14 |
| 7 | 1 | 6 | 20 | 49 | 98 | 169 | 259 | 359 | 455 | 531 | 573 | 573 | 531 | 455 |
| 8 | 1 | 7 | 27 | 76 | 174 | 343 | 602 | 961 | 1415 | 1940 | 2493 | 3017 | 3450 | 3736 |
| 9 | 1 | 8 | 35 | 111 | 285 | 628 | 1230 | 2191 | 3606 | 5545 | 8031 | 11021 | 14395 | 17957 |
| 10 | 1 | 9 | 44 | 155 | 440 | 1068 | 2298 | 4489 | 8095 | 13640 | 21670 | 32683 | 47043 | 64889 |
| Table 1 (continued) In(k) = In((n(n-1)/2)-k) | ||||||||||
| k, number of inversions | ||||||||||
| n\k | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 |
| 6 | 5 | 1 | ||||||||
| 7 | 359 | 259 | 169 | 98 | 49 | 20 | 6 | 1 | ||
| 8 | 3836 | 3736 | 3450 | 3017 | 2493 | 1940 | 1415 | 961 | 602 | 343 |
| 9 | 21450 | 24584 | 27073 | 28675 | 29228 | 28675 | 27073 | 24584 | 21450 | 17957 |
| 10 | 86054 | 110010 | 135853 | 162337 | 187959 | 211089 | 230131 | 243694 | 250749 | 250749 |
The unimodal behavior of the inversion numbers suggests that the number of inversions in a random permutation may be asymptotically normal. We explore this possibility by looking at the generating function for the probability distribution of the number of inversions. To get this generating function, we divide Fn(x) by n! since each of the n! permutations is equally likely.
Following Vladimir Sachkov, we have the moment generating function [12]
|
The explicit formula for the generating function of the Bernoulli numbers is
So we have
|
where the final step follows from integrating both sides and noting that
Using this generating function, we get that the log of the moment generating function is
|
Now consider lnMn(t/s) where s is the standard deviation of the number of inversions in a random equiprobable permutation with n elements,
|
The sum
for k > 1 is bounded above by the following integral,
|
so
as
uniformly for t from any bounded set.
|
Theorem 2 [Sachkov][12] If xn is a random variable representing the number of inversions in a random equiprobable permutation of n elements, then the random variable
has as
asymptotically normal distribution with parameters (0,1).
The graph below shows the density for a standard normal random variable in
black. The red curves give a continuous approximation for the discrete probability
mass function for the number of inversions of a random permutation with n
elements. Graphs are shown for n = 10 to n = 100. Holding the
mouse over any of the numbers n = 10 to n = 100 shown at the bottom
of the figure will show the graph for that n value. Note that as n
increases, the red curve moves closer to the standard normal density so that
it appears that the normal density may serve as a useful tool for approximating
the inversion numbers.
Figure 1. Comparison of the inversion probability mass function
to the standard normal density
The figure below shows the ratio of the inversion numbers to the estimate
provided by the normal density. The better the approximation, the closer the
curve will be to one. The graph is scaled so that the x-axis is the number
of standard deviations from the mean.
Figure 2. The ratio of the inversion probability
mass function to the standard normal density scaled by the number of standard
deviations from the mean The curves are shaped sort of like a cowboy hat. The top of the hat at about
y = 1 seems to be getting broader as n increases (black is n=10,
red is n=25, blue is n=50, and green is n=100), suggesting
that the approximation improves with increasing n. Compare the figure
above to the one below:
Figure 3. The ratio of the inversion probability
mass function to the standard normal density scaled by the nonzero inversion
numbers The curves are rescaled in this figure so that 0 inversions is mapped to -0.5,
and n(n-1)/2 inversions is mapped to 0.5 on the x-axis. In this
way, we can see whether the estimates for the nonzero inversion numbers improve
as a percentage of the total nonzero inversion numbers as n increases.
Note that the colored curves are in the opposite order of the preceding figure.
The figure suggests that the estimates actually get worse as n increases.
The width of the top of the cowboy hat is getting narrower as n increases.
What this shows is that the relative error of the normal density approximation
increases as n increases as we move further into the tails of the distribution.
We can examine the asymptotic behavior of In(k) for k £ n more closely.
Donald Knuth has made the observation that we may write an explicit formula
for the kth coefficient of the generating function when k £ n, [4], p.16. In that case,
Theorem 3
[Knuth, Netto][8],[11], The inversion
numbers In(k) satisfy the formula
4. An explicit formula for the inversion numbers
|
(1) |
The binomial coefficients are defined to be zero when the lower index is negative,
so there are only finitely many nonzero terms:
for the first sum, and
for the second. The uj are the pentagonal numbers,
sequence A000326 in [13],
Donald Knuth's formula follows from the generating function and Euler's pentagonal number theorem.
|
Recall the generating function
|
for |x| < 1. |
The coefficients of
will match those in the power series expansion of the infinite product given
by Euler's pentagonal number theorem up to the coefficient on xn.
We consider the product
|
The coefficient on xk is given by (1), for k £ n.
We are interested in the sequences In+k(n). For
,
the nth term of the sequence is given by
|
(2) |
With a = uj+j or a = uj, all terms are of the form:
|
We can approximate the value of this number, by applying Stirling's approximation ([4], p.54 or [6], p.452):
So we have
|
With this asymptotic formula, we can compute an asymptotic formula for the sum In+k(n) given in equation (2).
|
where
|
is a digital search tree constant [5], http://pauillac.inria.fr/algo/bsolve/constant/dig/dig.html, and C1 and C2 are given by the convergent series
|
and
|
respectively. We summarize a less precise result as the following theorem:
Theorem 5
where
This formula can be used to provide asymptotic estimates for the sequences: A000707, A001892, A001893, A001894, A005283, A005284, and A005285. In the formula, take k to be one for the first sequence, and increase k by one for each subsequent sequence.
The figure below shows the tail behavior of the number of permutations with k inversions for k £ n. The blue curve is n! times normal density with mean n(n-1)/4 and variance [(2n3+3n2-5n)/ 72], that is, the blue curve is the estimate of In(k) based on the normal density. The red dots are the values of the asymptotic estimate; and the green dots are the exact values of In(k). Where the red and green dots are not both visible, one dot covers the other. The figure shows the tail for n = 4 to n = 22. Move the cursor over the appropriate n to see its graph.
Figure 4. Comparison of normal density estimate to asymptotic formula and actual inversion numbers
From our asymptotic formula for In(n), we can see that
|
but the normal density approximation for the ratio [(In(n))/( In-1(n-1))] gives the estimate ne-9/8 as n tends to infinity. Hence, the normal density approximation grows much faster than the inversion numbers in the tails do.