Combinatorics/Binomial coefficients

Combinatorial interpretation
What does it mean that we have $$n \choose k$$ ways of choosing a set of size k from a set of size n?

You start with empty collection, {∅} and start drawing items from the alphabet {a, b, c, …}. At the first step you decide whether you take item a into your collection. At the next step you decide whether letter b is needed and so on. Now, the question is: how many ways do you have to end up with k items after n steps? Obviously, k cannot be outside the range [0,n]. But to answer the question, let's consider all choices you can make along n steps.

You'll have a binary tree with root (level 0)


 * Binomial_coeff_root_collection.svg

At the first level, you'll see the collections you can end up with after deciding whether to accept letter a or not,


 * Binomial_collections(level1).svg

You have one way to (select) 0 items, represented by collection {∅} on the left and 1 way to the collection {a} on the right. Here is the tree after n=3 steps


 * Binomial collections.svg

You see at the bottom level=3 that you have 1 (leftmost) way to select empty set, 3 ways to select 1 items, 2 ways to select 2 items and 1 way to select 3 items (at the right). Continuing, you end up with a binary tree of height n. You have made n decisions, choosing one of 2^n paths, selected k items and rejected n-k, thus effectively dividing n items into two groups (thus, the name "bi"-nomial).

Note that every k-collection corresponds to a distinct route that was used to compose it. That is, {c} at level n=3 means that we have accepted only c item while declained letters a and b offers. So, our route was left,left,right. No matter which route you take, you need one right and other turns left in order to get into the 1-item collection. Now, because we have all k-item collections (k-collections, collections of k items selected) grouped into one node of the tree, and there is 1-to-1 correspondence between every collection and route to get it, we see that the size of the group (number of k-collections in the node/group) is equal to the number of ways to reach the group, starting from the root. So, basically, every collection in the node encodes the route you can reach the node and to answer the question, how many ways do you have to select k items from n, you just may count the k-collections in the node.

Instead of recording the items selected, we can just count the number of items that we have accepted. Here, for instance you see that there are totally 8 accept/reject routes after n=3 choices and there are 3 ways to accept one item and 3 ways to accept two items.


 * [[File:Bin_coef_again.svg]]

You see that with every item you make a choice, one more (terminal) layer is added at the bottom of the pyramid, which doubles the number of possible ways because you can go either left or right from every node so that there is $$A_n = 2^n$$ ways totally, which is not surprisingly equal to the number of t-tuples. These $$2^n$$ n-tuples are partitioned into $$n+1$$ groups by the number of ways to accept the same amount (k, the number of bits='1' in the figure above) of items when we are exposed to n choices. The number of ways you can do that is called the number of combinations or binomial coefficient. It also counts the ways you can reach every node with same number of accept-choices. The corresponding binomial coefficients are usually presented in the form of Pascal's Triangle


 * Pascal_triangle.svg

To revise, we can redraw the triangle with a rule to compute each binomial coefficient value


 * Binomial coefficients.svg

The coefficients $$C_n^k = {n \choose k}$$ simply show that we have a node of k-collections at level n. But, since the number of ways to reach that node is the amount of such collections in the node, we can use $${n \choose k}$$ also as a number of such ways to the node (aka collections in the node). It is a function of depth n and collection size k.

Let's re-iterate. By definition, we already have an empty collection at the root level. So, you have 1 way to have an empty collection (k=0) at level n=0, $${n \choose k} = 1$$. Then, at step 1 we can either take a (right branch) or not take it (left branch).

The fact that you have one 0-element collection at the left, {∅}, corresponds to the fact that you have one route to obtain 0-element collection. Therefore, we have $${1 \choose 0} = 1$$ Similarly, you have 1 way to obtain 1-element collection {a}, that is to say $${1 \choose 1} = 1$$.

After next step, 2, you still have 1 way to get to the leftmost {∅} (therefore there is only 1 collection). Therefore, $${2 \choose 0} = 1$$. You also have two ways to get to the middle group of 1-element collections, {a} and {b}. Thus, we have $${2 \choose 1} = 2$$. Finally, there is 1 way to the collection {a,b} on the right and we, thus, have $${2 \choose 2} = 1$$

After every step, as you go 1 step deeper down the pyramid, its basement becomes 1 group wider, since you may reach a larger collection if collect all letters. After n steps, we will have a pyramid with n target groups in the basement.

You may notice that to compute the group $${n \choose k}$$ of collections, which is located n steps down the pyramid from the root and k steps to the left, there is no need to look for all the paths from the root to it. You just look at the two parents, one level above. Consider that we arrive at child $${n \choose k}$$ by taking the right branch of the left parent $${n \choose k}$$. This parent carries (k-1)-collections and taking right branch means expanding these collections with n-th letter, obtaining k-item collections in the child group $${n \choose k}.$$ We can also arrive at this child from the right parent by taking its left branch and, thus, declining the offer of n-th letter. Thus, right parent is the group $${n-1 \choose k}$$ since it must have k-item collections right away, at level n-1. Thus, we get the recursive formula to compute the cardinality of k-collection nodes at level n$${n \choose k} = {n-1 \choose k-1} + {n-1 \choose k}$$ with initial condition $${0 \choose 0} = 1$$. Recall that this is also the number of ways to arrive at this group/node.

What about the leftmost and rightmost groups? They have only one parent. Since the number of collections is always 1 in this group and the parent already contributes this number, another parent, not participating in the pyramid must have 0 collections (of size k > n and k < 0). That is, for not negative, $${n \choose k} = 0$$ for $$k < 0$$ and $$k > n$$.

Using either recursive formula $${n \choose k} = {n-1 \choose k-1} + {n-1 \choose k}$$ or just counting the number of collections in every node, we arrive at the Pascal Triangle again.

Sum of n-th row
Integrating all ways to reach line n, you will get 2ⁿ ways, $$\sum_{k=0}^{n} {n \choose k} = 2^n$$. This is not a surprise because we have a binary tree: at every step you have two options: you either accept a letter or decline it and the nodes double. So that you get strings "0" or "1" after step 1. The first one expands into "00" or "01" after step 2 while "1" can expand to "10" or "11". The string "001" would correspond to the "left,left,right" route above. You have twice as much options after every step or 2ⁿ possible binary strings after n steps. N-th char in such string tells whether we include n-th alphabet letter into our collection or not. Because such string can be generated by coin tossings, there is a direct correspondence between the ways to toss a coin n times and choosing k items from n.

Now, you know the total number of ways to get to the n-th level, 2ⁿ, as well as to reach specific group k at that level. You may choose any path from the root to that group at random. If every path has equal probability, as it is with coin tossing, you have probability $$p={{n \choose k} / 2^n}$$ to end up with collection of k items. If we toss a coin n times instead of choosing k of n items then we can interpret $${n \choose k} $$ as the number of k heads in n trials and p is probability of k heads in n trials. Since this probability is proportional to $${n \choose k}$$ you see, from every line n of the Pascal triangle that probability is highest in the center. The distribution approaches normal as you go further down the triangle and, thus, letting $$n \to \infty.$$ In case coin is not fair, going left or right has different chances. But, combinatorial coefficients computed thus, still help to find the binomial distribution.

I am not aware of combinatorial interpretations of negative k and n.

Binomial theorem
Now we cannot get away without relating the coefficients to the powers of sum $$(a+b)^n$$, called Newton's binomials, and probabilities (going left or right). Well,

$$(a+b)^2 = a^2 + 2ab + b^2$$, $$(a+b)^3 = (a+b)(a^2 + 2ab + b^2) = a^3 + 2a^2b + 2ab^2 + a^2b + 2ab^2 + b^3 = a^3 + 3a^2b + 3ab^2 +b^3$$. This looks suspiciously like
 * $$(a+b)^n = \sum_{k=0}^n {n \choose k} a^{n-k}b^k.$$

Indeed, a product like $$a^ib^i$$ means that you have a collection which consists of i a-items and j items of kind b. Taking every power you either expand your collections with a or b, depending the route you take: left or right (see picture on the right). On the n-th row it must be that i+j=n. The contrast with the first triangle that you see at the top of the page, is that in place of choosing a new item every time (a,b,c,d,...), powers produce sets that contain only of letters a and b in different proportions. At the bottom, after n exponentiations, you will get all n sets. Again, binomial coefficients give us the number of ways to select exactly k a-items or the number to select k items of n (there are n-k ways to select b-items). Here is the diagram that represents both selected items/sets and the number of ways you can arrive at them.



\begin{matrix} & & & 1\times \{\emptyset\} & & &  \\  & & 1\times \{a\} & & 1\times \{b\} & &  \\  & 1\times \{a^2\} & & 2\times \{ab\} & & 1\times \{b^2\} & \\  1\times \{a^3\} & & 3\times \{a^2b\} & & 3\times \{a b^2\} & & 1\times \{b^3\} \end{matrix} $$

One can approach from another angle. Experts say that the coefficient at $$a^ib^{n-i}$$ in $$(a+b)^n$$ is equal $$n \choose i$$ because we have n parenthesis in the product and we need to choose either a or b from each and there are $$n \choose i$$ ways to select exactly i `a`s. We need to select either a or b from every parenthis because having (a+b) you have only two monomials, a and b. Second order product can be split $$(a+b)^2= a(a+b) + b(a+b)$$. Multiplying once more, $$(a+b)^3= a[a(a+b) + b(a+b)] + b[a(a+b) + b(a+b)]$$. You see, we build up a binary pyramid. At the bottom, we get all the binomial of the polynomial if we open up the parenthesis. We can arrive at any of them if we traverse the pyramid from the root and select a or be at every level (selecting a means that we choose a(..) branch whereas selecting b stands for selecting b(..) branch). So, we have n parenthesis and need to mark some of them 1, leaving others at 0. That is counted by the $$n \choose i$$ formula, derived above.

Probabilities
You may set a=p and b=q=1-p, the probabilities of success (choosing left branch and, thus, picking and item) or failure (right branch and not winning an item, correspondingly). These can be heads/tails probabilities observed when tossing a coin or p can be probability of observing some value(s) when trowing a die. There is not requirement that p = q, just that p is invariable in all n trials. A series of such Bernoulli trials produces a binomial distribution



\begin{matrix} n=0& & & {0 \choose 0}\times 1 & & \\ n=1& & {1 \choose 0}\times p & & {1 \choose 1}\times q & \\ n=2& {2 \choose 0}\times p^2 & & {2 \choose 1}\times pq & & {2 \choose 2}\times q^2 \\  & k=0 & & k=1 & & k=2 \end{matrix} $$

Now, you can understand how binomial distribution and formula $$P(k) = {n \choose k} p^kq^{n-k}$$ raise up when we count the probability of winning k items. The probability of observing string 00001001 is equal to $$p^2\times q^6$$. However, you can also pick 2 items taking another route, say, 10000100. The probability of this another route is identically $$p^2\times q^6$$ because it is probability of every route that wins two items. There are $$C_8^2 = {8 \choose 2}$$ such routes and, therefore, probability to of two successes out of 8 is $$C_8^2p^2q^6$$. Generally, you probability of success is
 * $$P(k) = {n \choose k} p^kq^{n-k}$$

Using the Newton's binomial formula above, we can say that this probability is k-th member of expansion $$(p+q)^n = \sum_{k=0}^n{{n \choose k}p^kq^{n-k}}$$.

If we are only interested in counting ways to choose item a (and ignore item b picked when choosing right branch), we can reduce binomial coefficient to generating function, $$(x+1)^n = \sum_{k=0}^n{{n \choose k}x^n}$$. The binomial coefficient will be the coefficient at $$x^k$$.

Letting a=b=1 also explains why sum of combinatorial coefficients amounts to $$2^n$$ at n-th row. It is basically because $$(1+1)^n = \sum_{k=0}^n C_n^k\,1^k\,1^{n-k} = \sum C_n^k =2^n.$$

Multinomial coefficient (generalization)
Everything above says that $$C_n^k$$ stands for choosing k elements from n-element set and it can be interpreted as if we choose k elements of some type from n elements. Other n-k elements have some another type and it is totally expected that $$C_n^{n-k} = {n!\over (n-k)!\, k!} = {n!\over k!\,(n-k)!} = C_n^{k}$$ because instead of choosing k elements of first type we can come from another end and choose n-k elements of another type leaving k elements of the first type. Here, we have $$k_1=k$$ and $$k_2=n-k$$ and $$n_1+n_2=n$$. This case can be generalized to choosing $$k_1$$ elements of the first type, and $$k_2$$ elements of the second type, $$k_3$$ elements of the third type and so on so that $$k_1+k_2+\cdots+k_r=\sum{k_r}=n$$ for all r types of our choice. $$k_c$$ may stand for the number times letter $$c$$ occurs in the word and you may look for the number of ways to produce n-character words.

Binomials generalize to multinomials:
 * $${n\choose k_1,k_2,\ldots,k_r} =\frac{n!}{k_1!\ k_2! \cdots k_r!}$$

which represent the coefficients of expansion
 * $$(x_1 + x_2 + \cdots + x_r)^n = \sum_{k_1+k_2+\cdots = n}{n\choose k_1,k_2,\ldots,k_r}x_1^{k_1}\,x_2^{k_2}\cdots x_r^{k_r}$$

We can interpret it like n elements can be permuted in n! ways. There are $$k_1$$ positions that correspond to the first group. Some of n! permutations just change order of elements this group. But, combinations stand for unordered selections. There are $$k_1!$$ permutations for every combination. The order of items is also unimportant for all other r-1 groups. We therefore divide n! by $$k_1!$$, $$k_2!$$ and all other group sizes to undo the permutations within all our r groups getting $$n! \over k_1!k_2!\cdots k_r!$$ in the end.

The binomial partitions an n-group into two classes of size k and n-k correspondingly so you get $$n!\over k!(n-k)!$$ regardless you think that you select k items out of n or n-k out of n. In fact, you see that
 * $$C_n^k = {n!\over k!(n-k)!} = {n \choose k} = {n \choose k,n-k} = {n \choose n-k, k} = {n \choose n-k} = {n!\over k!(n-k)!} = C_n^{n-k}.$$

I won't bother to draw the multidimensional "Pascal triangle" but we still can similarly, we can count the total number of outcomes at row n
 * $$(1 + 1 + \cdots + 1)^n

= \sum_{k_1+k_2+\cdots = n}{n\choose k_1,k_2,\ldots,k_r}1^{k_1}\,1^{k_2}\cdots 1^{k_r} = k^n = \sum_{k_1+k_2+\cdots = n}{n\choose k_1,k_2,\ldots,k_r}$$

and use it to compute probabilities of getting $$k_1$$ first items, $$k_2$$ second items and so on when drawing one item n times and probability of getting each item in single drawing is $$p1$$ to $$p_r$$ corrsepondingly:
 * $$1^n = (p_1 + p_2 + \cdots + p_r)^n = \sum_{k_1+k_2+\cdots = n}{n\choose k_1,k_2,\ldots,k_r}p_1^{k_1}\,p_2^{k_2}\cdots p_r^{k_r}$$

Every member of the sum is the probability of observing first event k1 times, second event k2 times and etc when totally n events are observed.

Choosing with replacement (Coin Change generalization)
Binomial coefficient chooses whether 1 or $$x^1$$ is selected from every of m parenthesis in $$(1+x)^m$$. For instance, (1)(1)(0), (1)(0)(1) and (0)(1)(1) are 3 ways to choose two X-es (and 1 $$x^0$$) exposed to 3 choices. This means that $$C_3^2 = 3$$ and $$(1+x)^3 = C_3^0 x^0 + C_3^2 x^1 + C_3^2 x^2 + C_3^3 x^3$$. These are m binary choices and where you just mark every parenthesis, applied for your choice, with 0 (picked $$1 = x^0$$) or 1 (picked $$x = x^1$$). You can think of it as m-th power of binomial $$1+x$$ is formal series (a polynomial) with coefficient $$C^i_m$$ at $$x_i$$: $$(1+x)^m x_i = \sum_0^m C_m^i x^i$$.

We can interpret the parenthesis as a means of grouping summands, whose values can be 0 or 1. That is, using "+" as separator between parenthesis, we can say that 2 can be expressed in 3 different ways, 2 = 1+1+0 or 2 = 1+0+1 or 2 = 0+1+1, when we use only 0 and 1 as summands. Because powers of X naturally add up in the resulting polynomial, coefficients $$C_m^n$$ in $$(1+x)^m = \sum_0^m C_m^n x^n$$ naturally count all changes of n.

But what if our coins are not limited to values of 0 and 1? How many partitions of n do we have in this case? In other words, what is the number of ways to split n into $$n_1+n_2+\cdots + n_m$$ so that $$n_1+n_2+\cdots + n_m = n$$? These will be the coefficients of power series expansion of $$(1+x+x^2+x^3+\cdots)^m$$ since every time we take $$x^{n_1}$$ from first parenthesis, $$x^{n_2}$$ from the second, etc such that $$n_1+n_2+\cdots + n_m = n$$, we'll get $$x^{n}$$. In fact, we partition integer n into m groups (treating every x in $$x^{n}$$ as a unit). For instance, one way to partition 5 into 3 summands would be 1 + 2 + 2, which we represent as 1 + 11 + 11, another would be 0 + 1 + 4, encoded as +1+1111. Here, we use "+" as group separator instead of. Now, you are exposed to n (unit) choices and associate every unit 1 with one of m parentheses. Ultimately, every parenthesis is associated with a non-negative number $$n_i \in {0, \infty}$$ of 1s placed into it. The number of choices we can make this way is known as combinations with replacements, $$\bar C_m^n$$. They are different from simple combinations without replacements, $$C_m^n$$, which marks every group of m with either 0 or 1, by assumption that same group can be selected more than once and n > m it is not a problem since more than one unit can go into the same box (more than one X be extracted from the same parenthesis).

We have already associated the combinations with replacements, the number to place n balls into m boxes, with binary strings like 11+111++1. It says that we have n+m-1 positions to place m-1 "+"-separators. Other items are automatically set to 1. In other words, these are binary strings and exactly $$C_{n+m-1}^n$$ of them have n units (plus m-1 separators). That is, $$\bar{C}_m^n = C_{n+m-1}^n$$ is the number of ways to partition integer n into m ordered summands (coins). Here "ordered" means that n = a + b and n = b + a are two different partitions.

Grouping n units into m boxes can be counted differently. We can ask, what is the number of ways to place m-1 separators ("+" marks) between n units? The answer must be the same, $$\bar{C}_m^n$$ but there are (n-1) places to choose for (m-1) separators, which means that $$\bar{C}_{m-1}^{n-1} = \bar{C}_m^n.$$ Indeed, $$\bar{C}_m^n = C_{n+m-1}^n = C_{n+m-1}^{m-1} = \bar C_{m-1}^{n-1}.$$ This also suggests that n balls can be placed into m boxes so that at least one ball appears in every box, in $$C_{m-1}^{n-m+m-1} = C_{m-1}^{n-1}$$ ways since we first place n balls into m boxes and are free to choose boxes for the n-m balls that are left.

We have seen that binomial coefficients are coefficients in the power series of binomial $$(1+x)^m = \sum_0^m C_m^n x^n$$ and multinomial $$(\sum_0^\infty x^i)^m = {1\over 1-x}\sum_0^\infty \bar{C}_m^n x^n$$, which correspond to chaning money n into m coins. The binomial coins are limited to denominations of 0, 1 whereas denominations of $$(1+x+x^2+x^3+\cdots)$$ are any non-negative integers. It is reasonable to assume that coefficients of $$(1+x^k+x^{2\cdot k}+x^{3\cdot k}+\cdots)^m$$ count the number of ways to exchange sum n into m multiples of k, e.g. two ways to exchange 10 into multiples of 5 are 0 + 10, 5+5 and 10+0 and must be found as $$\bar C_2^i$$ in $$\sum_0^\infty\bar C_2^n x^{5*n} = (1+x^5 + x^{10} + \cdots)^2$$. The fact that you have no more than 3 coins of denomination 5 for exchange, is represented as finite-length polynomial $$(1+x^5 + x^{10} + x^{15})$$ instead of (infinite-length) series. In general, taking a product of available coins (polynomials or series), makes up a generating function for the number of ways to partition some sum into the coins where the number of polynomials/series in production is the number of coins in the sum. That is power series of $$(1+x)^m$$ counts the partitions of kind $$a_1 + a_2 + \cdots + a_m = n$$, where each $$a_i \in {0,1}$$, whereas $$(1+x)(1+x^3 + x^{10})$$ says that we have a + b = n where a can be either 0 or 1 and b can be 0, 3 or 10 and coefficient of the power series of the product will count the number of ways to partition n.

NE lattice paths and Catalan numbers
A NE (or northeast) lattice path is a path where all steps are either up or to the right. Shown to the right are all four NE lattice paths from (0,0) to (2,2).



NE lattice paths have close connections to the number of combinations. The number of lattice paths from $$ (0,0) $$ to $$ (n,k) $$ is equal to the binomial coefficient $$ \binom{n+k}{n} $$. The diagram shows this for $$ 0 \leq k \leq n =4 $$. Note the appearance of Pascal's triangle below and to the right. The figure above contains Pascal's triangle with the tip at the lower left corner. This triangle is shown below and to the left, and represents the situation in which ALL NE paths are allowed. A subset of all NE paths is on where the entire path is constrained to lie in the upper left corner.

To reaching square which is p steps upwards and q rightwards, you need to make p+q steps and p of them must be upwards (the rest is right). This means that you need to choose $$C_{p+q}^p$$ steps and mark them `upward`, or, equivalently, $$C_{p+q}^q$$ steps marking them rightward. Note that $$C_{p+q}^p = C_{p+q}^q$$. You can see the Pascal triangle where the main diagonal is bisection of it. The number of paths on main diagonal is $$C_{2n}^n$$ since if you move n steps upward, you need move n additional steps rightward to get to the diagonal.

To the right we see the situation where all NW paths are also constrained to lie above the diagonal. The numbers at the main diagonal are known as Catalan Numbers $$C_n = {1\over n+1} C^n_{2n} = C^n_{2n} - C^{n+1}_{2n}.$$ They also count the number of valid parenthesized expressions, like and () are valid but )( is not. That is, we know that the expression must not have more closing parenthesis than opening ones, and, particularly, must start with opening parenthesis.