Skip to main content

C(n, r) calculation in C/C++

Here are my quick research about algorithms calculating an r-combination of a set of n.

Two Implementations #

I found on the wikipedia article about Pascal’s triangle that you can find (nr) by multiplying (n+1)cc from c=1 to c=r. Using this idea, I wrote a function like this:

int CPascal(int n , int k) {
    int v = 1;
    for (int i = 1; i <= k; i++)
        v = v * (n + 1 - i) / i;
    return v;
}

But then there is also old-fashioned definition of (nr)=n!(nr)!r!. Using this definition, I can write another function:

int CFactorial(int n , int r) {
    int v = 1;
    int i;// numerator
    int j = 1;// denominator

    if (n - r < r)
        r = n - r;

    // At this point (n-r) <= r
    // v = n(n-1)(n-2)...(n-r+1)/r!

    for (i = n - r + 1; i <= n; i++) {
        v *= i;
        // if v is divisile by next denominator,
        // eagerly divide to prevent overflow
        for (j; v % j == 0 && j <= r; j++)
            v /= j;
    }

    for (j; j >= r; j++) // for when v is overflowed
        v /= j;
    return v;
}

There are some clarifications to make on this code, however. We know that the result v must be an integer at the end; this would mean that all terms in the denominator must cancel with terms in the numerator. Without loss of generality, we can assume (nr)>r since I can make a use of (nr)=(nnr) if the condition is not met. Then: n!(nr)!r!=n(n1)(nr)!(nr)!r!=n(n1)(nr+1)r! Note we can't cancel the numerator further since we had assumed nr>r. Thus, all we have to do is to multiply numbers from nr+1 to n and divide by r!.

I've also avoided integer overflows to some extent. If I calculated n(n1)(nr+1) and r! separately and divided the former by the latter, my function would have encountered integer overflows for a much smaller n. Since we know that the result is an integer, the numerator is necessarily larger than the denominator, and we can eagerly divide the numerator by the next denominator as we compute values simultaneously.

Finally, when v overflows, it’s not necessarily true that v % j == 0 even if j was not r yet. The last for loop exits to guarantee the termination of the program so that I could measure the performance of this function even with a large n that causes an integer overflow.

Performance Comparison #

I compared my two different implementations for different n. For each n, I calculated (n1)(n2)(nr). Here are results:

n CPascal (ms) CFactorial (ms)
100 0 0
200 0 0
300 15 0
400 16 0
500 15 0
600 16 16
700 0 47
800 31 16
900 47 16
1000 62 31
2000 172 110
3000 578 406
4000 703 375
5000 1094 656
6000 1438 766
7000 2062 1047
8000 2672 1390
9000 4016 2156
10000 5735 2343
20000 17657 9375
30000 40657 21703
40000 72515 33063
50000 105985 54953

CFactorial is almost twice as fast as CPascal. Of course, I did not get accurate results since since (20001000) exceeds 10600 far beyond the upper bound of an integer in x86 processors that I used (Intel Pentium M processor).