Performance comparison of Clojure, Ruby, and Python

Introduction

Speedometer

A fair performance comparison of implementations of different programming languages is difficult. Ideally one measures the performance for a wide range of algorithms and programming tasks. However each language will have different application areas where it performs best.

In the following we compare the performance of Clojure, Ruby, and Python on the factorial function. While the results might not allow to make general statements about the performance of each language interpreter, it gives us an idea of what the individual strengths and weaknesses are.

The measurements were performed using an AMD Ryzen 7 4700U with 16GB RAM and 8 cores.

Implementations

There are different ways to implement the factorial function.

Recursive

In functional programming the recursive implementation of the factorial function is the most common.

The Clojure implementation is as follows.

(defn factorial [n]
  (if (zero? n)
    1
    (*' n (factorial (dec n)))))

Note that we use *' instead of * which is required if we want to be able to handle large numbers.

The Ruby implementation is as follows.

def factorial n
  return 1 if n <= 1
  n * factorial(n - 1)
end

Finally the Python implementation is as follows.

def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n - 1)

Loop

If we introduce an additional variable, we can implement the factorial using a loop.

In Clojure we can use tail recursion as follows.

(defn factorial [n]
  (loop [result 1N n n]
    (if (zero? n)
      result
      (recur (*' result n) (dec n)))))

Note that we need to initialise the result with 1N (big number type) instead of 1, because the rebinding of the result variable does not allow a dynamic type change.

In Ruby we can implement factorial using a while loop.

def factorial n
  result = 1
  while n.positive?
    result *= n
    n -= 1
  end
  result
end

In Python we can implement factorial using a while loop as well.

def factorial(n):
    result = 1
    while n > 0:
        result *= n
        n -= 1
    return result

Reduce

Using the higher-order function reduce we can implement the factorial as follows.

In Clojure:

(defn factorial [n] (reduce *' (range 1 (inc n))))

In Ruby:

def factorial n
  1.upto(n).reduce :*
end

And in Python:

def factorial(n):
    return reduce(operator.mul, range(1, n + 1))

Unchecked integer math

One can use unchecked integers and type annotations in Clojure if it is known, that the result is not going to exceed the integer range.

(set! *unchecked-math* true)
(defn factorial [^long n] (if (zero? n) 1 (* n (factorial (dec n)))))
(set! *unchecked-math* false)

A similar approach in Python for improving performance is to use the Cython compiler. Here the method is implemented in a dialect of Python which uses static typing.

def factorial(int n):
    cdef int i, ret
    ret = 1
    for i in range(n):
        ret *= n
    return ret

Ruby only has the RubyInline library which requires to reimplement the method in C.

class Factorial
  inline do |builder|
    builder.c "
        long factorial(int max) {
          int i=max, result=1;
          while (i >= 2) { result *= i--; }
          return result;
        }"
  end
end

Other implementations

In Python one can instead use the factorial from the math module.

In Clojure one can apply the multiplication function to a range of numbers, since multiplication in Clojure can take an arbitrary number of arguments.

(defn factorial [n] (apply *' (range 1 (inc n))))

Only Clojure supports parallelism. For computing factorials we can use the fold function. Here we split the task into two chunks. Unfortunately range does not support random access, so we need to convert it to a vector.

(defn factorial4 [n] (clojure.core.reducers/fold (quot n 2) *' *' (vec (range 1 (inc n)))))

Finally if the input argument is known at compile time, one can use a macro in Clojure. This obviously is going to have much better performance than all the other implementations.

(defmacro factorial-macro [n]
  `(fn [] (*' ~@(range 1 (inc n)))))

Factorial of 20

First we compared the performance of computing the factorial of 20.

implementation Clojure 1.12.0 Ruby 3.4.1 Python 3.13.1
recursive 104 ns 538 ns 957 ns
loop 164 ns 665 ns 717 ns
reduce 116 ns 1512 ns 718 ns
unchecked integer 44.4 ns 77.6 ns 41.6 ns
fold 6211 ns n/a n/a
math library n/a n/a 45.4 ns
apply 178 ns n/a n/a
macro 0.523 ns n/a n/a

The Clojure implementation makes use of the JVM and the resulting performance for recursive, loop, and reduce implementation of factorial is the best. Forcing fold, which is a parallel version of reduce, to use 2 threads, does not yield better performance.

Note that the recursive implementation of Ruby is faster than the Python implementation. This is maybe due to the YJIT optimizing JIT compiler build into the Ruby interpreter.

Also note that the loop implementation in Python is faster than the recursive implementation. Surprisingly the reduce implementation in Python has comparable performance to the loop implementation.

The factorial implementation of the Python math library is very fast. The Python math library implementation uses a lookup table for arguments up to 20.

Since the result of factorial of 20 fits into a 64 bit integer, one can use unchecked integers in Clojure to get a fast implementation. The resulting implementation is slightly faster than the Python math library implementation. Note that the factorial implementation of the Python math library does not perform unchecked 64 bit integer math. In Python one can use the Cython dialect to use the C compiler and unchecked math with 64 bit integers. Finally one can use RubyInline to embed a C implementation in a Ruby program. The Cython implementation is the fastest for computing factorial of 20.

Finally using a macro, which of the three languages only Clojure supports, is faster than all other implementations but obviously limited to cases where the function argument is known at compile time.

Factorial of 100

When computing the factorial of arguments greater than 20, big integers are required. In the following table the performance of computing factorials of 100 is shown.

implementation Clojure 1.12.0 Ruby 3.4.1 Python 3.13.1
recursive 3148 ns 27191 ns 5520 ns
loop 3434 ns 32426 ns 4430 ns
reduce 2592 ns 28851 ns 3790 ns
fold 16012 ns n/a n/a
math library n/a n/a 599 ns
apply 2624 ns n/a n/a
macro 3.70 ns n/a n/a

Again the recursive, loop, and reduce implementations are fastest in Clojure. Also again using fold with two chunks fares much worse.

The Ruby implementations fare much worse for big numbers. Maybe the implementation of big numbers in Ruby has much worse performance than the Python and Clojure one.

Overall the implementation in the Python math library is the fastest candidate (unless we can use a Clojure macro of course).

Conclusion

As stated before, one cannot generalize this limited performance comparison. However maybe one can maybe make the following observations:

Any suggestions and comments are welcome.

Updates:

Because Python and Ruby bind methods late, there is a significant overhead when calling methods. In the following the identity function is tested in Clojure, Ruby, and Python.

implementation Clojure 1.12.0 Ruby 3.4.1 Python 3.13.1
identity function 0.0023 ns 42.5 ns 44.2 ns