Clojure Cookbook: Numbers

Converting Between Binary and Decimal

Problem

You want to produce the binary representation of an integer as a String, or you want to convert the binary representation into an integer.

Solution

Clojure has built-in support for integer literals in various bases. The Clojure reader will interpret nrdigits as a base-n integer:

2r1011 => 11
8r377 => 255
16rCAFEBABE => 3405691582
(Java trivia: Every Java .class file has these four bytes as its magic number—the first four bytes of the file.)

00000000:  CA FE BA BE 00 03 00 2D-00 84 0A 00 35 00 45 07  .......--....5.E.
00000010:  00 46 09 00 34 00 47 07-00 48 0A 00 04 00 49 0A  .F..4.G.-.H....I.
00000020:  00 4A 00 4B 07 00 4C 08-00 4D 0A 00 07 00 4E 0A  .J.K..L.-.M....N.
00000030:  00 4A 00 4F 08 00 50 08-00 51 08 00 52 08 00 53  .J.O..P.-.Q..R..S

Both the designator r and the representation of the number are read in a case-insensitive manner:

16rDEADBEEF => 3735928559
16Rdeadbeef => 3735928559

However, in order to perform the necessary conversions we will use two static methods of Java's Integer class.

Given an integer value, we produce a binary String representation using (Integer/toString integer radix):

(Integer/toString 19 2) => "10011"
(Integer/toString 45881 2) => "1011001100111001"

In the other direction, we can take a binary String and create the corresponding integer by using (Integer/parseInt string radix):

(Integer/parseInt "101010101" 2) => 341
(Integer/parseInt "110110110" 2) => 438

Converting Between Degrees and Radians

Problem

How do I work with the trigonometric functions in degrees or radians?

Solution

Java's trigonometric functions in the Math class expect their arguments to be given in radians. To convert from degrees to radians, use the static method toRadians():

(Math/toRadians 90) => 1.5707963267948966
(== (Math/toRadians 180) Math/PI) => true

And in the opposite direction, use toDegrees():

(Math/toDegrees Math/PI) => 180.0

Example - Compute Great-circle Distance

Compute the great-circle distance between two points on earth given their latitude and longitude in degrees. The distance is computed in kilometers by default. For miles or nautical miles use the appropriate optional fifth argument (3958.761 mi and 3440.069 nm, respectively). In fact, given the appropriate radius you can compute the distance between cities on other planets too!

(defn great-circle-distance
  ([lat1 long1 lat2 long2] (great-circle-distance lat1 long1 lat2 long2 6371.009))
  ([lat1 long1 lat2 long2 radius]
     (let [[lat1-r long1-r lat2-r long2-r]
           (map #(Math/toRadians %) [lat1 long1 lat2 long2])]
       (* radius
          (Math/acos (+ (* (Math/sin lat1-r) (Math/sin lat2-r))
                        (* (Math/cos lat1-r) (Math/cos lat2-r) (Math/cos (- long1-r long2-r)))) )))) )

Great-circle distance between Paris and San Francisco in nautical miles:

(great-circle-distance 48.87 2.33 37.8 -122.4 3440.069) => 4831.502535634215

Putting Commas in Numbers

Problem

How do I display a number with commas in the proper places?

Solution

In the United States, by convention, we divide large numbers into groups of three digits by means of commas. Since we group by counting from the right it is actually easier to work with a reversed numeric string and then reverse the result again when we are done. Here is a first attempt at an implementation:

(defn commify [s]
  (let [s (reverse s)]
    (loop [[group remainder] (split-at 3 s)
           result '()]
      (if (empty? remainder)
        (apply str (reverse (rest (concat result (list \,) group)))) ; rest to remove extraneous ,
        (recur (split-at 3 remainder) (concat result (list \,) group)))) ))

First we reverse the string and loop across it three characters at a time inserting commas. Once we have processed all of the characters there is an extraneous comma at the end, which we remove before performing the final reverse.

This seems to work:
(commify "1234567") => "1,234,567"

Unfortunately, it falters with negative integers or floating-point numbers:
(commify "-834196") => "-,834,196"
(commify "12345.67") => "12,345,.67"

An alternative solution involves Clojure's regular expression features. We will borrow a pattern from the Perl Cookbook:

(defn commify [s]
  (let [matcher (re-matcher #"(\d\d\d)(?=\d)(?!\d*\.)" (apply str (reverse s)))]
    (apply str (reverse (.replaceAll matcher "$1,")))) )

This handles all of our cases now:
(commify "1234567") => "1,234,567"
(commify "-834196") => "-834,196"
(commify "12345.67") => "12,345.67"

This version works fine in the U.S., but with a couple of changes we can create a version that works in many other countries as well:

(defn commify
  ([s] (commify s \,))
  ([s separator] (commify s separator 3))
  ([s separator group-size]
     (let [matcher (re-matcher (re-pattern (str "("
                                                (apply str (repeat group-size "\\d"))
                                                ")(?=\\d)(?!\\d*\\.)"))
                               (apply str (reverse s)))]
       (apply str (reverse (.replaceAll matcher (str "$1" separator)))) )))

We now have two optional arguments in addition to the required string. First, we can specify an alternative separator character, or default to a comma as above. Second, we can choose how many digits we want to group together. Here are some more examples:
(commify "12345.67") => "12,345.67"
Still works.
(commify "12345.67" \space 2) => "1 23 45.67"
(commify "1011010100111010" \space 4) => "1011 0101 0011 1010"
(commify "87234291" \.) => "87.234.291"
(commify "87234291" \') => "87'234'291"

User matti suggested a simpler solution that takes advantage of built-in Java features:

(import '(java.text NumberFormat)
        '(java.util Locale))
(defn commify
  ([n] (commify n (Locale/US)))
  ([n locale]
     (.format (NumberFormat/getInstance locale) (bigdec n))))

This implementation provides all of the functionality of the previous version aside from the size of the groupings. If you will excuse the bias towards the U.S. (introduced by me, not matti), which I weakly justify by the fact that Rich is American, you can specify other locales explicitly:
(commify "-123456.88") => "-123,456.88"
(commify "-123456.88" (Locale/GERMAN)) => "-123.456,88"

This version will also handle numeric input rather than the String required above:
(commify 121314151617) => "121,314,151,617"

Comparing Floating-point Numbers

Problem

How do I tell whether or not two floating-point numbers are equal?

Solution

Suppose that we have a function that performs a computation and yields a floating-point value as its result. Furthermore, suppose that we know what values to expect for certain inputs. How can we determine whether the computed values are correct considering the complexities involving floating-point arithmetic?

Floating-point numbers in Clojure are built on the primitive floating-point types of Java by default. Clojure also supports java.math.BigDecimal arbitrary-precision floats, but we will limit our discussion here to the fixed-precision primitive types float and double.

Clojure deals primarily with floating-point numbers of type double. These are automatically boxed in Java's wrapper class java.lang.Double:
(class 1.2) => java.lang.Double

Java uses 64-bit values to represent double's, and they provide about 16 digits of precision:
1234567890.12345678901234567890 => 1.2345678901234567E9

We can also work with Java's single-precision float type, but we need to be explicit and use the float function:
(class (float 1.2)) => java.lang.Float

For most purposes we'll stick with double's.

Because Clojure (and Java) use a fixed amount of memory to store a double value we obviously can't store a rational number with an infinite decimal expansion exactly:
(/ 1.0 3.0) => 0.3333333333333333

The fraction 1/3 is equal to 0.333…, where the trailing periods imply an infinite string of 3's. Clojure's representation terminates after 16 digits, so clearly it is only an approximation. How far off is this approximation? Well, let's ask Clojure to show us:
(rationalize (/ 1.0 3.0)) => 3333333333333333/10000000000000000

The function rationalize returns a rational number which represents the given floating-point argument.

So we see:
(- (rationalize (/ 1.0 3.0)) 1/3) => -1/30000000000000000
Thus our Clojure value is less than the true value by a very small amount.

But things get much weirder when working with floating-point numbers. A nice number such as 0.1 seems like it should be capable of exact representation as a double, after all it only requires one decimal digit after the decimal point. The problem is that the computer doesn't use decimal. Instead, it uses a binary representation of 0.1, which happens to be:
0.00011001100110011001100110011001100110011001100110011…

We see those periods again, so we know that we will have trouble with a fixed-precision representation. In fact, we can very easily see discrepancies:
(== (+ 0.1 0.1 0.1 0.1 0.1 0.1) (* 6 0.1)) => false
Evidently, adding 0.1 6 times is different from multiplying 0.1 by 6. And Clojure says:
(rationalize (+ 0.1 0.1 0.1 0.1 0.1 0.1)) => 3/5
(rationalize (* 6 0.1)) => 6000000000000001/10000000000000000
Wow! Off by just 1/10000000000000000.

As another example, let's take a look at how the same decimal fraction is stored differently in two different numbers. Consider the number 0.69. Clojure uses all available bits to represent those 2 decimal digits. However, in the number 66.69 some of those bits must be dedicated to the integer part as well. There are fewer bits available for the fractional part. The two numbers are represented as follows:

      0.10110000101000111101011100001010001111010111000010100
1000010.1011000010100011110101110000101000111101011100

This explains the following behavior:
(== 0.69 (- (+ 66 0.69) 66)) => false
After adding 66 and 0.69, some of the bits from 0.69 are lost, and it is not possible to recover them after subtracting 66 again.
(rationalize 0.69) => 69/100
(rationalize (- (+ 66 0.69) 66)) => 6899999999999977/10000000000000000

Note that we would not have this problem had we used Clojure's arbitrary-precision floats:
(== 0.69M (- (+ 66 0.69M) 66)) => true

But even with these java.lang.BigDecimal values there are limits:
(/ 1.0M 3.0M) =>
java.lang.ArithmeticException: Non-terminating decimal expansion; no exact representable decimal result. (NO_SOURCE_FILE:0)
It simply isn't possible to represent 1/3 as a finite decimal expansion.

So let's get back to the question of comparing floating-point numbers. What if we just use Clojure's built-in numeric equality test ==?

(defn trap [x]
  (prn x)
  (cond (== x 1) "How nice. 10 * 0.1 = 1"
        (> x 1) "Whoops. Good thing I had an escape hatch."
        :else (trap (+ x 0.1))))

(trap 0.1) =>
0.1
0.2
0.30000000000000004
0.4
0.5
0.6
0.7
0.7999999999999999
0.8999999999999999
0.9999999999999999
1.0999999999999999
"Whoops. Good thing I had an escape hatch."

The loop just blows right by the (== x 1) clause since the sum results in a value that is very close to but just under 1.0.

Considering all that we've discussed here, it should be clear that we don't normally want to compare two floating-point numbers exactly. But let's pause for a moment to consider what we mean when we say that two numbers are equal.

If we assert that two numbers x and y are equal, x = y, then there are two obvious facts which are closely related to this claim. First, if $x = y$, then $x - y = 0$. Second, if $x = y$, and $y $\neq$ 0$ (and consequently $x $\neq$ 0$), then $\frac{x}{y} = 1$. It is only a small step then to realize that if x is very nearly y, then x - y should be very nearly zero, and that the ratio of x to y should be very close to one. Each of these observations suggests a method of testing how close two numbers are.

First, we can check whether or not the difference between the two numbers is sufficiently close to zero. We will pick a small positive number, conventionally named epsilon (the fifth letter of the Greek alphabet), and check whether the numbers are less than epsilon apart. We can adjust epsilon depending on how close we want two numbers to be in order to consider them "equal".

Since we may not know ahead of time which of the two numbers is larger we don't want to risk subtracting the larger from the smaller. That would result in a negative number which would always be less than our positive epsilon. Instead we will simply take the absolute value of the difference. Then it doesn't matter which one is larger. Here is a first draft using 0.00001 for epsilon:

(defn float= [x y]
  (<= (Math/abs (- x y)) 0.00001))

Let's try it:
(float= (+ 0.1 0.1 0.1 0.1 0.1 0.1) (* 6 0.1)) => true
(float= 0.69 (- (+ 66 0.69) 66)) => true

Unfortunately, even this solution has a problem. It works fine for small numbers, but look what happens when we try to use it with some bigger numbers:
(float= 1.23456e38 (* 1.23456 (Math/pow 10 38))) => false
(* 1.23456 (Math/pow 10 38)) => 1.2345600000000001E38

Clearly with such a large number there aren't any bits available to the right of the decimal (binary) point. So our absolute epsilon is too precise a test in this case. What we really need is a relative value for epsilon based on the magnitude of the numbers we are testing. We can modify our function to accommodate this. But we must be careful when our numbers could be zero or negative. We don't want our epsilon to end up less than or equal to zero:

(defn float= [x y]
  (let [epsilon 0.00001
        scale (if (or (zero? x) (zero? y)) 1 (Math/abs x))]
    (<= (Math/abs (- x y)) (* scale epsilon))))

In particular, we have to check both x and y to see if either is zero. If we only checked x and used it to scale epsilon, then our function would not be commutative. We would get this result:
(float= 0.0 0.000000001) => true
(float= 0.000000001 0.0) => false

Our fixed function works properly now:
(float= 1.23456e38 (* 1.23456 (Math/pow 10 38))) => true
(float= -1.23456e38 (* -1.23456 (Math/pow 10 38))) => true

Depending on your application, you may need to adjust epsilon:
(float= 0.01 0.0) => false
(float= 0.001 0.0) => false
(float= 0.0001 0.0) => false
(float= 0.00001 0.0) => true

(defn float=
  ([x y] (float= x y 0.00001))
  ([x y epsilon]
     (let [scale (if (or (zero? x) (zero? y)) 1 (Math/abs x))]
       (<= (Math/abs (- x y)) (* scale epsilon)))) )

(float= 0.0 0.00001) => true
(float= 0.0 0.00001 1e-7) => false

Note that this function as written will not handle Clojure Ratio types:
(float= 1/3 (/ 1.0 3.0)) =>
java.lang.IllegalArgumentException: No matching method found: abs (NO_SOURCE_FILE:0)

You have to do the cast explicitly:
(float= (double 1/3) (/ 1.0 3.0)) => true

There's one final touch we can add. Let's define some relational functions for floats. We will refactor float=, then all we need to do is define just one more function and we can get the other three in terms of these two:

(defn- scale [x y]
  (if (or (zero? x) (zero? y))
    1
    (Math/abs x)))

(defn float=
  ([x y] (float= x y 0.00001))
  ([x y epsilon] (<= (Math/abs (- x y))
                     (* (scale x y) epsilon))))

(defn float<
  ([x y] (float< x y 0.00001))
  ([x y epsilon] (< x
                    (- y (* (scale x y) epsilon)))) )

We consider x to be less than y if it is further away from y than epsilon.

The remaining definitions are fairly obvious. x is greater than y when y is less than x. If x is not greater than y, then it is less than or equal to y. And x is greater than or equal to y if it is not less than y.

(defn float>
  ([x y] (float< y x))
  ([x y epsilon] (float< y x epsilon)))

(defn float<=
  ([x y] (not (float> x y)))
  ([x y epsilon] (not (float> x y epsilon))))

(defn float>=
  ([x y] (not (float< x y)))
  ([x y epsilon] (not (float< x y epsilon))))

As with float=, we can choose how strict we want the comparisons to be:
(float< 12.3049 12.305) => false
(float< 12.3049 12.305 1e-6) => true
(float<= 12.305 12.3049) => true
(float<= 12.305 12.3049 1e-6) => false
(float> 12.305 12.3049 1e-6) => true

The other way to compare our two floating-point numbers is to check whether their ratio is succiciently close to one.

Example - Newton-Raphson Method

Let's use our new ability to compare floats and explore how to compute the square root of a number.

In Clojure we are fortunate to have access to Java's Math.sqrt() method when we need to find the square root of some number. But somebody had to implement the method. How does one compute square roots efficiently? Here is one approach.

Isaac Newton created a method of approximating the roots of certain functions in the 17th century. Joseph Raphson later developed the method further, so the technique is often called the Newton-Raphson method. Suppose we want to find the square root of a number c. In other words, we would like to find the number x such that $x = \sqrt{c}$. But this is true when $x^{2} = c$ or $x^{2} - c = 0$. So we are faced with finding the positive root of the function $f(x) = x^{2} - c$, which is where the Newton-Raphson method comes in.

The process involves picking some initial approximation t. For our example we can simply choose $t = c$. Our plan is to compute successive values of t that are closer and closer to $\sqrt{c}$. Suppose $t = \sqrt{c}$, then $t^{2} = c$ or $t = \frac{c}{t}$. We have the exact value when $t - \frac{c}{t} = 0$, but we will be happy if we get close enough to the exact answer, i.e., if $|{t - \frac{c}{t}| < $\epsilon$$ for a sufficiently small value of epsilon.

Using some basic ideas from the differential calculus it can be shown that replacing t with $\frac{\frac{c}{t} + t}{2}$ will produce a closer approximation. We will repeatedly apply this transformation until our float= function says we are close enough:

(defn square-root
  ([c] (square-root c 1e-15))
  ([c epsilon]
     (loop [t (double c)]
       (if (float= t (/ c t) epsilon)
         t
         (recur (double (/ (+ t (/ c t)) 2)))) )))

(square-root 2) => 1.414213562373095
(square-root 2544545) => 1595.1630010754388
(Math/sqrt 2) => 1.4142135623730951
(Math/sqrt 2544545) => 1595.1630010754386

We do have to be careful though. If we specify too small a value for epsilon, we may never get close enough and wind up in an infinite loop:
(square-root 2544545 1e-16) => …crickets…

Finding Prime Factors

Problem

How do I find the prime factorization of a positive integer greater than 1?

Solution

(defn factors [n]
  (loop [i 2
         n n
         result []]
    (if (<= i (/ n i))
      (if (zero? (rem n i))
        (recur i (/ n i) (conj result i))
        (recur (inc i) n result))
      (if (> n 1)
        (conj result n)
        result))))

The loop begins with i equal to 2, the first possible prime factor. Each subsequent iteration sees i increase or n reduced by some prime factor. This continues until the test (<= i (/ n i)) fails, but i ≤ n/i => i² ≤ n => i ≤ √n. In other words, we can stop checking once i is greater than the square root of n. This is true because if n were to have a prime factor greater than √n, then it would have to have another factor less than √n, and we would have already removed this factor. When the loop terminates what's left of n is either 1 or the final prime factor.

There is a loop invariant to consider here. Each time that i is increased n will have no prime factors between 2 and i-1. Any such factors will have already been removed. Consequently, if i is not prime, then it will not divide n (the factors of i cannot be factors of n). Otherwise, i is prime and the other branch will conjoin i onto the result while reducing n by the appropriate factor.

(factors 2) => [2]
(factors 8) => [2 2 2]
(factors 86) => [2 43]
(factors 3757208) => [2 2 2 7 13 13 397]
(factors 287994837222311) => [17 1739347 9739789]

(let [factor-list '(2 2 2 7 13 19)]
  (= factor-list (factors (reduce * factor-list)))) =>
true

We can easily handle the examples from the Perl Cookbook:
(reduce * (map (fn [coll] (reduce * coll)) [(repeat 19 2) [3] (repeat 18 5) [39887]])) =>
239322000000000000000000

(factors 239322000000000000000000) =>
[2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 39887]

(reduce * (map (fn [coll] (reduce * coll)) [(repeat 24 2) (repeat 26 5)])) =>
25000000000000000000000000

(factors 25000000000000000000000000) =>
[2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5]

Back to Clojure Cookbook: Table of Contents


Comments

Add a New Comment
or Sign in as Wikidot user
(will not be published)
- +

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License