Fast algorithm for computing the integer square root of a number

Bo Tian

Determining whether a number is a perfect square and finding its square root if it is are common problems in number theory and computer science.

A number nn is a perfect square if there exists an integer kk such that

k2=nk^2 = n

Perfect squares play a crucial role in cryptographic algorithms, where square roots modulo a prime are commonly used. For example, in Fermat's method, a pair of perfect square numbers is needed to create a solution for integer factoring. In the Quadratic Sieve algorithm, a "smooth number" is relaxed by allowing an additional perfect square term, even when the root is not in the base prime set.

In the mentioned scenarios, determining if a number is a perfect square is just as important as computing the square root, especially as perfect square numbers become sparse with larger numbers. Therefore, it is important to give up early before investing time to calculate the square root if the number is not a perfect square.

It's worth noting that we cannot rely on floating-point arithmetic to solve this problem, as it can introduce precision errors. For example, using floating-point division to calculate the square root and then squaring the result may not yield the exact original number due to precision loss, especially for very large integers. This is why we need a specialized method to compute the square root of integers.

In this post, I will demonstrate a new method for efficiently determining if a number is a perfect square and finding its integer square root. Additionally, a performance comparison is provided to showcase the improvement.

Well-Known approaches to find integer square root

Checking if a number is a perfect square and finding its square root is a common computational task with various applications. The binary search method, Newton's method, and the bitwise method are commonly used ways to achieve this.

  1. Binary Search Algorithm:

    Method: This algorithm uses a binary search approach to find the integer square root. By repeatedly halving the search space, it converges to the largest integer whose square is less than or equal to nn.

    Complexity: O(log(n))O(log(n))

# the implementation of the binary search algorithm to find the square root of a number
def integer_square_root_binary_search(n):
    if n < 0:
        return False, None
    low, high = 0, n
    while low <= high:
        mid = (low + high) // 2
        square = mid * mid
        if square == n:
            return True, mid
        elif square < n:
            low = mid + 1
        else:
            high = mid - 1
    return False, None

The following is an example execution of the binary search algorithm (with additional debug information):

Finding the square root of 5438224 using binary search algorithm.
iteration          low         high
        1            0      5438224
        2            0      2719111
        3            0      1359554
        4            0       679776
        5            0       339887
        6            0       169942
        7            0        84970
        8            0        42484
        9            0        21241
       10            0        10619
       11            0         5308
       12            0         2653
       13         1327         2653
       14         1991         2653
       15         2323         2653
       16         2323         2487
       17         2323         2404
       18         2323         2362
       19         2323         2341

The square root of 5438224 is 2332
  1. Newton's Method:

    Method: An iterative approach that starts with an initial guess and refines it using the formula

    xk+1=12(xk+nxk)x_{k+1} = \frac{1}{2} (x_k + \frac{n}{x_k})

    This method converges quadratically, making it very efficient.

    Complexity: O(logn)O(logn) for the number of iterations

    Note: Python math.isqrt() is an implementation of Newton's method.

# the implementation of the Newton's method to find the square root of a number
def integer_square_root_newton_method(n):
    if n < 0:
        raise ValueError("Input must be a non-negative integer")
    x = n
    y = (x + n // x) // 2
    while y < x:
        x = y
        y = (x + n // x) // 2
    is_square = (x * x == n)
    return is_square, x if is_square else None

The following is an example execution of the Newton's method:

Finding the square root of 5396329 using Newton's method
iteration            y            x
        1      2698165      5396329
        2      1349083      2698165
        3       674543      1349083
        4       337275       674543
        5       168645       337275
        6        84338       168645
        7        42200        84338
        8        21163        42200
        9        10708        21163
       10         5605        10708
       11         3283         5605
       12         2463         3283
       13         2326         2463
       14         2323         2326

The square root of 5396329 is 2323
  1. Bitwise Method:

    Method: This method iterates over the bits of the integer from the most significant to the least significant, constructing the result bit by bit.

    Complexity: O(logn)O(logn)

def integer_square_root_bitwise(n):
    if n < 0:
        raise ValueError("Input must be a non-negative integer")
    orig_n = n
    result = 0
    bit = 1 << (n.bit_length() // 2 * 2)
    while bit > 0:
        if n >= result + bit:
            n -= result + bit
            result = (result >> 1) + bit
        else:
            result >>= 1
        bit >>= 2
    is_square = (result * result == orig_n)
    return is_square, result if is_square else None

The following is an example execution of the bitwise approach:

Finding the square root of 5438224 using bitwise method
iteration       result          bit            n
        1            0      4194304      5438224
        2      4194304      1048576      1243920
        3      2097152       262144      1243920
        4      1048576        65536      1243920
        5       589824        16384       129808
        6       294912         4096       129808
        7       147456         1024       129808
        8        73728          256       129808
        9        37120           64        55824
       10        18624           16        18640
       11         9328            4            0
       12         4664            1            0

iteration:  13
The square root of 5438224 is 2332

The improvement

I've explored three methods for determining if a number is a perfect square and finding its square root. All these methods require a full computation to confirm whether the number is a perfect square, which can be quite costly. This is particularly problematic when searching for perfect square numbers using methods like Fermat's, as in many cases, the number in question is not a perfect square, making the entire calculation wasteful.

One more issue is that all three methods tend to slow down as they approach the final solution. For instance, when using Newton's method, the values of x and y decrease by half in the initial iterations, but the change becomes very small in the final iterations.

To address the issues, I incorporated several techniques to make the algorithm faster. In most cases, it finds the solution in less than 2 iterations. The techniques include:

  1. Early detection of non-square number
  2. Acurate initial guess
  3. Faster iteration

Early detection of non-square number

The early detection of non-square numbers is based on the modular property of square numbers. For example, the last digit in a square number can only be one of {0, 1, 4, 5, 6, 9} in the base-10 system. Therefore, any number ending with 2 can be immediately detected as a non-square number. This fact is related to the property of taking a square number modulo 10.

In computers, it is more convenient to use modulo 16. Similarly, we know that a square number modulo 16 must be one of {0, 1, 4, 9}. This can be seen by:

x≡±1mod  8  ⟹  x2≡1mod  16x \equiv \pm1 \mod 8 \implies x^2 \equiv 1 \mod 16
x≡±2mod  8  ⟹  x2≡4mod  16x \equiv \pm2 \mod 8 \implies x^2 \equiv 4 \mod 16
x≡±3mod  8  ⟹  x2≡9mod  16x \equiv \pm3 \mod 8 \implies x^2 \equiv 9 \mod 16
x≡±4mod  8  ⟹  x2≡0mod  16x \equiv \pm4 \mod 8 \implies x^2 \equiv 0 \mod 16

If the remainder of a number divided by 16 is not one of {0, 1, 4, 9}, then we can conclude that the number is not a square. This method efficiently eliminates most of the non-square numbers. However, if the remainder is one of {0, 1, 4, 9}, the number could still be non-square.

Acurate initial guess

All the three methods, i.e. the binary search, the Newton's method, and the bitwise method start searching from x=nx = n, which is far from the final solution, thus take longer time to converge. If we can provide a better initial estimation, the performance can then be improved.

Suppose nn has 8+2k8 + 2 k bits. Let aa be the number formed by the first 8 bits. We have ⌊a⌋2k≤n≤⌈a+1⌉2k\lfloor \sqrt{a} \rfloor 2^k\le n \le \lceil \sqrt{a+1} \rceil 2^k. This bound can be direstly used in the binary search schema.

In the algorithm to be discussed, I will take x=⌊a⌋2kx = \lfloor \sqrt{a} \rfloor 2^k as the initial estimation.

Faster iteration

As mentioned earlier, the problem of Newton's method is that it slows down at the final stage. To find a faster iteration, let's have a look at the equation x2=nx^2 = n.

Let xkx_k be the current estimation of xx, xk2<nx_k^2 \lt n, we can replace xx with xk+δx_k + \delta, δ>0\delta > 0. We have

(xk+δ)2=n(x_k + \delta)^2 = n

If the original equation has an integer solution, then the new equation with respect to δ\delta will also have an integer solution. Rewrite the equation as:

2xkδ+δ2=n−xk22 x_k \delta + \delta^2 = n - x_k^2

Let rk=n−xk2r_k = n - x_k^2, we have

2xkδ<2xkδ+δ2=rk2 x_k \delta \lt 2 x_k \delta + \delta^2 = r_k.

We have the upper bound of δ\delta as

δ≤δmax=⌊rk2xk⌋\delta \le \delta_{max} = \Bigl \lfloor \dfrac{r_k}{2 x_k} \Bigr \rfloor

With the known upper bound, we now have

rk=2xkδ+δ2≤2xkδ+δmaxδ=(2xk+δmax)δr_k = 2 x_k \delta + \delta^2 \le 2 x_k \delta + \delta_{max} \delta = (2 x_k + \delta_{max})\delta

thus we have the lower bound of δ\delta as

δ≥δmin=⌈rk2xk+δmax⌉\delta \ge \delta_{min} = \Bigl \lceil \dfrac{r_k}{2 x_k + \delta_{max}} \Bigr \rceil

As δmin\delta_{min} takes the square term into account, x+δminx + \delta_{min} is a better estimation of xx comparing to x+δmaxx + \delta_{max}. So we can update the estimation of xx as

xk+1=xk+δminx_{k+1} = x_k + \delta_{min}

rkr_{k} can be updated with

rk+1=rk−(2xk+δmin)δminr_{k+1} = r_k - (2 x_k + \delta_{min}) \delta_{min}

After this updating, if nn is not a perfect square number, rk+1r_{k+1} can be a negative number. In such case, the algorithm will report a non-square number and stop.

from math import floor, sqrt

def estimate_square_root(n, est = [int(floor(sqrt(i))) for i in range(256)]):
    m = n.bit_length()
    m = ((m+1) >> 1) << 1
    m -= 8
    x = (n >> m)    # x is the most significant 8 bits
    x = est[x] << (m >> 1)
    return x

def integer_square_root_bo_method(n):
    r16 = n & 15
    if r16 not in [0, 1, 4, 9]:
        return False, None

    # initial guess
    x = estimate_square_root(n)
    if (n < 256):
        if (x*x == n):
            return True, x
        else:
            return False, None

    upper = 0
    r = n - x*x
    while r > 0:
        _2x = 2*x
        upper = r // _2x
        lower = (r - 1) // (_2x + upper) + 1 # the same as ceil(r / (_2x + upper))
        if upper < lower:
            return False, -1

        r -= (_2x + lower) * lower
        x += lower

    if r == 0:
        return True, x

    return False, None

The following is an example execution of the proposed algorithm:

Finding the square root of 2758815150486084950425754176 using Bo's method
iteration                x            range r               
        0   48378511622144                - 418334763712162868194597440
        1   52517137969970     184933324488 765369929220257803953276
        2   52524424323189           505498 3676709702624455
        3   52524424323224                0 0               
The square root of 2758815150486084950425754176 is 52524424323224

The square number in this example consists of 28 digits. It takes 3 iterations to find the solution after the intial guessing. The same number takes 47 iterations for the bitwise method (the SOTA) to finish.

Performance comparison

The following table displays the number of iterations required for integers of various sizes. It is evident that the proposed method requires significantly fewer iterations, approximately two orders of magnitude less.

Num Digits Binary search Newton's method Bitwise method Bo's method
4 14 10 9 1
8 23 17 15 2
16 53 31 29 2
32 106 59 55 3
64 212 113 108 3
128 424 220 215 4

The following plot shows the average execution time for each algorithm to find the integer square root of the test numbers. The test numbers are chosen to be perfect square numbers. The X axes is the number of digits of the test numbers and the Y axes is the execution time.

The following plot shows another test with random input numbers. As most of the inputs can be ruled out as non-square number at the first step, the proposed algorithm runs even faster than in the case where the input numbers are perfect squares.

Discussion

Three factors contribute to the speed of the new algorithm. Early detection of non-square values reduces the unnecessary computation. This trick can be easily incorporated into other algorithms. The initial estimation provides a good starting point so that less search is required. The same technique can enhance Newton's method performance, although the improvement is limited.

The primary factor in boosting performance is the fast iteration. It is the key to bring down the execution time by magnitudes. However, the algorithm's complexity using the iteration has not yet been analyzed.

Another potential research direction involves generalizing the algorithm for floating point numbers. It would also be interesting to explore the applicability of this type of iteration in more general equation-solving scenarios.

Made with REPL Notes Build your own website in minutes with Jupyter notebooks.