Numbers in Base X Part 1: Rising and Falling Division
When I was first introduced to polynomial arithmetic, my teacher said it's very similar to regular arithmetic. In fact, because you don't need to worry about carrying, performing addition, subtraction, and multiplication, is significantly easier. Are polynomials really that similar to numbers in base ? I felt compelled to find out, unaware of the fact that I was about to go down an unfathomably deep rabbit hole. This article will introduce you to various unusual approaches to dividing both polynomials and integers, and the algorithms we'll develop will serve as a foundation for many future explorations.
Preliminaries
In school, we learn that the position of each digit in the standard representation of an integer corresponds to the power of the associated base, usually , that multiplies it. For example: is really just . I'll omit , which is just , and the power of from here on for the sake of brevity. Let's subtract from using the expanded representation:
If we had instead of in this example, we'd be done with the subtraction. However, we can only use digits to in the base representation, and we have to deal with the by borrowing from the next highest power:
Multiplication is also straightforward, provided we remember its distributive property, which tells us we can multiply each term inside one bracket by the terms in the other:
Once again, we're required to normalise the representation, meaning every multiple of a power of has to be less than . This is easily accomplished if we rewrite the "digits" as multiples of with a remainder:
Multiply through and carefully regroup to get the final result:
I would've done this much faster using the notation and algorithms taught in elementary school, but it's not too difficult conceptually.
Division is where my younger self got stuck. Like many students, I had memorised the long division algorithm without understanding why it works, and was also vaguely aware that I could, in principle, subtract repeatedly. The latter option seemed way too inelegant, not to mention tedious, and the only way I could adapt standard long division to the expanded representation was to normalise after each multiplication and subtraction step. This really bothered me because I could perform all the other operations while treating the intermediary results as polynomials, only normalising at the end. What made the problem worse is that the polynomial division algorithm my teacher taught me required the use of rational numbers, and those are nowhere to be seen when dividing integers! I gave up on this approach to division eventually, but the urge to compare integers and polynomials would frequently resurface over the next 20 years. Enough with the navel-gazing, we've got quite a bit of work ahead of us before we get to play!
Polynomial Division
So how do we divide polynomials? The algorithms I was taught were pretty much identical to those presented on Wikipedia, but I'd like to offer an alternative perspective that, in my opinion, provides a lot more insight into what's happening.
Multi-Pass Division
Returning to our running example, this time in base and with reversed digits, let's try to divide by . The strategy will be to rewrite and as multiples of the divisor, which requires us to first subtract all the lower degree (power of ) terms of the divisor from itself to isolate the leading term (the highest degree). Once we have a single term, we can clear out its coefficient (the leading coefficient) and multiply by an appropriate power of . Symbolically, the process looks like this:
Take a moment to fully understand the symbolic manipulations because the entire algorithm is just a repeated application of this idea until we have a bunch of multiples of the divisor, and a remainder that has a lower degree. We still have one more term to rewrite:
This was the last step because the remaining term, , has a lower degree than the divisor, so we ran out of powers of to rewrite. We can factor out the to get our final result:
We effectively rewrote as a multiple of , the quotient, with an added remainder polynomial of lower degree. Conceptually, this algorithm splits the divisor into the leading term and the lower degree terms, and, at each step, we slide the lower degree terms to the right, and subtract them from the corresponding numerator coefficients. Before the subtraction, each divisor coefficient must be multiplied by the coefficient of the rewritten power of , and divided by the divisor's leading coefficient. We can bring the resulting numerator coefficient under a common denominator, and, to avoid having to keep track of the various powers of the leading divisor coefficient, we can pre-emptively multiply all numerator coefficients, below the modified terms, by the divisor.
Here's what this algorithm looks like when translated into python code:
from fractions import Fraction # rational numbers support
def polyDivMultiPass(num: list, div: list) -> tuple[list, list]:
divLen = len(div)
assert divLen > 0, 'An empty divisor list implies division by 0'
divDeg = divLen - 1 # the divisor's degree
leadingZeroTerms = None
for i in range(divDeg, -1, -1):
if div[i] != 0:
leadingZeroTerms = divDeg - i
divLen -= leadingZeroTerms
divDeg -= leadingZeroTerms
break
assert leadingZeroTerms is not None, 'All divisor coefficients were zero'
numDeg = len(num) - 1 # numerator degree
# Not exiting early won't impact correctness, but it can
# convert integers to rationals unnecessarily.
if numDeg - divDeg < 0:
return [], num
divLc = div[divDeg] # the divisor's leading coefficient
divLcAccum = divLc
divLcAccumPrev = 1
output = num.copy() # copy the numerator to preserve the input arguments
for i in range(numDeg, divDeg - 1, -1): # decreasing sequence
prevCoeff = output[i]
for k in range(1, divLen): # skip the leading divisor coefficient
# The value of i-k will produce a decreasing sequence of indices,
# corresponding to the lower degree divisor terms that flow out
# out of the rewritten term. Multiplying by divLc is done to
# bring the values under a common denominator.
output[i - k] = divLc * output[i - k] - prevCoeff * div[divDeg - k]
# Multiply all remaining coefficient by the leading divisor coefficient
for k in range(i - divLen, -1, -1):
output[k] *= divLc
output[i] = Fraction(prevCoeff, divLcAccum)
# Store the previous power of the leading divisor coefficient so we can
# divide the remainders by it. Eliminating this store requires a
# restructuring of the loop that will make the code harder to read.
divLcAccumPrev = divLcAccum
divLcAccum *= divLc
# We didn't get to the remainders in the loop above, so make sure to
# divide them as well.
for i in range(divDeg):
output[i] = Fraction(output[i], divLcAccumPrev)
return output[divDeg:], output[:divDeg] # split into quotient and remainder
def printQuotRem(quotRem: tuple[list, list]):
print('Quotient: [',', '.join([str(p) for p in quotRem[0]]),']'
' Remainder: [',', '.join([str(p) for p in quotRem[1]]),']')
I added "multi-pass" to the name of the function because we're doing multiple passes over the output list before we compute all coefficients. The input polynomials are specified as coefficient lists in increasing power order, which might look odd if you're not a programmer, but it makes adding higher powers of more efficient. I've also added a little helper function to print out the output of polyDivMultiPass
. Let's divide by using our code:
printQuotRem(polyDivMultiPass([1, 2, 3, 4, 5, 6], [7, 8, 9]))
Quotient: [ 872/2187, -10/243, -1/27, 2/3 ] Remainder: [ -3917/2187, -1972/2187 ]
Translating the result back to mathematical notation:
If you install the SymPy python library (enter pip install sympy
in a command prompt or terminal), you can validate the output against its polynomial division algorithm:
from sympy import *
x = var('x')
n = poly(1 + 2 * x + 3 * x**2 + 4 * x**3 + 5 * x**4 + 6 * x**5)
d = poly(7 + 8 * x + 9 * x**2)
quotient, remainder = div(n, d)
print(quotient.all_coeffs(), remainder.all_coeffs())
[2/3, -1/27, -10/243, 872/2187] [-1972/2187, -3917/2187]
If you're a programmer, try to spot the redundant computations and see if you can come up with a better implementation.
A Change in Perspective
Doing multiple passes over the coefficient list is a straightforward adaptation of how I presented the division algorithm, but we can also look at the computations from the perspective of a single output coefficient! Let's say you wanted to compute the 3rd quotient coefficient directly, and, to make the analysis simpler, we can assume the divisor polynomial is monic, meaning its leading coefficient is . What information do we need to compute it? Well, if the divisor has degree , this quotient coefficient will obviously depend on the constant divisor term (the degree one) from the rewrite of the previous (2nd) quotient coefficient. In other words, we have to multiply the constant divisor coefficient by the previously computed quotient, and then subtract it from the 3rd highest numerator coefficient. Similarly, if the divisor was of degree , the 3rd quotient coefficient would depend on the previous quotient coefficients, but, this time, the first (highest) coefficient needs to be multiplied by the constant divisor coefficient, and the 2nd by the degree divisor coefficient. In general, a quotient coefficient will depend on, at most, previous coefficients, where is the degree of the divisor minus .
Feel free to use this idea to compute the quotient coefficients for the examples above if you're struggling to visualise what's going on. If it helps, what I see in my head is something similar to a linear recurrence, where an array of the negated lower degree divisor coefficients, expressed in increasing power order, are sliding against the numerator coefficients, in decreasing order; each step generates an entry in a quotient array, which is parallel to the numerator one. Where the quotient array and the sliding divisor overlap, we multiply the overlapping coefficient, sum all the products, and subtract the result from the corresponding numerator coefficient. If there are no overlapping entries in the arrays, you can assume the missing values are zero. Let's divide by as an example:
, | , | , | , | , | ||
, | ||||||
, | , | , | , | , | ||
, | ||||||
, |
, | , | , | , | , | ||
, | ||||||
, | , |
, | , | , | , | , | ||
, | ||||||
, | , | , |
The values inside the boxes are the computed coefficients, and the underlined numbers are the numerators we need to subtract from in each step. Unfortunately, this approach doesn't work for the remainder coefficients, besides the highest degree one, because we're not doing any rewriting for them. The solution turns out to be very straightforward if we apply the same analysis we used for the quotient. The constant degree term of the remainder would only be affected by the last rewrite step, so we need to subtract the product of the constant quotient and divisor coefficients. The degree remainder coefficient will depend on the last quotients, but this time the constant quotient coefficient is multiplied by the degree divisor coefficient, and the degree quotient coefficient by the constant divisor coefficient. At this point you might recognise that the steps are similar to the quotient computation, except we're sliding the divisor against the quotient array in reverse, and are not using the previous output in each step:
, | , | , | ||
, | , | , | ||
, |
, | , | , | ||
, | , | , | ||
, |
Depending on your maths background, you might've also noticed that we're effectively performing a partial discreet convolution of the coefficient lists to produce the values we need to subtract from the numerator. You can verify this using the numpy library in python (pip install numpy
if you don't have it):
import numpy
print(numpy.convolve([7, 8, 1], [-2144, 306, -43, 6]))
[-15008 -15010 3 4 5 6]
This isn't surprising given how we could, if we wanted to, compute the remainder as , with the polynomial multiplication being a discrete convolution. The remainder algorithm above is simply an optimised version of this computation.
Modifying the monic algorithm to support non-monic divisors is also quite easy, because we can factor out the leading coefficient of the divisor, and bring it into the numerator:
The problem is now reduced to monic division with rational coefficients. We can use python's Fraction
class to convert all coefficients to fractions before passing them to a monic division function (or implement that function so it uses rationals internally), but I find it more interesting to create an algorithm similar to polyDivMultiPass
, which computes the output numerators directly, and only generates the appropriate rational coefficients at the very end.
Using the multi-pass algorithm for inspiration, it's obvious that each numerator should be multiplied by an increasing power of the leading divisor coefficient, which we can label . We must also bring the previous coefficients under a common denominator, and, because each of their denominators is a consecutive power of , we can simply multiply each one by an increasing power of , starting from for the last rewritten term, and going as far back , where is the degree of the divisor. Note how each coefficient of the divisor will always be multiplied by the same power of , meaning we can precompute and cache their product.
The numerators in the remainder computation should all be multiplied by the same power of , and the subtracted quotient numerators should be adjusted in the same way as the quotient computation, except the order is reversed. Unfortunately, precomputing the multiplier doesn't work for remainders, except for the leading coefficient. Bringing it all together, we get the following python function:
def polyDivRecurrence(num: list, div: list) -> tuple[list, list]:
divLen = len(div)
assert divLen > 0, 'An empty divisor list implies division by 0'
divDeg = divLen - 1 # the divisor's degree
leadingZeroTerms = None
for i in range(divDeg, -1, -1):
if div[i] != 0:
leadingZeroTerms = divDeg - i
divLen -= leadingZeroTerms
divDeg -= leadingZeroTerms
break
assert leadingZeroTerms is not None, 'All divisor coefficients were zero'
numDeg = len(num) - 1 # numerator degree
diffDeg = numDeg - divDeg
if diffDeg < 0:
return [], num
divLc = div[divDeg] # the divisor's leading coefficient
# Precompute the required powers of the leading divisor coefficient and the
# scaled lower degree divisor coefficients to keep it consistent with the
# pen-and-paper notation. Consumes more memory but avoids redundant computations.
divLcAccum = divLc
divLcPowers = [1, divLc]
for i in range(diffDeg):
divLcAccum *= divLc
divLcPowers.append(divLcAccum)
divDegMinus1 = divDeg - 1
divScaled = [div[divDegMinus1]] # the divisor must be non-empty as a precondition
for i in range(divDeg - 2, -1, -1): # avoiding list comprehensions for clarity
divScaled.append(divLcPowers[divDegMinus1 - i] * div[i])
quotientLen = diffDeg + 1
# pre-allocate the lists, because we'll fill them backwards
quotientNums = [0] * quotientLen
quotients = [0] * quotientLen
for i in range(numDeg, divDeg - 1, -1):
iRev = numDeg - i # incrementing index; could `enumerate` the range too
# Fill the quotient backwards to avoid having to reverse the list; the
# index is off by 1 to avoid unnecessary subtractions in the next loop.
quotientIndex = quotientLen - iRev
quotNum = num[i] * divLcPowers[iRev] # scale the associated numerator coefficient
for k in range(0, min(iRev, divDeg)):
quotNum -= divScaled[k] * quotientNums[quotientIndex + k]
quotientIndex -= 1 # the quotient index is off by one here, fix it
quotientNums[quotientIndex] = quotNum
quotients[quotientIndex] = Fraction(quotNum, divLcPowers[iRev + 1])
remainders = []
for i in range(divDeg):
remNum = num[i] * divLcAccum
for j in range(i + 1):
# The values of quotientNums[j] * divLcPowers[j] could be precomputed if the
# remainder degree tends to be very large. Would use more working memory.
remNum -= quotientNums[j] * divLcPowers[j] * div[i-j]
remainders.append(Fraction(remNum, divLcAccum))
return quotients, remainders
Pen and Paper
If you don't have access to a computer, I find that the notation inspired by this alternate algorithm is much more convenient when dividing large polynomials. The pen-and-paper example should also help clarify what the code does. Let's divide by as an example. The same "slide the divisor against the quotient coefficients" idea still applies, but we need to scale by the appropriate powers of in each step. Start by computing all powers of needed for the division, and list them in increasing power order. The degrees of the numerator and divisor are and respectively, meaning we'll need powers of :
9, 81, 729, 6561
If you have access to a calculator, you can type , and then press the equals key times to get all the powers of . Next, list all the negated lower degree coefficients of the divisor in decreasing power order, then multiply each by the power of 9 above and one entry to left - the first value will be multiplied by 1 implicitly. This means we get and . You can do it all in one step, placing the results on the next line, but this time list them in increasing power order to avoid getting disoriented once we start calculating the quotient coefficients. To wrap up the preparations, you should bring the highest degree numerator coefficient down to the third line, resulting in the following:
9, 81, 729, 6561
-63, -8
6
Our first quotient coefficient is , and to compute the next one we evaluate one step of our recurrence relation by multiplying the next highest term of the numerator, which is , by the lowest power of in the first row, and then adding the result of the linear recurrence formed by the 2nd and 3rd row. To avoid having to write a bunch of leading zeros in the first row, you can implicitly multiply by by ignoring the entry. This gives us the next value in the third row: :
9, 81, 729, 6561
-63, -8
6, -3
This makes our next quotient coefficient. Continue by bringing down the from the numerator, multiply it by the next power of , which is , and add the result of the linear recurrence: :
9, 81, 729, 6561
-63, -8
6, -3, -30
The third quotient coefficient is , or . Identifying the inputs for the linear recurrence is quite easy, you start from the last computed value in the third row, and pair it with the corresponding value from the second row. For the last quotient coefficient, that's , which we add to to get the last quotient numerator, :
9, 81, 729, 6561
-63, -8
6, -3, -30, 2616
Don't forget to divide by 6561 to get the final quotient coefficient: . I like to underline the denominator value from the first row after each coefficient computation so I don't lose track of where I am.
Computing the leading coefficient of the remainder can be done using the same process as the quotient coefficients. Bring down the 2 from the numerator and add the recurrence: . I like to place the result in line with the quotient, separated by a vertical bar:
9, 81, 729, 6561
-63, -8
6, -3, -30, 2616 | -5916
Divide by 6561 to get .
I visualise the remainder computation as a table where the first row comes from the low degree coefficients of the quotient, as many of them as there are lower degree terms in the divisor. The second row is comprised of powers of the leading divisor coefficient in decreasing power order, and the third contains the negated lower degree divisor coefficients in increasing power order.
-30, 2616
9, 1
-7, -8
Computing a coefficient of the remainder is now a matter of multiplying all values in the same column and summing the results: This value is then added to the corresponding scaled numerator coefficient: . It just so happens that this computation coincides with what we did for the quotient. The next coefficient is computed by shifting the bottom row to the right, discarding the coefficient that went out of the table and filling the new empty spot with 0:
-30, 2616
9, 1
0, -7
Notice how the first two rows don't chage. This means the values can be pre-computed! Carry out the same procedure by bringing down the last numerator coefficient: :
9, 81, 729, 6561
-63, -8
6, -3, -30, 2616 | -5916, -11751
The constant term of the remainder is therefore . If the divisor degree was higher, we would've kept shifting the bottom row to the right until we filled it with zeros. With a little bit of practice, you can do this without explicitly constructing the table, though I prefer to write the pre-multiplied quotients in the 4th row, because I find it less error-prone when the remainder has a large degree:
9, 81, 729, 6561
-63, -8
6, -3, -30, 2616 | -5916 , -11751
-270, 2616
Your phone's calculator can make quick work of each step in the computation, or you can use a separate sheet of paper to keep the bookkeeping notation above nice and tidy. I find this notation significantly easier to work with compared to synthetic division, though, nowadays, I mostly let the computer do the work for me.
If you're planning to teach this notation to someone else, I implore you to not skip the intuition behind its construction, as is often the case in school maths classes. It would be fairly difficult for students to reverse engineer the logic behind such a compact notation, and it encourages mindless memorisation.
Rising Division
One aspect of polynomial division that always bothered me was the focus on the leading term. It makes sense in the context of integer long division, but polynomials are different in one important aspect: can be less than , and when it gets very close to , arbitrarily close, if you will, the contribution of the higher powers of become vanishingly small. In fact, close to , the smaller the power of , the more it contributes to the overall value! The idea behind this vague notion of "close to " is that, for any polynomial, you can pick a value for , at and below which the absolute size of every term strictly decreases as the exponent of increases. I've assumed positive for simplicity, but the same idea applies for negative values, and, when you include those, it's more appropriate to think of a range centred on zero instead.
So, how do we divide polynomials under the assumption that the constant term is actually the leading one? Simple, we apply the same rewriting process as regular division, but starting from the low degree terms instead. Let's see how that works for our first polynomial division example, where we divided by :
An easy way to verify this computation is to reverse the list of coefficients passed to our standard division function, then reverse the output again:
def polyDivRise(n: list, d: list) -> tuple[list, list]:
quotient, remainder = polyDivMultiPass(list(reversed(n)), list(reversed(d)))
quotient.reverse()
remainder.reverse()
return quotient, remainder
printQuotRem(polyDivRise([3, 2, 1], [2, 9]))
Quotient: [ 1/9, 16/81 ] Remainder: [ 211/81 ]
Even though it's inefficient, polyDivRise
is good enough for our current needs, and I like the fact that it explicitly demonstrates its connection to the standard algorithm. Is this type of polynomial division actually useful or merely a curiosity? In my opinion, the most important application of rising division can be found in calculus and analysis, because it allows us to divide truncations of two infinite series. I'll show you an example later on!
Naming
When I first stumbled upon this idea I wanted to find the name mathematicians have assigned to it, but my search yielded no results. I eventually stumbled upon "division by increasing power order", mentioned in the documentation for the Giac computer algebra system, which prompted me to search French Wikipedia, where I finally found a proper description of this type of division. The name was too verbose for my taste, so I decided to call it rising polynomial division instead, because we rewrite starting from the lowest degree term; this line of reasoning suggested an alternative name for standard polynomial division - falling polynomial division. Even though it's effectively the same algorithm, I'll use these names from now on to differentiate between the two rewriting directions.
Expansion
Another polynomial division rule that always bothered me was the fact that we had to stop after rewriting the numerator term whose degree matched the degree of the denominator's leading term. The algorithm for computing decimal expansions of rational numbers was taught in primary school, so, by this point, I was quite comfortable with the idea of an infinite process. As a reminder, the decimal expansion of, say, is:
The line above the fractional digits is called an overline and is used to indicate that those digits will repeat indefinitely. For practical reason, the repeating part is often omitted because even relatively small denominators might require the computation of hundreds of digits before the repeating pattern is established. The algorithm is fairly straightforward, and involves repeated multiplication by , or to be specific:
Computing the first digits like this is straightforward but cumbersome, which is why schools usually teach a more compact notation that takes advantage of the fact that at each step we multiply the previous remainder by and divide by the denominator to decompose it into a quotient and remainder. The quotient is output as a digit, and the remainder is carried into the next step:
The numbers with boxes around them are the remainders that we carry forward into next line, and the underlined numbers are the output digits of the expansion. You can omit the divisor, or replace it with an easy to write symbol, once you're comfortable with the procedure. In the decimal notation, each digit after the dot is implicitly multiplied by a negative power of that corresponds to the index of the digit relative to the dot. For example, is a compact representation of the following:
This is very similar to the implicit positive powers of 10 for regular integers. Given how each digit is supposed to be less than , how can we be sure that quotient after each division will produce a valid digit? Fortunately, the proof is quite simple and it relies on the observation that, by definition, each remainder is strictly smaller than the divisor. Suppose the divisor is and n-th remainder is . We can form an equality based on the definition of a remainder , then divide both sides by , followed by a multiplication by :
The left-hand side is how we compute the quotient and remainder, and we can always ensure the previous is less than the divisor by reducing the numerator with long division first to compute the integer part. This is a rather elementary proof by induction which, sadly, is usually not taught when decimal expansion is introduced. Note how it didn't matter that we multiplied both sides by 10. We could've chosen any other number and the inequality would still hold, which means we can compute expansions in any base! Here's in base :
which gives us . The little 2 subscript is the standard notation for specifying the base of the number. Turning this algorithm into code is very easy:
def intDivFalling(n: int, d: int, b: int = 10, termCount: int = 10) -> tuple[int, list]:
expansion = []
integerPart = n // d
n %= d
for _ in range(termCount):
n *= b
expansion.append(n // d)
n %= d
return integerPart, expansion
print(intDivFalling(1, 13))
(0, [0, 7, 6, 9, 2, 3, 0, 7, 6, 9])
Given how we can expand in any base through multiplication by , why not do the same with polynomial falling division?
Rising and Falling Expansions
Let's apply our rational number expansion intuition to the falling division of by :
You could continue doing this by hand, but, if you're lazy like me, you might already be thinking of ways to reuse our polynomial division code. The good news is that you don't even need to touch the code! Repeatedly multiplying by and dividing is equivalent to multiplying the numerator by a power of that corresponds to the number of terms you want to compute, and then applying the standard division algorithm. All you have to do at the end is divide the result by the same power of to preserve the equality:
Multiplying by powers of is equivalent to prepending zeros in the coefficient list that represents polynomials:
n = [0] * 10 + [0, 1] # compute 10 terms by prepending 10 zeros, equal to a multiplication of x^10
d = [-1, -1, 1] # x^2 - x - 1
print(polyDiv(n, d))
[55, 34, 21, 13, 8, 5, 3, 2, 1, 1]
Applying the division by to undo the previous multiplication we get:
You might recognise the numerators as the Fibonacci numbers, and it shouldn't surprise you that they came out given the previous discussion about linear recurrences. The division turned into a pure linear recurrence because the leading divisor coefficient is , and the lower degree coefficients were filled with zeros, so there was nothing to subtract from when computing the quotient coefficients.
Rising division is even simpler because we don't need to worry about adjusting the powers of at the end. Let's apply it to a special case of the geometric series formula:
This time we need to append zeros instead because the list will be reversed:
num = [1] + [0]*10
div = [1, -1]
print(polyDivRise(num, div))
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Which corresponds to:
I promised to give you a calculus example earlier, so let's take a few terms from the and series expansions that we've seen before, and use them to compute the first few terms of the series:
from fractions import Fraction as Q
num = [0, Q(1), 0, Q(-1, 6), 0, Q(1, 120), 0, Q(-1, 5040)] + [0]*6 # sin(x) series
div = [Q(1), 0, Q(-1, 2), 0, Q(1, 24), 0, Q(-1, 720)] # cos(x) series
printQuotRem(polyDivRise(num, div))
Quotient: [ 0, 1, 0, 1/3, 0, 2/15, 0, 17/315 ] Remainder: [ 0, 331/15120, 0, -13/6300, 0, 17/226800 ]
The quotient gives us the correct expansion up to the 7th power:
The general formula for the tangent series coefficients is quite complicated, because it involves the Bernoulli numbers, and yet we can compute as many terms as we want with a simple algorithm! You can find more efficient formulas for the coefficients of various series by carefully studying the behaviour of rising division, and the patterns in each iteration of its computation. You can also use both rising and falling expansions to approximate the values of tricky integrals.
If you're familiar with calculus you might've realised that our algorithms are producing an infinite series approximation of a rational function. You might also suspect that the series will diverge around the zeros (poles) of the divisor, which, combined with a simple divergence testing method like the ratio test, can give give us one of the oldest polynomial root-finding algorithms - Bernoulli's Method. Optimisations based on this idea underlie many modern root-finding algorithms, which will be the subject of a future article.
We've only scratched the surface of the beautiful structure that underlies these simple algorithms, so feel free to play around with the code, evaluate it symbolically in terms of the roots of the divisor, and look for patterns.
Future Explorations
In future articles, we'll explore how expressing linear recurrences as matrix operations will allow us to create an important bridge between linear algebra and polynomials through the Companion matrix, whose eigenvalues can be used to approximate the roots of a polynomial. We'll also see how any term in a linear recurrence can be directly computed through a formula that depends on the roots of the polynomial division that corresponds to it. This also provides a natural connection between polynomials and a certain class of pseudo-random number generators.
Reversing polynomials expansion will turn out to be important for the efficient implementation of certain error correcting codes, the inversion of Toeplitz matrices, and even rational approximation.
Polynomial division leads to an algorithm for solving systems of polynomial equations, which, in turn, provides the tools necessary for a deep exploration of the theory of algebraic numbers with minimal background knowledge requirements. It turns out that the same ideas can be applied to the problem of partial fraction decomposition.
Division also turns out to be important in explaining what it means to change the basis of a polynomial, and how that relates to various exotic number bases. The subjects of interpolation and approximation theory will also make an appearance during this investigation.
Integers and rational numbers will closely accompany us at each step along the way, providing context and inspiration, particularly when it comes to data compression. This reminds me, before I wrap up, there's one last key tool missing from our division algorithms toolkit.
Rising Integer Division
After playing around with polynomial expansions for a while, I started wondering if there's an integer equivalent for rising division. Falling polynomial division was inspired by the classic integer expansion algorithm I learned in school, so maybe I can go in the opposite direction, and adapt the polynomial rewriting approach for integers! Let's try this idea with our old friend, . Can I rewrite the as a sum of a digit, which is less than , multiplied by , and some remainder? We have a lot of choices if the remainder is negative, but given the fact that we'll continuously rewrite this remainder, it better be a multiple of the base, which will be in this example, so we can get an increasingly larger base exponent. These requirements correspond to the following symbolic constraints:
Is it even possible to find solutions to the equation? If and had a common factor , greater than , we could rewrite the equation as , for some and , which obviously has no integer solutions because . I picked and so they're relatively prime (coprime), meaning their greatest common factor is 1, but if we were trying to find an expansion for, say, in base , we'd fail, because and have as a common factor. This is a rather serious limitation compared to the usual expansion, though we don't need to worry about it if we pick a prime number as the base, and factor out all multiples of the base from the denominator before we begin the expansion.
Putting this limitation aside, how do we find appropriate values for and ? The brute-force approach here would be to simply try out increasingly larger values for until the product has a lowest digit equal to . It only takes attempts to arrive at , which means we can subtract to get 1, giving us and
A more elegant approach relies on the observation that we can rewrite as , which allows us to reduce the size of the coefficients we're considering:
We can repeat this process until one of the outer coefficients becomes :
One of the multipliers became 1, so at this point we have two obvious options for its corresponding expression: or
The first gives us:
which we can solve through simple substitution to find and . You can check for yourself that the second option results in and . Neither of these satisfies the inequality, but we can easily fix that by rewriting as ; expand and re-collect the terms:
Depending on your maths background, you might've recognised the rewriting trick as an application of the Extended Euclidean algorithm (EEA), and the equation as a special case of Bézout's identity. The EEA will be our constant companion throughout the upcoming series of articles, and we'll explore it in more detail in part 2!
With our and coefficients in hand, we can now rewrite as:
To proceed from here we need to repeat the same process for , which is similar to our remainder term in the regular expansion algorithm. Because we already have a way to express , let's just multiply that equation on both sides by :
Negative digits aren't allowed in the usual decimal expansion, so let's express as , which allows us to combine the term with the one:
We've finally arrived at an appropriate expression for , which we can substitute back into our expansion:
From here on we just repeat the same process, but with instead:
Rewrite the rational term of the expansion once again:
Our next digit is and rewriting is equally straightforward:
At this point you might be starting to see the pattern. The power of increases by and the corresponding coefficient (digit) is . We can express as , which obviously makes the digit after it . while the next remainder is . Our expansion so far looks like this:
We can follow a similar logic for , giving us digit and remainder . Repeat one more time: , which gives us digit and remainder . Perhaps unsurprisingly, we eventually end up with a remainder that we've seen before, and in this case it goes all the way back when we computed the digit . This means that the digits will continue repeating. Here's what we've computed so far:
This looks an awful lot like a regular integer, except the powers continue to increase, and the repeating part is similar to repeating expansions. Writing out the powers of 10 is becoming cumbersome, so let's repurpose regular integer notation, and combine it with the overline used for repeated digits in regular expansions:
Compare this against the usual base expansion we computed previously:
Do you see the pattern? Starting from the 2nd digit and going backwards we get 4, and 1, then we loop to the end and continue: . This a cyclic permutation (rotation) of the repeated digits.
Compact Notation
It would be nice to have a compact notation similar to the one we used for falling integer expansions:
The equality in each row is easy to verify, but the downside is that you have to perform all the important calculations somewhere else. If the divisor is very large, you can substitute any easy to write symbol for it. Also, feel free to omit the too, if you feel comfortable with the calculations. An alternative, more computation-friendly, notation can look something like this:
Boxed values in each row are the same as before, but multiplied by the from our starting identity, which is at the very top. The idea is that you take the rightmost number in each row, the one with both an overline and an underline, an "incomplete box," "complete its box" by multiplying by , and bring the result down to the next row. The underlined values are just the (positive) remainders when dividing the boxed value by 10. Lastly, the value inside the incomplete box, from which we subtract, is the same as the rightmost value from the row above. We don't make use of the quotient from the first division by 10 so we can omit that too:
The leftmost equals sign is now a triple bar, used to indicate that we're dealing with a remainder. Feel free to replace the arrow with whatever you prefer, and I don't actually draw all the boxes, underlines, and overlines either; those are there to help you get comfortable with the notation.
I've yet to explain where the equation to the right of the arrow comes from.
Alternative Formulation
The fact that and are related by the starting equation suggests a potential optimisation:
So, when computing the -th digit of the expansion, we multiply the starting equation by the previous remainder, , and the coefficient in front of becomes:
On the other hand, the coefficient in front of the divisor becomes:
for some quotient and remainder . We can substitute for in the previous equation to get:
Recall that in our original algorithm we eventually add to the multiplied value of , which we can rewrite as to bring it under common denominator:
We didn't take advantage of the specific values of the divisor and base, so the expression holds for any divisor, , and base, :
This optimisation is useful because it removes a multiplication step from the calculation, and eliminates one of the variables from the symbolic computation, which is useful for proofs. As an exercise, use this formula to prove that the previous step remainder is always negative, and its absolute value is less than the divisor.
Automated Computation
It's time to make the computer do the annoying work for us. Assuming we have a way to compute the coefficients for the starting equation, the rest of the algorithm is trivial to implement. One important observation about that equation is that, if we compute its remainder after dividing by either or , we'll get , and and are the values that make that happen. Computations with remainders are so useful that there's an entire branch of mathematics, called modular arithmetic (or clock arithmetic), that studies them. In regular rational arithmetic, if the result of multiplying two rational numbers equals , we know that one of those numbers is the multiplicative inverse of the other. The exact same idea applies in modular arithmetic, where it's called modular multiplicative inverse instead. As you can imagine, computing such inverses is essential if you want to do modular arithmetic, so you can easily find an implementation that computes for you. SymPy's implementation is called mod_inverse
, and we can use it for our first implementation:
from sympy import mod_inverse
def intDivRisingSimple(n: int, d: int, b: int, termCount: int = 10) -> list:
expansion = []
inv = mod_inverse(d, b)
for _ in range(termCount):
digit = (n * inv) % b
n = (n - d * digit) // b
expansion.append(digit)
return expansion
print(intDivRisingSimple(1, 7, 10))
[3, 4, 1, 7, 5, 8, 2, 4, 1, 7]
The major problem with intDivRisingSimple
is that it will error out if we plug in any divisor that contains multiples of the base, such as . We know we can factor out the , expand , and then divide by afterwards. This implies something interesting about our notation: There can be a finite number of fractional digits, similar to how decimals can have a finite number of integer digits! To make our function more robust, we can divide out as many multiples of the base from the divisor as we can, and then return the negative power of the base that would need to be applied to the list of coefficients:
def intDivRising(n: int, d: int , b: int, termCount: int = 10) -> tuple[list, int]:
exp = 0
while not d % b:
exp -= 1
d = d // b
expansion = []
inv = mod_inverse(d, b)
for _ in range(termCount):
digit = (n * inv) % b
n = (n - d * digit) // b
expansion.append(digit)
return expansion, exp
print(intDivRising(1, 30, 10, 4))
([7, 6, 6, 6], -1)
This would correspond to in our revised notation, or, in expanded form:
Programming Warning
If you're trying to follow along in a different language, you might run into some issues related to how division and remainder operations are implemented. Python is somewhat unusual because its remainder operator, %
, always returns a positive value. This is very convenient when implementing maths algorithms, but it comes with the consequence that the its integer division operation can return negative values, and rounds down (floor) instead of the typical truncation towards zero in other languages. For example, the C99 standard explicitly states that integer division will truncate the quotient (section 6.5.5 "Multiplicative Operators"). Following suit, the C++11 standard also defined integer division as a truncating operation (section 5.6.4 "Multiplicative operators"). In practice, truncation was the de-factor standard regardless of what the language standards say, because that's how hardware implementation typically operate. Some languages implement both remainder
and modulo
to differentiate between truncation and flooring. Fortunately, implementing your own flooring division isn't too difficult if your language doesn't support it out of the box. Here's how you can do it in C:
typedef long long integer;
integer divModFloorConstantTime(integer n, integer d, integer* o_remainder) {
integer quotient = n / d;
integer remainder = n % d;
// Only adjust the output when either the numerator or divisor is negative, which
// translates to an XOR operation. Clang, gcc, and msvc will optimise isNegative
// to (n ^ d) >> (BIT_SIZE - 1); I didn't do it for the sake of clarity.
char isNegative = (n < 0) ^ (d < 0);
char isApproximate = remainder != 0;
char shouldAdjust = isApproximate & isNegative;
quotient -= shouldAdjust;
remainder += shouldAdjust * d;
*o_remainder = remainder;
return quotient;
}
integer divModFloor(integer n, integer d, integer* o_remainder) {
integer quotient = n / d;
integer remainder = n % d;
if (remainder != 0 && ((n < 0) ^ (d < 0))) {
quotient -= 1;
remainder += d;
}
*o_remainder = remainder;
return quotient;
}
Playing Around
Truncating after a specific number of digits in our computation is practical yet boring. Here's a simple rising expansion implementation that won't stop until it finds the repeating digits:
from sympy import mod_inverse
def intDivRisingRepeat(n: int, d: int, b: int) -> tuple[list, list, int]:
exp = 0
while not d % b:
exp -= 1
d = d // b
expansion = []
inv = mod_inverse(d, b)
remainderIndex = {} # keep track of where we encounter specific remainders
currIndex = 0
while True:
digit = (n * inv) % b
n = (n - d * digit) // b
expansion.append(digit)
if n == 0:
return expansion, [], exp
currIndex += 1
# `setdefault` will return the existing index, if it finds one during insertion
prevIndex = remainderIndex.setdefault(n, currIndex)
if prevIndex != currIndex:
return expansion[:prevIndex], expansion[prevIndex:], exp
print(intDivRisingRepeat(1, 7, 10))
([3], [4, 1, 7, 5, 8, 2], 0)
What else can we throw at this function? Try an integer like :
([2, 1], [], 0)
It just gave us back the 12 (the list is ordered by increasing powers). Can we convert to base with this function? Sure:
print(intDivRisingRepeat(16, 1, 2))
([0, 0, 0, 0, 1], [], 0)
How about a negative number like :
([7, 8], [9], 0)
Interesting, it has a repeating 9! Here's :
([3, 7, 8], [9], 0)
Repeating 9 again... If you carefully study the behaviour of the expansion formula we derived above, you'll see why the repeating 9 will continue to appear for every negative integer you input. In fact, the behaviour is the same regardless of the base, you'll just get a repeating , where is the base. Programmers reading this article might recognise what's going on here: These expansions are just 's complement representations of negative integers, which behaves the exact same way as the 2's complement used by computers!
We haven't tried any negative rationals yet, so let's see what we get for , compared to :
1/7: ([3], [4, 1, 7, 5, 8, 2], 0)
-1/7: ([7], [5, 8, 2, 4, 1, 7], 0)
We get a different fixed term and the repeating part is rotated! Can you come up with an algorithm to negate the expansion of a rational without using the our division algorithm? See if you can do the same for the integers.
At this point you might be wondering why we'd even care about this alternative representation of numbers. Depending on your background, you might be aware of just how useful linear algebra is in a variety of fields. Reducing difficult problems to equivalent, and often approximate, formulations in linear algebra is how a lot of mathematics is done in the wild, because the tools of linear algebra are powerful, well-understood, and well-behaved. I'd argue that the theory of finite fields is very similar in this regard, yet, when it comes to questions involving rational numbers, or approximations with them, we can't use the tools of finite field arithmetic unless we come up with a compatible encoding for our numbers. The representation we came up with appears to be a promising candidate that may allow us to, metaphorically, import the tools of finite fields in our mathematical projects, though, so far, we've only scratched the surface.
Conclusion
As is customary with my writing style, the unavoidable jargon comes at the end. What we've been doing so far is exploring a well-known mathematical field: the p-adic numbers. There are many high quality presentations that feature these curious numbers, some notable examples being 3Blue1Brown's, Eric Rowland's, and, more recently, Veritasium video. However, what all presentation I've seen have in common is a distinct lack of computational grounding. Norman Wildberger was the closest to providing this grounding, yet the clumsy calculations kept getting in the way. What I find most regrettable, however, is that proofs about the p-adics, at least those I've seen, require heavy mathematical machinery, and are practically inaccessible for most people - even though there are elementary alternatives accessible to anyone with a basic understanding of number theory and proof writing. I'll go over those proofs in future articles, and I hope this brief introduction has sparked your interest!
Historically, the invention of the p-adics was, indeed, inspired by what I called rising division, more specifically, Taylor series - we'll explore how those two are connected later on. Kurt Hensel, the inventor of p-adic numbers, was interested in using analysis tools for problems in number theory, and it seems the educational material on the subject hasn't evolved much since the early 20th century, because it's still deeply tied to analysis.
My interests in error correction and random number generators also exposed me to the lacking educational materials in those fields. I've yet to see a single book or paper properly explain why linear recurrences are, formally, represented by rational functions. Generating functions, as is typical for them, are thrown into the mix to sow even more confusion, all while simple ideas are buried under piles of needless abstraction.
Algebraic geometry is no exception, lack of understanding about polynomial division is apparent in most treatments, especially when it comes to elimination theory and Gröbner basis.
The motivation to work on this article came from my frustration with the state of education in fields I find beautiful, and I hope I can properly convey this beauty to people who are interested in mathematics and computer science, yet find the deeper ideas nigh on impenetrable.