Sums of four and more unit fractions and approximate parametrizations

Abstract We prove new upper bounds on the number of representations of rational numbers mn as a sum of four unit fractions, giving five different regions, depending on the size of m in terms of n. In particular, we improve the most relevant cases, when m is small, and when m is close to n. The improvements stem from not only studying complete parametrizations of the set of solutions, but simplifying this set appropriately. Certain subsets of all parameters define the set of all solutions, up to applications of divisor functions, which has little impact on the upper bound of the number of solutions. These ‘approximate parametrizations’ were the key point to enable computer programmes to filter through a large number of equations and inequalities. Furthermore, this result leads to new upper bounds for the number of representations of rational numbers as sums of more than four unit fractions.


Introduction
We consider the problem of representing an arbitrary positive rational number m n as a sum of k unit fractions. This leads to Diophantine equations of the form This equation has been studied from a variety of different view points, we only mention results of Croot [3], Graham [8], Konyagin [11] and Martin [12].
In this paper, we are interested in upper bounds for the number of solutions of (1) in (a 1 , . . . , a k ) ∈ N k , in particular for fixed m, n, k ∈ N, where we consider the a i to be given in increasing order.
The most important special case of equation (1) is when m = 4 and k = 3 which is linked to the famous Erdős-Straus conjecture. This conjecture states that for any n 2, the rational number 4 n has a representation as a sum of three unit fractions (see [7]). For a survey of recent results and for later use, we borrow the following notation from [2]: In case of the Erdős-Straus equation with n = p prime, Elsholtz and Tao [6] proved that For general m, n ∈ N, we have that f 3 (m, n) ε n ε n m 2/3 (Browning and Elsholtz [2]) (3) and f 3 (m, n) ε n ε n 3 m 2 1/5 (Elsholtz and Planitzer [5]). (4) Note that the upper bound in (4) is stronger than (3) if m n 1/4 . In particular, the bound in (4) allows to deduce the Elsholtz-Tao exponent 3/5 in (2) for the Erdős-Straus equation also for general denominators n.
In case of k = 3, the bounds in (2)-(4) were derived by analyzing suitable parametrizations of solutions of equation (1) together with an application of the classical divisor bound. The method of Elsholtz and Tao [6] leading to (2) is possibly the limit of that method, and the same seems to be true for the bound in (4) (at least for constant m). However, we believe that these bounds are still quite far from the truth. Indeed, it was suggested by Heath-Brown to Elsholtz that even f 3 (m, n) = O ε (n ε ) appears possible, as n tends to infinity. More generally, and somewhat stronger, we think that it is also quite possible that the following conjecture holds true.
for a positive constant C m,k depending only on m and k.
The bounds in (7) were derived via an application of a lifting procedure first introduced by Browning and Elsholtz [2]. The improvement in the bounds in (7) compared to the original bounds by Browning and Elsholtz comes from taking into account a small part of the information coming from parametrizations of solutions of (1) for k = 4 when lifting from k = 3.
In this paper, our goal is to prove better upper bounds in the k = 4 case directly by using suitable parametrizations of the solutions and not by lifting from the k = 3 case. The problem with this approach is that we want to use a parametrization where the number of parameters increases exponentially with k. The new method applied does not only use a suitable parametrization but in view of the increased complexity also has a computational part. In particular, we make heavy use of a computer algebra system to accomplish the following tasks.
• Find many defining sets. By this we mean subsets of the parameters such that once they are fixed, we have at most of order n ε choices for the remaining parameters.
• Find products of parameters which are small in terms of n and such that the parameters appearing as factors may be partitioned into many defining sets.
Note that what we call 'defining sets' above are approximate parametrizations in some sense. 'Defining sets' are not in one-to-one correspondence with solutions of equation (1) as we would have with a full parametrization. Nonetheless, fixing integer values for all parameters in a 'defining set' allows for very few (in our sense O ε (n ε )) solutions for this equation instead of just a single one.
Our main result is the following.
Together with the two bounds in (5) and (7), this gives: .
This new result shows that the analysis of the number of sums of 4 and more unit fractions might be much more complicated than was previously known. Remark 1. In equation (1) with k = 4, one generally has that a 1 n, a 2 n 2 , a 3 n 4 . Hence there are at most O(n 7 ) choices for a 1 , a 2 and a 3 , and then a 4 is unique, if it exists. Hence, f 4 (m, n) n 7 is a completely trivial upper bound. However, fixing only a 1 and a 2 , one sees that the number of pairs (a 3 , a 4 ) is bounded by a divisor function, (for details, see, for example, [5]). Hence f 4 (m, n) n 3+ε is still a trivial upper bound. The worst we would get from Theorem 1, when m is small, would be an upper bound of order n 3/2+ε . ) in (7) and n ε (( n m ) 5/3 + n 4/3 m 2/3 ) in (5), we see that each of these four bounds is best in some cases, and when splitting the contributions of the two parts in O n 28/17 m 8/5 + n 4/3 m 2/3 , we see that there are even five different upper bounds involved: To present these results in a uniform way, we write exponents as α/30 345, where 30 345 is the smallest integer avoiding further fractions in the boundaries below. For fractions m n with m = n α/30 345 , where α is a real parameter in 0 α 30 345, the following holds, (omitting the n ε factor): • 0 α 5250: the upper bound of order n 3/2 m 3/4 from Theorem 1 is the sharpest one. • 5250 α 8925: the bound n 28/17 m 8/5 from (7) gives the best bound.  At the points of transition, that is, α ∈ {5250, 8925, 10 115, 10 200, 24 276}, in these inequalities the corresponding upper bounds are equally sharp.
We summarize this in the following corollary, and present a graphical display in figures 1 and 2, with c = α/30345 on the x-axis, and the exponent of n on the y-axis. listing these solutions. In particular, we can decide within the same time constraints whether the rational number m n has a representation of this form. A precise formulation of this result would make use of the complexity of factorizations. For details, we refer to [5].
Again the bound on sums of four unit fractions can be lifted to upper bounds for k > 4.
Theorem 2. For m, n ∈ N and k 5, we have Note that the improvement in the upper bound in Theorem 2 concerns the constant 8 5 in the exponent. If we compare the result with the bounds in (8), we see that, depending on k, the difference in the corresponding exponents of n is 4 85 · 2 k−4 . The results in Theorem 2 immediately improve several upper bounds for the special case of representing 1 as a sum of unit fractions. Some of these results are mentioned in [2] with improved upper bounds in [5]. Here we just reformulate [5, Corollary 3] by giving the improved upper bounds, one gets by using Theorem 2. The proof is the same as in [2,5] after plugging in the new bound.
(2) Let (u n ) n∈N be the sequence recursively defined by u 0 = 1 and u n+1 = u n (u n + 1) and set c 0 = lim n→∞ u 2 −n n . Then for ε > 0 and k k(ε), we have (3) For ε > 0 and k k(ε), the number of positive integer solutions of the equation Remark 3. The sequence u n , starting with 1, 2, 6, 42, 1806, . . . is listed as A007018 in the online encyclopedia of integer seqeuences (OEIS), and is a shifted copy of the well-known Sylvester sequence (A000058 of the OEIS): 2, 3, 7, 43, 1807, . . . It is known that the limit c 0 = lim n→∞ u 2 −n n = 1.5979102 . . . exists and is irrational, for details see [1,13]. Graham, Knuth and Patashnik [9, Exercise 4.37] sketch a proof of (in our notation) u n = c 2 n 0 − 1 2 . The existence of the limit can be proved directly, as it follows inductively that u n 2 2 n 2 , so that the sequence q n := (u n ) 1/2 n is bounded from above by 2, and u n+1 u 2 n implies that (u n+1 ) 1/2 n+1 (u n ) 1/2 n , so that the sequence of the q n is also monotonically increasing.
At the end of this introduction, we want to comment on the most important aspects of the notation used in the following. The letters N and P, as usual, denote the sets of positive integers and positive primes. The function d(n) denotes the number of positive divisors of n. By ν p (n), p ∈ P, we denote the p-adic valuation of n, that is, the highest power of p dividing n. We use the symbols and O in the contexts of the well-known Vinogradov-and Landaunotations. Dependencies of the implied constants on additional parameters will be indicated by a subscript.

Patterns and parameters
In this section, we introduce a method of parametrization for solutions of equation (1) which is based on what we will call relative greatest common divisors and patterns. This type of parametrization has been used before in connection with sums of unit fractions. Elsholtz first used relative greatest common divisors as described below in [4] while patterns played a role in proving results in [5]. For a more thorough introduction to this method and for some historical comments, see [4,5].
We start by writing the denominators of the unit fractions on the right-hand side of equation (1) as a i = n i t i , where n i = gcd(a i , n). We note that by definition gcd(t i , n ni ) = 1 and for given (a 1 , . . . , a k ) ∈ N k , we call (n 1 , . . . , n k ) ∈ N k the pattern of the solution. To bound the number of patterns for given n ∈ N, we make use of the classical divisor bound which was also one of the main ingredients in Elsholtz and Tao's proof of an upper bound for f 3 (4, p) in [6]. We will use it in the following form (see [10,Theorem 315]).
Lemma A (Classical divisor bound). Let d(n) = d|n 1 be the number of positive divisors of an integer n. Then for any ε > 0, we have When trying to find upper bounds on f 4 (m, n), we can consider the pattern of the solutions to be fixed, since the upper bound we will establish is independent of the pattern. Lemma A tells us that we have at most O ε (n ) such patterns and when looking at the result in Theorem 1 we see that an additional factor of n ε does not change the upper bound there. Hence from now on we consider the pattern (n 1 , n 2 , n 3 , n 4 ) to be fixed.
Note that the trivial upper bound for the number of patterns would rather be of order n 4ε and to get the above bound we need to redefine ε. Also below we will often apply the divisor bound several times in a row to conclude that there are at most of order n ε choices for some parameters. In any such situation, this upper bound is achieved after possibly redefining ε, and we will not explicitly state this henceforth.
Next we set I = {1, . . . , k} to be the index set and write the factors t i as a product of what we want to call relative greatest common divisors denoted by x J , J = {i 1 , . . . , i |J| } ⊂ I. Here we recursively define these relative greatest common divisors x J as follows: x I = gcd(t 1 , . . . , t k ) and x J = gcd(t i1 , . . . , t i |J| ) With this definition, we have x J for 1 i k and it is easy to see that gcd(x J , x K ) = 1 whenever J K and K J.
See, for example, [5] for a short proof of the last statement.
To keep things readable, and since in the cases we use it no ambiguity will arise, below we will often resort to the following simplified notation. If J = {i 1 , . . . , i |J| } and the i j are given in increasing order, then we write We now apply this parametrization and patterns in the special case of sums of 4 unit fractions, that is, equation (1) with k = 4: where a 1 . . .
Next we multiply the last equation by n and the least common denominator of the unit fractions on the right-hand side. Note that after doing so, the variable x i , for 1 i 4, appears in exactly three of the four summands on the right-hand side and in the product on the left-hand side. This means that also the fourth summand on the right-hand side, of which x i is not a factor, has to be divisible by x i . This factor is of the form n n i J⊂I i ∈J where we use the set-index notation for convenience. By (9), x i is coprime to J⊂I i ∈J Furthermore, by the definition of a pattern, we also have gcd(x i , n ni ) = 1, which leaves x i = 1 for 1 i 4. With this simplification, we get We introduce the parameters d {i,j} = d ij = gcd( n ni , n nj ) for 1 i < j 4 and d {i,j,k} = d ijk = gcd( n ni , n nj , n n k ) for 1 i < j < k 4 and we note that they are fixed by the pattern (n 1 , . . . , n 4 ). Furthermore, again by definition of a pattern, we have that d ij is coprime to all relative greatest common divisors with an i or a j in the index. The same holds true for d ijk and relative greatest common divisors with an i, j or k in the index.
In [2,5,6], it turned out to be useful to consider divisibility relations in the equation corresponding to (11) in the three unit fractions case. We will also do this and define the following integer parameters: In the following, we will only use the parameters z J defined above. For a general definition of z J , J ⊂ {1, . . . , 4}, 2 |J| 3, see Section 6.

Defining sets for sums of four unit fractions
In this section, we will determine several defining sets for sums of four unit fractions. We define these sets in the following way. are the sets of parameters introduced in Section 2. We call a set S ⊂ P a (four unit fractions) defining set, if assigning a positive integer value to every parameter in S allows for at most O ε (n ε ) positive integer assignments to variables in X \S such that Note that the idea behind the 'defining sets' was already applied in [6, Section 3] and [5] when dealing with sums of three unit fractions (in [5] actually also in the four unit fractions case, but to a very limited extent). Since the larger number of parameters in the four unit fractions case leads to a lot more possibilities for defining sets than we had when dealing with sums of three unit fractions, it seems impractical to determine these sets by hand. In Section 6, we describe how we computed many defining sets via a structured approach using a computer algebra system. Any of these new defining sets can easily be verified by hand. In particular, we will prove the following Lemma, which covers only the defining sets used to prove Theorem 1.   Proof. With the help of equations (11)-(13), we derive the following set of equations: z 23 x 23 = n n 2 d 23 x 13 x 34 x 134 + n n 3 d 23 x 12 x 24 x 124 , z 34 x 34 = n n 3 d 34 x 14 x 24 x 124 + n n 4 d 34 x 13 x 23 x 123 , The method of proof will be as follows. We show that fixing positive integer values for the parameters in the sets in the statement of the lemma fixes the right-hand side of at least one of the equations (15)-(21). From the divisor bound in Lemma A, we may then deduce that we have at most of order n choices for the variables on the left-hand side of the corresponding equation. For any of these choices of new parameters, we may then iterate the argument.
Here we note that the right-hand sides of equations (15)-(21) are at most of polynomial sizes in n. By definition, the parameters d J , J ⊂ {1, . . . , 4}, 2 |J| 3, are bounded from above by n. If we have a look at the definition of the parameters in the set Z in (14), we see that they are certainly of size at most polynomial in n, if the same is true for the parameters in the set X . To see that the relative greatest common divisors in X are of size at most polynomial in n, we use the fact that any of them is a factor of at least two of the denominators a i , 1 i 4. In particular, if we have m n = 1 a1 + · · · + 1 a4 with 0 < a 1 . . . a 4 , then m n 4 a 1 and a 1 4n m .
With a similar argument, we get Finally we derive from the last two inequalities We now go through all defining sets in the statement of the lemma.
(1) Once we fix positive integer values for z 23 and z 234 , we deduce from equation (16) that we have at most of order n ε may choices for all relative greatest common divisors with a '1' in the index. Equation (19) then implies the same for the variables x 24 , x 34 and x 234 . Finally, the missing variable x 23 is uniquely determined by (11).
(2) We now consider z 234 , x 23 and x 24 to be fixed. Again we have at most of order n ε choices for all relative greatest common divisors with a '1' in the index by (16). Now the same holds true for the parameters z 34 and x 34 by equation (18). Via equation (20) we deduce that we have at most of order n ε choices for the missing parameter x 234 .
(3) Having assigned positive integer values to the parameters z 234 , x 23 and x 234 , we again use equation (19) to deduce that we have at most of order n ε many choices for all parameters with a '1' in the index. Now only assignments for the parameters x 24 and x 34 are missing.
To see that we also have at most of order n ε many choices for these two parameters, we will apply a method of factoring equation (11) which was already used by Browning and Elsholtz [2]. As two of the five terms of equation (11) contain the factor x 24 x 34 , it may be rewritten in the form and further where the constants C i , 1 i 4, depend only on relative greatest common divisors x J which are known. The last equation implies that also in this case, for the remaining parameters x 24 and x 34 we have at most of order n ε many choices.
(4) In the case of z 34 , x 12 , x 123 , x 124 and x 1234 being fixed, we see that we have at most of order n ε choices for the parameters z 134 and z 234 by equation (21). From equations 15 and 16 we now see that we have at most of order n ε choices for x 13 , x 14 , x 23 , x 24 , x 123 and x 234 . The last parameter, x 34 , is finally uniquely determined by 11.
(5) If all the parameters x 12 , x 13 , x 24 , x 34 , x 123 , x 124 , x 134 and x 1234 are fixed, we see from equation (17) that we have of order n ε choices for the parameter x 23 . Now only the parameters x 14 and x 234 are missing. At this point, we again use that equation (11) factors. Indeed, we may rearrange this equation to take the form where C 1 , C 2 , C 3 and C 4 are integer constants. This equation factors as in point (3), which leads to at most O ε (n ε ) choices for x 14 and x 234 .
(6) We now deal with the case when x 12 , x 13 , x 14 , x 23 , x 123 , x 124 , x 134 , x 234 and x 1234 are all fixed. Note that only the two variables x 24 and x 34 are missing out. We already proved in point (3) that in this case we have at most of order n ε many choices for these two parameters.

Upper bounds on sums of four unit fractions
In this section, we apply the parametrization introduced in Section 2 and defining sets in Section 3 together with ideas from [6, Section 3] and [5] to prove Theorem 1. Recall that with a fixed pattern all variables n i , d ij and d ijk are fixed for 1 i, j, k 4 and we have O (n ) patterns altogether.
We now use the fact that the denominators a i = n i t i are given in increasing order. The inequalities a 2 a 3 and a 3 a 4 may be rewritten as x 12 x 24 x 124 n 3 n 2 x 13 x 34 x 134 , x 13 x 23 x 123 n 4 n 3 x 14 x 24 x 124 , by just plugging in the corresponding products of relative greatest common divisors for the t i , 2 i 4. Combining these last inequalities with three of the equations in (12) and (13) It may seem a bit mysterious how equations (26) and (27) were found. In Section 6, we describe how we used a computer programme to list many suitable inequalities of this type based on a precomputed list of defining sets. From a list of given inequalities, we have chosen the best ones we found.

Upper bounds on sums of k 5 unit fractions
In this section, we prove Theorem 2. We do so by applying a lifting method by Browning and Elsholtz [2] to the result in Theorem 1.
We first derive the bound on f 5 (m, n) by summing our upper bound from Theorem 1 over several choices of the smallest denominator a 1 in the decomposition. Here, we will only consider the bound f 4 (m, n) ε n ε n 8/5 m . The reason for this is that summing over the bound f 4 (m, n) ε n ε n 3/2 m 3/4 leads to worse upper bounds for f 5 (m, n) because the exponent of m is too small. In particular, for given a 1 ∈ N, we consider decompositions of m n − 1 a1 = ma1−n na1 as a sum of four unit fractions. We set ma 1 − n = u, and with the trivial bounds n m < a 1 5n m , Lemma B together with our bound on f 5 (m, n) above proves Theorem 2.

Computational aspects
Here, we describe how we found the proof of Theorem 1. To find inequalities of the type (26) and (27), we used a computer algebra system. As stated earlier, there are two stages at which computational aspects came into play, the first of which was finding many defining sets. Here we used 96 equations of type After multiplying any number of such inequalities up, we divide by the product of all relative greatest common divisors on the right-hand side. To clear the resulting denominator on the new left-hand side, we use inequality (25) together with the inequalities t 2 12n 2 n2m and t 3 96n 4 n3m 2 , which we derived in the proof of Lemma 1. Note that apart from clearing denominators, we can add any number of these three inequalities to our previously selected ones.
Furthermore, we took into account that n i n j d ij n and n i n j n k d ijk n for all 1 i, j, k 4. This may lead to a further reduction in size in terms of n on the right-hand side of inequalities constructed as above. However, we cannot prove that our computer search covered all possible defining sets and all relevant combinations of inequalities. Hence, it may well be that the exponent in Theorem 1 can be improved by conducting a more complete search.