Linear correlations of multiplicative functions

We prove a Green–Tao type theorem for multiplicative functions.


Introduction
The purpose of this paper is to establish an asymptotic result for correlations of complexvalued multiplicative functions. More precisely, if h 1 , . . . h r : N → C are multiplicative functions that belong to a class F that will be introduced soon (see Section 1.1), then we wish to asymptotically evaluate expressions of the form n∈Z s ∩N K h 1 (ϕ 1 (n)) · · · h r (ϕ r (n)), (N → ∞), (1.1) where s 2, where K ⊂ R s is a fixed bounded convex subset that defines the family N K = {N x ∈ R s : x ∈ K} of homogeneously expanding regions, and where ϕ 1 , . . . , ϕ r ∈ Z[u 1 , . . . , u s ] are fixed linear polynomials in s 2 variables with the property that the non-constant parts of any two of these polynomials are pairwise linearly independent. There has been extensive work on establishing upper bounds on expressions of the form (1.1), both for linear polynomials and for polynomials of higher degree. In [8], Erdős obtained a correct-order upper bound on the sum n N d(P (n)), where d denotes the divisor function and P an irreducible polynomial. Wolke [33] extended Erdős's approach from [8] to all nonnegative multiplicative functions h that satisfy h(p k ) C 1 k C2 at all prime powers p k and with fixed constants C 1 and C 2 . In the case of linear polynomials, Shiu [31] extended this work to all non-negative multiplicative functions satisfying the bound h(p k ) A k at all prime powers as well as h(n) ε n ε for all n ∈ N and ε > 0. Nair [29] extended Shiu's result in two directions, allowing now polynomials of higher degree with integer coefficients and non-zero discriminant as well as sub-multiplicative functions, that is, to functions h satisfying h(mn) h(m)h(n) whenever gcd(m, n) = 1. Nair's work was further generalised by Nair and Tenenbaum [30] who replace the sub-multiplicative function h of one variable by any non-negative functions F of r 1 variables which satisfies the following sub-multiplicativity-type condition: F (m 1 n 1 , . . . , m r n r ) min(A Ω(m) , Bm ε )F (n 1 , . . . , n r ), (m = m 1 . . . m r ), for all r-tuples m, n ∈ N r such that gcd(m i , n i ) = 1 for 1 i r. The problem that Nair and Tenenbaum [30] study is that of establishing upper bounds for sums of the form x<n x+y F (|P 1 (n)|, . . . , |P r (n)|) for polynomials P i ∈ Z[X] such that P = 1 i r P i has no fixed prime factor. Nair and Tenenbaum's result covers the particular case where F is given by F (m 1 , . . . , m r ) = h 1 (m 1 ) . . . h r (m r ) for multiplicative functions h i . This case corresponds to the setting of (1.1). In unpublished work, Daniel [5] established a Nair-Tenenbaum result with bounds that are uniform in the discriminant of P . Finally, Daniel's work has been improved on and extended by Henriot in [17].
Asymptotic results on expressions of the form (1.1) are known in many special cases: Green and Tao [14] establish such results for the Möbius function μ. Using the machinery from [13,14], the author proved asymptotic results for the divisor function d in [26] and, in [25,27], for the function r counting representations by sums of two squares (and generalisations thereof to representations by binary quadratic forms). Lachand [21] considers the characteristic function of x 1/u -smooth numbers.
The problem of finding upper bounds or even asymptotics for (1.1) in the case where s = 1 is significantly harder and includes the famous open problem of finding the correct asymptotic behaviour of D h (N ) = n N d(n)d(n + h)d(n − h) as a function of h. When averaging over h ∈ {1, . . . , H} with H = N , this question takes the form (1.1) with s = 2. Browning [3] showed that the average h H D h (N ) can be asymptotically evaluated for smaller values of H, more precisely for H N 3/4+ε . Using spectral methods, Blomer [2] proves an asymptotic formula for (a smooth version of) h H n N a(n)d(n + h)d(n − h), where H N 1/3+ε and where a is an arbitrary complex-valued arithmetic function. Finally, in work improving on many previous results (see the references in [24]), Matomäki, Radziwi l l and Tao [24] recently proved asymptotic formulae for expressions of the form X<n 2X f (n)g(n + h) valid for almost all |h| H with H X 1−ε and where each of f and g can be a higher divisor function d k , k 2, or the von Mangoldt function † .
To describe the known asymptotic results for (1.1) that apply to larger classes of multiplicative functions, we recall that a bounded multiplicative function h : N → C is said to be pretentious, if there exists t h ∈ R and a Dirichlet character χ h such that Frantzikinakis and Host [9] obtained asymptotic results for (1.1) with K = (0, 1] s that apply to all bounded complex-valued multiplicative functions. In the case where their asymptotic formula for (1.1) carries a genuine main term, all multiplicative functions h i have the property that |N −1 n N h i (n)| 1. By Halász's theorem [16], the latter condition implies that all h i are pretentious. Klurman [19] recently succeeded in asymptotically evaluating correlations of the form n N h 1 (P 1 (n)) . . . h r (P r (n)) for bounded pretentious multiplicative functions † As the authors point out in [24], their methods do in fact apply to functions f and g stemming from a larger class of multiplicative functions. h 1 , . . . , h r and for arbitrary polynomials P 1 , . . . , P r ∈ Z[X]. In [20], Klurman and Mangerel obtain explicit main and error terms for the asymptotic result from [9].
Our aim here is to leave the pretentious setting and prove a general asymptotic result for (1.1) that applies to unbounded multiplicative functions such as d or r, as well as to functions with small mean values. To illustrate the latter goal, let us mention two examples of functions that the result applies to. Firstly, the result applies, yielding a genuine main term, when we let each h i in (1.1) be a function of the form h i : n → δ ω(n) , where δ ∈ (0, 1) and ω(n) = #{p : p|n}. Such a function h i has a small mean value in the sense that N −1 n N h i (n) (log N ) −1+δ = o (1) and the main term in our result will carry the correct logarithmic factor to capture this behaviour. A second example of an admissible function of small mean is h i : n → b(n), where b denotes the characteristic function of sums of two squares. In this case it follows from Landau's work [22] that N −1 n N h i (n) ∼ (log N ) −1/2 . To handle functions with small mean values, we prove an asymptotic formula with an error term that, instead of being o (1), reflects the order of the mean values N −1 n N |h i (n)| for 1 i r. This allows us to obtain asymptotic results with genuine main term in, amongst others, the above two examples. We proceed by introducing the precise classes of multiplicative functions our main result will apply to.

The classes F and F * of multiplicative functions
Throughout this paper we write for any q, a ∈ Z, q = 0 and x 1. The first two conditions in the following definition should be compared to the conditions appearing in the results by Shiu [31], Nair [29] and Nair-Tenenbaum [30] mentioned in the preceding discussion on upper bounds.
holds for all x 2 and x ∈ (x(log x) −C , x), and for all progressions A (mod q) with gcd(q, A) = 1 and with a modulus q ∈ [1, (log x) C ) that is divisible by the primorial p<log log x p.
Together with F , we consider the following slightly larger class F * ⊃ F that contains n it -twists of elements of F . In other words, condition (iv) only needs to hold for an n it -twist of f , but not necessarily for f itself.
Definition 1.2. Let F * denote the class of multiplicative functions h : N → C that satisfy conditions (i)-(iii) of Definition 1.1, as well as the following variant of condition (iv).
(iv) For every constant C > 0 there exists a function ϕ = ϕ C : R >0 → R 0 with ϕ(x) → 0 as x → ∞ and, given C > 0 and x > 1, there exists t x ∈ R with |t x | 2 log x such that the function h * = h * x : n → h(n)n −itx satisfies the estimate: for all x ∈ (x(log x) −C , x) and all progressions A (mod q) with gcd(q, A) = 1 and with a modulus q ∈ [1, (log x) C ) that is divisible by the primorial p<log log x p.
The classes F and F * of functions are closely related to the classes F H and F H,n it studied in [28]. Here, we impose the additional assumption (ii) which states that the growth of the function h is bounded like the growth of the divisor function. Out of the four conditions above, the last one is perhaps the least intuitive one and the one that is most difficult to check in an application. Conditions (iv) and (iv) have been studied in detail in [28, § 4], where we prove several sufficient conditions for them. These ought to be significantly easier to check in many applications as they either only involve the values of h at primes or only require to bound the correlation of h with certain Dirichlet characters. To conclude our discussion of the functions classes relevant to this paper let us record some explicit examples of functions satisfying the abstract set of rules defining F . (1) the general divisor functions d k (n) = 1 * · · · * 1(n) = 1 * k (n) for k 2; (2) the function 1 4 r(n) = 1 4 #{(x, y) ∈ Z 2 : x 2 + y 2 = n} which, up to the factor 1 4 , counts representations as a sum of two squares; (3) the characteristic function b(n) of the set of sums of two squares; (4) the function n → δ ω(n) belongs to F if δ ∈ (0, 2) and ω(n) = #{p prime : p|n}; (5) the function n → |λ f (n)|, where λ f (n) describes the normalised Fourier coefficients of a primitive holomorphic cusp form.

Main result and underlying method of proof
The main result of this paper (Theorem 2.4) establishes an asymptotic formula for (1.1) under the assumption that h 1 , . . . , h r all belong to F * . The proof of this result proceeds via Green and Tao's nilpotent Hardy-Littlewood method (see [13]), which is a method consisting of two main parts. One of them requires us to establish that a 'W -tricked' version of any h ∈ F * is orthogonal to arbitrary nilsequences. The other part amounts to showing that this W -tricked version of h has a majorant function which is pseudo-random in the sense of [13] and of the 'correct' average order in a sense we will specify later. The first task, namely that of finding a suitable W -trick and obtaining non-trivial estimates for the correlation of the W -tricked version of h and nilsequences, has been established in [28] for all h ∈ F * . The second task, namely the construction of correct-order pseudo-random majorants, will be the main focus of this work. The fact that we are seeking a majorant function for h means that our work is closely related to the sequel of papers studying upper bounds on expressions of the form (2.3) that were referred to at the very start of this introduction.

Overview
This paper is organised as follows. In Section 2, we give the precise statements of our main result and an easier special case of it. Section 3 describes a reformulation of the main result in terms of short character sums, allowing one in special cases to deduce local-global principles.
As an application, we recover a result of Frantzikinakis-Host [9] concerning the case where in (1.1) all the arithmetic functions h 1 , . . . , h r are 'pretentious'. Sections 4 and 5 are concerned with the more technical parts of the proof of the main result: Section 4 (conditionally) proves a 'W -tricked' version of the main result, in which each of the multiplicative functions h j from (2.3) is replaced by a function of the form n → h j (W j n + A j ) for a suitable integer W j and a reduced residue A j (mod W j ). To prove this result, we preliminarily assume the existence of families of pseudo-random majorants for the W -tricked functions n → h j (W j n + A j ). In Section 5, we then deduce the main theorem from its just established W -tricked version. Sections 6-10 are independent from all preceding sections, apart from the definitions made in this introduction, and contain the main new input of this paper, namely the construction of the required families of pseudo-random majorants for all (W -tricked versions of the) multiplicative functions from F * . The construction itself takes place in Sections 6-8. Here, the main difficulty lies in establishing suitable majorant functions of the correct average order for bounded multiplicative function, see Section 8. In Section 9 we recall and introduce all relevant concepts around the notion of families of pseudo-random majorants. The task of checking that the constructed majorant functions do indeed give rise to a family of pseudo-random majorants is carried out in Sections 9 and 10. Section 11, finally, contains the proofs of the results from Section 3, and Section 12 discusses the application of our main result to the arithmetic functions h j (n) = |λ fj (n)|, where λ fj (n) denotes the normalised Fourier coefficients of a primitive holomorphic cusp form f j .

Statement of main result
This section contains the precise statement of our main result, which we present subsequently to that of an easier, but important, special case.
Let w : R >0 → R >0 be any function such that log log x log log log x < w(x) log log x for all sufficiently large x, and define W (x) = p w(x) p. The asymptotic formula below features an integer multiple W (N ) of W (N ) with the property that for each h j ∈ F * appearing in the statement, the mean value S hj (N ; q W (N ), A) in progressions A (mod q W (N )) shows a certain amount of regularity as 1 q (log N ) E varies over small integers and A varies over reduced residues, that is, gcd(A, q W (N )) = 1. The type of regularity we require is that S hj (N ; q W (N ), A) ∼ S hj (N ; W (N ), A) for A and q as above. The existence of such values of W (N ) was established in [28] and any function of the form n → h j ( W (N )n + A) for A with gcd(A, W (N )) = 1 will be referred to as a W -tricked version of h j . By working with such Wtricked versions of each h j , one removes the potentially irregular contribution from small primes (that is, primes dividing W (N )), the contribution of which can then be handled separately. We begin by stating the special case of the main result where h 1 , . . . , h r ∈ F . The restriction to F simplifies the asymptotic formula significantly. This version of the result applies to, amongst others, all the examples mentioned at the end of Section 1.1.

1)
where W = W (N ), w = lcm(w 1 , . . . , w r ) and (2.2) Remark 2.2 (On removing the ε). The dependence on ε is an artefact of the generality of the result. In cases where the right-hand side of (2.1) can be reformulated as a closed expression that is independent of W and B 2 , this dependence can be removed. Such a reformulation always exists if for each j ∈ {1, . . . , r} the set of primitive characters χ for which x −1 | n x h j (n)χ(n)| is (close to) maximal does not depend on x as soon as x is sufficiently large. (This assumption allows one to replace S hj (N, W , A j ) by a short character sum involving a fixed set of characters.) We will describe this set-up in Theorem 3.5. Remark 2.3 (On the parameter T ). In applications it is often essential that the one can vary the cut-off parameter slightly while preserving the shape of the main term as well as uniformity in the error term. For this reason we introduced a second cut-off parameter T that is closely related to the value of N which determines W (N ).
The full version of our main result extends the class of admissible functions to F * . Compared with the statement of Theorem 2.1, part (i) from the assumptions and the asymptotic formula (2.1) itself have to be adjusted. In order to simplify the asymptotic formula, we also slightly change the assumptions in part (ii). This leads to Theorem 2.4 (Main theorem). Let N > 1 be an integer parameter, let r, s, L 2 and B 0 0 be integers, let δ ∈ (0, 1) be a parameter, fix a value of c ∈ (0, 1) and let C > 1 be a constant. Suppose further that we are given the following data.
(2) W satisfies the bound W (x) (log x) B1 for all x; and (3) as N → ∞, the following asymptotic formula holds uniformly for all ϕ 1 , . . . , ϕ r as above and all T ∈ [N (log N ) −B0 , N], provided C (see assumption (i)) is sufficiently large with respect to B 0 , H, r, s, L, α and δ:

2.4)
The main term in (2.3) can be subsumed in the error term as soon as for one j ∈ {1, . . . , r} the sequence of functions h * j = h * j,N : n → h j (n)n −itj,N satisfies |S h * j,N (N )| = o N →∞ (S |hj | (N )).

A short character sum version of the main theorem and corollaries
This section describes a reformulation of Theorem 2.4 in terms of short character sums, allowing one in special cases to reinterpret the main term in the asymptotic formula as a product over local factors. The starting point for such a reformulation lies in the observation that for many multiplicative functions of interest, the asymptotic behaviour of is determined by only finitely many characters of bounded conductor, allowing for the main term of the asymptotic formula (2.3) to be simplified. Before we consider this problem in general, let us state the special case which assumes that S h * (T ; W , A) is determined by the trivial character modulo W for each function h = h j appearing in the correlation.
|q, all reduced residues A (mod q) and where χ 0 denotes the trivial character modulo q. Assuming, in addition, that all the assumptions from Theorem 2.4 hold, we have the following asymptotic formula: and where
As an example of a function for which (3.1) is not determined by χ 0 but nonetheless by finitely many characters, we may consider the function h(n) = 1 4 r(n) = 1 * χ −1 (n), where χ −1 denotes the non-principal character modulo 4 and where r is the function that counts representations by sums of two squares.
In situations where (3.1) is determined by only finitely many characters, our main theorem can be reformulated in terms of short character sums, and Theorem 3.5 is such a reformulation. The fact that the character sum can be truncated relies on the following consequence of the repulsion of characters phenomenon described in [1] and refined in [10,11]. Define for any given value of N > 1 the set Remark 3.3. Note that the statement above simplifies if h is such that the sets E (x, C) are independent of x as soon as x is sufficiently large. In this case E N is just given by the fixed set E (x, C) = {χ 1 , . . . , χ k } of the first k characters in the sequence for any C and any sufficiently large x.
The finite set of characters picked out by the proposition above may still be larger than strictly necessary in order for (3.3) to hold, and results by Elliott [6], Tenenbaum [32] or Mangerel [23] relating the mean value of a multiplicative function h to that of |h| can be used to further restrict the set E N (q). To illustrate the character of these comparison results, we include the following qualitative lemma, which is a straightforward consequence of Elliott [6,Theorems 2 and 4]. In fact, the first part of this lemma follows already from earlier work of Elliott and Kish [7, Lemma 21].

Lemma 3.4 (Elliott, Elliott-Kish). Suppose h is a multiplicative function that satisfies conditions (i) and (iii) from Definition 1.1 and p H k
More precisely, if there exists t as above, then Under stronger assumptions on the behaviour of y1<p y2 for certain ranges of y 1 , y 2 , Tenenbaum's [32, Theorem 1.3] yields an asymptotic formula for S h (x) (or S hχ (x)) in terms of S |h| (x) with explicit error terms. This result allows one to consider twists of h by characters of conductor depending on x and can therefore be used to check the conditions of the following result in suitable applications.
for all y ∈ [N 1/2 , N] and for every χ * ∈ E * j induced from some χ ∈ E + j,N , and for all y ∈ [N 1/2 , N] and for every χ * ∈ E * j induced from some χ ∈ E j,N \ E + j,N . Then (i) as N → ∞, the following asymptotic formula holds for T ∈ [N (log N ) −B0 , N]: where χ j = p χ j,p denotes the decomposition of χ j (mod q j ) into characters modulo p vp(qj ) , and where, given any Dirichlet character χ, we let χ denote the completely multiplicative function defined via ] be a cut-off parameter that is sufficiently large in terms of r, H and the bound L on the coefficients of ψ 1 , . . . , ψ r and let Q denote product of all primes p < B and of the conductors of the characters in E + 1,N , . . . , E + r,N ; then, for all p Q, and, writing β Q (χ 1 , . . . , χ r ) = p|Q β p (χ 1 , . . . , χ r ), we have and χ ∈ E j,N \ E + j,N in many explicit applications.
In those cases where the local factors β p (χ 1 , . . . , χ r ) are in fact independent of the characters χ 1 , . . . , χ r or when, for instance, E j,N is independent of N and #E + j,N = 1 for all j ∈ {1, . . . , r}, then the main term in the previous theorem becomes a product of local factors, allowing one to prove a local-to-global principle. The latter condition holds, for example, when h j is χ j (n)n itjpretentious, that is, when h j is bounded and (1.2) holds for some character χ j and some t j ∈ R. Asymptotic results for correlations of bounded pretentious multiplicative functions were first proved by Frantzikinakis and Host in [9, Theorem 1.1] and re-proved with explicit main and error terms by Klurman and Mangerel in [20]. As a corollary to our main result, we obtain the following version of the pretentious case of these results.
while the local factors β p (χ 1 , . . . , χ r ) are as in Theorem 3.5 and the factors β p are given by While the present paper is mainly concerned with the construction of correct-order pseudorandom majorants, no such construction is required in the case of the above corollary, and more generally in the setting of Frantzikinakis and Host's work [9]. The reason for this lies in the fact that the trivial majorant given by the all-one function 1 is a pseudo-random majorant of the correct average order for every pretentious multiplicative function. Working with the all-one function 1 as a majorant leads to an error term of the form . To be precise, the average order of the majorant appears as a factor in the error term. This is why a majorant of correct average order is required in order to capture the behaviour of functions with small mean values in this asymptotic formula.

A W -tricked version of the main theorem
In this section we prove, assuming the existence of suitable majorant functions, a special case of the main theorem. The main theorem itself will be deduced from this special case in Section 5, while most of the remaining sections of this paper will be concerned with the construction of majorant functions and verification of the required properties: in Sections 6-8 we construct the majorants, in Section 9 we recall and introduce all relevant concepts around the notion of families of pseudo-random majorants and in Sections 9 and 10, we check that the constructed majorant functions give in fact rise to families of pseudo-random majorants. For references purposes, we summarise the results of Sections 6-10 in the statement below, emphasising that all terms are properly introduced in Section 9.
The following special case of Theorem 2.4 works in sub-progressions whose common difference is an integer divisible by the primorial W (N ). This procedure removes potential irregularities in the behaviour of the multiplicative functions h j ∈ F * that occur when working in progressions to small moduli. Then there exist positive constants c(r, δ) and such that the following holds, provided C was sufficiently large with respect to H, α, r and c(r, δ). Let and all ϕ 1 , . . . , ϕ r , W 1 , . . . W r and A 1 , . . . , A r as above, and where κ is a function that satisfies κ(δ) → 0 as δ → 0.
. . , h r are such that W (N ) and the t j are independent of δ and C, then the term κ(δ) in the error term above can be omitted. This is for instance the case when for every j ∈ {1, . . . , r}, both h j ∈ F and when the set of primitive characters χ for which |x −1 n x h j (n)χ(n)| is maximal does not depend on x as soon as x is sufficiently large.
Observe that Proposition 4.2 is a statement about a family of W -tricked versions of the functions h 1 , . . . , h r ∈ F * . To prove this result, we will begin by introducing a suitable choice of W (N ), the existence of which is claimed in the statement. Our choice of W (N ) will arise from an application of [28, Proposition 5.1]. Once we have set up this application and defined W (N ), we will show that the main result of [28] can in fact be applied to obtain uniform bounds on the correlation with nilsequences that apply uniformly to all W -tricked versions of the h j that arise in the statement of the proposition. This in turn will allow us to apply the machinery from [13,15] to deduce the proposition.
Proof. Let W (N ) denote the integer produced by [ holds uniformly for all intervals Let G/Γ be a nilmanifold together with a filtration G • of G of degree d and let g ∈ poly(Z, G • ) a polynomial sequence. Suppose that G/Γ has a M 0 -rational Mal'cev basis adapted to G • for some M 0 2 and let G/Γ be equipped with the metric defined by this basis. Let F : G/Γ → C be a 1-bounded Lipschitz function. Then [28, Theorem 6.1] implies that, provided E 1 is sufficiently large with respect to d, the dimension of G, α and H, we have The bound (4.3) applies in particular to all W j appearing in the statement of Proposition 4 , defined on the integers 1 n T /W j , is orthogonal to nilsequences provided E, and hence B 2 and C, are sufficiently large with respect to d, dim G, α and H. By (4.2), it follows moreover that where W = W (N ) and where E hj (N ; W ) is as in (1.3). Thus, the function , defined on 1 n T /W j , is also orthogonal to nilsequences provided E, and hence B 2 and C, are sufficiently large with respect to d, dim G, α and H. Consider the normalisation of h j given byh (N ; W (N )) .
Mertens' estimate and property (1)  is pseudo-random. We now seek to apply the (transferred) inverse theorem for uniformity norms from [15] (see [13, Proposition 10.1; 15, Conjecture 1.2 and Theorem 1.3]). Note that this statement only involves linear sequences, that is, their degree is equal to the step of the nilmanifold involved. If M r−2,δ denotes the finite set of r − 2-step nilmanifolds from the statement of the the inverse theorem, let c(r, δ) be the maximum dimension of the elements of this set. Then (4.3) applies to all linear sequences associated to manifolds in M r−2,δ provided E is sufficiently large with respect to the step r − 2 and c(r, δ). Thus, as soon as N is sufficiently large, it follows from the bound (4.3), from Theorem 4.1 (cf. Theorem 9.2) and from the inverse theorem for uniformity norms from [15] that the functionh j satisfies where, following [13], κ(δ) → 0 as δ → 0, and where we used the notation E s∈S = 1 #S s∈S for finite sets S.
In the case where h 1 , . . . , h r ∈ F , that is, where t 1 = · · · = t r = 0, the above estimate concludes the proof of the proposition, which could have been simplified in many places.
To further simplify the main term of (4.4) in the remaining case where not all t j vanish, our next aim is to show that the summation argument (W j ϕ j (n) + A j ) itj can be replaced by (W j ψ j (n)) itj . More precisely, using the fact that n itj varies slowly, we will show that, outside an exceptional set, (W j ϕ j (n) + A j ) itj can approximated by (W j ψ j (n)) itj . We begin with the exceptional set. Recall that the linear form ψ j = ϕ j − a j has bounded coefficients and note that for all 0 < T < T we have Thus, if K * := {x ∈ K : |ψ i (x)| > (T /W ) c * for all 1 i r} for some c * ∈ (0, 1), then We will use the fact that |ψ j (n)| > (T /W ) c * is large for all 1 j r and n ∈ (T /W )K * to show that we may approximate (W j ϕ j (n) + A j ) itj by (W j ψ j (n)) itj . To do so, recall that |a j (N )| N c = (T /W ) c+o (1) for some c ∈ (0, 1) and all 1 j r and N 1, and set Putting everything together and recalling that the ψ j are linear forms, we obtain Multiplying through by 1 j r S h * j (N, W , A j ), which is bounded by 1 j r E hj (N ; W ), Proposition 4.2 now follows from the combination of (4.4) and (4.5).

Proof of the main result from its special case
In this section we deduce Theorem 2.4 from the W -tricked version presented in Proposition 4.2, that is to say, assuming the results about pseudo-random majorants we summarised in Theorem 4.1. Since Proposition 4.2 assumes that the quantities W i , W (N ) as well as N/T are all bounded above by a fixed power of log N (an assumption that is essential for the application of the results from [28]), the proof of Theorem 2.4 will require us to truncate certain summations. For this purpose we introduce the following exceptional set.
provided D is sufficiently large with respect to α, r, s, L and H.
Proof. To start with, we have The following simple observation will allow us to discard most cases in which, in the proof of Theorem 2.4, Proposition 4.2 would need to be applied with W j > (log T ) D for some j ∈ {1, . . . , r}.   N ). We aim to prove this using the strategy from the proof of Lemma 5.2 and therefore require a bound on

Lemma 5.4 (Exceptional set II). Under the assumptions of Theorem 2.4, we have
Let M 2 denote the number of possible values that m 2 can take and let us begin by bounding this quantity. Since W (N ) (log N ) B1 , the total number of primes p > w(N ) dividing W (N ) is at most (B 1 log log N )/(log w(N )). Since m 2 m (log N ) 3B1 , this implies Ω(m 2 ) 3B1 log log N log w(N ) . Taking further into account that w(N ) (log log N )/ log log log N , we deduce that where log (k) N = log log · · · log N denotes the k-fold logarithm of N . For all sufficiently large N we therefore obtain By combining this bound with [4, Lemma 7.9] or (5.3), it follows by Cauchy-Schwarz and the assumptions on D that The lemma follows from the above and (5.6).
Returning to the deduction of Theorem 2.4, observe that Lemma 5.4 implies that the expression from Theorem 2.4 satisfies: where gcd(n, m ∞ ) := p|m p vp(n) . The following definition will simplify the notation as well as making changes of variables. Taking into account Definition 5.5 as well as the decomposition we claim that Lemma 5.6. The main term from (5.7) equals w1,...,wr ∈W (T ) (U1,...,Ur) ∈U (w1,...,wr) and it replaces the summation range T K by which is a translate of the region K * by a vector with coordinates all bounded by 1. Let Since v/(w W ) has bounded length, K * Δ is contained in a neighbourhood of bounded diameter of the boundary of K * . Since K * is convex, this implies that vol(K * Δ ) = O((T /(w W )) s−1 ). Hence, the error term incurred by replacing K * − v/(w W ) by K * is bounded by

which by Cauchy-Schwarz and (5.3) is bounded by
To sum this bound over all (U 1 , . . . , U r ) ∈ U (w 1 , . . . , w r ) and all w 1 , . . . , w r ∈ W (N ), note that the outer summation over w 1 , . . . , w r ∈ W (N ) has at most (log N ) B2r terms since w i (log N ) B2 for every w i ∈ W (N ). In particular, w = lcm(w 1 , . . . , w r ) (log N ) B2r , which implies that the summation over U (w 1 , . . . , w r ) has (w W ) s (log N ) (B1+B2r)s terms. Combining these bounds with (5.4) we deduce that the total error term incurred by replacing N ) Or,s,H,B 0 ,B 1 ,B 2 (1 which is o N,T →∞ (vol(T K) i S |hi| (N ))). This completes the proof of the lemma.
We proceed by analysing the main term appearing in Lemma 5.6. Invoking the multiplicativity of the h i , this expression equals w1,...,wr Recalling the definition of U (w 1 , . . . , w r ), making the change of variables A j = U j /w j and noting that h j (w j )/w itj j = h * j (w j ), we deduce that (5.8) equals and where β ϕ (w 1 A 1 , . . . , w r A r ) is defined in (2.4).
Since the main term in the expression above agrees with the one in (2.3), it now remains to show that the error term in that expression is of the shape Recalling the definition of E hj (N, W (N )) from (1.3), this will follow, provided we can show that A1,...,Ar To this end, note that where L = max 1 i r { ϕ i , r, s} and where ϕ i denotes the maximum modulus of the coefficients of ϕ i . Let L 0 L 1 be such that the second alternative applies to p > L 0 and suppose further that L 0 > H. Since, given any k ∈ N 0 there are at most k r tuples (a 1 , . . . , a r ) with a max = k and taking into account that |h j (p k )| H k and h j (p k ) ε p εk , we deduce that (5.9) is bounded by This completes the proof of Theorem 2.4 and leaves us with the task of establishing Theorem 4.1, that is, the existence of families of pseudo-random majorants for elements of F * .

Majorants for multiplicative functions
The following three sections are devoted to the construction of suitable majorant functions for elements of the class F * . We will define in Section 9 what it means for a majorant function to be pseudo-random. Here, we will show that for every N ∈ N, every γ ∈ (0, 1/2) and every h ∈ F * , there exists a function ν h = ν (2) (Truncated divisor sum structure). Outside an exceptional set, we have Condition (2) is not a necessary condition. We will, however, make essential use of the truncated divisor sum structure in order to show in Section 9 that the majorants that we are about to construct are indeed pseudo-random. Condition (3) will be established in Section 9.1. This condition is important since, when applying the machinery from [13]  respectively. It is immediate that |h(n)| h (n) for all n ∈ N. The function h belongs to the class of functions for which pseudo-random majorants were already constructed in [4, § 7]. Our pseudo-random majorant for h will arise as a product of separate majorants for h and h . Thus, our main task in the present work is to construct general pseudo-random majorants for bounded multiplicative functions.

A majorant for h
Before we turn to the case of bounded multiplicative functions, let us record the known family of pseudo-random majorants ν (N ) : {1, . . . , N} → R 0 for h .
For this purpose, set g = μ * h and define for any γ ∈ (0, 1/2) and any sufficiently large N the truncation With the truncation (7.1) of h in place, [4,Proposition 7.6] shows that for any fixed value of γ ∈ (0, 1/2) we have for n N , where the exceptional set is defined as and where the majorant outside the exceptional set is given by where each set U (λ, κ) is a sparse subset of the integers up to N γ that is defined as follows.
Set ω(λ, κ) = γκ(λ+3−log 2 κ) if κ = 4/γ and λ = log 2 κ − 2, We note for later reference that u ∈ P whenever u ∈ U (λ, κ) and that p N 1/(log log N ) 3 for any prime divisor p|u of any u ∈ U (λ, κ) that appears in the definition of ν . An important technical property of the above majorant construction is that the set of integers u ∈ κ,λ U (λ, κ) is fairly sparse. In particular, one can show (cf. the computation at the end of [26, § 7]) that whenever f is a multiplicative function that is bounded at primes (for example, Moreover, the following stronger version holds: To prove (7.4), note that gcd(u 1 , u 2 ) = 1 for u i ∈ U (λ i , κ i ), i = 1, 2 unless λ 1 = λ 2 . Thus, (7.4) is bounded by and it suffices to show that for every 1 d j r, the respective factor above converges. To show this, let C > 1 be a constant such that |f (p)| C for all primes and let 1 d r. Then, the above factor satisfies Since ω(lcm(u 1 , . . . , u d )) max i ω(λ, κ i ) 1 , κ), the above is bounded by Recall that each interval I λ is of the form [y, y 2 ] with y > N 1/O((log log N ) 3 ) , that is, y → ∞ as N → ∞. Thus, the above is bounded by .

This expression, finally, is bounded by
which is seen to converge, since j 1 A j j −αj < ∞ for all constants A, α > 0. This completes the proof of (7.4).

Majorants for bounded multiplicative functions
Any non-negative bounded multiplicative function h has the property that whenever m|n is a divisor that is coprime to its cofactor n/m, then h (n) h (m). The key step in turning this simple observation into the construction of a pseudo-random majorant is to find a systematic way of assigning to any integer n a suitable divisor m. The main property this map must have is that the pre-image of any divisor m should be easily reconstructible, a property which will allow us to swap the order of summation in later computations. The following type of assignment already featured in Erdős's work [8] on the divisor function. Given a cut-off parameter N and an integer n ∈ [N γ , N], let D γ (n) denote the largest divisor of n that is of the form p<Q p vp(n) , (Q ∈ N) but does not exceed N γ . If m ∈ [1, N γ ) is an integer then its inverse image takes the form where P + (m), respectively, P − (m), denote the largest, respectively, the smallest, prime factor of an integer m. For our purpose it turns out to be of advantage to restrict attention to divisors where † P = {p : |h(p)| < 1}. † We note as an aside that by Lemma 3.4 the values of h at higher prime powers do not influence the asymptotic order of S h (N ) and need for this reason not be taken into account in the construction of ν .
Thus, if n ∈ [1, N] is an integer that factorises as n = mq with m ∈ P and p ∈ P for all prime divisors p|q, then we set Our next aim is to show that a sufficiently smoothed version of the function D γ can be written as a truncated divisor sum. To detect whether a given divisor m|n is of the form p Q p vp(n) for some Q or, equivalently, whether q = n m has no prime factor p Q, we use a sieve majorant similar to the one considered in [13,Appendix D]. The two essential differences are that the parameter corresponding to Q cannot be fixed in our application and that the divisor sum will be restricted to elements of the set P . Thus, let σ : R × N → R 0 be defined as where χ : R → R 0 is a smooth function with support in [−1, 1] and the property χ(x) = 1 for x ∈ [− 1 2 , 1 2 ]. This yields a non-negative function with the property that σ (Q; q) = 1 if q is free from prime factors p ∈ P with p Q. Setting, for 1 n N , we obtain a (preliminary) majorant ν : N → R 0 for h . The first of two small modifications consists of inserting smooth cut-offs for m and Q k m, leading to the majorant function ν : where λ is a smooth cut-off of the interval [1,2] which is supported in [1/2, 4] and takes the value 1 on the interval [1,2]. To carry out the second simplification, we exclude a sparse exceptional set related to the one from Section 7. If Q N γ/(log log N ) 3 , then an integer n < N certainly belongs to the exceptional set S from Section 7 if it has a divisor of the form 4 and if N is sufficiently large, then Q 2 > (log N ) C1 so that any multiple of Q 2 again belongs to S . Thus, by defining that is, that S is indeed an exceptional set.

The product ν ν is pseudo-random
The aim of this section is to show that the majorant functions constructed in the previous section are pseudo-random, much in the sense of [13, § 6]. However, since we are working with a larger class of functions, several adjustments to this notion are required and we include for this reason a complete definition. (1) (Normalisation). We have (2) (Majorisation condition). There exists an absolute constant C > 0 and weights E for all τ ∈ T , all n N τ \ S (N τ ) and every 1 i r, where S (N τ ) is any exceptional set with the property that, if B 0 is fixed and T ∈ [N τ (log N τ ) −B , N τ ], then, as N τ , T → ∞, we have (3) (Linear forms condition). For each T ∈ N, let T ∈ (T, O D (T )) be a prime that is sufficiently large for the following to hold. Let 1 d, t D and let φ 1 , . . . φ t ∈ Z[X 1 , . . . , X d ] be any system of linear polynomials such that |φ 1 (0)|, . . . , |φ d (0)| DT and such that all other coefficients are bounded by D in absolute value. Suppose further that φ i (n) 0 holds for all n ∈ {1, . . . , T } d and all 1 i t. Then 0 φ i (n) < T for all n ∈ {1, . . . , T } d , that is, the embedding ι : Z/T Z → N 0 that sends each class to its smallest non-negative representative induces the identity (n ∈ {1, . . . , T } d ). Our aim is to prove the following result.
Since the majorisation condition holds for our majorant by construction, the proof of Theorem 9.2 reduces to establishing Propositions 9.3, 9.4 and Proposition 9.5. To see this, one splits the majorantν (T,N) from Definition 9.1(3) into its individual parts and proceeds as described in the subsection 'Construction of the enveloping sieve' of [13,Appendix D]. hr is sufficiently small with respect to D: provided the parameter γ defining ν hr is small enough with respect to D.
Proof of Proposition 9.4. For each 1 i r, writeφ i (n) = W i φ i (n) + A i . Inserting all definitions and combining the two parts of the majorant ν hj into one by extending the summation in Q j to include Q j = 1, we obtain where Q * j := Q j (N γ ) 1Q 1 =1 takes the value Q j unless Q j = 1 and thus Q * j = N γ . Our aim is to show that in the final expression above the summation in n and the product can be interchanged at the expense of a small error, that is, that (9.3) equals and where κ, λ, u, Q, m, δ, δ , and d run over r-tuples of integers that satisfy component-wise the summation conditions imposed in (9.3). The proof that '(9.3) = (9.4)' follows the approach of [4, § 9] closely, but the situation here bears a few extra difficulties. All essential tools in this analysis were derived or developed starting out from material in [13,Appendix D].
Multiplying out the product in (9.3), collecting together r-tuples and slightly rearranging them, the first step in our proof, as in [4, § 9], is to show that the summation over r-tuples κ, λ, u, Q, m, δ, δ and d can be restricted to such tuples that satisfy the additional conditions that for every 1 i r we have gcd(Q i m i δ i δ i d i , r j=1 u j ) = 1 and gcd(m i δ i δ i d i , r j=1 Q j ) = 1, and that all tuples (u 1 . . . , u r ) consist of pairwise coprime integers and all prime-tuples (Q 1 , . . . , Q r ) have pairwise distinct entries.
To prove this, observe first that all the above coprimality conditions are automatically satisfied for the factors with index j = i. Thus, if any choice of tuples κ, λ, u, Q, m, δ, δ , and d violates the above additional conditions, then at least one of the following three alternatives holds. There is a prime factor p|u j for some 1 j r which divides two entries of (Δ 1 , . . . , Δ r ), or, for some 1 j r, the prime Q j either divides two entries of (Δ 1 , . . . , Δ r ) or we have Q 2 j |Δ i for some 1 i r. Recall that all prime factors p of any u j ∈ U (κ j , λ j ) satisfy the lower bound p N (log log N ) −3 , and that also Q j > N (log log N ) −3 .
We seek to apply the Cauchy-Schwarz inequality to show that the contribution of the excluded choices r-tuples makes a negligible contribution. If T denotes the set of all integers that are divisible by the square of a prime from the interval [N (log log N ) −3 , N γ ), and if 1 T denotes the corresponding characteristic function, then Next, we seek a bound on the second moment of (9.3) with respect to the summation in n. Note that H κj H (log log N ) 3 , that |μ(δ j )μ(δ j )h (m j Q j )| 1 and that |χ|, |λ| 1. Further, for all divisors d|φ j (n). Finally, κ,λ,u,Q m,δ,δ ,d r j=1 1 lcm(δj mj Qj ,δ j mj Qj ,uj ,dj )|φj (n) r j=1 d 7 (φ j (n)), (9.5) since Δ j = lcm(δ j m j Q j , δ j m j Q j , u j , d j )|φ j (n) implies the decomposition of φ j (n) into seven factors as follows: where Q j m j is regarded as one factor since Q j = P + (Q j m j ) is uniquely determined. Thus, where the last step follows from the kth moment bound [4,Proposition 7.9], applied to the function f (n) = d 7 (n)(h j (n)) 2 . Indeed, the function f satisfies the conditions of [4,Proposition 7.9]. The proof of the proposition is easily adjusted to apply to a system of forms (W i φ i (m) + A i ) 1 i r , where (φ i (m)) 1 i r has the same properties as the system (ψ i (m)) 1 i r in the statement. (The W -trick would in fact allow us to remove the assumption that |f (n)| ε n ε from the statement.) Putting everything together, the contribution from the excluded tuples is bounded above by (2) Similarly as above, one can also show that the summations in (9.4) can be restricted to tuples such that for every 1 i r we have gcd(Q i m i δ i δ i d i , r j=1 u j ) = 1 and gcd(m i δ i δ i d i , r j=1 Q j ) = 1, and that all tuples (u 1 . . . , u r ) consist of pairwise coprime integers and all prime-tuples (Q 1 , . . . , Q r ) have pairwise distinct entries. To see this, we bound n1,...,nr in essentially the same way as before.
(3) For any system of linear forms ϕ 1 , . . . , ϕ r ∈ Z[x 1 , . . . , x s ] and for any prime p let α ϕ (p c1 , . . . , p cr ) = 1 p ms where m = max(c 1 , . . . , c r ), and extend α ϕ to composite arguments multiplicatively (cf. [13, p. 1831; 25, Definition 8.4]). Suppose that for each i ∈ {1, . . . , r}, we are given coprime integers W i and A i and let ϕ i (m) = W i ϕ i (m) + A i . If n(c) denote the number of non-zero components of c and if ϕ has finite complexity in the sense of [13], that is, if the linear forms ϕ i are pairwise linearly independent, then the following asymptotic formulae hold (cf. [4, equation (5.6)]): with the same summation conditions on all components as in (9.3) and where indicates that the additional coprimality conditions from (1) also are in place. Due to the additional coprimality conditions from (1), the argument of (9.10) equals Applying the asymptotic evaluation (9.8) of αφ and (9.9) to the system of length r = 1 in (9.4) and taking into account step (2), our original aim translates as follows: our task is to show that, although αφ is not multiplicative across its components if r > 1, the expression (9.10) equals again with the summation conditions from (9.3) and the coprimality conditions from step (2) in place. Note that (9.11) no longer features the linear polynomials φ i , a fact that can only be achieved because we are working with a W -trick.
(4) Following [13, Appendix D], we essentially replace χ(log m/ log Q) and λ(log m/ log Q)) by multiplicative functions in m, using the Fourier-type transforms which define rapidly decaying functions θ, θ : R → R. Since χ and λ have compact support, Fourier inversion and integration by parts shows that (9.14) Inserting these new expressions for χ and λ at all instances in (9.10) (respectively, (9.11)) and multiplying out, we obtain a main term and an error term. Any integral occurring in the main term runs over the compact interval I and, thanks to the factor α φ (Δ 1 , . . . , Δ r ) (respectively, summation over κ, λ, u, Q, m, δ, δ , d is absolutely convergent. Thus, in the main term of (9.10) (respectively, (9.11)) we can swap sums and integrals. Taking into account steps (1)-(3), as well as the convergence of (7.3), the expression (9.10) then becomes and , as before. Similarly, the expression (9.11) takes the same form as (9.15) but with α φ (Δ 1 , . . . ,Δ r ) replaced by (Δ 1 . . .Δ r ) −1 in both instances. To proceed further, the following two lemmas are required.
Lemma 9.6. Suppose κ, λ, u i ∈ U (κ i , λ i ) and Q i N (log log N ) −3 for 1 i r are fixed and let M 1 be any given integer, for example, M = 1 j r Q j u j . Let J j for 1 j r be the quantity defined in (9.17). Then, uniformly in M , where o(1) is uniform in all parameters and where indicates that all components are coprime to M .
Lemma 9.7. Suppose κ, λ and u with u i ∈ U (κ i , λ i ), 1 i r are fixed, and suppose that M 1 is any given integer, for example, M = 1 i r u i . Let J * j for 1 j r be the quantity defined in (9.16

The average order of ν ν
As a consequence of the proof of Proposition 9.4, we shall now deduce Proposition 9.3, that is, that ν ν has the 'correct' average order. To start with, note that the first bound of (9.1) is immediate since ν ν is, outside of the sparse set S , a majorant for |h|. The second bound follows from the upper bound (9.15) on where we are only interested in the case s = 1. To see this, we apply first Lemma 9.6 and then Lemma 9.7 to the integrand, and finally recall that the outer sum (7.3) converges. 10. Proof of Proposition 9.5 To complete the proof of Theorem 9.2, it remains to prove Proposition 9.5.
Recall that τ = (N, W 1   Note that whenever the collection (a j ) 1 j d ∈ {1, . . . , T /W } d contains two identical elements, then σ τ,d (0) appears in the bound (9.2) we seek to establish. Following [12,13] closely, we seek to handle this case using the fact that the above lemma allows us to choose σ τ,d (0) to be rather large. To start with, note that the generalised divisor function d k satisfies condition (i) of Definition 1.1 with H = k. Further, given h ∈ F * , condition (i) from Definition 1. for some absolute constant C H thus ensures that (9.2) holds whenever 1 d D, and that the condition on σ τ,d (0) in Lemma 10.1 is satisfied.
In the remaining case where the a j are pairwise distinct, the system of linear forms is less degenerate and we may employ the same techniques used to prove Proposition 9.4. The key observation is that whenever a prime p divides two distinct polynomials W ij (m + a j ) + A ij and W i j (m + a j ) + A i j at the integer m, then it divides Δ(a j − a j ). Moreover, we claim that if Δ := j =j Δ(a j − a j ) and the a j are pairwise distinct, then where φ = (φ i1 , . . . , φ i d ) : Z → Z d is the system of linear forms φ j (m) = W j (m + a j ) + A j for 1 j r, and where E hi j (N ; W ij ) was defined in (1.3). We shall take (10.1) on trust for now and defer its proof to the very end of this section. Note that, instead of (9.8), the given system φ (of infinite complexity) satisfies Since there are at most (j + 1) d integer tuples (c 1 , . . . , c d ) with max i c i = j and since (j + 1) d (2 4 H) dj p j/2 /2 for all j 1 and all p > w(N ) as soon as N is sufficiently large, we have Since exp(x 1 + · · · + x d ) exp(d max i x i ) the above is bounded by The proposition now follows from Lemma 10.1.
Proof of Theorem 3.5. Since Proposition 3.2 implies that h 1 , . . . , h r ∈ F * , Theorem 2.4 applies and yields an asymptotic formula of the form (2.3) for the correlation h i (ϕ i (n)), which involves certain constants B 1 , B 2 > 0 and an integer-valued function W with the property that W (N ) (log N ) B1 and W (N )| W (N ) for all N > 1. Our aim is to show that under the assumptions of Theorem 3.5, the main term of (2.3), that is, w1,...,wr p|wi⇒p| W wi (log N ) B 2 A1,...,Ar can be factorised into a product over primes. To start with, we consider the expression S h * (N ; W , A) for h ∈ {h 1 , . . . , h r } and for any reduced residue A (mod W (N )). Omitting the index N , let E , E + and E * denote the respective sets of characters from the statement, and, given any χ ∈ E or χ ∈ E + , let χ * denote the corresponding induced character in E * , if it exists. Then, by (3.3) for q = W (N ), (3.9) and (3.8), we obtain the expansion Recalling (2.2) and comparing (11.2) with (11.1) suggests to analyse the expression w1,...,wr By the Chinese remainder theorem, the inner sum of this expression is also multiplicative and we may, in particular, factorise the summation condition ϕ j (v) ≡ w j A j (mod w j W ) into congruences modulo prime powers. To handle the new summation conditions that arise, we will invoke the notion of divisor densities already introduced in (9. Recall further that these quantities can be asymptotically evaluated and satisfy (5.10). Extending α ϕ multiplicatively, it follows from (5.10) that α ϕ (n 1 , . . . , n r ) L (lcm(n 1 , . . . , n r )) −1 L (max j n j ) −1 .
We will use the decomposition To bound the above, we use the following inequalities. For all a 1, r 1 and ε > 0 we have a r p aε if p > r r/ε and a r r,ε p aε for all p.
To prove the first bound, note that if p > r r/ε and 1 a r, then a r r r p ε p aε . Furthermore, if p > r 1/ε and 2 r < a, then a r < r a < p aε . The second bound can be proved with implied constant given by C * r,ε = max a Cr,ε (C r,ε , a r ), where C r,ε = 2 r r 2 ε −1 . Indeed, a r C * r,ε 2 εa holds for all a C r,ε . Further, if a C r,ε = 2 r r 2 ε −1 and a r C r,ε 2 εa , then a + 1 a r = (1 + a −1 ) r 1 + 2 r r a < 2 ε , and hence (a + 1) r 2 ε a r C r,ε 2 ε 2 εa = C r,ε 2 ε(a+1) , as required. Setting ε = 1/8, we deduce from the above two bounds that (11.3) is bounded by  (1) , (11.4) since ω( W (N )) π(log log N ) + p| W (N ),p>log log N 1 (log log N )(log log log N ) −1 .  (N ; q)), uniformly all q as above and T ∈ [N 1/2 , N], and where χ 0 (mod q) is the trivial character. Thus, setting q = W (N ) (log N ) B1 with W as in the statement of Theorem 2.4 (in fact, we have W (N ) = W (N ) in this case), the conditions of Corollary 3.1 are seen to be satisfied and the lemma follows.