Monday 18 June 2007

nt.number theory - Erik Westzynthius's cool upper bound argument: update?

After studying Kanold's 1967 paper on Jacobsthal's function, (and being inspired by a preprint http://arxiv.org/abs/1208.5342 that I discuss below,) I found an argument, mostly very simple, which gives some nice results for the effort given.  While Kanold deserves some of the credit for the argument, I have yet to see a statement by him or by anyone else that gives these results, so I present them here.  (Kanold wrote several articles on Jacobsthal's function, many of which I am tracking down, which might have this argument.  I am happy to accept help in obtaining electronic copies of them.)  This is the post I promised over a few months ago in a supplement to a question of Timothy Foo,
Analogues of Jacobsthal's function  .



For maximum ooh-aah effect, I assume $n$ is squarefree and has $k gt 2$ prime factors, one of which I call Peter, or $p$ for short.Now $1+tn$ is coprime to $n$ for any integer $t$.  So are most integers of the form $1 + tn/p$, the exceptions being those that are multiples of $p$, and those multiples do not occur as consecutive terms.  Thus, every interval of length $2n/p (=g(p)n/p)$ has at least one integer coprime to $n$ of the form $1+tn/p$.



Let's go further with this.  Let $d gt 1$ and divide $n$, and let $f=n/d$.  (Here I use $n$ squarefree to get $f$ coprime to $d$.)  Then numbers of the shape $1+tf$ form an arithmetic progression, are coprime to $f$, and (as can be seen by multiplying by $f$'s inverse in the ring of integers mod $d$) you can't pick $g(d)$ consecutive members of this progression without hitting something coprime to $d$ also.  So $g(n) leq g(d)f = g(d)n/d$ .



While I'm here, let me sharpen the inequality, assuming $f gt 1$ and $d gt 1$ are coprime:
there are $phi(f)$ totients $c$ of $f$ in the interval $[0,f]$, so I can repeat the argument with $c+tf$ instead of $1+tf$.  In the worst case, using all $phi(f)$ progressions, I get $g(fd) leq g(d)f - f + g(f)$, which mildly improves upon Kanold's bound $g(d)f -phi(f)+1$, and matches it when $f$ is prime.  (Of course, for $n=fd$ I really want $g(n)$ to be near $O(g(d)+g(f))$, but I don't yet know how to show that with grade school arithmetic.)



How to use this inequality? Pick the largest divisor $d$ for which one can comfortably compute (a subquadratic in $k$ upper bound for) $g(d)$; I pick $d$ to contain most of the large prime factors of $n$: find prime $q$ dividing $n$ so that $sigma^{-1}(d)=sum_{p text{ prime,} p|n, p geq q} 1/p$ is less than $1 + 1/2q$; a routine argument yields $g(d)$ is $O(qk)$.  The ugly part is to show that $q lt k^{0.5}$ (or else $d=n$), that $n/d lt 2^{3q/2}$ which for large $k$ approaches $2^{3(k^{epsilon + 1/e})/2}$, and that asymptotically $g(n)$ is $O(e^{k^{1/e}+Dlog(k)})$.  This isn't hard after using one of Mertens's theorems and a Chebyshev function; it just isn't pretty.  (Also for smaller $n$, $epsilon + 1/e$ can be close to $1/2$, but with patience $epsilon$ will tend to zero.)



This gives a bound that is asymptotically better than my first efforts at this, improves slightly ($k^{0.5}$ replaced by $Ck^{0.37}+ Dlog(k)$ on Kanold's bound of $2^sqrt{k}$ for $k$ not too large, and does not need Kanold's requirement that $k > e^{50}$.  Up until one chooses $d$ and crunches the formulae, it is also a very elementary argument; I suspect even Legendre knew about using the multiplicative inverse to transform a general arithmetic progression to a (effectively) consecutive sequence of integers and still preserve the property of interest here, being a unit in a certain ring (or missing it by that much). 



(One of the benefits of letting this sit for a few months before posting is that I can add cool observations like: If I could get the inequality down to $g(n) leq g(d)g(n/d)$, I could iterate the
above simple estimate to get an explicit bound of $O(k^c)$, where $c$ is a positive number less than 3.  Or like: using more advanced work combined with the above, I can get $g(n)
leq e^{k^{e^{-a}}}Ck^{a}$ for some integers $a$, which seems better than $Ck^{4loglog{k}}$ if you don't look too closely.)



Further, one can use a computer to refine the method slightly and get estimates which do quite well for small values of $n$, where small here means $k<100$.  Asymptotically though, Stevens's and my upper bounds eventually outperform this bound.



Also, there has been a nice result out of University College Dublin that I will briefly interpret.  Fintan Costello and Paul Watts find a way of presenting a related function recursively, then numerically compute a lower bound on this function which implies an upper bound on Jacobsthal's function computed on some particular values.  I thank them for reminding me about using a multiplicative inverse mod $d$ for $f$, so they deserve a "piece of the action".



These authors work in (and sometimes away from) the integer interval $BM = [b+1,b+2,ldots,b+m]$.  Given squarefree $n$ and its distinct prime factors, listed in some order as $q_1$ to $q_k$, define $Q_i$ as  $prod_{0 lt j leq i} q_j$. One approach to computing the size $pi(b,m,n)$ of the set $CP(b,m,n)$ which has those integers in $BM$ coprime to $n$ is to do the standard inclusion-exclusion argument: if we represent by $F(b,m,d)$ the multiples of $d$ in $BM$, and say there are $f(b,m,d)$ many such multiples, and abuse some notation, I then write
$CP(b,m,n) = sum_{d | n} sgn(d,F(b,m,d))$ .  Here $sgn$ is to suggest adding elements of the set $F(b,m,d)$ if $d$ has an even number of prime factors, and subtracting them instead when $d$ has an odd number of prime factors.



To set up for the recurrent expression, Costello and Watts use just some of the terms on the right hand side of the abused equation, and reorganize the rest of the terms.  In my interpretation of their work, they start with the multiset identity



$$CP(b,m,n) cup biguplus_{0 lt i leq k} F(b,m,q_i) =
BM uplus  biguplus_{0 lt i lt j leq k} RCP(i,j)$$



where $RCP(i,j)$ is $F(b,m,q_iq_j) cap CP(b,m,Q_{i-1})$, or the subset of $BM$ which has those multiples of $q_iq_j$ whose soonest prime factor in common with $n$ is $q_i$. 



One sees this identity holds by considering a member of $BM$ which has exactly $t$ distinct prime factors in common with $n$.
If $t$ is $0$, then the member occurs only once in $CP(b,m,n)$ and similarly only once in $BM$.  Otherwise, it occurs exactly $t$ times in the left hand side in $t$ distinct terms $F(b,m,q_i)$, and if $l$ is soonest such that $q_l$ is a prime factor of the member, the member occurs only once in each of $t-1$ sets
$RCP(l,j)$ (remember $l$ comes sooner than $j$) and only once in $BM$.



Now the term $RCP(i,j)$ is a subset of an arithmetic progression $A$ with common difference $q_iq_j$. By using the technique above of multiplying by a suitable inverse of $q_iq_j$ in the ring of integers mod $Q_{i-1}$, $A$ corresponds with an integer interval starting near some integer $c_{ijbm}$ of length $f(b,m,q_iq_j)$ which preserves the coprimality status with respect to $Q_{i-1}$: to wit, the size of $RCP(i,j)$ is $pi(c_{ijbm},f(b,m,q_iq_j),Q_{i-1})$.  Using the $pi$ term for the size of $CP$ and translating the other sets to numbers gives the numerical recurrent formula of Costello and Watts:
$$pi(b,m,n) = m - sum_{0 lt i leq k} f(b,m,q_i)
+ sum_{0 lt i lt j leq k} pi(c_{ijbm},f(b,m,q_iq_j),Q_{i-1})$$.



Following work of Hagedorn who computed $h(k)=g(P_k)$ for $k$ less than 50, where $P_k$ is the $k$th primorial, Costello and Watts use their formula and some analysis of coincidence of prime residues to compute an inequality for $pi_{min}(m,n)$ which is the minimum over all integers $b$ of $pi(b,m,n)$.  They underestimate $f(b,m,q_iq_j)$ by $lfloor m/q_iq_j rfloor$, ignore the $c$'s by using $pi_min$, pull out the $i=1$ terms from the double sum and rewrite that portion to include a term $E$, depending only on $m$ and the $p_i$, which arises from looking at when estimates for the sizes of the $F(b,m,p_i)$  and $F(b,m,2p_i)$ sets can be improved, and come up with (a refined version, using $p$'s for $q$'s, of) the inequality
$$m - sum_{0 lt i leq k}  lceil frac{m}{p_i} rceil + sum_{1 lt i leq k} lfloor frac{m}{2p_i} rfloor + E + sum_{1 lt i lt j leq k} pi_{min}(lfloor frac{m}{p_ip_j} rfloor,P_{i-1}) leq pi_{min}(m,P_k)$$.



With this inequality, Costello and Watts compute $pi_{low}$, a lower bound approximation to $pi_{min}$.  Since $h(k) leq m$ iff $pi_{min}(m,P_k) gt 0$, computing $pi_{low}(m,P_k)$ for various $m$ will give an upper bound on $h(k)$.  They say their computations for $k leq 10000$ suggest $h(k) leq Ck^2 log k$, where $C$ is a constant less than $0.3$ .  Although this data is achieved using data from Hagedorn's work, even without that their algorithm yields values which are a vast improvement on known and easily computable bounds, even the ones listed above.



One item to explore is how an algorithm based on this approximation will perform given different orderings of the prime factors.  I suspect that letting the larger primes come first will give tighter results.  Another item to explore is to see if there is a better term $F$ that will supplant $E$ and some of the recurrent terms in the double sum.  The idea of rewriting the $pi$ function recursively, while not new, is given new life in this double sum form, and suggests revisiting some old approaches with an eye toward computability.



Gerhard "Ask Me About Coprime Integers" Paseman, 2013.02.05

No comments:

Post a Comment