Monday, 18 June 2007

nt.number theory - Erik Westzynthius's cool upper bound argument: update?

After studying Kanold's 1967 paper on Jacobsthal's function, (and being inspired by a preprint http://arxiv.org/abs/1208.5342 that I discuss below,) I found an argument, mostly very simple, which gives some nice results for the effort given.  While Kanold deserves some of the credit for the argument, I have yet to see a statement by him or by anyone else that gives these results, so I present them here.  (Kanold wrote several articles on Jacobsthal's function, many of which I am tracking down, which might have this argument.  I am happy to accept help in obtaining electronic copies of them.)  This is the post I promised over a few months ago in a supplement to a question of Timothy Foo,
Analogues of Jacobsthal's function  .



For maximum ooh-aah effect, I assume nn is squarefree and has kgt2kgt2 prime factors, one of which I call Peter, or pp for short.Now 1+tn1+tn is coprime to nn for any integer tt.  So are most integers of the form 1+tn/p1+tn/p, the exceptions being those that are multiples of pp, and those multiples do not occur as consecutive terms.  Thus, every interval of length 2n/p(=g(p)n/p)2n/p(=g(p)n/p) has at least one integer coprime to nn of the form 1+tn/p1+tn/p.



Let's go further with this.  Let dgt1dgt1 and divide nn, and let f=n/df=n/d.  (Here I use nn squarefree to get ff coprime to dd.)  Then numbers of the shape 1+tf1+tf form an arithmetic progression, are coprime to ff, and (as can be seen by multiplying by ff's inverse in the ring of integers mod dd) you can't pick g(d)g(d) consecutive members of this progression without hitting something coprime to dd also.  So g(n)leqg(d)f=g(d)n/dg(n)leqg(d)f=g(d)n/d .



While I'm here, let me sharpen the inequality, assuming fgt1fgt1 and dgt1dgt1 are coprime:
there are phi(f)phi(f) totients cc of ff in the interval [0,f][0,f], so I can repeat the argument with c+tfc+tf instead of 1+tf1+tf.  In the worst case, using all phi(f)phi(f) progressions, I get g(fd)leqg(d)ff+g(f)g(fd)leqg(d)ff+g(f), which mildly improves upon Kanold's bound g(d)fphi(f)+1g(d)fphi(f)+1, and matches it when ff is prime.  (Of course, for n=fdn=fd I really want g(n)g(n) to be near O(g(d)+g(f))O(g(d)+g(f)), but I don't yet know how to show that with grade school arithmetic.)



How to use this inequality? Pick the largest divisor dd for which one can comfortably compute (a subquadratic in kk upper bound for) g(d)g(d); I pick dd to contain most of the large prime factors of nn: find prime qq dividing nn so that sigma1(d)=sumptextprime,p|n,pgeqq1/psigma1(d)=sumptextprime,p|n,pgeqq1/p is less than 1+1/2q1+1/2q; a routine argument yields g(d)g(d) is O(qk)O(qk).  The ugly part is to show that qltk0.5qltk0.5 (or else d=nd=n), that n/dlt23q/2n/dlt23q/2 which for large kk approaches 23(kepsilon+1/e)/223(kepsilon+1/e)/2, and that asymptotically g(n)g(n) is O(ek1/e+Dlog(k))O(ek1/e+Dlog(k)).  This isn't hard after using one of Mertens's theorems and a Chebyshev function; it just isn't pretty.  (Also for smaller nn, epsilon+1/eepsilon+1/e can be close to 1/21/2, but with patience epsilonepsilon will tend to zero.)



This gives a bound that is asymptotically better than my first efforts at this, improves slightly (k0.5k0.5 replaced by Ck0.37+Dlog(k)Ck0.37+Dlog(k) on Kanold's bound of 2sqrtk2sqrtk for kk not too large, and does not need Kanold's requirement that k>e50k>e50.  Up until one chooses dd and crunches the formulae, it is also a very elementary argument; I suspect even Legendre knew about using the multiplicative inverse to transform a general arithmetic progression to a (effectively) consecutive sequence of integers and still preserve the property of interest here, being a unit in a certain ring (or missing it by that much). 



(One of the benefits of letting this sit for a few months before posting is that I can add cool observations like: If I could get the inequality down to g(n)leqg(d)g(n/d)g(n)leqg(d)g(n/d), I could iterate the
above simple estimate to get an explicit bound of O(kc)O(kc), where cc is a positive number less than 3.  Or like: using more advanced work combined with the above, I can get g(n)leqekeaCkag(n)leqekeaCka for some integers aa, which seems better than Ck4loglogkCk4loglogk if you don't look too closely.)



Further, one can use a computer to refine the method slightly and get estimates which do quite well for small values of nn, where small here means k<100k<100.  Asymptotically though, Stevens's and my upper bounds eventually outperform this bound.



Also, there has been a nice result out of University College Dublin that I will briefly interpret.  Fintan Costello and Paul Watts find a way of presenting a related function recursively, then numerically compute a lower bound on this function which implies an upper bound on Jacobsthal's function computed on some particular values.  I thank them for reminding me about using a multiplicative inverse mod dd for ff, so they deserve a "piece of the action".



These authors work in (and sometimes away from) the integer interval BM=[b+1,b+2,ldots,b+m]BM=[b+1,b+2,ldots,b+m].  Given squarefree nn and its distinct prime factors, listed in some order as q1q1 to qkqk, define QiQi as  prod0ltjleqiqjprod0ltjleqiqj. One approach to computing the size pi(b,m,n)pi(b,m,n) of the set CP(b,m,n)CP(b,m,n) which has those integers in BMBM coprime to nn is to do the standard inclusion-exclusion argument: if we represent by F(b,m,d)F(b,m,d) the multiples of dd in BMBM, and say there are f(b,m,d)f(b,m,d) many such multiples, and abuse some notation, I then write
CP(b,m,n)=sumd|nsgn(d,F(b,m,d))CP(b,m,n)=sumd|nsgn(d,F(b,m,d)) .  Here sgnsgn is to suggest adding elements of the set F(b,m,d)F(b,m,d) if dd has an even number of prime factors, and subtracting them instead when dd has an odd number of prime factors.



To set up for the recurrent expression, Costello and Watts use just some of the terms on the right hand side of the abused equation, and reorganize the rest of the terms.  In my interpretation of their work, they start with the multiset identity



CP(b,m,n)cupbiguplus0ltileqkF(b,m,qi)=BMuplusbiguplus0ltiltjleqkRCP(i,j)CP(b,m,n)cupbiguplus0ltileqkF(b,m,qi)=BMuplusbiguplus0ltiltjleqkRCP(i,j)



where RCP(i,j)RCP(i,j) is F(b,m,qiqj)capCP(b,m,Qi1)F(b,m,qiqj)capCP(b,m,Qi1), or the subset of BMBM which has those multiples of qiqjqiqj whose soonest prime factor in common with nn is qiqi



One sees this identity holds by considering a member of BMBM which has exactly tt distinct prime factors in common with nn.
If tt is 00, then the member occurs only once in CP(b,m,n)CP(b,m,n) and similarly only once in BMBM.  Otherwise, it occurs exactly tt times in the left hand side in tt distinct terms F(b,m,qi)F(b,m,qi), and if ll is soonest such that qlql is a prime factor of the member, the member occurs only once in each of t1t1 sets
RCP(l,j)RCP(l,j) (remember ll comes sooner than jj) and only once in BMBM.



Now the term RCP(i,j)RCP(i,j) is a subset of an arithmetic progression AA with common difference qiqjqiqj. By using the technique above of multiplying by a suitable inverse of qiqjqiqj in the ring of integers mod Qi1Qi1, AA corresponds with an integer interval starting near some integer cijbmcijbm of length f(b,m,qiqj)f(b,m,qiqj) which preserves the coprimality status with respect to Qi1Qi1: to wit, the size of RCP(i,j)RCP(i,j) is pi(cijbm,f(b,m,qiqj),Qi1)pi(cijbm,f(b,m,qiqj),Qi1).  Using the pipi term for the size of CPCP and translating the other sets to numbers gives the numerical recurrent formula of Costello and Watts:
pi(b,m,n)=msum0ltileqkf(b,m,qi)+sum0ltiltjleqkpi(cijbm,f(b,m,qiqj),Qi1)pi(b,m,n)=msum0ltileqkf(b,m,qi)+sum0ltiltjleqkpi(cijbm,f(b,m,qiqj),Qi1).



Following work of Hagedorn who computed h(k)=g(Pk)h(k)=g(Pk) for kk less than 50, where PkPk is the kkth primorial, Costello and Watts use their formula and some analysis of coincidence of prime residues to compute an inequality for pimin(m,n)pimin(m,n) which is the minimum over all integers bb of pi(b,m,n)pi(b,m,n).  They underestimate f(b,m,qiqj)f(b,m,qiqj) by lfloorm/qiqjrfloorlfloorm/qiqjrfloor, ignore the cc's by using piminpimin, pull out the i=1i=1 terms from the double sum and rewrite that portion to include a term EE, depending only on mm and the pipi, which arises from looking at when estimates for the sizes of the F(b,m,pi)F(b,m,pi)  and F(b,m,2pi)F(b,m,2pi) sets can be improved, and come up with (a refined version, using pp's for qq's, of) the inequality
msum0ltileqklceilfracmpirceil+sum1ltileqklfloorfracm2pirfloor+E+sum1ltiltjleqkpimin(lfloorfracmpipjrfloor,Pi1)leqpimin(m,Pk)msum0ltileqklceilfracmpirceil+sum1ltileqklfloorfracm2pirfloor+E+sum1ltiltjleqkpimin(lfloorfracmpipjrfloor,Pi1)leqpimin(m,Pk).



With this inequality, Costello and Watts compute pilowpilow, a lower bound approximation to piminpimin.  Since h(k)leqmh(k)leqm iff pimin(m,Pk)gt0pimin(m,Pk)gt0, computing pilow(m,Pk)pilow(m,Pk) for various mm will give an upper bound on h(k)h(k).  They say their computations for kleq10000kleq10000 suggest h(k)leqCk2logkh(k)leqCk2logk, where CC is a constant less than 0.30.3 .  Although this data is achieved using data from Hagedorn's work, even without that their algorithm yields values which are a vast improvement on known and easily computable bounds, even the ones listed above.



One item to explore is how an algorithm based on this approximation will perform given different orderings of the prime factors.  I suspect that letting the larger primes come first will give tighter results.  Another item to explore is to see if there is a better term FF that will supplant EE and some of the recurrent terms in the double sum.  The idea of rewriting the pipi function recursively, while not new, is given new life in this double sum form, and suggests revisiting some old approaches with an eye toward computability.



Gerhard "Ask Me About Coprime Integers" Paseman, 2013.02.05

No comments:

Post a Comment