Chronicles of Harry - New integer-complexity related upper bounds!

2014 Apr 13

18:09:00 - New integer-complexity related upper bounds!

Previous Entry Add to Memories Share Next Entry

So! Juan Arias de Reyna and Jan Van de Lune recently posted to arXiv their paper "Algorithms for Determining integer complexity". In it, they A. improve the known bounds on time complexity for computing ||n||, and B. get a new bound on ||n|| that works for almost all n.

(Hey, I'm blogging about things adjacent to my work again! Well, a little, anyway. More will have to wait.)

To review -- a big problem in integer complexity is determining the limsup of ||n||/(log n), which I'll denote Cmax. Certainly Cmax≤3/(log 2), or about 4.328, since ||n||≤3log2n for all n≥1; this bound follows from writing n in base 2. And Josh Zelinsky, in a paper he still needs to write, has lowered this to ||n||≤27log1260n for all n≥1, so Cmax≤27/(log 1260), or about 3.782. The highest known value of ||n||/(log n) occurs at n=1439, for a value of 26/(log 1439), or about 3.576. This seems to be a unique occurrence, so one ought to have Cmax<26/(log 1439), but the current best upper bounds have not beaten this milestone.

For comparison, if it is indeed the case that ||2k||=2k for all k≥1, then one would have Cmax≥2/(log 2), which is about 2.885; Josh has previously suggested that perhaps the limsup is exactly this number. (Though Iraids et al suggested that perhaps it's even larger.) However, at present, nobody's even been able to prove that the limsup is any greater than the liminf, which is 3/(log 3), or about 2.731 (and which is also an absolute lower bound). And indeed, various people have suggested that perhaps the limsup simply is the liminf[0]. Josh and I attempted a while ago to show it was at least slightly larger, but that ended up not working out, though I'm of the opinion it's worth re-exploring.

Anyway. So the state of Cmax is not good. But we can also define a number I'll call Cavg: We'll define Cavg to be the inf of all C such that ||n||≤Clog(n) for almost all n, i.e., on a set of density 1. (So certainly 3/(log 3)≤Cavg≤Cmax). And it's here that Arias de Reyna and Van de Lune have made an improvement. (But first a bit more history, if you don't mind.)

A number of people have noted that based on writing n in base 2, one can show Cavg≤5/(2log2), or about 3.607. (Already this is better than Josh's bound.) Richard Guy and John Isbell took this a step further and tried writing n in base 24, yielding a bound of Cavg≤265/(24log24), or about 3.474. (This is even better than 26/(log 1439), which the current Cmax bounds are not!) Well, now, based on writing n in base 2936, they've shown that in fact


which is about 3.321. Quite an improvement, in my opinion!

(As for the true value of Cavg, who knows. Maybe it and Cmax are both equal to 2/(log 2). Well -- if Cmax is at most 2/(log 2), then so is Cavg, and if Cmax is equal to 3/(log 3), then so is Cavg; and if either is true, then even this new bound on Cavg is quite a ways away from the true value. Still. A substantial improvement.)

So what's going on here? How did they do this? Well, it's the same method as Richard Guy and John Isbell used, just applied with a much higher base with the help of a bit of automation. Let's go into a bit more detail.

Let's define D(b,r), as Arias de Reyna and Van de Lune do, to be the smallest number of ones needed to turn a number x into the number bx+r. That's not a formal definition, but it's not too hard to write one down; you can write down a recursion similar to that for integer complexity, allowing you to compute it algorithmically. For r=0, D(b,r) will simply be ||b||. (And for b=0, D(b,r) will simply be ||r||, though we won't care about that here.) We'll only be considering here D(b,r) for 0≤r<b, though one can consider it for r≥b as well. Note, by the way, that (excluding r=0 or b=0) one always has D(b,r)≥max(||b||,||r||). (Actually, so long as r>0, one has D(b,r)>||b||, and so long as b>0, one has D(b,r)>||r||.)

With this language, we can make some nice statements. The method of getting upper bounds on Cmax by writing numbers in different bases simply becomes the statement that, for any b≥2,

Cmax ≤ max0≤r<b D(b,r)/(log b)

...or does it? It's possible I'm neglecting some subtleties with the initial digit here. Well -- it's ultimately irrelevant, because so far nobody's ever found a base that does better than base 2 for getting an upper bound on Cmax! And in base 2 one does not have to worry about such things.

But what certainly is true, and much more useful, is that

Cavg ≤ avg0≤r<b D(b,r)/(log b);

this is the method of writing things in base b to get upper bounds on Cavg. Well, Arias de Reyna and Van de Lune have done this with b=2936, and gotten a pretty nice bound.

But -- and now we get to part A -- this isn't the only thing they've done with the values D(b,r)! They present also in their paper an algorithm for computing ||n||, which is apparently due to Martin Fuller. Now, we already knew that you could do better than O(n2) in terms of the time required to compute ||n||; Srinivas and Shankar presented an algorithm that showed that the exponent 2 could be lowered to log23, or about 1.585.

Arias de Reyna and Van de Lune present a time analysis of Fuller's algorithm, and show that it runs in time O(nα), where α is about 1.246. Interestingly, their time analysis depends on the numbers D(b,r) -- once again, you pick a base b, do computations involving D(b,r) for 0≤r<b, and get out a bound. In this case, the computation involved is rather more complicated than just taking an average, and doesn't yield a number that can be written down in a nice short exact form, but it still is a computation based on D(b,r) for 0≤r<b. The best base they found for this problem was 66, which yielded the above value of α.

So, pretty neat! A little different from what I'm mostly working on, but, it's not like integer complexity is a widely-studied thing. And it shows that the numbers D(b,r) have more than one purpose, and may be worth further study.


[0]Indeed, if instead of the integer complexity one considers the addition chain length l(n), which is similar in a number of ways (which reminds me, I've never really posted here about any of my work regarding addition chains; I should probably get around to that sometime), one has that the liminf and limsup of l(n)/(log n) are both equal to each other, both being equal to 1/(log 2).


Date:2014 Apr 14 08:54:21 (UTC)
I modified the program to give the exact value

α = log( 3^4 2^(-5) (86990+60081*a+41610*a^2) ) / log(2^6 3^6) with a=3^(1/3)

also I have now a better value for α

α = log( 3^6 2^(-9) (1794205+1247109*a+859894*a^2) ) / log(2^9 3^6)= 1.235528...

obtained with the base 2^9 3^6.


(Reply) (Thread)
[User Picture]
Date:2014 Apr 14 18:06:56 (UTC)
Oh, huh. I didn't realize it could be written in such a form.

I have to admit I wasn't certain whether you used 6^6 because it was actually the best you found or because you just decided to stop there. I would expect larger 3-smooth bases to continue to do better, so I'm glad to see that does seem to be the case.

I'm hoping you upload an update with this; this blog isn't exactly widely-read. :)
(Reply) (Parent) (Thread)