I thought of creating a post on Dirichlet Generating functions which would be right continuation of the direction my blog is heading. But this post is about the C code for Totient Summatory function.
The code uses the algorithm described in Totient Summatory function. If you haven't visited that post, I suggest you to do so. The code computes the answer for $10^{11}$ in approx. $7$ seconds and for $10^{12}$ in approx. $25$ seconds. Not bad compared to Mathematica. Here's the code for you.
#include<stdio.h>
#include<math.h>
#define M 60000000
#define Mod 1000000007
int Phi[M+1], SumPhi[M+1], TotCache[M+1];
void precalc()
{
int i, j;
Phi[1] = 1;
SumPhi[1] = 1;
for(i = 2; i <= M; i++) {
if(!Phi[i])
for(j = i; j <= M; j += i) {
if(!Phi[j])
Phi[j] = j;
Phi[j] = (Phi[j] / i) * (i - 1); }
SumPhi[i] = (SumPhi[i-1] + Phi[i]) % Mod; }
}
long long SumNat(long long n)
{
long long p = n, q = n+1;
if(p%2)
q /= 2;
else
p /= 2;
return (p % Mod) * (q % Mod) % Mod;
}
long long TotientSum(long long n, long long k)
{
long long m = n / k;
if(m <= M)
return SumPhi[m];
else if(TotCache[k] > 0)
return TotCache[k];
long long p = sqrt(m), q = m / (p+1), i, res = 0;
for(i = 2; i <= p; i++) {
res += TotientSum(n, k * i);
if(res >= Mod)
res %= Mod;
}
for(i = 1; i <= q; i++) {
res = res + ((m / i) % Mod * Phi[i]) % Mod;
if(res >= Mod)
res %= Mod;
}
res = SumNat(m) - res;
res += (p % Mod) * TotientSum(n, k * (p + 1));
res %= Mod;
TotCache[k] = res;
return res;
}
int main()
{
precalc();
printf("%lld",TotientSum(1000000000000LL,1));
return 0;
}
Note: $\text{TotientSum}(n,k)=\Phi\left(\left\lfloor\dfrac{n}{k}\right\rfloor\right)$ in the code.
Just posting a code wouldn't do much good either for the blogger or the reader I think from the experience I've had myself. But am not gonna no through the complete code. Just a few details which we should be paying attention to.
First, the 'precalc' function. It's just a word-by-word translation of the algorithm that we saw on Optimized Precomputation.
Second is the 'SumNat' function. The important thing about this function is that we ought to keep the values from overflowing. For example, I first did $\text{(p%Mod * q%Mod)%Mod}$ which was perfectly okay looking except that it overflows.
Third, the 'TotientSum' function, is pure memoization. It'll take a separate post to completely explain it. Maybe I'll do that later. But for now I think this should suffice. I'll try to come up with the C-code for Merten's function in the next post.
Last is the limit you choose to sieve. That's the real trade-off you have to make between space and time. If you choose to sieve more, it's gonna take more space but the final result will be quick, whereas if you choose to memoize more you consume less space at cost of being time expensive. I made some trial and error to arrive at the above sieve limit which kinda minimized the computation time.
UPDATE(13 Mar 2017): I made an iterative version of the Totient summartory function in Python. You can check it out in this Online IDE.
from timeit import default_timer
start = default_timer()
n = 1000000000
s = int(n ** 0.5)
LowPhis = [k for k in range(1 + s)]
HiPhis = [0] * (1 + s)
for p in range(2, 1 + s):
if p == LowPhis[p]:
for k in range(p, 1 + s, p):
LowPhis[k] = LowPhis[k] - LowPhis[k] // p
LowPhis[p] += LowPhis[p - 1]
for k in range(s, 0, -1):
tempn = n // k
res = tempn * (1 + tempn) // 2
u = int(tempn ** 0.5)
v = tempn // (1 + u)
for i in range(1, 1 + v):
res -= (tempn // i) * (LowPhis[i] - LowPhis[i - 1])
for i in range(2, 1 + u):
res -= (HiPhis[i * k] if i * k <= s else LowPhis[tempn // i])
i = 1 + u
res += u * LowPhis[tempn // i]
HiPhis[k] = res
print(HiPhis[1])
print(default_timer() - start)
Note: $\text{HiPhis}[k]=\Phi\left(\left\lfloor\dfrac{n}{k}\right\rfloor\right)$ in the code.
I like iterative versions in that they don't suffer from the annoying maximum-recursion-depth error. I think this is a little slower than the recursive version but looks well nonetheless in Python.
Until then
Yours Aye
Me
The code uses the algorithm described in Totient Summatory function. If you haven't visited that post, I suggest you to do so. The code computes the answer for $10^{11}$ in approx. $7$ seconds and for $10^{12}$ in approx. $25$ seconds. Not bad compared to Mathematica. Here's the code for you.
#include<stdio.h>
#include<math.h>
#define M 60000000
#define Mod 1000000007
int Phi[M+1], SumPhi[M+1], TotCache[M+1];
void precalc()
{
int i, j;
Phi[1] = 1;
SumPhi[1] = 1;
for(i = 2; i <= M; i++) {
if(!Phi[i])
for(j = i; j <= M; j += i) {
if(!Phi[j])
Phi[j] = j;
Phi[j] = (Phi[j] / i) * (i - 1); }
SumPhi[i] = (SumPhi[i-1] + Phi[i]) % Mod; }
}
long long SumNat(long long n)
{
long long p = n, q = n+1;
if(p%2)
q /= 2;
else
p /= 2;
return (p % Mod) * (q % Mod) % Mod;
}
long long TotientSum(long long n, long long k)
{
long long m = n / k;
if(m <= M)
return SumPhi[m];
else if(TotCache[k] > 0)
return TotCache[k];
long long p = sqrt(m), q = m / (p+1), i, res = 0;
for(i = 2; i <= p; i++) {
res += TotientSum(n, k * i);
if(res >= Mod)
res %= Mod;
}
for(i = 1; i <= q; i++) {
res = res + ((m / i) % Mod * Phi[i]) % Mod;
if(res >= Mod)
res %= Mod;
}
res = SumNat(m) - res;
res += (p % Mod) * TotientSum(n, k * (p + 1));
res %= Mod;
TotCache[k] = res;
return res;
}
int main()
{
precalc();
printf("%lld",TotientSum(1000000000000LL,1));
return 0;
}
Just posting a code wouldn't do much good either for the blogger or the reader I think from the experience I've had myself. But am not gonna no through the complete code. Just a few details which we should be paying attention to.
First, the 'precalc' function. It's just a word-by-word translation of the algorithm that we saw on Optimized Precomputation.
Second is the 'SumNat' function. The important thing about this function is that we ought to keep the values from overflowing. For example, I first did $\text{(p%Mod * q%Mod)%Mod}$ which was perfectly okay looking except that it overflows.
Third, the 'TotientSum' function, is pure memoization. It'll take a separate post to completely explain it. Maybe I'll do that later. But for now I think this should suffice. I'll try to come up with the C-code for Merten's function in the next post.
Last is the limit you choose to sieve. That's the real trade-off you have to make between space and time. If you choose to sieve more, it's gonna take more space but the final result will be quick, whereas if you choose to memoize more you consume less space at cost of being time expensive. I made some trial and error to arrive at the above sieve limit which kinda minimized the computation time.
UPDATE(13 Mar 2017): I made an iterative version of the Totient summartory function in Python. You can check it out in this Online IDE.
from timeit import default_timer
start = default_timer()
n = 1000000000
s = int(n ** 0.5)
LowPhis = [k for k in range(1 + s)]
HiPhis = [0] * (1 + s)
for p in range(2, 1 + s):
if p == LowPhis[p]:
for k in range(p, 1 + s, p):
LowPhis[k] = LowPhis[k] - LowPhis[k] // p
LowPhis[p] += LowPhis[p - 1]
for k in range(s, 0, -1):
tempn = n // k
res = tempn * (1 + tempn) // 2
u = int(tempn ** 0.5)
v = tempn // (1 + u)
for i in range(1, 1 + v):
res -= (tempn // i) * (LowPhis[i] - LowPhis[i - 1])
for i in range(2, 1 + u):
res -= (HiPhis[i * k] if i * k <= s else LowPhis[tempn // i])
i = 1 + u
res += u * LowPhis[tempn // i]
HiPhis[k] = res
print(HiPhis[1])
print(default_timer() - start)
I like iterative versions in that they don't suffer from the annoying maximum-recursion-depth error. I think this is a little slower than the recursive version but looks well nonetheless in Python.
Until then
Yours Aye
Me
No comments:
Post a Comment