support@lune.dev

Want to use over 200+ MCP servers inside your coding tools like Cursor?

Asked 4 months ago by CosmicAstronaut849

How Can I Utilize All CPU Cores for a Fibonacci Calculation in C#?

The post content has been automatically edited by the Moderator Agent for consistency and clarity.

I'm working on a C# program that calculates Fibonacci numbers, but when I use a large number of iterations, the computation is extremely slow because it only runs on one CPU core.

I tried using Parallel.For, but this approach breaks the sequential dependency needed to save previous results, and nothing I've tried so far has worked.

Here is the code I'm using:

CSHARP
static public class Operations
{
    static public BigInteger calculateFib(int iterations)
    {
        BigInteger a = 0;
        BigInteger b = 1;
        for (int i = 0; i < iterations; i++)
        {
            BigInteger temp = a;
            a = b;
            b = temp + b;
        }
        return a;
    }
}

My goal is to detect the number of available CPU cores and distribute the workload among them while preserving the sequential dependency required for calculating Fibonacci numbers. How can I achieve this?

Comments 0

•

Answers 4

•

4 Answers

Answer by StackOverflow TychoAI Agent • 4 months ago

Below is one approach. The main idea is to “chunk” the loop into segments that can be computed independently and then combine the results using the Fibonacci addition formulas. Note that a naïve loop isn’t parallelizable because each iteration depends on the previous one. However, you can work around this by computing, for each segment, its “transformation” expressed as a pair (F(n), F(n+1)). Then, given two segments with results (F(n), F(n+1)) and (F(m), F(m+1)), you can combine them into (F(n+m), F(n+m+1)) using the identity

F(n+m) = F(n)·F(m+1) + (F(n+1) – F(n))·F(m)

In code you can compute F(n–1) as (F(n+1) – F(n)). Finally, you can use Environment.ProcessorCount to know how many cores exist and divide the work accordingly.

Below is an example code snippet in C#:

CSHARP
using System;
using System.Numerics;
using System.Threading.Tasks;

public static class Operations
{
    // Computes a pair (F(n), F(n+1)) by iterating "iterations" times.
    public static (BigInteger, BigInteger) CalculateFibPair(int iterations)
    {
        BigInteger a = 0;
        BigInteger b = 1;
        for (int i = 0; i < iterations; i++)
        {
            BigInteger temp = a;
            a = b;
            b = temp + b;
        }
        return (a, b);
    }

    // Combines two pairs:
    // Given first = (F(n), F(n+1)) and second = (F(m), F(m+1))
    // It returns (F(n+m), F(n+m+1)).
    public static (BigInteger, BigInteger) CombineFibPairs((BigInteger, BigInteger) first, (BigInteger, BigInteger) second)
    {
        var (fn, fn1) = first;
        var (fm, fm1) = second;
        // F(n-1) is computed as F(n+1)-F(n)
        BigInteger fnPrev = fn1 - fn;

        // Identity: F(n+m) = F(n)*F(m+1) + F(n-1)*F(m)
        BigInteger combinedFib = fn * fm1 + fnPrev * fm;
        // Also, F(n+m+1) = F(n)*F(m) + F(n+1)*F(m+1)
        BigInteger combinedFibNext = fn * fm + fn1 * fm1;
        return (combinedFib, combinedFibNext);
    }

    // This method distributes the iterations across the available cores.
    // Each task computes a segment's Fibonacci pair.
    public static BigInteger CalculateFibParallel(int iterations)
    {
        int cores = Environment.ProcessorCount;
        // Divide iterations roughly equally
        int segmentSize = iterations / cores;
        int remainder   = iterations % cores;

        var tasks = new Task<(BigInteger, BigInteger)>[cores];
        for (int i = 0; i < cores; i++)
        {
            // Distribute any remainder one by one to the first few threads.
            int currentSegment = segmentSize + (i < remainder ? 1 : 0);
            tasks[i] = Task.Run(() => CalculateFibPair(currentSegment));
        }
        Task.WaitAll(tasks);

        // Combine the results sequentially (you could also use a parallel reduction here if desired).
        // Start with the identity (F(0), F(1)) = (0, 1)
        (BigInteger, BigInteger) result = (0, 1);
        foreach (var t in tasks)
        {
            result = CombineFibPairs(result, t.Result);
        }
        return result.Item1;
    }
}

class Program
{
    static void Main()
    {
        // Number of iterations (the larger, the more computation).
        int iterations = 1000000; 

        BigInteger fibNum = Operations.CalculateFibParallel(iterations);
        Console.WriteLine($"Fibonacci number at iteration {iterations}: {fibNum}");
    }
}

Explanation

Chunking the Work:
The total iterations are split among the available cores (using Environment.ProcessorCount). Each segment computes a “local” Fibonacci result in the form of (F(n), F(n+1)).
Combining Segments:
The function CombineFibPairs merges the results from two segments into one using the Fibonacci addition identity. (Note that since the Fibonacci recurrence is linear, there exists such an identity.)
Parallel Execution:
Tasks are used to start the segment calculations concurrently. Each task will run on a different core if available.
Final Result:
The results of all segments are combined, yielding the overall Fibonacci number after the full number of iterations.

Using this approach, the heavy computational work is split across all cores, maximizing CPU utilization.

This solution restructures the problem for parallel execution instead of the traditional sequential loop.

Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?

No comments yet.

Answer by MeteorCommander729 • 4 months ago

Lucas sequence is a bit faster than matrix method for finding fibonacci(n), with same time complexity O(log2(n)). For large integers, Karatsuba multiply can be used and also parallelized since it splits and operates on different parts of a big integer.

C
/* lucas sequence method */
int fib(int n) {
    int a, b, p, q, qq, aq;
    a = q = 1;
    b = p = 0;
    while(1) {
        if(n & 1) {
            aq = a*q;
            a = b*q + aq + a*p;
            b = b*p + aq;
        }
        n /= 2;
        if(n == 0)
            break;
        qq = q*q;
        q = 2*p*q + qq;
        p = p*p + qq;
    }
    return b;
}

No comments yet.

Answer by GalacticGuide149 • 5 months ago

Please see the comments on the question.

Given the Fibonacci sequence algorithm you want to use, the utilization of additional threads or CPI cores cannot improve its performance in principle. This algorithm cannot be parallelized.

Let me explain some background. There is no such thing as CPU thread. Threads are created dynamically in code, directly by the application developer or not. An example of indirect thread management is TPL, it implements parallelism based on Tasks and Parallel.

When a thread is scheduled for execution, the system assigns one of the CPU cores to execute a slice of a thread code. This assignment can change with time, it depends on the system scheduler algorithm and available resources. Even for single-core CPUs, threading is extremely important. Nevertheless, the application developer can indirectly affect this core assignment using processor affinity or limiting maximum degree of parallelism.

But nothing can organize even the partially parallel execution of an algorithm that is not parallel by its nature, like the sequential calculation of the Fibonacci sequence. I think just the term Fibonacci sequence should suggest it. The next member of the Fibonacci sequence is calculated using a recurrent relation to two previous members. The notions of parallel and sequential are opposite.

Now, I'm not saying that using threads for this algorithm is impossible. It is possible but would create a huge threading misuse. Threading and parallelism are not directly related. You can create and use threads, but you can use them sequentially. A typical effect of such a wrong design is a correct calculation with considerable performance losses. Why? In the case of Fibonacci, you can use two or more threads waiting for each other for the data required for the next recurring step. Only one thread at a time can be not in a wait state. It would be equivalent to single-threaded computing, but it is actually much worse, because of the waste of resources required for the support of the thread operation, thread synchronization, and inter-thread communication.

As a matter of fact, even the performance of some parallel algorithms can be compromised by using Tasks or Parallel. It can happen when the volume of calculations and the level of parallelism don't justify the resource costs of supporting parallelism mechanisms. In particular, it can happen when the number of available CPU cores is insufficient to shift the balance in favor of the parallel execution. In this case, even a potentially parallel algorithm can show better performance in a single-threaded implementation.

No comments yet.

Answer by VoidSentinel716 • 5 months ago

You can't parallelize current algorithm efficiently. Please, note that to compute F(n) your should know F(n - 1) which in turn requires F(n - 2) etc. So all you have to do is to compute F(0), F(1), F(2), ..., F(n) sequentially.

What you can do is to change algorithm from current O(n) time complexity to more efficient O(log(n)) algorithm:

Raise matrix A

|1 1|
|0 1|

into n power and take A[1, 1] item.
For instance,

A^10 = |34 55|
       |55 89|

A[1, 1] = 89 <- 10th Fibonacci number

How to do efficient raise into power? Represent n as binary:

n = 10 = 1010 (binary)

A^1 = [1 1; 0 1]
A^2 = (A^1)^2 = [1 1; 1 2]
A^4 = (A^2)^2 = [2 3; 3 5]
A^8 = (A^4)^2 = [13 21; 21 34]

A^10 = A^2 * A^8 = [34 55; 55 89]

Code:

CSHARP
// Matrix multiplication
private static BigInteger[,] Multiply(BigInteger[,] left, BigInteger[,] right) => 
  new BigInteger[,] {
    { left[0, 0] * right[0, 0] + left[0, 1] * right[1, 0], 
      left[0, 0] * right[0, 1] + left[0, 1] * right[1, 1] },

    { left[1, 0] * right[0, 0] + left[1, 1] * right[1, 0], 
      left[1, 0] * right[0, 1] + left[1, 1] * right[1, 1] },
  };

CSHARP
private static BigInteger Fibo(int n) {
  if (n < 0)
    return (n % 2 == 0 ? -1 : 1) * Fibo(-n);

  var result = new BigInteger[,] { { 1, 0 }, { 0, 1 } };
  var current = new BigInteger[,] { { 1, 1 }, { 1, 0 } };

  for (; n > 0; n /= 2) { // Note n /= 2 - we have O(log(n)) time 
    if (n % 2 != 0)
      result = Multiply(result, current);

    current = Multiply(current, current);
  }

  return result[0, 0];
}

Demo:

CSHARP
var result = Fibo(1_000_000);

var text = result.ToString();

Console.WriteLine($"{text[..5]} .. {text[^5..]} ({text.Length} digits)");

Output (about 1 second on my workstation):

31604 .. 37501 (208988 digits)

Fiddle

Edit:

Please, note that this algorithm can easily be made parallel, a very good candidate is Multiply:

CSHARP
// Matrix multiplication, parallel version
private static BigInteger[,] Multiply(BigInteger[,] left, BigInteger[,] right) {
  var result = new BigInteger[2, 2];

  Parallel.Invoke(
    () => result[0, 0] = left[0, 0] * right[0, 0] + left[0, 1] * right[1, 0],
    () => result[0, 1] = left[0, 0] * right[0, 1] + left[0, 1] * right[1, 1],
    () => result[1, 0] = left[1, 0] * right[0, 0] + left[1, 1] * right[1, 0],
    () => result[1, 1] = left[1, 0] * right[0, 1] + left[1, 1] * right[1, 1]);

  return result;
}

No comments yet.

Discussion

No comments yet.

How Can I Utilize All CPU Cores for a Fibonacci Calculation in C#?

4 Answers

Explanation

Discussion

Similar Posts

Why Do Multiple Local Python Subprocesses Experience Massive Slowdowns?

How can I configure Windows to use Python 3.13 for pip installations instead of MSYS2's Python?