[Bug] TParallel.For Parallelism Bug in FPC System.Threading When Parallelism = 1

Summary

When using TParallel.For from the System.Threading in an FPC project, the loop body is not executed correctly when the “parallelism” is 1, leading to incorrect computations. The same logic works correctly when the parallelism is greater than 1.

In other words: TParallel.For(0, 0, ...) behaves differently (and incorrectly) compared to TParallel.For(0, N, ...) with N >= 1.

This works correctly in Delphi.


System Information

  • Operating system: Windows 10 (64-bit)
  • Processor architecture: x86-64
  • Compiler version: FPC 3.3.1 x86_64-win64-win32/win64 (trunk)
  • Device: Desktop PC

Steps to reproduce

  1. Create a new console program.

  2. Paste the following code:

     program TestParallelForBug;
    
     {$IFDEF FPC}
       {$MODE DELPHI}
     {$ENDIF}
    
     {$APPTYPE CONSOLE}
    
     uses
       SysUtils,
       System.Threading;
    
     const
       NUM_ITEMS = 16;
    
     procedure RunTest(AParallelism: Integer);
     var
       Data, SerialData: TArray<Integer>;
       I: Integer;
     begin
       Writeln('--- Parallelism = ', AParallelism, ' ---');
    
       SetLength(Data, NUM_ITEMS);
       SetLength(SerialData, NUM_ITEMS);
    
       // Baseline: serial computation
       for I := 0 to NUM_ITEMS - 1 do
         SerialData[I] := I * 2;
    
       // Parallel version
       TParallel.&For(
         0,
         AParallelism - 1,
         procedure(Lane: Integer)
         var
           J, StartIdx, EndIdx, ChunkSize: Integer;
         begin
           // Simple even split of work between lanes
           ChunkSize := NUM_ITEMS div AParallelism;
    
           StartIdx := Lane * ChunkSize;
           if Lane = AParallelism - 1 then
             EndIdx := NUM_ITEMS - 1 // last lane takes the remainder
           else
             EndIdx := StartIdx + ChunkSize - 1;
    
           for J := StartIdx to EndIdx do
             Data[J] := J * 2;
         end
       );
    
       // Compare serial vs parallel result
       for I := 0 to NUM_ITEMS - 1 do
       begin
         if Data[I] <> SerialData[I] then
           Writeln(Format('Mismatch at index %d: expected %d, got %d',
             [I, SerialData[I], Data[I]]));
       end;
    
       Writeln;
     end;
    
     begin
       try
         RunTest(1);  // this is where the FPC TParallel bug should show
         RunTest(4);  // this is correct on both Delphi and FPC
       except
         on E: Exception do
           Writeln(E.ClassName, ': ', E.Message);
       end;
    
       Writeln('Done. Press ENTER to exit.');
       Readln;
     end.
    
  3. Compile and run the program.

  4. Observe the output for:

    • Parallelism = 1
    • Parallelism = 4

What is the current bug behavior?

  • For Parallelism = 1, the TParallel.For section does not produce the same result as the serial loop.
  • The output shows mismatches between Data[] (parallel) and SerialData[] (baseline).
  • For Parallelism = 4, the results match and no mismatches are printed.

This suggests that the TParallel.For implementation behaves incorrectly (or does not execute the body as expected) when the loop range collapses to a single iteration (e.g. 0..0), in the FPC System.Threading TParallel.For.


What is the expected (correct) behavior?

  • TParallel.For(0, AParallelism - 1, ...) should behave like a normal for-loop over that inclusive integer range, including when AParallelism = 1.
  • For both Parallelism = 1 and Parallelism = 4, the computed Data[] array should be identical to the serial baseline (SerialData[] with Data[i] = i * 2).
  • No mismatches should be printed in either case.

Relevant logs and/or screenshots

Example output on the affected setup (FPC System.Threading TParallel.For):

--- Parallelism = 1 ---
Mismatch at index 1: expected 2, got 0
Mismatch at index 2: expected 4, got 0
Mismatch at index 3: expected 6, got 0
Mismatch at index 4: expected 8, got 0
Mismatch at index 5: expected 10, got 0
Mismatch at index 6: expected 12, got 0
Mismatch at index 7: expected 14, got 0
Mismatch at index 8: expected 16, got 0
Mismatch at index 9: expected 18, got 0
Mismatch at index 10: expected 20, got 0
Mismatch at index 11: expected 22, got 0
Mismatch at index 12: expected 24, got 0
Mismatch at index 13: expected 26, got 0
Mismatch at index 14: expected 28, got 0
Mismatch at index 15: expected 30, got 0

--- Parallelism = 4 ---

Done. Press ENTER to exit.
  • Parallelism = 1 produces wrong data.
  • Parallelism > 1 works correctly.

Edited Nov 27, 2025 by Ugochukwu Mmaduekwe
Assignee Loading
Time tracking Loading