Hashing bug related in generic dictionaries on Darwin/x86-64 and CPU i3-8100B
Summary
A simple program that uses a generic dictionary built with trunk compiler on macOS/x86_64 allows duplicate keys.
System Information
- Operating system: macOS
- Processor architecture: x86-64 Intel i3-8100B
- Compiler version: trunk [3.3.1-12931-gfa44cd9d]
- Device: Computer
Steps to reproduce
- Create a program named test:
program test;
{$MODE DELPHI}
uses
Lang;
begin
LangObj := TLangClass.Create;
LangObj.Test('Russian');
end.
- Create a second unit named lang:
unit Lang;
{$MODE DELPHI}
interface
uses
Generics.Collections;
type
TLangClass = class(TObjectDictionary<string, TObject>)
public
procedure Test(const Value: string);
end;
var
LangObj: TLangClass;
implementation
{TLanguage}
procedure TLangClass.Test(const Value: string);
const
SRussian = 'Russian';
var
Key: string;
LangDict: TObject;
begin
writeln('Adding string = ', SRussian);
WriteLn('Input equals const = ', Value = SRussian);
Add(SRussian, TObject.Create);
WriteLn('Added item found = ', TryGetValue(Value, LangDict));
WriteLn('Now oddly dictionary can hold a duplicate');
Add(Value, TObject.Create);
WriteLn('Dictionary keys: ');
for Key in Keys do
WriteLn('K = ', Key, ' V = ', Items[Key].GetHashCode);
end;
end.
- Compile it with trunk compiler on a x86_64 based macOS.
- Run the program and observe the following odd output:
user201578@SY451 x86_64-darwin % ./fpc test.pas && ./test
Free Pascal Compiler version 3.3.1-12931-gfa44cd9da6 [2023/07/10] for x86_64
Copyright (c) 1993-2023 by Florian Klaempfl and others
Target OS: Darwin for x86_64
Compiling test.pas
Assembling (pipe) test.s
Linking test
11 lines compiled, 1.2 sec, 1212416 bytes code, 393215 bytes data
Adding string = Russian
Input equals const = TRUE
Added item found = FALSE
Now oddly dictionary can hold a duplicate
Dictionary keys:
K = Russian V = 4375294400
K = Russian V = 4375294368
user201578@SY451 x86_64-darwin %
Example Project
Trivial example project attached: bug-generics-compiler.zip
What is the current bug behavior?
As you may see, the Test method takes a string value "Russian". Inside the method there is a constant with the same value. The input string and the constant are identical as confirmed by direct string comparison.
Yet when we add the constant to the dictionary and then try finding the input value, nothing is found. Consequentially the dictionary allows two identical string keys to be added.
What is the expected (correct) behavior?
Finding the input value must succeed as the same value was already added. Furthermore, dictionary, by design, must not allow duplicates.
Additional details
-
The bug only exhibits itself when the dictionary class is in a separate unit and it was introduced in commit e1bbcf70
-
The string itself has something to do with it. For example, if I replace "Russian" with "ABC" in both units, it works as expected.
-
Only the x86_64-darwin target seems affected, possibly with the specific CPU. I have further tested with aarch64-darwin, i386-win32, x86_64-win64 and x86_64-linux with newer CPUs and they all work correctly.
-
It's likely caused by Darwin's ABI issues with the ASM code from the above-mentioned commit, or the CPU doesn't understand some of the ASM instructions from that commit.