Tags
Tags give the ability to mark specific points in history as being important
master-25d7abb
25d7abbd
·
llama : fixed rlimit error message (#888)
·
Apr 21, 2023
master-018f227
018f2279
·
cmake : link threads publicly to ggml (#1042)
·
Apr 21, 2023
master-9411288
94112882
·
main : evaluate tokens in batches after swapping context (#1014)
·
Apr 21, 2023
master-8687c1f
8687c1f2
·
llama : remember and restore kv cache data pointers (#1104)
·
Apr 21, 2023
master-1bfc153
1bfc153e
·
ggml : a faster version for Q4_1 x Q8_0 dot products (#1083)
·
Apr 21, 2023
master-3d59769
3d59769c
·
Show perplexity ETA in hours and minutes (#1096)
·
Apr 21, 2023
master-d40fded
d40fded9
·
llama : fix comment for "output.weight" tensor
·
Apr 21, 2023
master-12b5900
12b5900d
·
ggml : sync ggml (add GPT-NeoX RoPE implementation)
·
Apr 20, 2023
master-9ff334f
9ff334f3
·
ggml : fix bug in ggml_compute_forward_dup_f32()
·
Apr 20, 2023
master-2005469
2005469e
·
Add Q4_3 support to cuBLAS (#1086)
·
Apr 20, 2023
master-8a1756a
8a1756ab
·
ggml : do not break cuBLAS build (Q4_3 is not yet implemented)
·
Apr 20, 2023
master-66aab46
66aab460
·
ggml : fix Q4_3 quantization
·
Apr 20, 2023
master-38de86a
38de86a7
·
llama : multi-threaded quantization (#1075)
·
Apr 20, 2023
master-e0305ea
e0305ead
·
ggml : add Q4_3 quantization (#1082)
·
Apr 20, 2023
master-6a9661e
6a9661ea
·
ci : remove the LLAMA_ACCELERATE matrix dimension from Ubuntu builds in the CI (#1074)
·
Apr 20, 2023
master-5addcb1
5addcb12
·
fix: LLAMA_CUBLAS=1 undefined reference 'shm_open' (#1080)
·
Apr 20, 2023
master-c8c2c52
c8c2c524
·
AVX2 optimization for vec_dot_q4_2_q8_0 (#1068)
·
Apr 20, 2023
master-02d6988
02d69881
·
Improve cuBLAS performance by dequantizing on the GPU (#1065)
·
Apr 20, 2023
master-f7d0509
f7d05095
·
Q4_2 quantization with rmse-optimized scale and quants (#1062)
·
Apr 19, 2023
master-884e7d7
884e7d7a
·
ggml : use 8-bit precision for Q4_1 intermediate results (#1047)
·
Apr 19, 2023
1
…
25
26
27
28
29
30
31
32
33
…
38