Tags

Tags give the ability to mark specific points in history as being important

master-25d7abb

25d7abbd · llama : fixed rlimit error message (#888) · Apr 21, 2023
master-018f227

018f2279 · cmake : link threads publicly to ggml (#1042) · Apr 21, 2023
master-9411288

94112882 · main : evaluate tokens in batches after swapping context (#1014) · Apr 21, 2023
master-8687c1f

8687c1f2 · llama : remember and restore kv cache data pointers (#1104) · Apr 21, 2023
master-1bfc153

1bfc153e · ggml : a faster version for Q4_1 x Q8_0 dot products (#1083) · Apr 21, 2023
master-3d59769

3d59769c · Show perplexity ETA in hours and minutes (#1096) · Apr 21, 2023
master-d40fded

d40fded9 · llama : fix comment for "output.weight" tensor · Apr 21, 2023
master-12b5900

12b5900d · ggml : sync ggml (add GPT-NeoX RoPE implementation) · Apr 20, 2023
master-9ff334f

9ff334f3 · ggml : fix bug in ggml_compute_forward_dup_f32() · Apr 20, 2023
master-2005469

2005469e · Add Q4_3 support to cuBLAS (#1086) · Apr 20, 2023
master-8a1756a

8a1756ab · ggml : do not break cuBLAS build (Q4_3 is not yet implemented) · Apr 20, 2023
master-66aab46

66aab460 · ggml : fix Q4_3 quantization · Apr 20, 2023
master-38de86a

38de86a7 · llama : multi-threaded quantization (#1075) · Apr 20, 2023
master-e0305ea

e0305ead · ggml : add Q4_3 quantization (#1082) · Apr 20, 2023
master-6a9661e

6a9661ea · ci : remove the LLAMA_ACCELERATE matrix dimension from Ubuntu builds in the CI (#1074) · Apr 20, 2023
master-5addcb1

5addcb12 · fix: LLAMA_CUBLAS=1 undefined reference 'shm_open' (#1080) · Apr 20, 2023
master-c8c2c52

c8c2c524 · AVX2 optimization for vec_dot_q4_2_q8_0 (#1068) · Apr 20, 2023
master-02d6988

02d69881 · Improve cuBLAS performance by dequantizing on the GPU (#1065) · Apr 20, 2023
master-f7d0509

f7d05095 · Q4_2 quantization with rmse-optimized scale and quants (#1062) · Apr 19, 2023
master-884e7d7

884e7d7a · ggml : use 8-bit precision for Q4_1 intermediate results (#1047) · Apr 19, 2023

1
…
25
26
27
28
29
30
31
32
33
…
38