I
inference

Projects with this topic

View aigate project

Ciprian Mandache / aigate

[mirror] A self-hosted AI platform — inference, tool use, browser automation, image generation, speech synthesis, transcription, object storage, agentic code execution, and more — behind a single OpenAI-compatible endpoint. One docker-compose up.

AI ai-gateway browser-auto... claude-code Docker docker-compose image-genera... inference litellm llm llm-gateway LLMOps mcp MCP-server ollama openai-compa... self-hosted self-hosted-ai text-to-speech

0

Updated Jul 31, 2026

0 0 0 0

Updated Jul 31, 2026
View mockaccel-runtime project

hadarosht-dev / mockaccel-runtime

Mock embedded inference-accelerator runtime a small C++/Python SDK and Linux daemon used as a test fixture for build automation and CI tooling.

embedded cmake pybind11 cpp inference

0

Updated Jun 17, 2026

0 0 0 0

Updated Jun 17, 2026
View AI Chat LLM Local project

Syed Rizwan Shah / AI Chat LLM Local

A powerful, single-file frontend for LM Studio — run local LLMs with a professional chat interface, knowledge base, guardrails, and full document support. No installation. No server. Just open the HTML file.

AI LLMs inference HTML JavaScript MCP-server local-ai

1

Updated Jun 02, 2026

1 0 0 0

Updated Jun 02, 2026
View Hive Inference P2P project

Lexy Callemeyn / Hive Inference P2P

Hive is a peer-to-peer system that distributes AI inference tasks across volunteer workers ("bees") running local or cloud LLMs. Send the same task to multiple bees in parallel, then automatically merge their outputs over several rounds to make small models smarter together. Built with Rust, Tauri, and libp2p.

Rust tauri AI inference p2p libp2p peer-to-peer distributed ... llm decentralized collaborative open source local-llm

0

Updated May 31, 2026

0 0 0 0

Updated May 31, 2026
View VRAMSwapper project

Ayi NEDJIMI / VRAMSwapper

Intelligent VRAM/RAM swapping for LLM inference - Extension of KVortex | Offloading intelligent VRAM/RAM pour l'inference

https://ayinedjimi-consultants.fr

cuda gpu inference llm memory-manag... nvidia offloading Python vRAM kvortex deep-learning

0

Updated May 22, 2026

0 0 0 0

Updated May 22, 2026
View ModelBench project

Ayi NEDJIMI / ModelBench

Automated LLM Benchmarking on GPU - tokens/sec, latency percentiles, VRAM profiling, multi-format support (HuggingFace, GGUF, GPTQ)

https://ayinedjimi-consultants.fr

benchmark cuda gguf gptq gpu inference llm mlops performance PyTorch transformers vRAM benchmarking deep-learning nvidia Python

0

Updated May 22, 2026

0 0 0 0

Updated May 22, 2026
View KVortex project

Ayi NEDJIMI / KVortex

VRAM to RAM Offloader for AI and vLLM - High-Performance C++23 KV Cache Engine with Multi-Stream GPU Transfers

https://ayinedjimi-consultants.fr

AI cpp23 cuda GPU-computing high-perform... kv-cache llm-inference machine-lear... vllm vram-offload cpp deep-learning gpu inference nvidia vRAM

0

Updated May 22, 2026

0 0 0 0

Updated May 22, 2026
View flashquant project

Ayi NEDJIMI / flashquant

Extreme KV Cache Compression for LLM Inference — C++17/CUDA implementation of TurboQuant (arXiv 2504.19874). 7.5x compression, <2% quality loss.

https://ayinedjimi-consultants.fr

compression cpp cuda flash-attention gpu inference kv-cache llm machine-lear... PyTorch quantization transformer turboquant vllm

0

Updated May 22, 2026

0 0 0 0

Updated May 22, 2026
View Bayesian Authorship Attribution II project

Giulio Tani Raffaelli / Bayesian Authorship Attribution II

In this project, we discard the hypothesis of a discrete distribution for the probability of the words and look for the proper correction to exchangeability to better attribute books to authors.

bayes inference authorship poisson-diri...

0

Updated Nov 16, 2025

0 0 0 0

Updated Nov 16, 2025
View Fast_Faster_Mask R-CNN Inference Comparision project

Michael Schmitz / Fast_Faster_Mask R-CNN Inference Comparision

Evaluation of Fast, Faster and Mask R-CNN regarding their inference times

Faster R-CNN Fast R-Cnn Mask R-CNN detectron2 inference

0

Updated May 08, 2022

0 0 0 0

Updated May 08, 2022