Yet Another Soft Robot Evolver. Evolutionary computation experiments using Evolution Gym as a base to play with evolutionary computation algorithms and other weirder things. It can also be useful as a minimalist codebase to learn how to use evogym without having to worry about PPO and stuff.
Open WebUI is an extensible, feature-rich, and user-friendly self-hosted WebUI designed to operate entirely offline. It supports various LLM runners, including Ollama and OpenAI-compatible APIs. Effortlessly integrate OpenAI-compatible APIs for versatile conversations alongside Ollama models. Customize the OpenAI API URL to link with LMStudio, GroqCloud, Mistral, OpenRouter, and more. Seamlessly integrate custom logic and Python libraries into Open WebUI using Pipelines Plugin Framework. Launch your Pipelines instance, set the OpenAI URL to the Pipelines URL, and explore endless possibilities. Examples include Function Calling, User Rate Limiting to control access, Usage Monitoring with tools like Langfuse, Live Translation with LibreTranslate for multilingual support, Toxic Message Filtering and much more. Enjoy a seamless experience across Desktop PC, Laptop, and Mobile devices.
Txtify is a free and open-source web app for converting audio and video to text using advanced AI models. It supports YouTube videos and personal media files, offering fast and accurate transcriptions. Txtify can be self-hosted, giving you full control over your transcription process.
Singulatron is an app that lets you run AI anywhere! It is private, works offline, and can run on your laptop, PC, or even on your company computers or servers. It's not just an app but also a platform that enables building other AI applications on top of it.
Singulatron aims to be both a desktop app for local usage and also to work as a distributed daemon to drive servers, with a web app frontend client that is the same as the local app. Private: your chats never leave your computer
Works without an internet connection. The prompt queue system lets you input many prompts at once - even across threads - they will be processed sensibly. You can leave threads and return - streaming won't be interrupted. A download manager makes sure your models are well kept.
Unfortunately, it's an Electron app.
Perplexica is an open-source AI-powered searching tool or an AI-powered search engine that goes deep into the internet to find answers. Inspired by Perplexity AI, it's an open-source option that not just searches the web but understands your questions. It uses advanced machine learning algorithms like similarity searching and embeddings to refine results and provides clear answers with sources cited. Using SearxNG to stay current and fully open source, Perplexica ensures you always get the most up-to-date information without compromising your privacy.
You can make use local LLMs such as Llama3 and Mixtral using Ollama. Normal or Copilot modes. Special modes to better answer specific types of questions. Some search tools might give you outdated info because they use data from crawling bots and convert them into embeddings and store them in a index. Unlike them, Perplexica uses SearxNG, a metasearch engine to get the results and rerank and get the most relevant source out of it, ensuring you always get the latest information without the overhead of daily data updates.
Has a documented installation process that doesn't require Docker.
Chunking documents is a challenging task that underpins any RAG system. High quality results are critical to a sucessful AI application, yet most open-source libraries are limited in their ability to handle complex documents. Open Parse is designed to fill this gap by providing a flexible, easy-to-use library capable of visually discerning document layouts and chunking them effectively.
Visually driven. Parses Markdown. Can analyze data tables by extracting them into Markdown tables.
Surya is a document OCR toolkit.
Built on top of PyTorch. Multiple models.
Cross-platform, open-source voice assistant and framework to build fully-featured, offline machines you can talk to. Self-hosted. Desktop and mobile clients. Repos of note:
AlgorithmWatch is a human rights organization based in Berlin and Zurich. We fight for a world where algorithms and Artificial Intelligence (AI) do not weaken justice, democracy, and sustainability, but strengthen them.
Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints, which are ready for inference and available for commercial use.
Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases. This enables vector search with SQL, topic modeling, retrieval augmented generation and more. Embeddings databases can stand on their own and/or serve as a powerful knowledge source for large language model (LLM) prompts.
Features
Artificial Intelligence (AI) is often presented like a complex field, the state of the art being impossible to understand, models too large to train, incredible work in progress moving forward that could change anything, yet a black box inscrutable for anyone except the selected few.
This is truly damaging to the field as it is a fascinating topic and even though indeed nobody can understand it all, we can all benefit from tinkering with it, learning from it and possibly even using it.
Regardless of all those limitation the goal here is to showcase that even though not everything can be done on your desktop, a lot can. Composing from that and learning how it works can help to reconsider a potential feeling of helplessness. Not only can you self-host AI models, use them, adapt them, but there is a whole community and set of tools to help you do so. This movement itself is very encouraging. AI does not have to be a block box. Your digital life does not have to be owned by someone else, even for the state of the art.
Your AI second brain. A copilot to search and chat (using RAG) with your knowledge base (pdf, markdown, org). Use powerful, online (e.g gpt4) or private, offline (e.g mistral) LLMs. Self-host locally or have it always accessible on the cloud. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp
Khoj is an AI application to search and chat with your notes and documents. It is open-source, self-hostable and accessible on Desktop, Emacs, Obsidian, Web and Whatsapp. It works with pdf, markdown, org-mode, notion files and github repositories. It can paint, search the internet and understand speech.
Weaviate is an open source vector database that stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance and scalability of a cloud-native database, all accessible through GraphQL, REST, and various language clients.
With Weaviate, you can turn your text, images and more into a searchable vector database using state-of-the-art ML models. Weaviate typically performs a 10-NN neighbor search out of millions of objects in single-digit milliseconds. You can use Weaviate to conveniently vectorize your data at import time, or alternatively you can upload your own vectors (say, if you download a model from OpenAI or HuggingFace). Weaviate powers lightning-fast vector searches, but it is capable of much more. Some of its other superpowers include recommendation, summarization, and integrations with neural search frameworks.
Milvus is an open-source vector database built to power embedding similarity search and AI applications. Milvus makes unstructured data search more accessible, and provides a consistent user experience regardless of the deployment environment.
Millisecond search on trillion vector datasets. Rich APIs designed for data science workflows. Consistent user experience across laptop, local cluster, and cloud. Embed real-time search and analytics into virtually any application. Component-level scalability makes it possible to scale up and down on demand. Milvus can autoscale at a component level according to the load type, making resource scheduling much more efficient.
Welcome to Machine Learning Systems with TinyML. This book is your gateway to the fast-paced world of AI systems through the lens of embedded systems. It is an extension of the course, TinyML from CS249r at Harvard University.
Our aim is to make this open-source book a collaborative effort that brings together insights from students, professionals, and the broader community of applied machine learning practitioners. We want to create a one-stop guide that dives deep into the nuts and bolts of AI systems and their many uses.
An interactive visualization (with simple explanations) of how large language models work.
A database that tries to make it easy to build an LLM-like search database. Super-simple API for loading data and querying it.
You can do everything in your code or run it as a server (chroma run --path /path/to/datastore/on/disk
) and use an HTTP client to interact with it.
This is an amalgam of TTP's on different offensive ML attacks encompassing the ML supply chain and adversarial ML attacks.
It is focused heavily on attacks that have code you can use to perform the attacks right away, rather than a database of research papers. (PoC or GTFO type logic). Generally speaking if it is here I have tested it and it works. The intent is to help red teams and offensive practitioners quickly understand what tool in the toolbox to use to attack ML environments.
This is a living vault. It is very much not a finished list of resources. There are pages that are polished, and some that are little more than placeholders with a few bullet points that I jotted down during conferences or on the fly.
The goal is to organize the attacks in a way that is useful to red team operators rather than useful for say, academics trying to understand adversarial ML.
The "Awesome GPTs (Agents) Repo" represents an initial effort to compile a comprehensive list of GPT agents focused on cybersecurity (offensive and defensive), created by the community. Please note, this repository is a community-driven project and may not list all existing GPT agents in cybersecurity. Contributions are welcome – feel free to add your own creations!
Disclaimer: Users should exercise caution and evaluate the agents before use. Additionally, please note that some of these GPTs are still in experimental test phase.