Bookmarks
Tag cloud
Picture wall
Daily
RSS Feed
  • RSS Feed
  • Daily Feed
  • Weekly Feed
  • Monthly Feed
Filters

Links per page

  • 20 links
  • 50 links
  • 100 links

Filters

Untagged links
page 1 / 2
39 results tagged nlp  ✕   ✕
valeriansaliou/sonic https://github.com/valeriansaliou/sonic
Mon 26 Aug 2024 12:22:30 PM PDT archive.org

Sonic is a fast, lightweight and schema-less search backend. It ingests search texts and identifier tuples that can then be queried against in a microsecond's time.

Sonic can be used as a simple alternative to super-heavy and full-featured search backends such as Elasticsearch in some use-cases. It is capable of normalizing natural language search queries, auto-completing a search query and providing the most relevant results for a query. Sonic is an identifier index, rather than a document index; when queried, it returns IDs that can then be used to refer to the matched documents in an external database.

A strong attention to performance and code cleanliness has been given when designing Sonic. It aims at being crash-free, super-fast and puts minimum strain on server resources (our measurements have shown that Sonic - when under load - responds to search queries in the μs range, eats ~30MB RAM and has a low CPU footprint

Available in Arch as extra/sonic.

Configuration docs: https://github.com/valeriansaliou/sonic/blob/master/CONFIGURATION.md

rust search searchengine lightweight nlp exocortex archival
UKPLab/sentence-transformers https://github.com/UKPLab/sentence-transformers
Tue 09 Jul 2024 01:42:17 PM PDT archive.org

This framework provides an easy method to compute dense vector representations for sentences, paragraphs, and images. The models are based on transformer networks like BERT / RoBERTa / XLM-RoBERTa etc. and achieve state-of-the-art performance in various tasks. Text is embedded in vector space such that similar text are closer and can efficiently be found using cosine similarity. We provide an increasing number of state-of-the-art pretrained models for more than 100 languages, fine-tuned for various use-cases. Further, this framework allows an easy fine-tuning of custom embeddings models, to achieve maximal performance on your specific task. CUDA enabled.

Seems to lend itself to research coding. The real winner here is that you can generate embeddings and vectors for arbitrary text, which would make it ideal for writing a utility that could do only this without a lot of heavy lifting.

Comes with pre-trained models for over 100 languages. Has documentation and examples for building your own models.

python modules vectors nlp images models
fastText https://fasttext.cc/
Sat 30 Sep 2023 05:09:24 PM PDT archive.org

FastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices.

Pre-trained word vectors can be downloaded.

Github: https://github.com/facebookresearch/fastText/

ml nlp classifiers library
AI Emoji https://emoji.fly.dev/
Fri 15 Sep 2023 05:19:10 PM PDT archive.org

Uses sdxl-emoji to turn natural language descriptions into custom emoji.

Github: https://github.com/cbh123/emoji

online ai emoji generators nlp
LibreTranslate/LibreTranslate https://github.com/LibreTranslate/LibreTranslate
Sun 17 Apr 2022 07:54:46 PM PDT archive.org

Free and Open Source Machine Translation API, entirely self-hosted. Unlike other APIs, it doesn't rely on proprietary providers such as Google or Azure to perform translations. Instead, its translation engine is powered by the open source Argos Translate library.

Supports per-user limit quotas, e.g. you can issue API keys to users so that they can enjoy higher requests limits per minute (if you also set --req-limit). By default all users are rate-limited based on --req-limit, but passing an optional api_key parameter to the REST endpoints allows a user to enjoy higher request limits. To use API keys simply start LibreTranslate with the --api-keys option.

There are also F/OSS mobile clients for Android and browser plugins.

python translation api webapps nlp foss exocortex leandra
LOCO: the 88-million word language of conspiracy corpus https://osf.io/snpcg/
Wed 16 Mar 2022 03:31:17 PM PDT archive.org
dataset corpus conspiracies linguistics nlp
jyguyomarch/awesome-conversational-ai http://jyguyomarch/awesome-conversational-ai
Sat 25 Dec 2021 06:38:31 PM PST archive.org

A curated list of delightful Conversational AI resources.

awesome ai ml conversational bots ui books nlp nlu
N2ITN/Reddit_Persona https://github.com/N2ITN/Reddit_Persona
Wed 15 Dec 2021 04:24:57 PM PST archive.org

Reddit Persona is a python module that extracts personality insights, sentiment & interests from a user account. Support for subreddit analysis not working due to praw update v3--> v5, fix incoming ).

Text is collected via reddit's python API, praw, and NLP is powered by the indico.io API.

python modules cli analysis socialnetworks nlp profiles
Intellexer REST API documentation http://esapi.intellexer.com/Home/Help
Wed 20 Oct 2021 04:51:30 PM PDT archive.org

Intellexer™ is a linguistic platform developed by EffectiveSoft.

Our API and SDK incorporate powerful linguistic tools for analyzing text in natural language. We encourage both developers and integrators to use them for improving existing or creating new Document/Knowledge management systems.

Our API and SDK provide effective capabilities for the development of various semantics-based solutions. The solutions can vary in the number and algorithmic complexity of the linguistic instruments used, depending on the customer's needs.

Free API key.

rest api exocortex nlp service
NLP Cloud https://nlpcloud.io/
Thu 22 Apr 2021 04:50:29 PM PDT archive.org

High performance NLP models as a service. Pre-trained. You can upload and run your own spaCy models as well. Seems to be GPU accelerated on the back-end because they're an nVidia partner.

Named entity recognition, classification, summarization, question in context answering, sentiment analysis, part of speech tagging.

Free tier: All pre-trained models, 3 API requests per minute.

Starter tier: All pre-trained models, 15 requests per minute, $39us/month

service rest api nlp exocortex text
GitHub - MycroftAI/lingua-franca: Mycroft's multilingual text parsing and formatting library https://github.com/MycroftAI/lingua-franca
Thu 05 Dec 2019 01:48:08 PM PST archive.org

Lingua Franca is our multilingual Natural Language Processing library. It allows Mycroft to both understand and respond with naturally expressed entities such as numbers, dates and times. Stand-alone Python module. Ready-to-use and currently has support for Danish, Dutch, English, French, German, Hungarian, Italian, Portuguese, Spanish, and Swedish. Heuristic parsing routines to extract numbers, dates, times, or durations from a spoken language transcription. Natural language formatters for numbers, dates, times and durations as well as utilities for working with lists in multiple languages. Can reformat figures so they can be better pronounced by a synthesizer. Extract information from text to use in figuring out what the user wants and grab the stuff needed to do it.

python modules nlp nlu exocortex mycroft parser formats
GitHub - huggingface/transformers https://github.com/huggingface/transformers
Mon 02 Dec 2019 02:03:46 PM PST archive.org

Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides state-of-the-art general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet, CTRL...) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch.

ai ml python exocortex nlp nlg nlu languages modules neuralnetworks
chartbeat-labs/textacy https://github.com/chartbeat-labs/textacy
Wed 20 Nov 2019 01:31:50 PM PST archive.org

textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals --- tokenization, part-of-speech tagging, dependency parsing, etc. --- delegated to another library, textacy focuses primarily on the tasks that come before and follow after. Abstracts away the boilerplate for the stuff you actually care about.

Quickstart: https://chartbeat-labs.github.io/textacy/getting_started/quickstart.html

python modules exocortex nlp faas tokens tagging dependencies parsing ai ml
GitHub - probot/background-check: A GitHub App built with probot that peforms a "background check" to identify users who have been toxic in the past, and shares their toxic activity in the maintainer’s repo. https://github.com/probot/background-check
Sun 06 Oct 2019 06:01:51 PM PDT archive.org

A bot implemented as a Github App which analyzes the interactions a user has had elsewhere on Github and uses sentiment analysis to figure out how toxic the user is likely to be in their interactions with your project.

Uses the Probot framework.

github apps bot sentiment nlp nodejs
Apertium | A free/open-source machine translation platform https://www.apertium.org/index.eng.html?dir=cat-por#translation
Sat 31 Aug 2019 08:18:37 PM PDT archive.org

A F/OSS natural language translation system that seems to want to give Google Translate a run for its money. The corpuses used for training appear to be crowdsourced, and I think you can download the trained models on their own. Aims to be self-hosted.

Github: https://github.com/apertium

Installation docs: http://wiki.apertium.org/wiki/Installation

translation foss ai ml nlp crowdsourcing languages
GitHub - microsoft/NeuronBlocks: NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego https://github.com/microsoft/NeuronBlocks
Mon 24 Jun 2019 04:39:07 PM PDT archive.org

An NLP deep learning toolkit for building training pipelines. Tries to minimize the effort for constructing the training and inference stages. Defines modular building blocks of neural network components, and a suite of NLP models. The end goal is to make building a neural network as easy as playing with Legos. Supports English and Chinese.

python ai ml deeplearning nlp toolkit
facebookresearch/pytext https://github.com/facebookresearch/pytext
Mon 03 Jun 2019 10:15:39 AM PDT archive.org

A deep learning NLP modeling framework based on PyTorch. Text classifiers, sequence taggers, joint intent-slot models.

python nlp deeplearning ai ml modules analysis frameworks
GitHub - wit-ai/pywit: Python library for Wit.ai https://github.com/wit-ai/pywit
Fri 24 May 2019 03:34:13 PM PDT archive.org
python modules api clients nlp
vaderSentiment · PyPI https://pypi.org/project/vaderSentiment/
Wed 08 May 2019 01:23:20 PM PDT archive.org

VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media. It is fully open-sourced under the MIT License]. Incorporated into NLTK.

python modules nlp sentiment analysis exocortex betafork
spacy · PyPI https://pypi.org/project/spacy/
Wed 08 May 2019 01:05:48 PM PDT archive.org

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pre-trained statistical models and word vectors, and currently supports tokenization for 45+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.

python nlp exocortex models tokens languages ml betafork modules text
page 1 / 2
6963 links, including 441 private
Shaarli - The personal, minimalist, super-fast, database free, bookmarking service by the Shaarli community - Theme by kalvn