What's the best place to understand in layman programmer's language the simple c...

bugglebeetle · on April 19, 2023

Pinecone is an expensive online vector database that can easily be replaced with any number of free, local versions of the same thing (e.g. Faiss). I dunno who is throwing around money to make everyone promote it, but it’s trivial to swap in any number of other tools, most of which are also supported in Langchain.

jamilton · on April 19, 2023

I did like this article's note on that:

>Why do people all use pinecone to store like 10 things in memory

coffeebeqn · on April 19, 2023

Because the readme told them to. I’d imagine 99.9% of devs heard the phrase “vector database” first time this week

williamcotton · on April 19, 2023

pgvector for Postgres works great!

bugglebeetle · on April 19, 2023

Yes, this too!

Garlef · on April 19, 2023

Regarding 2)

It's a python library for stitching together existing APIs for AI models.

Regarding 3)

It means that rather than training a new model that does your thing you use existing models and combine them in interesting ways. For example, AutoGPT works roughly like this: Give it a task, it then uses the ChatGPT API to create a plan to achieve the task. It tries to do the first item in the plan by picking a tool from a preconfigured toolbox (google search, generate an image using stablediffusion, use some predefined prompts, ...) afterwards it assesses how far it got and updates the task list and loops until the task is done.

Regarding 4)

Some can run on your machine, some run in the cloud and you'll need to pay and get an API key.

Kelamir · on April 19, 2023

Regarding 1) you could watch https://www.youtube.com/watch?v=klTvEwg3oJ4 . Pinecone is a vector database and LLMs can use them to extend their memory beyond their token limit. Where traditionally an LLM can answer only according what's provided in its context, which is limited by a token limit, an LLM can query the database to get information from it such as your name.

tunesmith · on April 19, 2023

So.... if I wrote a book manuscript, and wanted an LLM to help me track plot holes by asking it questions about it, I can't do that with token limits (aside from various summarization tricks people use with ChatGPT), but I could somehow parse/train a system to represent the manuscript in the vector database and hook that up with my LLM?

mark_l_watson · on April 19, 2023

You would partition the manuscript into a sequence of chunks. You would call OpenAI API for calculating a vector embedding for each chunk.

When you want to query against your manuscript, you call the OpenAI API for calculating a vector embedding for your query, locally find the chunks "near" your query, concatenate these chunks, then pass this context text with your query to GPT-3.5turbo or GPT-4.0.

I have written up small examples for doing this in Swift [1] and Common Lisp [2].

[1] https://github.com/mark-watson/Docs_QA_Swift

[2] https://github.com/mark-watson/docs-qa

Spivak · on April 20, 2023

And the missing glue is that "vectors closest to the question string" actually produces pretty good results. You won't be Google level of relevancy but for "free" with a really dumb search algorithm you'll be at the level of a elasticsearch tuned by someone who knows what they're doing.

I think in all the chaos of the other cool stuff you can do with these models that people are just glossing over that these Llms close the loop on search based on word or sentence embedding techniques like word2vec, GloVe, ELMo, and BERT. The fact that you can actually generate quality embeddings for arbitrary text that represents their meaning semantically as a whole is cool as shit.

mark_l_watson · on April 19, 2023

BTW, I am working on an open source project that will generally make these ideas usable, at least useful for me: http://agi-assistant.org

No public code yet, but I will release it with Apache 2 license when/if it works well enough for my own daily use.

sputknick · on April 19, 2023

adding to other user's definition of LangChain: LLMs have what are called "context" which is basically the amount of information it can remember at any one time. GPT-3 it was about 2 pages of text, GPT-4 is currently about 6 pages of text and will soon be about 40 pages of text. If you want the LLM to know about more data than that, LangChain will allow you to "chain" together multiple contexts that the LLM can gather data across.