Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks, I'll have a look.

My use for this is very different--I want to be able to use a specific subset of my archived pages (which is mostly reference documentation) to "chat" with, providing different LLM prompts depending on subset and fetching plaintext chunks as reference info for the LLM to summarize (and point me back to the archived pages if I need more info).



Ok that makes sense, I think archivebox works as the first step in a pipeline there, with some other tool doing the LLM analysis and query stuff.


Yep. That's what I've built for myself, I just can't really get at the data inside ArchiveBox until I upgrade.


How did you build it?

I can imagine an architecture where I throw everything into ArchiveBox, then run VectorDB as a plugin with Gradio or some such as the client.

https://vectordb.com/


You're overcomplicating things. You don't need a vector database, FTS works just as well for non-homogenous content.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: