Ask HN: AI to study my DSL and then output it?

newhouseb · on April 19, 2023

There are a couple different approaches:

- Use multi-shot prompting with something like guardrails to try prompting a commercial model until it works. [1]

- Use a local model with a final layer that steers token selection towards syntactically valid tokens [2]

[1] https://github.com/ShreyaR/guardrails

[2] "Structural Alignment: Modifying Transformers (like GPT) to Follow a JSON Schema" @ https://github.com/newhouseb/clownfish (full disclosure: this is my work)

trifurcate · on April 19, 2023

Regarding [2], dang, I am working on exactly this! I mean, it's not that novel of a technique once you start controlling the sampling process directly, but you beat me to the punch.

This technique generalizes to pretty much any grammar one can specify. I weakly hypothesize that by making it impossible for the LM to output syntactically invalid text, the model's task performance improves not just because all of its outputs are valid, but also because part of the model's "processing power" gets "rerouted" from trying to understand and follow the grammar it's writing, to applying improved reasoning overall.

newhouseb · on April 19, 2023

Nice! I've been wondering similar things about whether you could use this to eek out more intelligence through methods like these, to quote the end of my write up:

> Does structured decoding increase the observability of emergent world models in these models? To make an analogy: I may not represent an opinion of how something works if I am not confident in it, but if I am forced to present an opinion we might find out that I in fact have (or have not) grasped something.

In practice, however, without tight integration with beam search, the autoregressive nature of these models means that the syntactic steering may result in the models rabbit-holing themselves without forward looking visibility that's obvious from the defined grammar. I.e. if it was forced to choose between "Don't Jump" and "Do run" in some hypothetical example, the set of tokens that it would likely be deciding between is "Don't" and "Do" with no idea what is going to end up syntactically required after those tokens.

bertday · on April 19, 2023

There’s actually a few papers already on constrained decoding. I won’t link them but if you go on arxiv and really look you will find a couple in the past year.

onesphere · on April 21, 2023

What if the output schema were something like instruction code? Just get rid of the need for programming languages altogether.

say_it_as_it_is · on April 20, 2023

Spends years working on an AI solution to problems caused by using postgres as a KV store. That's quite a branch.

ukuina · on April 20, 2023

I like that you use a local model to start off; why switch to OpenAI for tokenization?

newhouseb · on April 20, 2023

The code supports both local and OpenAI as a backend, I added OpenAI as a backend because their models are still miles better than anything I can run locally (even 65B LLaMA)

bicx · on April 19, 2023

Honestly ChatGPT has worked well for things like this in my experience. If you can fit enough examples within a prompt, you may not need anything special.

fergal_reid · on April 19, 2023

LLMs like GPT-4 'natively' speak certain syntaxes very well - e.g. Python, JSON. I'd suggest you want to take advantage of that, if at all possible, rather than embark on training or fine tuning your own LLM.

If you have a particular data structure you want to have the LLM generate or manipulate, which there aren't large quantities of in the training set, you might want to consider writing a translator that will translate it into a format the LLM natively 'speaks', using the LLM on that, and then translating back into your DSL.

Going this direction and also adding examples in some sort of vector store, as others have suggested, could be a good direction.

joenot443 · on April 19, 2023

The best answer, by far, would be ChatGPT and GPT4 with some well-written prompts.

I'd be super impressed if any other approach worked as well and would fall under the category of "easy". Keep us updated on what you go with!

PaulHoule · on April 19, 2023

See

https://huggingface.co/blog/codeparrot

for some idea of how to train a code generator.

tonerow · on April 19, 2023

On https://flowchart.fun I found that I got better overall results by asking GPT for an intermediate syntax that it was less likely to mess up (and easier for me to parse), and then parsing and transforming that syntax to my DSL. The relevant code: https://github.com/tone-row/flowchart-fun/blob/main/api/prom...

bob1029 · on April 19, 2023

We have a similar issue - we have a domain-specific schema that we want GPT4 to author SQL for. The challenge for us is that a full explanation of everything in the schema absolutely blows out the token limits.

Right now, we are playing around with the idea of using a classification layer to detect which schema elements are likely involved, and then dynamically including explanations for those elements in the final prompt.

Our attempts at fine tuning ended after about 2 weeks of struggling. I don't think it's viable for a certain range of domain-specific tasks.

summarity · on April 19, 2023

I've had good success teaching GPT4 a language interactively: provide documentation, examples then asked it to generate examples of increasing complexity and correct it if it's wrong.

See previous comment here: https://news.ycombinator.com/item?id=35447368

onesphere · on April 19, 2023

This DSL might suspend instead of halt. Your comment got me thinking about using LLMs to generate new language grammar.

EDIT: suspension is halting?

tester457 · on April 19, 2023

Langchain with a vectorstore of examples of your DSL. https://python.langchain.com/en/latest/modules/indexes/vecto...

AJRF · on April 19, 2023

What have you tried so far?

onesphere · on April 19, 2023

I’m somewhere between thinking that a prompt won’t be enough to get it to think deeply/expertly about a limited subject, and realizing I don’t know my weight decays from my gradients.

What I want to do is train for some inputted amount of documentation, sample code, and maybe even interpreter implementation source and then ask it: “Generate lots of instructions to gain elevated access.” Or maybe even: “Generate social media widget site.” But of course, in the given language.

onesphere · on April 30, 2023

Maybe I'm looking for too specific a definition. So I've been considering https://en.wikipedia.org/wiki/PaLM but currently trying to find its pretrained dataset. Edit: "The API will first be available to a limited number of developers who join a waitlist before being opened to the public"

Implementation of PaLM in Elemental (I guess?): https://thetaplane.com/ai/palm

kordlessagain · on April 19, 2023

This is very interesting.

I’m still noodling on how to send a full page screenshot to a model and get it to return the individual images (or the bounds of them) in the page.

elliottcarlson · on April 19, 2023

Have you looked at https://github.com/facebookresearch/segment-anything ?

everlier · on April 19, 2023

txtai accomplished a similar task by fine tuning a very small t5 model, notebook with usage samples (training code has to be somewhere near)

https://github.com/neuml/txtai/blob/master/examples/33_Query...

b20000 · on April 22, 2023

AI today is not intelligent, it is just a sophisticated generator using patterns it was trained on.