I went with a manual approach. It was more than capturing text input in a shell....

I went with a manual approach. It was more than capturing text input in a shell. I was showing things like Vim and tmux workflows, so it involved recording my entire terminal at the application level.

What I ended up doing was taking individual screenshots of each frame I wanted in the animation. Most of them were only ~5 frames long.

Then if needed, I enhanced each screenshot by adding text labels. Like when I demo'd Vim splits I wanted to put a "1", "2", etc. on each split to make it easier to see what's going on.

Then I took all of those individual screenshots and fed them into https://ezgif.com/maker which is an online tool that converts screenshots into an animated gif where I was able to configure how long each frame should stay visible before transitioning to the next one.

It sounds like a lot of work but it wasn't too bad considering I needed to sometimes add graphics to each frame and control the length of each frame. Once you get the work flow down it was pretty fast.