Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thank you for the great question.

Thinking back to when I started this I initially just wanted to keep everything simple and so I avoided putting in a large and high-level lib like pptr, and went with chrome-remote-interface.

I looked at pptr and IIRC at that time (~ 12 months ago) there was not a clear way for me to handle multiple tabs (a key "real UI" use case). The same goes for Cyrus' lib too.

With Cryus' lower level lib I could hack around that, by doing my own target and session management, but at some point in the last couple months I hit a wall with chrome-remote-interface. Cyrus' lib was not up to date with the latest ToT API (specifically flat session mode) and I worked out I could replace the entirety of chrome-remote-interface with some simple code that sent messages down a WebSocket, saved a Promise (by message id) and returned it, and resolved that promise when it received back a message tagged by corresponding id. It was also simple to write an 'on' function to add listeners for various events. So that was that.

Basically, the DevTools protocol is a well specced, well tested, simple protocol and all these libs (like pptr and chrome-remote-interface) began simply as wrappers around the WebSocket, with an API to map function calls to protocol messages and add listeners for events. PPTR has evolved into much more than that now, and during the same time period, I evolved my own "BG protocol" atop the CDTP (Chrome DevTools Protocol). It became easier to deal with the single source of truth that CDTP is, and get the full expressibility of the latest ToT protocol than deal with the limitations and abstractions of other things built atop that.

Specifically, PPTR did not (and I believe probably still does not, tho I have not deeply checked) an easy way to control and manage multiple tabs. And even if it does, I'd have no use for it, because I already have the code that does all that anyway. Scanning PPTR docs now I see that I prefer the abstractions, naming, etc of the CDTP protocol itself, rather than the ones PPTR provides. Like I said, the CDTP protocol is very comprehensive, consistent and makes a lot of sense, and I know it very well. For me and my use case, it's just a better fit.

The way I think about this is not that "PPTR" has some problem, it's that the "BG protocol" and PPTR (et al) are trying to solve (basically) fundamentally different problems. PPTR (et al) try to provide a clean developer experience for common tasks related to browser use cases (such as automation, getting screenshots, PDFs, testing, etc). That's a particular domain, and not exactly the same as what BG protocol does. BG protocol attempts to provide as realistic and familiar as possible experience of using a browser (when you're actually controlling a remote browser through the CDTP). That's not entirely the same domain, because some things that users want, are not required in automation, and some things that automation does are not required or done by users.

One of the ways I code is by picking the right tool for the job, and if that tool doesn't exist, or no longer works, I build the tool. I want to work with tools that fit right. So for this domain and use case BG protocol is a better fit than PPTR.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: