C++, despite being a really powerful language, has a really primitive build system. Developers have to wade through a complex soup of Makefiles/CMake/Gyp/ninja etc just to get a project set up. There's conan for package management, but it pales when compared to Cargo or Gradle. Moreover setting up Conan is a full time job in itself, and it's not without weird quirks either. I've not tried Bazel yet, and would love to hear if it's as good as Xooglers say.
C++ on Jupyter looks amazing, and I really hope it makes starting with C++ easier.
The conda package manager and it's subproject conda-build are fantastic tools to solve this problem. I write scientific software (primarily a mix of C++ and Python), and I use conda for literally every dependency I need, including pure C or C++ dependencies. Most of them are already available via the conda-forge project, and the ones that aren't are easily integrated by writing my own recipes to package with conda-build.
I write in a mix of python and C++ as well. As such, I've also gradually moved towards using conda. What's your dev workflow for the code that's not a dependency like? Specially if the code you're developing is C++. Do you make a conda package of it as you're developing? I've found that too much of a hassle to rebuild. I usually just use CMake with the conda environment dir as the CMAKE_INSTALL_PREFIX.
Yeah, that's what I do also. conda-build makes it easier to use your code downstream, but doesn't obviate the need for build scripts in general.
However, if you're using conda, you might be able to at least simplify your build scripts, even though you can't eliminate them.
If you are okay with requiring your users to have conda, then you can exploit that fact to simplify your CMakeLists.txt in some ways. For instance, find_library, etc.. can be replaced with hard-coded links to ${CONDA_PREFIX}/lib/...
(I'm not saying that's necessarily "best practice" for all projects, but it's a nice option to consider, especially in the early phases of development.)
I've found generally it's fine for most things but issues come from when the scientific libraries are built with/without certain options. The lack of OpenMP for FFTW is annoying, for example.
Yes, that's of course a potential issue with any distribution. The good news is that's relatively easy to tweak their recipe to build your own version.
Bazel is what I really like (xoogler here, used it mainly with java+protos, minor python/c++, testing, other things). Similarly - gn (though different internals, easier platform switch, etc.), buck, pants, etc.
In all of these I like the single "BUILD" file per directory, and simple text file explaining dependencies.
Also these tools work awesome when you are in a company, and you have underneath - dozen or more libs, sdks, projects that need to run together.
> Also these tools work awesome when you are in a company, and you have underneath - dozen or more libs, sdks, projects that need to run together.
I'm pretty sure it works great then. But for the bigger C/C++ ecosystem the problem is that everyone uses another build system. From Makefiles to Automake, Cmake, Gyp, Scons, Bazel, etc. everything is included. As soon as you try to use dependencies that favor a different format you either have to rewrite their project definition in your format or at least build it as an external subproject and manually add the include directories.
Imho that's one of the biggest weaknesses of C++ that even outweighs language concern for me: It's often lots of work to integrate 3rd party libraries. And because of this there isn't a really great ecosystem of libraries (e.g. compared to Javascript and Java).
There is another big reason why there isn't a great ecosystem.
The C mentality of some devs dragged into C++ world.
Already in the 90's we had very nice high level C++ libraries, that could compete with what Java offered later.
Turbo Vision on MS-DOS, Object Windows Library and Visual Components Library on Windows. All from Borland.
PowerPlant on Mac OS, from Metrowerks.
From Microsoft, we had MFC, which started high level like those ones (originally named Afx), but then the beta testers requested for it to just be a thin layer over Win16 and it was reborn as MFC.
Many of the modern C++ patterns were already possible with those libraries, but the C wisdom made many not use them.
Another modern example is Android NDK, Google writes the code in nice C++ classes, that get exposed as low level C APIs, or have a Java JNI barrier (e.g. Skia).
> ...Google writes the code in nice C++ classes, that get exposed as low level C APIs...
I'm not on the android team, but in general it's easier to maintain reverse-compatibility with C ABIs than with C++ ABIs.
And the plethora of libraries and the modest size of the standard library are part of the problem. Let's say you're writing a C++ library that understands git and lets users develop more complex applications with your libgitxx as a building block. Whose filesystem primitives (file, directory, path, user, etc.) do you use? There's really not a satisfactory answer to that question.
Point being, C ABIs don't have this problem as much. They tend to just pass around ints and char. Maybe there are some structs to pass around, but they tend to be made up of ints, char, etc.
@ Google - most of the projects are linked statically, and all source code is present (possibly very, very few exceptions) at compile time (all open-source packages are recompiled too)...
As such, backward compatibility (binary-wise) is not needed.
Android, Chrome, etc. are another matter - former has an SDK, and I guess you may need that there, but I don't have much experience with it.
True, but if it's my understanding that Google doesn't pursue semantic versioning, at least with C++, partly because it's hard for a human being to understand what C++ changes affect an ABI and which ones don't.
The need to standardize things like std::filesystem was my point.
> Additionally there are ways to implement compatible ABIs in C++.
Eh... when even compilation flags affect ABIs, it's hard for a particlar C++ file to have a stable ABI. The same is true in C, of course, but the way C libraries are generally designed, it's simpler to be ABI compatible. In C++, arcane and unexpected things like noexcept specifications (affected by changes in underlying libraries!) can change your ABI.
> Instead, one needs to write unsafe C code, JNI boilerplate or wrap them again in C++.
That's a fair point. There might be reasons they'd do that (SLAs they can afford), but that's a presumption.
> The need to standardize things like std::filesystem was my point.
Which circles back to my statement about "The C mentality of some devs dragged into C++ world.".
Many useful C++ libraries were not standardized in the early days beyond STL, because many thought C++ should be like C where libraries are whatever the OS provides and nothing else.
Which is clearly visible in the decision to have ANSI C and POSIX APIs as two separate standards, although most C compilers try to provide some kind of additional POSIX compatibility, specially visible in non-POSIX OSes.
Thankfully, the ANSI C++ committee realized the mistake and now we are getting a useful set of libraries into the standard.
As for compilation flags, even for C it is only true if all compilers on the platform follow the same ABI.
And you still need to provide all variants anyway (debug, single and multithreaded, hard and soft float, pre-defined pre-processor macros, ...).
C ABI only appears to be simple in OSes that happen to be written in C.
Grated that is the case nowadays with most being UNIX clones, besides Windows and mainframes, but it wasn't always like that.
I remember the pain of trying to mix multiple C compilers back in the 90's.
> C ABI only appears to be simple in OSes that happen to be written in C.
I'd say that C is the only language that (mostly accidentally and non-standard, for the reasons you state) provides a simple a (relatively) stable ABI.
Using another build system might seem like a problem, but at the end of the day - it's a list of .cpp, .h files, some defines and possibly some intermediate steps. These rules, list of files, etc. would tend to change much less in a well established library, hence a convert to another build system would not seem that big of a deal, even if two, or more a supported.
Today, Qt, for example has qmake, cmake, qbs? others?
At work, I'm heavily vested in using Microsoft's VCPKG for new builds of open source libs, they internally use CMake, but then CMakeLists.txt were not even readily available for some of these projects, and the VCPKG maintainers did them.
At some point, or it may already exist, someone would make a converter from one system to another, lossy possibly. For example it's possible to parse everything you need, correctly from a .sln/.vcxproj usign the MSBuild framework, which is now available everywhere (except the actual .props, .target files, which are Visual Studio specific). But on a machine with the Community version of the compiler, these are free to use (should even work under wine). And since a lot of other build systems target .sln, .vcxproj files, this could in a way be a target for a lot of systems to converge, and from there extract what you want:
- List of source, header files.
- Separation between projects
- Defines for each project.
- etc.
It doesn't have to be perfect, but it's possible.
Or maybe from ninja generated files (though not meant to be used that way).
Luckily there is sane way to verify whether alternative build for a project works - simply compile it, and verify whether the produced artifacts are the same, or are close enough (list of produced header files, exported symbols, may go even deeper).
And yes, I do miss the Turbo/Borland Pascal's .TPU files, where no .h file was needed, or Borland C++ simple project file that was just listing line by line the .c files, but back then projects (which I was familiar with) were much smaller, even gcc, the linux kernel and others were much smaller then.
It really doesn't matter what build system other libraries use. If you use CMake, you just need to use `find_package` directives (https://cmake.org/cmake/help/v3.10/manual/cmake-packages.7.h...) to integrate third party libraries (which should be separately installed in the system).
It only seems difficult if you're stuck in the mentality of shipping all your dependencies with your source code and statically linking everything. C and C++ libraries actually care about maintaining API (and usually ABI) compatibility, so you don't have to ensure you have a specific version of dependencies.
Come on, you cannot even link C++ shared libraries from different version of standard library, potentially also compiler.
ABI compatibility in C++ does not exist.
In C, you still had incompatibilities between compilers. (MS fastcall convention and declspec come to mind...)
There's better ABI compatibility than in the languages where you have to rebuild after every minor version bump. The C++ standard library version (and to some extent, the compiler) should be considered part of the platform.
C++ was the first programming language I picked up that made me feel like I was really making something (after VisualBASIC and HTML), when I was 13. Things were simpler back then. I used Borland C++ and it could produce .exe exports from .cpp sources, pulling in referenced #includes. Running it in Windows would launch a little DOS prompt that ran your program and then quit (unless you put a getch()!)
Like any teenager tinkering, I opened up some sample cpp file that came with the compiler and started just throwing in cout<< all over the place doing random stuff with variables. My older brother explained "loops", "functions" and later the basics of "arrays" and I completed my first game by 16, written purely in C++ -- a Tank clone that was a single long-ass cpp file using purely #include<graphics.h> calls, putting individual pixels on a 640x480 screen. I hadn't learned what classes were yet, so it was some form of "functional programming", just starting from a void main()
The point was, it was easy to tinker and learn and make stuff and explore in C++, the language isn't innately arcane. I didn't even have the internet back then to look stuff up. A teacher gave me a textbook for "graphics in C++" because my high school (which I wasn't in yet) computer science curriculum didn't cover graphics, but I hated copying code from the textbook (it seemed outdated) so I just opened up header files and tried to call functions I saw in there. I didn't need to use any fancy C++ syntax, just the basics my brother taught me. No make files, no build scripts, no gcc. Just an IDE that existed on a floppy disk and the most common OS available at the time (DOS).
It must be a sign I'm getting crotchety and old, but "back in my day" C++ wasn't such a confounding beast. And I'm sad to say I think modern-day python is going down a similar path, it started out intending to be the easiest language to tinker in, but it's slowly becoming arcane just to import a module and launch a program for a beginner.
I stepped back into C++ recently trying to make something "I can see" at from scratch, and it's damn nigh impossible without visiting 4-8 different instructional sites depending on how you define "something I can look at".
I wanted to write a Postgres extension/fork, and was thinking "oh I'll just replace some bits with Rust".
Unfortunately everything is `make`, which is great when you're setting something up and have a huge process... but it's very hard to make some changes without figuring out the entire flow.
Someone that could somehow wrap make to make it easier to make incremental changes on existing projects could help fuel a lot more experimental changes.
I feel like I "get" make, but when you have autotools in front of it you end up with stuff like [0], and Makefiles kicking off other Makefiles.
It's not the worst thing to ever grace the planet by a long shot, but I think it suffers from the same problem that bash scripts have. Everything is string-ly typed and you can only build up a thing to a certain level of complexity easily.
There's of course a lot of requirements in systems software, of course. And a lot of difficulty stems more from the C ecosystem's difficulties with packaging (where are the C equivalents to "environments" you find in Python/Ruby so you don't have to pass every lib in explicitly?)
I think there's some useful opportunities somewhere here.
> Everything is string-ly typed and you can only build up a thing to a certain level of complexity easily.
Yes, make was designed for the 1970 world, where all you care about is building simple utilities on Unix. There was no today's complexity (out of tree builds, cross-compilation, etc) nor "alien" platforms like Windows where people have spaces in their paths.
What we need is to redesign make to handle today's requierements. And this is what we are trying to do with build2.
I'm a bit surprised to read this. It's been a long time since I've done C++ outside of the embedded space, but isn't this a problem that's solved by libraries and LTO? Compiling individual source units isn't terribly difficult either. There's a single tool that turns source files into object files, and a single tool to link them all up. It all feels very UNIX-y to me.
LTO is unrelated. Compiling individual units is only one of many build system responsibilities. There's also dependency management, cross-platform support, iterative builds, etc. And it all needs to be fast.
Written by Matt Hermann who I believe is now at google - I've never met him. I worked somewhere that used it and have used it ever since for every greenfields C++ project I've done and just don't think about my build system anymore. For me this just works for everything I need, adding a linkflag for this .cpp file only? Dead simple, hard to imagine it being simpler. Build tests and run with valgrind equally so. I have a single top level makefile that literally just specifies targets, eg release, test, asan_test, clang_release etc. Each is a single cake line.
I highly recommend it for C++ on linux (haven't used it elsewhere to say), for me it's a thing of beauty and a joy forever. Not dealing with autotools or scons or complicated make based build systems makes C++ so much nicer to work with.
- I have to start seperate small test and prototype projects every now and then.
- I have to include third party libraries occasionally, and sometimes weight if it wouldn't be just easier, faster and better for my mental health to reinvent the wheel instead.
- I have to check out, test or use third party projects every now and then. They all use different build systems, different ways to use different build systems and different work around to get around limitations of build systems.
I like C++11 and love C++17 but absolutely hate building and managing C++ projects. It's a mess.
I guess it also applies when you want to integrate a new third party library once a week, in cases where the libraries build system doesn't match yours.
Very cool! I watched the SciPy talk on xtensor (same developers as xeus) and the project looked pretty awesome. Many people in my field used CINT for a long time and cling is so much better. It's nice to see it used outside of particle physics.
One thing I disagree with is the idea that the lack of a good interactive environment for C++ makes it difficult to teach. C++ is a compiled language, so learning how to compile a C++ library or program is _part_ of C++. I feel like C++ beginners should go interactive _after_ learning how to compile a project. I write that with first hand experience; my first programming experience was with writing interpreted C++ in CINT and I feel like it hindered my ability to eventually understand what a real C++ program was.
I agree that the compilation process is part of the C++, however (and that's from my teacher's experience), teaching it to beginners as a first lesson may be complicated (and you have to teach it early so they can build their first program). Besides this is not rewarding when your students are familiar with other languages like python where they can write some code and have immediate results without any additional step.
So having a good interactive environment is really useful for the first lessons, then you can teach the compilation chaintool and switch to a real environment.
Compiling a project with right incantation of flags is not rocket science but it is time consuming that a build system should be better equipped to do.
Interactive development has immense benefits in exploring apis for experts and learning the language for beginners. And C++ is not well-known for its compile-times.
Someone else commented on the semantic difference between the narrow definition of C++ as a specification and the looser "C++" as a part of the full stack.
With regard to the full stack, I tend to agree with you, at least in the academic environment. Organizations that don't have a lot of software engineers will shoot themselves in the foot if they focus on the C++ language rather than the infrastructure around it. An interactive notebook is great to teach someone to add a few lines of code here or there, or to run quick tests, but what we're lacking is people who understand how to plug their jupyter notebook into the production line.
C++ on environments like C++ Builder is relatively easy.
The issue is developers having open mind to such environments, luckily it seems to be changing with QtCreator, CLion, VC++, specially thanks to clang being implemented as a library.
> so learning how to compile a C++ library or program is _part_ of C++.
I entirely disagree. This is only for historical reasons, the language itself does not care at all about the compilation model. This mindset is what is keeping C++ back, both for the industry AND in students' minds.
> I feel like it hindered my ability to eventually understand what a real C++ program was.
The moment you have a .o file it's not a C++ program anymore, but a platform-specific object file. You aren't learning C++ but windows / mac / linux's native binary production toolchain.
> The moment you have a .o file it's not a C++ program anymore, but a platform-specific object file. You aren't learning C++ but windows / mac / linux's native binary production toolchain.
Then that is at least part of the learning curve of seriously using C++ so I think as of today it's (potentially) dangerous to start with only interpreted code and wait to be introduced to compiling, build tools, etc. They should be taught together. I also think it's not the greatest environment - and I hope it changes, perhaps projects like this will help!
Indeed, the language does not care which means you should. Object file is not even a compilation unit at times.
You get to build the object file, shared objects, dlls what have you.
Then use the linker to link it to turn it into an executable.
Some modern languages specify the runtime very deep. For example Python or Java - so much that it is hard to separate language from runtime or standard library.
I think jupyter with C++ is an amazing idea, and I hope it will help more people to start with, in my subjective opinion, one of the most important programming languages today.
That being said, I look down on all the commentators who say that C++ has a primitive build system/package manager etc. Lets be clear: C++ has no package manager, and , in my view of the world, no compiled language has a native build system, its just that most compiled languages dont separate compilation from linking and also most languages support modules :)
C++ is hard. Throwing some fancy frontend and and a REPL will only make you realise that sooner. The complexity is mostly unjustified, but the language is so powerful that you could do anything with it: from trading strategies to power plant control systems to game engines. I have been studying C++ for 9 years, have used it for 7, and have been a professional developer for only 2.5 years, and I think I have only scratched the surface. I have used python for a lot less time but I feel like I am more comfortable with it because Python is trivial to learn (to a person with a comp sci background ofc). What you think you are cutting away (dependencies/build jnfo) is something essential to learning C++ and how everything comes together. Better learn it sooner than later imho.
> no compiled language has a native build system, its just that most compiled languages dont separate compilation from linking and also most languages support modules :)
Most modern, mainstream compiled languages have a build system that wraps around the compiler and linker. Invoking the compiler and linker are responsibilities of the build system, and frequently modules are compilation units.
Modern C++ is a neat language (I used to be a C++ developer) but it's tooling alone makes it unsuitable for any project where rapid iteration is remotely important.
QT follows a certain style, which is mostly object-oriented and heavily PIMPL based (it thereby compiles fast).
The consistent style is actually something nice. However lots of C++ programmers will not necessary like the OO style because it will lead to relatively slow programs. I guess with all the indirection which is going on and the metadata-based reflection and signal/slot system it might not really be ahead of even easier to use managed languages, which are even more convenient to use.
If you look at the Boost (or also C++11/14/17) libraries you will often find a very different programming style, which is heavily based on templates. This one is a lot harder to read and understand for non-experts, but it allows to achieve an even higher performance. The authors might think that this is the preferred model for C++.
Qt is vastly more complex than the boost libraries due to the meta-object compiler and qmake build process. Furthermore the signals and slots are passed using lock free mechanisms and you will find they hold their own in benchmarks.
The biggest advantage is the thread safety Qt affords. Not sure why you equate templates with expert programmer because often times template systems offer unacceptable build times on large programs. (The boost template object serialization library for example)
Qt lends itself to writing more correct and simpler to digest code which any expert programmer will tell you is far more important than fast code
I didn’t imply in my post that boost is the better choice for experts. Exactly because I also don’t think that is always true. I only mentioned that lots of Boost libs are harder to use for beginners than Qt base once since one often needs to understand a good amount of templates. whereas Qt is mostly based on a sniper object oriented model with inheritance.
Having been a former Qt developer, I couldn't disagree more. I have no desire to touch that dumpster fire again. To its credit, it's best in class, but that isn't saying much.
I'm a fairly newer Qt developer so I really only have experience with QT5. Perhaps I was fortunate to miss the growing pains you seem to have endured. For me, the ease of doing cross-platform development along with the ability to create really high quality GUIs with QML made me fall in love.
Qt5 is much nicer, but still a pain. QML is even better, just too coupled to Qt. Unfortunately you have a point that there isn't a better cross-platform native GUI toolkit.
I can’t help mentioning my own project - https://github.com/aldanor/ipybind - which provides a lightweight Jupyter interface to pybind11 [1], allowing to quickly sketch working C++ extensions in the notebook; all features of pybind11 like numpy support are immediately available. Been using it myself quite a lot for prototyping C++/Python code.
Finally! Thanks a ton to both Cling developers and Jupyter.
Personally, a lot my programming involves prototyping and playing with a lot of different ideas and implementations before settling on a solution.
Compiled languages put a steep barrier to this way of thinking and that is why Python has always been my favorite.
I have come back to C++ thanks to Cpp14, 17 and great new features like this!!
Good tip! IMHO, I always recommend the Miniconda installer over the full Anaconda installer, except for absolute beginners who would struggle to understand basic concepts like environments or the PATH environment variable, etc.
> Most of the progress made in software projects comes from incrementalism. Obstacles to fast iteration hinder progress.
I would like to print it in very big letters and put it at the entrance of slow-moving IT departments. "cost of change" is likely one of the most important metrics to measure the effectiveness of a team.
Being risk adverse because of fear of change leads to stagnation.
"more Jupyter notebooks and less spreadsheets" could be a useful simplification for conveying this change of POV.
You can very easily move some functions out of a notebook and into a proper module, that can be included in a program / production system. Autoreload extension in Python Jupyter notebooks makes it extra nice.
It is also easy, and convenient to write py.test style tests inside a notebook, just call them manually inside the block. Then when you move it out to module, you have tests that will automatically be picked up by your testrunner.
You can. Most don't. And moving something from a free form environment to a tested continuous delivery one, is not automatic.
Luckily, the performance constraints of most environments are such that python is not an automatic deal breaker nowadays. That said, correctness of code proofs are usually different from correctness of machine learning algorithms. Such that mixing them seems to just fool both sets of practitioners.
Meh. With some of the practices I've already seen, at least spreadsheets always show the data. You send me a spreadsheet, I can verify most of it. Send me a notebook, I can typically only audit what you did.
This is how I'm doing C++ rapid interactive. Combined shell and C++ in a single file that compiles itself and either runs itself or runs objdump on itself. I find it really useful and fast to test something in junkcode. Not quite the same problem as interactive C++ with Jupyter but there's some overlap. Overlap with godbolt as well - find this quicker perhaps because I'm more used to it?
I only wish it had come a few months earlier. I've had to switch from python in a jupyter notebook to C++ for performance reasons, and I really felt the loss of incrementalism.
You can create your project into python library by wrapping in ctypes or using the cffi package. I do that with some of my numerical simulation code and it doesn't take away from the interactivity of python. Once I get a feature more or less complete, I re-write it in c/fortran and compile it away. This maintains performance as well as focus, because the 'interesting' thing you'll be working on is quick and dirty using python, and the stuff you've already okay'd is in a compiled language out of sight.
It is this transpiling from python to C++ that I want to avoid. Especially because there is a lot of iterative development on the C++.
However, I'll essentially never be able to run my code from a notebook because it runs on a cluster I need to SSH into. Thing is, I'd like to develop that code locally and iteratively.
But I thought you said you already re-wrote it from python to c++ anyways? I was trying to point out it doesn't have to be either/or. That being said, I think the fact that you need to get access through ssh is another hurdle all together.
I switched once I needed the performance, but that was not near the end of developtment. The switch was made far from a point where the code, or even the algorithms were finished.
Thus it was required to do a lot of incremental development in C++. It wouldn't make sense to add in a manual transpilation at every increment, and I needed the actual C++ code to judge performance.
>This also makes C++ more difficult to teach. The first hours of a C++ class are rarely rewarding as the students must learn how to set up a small project before writing any code. And then, a lot more time is required before their work can result in any visual outcome.
Install a c++ compiler, create main.cpp, `g++ main.cpp`
The students don't know anything about the CLI. They might not know what a directory is or how to cd to it. They may have to install multiple packages. On certain operating systems they'll have to set some PATH variables.
I'm not saying it shouldn't be that easy, but due to failures in earlier education it's not.
I'd like to use Jupyter notebooks to do support tasks - eg check on system hosts and do reports, maybe restart processes. I think it could be perfect. I'm just hesitant because no one else seems to use it this way. Any advice out there?
Yes, numba is great; I've had a good experience with it for some functions.
Some problems aren't suited well for it though, especially those that involve Python dictionaries or sets. For those problems, switching to C++ results in MUCH faster code (std::unordered_map is way faster than Python's dict, especially for small elements).
As I'm aware of Python dictionaries are not supported by Numba, however it will still run in "object mode" which won't give you much performance boost. I tend to always specify the Numba function as njit or objectMode=False. But yea true, Numba does have shortcomings, maybe dictionaries and classes will be supported in the future.
Actually, I've found Xcode to be a very easily usable prototyping tool for C++. I haven't used it in any large project, but it's super simple to just prototype some concepts.
Having that in a browser is even better.
C++ on Jupyter looks amazing, and I really hope it makes starting with C++ easier.