I want there to be a well-defined small subset of C++ that is sufficient for writing large maintainable structured code (i.e. it has features like classes and exceptions), but does not allow all the bells and whistles that tempt C++ programmers into writing unreadable, unmaintainable and inefficient code (i.e. pretty much everything else.)
At the moment my work project involves a library written in C. Why C? Because it has to be fast, small, cross-platform, and callable by clients written in a variety of languages.
Unfortunately, coding anything nontrivial in raw C sucks. Coding the way the language wants you to results in piles of tangly functions and raw structs. It’s missing important features like:
To me the semi-obvious answer is to use C++ instead, since it provides all those missing elements while being easily compatible with C. But as soon as one mentions C++, the obvious objections are made:
These are all valid, and they remind me of a great quote by my first manager at Apple, Steve Friedrich, which I paraphrase as:
We’re going to port MacApp from Object Pascal, which we liken to a crayon, to C++, which is like a double-edged razor blade.
I’ve written my fair share of C++. I’m kind of a language geek and at first I enjoyed the ability to use operator overloading, copy constructors and templates to create really high-level constructs that made complex operations look trivial. But getting those to work right was incredibly painful, involving slogging through twenty-line syntax error messages filled with tangles of angle brackets. In the end I suspected that the time taken to get the fancy labor-saving devices working more than ate up the time they saved in the rest of the coding.
I’ve also optimized enough C++ that I’m no longer shocked when I disassemble a fairly small function/method and find the enormous and inefficient assembly code that it expands to once all the template expansions, virtual calls and exception handlers are generated.
THERE’S NO OTHER CHOICE
Unfortunately I don’t know of another language that fits all the goals of fast, small and cross-platform. D comes close — it’s a really sweet language, like C++ rethought and done right — but its compiler support is still limited; for example, as far as I know it doesn’t compile to ARM. Some functional languages like Haskell and OCaml claim they can approach the speed of C, but from what I’ve heard this requires really clever and non-intuitive coding patterns. And I suspect the runtime libraries will add significant size to the program.
A bit of history: Long ago, in the late ’80s, there was a Mac C compiler and IDE called Lightspeed C (later renamed THINK C). It was fast and reliable and very popular, but everyone started clamoring for C++ support, because C++ was the New Hotness. The developers realized that C++ was (even then) a very complex language, so they decided to break the task into steps. In version 4 (I think?), released in 1990, they added support for a subset of C++, which was quickly nicknamed “C-plus-minus”.
C-plus-minus pretty much only added classes. And not even all the features of C++ classes — no stack-based objects, no copy constructors, etc. I can’t remember whether it even allowed constructors to have parameters. But it was still really, really useful, because it let you structure your code and data into classes and objects.
(What happened next? Symantec bought THINK, bought another startup with a C++ compiler, bolted the two together and produced THINK C++. It supported all of C++, but the compiler was slow and the IDE was buggy. Developers suffered with it for a few years until Metrowerks showed up with CodeWarrior and ate their lunch. But that’s not relevant to this story.)
So, now I’m thinking of whether it makes sense to voluntarily adopt a similarly small subset of C++. At this point it’s obviously not a technical limitation but a self-imposed one, rather like the Amish rejecting modern technology. The question is: what subset do you choose?
I’ve been thinking about it for the past few days, and it’s not as easy as it appears. THINK’s C-plus-minus is a good place to start from, but there are definitely features I want to add, like exceptions. Reference parameters are simple and useful too. But then it gets tricky.
The biggest issue seems to be whether to allow stack-based objects. On the one hand, these are obviously great for efficiency (since you save the overhead of malloc/free), and arguably for reliability (since the object is guaranteed never to be leaked.) But they also seem to be the thin end of a wedge, and when I think of what they imply, that wedge seems to turn into a razor blade slicing into my skin:
On the other hand, if you don’t allow stack-based objects, you can’t implement the “resource acquisition is initialization” (RAII) pattern that’s important for writing reliable code in C++. Without this, exception handling gets messier, especially because C++ for some reason doesn’t support the “finally” clause that most other languages have.
There are other sticky issues. Using the STL seems right out since it’s neck-deep in template goop. But without templates, it becomes difficult to write generic container classes — the classes have to use some type for their values, and it seems like you either have to use void* or declare a base Object class that everything inherits from. Either way, you have to do a lot of pointer-casting on values returned from collections. This may not be so terrible; after all, Java dealt with this for a decade before it finally added its own template-like facilities.
Why am I posting about this? Because I think it would be useful to have an explicit, documented standard “sane” subset of C++. It needs to be clearly specified so it can be used as a coding standard among teams, and also so that individual developers aren’t too tempted to backslide and use templates “just this once”. Who knows, if it gets popular maybe compilers can add a flag to enforce its limitations.
As you can tell, I’m still not clear on what this subset should allow and disallow. It’s an interesting question, and I hope to get other people thinking about it. Have you worked on a C++ project that had coding guidelines that mandated a very limited subset of the language? Or do you have your own ad-hoc limits that you keep in mind while coding?