Criticism of C++

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found. Lua error in package.lua at line 80: module 'strict' not found. C++ is a general-purpose programming language with imperative, object-oriented and generic programming features. Many criticisms have been leveled at the programming language from, among others, prominent software developers like Linus Torvalds,[1] Richard Stallman,[2] and Ken Thompson.[3]

C++ is a multiparadigm programming language[4] with backward compatibility with the programming language C.[5] This article focuses not on C features like pointer arithmetic, operator precedence or preprocessor macros, but on pure C++ features that are confusing, inefficient or dangerous for normal users of the language.

Slow compile times

The natural interface between source files in C/C++ are header files. Each time a header file is modified, all source files that include the header file should recompile their code. Header files are slow because of them being textual and context dependent as a consequence of the preprocessor.[6] C only has limited amounts of information in header files, the most important being struct declarations and function prototypes. C++ stores its classes in header files and they are not only exposing their public variables and public functions (like C with its structs and function prototypes) but also their private functions. This forces unnecessary recompiles of all source files that include the header file, each time when changing these private functions. This problem is magnified where the classes are written as templates, forcing all of their code into the slow header files, which is the case with the whole C++ standard library. Large C++ projects can therefore be extremely slow to compile.

One solution for this is to use the Pimpl idiom. By using pointers on the stack to the implementation object on the heap there is a higher chance all object sizes on the stack become equal. This of course comes with the cost of a unnecessary heap allocation for each object. Additionally precompiled headers can be used for header files that are fairly static.

One suggested solution is to use a module system.[7]

Global format state of <iostream>

C++ <iostream> unlike C <stdio.h> relies on a global format state. This fits very poorly together with exceptions, when a function must interrupt the control flow, after an error, but before resetting the global format state. One fix for this is to use Resource Acquisition Is Initialization (RAII) which is implemented in Boost[8] but is not a part of the C++ Standard Library.

The global state of <iostream> uses static constructors which causes overhead.[9] Another source of bad performance is the use of std::endl instead of '\n' when doing output, because of it calling flush as a side effect. C++ <iostream> is by default synchronized with <stdio.h> which can cause performance problems. Shutting it off can improve performance but forces giving up thread safety.

Here follows an example where an exception interrupts the function before std::cout can be restored from hexadecimal to decimal. The error number in the catch statement will be written out in hexadecimal which probably isn't what one wants:

#include <iostream>
#include <vector>

int main() {
    try {
        std::cout << std::hex;
        std::cout << 0xFFFFFFFF << std::endl;
        std::vector<int> vector(0xFFFFFFFFFFFFFFFFL,0); // Exception
        std::cout << std::dec; // Never reached
    } catch(std::exception &e) {
        std::cout << "Error number: " << 10 << std::endl; // Not in decimal
    }
    return(EXIT_SUCCESS);
}

It is acknowledged even by some members of the C++ standards body[10] that the iostreams interface is an aging interface that needs to be replaced eventually. This design forces the library implementers to adopt solutions that impact performance greatly.[citation needed]

Lacking POSIX support in <iostream>

One of the greatest strengths of C as a system programming language is its close bonds with Unix and POSIX. POSIX uses file descriptors to access data in files and various IO devices. The C <stdio.h> uses file structs as portable abstractions that can easily be converted to and from file descriptors.[11][12] It is therefore easy to write portable C code using the standard library, where only special functions quickly can reach POSIX to use unique functions tied to the operating system.[original research?]

In the C++ <iostream> there is no portable way to convert its classes to or from file descriptors. The only[citation needed][original research?] way to do this without resorting to C <stdio.h> is to use Boost[13] which isn't part of the C++ standard library. This limits the use of C++ as a system programming language.[citation needed]

#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>
#include <poll.h> // not part of the c standard library

bool POSIX_Poll(char buffer[], size_t size, size_t seconds, FILE *file)
{
    if (!buffer || !file)
        return false;

    struct pollfd fd = {
        .fd = fileno(file),
        .events = POLLIN,
        .revents = 0,
    };
    
    int ret = poll(&fd, 1, 1000 * seconds);
    if (ret < 0) {
        snprintf(buffer, size, "%s", "Poll Error");
        return false;
    } else if (!ret) {
        snprintf(buffer, size, "%s", "Poll Timeout");
        return false;
    } else {
        if (!fgets(buffer, size, file)) {
            snprintf(buffer, size, "%s", "Read Error");
            return false;
        }
    }
    
    return true;
}

int main()
{
    // Let's wait for input from the stdin or some other 
    // file stream for 5 seconds. If nothing comes we'll
    // set an error text. This is not possible with
    // std::cin or other related C++ classes

    char buffer[100] = {0};

    if (!POSIX_Poll(buffer, sizeof(buffer), 5, stdin)) {
        puts(buffer);
        return EXIT_FAILURE;
    }

    printf("Input: %s", buffer);
    return EXIT_SUCCESS;
}

Heap allocations in containers

After the inclusion of the STL in C++, its templated containers were promoted while the traditional C arrays were strongly discouraged.[14] One important feature of containers like std::string and std::vector is them having their memory on the heap instead of on the stack like C arrays.[15][16] To stop them from allocating on the heap, one would be forced to write a custom allocator, which isn't standard. Heap allocation is slower than stack allocation which makes claims about the classical C++ containers being "just as fast" as C arrays somewhat untrue.[17][18] They are just as fast to use, but not to construct. One way to solve this problem was to introduce stack allocated containers like boost::array[19] or std::array.[20]

As for strings there is the possibility to use SSO (short string optimization) where only strings exceeding a certain size are allocated on the heap. The heap is most useful to support large objects that need to be copied, using things like move semantics, shallow copies and memory exceptions. How to define a large object is a question for each function/module/project and is a very important one regarding performance and security. There is however no standard way in C++ for the user to decide this SSO limit and it remains hard coded and implementation specific.[21]

Here is an example where the heap allocations are written out, just for the trivial task of initializing a vector with integers. Using a C array wouldn't trigger any heap allocations at all, which shows the cost of some of these high level abstractions:[22][23][24]

#include <iostream>
#include <vector>

void *operator new(std::size_t size) {
    std::cout << "Heap allocation" << std::endl;
    return(malloc(size));   
}

int main() {
    // Will probably allocate on heap 5 times
    // We "could" reserve memory in the constructor, 
    // but let's forget it!

    std::vector<int> vector;
    for(auto i = 0; i < 10; ++i)
        vector.push_back(i);

    for(auto &i : vector)
        std::cout << i << std::endl;

    return(EXIT_SUCCESS);
}

Iterators

The philosophy of the Standard Template Library (STL) embedded in the C++ Standard Library is to use generic algorithms in the form of templates using iterators. Iterators are hard to implement efficiently which caused Alexander Stepanov to blame some compiler writers for their initial weak performance.[25] The complex categories of iterators have also been criticized,[26][27] and ranges have been proposed for the C++ standard library.[28]

One big problem is that iterators often deal with heap allocated data in the C++ containers and becomes invalid if the data is independently moved by the containers. Functions that change the size of the container often invalidate all iterators pointing to it, creating dangerous cases of undefined behavior. Here is an example where the iterators in the for loop get invalidated because of the std::string container changing its size on the heap:

#include <iostream>
#include <string>

int main() {
    std::string text = "One\nTwo\nThree\nFour\n";
    // Let's add an '!' where we find newlines
    for(auto i = text.begin(); i != text.end(); ++i) {
        if(*i == '\n') {
            // i = 
            text.insert(i,'!')+1; 
            // Without updating the iterator this program has 
            // undefined behavior and will likely crash
        }
    }
            
    std::cout << text;
    return(EXIT_SUCCESS);
}

Auto and deduced type systems

Introduced with the C++11 standard, the auto feature and keyword defined a way to declare variables without an explicitly declared type as prior C++ standards required. The necessity for an auto feature might be seen as a result of the strong influence of the Intuitionistic Type Theory introduced by Per Martin-Löf in the 70s. As mathematics and computer science became more closely related, and perhaps to the detriment of the original relationship of C++ with its roots in computer engineering, it was unsurprising that functional programming dominated the development of the language from 90s on, with marked influences from dependently typed languages like Agda.

However the introduction of a context-dependent type system into a language that is used in mission critical systems like x-ray machines, internet routers, avionics, gaming systems among others introduces an uncertainty that can be fatal. The first two chapters (40 pages) of what many people consider the authoritative book of modern C++, Effective Modern C++ by Scott Meyers,[29] deals with unraveling the extremely complex set of rules that govern type deduction by templates and auto.

The complexity introduced by auto is so severe that LLVM adopts a policy in its Coding Standards[30] to use auto if and only if it makes the code more readable or easier to maintain, thus discouraging the policy of “almost always auto". LLVM also warns[31] that the default behavior of auto is copy, which can be particularly expensive in range-based for loops.

#include <iostream>
#include <array>

class expensive {
public:
    expensive() = default;
    expensive(expensive&) {
        std::cout << "Expensive copy" << std::endl;
    }

    void calculate() {
        std::cout << "Expensive calculation" << std::endl;
    }
};

int main() {
    std::array<expensive,10> array;
	// for(auto &i : array)
	// We forgot the reference operator
	// This will be an expensive auto copy
    for(auto i : array)
        i.calculate();

    return(EXIT_SUCCESS);
}

Uniform initialization syntax

The C++11 uniform initialization syntax and std::initializer_list share the same syntax which are triggered differently depending on the internal workings of the classes. If there is a std::initializer_list constructor then this is called. Otherwise the normal constructors are called with the uniform initialization syntax. This can be confusing for beginners and experts alike[32][33]

#include <iostream>
#include <vector>

int main() {
    int integer1{10}; // int
    int integer2(10); // int
    std::vector<int> vector1{10,0}; // std::initializer_list
    std::vector<int> vector2(10,0); // size_t,int
    
    std::cout << "Will print 10" 
    << std::endl << integer1 << std::endl;
    std::cout << "Will print 10" 
    << std::endl << integer2 << std::endl;
    
    std::cout << "Will print 10,0," << std::endl;
    for(auto &i : vector1) std::cout << i << ',';
    std::cout << std::endl;
    std::cout << "Will print 0,0,0,0,0,0,0,0,0,0," << std::endl;
    for(auto &i : vector2) std::cout << i << ',';

    return(EXIT_SUCCESS);
}

Exceptions

One problem with C++ exceptions is that they don't work properly in destructors, which impedes the use of the RAII idiom. RAII advises acquiring resources in the constructor, and releasing them in the destructor. Exceptions often result in stack unrolling which calls more destructors. If two exceptions leave their destructors in parallel the C++ runtime will call std::terminate() which exits the program.[34] This forces one to use global variables or other tricks to report errors from destructors. As default every destructor is marked with the noexcept keyword which immediately calls std::terminate when an exception is thrown. Exceptions are needed in C++ because of its operator overloading not leaving any room for return values.

Another concern is that the zero-overhead principle[35] isn't compatible with exceptions.[36]

Here is an example where exceptions are allowed to leave destructors and therefore provoke a std::terminate():

#include <iostream>
#include <stdexcept>

class connection {
public:
    connection() = default;
    ~connection() noexcept(false) {
        throw std::runtime_error("Connection Error!");
    }
};

static void InitConnection() {
    connection connection1;
    connection connection2;
    // This second object will lead to std::terminate() because of
    // the stack unrolling triggering two exceptions in parallel.
}

int main() {
    try {
        InitConnection();
    } catch(std::runtime_error &e) {
        std::cout << e.what() << std::endl;
    }
    
    return(EXIT_SUCCESS);
}

Strings without Unicode

The C++ Standard Library offers no real support for Unicode compared to frameworks like Qt.[37][38] std::basic_string::length will only return the underlying array length which is acceptable when using ASCII or UTF-32 but not when using variable length encodings like UTF-8 or UTF-16. In these encodings the array length has little to do with the string length in code points.[39] There are no support for advanced Unicode concepts like normalization, surrogate pairs, bidi or conversion between encodings. There isn't even a way to change between lowercase and uppercase letters without resorting to the C standard library.[40]

This will print out the length of two strings with the equal amount of Unicode code points:

#include <iostream>
#include <string>
#include <cassert>

int main() {
    // This will print "22 18", 
    // UTF-8 prefix just to be explicit
    std::string utf8  = u8"Vår gård på Öland!";
    std::string ascii = u8"Var gard pa Oland!";
    std::cout << utf8.length() << " " << ascii.length() << std::endl;
    assert(utf8.length() == ascii.length()); // Fail!
    return(EXIT_SUCCESS);
}

Verbose assembly and code bloat

There has for a long time been accusations about C++ generating code bloat.[41]

Important C99 features missing

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

There are several important features from C99 missing in C++. This makes it harder and sometimes impossible to use C++ compilers with newer C code. This criticism must still be regarded to be somewhat milder because of C++ being a different language than C. It is only mentioned here because of these features creating big problems when using modern C code in C++ projects.

Designated Initializers

With designated initializers, arrays can be initialized more flexibly at compile time. In C++ one would instead use constexpr functions to do it.

#include <stdio.h>
#include <stdlib.h>

enum
{
    WELCOME,
    STATEMENT = 3,
    COMMENT   = 6,
    GOODBYE   = 11,
    MAX,
};

const char *Text[MAX] =
{
    [WELCOME]   = "Welcome fellow programmer!",
    [STATEMENT] = "Here we are using designated initializers",
    [COMMENT]   = "They are not a part of the C++ standard",
    [GOODBYE]   = "See you later!"
};

int main()
{
    for (size_t i = 0; i < MAX; i++) {
        if (!Text[i]) puts("Waiting");
        else puts(Text[i]);
    }
    return(EXIT_SUCCESS);
}

Variable Length Arrays

In this example we are using Variable Length Array to allocate dynamic memory on the stack. This can replace heap allocations with faster stack allocations for dynamic memory. Several C++ compilers are supporting this feature independently of the C++ standard.[42] There is no similar feature in C++ where one is forced to use the heap for dynamic memory allocation.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void Fibonacci(size_t size)
{
    // Here we are using a Variable Length Array.
    // We can use the stack for dynamic allocation.

    if (size < 2) return;
    unsigned long long array[size];
    memset(array, 0, sizeof(array));
    array[0] = 0;
    array[1] = 1;
    for (size_t i = 2; i < size; i++) {
        array[i] = array[i-1] + array[i-2];
        if (array[i] < array[i-1]) {
            array[i] = 0;
            break;
        }
    }
    
    for (size_t i = 0; i < size; i++)
        printf("%llu ", array[i]);
}

int main()
{
    Fibonacci(100);
	return(EXIT_SUCCESS);
}

Restrict keyword

For very time critical tasks restrict pointers can help the compiler optimize the code by avoiding pointer aliasing. Without this keyword one might be required to use inline assembler to achieve the same performance in C++.

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found.
  6. Lua error in package.lua at line 80: module 'strict' not found.
  7. Lua error in package.lua at line 80: module 'strict' not found.
  8. Lua error in package.lua at line 80: module 'strict' not found.
  9. Lua error in package.lua at line 80: module 'strict' not found.
  10. Lua error in package.lua at line 80: module 'strict' not found.
  11. Lua error in package.lua at line 80: module 'strict' not found.
  12. Lua error in package.lua at line 80: module 'strict' not found.
  13. Lua error in package.lua at line 80: module 'strict' not found.
  14. Lua error in package.lua at line 80: module 'strict' not found.
  15. Lua error in package.lua at line 80: module 'strict' not found.
  16. Lua error in package.lua at line 80: module 'strict' not found.
  17. Lua error in package.lua at line 80: module 'strict' not found.
  18. Lua error in package.lua at line 80: module 'strict' not found.
  19. Lua error in package.lua at line 80: module 'strict' not found.
  20. Lua error in package.lua at line 80: module 'strict' not found.
  21. Lua error in package.lua at line 80: module 'strict' not found.
  22. Lua error in package.lua at line 80: module 'strict' not found.
  23. Lua error in package.lua at line 80: module 'strict' not found.
  24. Lua error in package.lua at line 80: module 'strict' not found.
  25. Lua error in package.lua at line 80: module 'strict' not found.
  26. Lua error in package.lua at line 80: module 'strict' not found.
  27. Lua error in package.lua at line 80: module 'strict' not found.
  28. Lua error in package.lua at line 80: module 'strict' not found.
  29. Lua error in package.lua at line 80: module 'strict' not found.
  30. Lua error in package.lua at line 80: module 'strict' not found.
  31. Lua error in package.lua at line 80: module 'strict' not found.
  32. Lua error in package.lua at line 80: module 'strict' not found.
  33. Lua error in package.lua at line 80: module 'strict' not found.
  34. Lua error in package.lua at line 80: module 'strict' not found.
  35. Lua error in package.lua at line 80: module 'strict' not found.
  36. Lua error in package.lua at line 80: module 'strict' not found.
  37. Lua error in package.lua at line 80: module 'strict' not found.
  38. Lua error in package.lua at line 80: module 'strict' not found.
  39. Lua error in package.lua at line 80: module 'strict' not found.
  40. Lua error in package.lua at line 80: module 'strict' not found.
  41. Lua error in package.lua at line 80: module 'strict' not found.
  42. Lua error in package.lua at line 80: module 'strict' not found.

External links