ikerhurtado.com
You're in
Iker Hurtado's pro blog
Developer | Entrepreneur | Investor
Software engineer (entrepreneur and investor at times). These days doing performant frontend and graphics on the web platform at Barcelona Supercomputing Center

Relearning C++: Notes on the Standard Template Library (STL)

6 Mar 2015   |   iker hurtado  
Share on Twitter Share on Google+ Share on Facebook
In this post I write down some notes as from my study of C++ on the Standard Template Library

The Standard Template Library provides a collection of standard C++ templates representing containers, iterators, algorithms and function objects.

Containers

Sequence containers

They are container classes that maintain the ordering of elements as these are inserted. There are 3 sequence containers:

vector

Vectors are dynamic arrays capable of growing as needed to contain its elements.

They use contiguous storage locations for their elements, as efficiently as in arrays. But like their size can change dynamically, their storage is handled automatically by the container.

Internally, vectors use a dynamically allocated array to store their elements. This array may need to be reallocated in order to grow in size. This is a relatively expensive task, so vectors do not reallocate each time an element is added. Instead, vector containers may allocate some extra storage to accommodate for possible growth.

Vectors are very efficient accessing its elements and quite efficient adding or removing elements from its end. For operations that involve inserting or removing elements at positions other than the end, they perform worse than the others, and have less consistent iterators and references than lists and forward_lists.

To use size_type, we must name the type in which it is defined. A vector type always includes its element type:

vector::size_type // ok
The standard requires that vector implementations can efficiently add elements at run time. So, it is often unnecessary to define a vector of a specific size.

deque

This is a double-ended queue class, implemented as a dynamic array that can grow from both ends. It allows elements to be added and removed from both the front and the rear, at constant time.

list

It's a double-linked list that can be transverse in both direction. It support constant-time insertion and removal, but does not allow direct (random) access (except with a indexing operator).

forward_list

Added in C++11 version. It's single-linked list that support forward transversal only. It is simpler and with similar features than list.

Associative containers

This type of containers store key-value pairs: associate a key with a value, and use a key to find the value. Some of them (sorted) automatically sort their inputs when inserted into the container.

set and multiset

In this containers, the key is the same as the value. Both are sorted and reversible. Only multiset allows duplicate values.

Although the set types define both the iterator and const_iterator types, both types of iterators give us read-only access to the elements in the set. Just as we cannot change the key part of a map element, the keys in a set are also const.

map and multimap

This containers store key-value pairs, where keys are unique and each key is associated with one value (map) or more values (multimap). Both are sorted and reversible.

In order to represent each key-value, STL provides a template class called pair.

Subscripting a map behaves quite differently from subscripting an array or vector: Using a key that is not already present adds an element with that key to the map. So, we cannot use the subscript operator to determine whether an element is present. Better use the find() method.

Usually, the type returned by dereferencing an iterator and the type returned by the subscript operator are the same. Not so for maps: when we subscript a map, we get a mapped_type object; when we dereference a map iterator, we get a value_type object.

Unordered associative containers

The C++11 standard added the unordered versions of the previously refered containers: unordered_set, unordered_multiset, unordered_map, and unordered_multimap. These are implemented by hash table (efficient in insertion, removal and search, but requires more storage spaces); whereas the ordered counterparts are based on tree.

Although hashing gives better average case performance in principle, achieving good results in practice often requires a fair bit of performance testing and tweaking. As a result, it is usually easier (and often yields better performance) to use an ordered container. Use an unordered container if the key type is inherently unordered or if performance testing reveals problems that hashing might solve.

Iterators

An iterator is an object intended to traverse (iterate over) a container without the user having to know how the container is implemented. For many containers (particularly lists and the associative classes), iterators are the usual way to access their elements.

Iterators give indirect access to an object. That object is an element in a container or a character in a string.

An iterator can be defined as a class which allows to efficiently access items in a specific container. An iterator is a pure abstraction, that is, any element that acts like an iterator is defined as an iterator. In other words, an iterator is an abstraction of the concept of pointer to an element of a sequence.

The main key elementes of a iterator are:

  • The element pointed to (dereferenced using the operators * and ->).
  • The ability to point to the next element (increasing by ++ operator).
  • Equality (represented by the == operator).

The main advantages of using an iterator over trying the direct access to a container element are:

  • Direct access breaks the encapsulation of the wrapper class. On the contrary, the iterator is usually a friend of the class and can efficiently iterate without exposing implementation details.
  • The iterator simplifies the process of iteration. Most iterators act as indexes or pointers, allowing the use of loop structures to traverse the container by increment and comparison operators.

A valid iterator either denotes an element or a position one past the last element in a container. All other iterator values are invalid.

The types with iterators have members that return iterators. The way to get them:

auto b = v.begin();
auto e = v.end(); // e denotes one past the last element in v

The iterator returned by end() is positioned one past the end of the container, therefore it denotes a nonexistent element in the container. It is used as a marker indicating when we have processed all the elements.

If the container is empty, begin returns the same iterator as the one returned by end.

When using auto with begin or end, the iterator type obtained depends on the container type and it is irrelevant for working with the iterators.

An iterator range is denoted by a pair of iterators each of which refers to an element, or to one past the last element, in the same container.

The arrow operator (->) combines dereference and member access into a single operation, and it is very used with iterator. So, it->mem is equivalent for (* it).mem.

iterator and const_iterator

In addition to iterator type, the container type define const_iterator as read-only iterator.

The type returned by begin() and end() depends on the object. If the object is const, they return a const_iterator. To let us ask specifically for the const_iterator type, There are the functions: cbegin() and cend() (use if write access is not needed).

Containers operations

Access to elements

The members that access elements in a container return references. If the container is a const object, the return is a reference to const. If the container is not const, the return is an ordinary reference that we can use to change the value of the fetched element.

This lines are very didactic:

vector1.front()  = 12;      // assigns 12 to the first element
auto &e =  vector1.back();  // get a reference to the last element
e = 104;                    // changes the element 
auto e2 =  vector1.back();  // e2 is not a reference; it's a copy of vector1.back()
e2 = 0;                     // no change to the element 
As usual, if we use an auto variable to store the return and we want to change the element, we must define this variable as a reference.

Some containers, what provide fast random access also provide the subscript operator. This operator takes an index and returns a reference to the element at that position in the container.

It is up to the program to ensure the index validity; the subscript operator does not check whether the index is in range. In the other hand, the at() member that checks the validity of the index and throws an out_of_range exception if invalid.

Assignment

The assignment operator replaces the entire range of elements in the left-hand container with copies of the elements from the right-hand operand (set the same size):

container = contatiner2;  

The sequential containers also define a member named assign that lets assign from a different but compatible type, or assign from a subsequence of a container. Like in assignment operator, it replaces all the elements with copies of the elements specified by its arguments.

vector.assign (7,100);             // 7 ints with a value of 100
vector2.assign (it,vector.end()-1);

Swapping

Swapping operation exchanges values of two objects (of the same type). This is a fast operation because the elements themselves are not swapped; internal data structures are swapped (with move operations). Excepting array, swap does not copy, delete, or insert any elements and is guaranteed to run in constant time.

swap(v1, v2);

Insertion

The push_back function creates a new element (copy of param value) at the end of sequential container, increasing the size by 1.

Each of the insert functions takes an iterator as its first argument. The iterator indicates where in the container to put the element(s). It can refer to any position in the container, including one past the end of the container. Because the iterator might refer to a nonexistent element off the end of the container, and because it is useful to have a way to insert elements at the beginning of a container, element(s) are inserted before the position denoted by the iterator.

The emplace functions construct elements in the container. The arguments to these functions must match a constructor for the element type.

c.emplace_back("97867567", 15.99);
When we use an object to initialize a container, or insert an object into a container, a copy of that object’s value is placed in the container, not the object itself. Subsequent changes to the element in the container have no effect on the original object, and viceversa.

Deletion and resizing

The members that remove elements do not check their arguments. We must ensure that elements exist before removing them.

With the usual exception of arrays, we can use resize to make a container larger or smaller.

Hidden invalidations

Operations that add or remove elements from a container can invalidate (no longer points to an element) pointers, references, or iterators to elements.

For example (for other secuential types can be diferent) after an operation that adds or remove elements to a container:

ADDITION: Iterators, pointers, and references to a vector or string are invalid if the container was reallocated. If no reallocation happens, indirect references to elements before the insertion remain valid; those to elements after the insertion are invalid.

DELETION: After we remove an element, all other iterators, references, or pointers to a vector or string remain valid for elements before the removal point. The off-the-end iterator is always invalidated when we remove elements.

Hence, when you use an iterator (or a reference or pointer to a container element), it is a good idea to minimize the part of the program during which an iterator must stay valid.

Generic Algorithms

In addition to containers and iterators, STL also provides a number of generic algorithms for working with the elements of the container classes. These allow search, sort, insert, reorder, remove, and copy elements of the container class. These algorithms are implemented as global functions that operate using iterators.

Usually, the algorithms do not work directly on a container. Instead, they operate by traversing a range of elements bounded by two iterators.

With only a few exceptions, the algorithms operate over a range of elements (called input range). The algorithms that take an input range always use their first two parameters to denote that range. These parameters are iterators denoting the first and one past the last elements to process.

Ordinarily it is best to use cbegin() and cend() with algorithms that read, but do not write, the elements. However, if you plan to use the iterator returned by the algorithm to change an element’s value, then you need to pass begin() and end().

The tuple Type

A tuple is useful when we want to combine some data into a single object but do not want to bother to define a data structure. A tuple can be thought of as a quick and dirty data structure. Example:

tuple, int> tupleExample("sometext", {3.5, 24.3}, 23);

Alternatively, the library defines a make_tuple function that generates a tuple object:

auto tupleExample1 = make_tuple("sometext", 3, 20.00);
This function uses the types of the supplied initializers to infer the type of the tuple.

It's possible to pass sequences of tuples to the algorithms and can use a tuple as key type in an ordered container.

A common use of tuple is to return multiple values from a function.


This topic is very well covered in the book Desarrollo de Videojuegos: un enfoque práctico -Chapters 5, 7 and 21.

POST A COMMENT: