using_algorithms [SFTL]

- Using algorithms

Using algorithms

This section gives some details about the use of algorithms.

The map algorithm will be used throughout for illustration.

The base example

// examples/basic_example.cpp
 
#include <iostream>
#include <vector>
#include "sftl.h"
 
using namespace std;
 
// inverse function
double inverse( int x ) {
	return (1.0/x);
}
 
int main() {
 
	// create numbers 1..9
	vector<int> numbers;
	for (int i=0;i<9;i++)
		numbers.push_back(i+1);
 
	// get their inverse
	vector<double> inverses = map( numbers, inverse );
 
	// print them out
	for (int i=0;i<9;i++)
		cout << inverses[i] << endl; 
}

In this base example, map is used to get the inverse value of some numbers.

Containers are passed by value

We can see that numbers, the input container, is passed by value and that inverses, the output container, is also returned by value.

This may seem inefficient but this is an important design choice because it allows to chain algorithms.

For instance, by including sftl_utils.h (bundled in the SFTL distribution), the same code could be rewritten as :

...
 
int main() {
    cout << toString( map( makeRange(1,9), inverse ) ) << endl;
}

In this code makeRange creates a vector initialized to numbers 1..9. That vector is passed by value to map that transforms it into a vector of inverses. That vector is in turn piped into toString that converts it into a string. That string is finally streamed to the standard output.

This simply would not have been possible if containers were passed by reference.

This latter code may seem less readable for eyes not used to functional programming but it has many advantages :

There are fewer lines to read
Obviously, it required less typing
There's no chance you could have messed up with your indexes (because there aren't any!)
You didn't need to come up with names for your temporary variables

As for why the containers are passed by value, the library is designed as such because it's the nicest way to pipe results from algorithms directly into other algorithms.

There are other ways to pipe results (for instance through pointers) but :

it would be at the expense of syntax-friendliness
it would require dodgy background machinery to release dynamically allocated memory for containers

~~The SFTL provides variants of the algorithms that accept arguments by reference but, as they're breaking the philosophy of the library, they'll only be introduced later on.~~

Container type in = container type out

The algorithms (such as map, filter and reduce) return a container of the same template type of the input container.

(this is not about the type of the objects stored in the container, this is about the type of the container type itself : vector, list, deque, …).

In the base example numbers is defined as a vector of ints. Thanks to the C++ template mechanism, the result container can be instantiated of the same type, and therefore it's a vector of double that is returned.

This means one could equally write :

...
list<int> numbers;
...
list<double> inverses = map( numbers, inverse );

In the current state of the SFTL, only sequence containers are supported (vector,liste,deque).

Automatic type inference for output objects

Back to the base example : numbers is a vector of ints and inverse is a function that takes an int as argument and returns a double.

The type of objects in the result container is given by the result type of the given functor. The result type being double, it's therefore a container of double that is returned.

Simply put, given :

a container of objects T
a functor that takes an argument of type T and returns a value of type U

map will give a container of objects U

As for the filter algorithm, the behavior is different. filter doesn't transform objects, it just selects some of them. That means objects in the result container will be of the same type as those from the input container, since they're the ones matching the functor test.

Cases when type inference doesn't work

There are cases when type inference can't work its magic. This boils down to the type of functor used with the algorithm.

Type inference cannot be obtained when the functor (see using Functors) is a function object that doesn't provide type information (that is, a function object that doesn't provide a result_type typedef, unlike the ones in STL).

In practice, that means that in the absence of type information, type information must be given explicitly.

Compared to the base example, if inverse was defined as a function object

struct InverseFunctionObject {
    double operator () (int x) {
        return (1.0/x);
    }
};

The return type would have to specified explicitly in order to be used :

...
vector<double> inverses = map<double>( numbers, InverseFunctionObject() );

Notice the <double> right after map ? That's the extra type information provided because it cannot be inferred. It wasn't needed in the earlier examples because it could be deduced automatically.

Automatic referencing and dereferencing

Ok, here's another sexy feature.

So far algorithms have been taking :

a container with objects of type T
a functor taking an argument of type T and returning type U.

This is the perfect case, because the input type (T) matches the functor argument type (T). This is what we have in the base example : numbers holds objects of type int and the inverse functor takes precisely an int argument.

Now, what if it's not the case ? What if the input container hold pointers to ints instead of ints ?

This is illustrated in the following example :

// square functor taking an int 
int square( int x ) {
    return x*x;
}
...
vector<int*> numbers; // vector of pointers to ints
...
vector<int> squares = map( numbers, square ); // <- possible ???

Would it be possible to use the square functor although it takes an int instead of an int* ?

In the opposite case, what if it's the functor that required an int* as parameter instead of an int ?

This is illustrated here :

// inverse functor taking a pointer to an int instead of an int
int square_p( int *x ) { 
    return (*x)*(*x);
}
...
vector<int> numbers; // vector of ints
...
vector<int> squares = map( numbers, square_p ); // <- possible ???

Would it be possible to use square_p on numbers ?

In both cases the functor argument type doesn't match the container's objects type anymore. They're just a reference away from equality. It's type T versus type T*, and vice-versa. Does it mean it's void ?

Well, here is good news : The SFTL takes care of this. The SFTL provides an automatic (de)referencing mechanism that plugs the functor with the input objects if it's just a matter of referencing. In other words, the SFTL adapts the functor call so that the expected type is provided.

In practice, this means that :

if the container holds objects of type T and the functor expects an argument of type T*, the SFTL will, for each input object, get its address and give it to the functor. This is the act of referencing.
if the container holds objects of type T* and the functor expects an argument of type T, the SFTL will, for each input pointer, get the pointed object and give it to the functor. This is the act of dereferencing.

To be complete, if the container holds objects of type T* and the functor also expects an argument of type T*, no translation has to be done since they're both of the same type.

No extra syntax is needed to use this type translation, it's automatic. That also means that the two code samples given just above are perfectly valid and working !

About ranges

When an algorithm is applied on a container, the functor is applied to every element in the container.

This is very different from the STL philosophy : In STL, one always has to specify the range over which to operate. That is, two informations must be provided : an iterator that gives the start of the range, and another one that gives the end of the range. 1)

In SFTL, the algorithms work over the entire range of a container. There's no way to apply a functor to a sub-range of a container. This is not a limitation but a design decision.

In some way, the STL ability to specify ranges can seem powerful but in practice it's not used much and it bloats the code more than anything. <RANT> Is it worth having to type it.begin() and it.end() all over the place just because we occasionally need to use a sub-range ? we're processing the entire range 99.99% of the time anyway…</RANT>

This design decision illustrates one of the goals of the SFTL, that is to exchange, whenever possible, unneeded power for simplicity and conciseness.

1) well, one step past the end

using_algorithms.txt · Last modified: 2007/08/25 01:13 (external edit)