CS105: Introduction to Computer Programming: C++

Integral types

Here is some code that computes the factorial of a number:

#include <iostream>
using std::cout;
using std::endl;

int fact( int n ) {
  if( n <= 1 ) return 1;
  else return n * fact(n-1);
}

int main() {

  for( int i = 0; i < 15; i++ ) {
    cout << "fact(" << i << ") is " << fact(i) << endl;
  }

  return 0;
}

If you run this program on a standard 32-bit computer, you might notice that some of the values that the fact function computes are not correct. This is because of integer overflow: the compiler will try to store int variables using a fixed amount of space, which means that you can only store numbers up to a certain size.

C++ provides some other data types that might be helpful. If we don't care about negative numbers, we can use unsigned ints, which allow us to store positive numbers that are twice as big as those we can store with ints. There are also short and long modifiers, which provide an amount of storage that is less than or equal to ints (for short) or greater than or equal to ints (for long. Here is a version of the factorial function that uses unsigned long ints:

#include <iostream>
using std::cout;
using std::endl;

unsigned long int ulfact( unsigned long int n ) {
  if( n <= 1 ) return 1;
  else return n * ulfact(n-1);
}

int main() {

  for( int i = 0; i < 15; i++ ) {
    cout << "ulfact(" << i << ") is " << ulfact(i) << endl;
  }

  return 0;
}
While unsigned long int provides more storage than regular int, it won't necessarily help us for the factorial function. The factorial function grows so quickly that simply doubling the amount of storage won't help!

Other data types

C++ has most of the normal data types: int, double, float, char, bool (not boolean, but bool!), and wchar_t (a "wide" character used for non-ASCII languages).

The amount of storage allocated for the primitive data types in C++ depends on your computer. You can use the sizeof operator to determine how big a variable or data type is. The number returned by sizeof is a multiple of sizeof(char), which is defined as 1. The number of bits in a char is stored in the constant CHAR_BIT, defined in the <climits> header file.

One might think that since the double data type can represent much larger numbers, we should make a double-based factorial function:

#include <iostream>
using std::cout;
using std::endl;
#include <iomanip>
using std::fixed;

double dfact( double n ) {
  if( n <= 1 ) return 1;
  else return n * dfact(n-1);
}

int main() {

  for( int i = 0; i < 25; i++ ) {
    cout << "dfact(" << i << ") is " << dfact(i) << endl;
  }

  return 0;
}

Unfortunately, this too has problems. Since doubles can't represent every integer precisely, small errors will accumulate as dfact multiplies each number. By the time we get to 25!, the answer is close, but noticably off. If you're interested in learning more about this problem, check out the IEEE 754 standard for representing floating-point numbers.

Casting

C++ uses four different casting operators. In order to do simple double to int casting, use static_cast:

#include <iostream>
using std::cout;
using std::endl;
#include <cmath>

bool isPrime( int n ) {
  if( n == 2 ) return true;
  if( n % 2 == 0 ) return false;

  int sqrtN = static_cast<int>( sqrt(n) );
  for( int i = 3; i <= sqrtN; i += 2 ) {
    if( n % i == 0 ) return false;
  }

  return true;
}


int main() {
  for( int i = 0; i < 50000; i++ ) {
    if( isPrime(i) ) cout << i << " is prime" << endl;
  }
  
  return 0;
}

Bitwise Operators

There are several C++ operators that deal with individual bits: <<, >> &, |, ~, and ^. Here are several examples of a function that uses bitwise operators to compute the binary version of a number:

#include <iostream>
using std::cout;
#include <limits>
#include <string>
using std::string;

string to_binary( unsigned int n ) {
  unsigned int mask = 1;
  string result = "";

  while( mask != 0 ) {
    if( (n & mask) > 0 ) result = '1' + result;  // these two lines
    else result = '0' + result;                  // re-allocate the string
    mask <<= 1;                                  // data repeatedly
  }

  return result;
}


string to_binary_faster( unsigned int n ) {
  int num_bits = sizeof(n) * CHAR_BIT;
  char buf[num_bits+1];                          // one allocation => faster
  unsigned int mask = 1;

  int idx = num_bits - 1;
  while( mask != 0 ) {
    if( (n & mask) > 0 ) buf[idx] = '1';
    else buf[idx] = '0';
    mask <<= 1;
    --idx;
  }

  buf[num_bits] = '\0';                          // don't forget null termination!

  return string(buf);
}



string to_binary_fasteragain( unsigned int n ) {
  int num_bits = sizeof(n) * CHAR_BIT;
  string result( num_bits, '0' );             // fast, and uses the string class

  for( int idx = num_bits - 1;
       idx >= 0 && n > 0;
       --idx, n >>= 1 ) {                     // shift n to the right by 1
    if( (n&1) == 1 ) result[idx] = '1';       // is the right-most bit set?
  }

  return result;
}




int main() {

  // this will take a while
  for( int i = 0; i < 1000000; ++i ) {
    string s = to_binary(i);
  }

  // these are much faster!
  for( int i = 0; i < 1000000; ++i ) {
    //string s = to_binary_faster(i);
    string s = to_binary_fasteragain(i);
  }

}
In the first to_binary function above, using the + operator to insert 0 or 1 in the beginning of the string causes the buffer behind the string to be re-allocated a bunch of times. This results in a rather slow function! It is much faster to make sure the buffer only gets allocated once, and then fill it appropriately.