Fallacies Programmers believe about Compiling

  1. Every line compiles into a single instruction.

This is obviously wrong. Just think for-loop.

  1. Every statement compiles into a single instruction.

c = sqrt( a * a + b * b); I don't think any CPUs have a pythagoras instruction.

  1. Well, every expression compiles into an instruction.

Nope, also not true. sqrt( a * a + b * b)

  1. But each atomic expression compiles into an instruction (for some definition of atomic).

Well, a*a could be an overloaded operation in C++. Also, a could be a reference. Both cases add more complexity.

  1. Every instruction takes one cycle.

Even most RISC processors have one fetch/decode and one execute cycle per instruction. A simple two-slot pipeline is used so that it feels like every instruction just needs one cycle. CISC machines have complex instructions like COS and AES which take longer than one cycle. In fact, their execution time may vary.

  1. Every basic arithmetic operation takes one cycle.

Not true for floating points. On microcontrollers without floating point support, even a single addition has to be implemented via software instructions.

  1. Every basic arithmetic operation for native data-types takes one cycle.

Here is a tricky one: Java uses a different floating point model optimized for SPARCs. They have build-in instructions for denormalized FPs. however, x86 does not. So doing a lot of FP operations with Java on x86 can yield surprising performance issues.

  1. Every basic arithmetic operation on integers takes one cycle.

Nope, division is hard. ADD, SUB and MUL can be implemented very efficiently. But DIV is completely different.

  1. One instruction is executed per cycle.

Now that's just cute.

  1. At least one instruction is executed per cycle.

Nope, if the CPU is waiting for data from the RAM, the whole pipeline can be stalled.

  1. At most one instruction is executed per cycle.

Modern CPUs are super-scalar. They allow for multiple instructions to be executed at the same time.

  1. Instructions leave the CPU as the come in.

Out-of-order execution can make some instructions over take others.

  1. At least, instructions enter the CPU as defined in the program.

Even that may not be guaranteed.

to be continued