CSAPP-lecture-6

CSAPP lecture 6

1. Condition code

  • Four condition codes:

    • CF: Carry Flag. Will be 1 if there is a carry out of the most significant bit. Used to detect the overflow of the unsigned number
    • ZF: Zero Flag. Will be 1 if the most recent operation yield 0.
    • SF: Sign Flag. Will be 1 if the most recent operation yield a value with the sign bit being 1.
    • OF: Overflow Flag. Will be 1 if the most recent operation caused an overflow (for two's complement), either positive overflow or negative overflow.

    The leaq operation would not alter any of the condition code. All other arithmetic operation would make these condition code to change.

  • For the logical operation (such as xor or add), they will leave the CF and OF to be 0.

  • For the shift operation, the CF will be set to the last digit shifted out, and the OF will be set to 0.

  • For the inc and dec operation, they will set the OF and ZF, but they will leave the CF unchanged.

  • There are two instruction classes which can only alter the condition codes without changing the other value.

​ We can check the sign of a value or whether this value is 0 by test %rax %rax. (Suppose this value is stored in register %rax)

2. Access the condition code

  • Access the condition code: Rather than directly access the condition code, we prefer to access the condition code indirectly, using the following 3 methods:
    • We can set a single byte to 0 or 1, depending on some combination of the condition codes.
    • We can conditionally jump to some part of the program
    • We can conditionally transfer data.
2.1 How to set a single byte according to the condition code

​ These set class of instruction have a single operand D indicating the destination. The destination have to be a register or a memory location. If it is a register, then it would set the least significant byte of the register to 1 (usually this register have to be %al kind rather than %rax kind.)

​ an example:

1
2
3
4
5
comp:
cmpq %rsi %rdi
setl %al
movzbq %al %rax
ret
2.2 (Conditionally) jump to some part of the program

Unconditionally jump:

1
2
3
4
5
	movq $0, %rax
jmp .L1
movq (%rax), %rdx
.L1:
popq %rdx

The jmp instruction would jump to .L1 label

jmp is the unconditional jump operation. It can jump directly or indirectly. When it is directly, it's something like jmp .L1. When it is indirectly, it is something like jmp *%rax or jmp *(%rax), where the label is stored in the register %rax or the memory location (%rax)

Conditional jump:

2.3 Jump instruction encodings

​ In assembly code, to specify where we want to jump to we can simply write something like .L1 happily. In the disassembled machinery code (the assemble code translated from the machinery code), the jump operation would written in the form like jmp 0x1137. There will be an address operand like 0x1137 to specify which operation will the program jump to. In this case, the jump operation would jump to 0x1137 and do the operation in 0x1137

​ But in machinery code, how to specify where we want to jump to? There are two ways to do this, first is so called PC relatives. That is saying, they encode the target address as the difference between the target address and the address of the instruction immediately following the jump instruction.

The difference between a and b is a - b

​ The reason why we try to use this kind of notation is that the jump target would remain unchanged when the whole program is shifted to another address.

1
2
eb 03 # In this case, 0x03 is the encoded jump target. 0x03 has to be 							added by the address of the next instruction to get the 								target address. 
# "eb 03" is a machine code in binary. "eb" specifies the type of the operation and "03" specifies the encoded jump target

Note: this 0x03 is a two's complement number, so for example, if now it is 0xFF, then it is actually -1, and the actual jump target would be the address of the next instruction minus 1.

Question

​ Sometimes we would see operation jmp .L1 and sometimes we would see operation jmp 0x1137. Which one (probably both) is the assembly code? What's difference between these two command?

2.4 Conditional Move

​ Apart from the jump operation, we have an operation called conditional move.

1
2
3
4
5
6
7
long comp(long x,long y) {
long res;
if (x < y)
res = y - x; else
res = x - y;
return res;
}

For example, this C code might be translated into the following assembly code:

1
2
3
4
5
6
7
8
absdiff:
movq %rsi, %rax
subq %rdi, %rax
movq %rdi, %rdx
subq %rsi, %rdx
cmpq %rsi, %rdi
cmovge %rdx, %rax #If the last comparison said "%rdi >= %rsi", then this move will conditional move will apply. Otherwise nothing will happen.
ret

Question

​ Why will the conditional move operation reduce the penalty for misprediction? It also need to decide whether to do the move instruction, and the misprediction should affect the following pipeline as well.

The class of conditional move:

​ Some property for the conditional move:

  • The destination must be register
  • The data type can be 2,4,8 bytes long, but can't be 1 byte long
  • The instruction don't need a suffix to specify the data type. The processor could infer the operand length from the name of the destination register.

The condition move may not be valid if one expression is invalid.

for example

1
2
3
4
int *p;
if (!p)
return 0; else
return *p;

You can not calculate *p beforehand as p may be a null pointer, making the dereferencing invalid.