CSAPP-lecture-6

发表于 2024-08-02 更新于 2025-05-04

CSAPP lecture 6

1. Condition code

Four condition codes:
- CF: Carry Flag. Will be 1 if there is a carry out of the most significant bit. Used to detect the overflow of the unsigned number
- ZF: Zero Flag. Will be 1 if the most recent operation yield 0.
- SF: Sign Flag. Will be 1 if the most recent operation yield a value with the sign bit being 1.
- OF: Overflow Flag. Will be 1 if the most recent operation caused an overflow (for two's complement), either positive overflow or negative overflow.
The leaq operation would not alter any of the condition code. All other arithmetic operation would make these condition code to change.
For the logical operation (such as xor or add), they will leave the CF and OF to be 0.
For the shift operation, the CF will be set to the last digit shifted out, and the OF will be set to 0.
For the inc and dec operation, they will set the OF and ZF, but they will leave the CF unchanged.
There are two instruction classes which can only alter the condition codes without changing the other value.

We can check the sign of a value or whether this value is 0 by test %rax %rax. (Suppose this value is stored in register %rax)

2. Access the condition code

Access the condition code: Rather than directly access the condition code, we prefer to access the condition code indirectly, using the following 3 methods:
- We can set a single byte to 0 or 1, depending on some combination of the condition codes.
- We can conditionally jump to some part of the program
- We can conditionally transfer data.

2.1 How to set a single byte according to the condition code

These set class of instruction have a single operand D indicating the destination. The destination have to be a register or a memory location. If it is a register, then it would set the least significant byte of the register to 1 (usually this register have to be %al kind rather than %rax kind.)

an example:

comp:
	cmpq	%rsi %rdi
	setl	%al
	movzbq	%al %rax
	ret

2.2 (Conditionally) jump to some part of the program

Unconditionally jump:

	movq $0, %rax
	jmp .L1
	movq (%rax), %rdx
.L1:
	popq %rdx

The jmp instruction would jump to .L1 label

jmp is the unconditional jump operation. It can jump directly or indirectly. When it is directly, it's something like jmp .L1. When it is indirectly, it is something like jmp *%rax or jmp *(%rax), where the label is stored in the register %rax or the memory location (%rax)

Conditional jump:

2.3 Jump instruction encodings

In assembly code, to specify where we want to jump to we can simply write something like .L1 happily. In the disassembled machinery code (the assemble code translated from the machinery code), the jump operation would written in the form like jmp 0x1137. There will be an address operand like 0x1137 to specify which operation will the program jump to. In this case, the jump operation would jump to 0x1137 and do the operation in 0x1137

But in machinery code, how to specify where we want to jump to? There are two ways to do this, first is so called PC relatives. That is saying, they encode the target address as the difference between the target address and the address of the instruction immediately following the jump instruction.

The difference between a and b is a - b

The reason why we try to use this kind of notation is that the jump target would remain unchanged when the whole program is shifted to another address.

1
2

eb 03 # In this case, 0x03 is the encoded jump target. 0x03 has to be 							added by the address of the next instruction to get the 								target address. 
# "eb 03" is a machine code in binary. "eb" specifies the type of the operation and "03" specifies the encoded jump target

Note: this 0x03 is a two's complement number, so for example, if now it is 0xFF, then it is actually -1, and the actual jump target would be the address of the next instruction minus 1.

Question

Sometimes we would see operation jmp .L1 and sometimes we would see operation jmp 0x1137. Which one (probably both) is the assembly code? What's difference between these two command?

2.4 Conditional Move

Apart from the jump operation, we have an operation called conditional move.

long comp(long x,long y) {
    long res;
    if (x < y)
        res = y - x; else
        res = x - y;
    return res;
}

For example, this C code might be translated into the following assembly code:

absdiff:
	movq 	%rsi, %rax
	subq	%rdi, %rax
	movq	%rdi, %rdx
	subq	%rsi, %rdx
	cmpq	%rsi, %rdi
	cmovge	%rdx, %rax	#If the last comparison said "%rdi >= %rsi", then this move will 						  conditional move will apply. Otherwise nothing will happen.
	ret

Question

Why will the conditional move operation reduce the penalty for misprediction? It also need to decide whether to do the move instruction, and the misprediction should affect the following pipeline as well.

The class of conditional move:

Some property for the conditional move:

The destination must be register
The data type can be 2,4,8 bytes long, but can't be 1 byte long
The instruction don't need a suffix to specify the data type. The processor could infer the operand length from the name of the destination register.

The condition move may not be valid if one expression is invalid.

for example
1
2
3
4
int *p;
if (!p)
    return 0; else
    return *p;
You can not calculate *p beforehand as p may be a null pointer, making the dereferencing invalid.