Experiment1: To study the basics of forwarding. For all the programs please select Stall Detection and Forwarding should be ON.
(a) Write a sample program that forwards between EXE stage and MEM stage on upper input
of the the program below and show forwarding with arrow. Test it on the simulator and mention what value is being forwarded. DO NOT USE LW INSTRUCTION.
Instruction ADD R1,R2,R3 SUB R5,R1,R4
(b) Write a sample program that forwards between EXE stage and WB stage on upper input
of the the program below and show forwarding with arrow. Test it on the simulator and mention what value is being forwarded. DO NOT USE LW INSTRUCTION.
Instruction ADD R1,R2,R3 NOP SUB R5,R1,R4 1 IF 2 ID IF 3 EXE ID IF 4 MEM EXE ID 5 WB MEM EXE 6 WB MEM 7 WB 8 9 1 IF 2 ID IF 3 EXE ID 4 MEM EXE 5 WB MEM 6 WB 7 8 9 其中ADD指令中在MEM阶段的ALUout-555(R2+R3)前递到SUB指令的EXE阶段的upper input。
其中ADD指令中在WB阶段的输出555(R2+R3)前递到SUB指令的EXE阶段的upper input。
(c) Write a sample program that forwards between EXE stage and MEM stage on lower input
of the the program below and show forwarding with arrow. Test it on the simulator and mention what value is being forwarded. DO NOT USE LW INSTRUCTION.
Instruction ADD R1,R2,R3 SUB R5,R4,R1 1 IF 2 ID IF 3 EXE ID 4 MEM EXE 5 WB MEM 6 WB 7 8 9
其中ADD指令中在MEM阶段的ALUout-555(R2+R3)前递到SUB指令的EXE阶段的upper input。
(d) Write a sample program that forwards between EXE stage and WB stage on lower input of the the program below and show forwarding with arrow. Test it on the simulator and mention what value is being forwarded. DO NOT USE LW INSTRUCTION. Instruction ADD R1,R2,R3 NOP SUB R5,R4,R1 1 IF 2 ID IF 3 EXE ID IF 4 MEM EXE ID 5 WB MEM EXE 6 WB MEM 7 WB 8 9 其中ADD指令中在WB阶段的输出555(R2+R3)前递到SUB指令的EXE阶段的upper input。
(e) Write a sample program that forwards between MEM stage and WB stage. Write the program below and show forwarding with arrow. Test it on the simulator and mention what value is being forwarded. Instruction LW R1,0(R2) SW R3,0(R1)
(f) Write a program that causes Load Use Delay Stall. See what data is to be moved and notice exactly when the required Data is passed on to the waiting instruction. Show it as an arrow on the following diagram. Instruction LW R1,0(R3) ADD R2,R1 ,R3
BRANCH HAZARDS. (Lab Experiment 2)
(a) Let us now study some branch hazard. First of all make sure that the Aggressive
1 IF 2 ID IF 3 EXE ID 4 MEM EXE 5 WB EXE 6 MEM 7 WB 8 9 1 IF 2 ID IF 3 EXE ID 4 MEM EXE 5 WB MEM 6 WB 7 8 9 其中ADD指令在WB阶段的输出555(R2+R3)前递到SW指令的MEM阶段。
branching option is OFF, Stall Detection is ON and Forwarding is ON. Select Always Flush option from the branch Policy and write the following program. Does this program work properly. If not modify the program so that it works properly. It fills 10 memory locations memory with a value 222. Check what should be the values of all the registers if this program work is to properly (no useful instruction turning into NOP or getting flushed). Carefully note the use of SLTI instruction in the following loop. ADDI R3, R0, 0 ADDI R1, R0, 0 ADDI R2, R0, 222 Loop: Addi R1, R1, 4 SW R2, 100(R1) ADDI R3, R3, 1 SLTI R5, R3, 10 BNEQ R5, R0, loop ADDI R7, R1, 10 ADDI R8, R2, 5 ADDI R2, R2, 100
(i)Calculate the CPI for this program. 61/65
(ii)What changes can we make to this program so that it works properly (useful instructions
after the program do not flush) 可以在BNEQ跳转后面加3条ADD R0,R0,R0
(iii)Run the same program with Predict NT option in Branch Policy. What difference do you see
when loop completes.
Predict NT:循环执行完了顺序执行了剩下3条语句。 Always Flush:循环执行完了依然将剩下3条语句flush掉了。
(b) This problem is similar to problem A-1 at the end of the book. (Exercise A-1.) Note that you need to calculate the offset in the actual program in terms of number of instructions.
Offset = -6
(A) Processor Configuration. Stall Detection ON., Forwarding: OFF, Aggressive Branching: YES, Branch Policy: Always Flush. Run the above program and fill the following table for the instructions in the loop body for first 2 or 3 iterations. Total Clock cycles to run the program ____
LW R1, 0(R2) ADDI R1,R1, #1 SW R1, 0(R2) ADDI R2,R2, #4 SUB R4,R3,R2 BNEZ R4, Loop1 ADDI R2, R0, 0 ADDI R3, R0, 0 1 IF
2 ID IF 3 EXE ID IF 4 MEM ID IF 5 WB ID IF 6 EXE ID IF 7 MEM ID IF 8 WB ID IF 9 EXE ID IF 10 MEM EXE ID IF 11 WB MEM ID IF 12 WB ID IF 13 EXE ID IF LW R1, 0(R2) ADDI R1,R1, #1 SW R1, 0(R2) ADDI R2,R2, #4 SUB R4,R3,R2 BNEZ R4, Loop1 ADDI R2, R0, 0 ADDI R3, R0, 0 14 MEM ID
15 WB ID IF 16 IF EXE ID 17 ID IF MEM EXE 18 EXE ID IF WB MEM 19 MEM ID IF WB 20 WB ID IF 21 EXE ID IF 22 MEM ID IF 23 WB ID IF 24 EXE ID IF 25 MEM EXE ID IF 26 WB MEM ID IF (b) Redo the part (a) with forwarding ON and other processor configuration is same. Total clock cycles ___
LW R1, 0(R2) ADDI R1,R1, #1 SW R1, 0(R2) ADDI R2,R2, #4 SUB R4,R3,R2 BNEZ R4, Loop1 ADDI R2, R0, 0 ADDI R3, R0, 0 1 IF
2 ID IF 3 EXE ID IF 4 MEM EXE ID IF 5 WB EXE ID IF 6 MEM EXE ID IF 7 WB MEM EXE ID IF 8 WB MEM EXE ID IF 9 WB MEM ID IF 10 IF WB EXE ID 11 ID IF MEM EXE 12 EXE ID IF WB MEM 13 MEM EXE ID IF WB LW R1, 0(R2) ADDI R1,R1, #1 SW R1, 0(R2) ADDI R2,R2, #4 SUB R4,R3,R2 BNEZ R4, Loop1 ADDI R2, R0, 0 ADDI R3, R0, 0 14 WB EXE ID IF
15 MEM EXE ID IF 16 WB MEM EXE ID IF 17 WB MEM EXE ID IF 18 WB EXE ID IF 19 IF WB EXE ID 20 ID IF MEM EXE 21 EXE ID IF WB MEM 22 MEM EXE ID IF WB 23 WB EXE ID IF 24 MEM EXE ID IF 25 WB MEM EXE ID IF 26 WB MEM EXE ID IF 实验心得:
通过本次实验,我更加深刻的理解了流水线中的前递技术的实现,以及不同类型的指令在不同的阶段前递,方式也不同。同时,我对branch指令数据冒险的解决也有了一个更加清晰的认识,branch类型主要通过循环内改写和循环展开这两种方式,通过不同的实例,清楚地知道不同情况下考虑策略的不同。
通过实验,更加清晰的理解了课上的理论内容,也对复杂流水化有了一个更加全面系统的认知,收获很大。
因篇幅问题不能全部显示,请点此查看更多更全内容