Process of pipelining is becoming a common feature in modern processor (Davidson
et al., 1975). Today’s processor uses pipelining to achieve Instruction Level Parallelism (ILP) (Forsell, 1996). Pipelining is the technique of partitioning the instruction execution process. It is invented to speed up the execution of instructions, (Flynn, 1995; and Martti Forsell, 1997). To speed up the execution of instructions pipelining takes the approach of partitioning the given instruction into many smaller autonomous but interconnected subinstructions (pieces) and allocates separate dedicated hardware to each subinstruction (piece). These separate dedicated hardwares are termed as pipe stages (Kogge, 1981). Generally, a pipelined processor has five pipe stages, namely, load (L), decode (D), fetch (F), execute (E) and write (W) (Soliman, 2013). In the first stage, instruction is loaded into register from cache or main memory depending upon the value of Program Counter (PC). In the second stage, loaded instruction is decoded. Here decoding means knowing the behavior, operands, addressing modes of the loaded instruction and identifying the needed resources for next pipe stages. In the third stage, fetching of data (operands or operand’s value) takes place and also reservation of resources is done. Instruction is executed in the fourth stage, and finally, in the last stage, the result is written back into the memory (Hwang, 2001; Kim and Kim, 2005; and Saravanan et al., 2015). There
are mainly five types of instruction pipelines—scalar, superscalar, super pipeline,
under pipeline and super scalar super pipeline. Scalar pipelining is also known
as simple pipeline or base pipeline. In scalar pipeline, one instruction is issued per clock cycle and only one instruction is completed in each clock cycle (Jouppi and
Wall, 1989; Johnson, 1989; and Steven et al., 1997).