A register file comprises a series of parallel load registers and is an integral part of a CPU. The register file outputs the contents of chosen registers to the rest of the CPU and loads registers with input values given by the rest of the CPU.
Consider a 16-bit CPU with eight parallel load registers.
A typical design will at most need to deal with operations giving an input of one value to load into a register and requiring the
output of two register values.
An example of such an operation could be the adding of two register contents with the result of the addition loaded into a third register.
Let's call the two register contents that are output from the register file A
and B
.
The input value to be loaded into a register we'll call D
(for destination).
The register file will need three 3-bit addresses to select these registers from the eight possible (23=8).
Let's call these addresses AA
, BA
, and DA
, and we'll have them as inputs to the register file.
Since we have registers which are sequential logic circuits, we will have a clock input
called CLK
.
The one remaining input is called Load
. Its purpose is to enable or disable the loading of the register addressed by DA
.
This is needed as it may be the case that a no register is to be loaded —
registers A
and B
may be used as memory address and data as part of a memory write operation.
In total we have inputs AA
, BA
, DA
, D
, CLK
, and Load
.
Our outputs are A
and B
.
If the output of each register is fed into a
multiplexer,
we can use AA
and BA
to select the register whose contents is to become the A
or B
output.
So, we will have two internal multiplexers, muxa
and muxb
.
The D
input needs to be fed into a register so it can be stored on a positive clock edge.
The input address, DA
, can be used to select the Load input of one of the eight registers if we feed DA
into an octal
decoder
and connect a decoder output to the Load input of each register.
Instead of making this connection directly, we use an AND gate to generate the Load
input for each
register.
The inputs to each gate would the the register file's Load
input and the appropriate decoder output.
This would only permit a register load when the register file's Load
input was 1
.
Our bill of materials is now eight registers, eight AND gates, one decoder, and two multiplexers. A schematic form of the register file which, for the purposes of saving space, has been simplified to show only four registers is shown below.
Below is a Verilog structural model for the example register file (with all eight registers). The code for the flip-flops, multiplexers, and the decoder is also shown for completeness.
module register_file(A, B, D, AA, BA, DA, Load, CLK);
output [15:0] A; // Data contents of A reg.
output [15:0] B; // Data contents of B reg.
input [15:0] D; // Data to load into D reg.
input [2:0] AA; // Address of A reg.
input [2:0] BA; // Address of B reg.
input [2:0] DA; // Address of C reg.
input Load; // Enable loading of D reg - active high.
input CLK; // Clock.
wire [15:0] Q0, Q1, Q2, Q3, Q4, Q5, Q6, Q7;
wire dr0, dr1, dr2, dr3, dr4, dr5, dr6,dr7;
wire load0, load1, load2, load3, load4, load5, load6, load7;
multiplexer_8_1 muxa(A, Q0, Q1, Q2, Q3, Q4, Q5, Q6, Q7, AA);
multiplexer_8_1 muxb(B, Q0, Q1, Q2, Q3, Q4, Q5, Q6, Q7, BA);
octal_decoder decd(dr0, dr1, dr2, dr3, dr4, dr5, dr6, dr7, DA[2], DA[1], DA[0], 1'b1);
and(load0, dr0, Load);
and(load1, dr1, Load);
and(load2, dr2, Load);
and(load3, dr3, Load);
and(load4, dr4, Load);
and(load5, dr5, Load);
and(load6, dr6, Load);
and(load7, dr7, Load);
register_parallel_load r0(Q0, D, load0, CLK);
register_parallel_load r1(Q1, D, load1, CLK);
register_parallel_load r2(Q2, D, load2, CLK);
register_parallel_load r3(Q3, D, load3, CLK);
register_parallel_load r4(Q4, D, load4, CLK);
register_parallel_load r5(Q5, D, load5, CLK);
register_parallel_load r6(Q6, D, load6, CLK);
register_parallel_load r7(Q7, D, load7, CLK);
endmodule // register_file
module register_parallel_load(Q, D, Load, CLK);
output [15:0] Q;
input [15:0] D;
input Load;
input CLK;
wire Loadn;
wire w1, w2, w3, w4, w5, w6, w7, w8, w9, w10, w11, w12; // Connecting wires.
wire w13, w14, w15, w16, w17, w18, w19, w20, w21, w22, w23, w24; // Connecting wires.
wire w25, w26, w27, w28, w29, w30, w31, w32, w33, w34, w35, w36; // Connecting wires.
wire w37, w38, w39, w40, w41, w42, w43, w44, w45, w46, w47, w48; // Connecting wires.
wire [15:0] Qn; // Unused.
not(Loadn, Load);
and(w1, Q[0], Loadn);
and(w2, D[0], Load);
or(w3, w2, w1);
and(w4, Q[1], Loadn);
and(w5, D[1], Load);
or(w6, w5, w4);
and(w7, Q[2], Loadn);
and(w8, D[2], Load);
or(w9, w8, w7);
and(w10, Q[3], Loadn);
and(w11, D[3], Load);
or(w12, w11, w10);
and(w13, Q[4], Loadn);
and(w14, D[4], Load);
or(w15, w14, w13);
and(w16, Q[5], Loadn);
and(w17, D[5], Load);
or(w18, w17, w16);
and(w19, Q[6], Loadn);
and(w20, D[6], Load);
or(w21, w20, w19);
and(w22, Q[7], Loadn);
and(w23, D[7], Load);
or(w24, w23, w22);
and(w25, Q[8], Loadn);
and(w26, D[8], Load);
or(w27, w26, w25);
and(w28, Q[9], Loadn);
and(w29, D[9], Load);
or(w30, w29, w28);
and(w31, Q[10], Loadn);
and(w32, D[10], Load);
or(w33, w32, w31);
and(w34, Q[11], Loadn);
and(w35, D[11], Load);
or(w36, w35, w34);
and(w37, Q[12], Loadn);
and(w38, D[12], Load);
or(w39, w38, w37);
and(w40, Q[13], Loadn);
and(w41, D[13], Load);
or(w42, w41, w40);
and(w43, Q[14], Loadn);
and(w44, D[14], Load);
or(w45, w44, w43);
and(w46, Q[15], Loadn);
and(w47, D[15], Load);
or(w48, w47, w46);
d_flip_flop_edge_triggered dff0(Q[0], Qn[0], CLK, w3);
d_flip_flop_edge_triggered dff1(Q[1], Qn[1], CLK, w6);
d_flip_flop_edge_triggered dff2(Q[2], Qn[2], CLK, w9);
d_flip_flop_edge_triggered dff3(Q[3], Qn[3], CLK, w12);
d_flip_flop_edge_triggered dff4(Q[4], Qn[4], CLK, w15);
d_flip_flop_edge_triggered dff5(Q[5], Qn[5], CLK, w18);
d_flip_flop_edge_triggered dff6(Q[6], Qn[6], CLK, w21);
d_flip_flop_edge_triggered dff7(Q[7], Qn[7], CLK, w24);
d_flip_flop_edge_triggered dff8(Q[8], Qn[8], CLK, w27);
d_flip_flop_edge_triggered dff9(Q[9], Qn[9], CLK, w30);
d_flip_flop_edge_triggered dff10(Q[10], Qn[10], CLK, w33);
d_flip_flop_edge_triggered dff11(Q[11], Qn[11], CLK, w36);
d_flip_flop_edge_triggered dff12(Q[12], Qn[12], CLK, w39);
d_flip_flop_edge_triggered dff13(Q[13], Qn[13], CLK, w42);
d_flip_flop_edge_triggered dff14(Q[14], Qn[14], CLK, w45);
d_flip_flop_edge_triggered dff15(Q[15], Qn[15], CLK, w48);
endmodule // register_parallel_load
module d_flip_flop_edge_triggered(Q, Qn, C, D);
output Q;
output Qn;
input C;
input D;
wire Cn; // Control input to the D latch.
wire Cnn; // Control input to the SR latch.
wire DQ; // Output from the D latch, inputs to the gated SR latch.
wire DQn; // Output from the D latch, inputs to the gated SR latch.
not(Cn, C);
not(Cnn, Cn);
d_latch dl(DQ, DQn, Cn, D);
sr_latch_gated sr(Q, Qn, Cnn, DQ, DQn);
endmodule // d_flip_flop_edge_triggered
module d_latch(Q, Qn, G, D);
output Q;
output Qn;
input G;
input D;
wire Dn;
wire D1;
wire Dn1;
not(Dn, D);
and(D1, G, D);
and(Dn1, G, Dn);
nor(Qn, D1, Q);
nor(Q, Dn1, Qn);
endmodule // d_latch
module sr_latch_gated(Q, Qn, G, S, R);
output Q;
output Qn;
input G;
input S;
input R;
wire S1;
wire R1;
and(S1, G, S);
and(R1, G, R);
nor(Qn, S1, Q);
nor(Q, R1, Qn);
endmodule // sr_latch_gated
module multiplexer_8_1(X, A0, A1, A2, A3, A4, A5, A6, A7, S);
parameter WIDTH=16; // How many bits wide are the lines
output [WIDTH-1:0] X; // The output line
input [WIDTH-1:0] A7; // Input line with id 3'b111
input [WIDTH-1:0] A6; // Input line with id 3'b110
input [WIDTH-1:0] A5; // Input line with id 3'b101
input [WIDTH-1:0] A4; // Input line with id 3'b100
input [WIDTH-1:0] A3; // Input line with id 3'b011
input [WIDTH-1:0] A2; // Input line with id 3'b010
input [WIDTH-1:0] A1; // Input line with id 3'b001
input [WIDTH-1:0] A0; // Input line with id 3'b000
input [2:0] S;
assign X = (S[2] == 0
? (S[1] == 0
? (S[0] == 0
? A0 // {S2,S1,S0} = 3'b000
: A1) // {S2,S1,S0} = 3'b001
: (S[0] == 0
? A2 // {S2,S1,S0} = 3'b010
: A3)) // {S2,S1,S0} = 3'b011
: (S[1] == 0
? (S[0] == 0
? A4 // {S2,S1,S0} = 3'b100
: A5) // {S2,S1,S0} = 3'b101
: (S[0] == 0
? A6 // {S2,S1,S0} = 3'b110
: A7))); // {S2,S1,S0} = 3'b111
endmodule // multiplexer_8_1
module octal_decoder(X0, X1, X2, X3, X4, X5, X6, X7, A2, A1, A0, E);
output X0; // Minterm 0
output X1; // Minterm 1
output X2; // Minterm 2
output X3; // Minterm 3
output X4; // Minterm 4
output X5; // Minterm 5
output X6; // Minterm 6
output X7; // Minterm 7
input A2; // Input binary code most significant bit
input A1; // Input binary code middle bit
input A0; // Input binary code least significant bit
input E; // Enable signal
wire A2n; // A2 negated
wire A1n; // A1 negated
wire A0n; // A0 negated
not(A2n, A2);
not(A1n, A1);
not(A0n, A0);
and(X0, A2n, A1n, A0n, E); // Minterm 0: 000
and(X1, A2n, A1n, A0, E); // Minterm 1: 001
and(X2, A2n, A1, A0n, E); // Minterm 2: 010
and(X3, A2n, A1, A0, E); // Minterm 3: 011
and(X4, A2, A1n, A0n, E); // Minterm 4: 100
and(X5, A2, A1n, A0, E); // Minterm 5: 101
and(X6, A2, A1, A0n, E); // Minterm 6: 110
and(X7, A2, A1, A0, E); // Minterm 7: 111
endmodule // octal_decoder
Below we see the waveforms generated by a simple test run which comprised the loading of A5A5 (hex) into register 7 and the outputting of registers 0 and 1. This was followed by the loading of 1234 (hex) into register 7 and the outputting of registers 0 and 7.
Mano, M. Morris, and Kime, Charles R. Logic and Computer Design Fundamentals. 2nd Edition. Prentice Hall, 2000.
Copyright © 2014 Barry Watson. All rights reserved.