Changes

gling · 9e04a54e
--- a/mac_5x9x8-Revised.md
+++ b/mac_5x9x8-Revised.md
-We have defined a KxMxN MAC unit as a device which takes K signed M-bit numbers ($a_0, a_1, ..., a_{K-1}$) and K signed N-bit numbers ($b_0, b_1, ..., b_{K-1}$), and computes 
-$$a_0*b_0 + a_1*b_1 + ... + a_{K-1}*b_{K-1}$$
+We have defined a KxMxN MAC unit as a device which takes K signed M-bit numbers ($a_0, a_1, ..., a_{K-1}$) and K signed N-bit numbers ($b_0, b_1, ..., b_{K-1}$), and computes $$a_0 \ast b_0 + a_1 \ast b_1 + ... + a_{K-1} \ast b_{K-1}$$

 For making an KxMxN MAC unit, the number of output bits will be
 $$\lceil\log_2(K * 2^M * 2^N)\rceil = \lceil\log_2 K\rceil + M + N$$
@@ -10,84 +9,25 @@ First, to perform a signed multiplication, we use the modified Baugh-Wooley form
 \url{https://web.ece.ucsb.edu/~parhami/pres_folder/f31-book-arith-pres-pt3.pdf}. A more clear representation of this method is shown at \url{https://en.wikipedia.org/wiki/Binary_multiplier#Signed_integers} with the exception of a missing 1 in the MSB of the last partial sum.

 Using an example calculation with a 4-bit by 5-bit signed multiply with multiplicands `a` and `b`:
-\begin{verbatim}
+```
 p0[4:0] = a[4:0] * b[0]
 p1[4:0] = a[4:0] * b[1]
 p2[4:0] = a[4:0] * b[2]
 p3[4:0] = a[4:0] * b[3]

-                             ~p0[4] +  p0[3] +  p0[2] +  p0[1] +  p0[0]
-                    ~p0[4] +  p0[3] +  p0[2] +  p0[1] +  p0[0]
-           ~p0[4] +  p0[3] +  p0[2] +  p0[1] +  p0[0]
-  ~p0[4] +  p0[3] +  p0[2] +  p0[1] +  p0[0]
-----------------------------------------------------------------------
-
-\end{verbatim}
-
-
-
-
-
-
-We have defined a 5x9x8 MAC unit as a device which takes 5 signed 9-bit numbers (a0, ..., a4) and 5 signed 8-bit numbers (b0, ..., b4), and computes a0*b0 + a1*b1 + a2*b2 + a3*b3 + a4*b4.
+                                    ~p0[4]    p0[3]    p0[2]    p0[1]    p0[0]
+                           ~p0[4]    p0[3]    p0[2]    p0[1]    p0[0]
+                  ~p0[4]    p0[3]    p0[2]    p0[1]    p0[0]
+          p0[4]   ~p0[3]   ~p0[2]   ~p0[1]   ~p0[0]
+ + 1       0        0        0        0        1        0        0        0
+-------------------------------------------------------------------------------
+  P[8]    P[7]     P[6]     P[5]     P[4]     P[3]     P[2]     P[1]     P[0]
+```

-For making an KxMxN MAC unit, the number of output bits will be
-ceil(log2(K * 2^M * 2^N)) = ceil(log2(K)) + M + N
-For a 5x9x8 MAC unit, ceil(log2(5)) + 8 + 9 = 20 bits.

-First, to perform a signed multiplication, the modified Baugh-Wooley form on slide 64 of
-https://web.ece.ucsb.edu/~parhami/pres_folder/f31-book-arith-pres-pt3.pdf was used. A more clear representation of this method is shown at https://en.wikipedia.org/wiki/Binary_multiplier#Signed_integers with the exception of a missing 1 in the MSB of the last partial sum.

-A corrected form for a 9-bit input `a` and an 8-bit input `b`:
-```
-p0[8:0] = a[8:0] * b[0]
-p1[8:0] = a[8:0] * b[1]
-p2[8:0] = a[8:0] * b[2]
-p3[8:0] = a[8:0] * b[3]
-p4[8:0] = a[8:0] * b[4]
-p5[8:0] = a[8:0] * b[5]
-p6[8:0] = a[8:0] * b[6]
-p7[8:0] = a[8:0] * b[7]

-                                                       ~p0[7]  p0[6]  p0[5]  p0[4]  p0[3]  p0[2]  p0[1]  p0[0]
-                                                ~p1[7] +p1[6] +p1[5] +p1[4] +p1[3] +p1[2] +p1[1] +p1[0]   0
-                                         ~p2[7] +p2[6] +p2[5] +p2[4] +p2[3] +p2[2] +p2[1] +p2[0]   0      0
-                                  ~p3[7] +p3[6] +p3[5] +p3[4] +p3[3] +p3[2] +p3[1] +p3[0]   0      0      0
-                           ~p4[7] +p4[6] +p4[5] +p4[4] +p4[3] +p4[2] +p4[1] +p4[0]   0      0      0      0
-                    ~p5[7] +p5[6] +p5[5] +p5[4] +p5[3] +p5[2] +p5[1] +p5[0]   0      0      0      0      0
-             ~p6[7] +p6[6] +p6[5] +p6[4] +p6[3] +p6[2] +p6[1] +p6[0]   0      0      0      0      0      0
-      +p7[7] ~p7[6] ~p7[5] ~p7[4] ~p7[3] ~p7[2] ~p7[1] ~p7[0]   0      0      0      0      0      0      0
-1       0      0      0      0      0      0      0      1      0      0      0      0      0      0      0  
- ------------------------------------------------------------------------------------------------------------
-P[15]  P[14]  P[13]  P[12]  P[11]  P[10]   P[9]   P[8]   P[7]   P[6]   P[5]   P[4]   P[3]   P[2]   P[1]  P[0]
-```




- *
- * For making a 5x8x9 MAC unit, the number of output bits will be
- * log2(2^8 * 2^9 * 5) = 8 + 9 + log2(5) = 19.3219280949 => 20 bits
- *
- * This works by taking the partial sums of the 5 multiplies, adding the 5
- * parital sums of each place value together first, then summing the 8 partial
- * sums together last.
- *
- * This is derived from the Baugh-Wooley method for signed multiplication, but
- * modified to support summing 5 items without finishing the multiplies. First,
- * XOR every initial partial product by (1 << 8), in other words invert the sign
- * bits. Invert the bits last partial product corresponding to the sign bit of
- * the second multiplicand.
- *
- * Now, to complete the Baugh-Wooley method for a single multiply, we need to
- * add (1 << 15) (one in the sign bit of the output) which effectively is
- * inverting the sign bit of the final output. And add (1 << 9), a bit in the
- * position one place value greater than the sign bit of the first multiplicand.
- *
- * Now, when summing these partial sums together, they need to be sign-extended.
- * Unfortunately, due to the array adder outputting two partial sums, the sign
- * bit cannot easily be determined, so it would be better if we can modify the
- * Baugh-Wooley method to produce a 20-bit output directly and sum those
- * instead.
- * 
- * It ends up that simply adding a set of 4 bits in