Day 20
I. Last Time:
Lab #2 Due

A. Pointers vs. Array indexing:
Arrays often more efficient / faster.
(often easier in ASM too)
      clear1 vs. clear2 (pg 171):
Pointer version was nearly half the insts.
(Twice as fast)

 
II. New Stuff
A. Call By Value vs. Call be reference
  
  Multi/Larg Argument Functions
1. C/C++ use a mechanism called
"Pass by Value" by default
(The & operator can be used to specify pass by reference)
Pass by value - Pass the "value" of something.
I.e. make a copy of it and let the function
work with the copy.
Pass by reference - Pass the address of the object.
I.e. let the function work with the actual
object.

       What if you want to pass more than 128 bits?
       What if you want to return more than 64 bits?
       
Example Handout/Overhead

    Summary/Things We've learned:
       Large Static Initializations are done with a "copy" loop
       at the begining of the function where they aer declared.

       Compilers play some funny tricks to optimize copying data.
       These same tricks typically are specific to a single 
       implementation of the ISA.

       Compilers will sometimes add in what looks like unndeeded 
       branches, but these may really be harmless.

       Passing (large data) by value is much less efficient than 
       passing by reference because the entire object must be copied.

       Compilers use pointers to help deal with return values. 
       The return value is written into the caller's stack space
       and then copied (by the caller) into it's final destination.

       Saving space for a0-a3 in the caller's frame gives us the 
       ability to completly re-construct large arguments in memory.

B. The Stack in C++ (Example from Visual C)

C. The data segment in C++ (Example from Visual C)

D. The Stack and Security / Buffer overflow
Why stack overflow can be VERY bad

   E. MIPS Floating Point Processor

      MIPS uses a seperate "Co-Processor" for floating point
      arithmatic. (Co-Processow 1) 

      1. Registers: $f0-$f31
         Single Precision (32-bit) numbers only require 1 reg.
         so, 32 regs may be used.
         Double Precision (64-bit) numbers require 2 regs
         so, only the 16 "even" registers may be used: $f0,
         $f2,etc.

      2. Math Operations:
             FUNCTION.TYPE 
             TYPE is s for single prec, d for double
             add.s/add.d; sub.s/sub.d; mul.s/mul.d; div.s/div.d
         "R-format" instructions similiar to the ones we've
         already seen, but work only with F registers.

      3. Floating Point Branches/Comparisons:
             c.OP.TYPE 
             TYPE is s for single prec, d for double
             OP is eq (==),neq(!=),lt(<),le(<=),gt(>),ge(>=)
             Sets a condition bit - No "destination" like slt
          Once the comparison is made, branches can be made with:
             bc1t - Branch on Co Pro 1 true
             bc1f - Branch on Co Pro 1 false
          These just check the condition bit.

      4. Data Loading/Moving:
            lwc1 - Load word to c1
            swc1 - Store word from c1
         Only diff is that rs field specifies a f register.

      5. Movement from R-regs to Co-Pro regs:

      6. Misc. Others:
         mfc1.TYPE - Move From Co-Pro 1 to R reg
            TYPE is s=single pred, d=double prec
mfc1 $t0,$f0
         mtc1.TYPE - Move to Co-Pro 1 from R reg
mtc1 $t0, $f0
         abs.TYPE - Compute Absolute Value
         cvt.ETYPE.ETYPE
            ETYPE is w=integer, s=single pred, d=double prec
            Works only with float regs.
         mov.TYPE - Move data from one reg to another.
         neg.TYPE - Negate a value                  
li.TYPE - Load an immediate float (must be float (deciaml needed))
  Ex: li.s $f0,0.0

7. Floating Point Register Usage: (From Pg 6 of handout)
$f0-$f3 - Function Return Values
$f4-$f11 - Temps (Functions may change)
$f12-$f15 - Function Arguments
$f16-$f19 - Temps (Functions may change)
$f20-$f31 - Saved Temps

    F. Multi-Dimensional Array Overview
       How is Multi-Dimensional Data stored in memory?
       int Array[2][3] =  {{1,2,3},{4,5,6}};
       Read Indexes from left to right - 2 groups of 3 elements each.

       Memory is one-dimensional/flat - we must somehow "map"
       this array into flat memory.

            Row\Column   0 1 2
              0          1 2 3
              1          4 5 6
          2 groups of 3 elements each


          This can be "mapped" to "flat" memory in 2 ways:
                       ROW MAJOR ORDER   COLUMN MAJOR ORDER
            Mem[Array]       1                    1
            Mem[Array+4]     2                    4
            Mem[Array+8]     3                    2
            Mem[Array+12]    4                    5
            Mem[Array+16]    5                    3
            Mem[Array+20]    6                    6 

        Row Major - Store whole "rows" first (C/C++ Style)
        Column Major - Store whole "columns" first (Fortran Style)

        To compute the Flat address of a specific element:
           ROW MAJOR ORDER: Work from left to right multiplying
                            index by sum of size of remaining sizes
               Ex: Array[1][0] from above
                   Mem[Array+1*3*sizeof(int)+0*sizeof(int)] = Mem[Array+12]
           COLUMN MAJOR ORDER: Work from right to left multiplying
                               index by sum of size of remaining sizes
               Ex: Array[1][0] from above
                   Mem[Array+0*2*sizeof(int)+1*sizeof(int)] = Mem[Array+4]

                               
       int Array[2][3][4] = { {{1,2,3,4},   {5,6,7,8},    {9,10,11,12}}, 
                            { {13,14,15,16},{17,18,19,20},{21,22,23,24}} }
           2 groups of {3 groups of 4 elements each}
           ROW MAJOR ORDER: Work from left to right multiplying
                            index by size
               Ex: Array[1][0][2] (15) from above
                   Mem[Array+1*12*sizeof(int)+0*sizeof(int)*4+2*sizeof(int)] =
                   Mem[Array+48+8] = Mem[Array+56]

   G. MIPS Addressing Styles
       Addressing refers to identifying a piece of data
       which you want to work with. More specifically, 
       addressing is describing where the data is or what
       it is.
       1. Register Addressing - The data you want is in a 
          register. Ex: Add inst - all the elements to add 
          are in registers.
       2. Base/Displacement Addressing - Describing where the
          operand is based on a "base" address. Ex: lw
          (A base + displacement = true address)
          In C an array's name is just used as the base.
          The index is used to compute an offset from the
          base.
          A class instance's name is also a "base" and the 
          class members are displacements from the base.
       3. Immediate Adderssing - The data is immeadiately 
          available - i.e. it's part of the instruction.
          Ex: li, la, etc. 
       4. PC-Relative Addressing - The desired location is 
          relative to the current instruction. Ex: Beq/Bne.
       5. Pseudo-Direct/Absolutute Addressing - The exact  
          location is provided. Ex: j is pseudo-absolute. 
          Absolute in a sense, but dependent on current PC.
     Note that some instructions "mix" these.
     addi - register and immediate.

H. C/C++ and ASM:
1. All programs are translated to ASM then Machine Language
2. Inlining - In C++ an "Inline" function will save several instructions
(The overhead of calling and setting up a function)
3. Locals - Local Variables are on the Stack! (Except static locals)
  Locals are Uninitialized (take on a "random" value)
 Overwriting a local array will destroy "local" info.
(possible the stack pointer or other registers)
   4. Globals &  Initilizing constants are in the static data segment
Overwriting a global array will destroy other global variables!
Initializers take space and time!
  5. Pointers/Arrays
Pointers are often more efficient, but tricker to read the C/C++.
6. Pass by Value/Pass by reference
Pass by reference is more efficient (no "copy" loop)
for large structs/classes
7. C/C++ use "short circuit" evaluation.
I.e. they only look at the minimum needed in an expression.
Ex: if(a&&b&&c&&d) may only look at a.
8. Multi Dim Arrays - really they are mapped into flat memory

III. Next Time:
A. Multi-Dim arrays
B. MIPS floating point
C. More C++ / ASM examples
D. Stack & security concerns.