Day 6

I. Last Time:
A. Analytical Performance Completed:
1. It's all about time/steps

B. Amdahl's Law and Upgrading
You can only "upgrade" hardware soo much -
then you're just "moving" the bottleneck around.
(I.e. Amdahl's law will always be a problem...)

C. Benchmarking:
Synthetic vs. Non-Synthetic

II. New Stuff:
A. Benchmark Measures:
        Often MIPS and MFLOPS are used as performance measures. 
        MIPS - Count of the millions of instructions executed per second
        MFLOPS - Count of the millions of Floating Point Operations per second

        These are only useful when comparing identical ISAs. Why?
        Because different machines may do different quantities of "work" for 
        different instructions.

        Ex: Memory Copy (ObjectA=ObjectB - Done OFTEN)
A RISC Machine (like MIPS) may require 200-300 instructions to 
            perform a memory copy.
            A CISC Machine (like Intel) may require only a single instruction 
            to do the same work.
Similiar to acking two people to move a pile of Bricks.
Person A is "smart" and can just be told...move the bricks there.
Person B requires detailed instructions...Move this brick...now this.

        Just comparing the number of instructions done in a given time doesn't
        really measure the WORK done in that time.

        If both machines complete the copy in the same time, we'd say they have
        the same performance for the copy...

        But the RISC machine would have a much higher MIPS rating.
      
        Really, EXE time is probably the "best" measure.

    Other Poor Metrics (Besudes MIPS and MFLOPS):
       Peak MIPS - WORSE than MIPS. Just a measure of the number of times the 
                   fastest instruction can execute per second. 
(May give some idea of CPU design, but BAD measure of perf)
       Relative MIPS - Based on a workload and a common machine, so a little better.

B. Problems with Benchmark results:
        1. Application Mix/Usage - Needs to be same as in benchmark
           We want to test apps that are as close to what the machine
           will be used for as possible. 
           I.e. 3D performance shouldn't be measusred if we're only
                going to use the machine for spreadsheets!
        2. "Ranking" in suites - How are different results combined?
            Look at page 32 of the SPEC handout
            How do we really combine these results to determine 
            what is best?

    C. SPEC  - System Performance and Evaluation Cooperative
       1. What's Measured: Int and Float Performance
          Int: "Normal" Tasks
               Compression: gzip
               Compilation: gcc
               chess, perl, combinational optimization, databases, 
               logic simulation, etc...
          Floating Point: Number crunching
               Image Processing/Neural Networks (Matrix Manip), Fluid Dynamics, 
               Primality Test, 3D Graphics, Finite Elements/Simultaion, 
               Nuclear Physics, etc... 
      2. What's used to measure these?
         REAL Apps (Non-Synthetic Benchmark) for "realistic" performance
         Must be compute (rather tan IO) bound.
         Is "unique" compared to other benchmark members
         Must do meaningful/usefull work.
      3. Who's behind SPEC?
         For the most part: Industry (HP, IBM, SGI, Compaq, Sun, Intel, etc.)
      4. What are the problems in developing SPEC
         a. Selecting a program. MUST be REALLY portable
            18 Platforms, some 32-bit, some 64-bit
            11 types of unix, and 2 windows NTs
            multiple compilers
         b. Programming Languages & Compilers
            C:11 in int, 4 in FP
            C++: 1 in int.
            F77: 6 in fp
            F90: 4 in FP 
            Why Fortran in FP?
            Why so little C++?
         c. Problems with FP
            Numerical variableability - subtle implmentation differences 
                                        will actually give different results.
         d. Vendor self interest:
            results are usually confidential
            voted for by diverse sub-committee

         e. Misc:
               Compilers: Optimazation/differences = Big Impact
              181.mfc, 282.eon (pg32) 500 MHz beats 533 3 times.

D. Binary Numbers:
     1. Back to the basics: 1s and 0s
         Computers use a 2 digit counting system. Corresponds to on/off of the "switches" used 
         Really it works just like the counting system you're used to, just fewer digits:
         Counting Sequence: 0, 1, 10, 11, 100, 101, etc.
      2. Since we use at least 3 numbering systems in here, I'll try to use one of the following notations:
         binary: b or base2
         decimal: d or nothing
         hexadecimal: h, or 0x 
      3. These digits can be used to represent numbers by realizing that, like decimal,
         each place has a value dependent on the base. I.e. :
         9,824 = 9*1,000+8*100+2*10+4*1 = 9*10^3+8*10^2+2*10^1+4*10^0.
         Binary works essentially the same, but our digits only count to 2 
         (0-1 instead of 0-9), and we use powers of 2 rather than powers of 10.
         EX: 10 1101b= 1*2^5+0*2^4+1*2^3+1*2^2+0*2^1+1*2^0 = 46d
      4. Conversion from bin to decimal - see above
      5. Conversion from decimal to bin - repeated division by 2 w/ remaindes
         EX: 13/2=6 r 1
              6/2=3 r 0
              3/2=1 r 1
              1/2=0 r 1
        Divide until 0, read remainders from bottom to top.
        (Actually, this is the same as dividing by 2^3, 2^2, 2^1, 2^0, etc.)

    E. Hexadecimal to store/represent binary - Note 1-to-1 correspondence
       Hexadecimal is base 16. (Digits of 0-15)
       Note that in binary, 4 digits can count up to 15.
       This is really why hexadecimal is used - 4 bin digits = 1 hex digit.
       4 binary digits is called a nibble.
       1 nibble = 4 bits
       1 byte = 2 nibbles = 8 bits

       The most fundamental unit of storage in most computers is 1 byte = 
       8 bits = 2 hexadecimal digits, so hex works really well.

       So, hex has 16 digits. They represent 0-15. 
       Decimal     Binary      Hexadecimal
         0          0000            0
         1          0001            1
         2          0010            2
         .          ....            .
     
       It's probably best to memorize this table so you can easily convert.

       Convert 0xAC to binary Ah=1010b, Ch=1100b => 1010 1100b
               0x4E to binary 4h=0100b, Eh=1110b => 0100 1110b
   
       Convert 110100 to Hex: 0011 0100b = 0011 0100b = 0x34

       Easy conversion to decimal (Same process as binary and decimal):
          0x7AC = 7*16^2 + A(10)*16^1 + C(12)*16^0 = 1964

    F. Sign Representations
       How do we do negatives in a computer? 
       We have to somehow store the sign within the number. 
       
       1. Sign Bit  (Sign Magnitude) - Assign a bit to represent the sign.
          Usually MSB is used.

         Pitfalls: This makes subtraction/negatives is difficult
                   There are 2 "zeros" at MSB=1 and MSB=0
                   The wraparound has 2 boundaries

         Pros: Numbers are "easy" to read

          3 bit example: 000 001 010 011, 100 101 110 111
                          0   1   2   3   -0  -1  -2  -3

          Uses: There are a few uses of sign magnitude representation...
                IEEE-754 Floating Point uses something similiar

       2. Binary Offset
          Subtract half of the largest Value from the largest value and 
          call the result 0. Everything else is numbered outward.

          3 bit example: 000 001 010 011 100 101 110 111
                         -4  -3  -2  -1   0   1   2   3
          
          Pitfalls: Again, Subtraction/negatives is difficult
          Uses: IEEE-754 uses binary offset to store the exponent on a number

       3. 1's Compliment
          In one's compliment a positive number is written in the same
          manner as an unsigned number. To find a negative, find the 
          representation of the magnitude and then flip (invert) each bit.

          Pitfalls: Subtraction is a little more difficult than 2's comp.

          Uses: JPEG uses 1's compliment to store numbers.
                (It's trying to compress data - minimize the number of
                 bits used. 1's compliment numbers can be stored in exactly
                 the number of bits needed for the magnitude of the number:
                 ex: 72: 100 1000b, -72= 011 0111. 7 bits either way.
                 Positive numbers begin with 1, negatives with 0)

       4. 2's Compliment
          Positive: Sames as unsigned
          Negative: Invert each bit and add 1

          3 bit example: 000 001 010 011 100 101 110 111
                          0   1   2   3  -4  -3  -2  -1

          A nice "ring" with only 1 abrupt boundary.

         Most Commonly Used represenatation for negative integers in computers.
 
        Pros: Makes 
        Uses: Practically all integer math...where negatives are allowed

        Shortcut: Locate RIGHTMOST ONE bit and invert everything to it's 
                  LEFT. (DO NOT invert that bit itself)

 III. Next Time:
A. Continuing Number / Data Representation