Day 6
I. Last Time:
A. Analytical Performance Completed:
1. It's all about time/steps
B. Amdahl's Law and Upgrading
You can only "upgrade" hardware soo much -
then you're just "moving" the bottleneck around.
(I.e. Amdahl's law will always be a problem...)
C. Benchmarking:
Synthetic vs. Non-Synthetic
II. New Stuff:
A. Benchmark Measures:
Often MIPS and MFLOPS are used as performance measures.
MIPS - Count of the millions of instructions executed per second
MFLOPS - Count of the millions of Floating Point Operations per second
These are only useful when comparing identical ISAs. Why?
Because different machines may do different quantities of "work" for
different instructions.
Ex: Memory Copy (ObjectA=ObjectB - Done OFTEN)
A RISC Machine (like MIPS) may require 200-300 instructions to
perform a memory copy.
A CISC Machine (like Intel) may require only a single instruction
to do the same work.
Similiar to acking two people to move a pile of Bricks.
Person A is "smart" and can just be told...move the bricks there.
Person B requires detailed instructions...Move this brick...now this.
Just comparing the number of instructions done in a given time doesn't
really measure the WORK done in that time.
If both machines complete the copy in the same time, we'd say they have
the same performance for the copy...
But the RISC machine would have a much higher MIPS rating.
Really, EXE time is probably the "best" measure.
Other Poor Metrics (Besudes MIPS and MFLOPS):
Peak MIPS - WORSE than MIPS. Just a measure of the number of times the
fastest instruction can execute per second.
(May give some idea of CPU design, but BAD measure of perf)
Relative MIPS - Based on a workload and a common machine, so a little better.
B. Problems with Benchmark results:
1. Application Mix/Usage - Needs to be same as in benchmark
We want to test apps that are as close to what the machine
will be used for as possible.
I.e. 3D performance shouldn't be measusred if we're only
going to use the machine for spreadsheets!
2. "Ranking" in suites - How are different results combined?
Look at page 32 of the SPEC handout
How do we really combine these results to determine
what is best?
C. SPEC - System Performance and Evaluation Cooperative
1. What's Measured: Int and Float Performance
Int: "Normal" Tasks
Compression: gzip
Compilation: gcc
chess, perl, combinational optimization, databases,
logic simulation, etc...
Floating Point: Number crunching
Image Processing/Neural Networks (Matrix Manip), Fluid Dynamics,
Primality Test, 3D Graphics, Finite Elements/Simultaion,
Nuclear Physics, etc...
2. What's used to measure these?
REAL Apps (Non-Synthetic Benchmark) for "realistic" performance
Must be compute (rather tan IO) bound.
Is "unique" compared to other benchmark members
Must do meaningful/usefull work.
3. Who's behind SPEC?
For the most part: Industry (HP, IBM, SGI, Compaq, Sun, Intel, etc.)
4. What are the problems in developing SPEC
a. Selecting a program. MUST be REALLY portable
18 Platforms, some 32-bit, some 64-bit
11 types of unix, and 2 windows NTs
multiple compilers
b. Programming Languages & Compilers
C:11 in int, 4 in FP
C++: 1 in int.
F77: 6 in fp
F90: 4 in FP
Why Fortran in FP?
Why so little C++?
c. Problems with FP
Numerical variableability - subtle implmentation differences
will actually give different results.
d. Vendor self interest:
results are usually confidential
voted for by diverse sub-committee
e. Misc:
Compilers: Optimazation/differences = Big Impact
181.mfc, 282.eon (pg32) 500 MHz beats 533 3 times.
D. Binary Numbers:
1. Back to the basics: 1s and 0s
Computers use a 2 digit counting system. Corresponds to on/off of the "switches" used
Really it works just like the counting system you're used to, just fewer digits:
Counting Sequence: 0, 1, 10, 11, 100, 101, etc.
2. Since we use at least 3 numbering systems in here, I'll try to use one of the following notations:
binary: b or base2
decimal: d or nothing
hexadecimal: h, or 0x
3. These digits can be used to represent numbers by realizing that, like decimal,
each place has a value dependent on the base. I.e. :
9,824 = 9*1,000+8*100+2*10+4*1 = 9*10^3+8*10^2+2*10^1+4*10^0.
Binary works essentially the same, but our digits only count to 2
(0-1 instead of 0-9), and we use powers of 2 rather than powers of 10.
EX: 10 1101b= 1*2^5+0*2^4+1*2^3+1*2^2+0*2^1+1*2^0 = 46d
4. Conversion from bin to decimal - see above
5. Conversion from decimal to bin - repeated division by 2 w/ remaindes
EX: 13/2=6 r 1
6/2=3 r 0
3/2=1 r 1
1/2=0 r 1
Divide until 0, read remainders from bottom to top.
(Actually, this is the same as dividing by 2^3, 2^2, 2^1, 2^0, etc.)
E. Hexadecimal to store/represent binary - Note 1-to-1 correspondence
Hexadecimal is base 16. (Digits of 0-15)
Note that in binary, 4 digits can count up to 15.
This is really why hexadecimal is used - 4 bin digits = 1 hex digit.
4 binary digits is called a nibble.
1 nibble = 4 bits
1 byte = 2 nibbles = 8 bits
The most fundamental unit of storage in most computers is 1 byte =
8 bits = 2 hexadecimal digits, so hex works really well.
So, hex has 16 digits. They represent 0-15.
Decimal Binary Hexadecimal
0 0000 0
1 0001 1
2 0010 2
. .... .
It's probably best to memorize this table so you can easily convert.
Convert 0xAC to binary Ah=1010b, Ch=1100b => 1010 1100b
0x4E to binary 4h=0100b, Eh=1110b => 0100 1110b
Convert 110100 to Hex: 0011 0100b = 0011 0100b = 0x34
Easy conversion to decimal (Same process as binary and decimal):
0x7AC = 7*16^2 + A(10)*16^1 + C(12)*16^0 = 1964
F. Sign Representations
How do we do negatives in a computer?
We have to somehow store the sign within the number.
1. Sign Bit (Sign Magnitude) - Assign a bit to represent the sign.
Usually MSB is used.
Pitfalls: This makes subtraction/negatives is difficult
There are 2 "zeros" at MSB=1 and MSB=0
The wraparound has 2 boundaries
Pros: Numbers are "easy" to read
3 bit example: 000 001 010 011, 100 101 110 111
0 1 2 3 -0 -1 -2 -3
Uses: There are a few uses of sign magnitude representation...
IEEE-754 Floating Point uses something similiar
2. Binary Offset
Subtract half of the largest Value from the largest value and
call the result 0. Everything else is numbered outward.
3 bit example: 000 001 010 011 100 101 110 111
-4 -3 -2 -1 0 1 2 3
Pitfalls: Again, Subtraction/negatives is difficult
Uses: IEEE-754 uses binary offset to store the exponent on a number
3. 1's Compliment
In one's compliment a positive number is written in the same
manner as an unsigned number. To find a negative, find the
representation of the magnitude and then flip (invert) each bit.
Pitfalls: Subtraction is a little more difficult than 2's comp.
Uses: JPEG uses 1's compliment to store numbers.
(It's trying to compress data - minimize the number of
bits used. 1's compliment numbers can be stored in exactly
the number of bits needed for the magnitude of the number:
ex: 72: 100 1000b, -72= 011 0111. 7 bits either way.
Positive numbers begin with 1, negatives with 0)
4. 2's Compliment
Positive: Sames as unsigned
Negative: Invert each bit and add 1
3 bit example: 000 001 010 011 100 101 110 111
0 1 2 3 -4 -3 -2 -1
A nice "ring" with only 1 abrupt boundary.
Most Commonly Used represenatation for negative integers in computers.
Pros: Makes
Uses: Practically all integer math...where negatives are allowed
Shortcut: Locate RIGHTMOST ONE bit and invert everything to it's
LEFT. (DO NOT invert that bit itself)
III. Next Time:
A. Continuing Number / Data Representation