Bài giảng Computer Architecture - Chapter 3: Computer arithmetic

pdf 37 trang Gia Huy 4560
Bạn đang xem 20 trang mẫu của tài liệu "Bài giảng Computer Architecture - Chapter 3: Computer arithmetic", để tải tài liệu gốc về máy bạn click vào nút DOWNLOAD ở trên

Tài liệu đính kèm:

  • pdfbai_giang_computer_architecture_chapter_3_computer_arithmeti.pdf

Nội dung text: Bài giảng Computer Architecture - Chapter 3: Computer arithmetic

  1. COMPUTER ARCHITECTURE Chapter 3: Computer arithmetic Computer Engineering – CSE – HCMUT 1
  2. Outline • Integer operations – Addition and subtraction – Multiplication and division • Floating-point numbers – Representation – Operations and instructions Computer Architecture (c) Cuong Pham-Quoc/HCMUT 2
  3. INTEGER OPERATIONS 3
  4. Integer addition • Example: 710 + 610 = 01112 + 01012 • Overflow: result out of range – Adding +ve and –ve operands, no overflow – Adding two +ve operands • Overflow if result sign is 1 – Adding two –ve operands • Overflow if result sign is 0 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 4
  5. Integer subtraction • Add negation (2’s complement) of the second operand • Example: 7 − 6 = 7 + (−6) = 01112 + 10102 = 00012 • Overflow if result out of range – Subtracting two +ve or two –ve operands, no overflow – Subtracting +ve from –ve operand: −7 − 6 • Overflow if result sign is 0 – Subtracting –ve from +ve operand: 7 − (−6) • Overflow if result sign is 1 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 5
  6. Deal with Overflow • Some languages (e.g., C) ignore overflow – Use MIPS addu, addui, subu instructions • Other languages (e.g., Ada, Fortran) require raising an exception – Use MIPS add, addi, sub instructions – On overflow, invoke exception handler (hardware) • Save PC in exception program counter (EPC) register • Jump to predefined handler address • mfc0 (move from coprocessor reg) instruction can retrieve EPC value, to return after corrective action Computer Architecture (c) Cuong Pham-Quoc/HCMUT 6
  7. Hardware for multiplication multiplicand 1000 multiplier × 1001 1000 0000 0000 1000 product 1001000 Length of product is the sum of operand lengths Computer Architecture (c) Cuong Pham-Quoc/HCMUT 7
  8. Hardware operation m0: LSB bit of the multiplier Computer Architecture (c) Cuong Pham-Quoc/HCMUT 8
  9. Example • Using 4-bit numbers, calculate 210 × 310 = 00102 × 00112 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 9
  10. Optimized hardware Optimized in hardware usage; not in performance Computer Architecture (c) Cuong Pham-Quoc/HCMUT 10
  11. MIPS multiplication instructions • Two 32-bit registers for product – HI: most-significant 32 bits – LO: least-significant 32-bits • Instructions – mult rs, rt / multu rs, rt • 64-bit product in HI/LO – mfhi rd / mflo rd • Move from HI/LO to rd • Can test HI value to see if product overflows 32 bits – mul rd, rs, rt • ONLY least-significant 32 bits of product → rd Computer Architecture (c) Cuong Pham-Quoc/HCMUT 11
  12. Division n-bit operands yield n-bit quotient and remainder • Long division approach • Restoring divisor – if divisor ≤ dividend bits – Do the subtract, and if remainder goes < 0, add divisor back 1 bit in quotient, subtract • • Signed division – Otherwise – Divide using absolute values • 0 bit in quotient, bring down – Adjust sign of quotient and remainder as next dividend bit required Computer Architecture (c) Cuong Pham-Quoc/HCMUT 12
  13. Hardware for division Computer Architecture (c) Cuong Pham-Quoc/HCMUT 13
  14. Example • Using 4-bit numbers, calculate 710 ÷ 210 = 01112 ÷ 00102 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 14
  15. Optimized hardware Computer Architecture (c) Cuong Pham-Quoc/HCMUT 15
  16. MIPS division instructions • Use HI/LO registers for result – HI: 32-bit remainder – LO: 32-bit quotient • Instructions – div rs, rt / divu rs, rt – No overflow or divide-by-0 checking • What are values in HI/LO if divisor is 0? • Software must perform checks if required – Use mfhi, mflo to access result Computer Architecture (c) Cuong Pham-Quoc/HCMUT 16
  17. FLOATING POINT NUMBERS 17
  18. Floating point • Representation for non-integral numbers – Including very small and very large numbers • Like scientific notation – Normalized: −2.54 × 1056 – Not normalized: 0.002 × 10−4; 987.6 × 103 • In binary yyyy – ±1.xxxx2 × 2 • In ANSI: float or double Computer Architecture (c) Cuong Pham-Quoc/HCMUT 18
  19. Floating point standard • Defined by IEEE Std 754-1985 (IEEE-754) – Developed in response to divergence of representations – Portability issues for scientific code • Now almost universally adopted • Two representations – Single precision (32-bit): float (C) – Double precision (64-bit): double (C) Computer Architecture (c) Cuong Pham-Quoc/HCMUT 19
  20. IEEE-754 format single: 8 bits single: 23 bits double: 11 bits double: 52 bits S Exponent Fraction S (Exponent−Bias) X10 = (−1) × (1 + Fracon) × 2 • S: sign bit (0 ⇒ non-negative, 1 ⇒ negative) • Normalize significand: 1.0 ≤ |significand| < 2.0 – Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit) – Significand is Fraction with the “1.” restored: 0 ≤ |Fraction| < 1.0 • Exponent = actual exponent + Bias – Ensures exponent is unsigned – Single: Bias = 127; Double: Bias = 1023 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 20
  21. Example • Question: What is the decimal value of the floating point number 0x414C0000? • Answer: – 0x414C0000 ⇒ single precision • S = 0; • Exponent = 1000_00102 = 130; −1 −4 −5 • F = 100_1100_0000_ _00002 = 2 + 2 + 2 = 0.59375 – X = (−1)0 × (1 + 0.59375) × 2130−127 = 12.75 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 21
  22. Single precision range • Exponents 00000000 and 11111111 reserved • Smallest value – Exponent: 00000001 ⇒ actual exponent = 1 – 127 = –126 – Fraction: 000 00 ⇒ significand = 1.0 – ±1.0 × 2−126 ≈ ± 1.2 × 10−38 • Largest value – exponent: 11111110 ⇒ actual exponent = 254 – 127 = +127 – Fraction: 111 11 ⇒ significand ≈ 2.0 – ±2.0 × 2127 ≈ ± 3.4 × 1038 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 22
  23. Double precision range • Exponents 0000 00 and 1111 11 reserved • Smallest value – Exponent: 00000000001 ⇒ actual exponent = 1 – 1023 = –1022 – Fraction: 000 00 ⇒ significand = 1.0 – ±1.0 × 2−1022 ≈ ± 2.2 × 10−308 • •Largest value – Exponent: 11111111110 ⇒ actual exponent = 2046 – 1023 = +1023 – Fraction: 111 11 ⇒ significand ≈ 2.0 – ±2.0 × 21023 ≈ ± 1.8 × 10308 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 23
  24. Convert to IEEE-754 • Step 1: Decide S (1: negative; 0: positive) • Step 2: Decide Fraction – Convert the integer part to Binary – Convert the fractional part to Binary – Adjust the integer and fractional parts according the Significand format (1.xxx) • Step 3: Decide exponent Computer Architecture (c) Cuong Pham-Quoc/HCMUT 24
  25. Example • Question: what is the IEEE-754 representation of 12.75? • Answer: – S = 0; 3 – 12.75 = 1100.112 = 1.10011 × 2 – Exponent = 3 + 127 = 130 – Fraction: 100_1100_0000_0000_0000_00002 – 12.75 = 0x414C0000IEEE−754 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 25
  26. Floating point addition • Question: how to add two 4-digit decimal floating point numbers: 9.999 × 101 + 1.610 × 10−1 • Answer: do the following step 1. Align decimal points – Shift number with smaller exponent – 9.999 × 101 + 0.016 × 101 2. Add significands – 9.999 × 101 + 0.016 × 101 = 10.015 × 101 3. Normalize result & check for over/underflow – 1.0015 × 102 4. Round and renormalize if necessary – 1.002 × 102 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 26
  27. Floating point addition • Now consider a 4-digit binary example −1 −2 1.0002 × 2 + −1.1102 × 2 (0.5 + −0.4375) 1. Align binary points – Shift number with smaller exponent −1 −1 – 1.0002 × 2 + −0.1112 × 2 2. Add significands −1 −1 −1 – 1.0002 × 2 + −0.1112 × 2 = 0.0012 × 2 3. Normalize result & check for over/underflow −4 – 1.0002 × 2 , with no over/underflow 4. Round and renormalize if necessary −4 – 1.0002 × 2 (no change) = 0.0625 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 27
  28. Floating pointer adder hardware Step 1 Step 2 Step 3 Step 4 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 28
  29. Floating point multiplication • Question: how to multiply two 4-digit decimal numbers: 1.110 × 1010 × 9.200 × 10−5 • Answer: do the following steps 1. Add exponents – For biased exponents, subtract bias from sum – New exponent =10 + −5 = 5 2. Multiply significands – 1.110 × 9.200 = 10.212 ⇒ 10.212 × 105 3. Normalize result & check for over/underflow – 1.0212 × 106 4. Round and renormalize if necessary – 1.021 × 106 5. Determine sign of result from signs of operands – +1.021 × 106 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 29
  30. Floating point multiplication • Now consider a 4-digit binary example −1 −2 1.0002 × 2 × −1.1102 × 2 = (0.5 × −0.4375) • Add exponents – Unbiased: −1 + −2 = − 3 – Biased: (−1 + 127) + (−2 + 127) = − 3 + 254 − 127 = − 3 + 127 • Multiply significands −3 – 1.0002 × 1.1102 = 1.1102 ⇒ 1.1102 × 2 • Normalize result & check for over/underflow −3 – 1.1102 × 2 (no change) with no over/underflow • Round and renormalize if necessary −3 – 1.1102 × 2 (no change) • 5. Determine sign: +ve × −ve ⇒ − ve – −1.1102 × 2−3 = − 0.21875 Computer Architecture (c) Cuong Pham-Quoc/HCMUT 30
  31. FP instructions in MIPS • FP hardware is coprocessor 1 – Adjunct processor that extends the ISA • Separate FP registers – 32 single-precision: $f0, $f1, $f31 – Paired for double-precision: $f0/$f1, $f2/$f3, • Odd-number registers: right half of 64-bit floating-point numbers • FP instructions operate only on FP registers – Programs generally don’t do integer ops on FP data, or vice versa – More registers with minimal code-size impact • FP load and store instructions – lwc1, ldc1, swc1, sdc1 • e.g., ldc1 $f8, 32($sp) Computer Architecture (c) Cuong Pham-Quoc/HCMUT 31
  32. FP instructions in MIPS • Single-precision arithmetic – add.s, sub.s, mul.s, div.s • e.g., add.s $f0, $f1, $f6 • Double-precision arithmetic – add.d, sub.d, mul.d, div.d • e.g., mul.d $f4, $f4, $f6 • Single- and double-precision comparison – c.xx.s, c.xx.d (xx is eq, lt, le, ) – Sets or clears FP condition-code bit • e.g. c.lt.s $f3, $f4 • Branch on FP condition code true or false – bc1t, bc1f • e.g., bc1t TargetLabel Computer Architecture (c) Cuong Pham-Quoc/HCMUT 32
  33. Example: °F to °C • C code: float f2c (float fahr){ return ((5.0/9.0)*(fahr - 32.0)); } – fahr in $f12, result in $f0, literals in global memory space • Compiled MIPS code: f2c: lwc1 $f16, const5($gp) lwc1 $f18, const9($gp) div.s $f16, $f16, $f18 lwc1 $f18, const32($gp) sub.s $f18, $f12, $f18 mul.s $f0, $f16, $f18 jr $ra Computer Architecture (c) Cuong Pham-Quoc/HCMUT 33
  34. FP machine instructions Computer Architecture (c) Cuong Pham-Quoc/HCMUT 34
  35. Accurate arithmetic • IEEE Std 754 specifies additional rounding control – Extra bits of precision (guard, round, sticky) – Choice of rounding modes – Allows programmer to fine-tune numerical behavior of a computation • Not all FP units implement all options – Most programming languages and FP libraries just use defaults • Trade-off between hardware complexity, performance, and market requirements • Who Cares About FP Accuracy? Computer Architecture (c) Cuong Pham-Quoc/HCMUT 35
  36. Concluding remarks • Bits have no inherent meaning – Interpretation depends on the instructions applied • Computer representations of numbers – Finite range and precision • Need to account for this in programs Computer Architecture (c) Cuong Pham-Quoc/HCMUT 36
  37. The end Computer Engineering – CSE – HCMUT 37